AICurious Logo

What is: ASGD Weight-Dropped LSTM?

SourceRegularizing and Optimizing LSTM Language Models
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

ASGD Weight-Dropped LSTM, or AWD-LSTM, is a type of recurrent neural network that employs DropConnect for regularization, as well as NT-ASGD for optimization - non-monotonically triggered averaged SGD - which returns an average of last iterations of weights. Additional regularization techniques employed include variable length backpropagation sequences, variational dropout, embedding dropout, weight tying, independent embedding/hidden size, activation regularization and temporal activation regularization.