AICurious Logo

What is: Long Short-Term Memory?

Year1997
Data SourceCC BY-SA - https://paperswithcode.com

An LSTM is a type of recurrent neural network that addresses the vanishing gradient problem in vanilla RNNs through additional cells, input and output gates. Intuitively, vanishing gradients are solved through additional additive components, and forget gate activations, that allow the gradients to flow through the network without vanishing as quickly.

(Image Source here)

(Introduced by Hochreiter and Schmidhuber)