Understanding LSTM
and
its Diagrams
Shi Yan
-
Slides written by Park JeeHyun
22 DEC 2017
Understanding LSTM and its diagrams
• Theoretically the naively connected neural network,
so called recurrent neural network, can work.
But in practice, it suffers from two problems:
• vanishing gradient and
• exploding gradient, which make it unusable.
• Then later, LSTM (long short term memory) was invented
to solve this issue by explicitly introducing a memory
unit, called the cell into the network.
Understanding LSTM and its diagrams
Understanding LSTM and its diagrams
• Memory pipe
• Forget valve
• New memory valve
• Generate new output
• Output valve
Memory pipe
• An element-wise multiplication
• if you multiply the old memory C_t-1 with a vector that is close to 0, that m
eans you want to forget most of the old memory.
• You let the old memory goes through, if your forget valve equals 1.
• A piece-wise summation
• New memory and the old memory will merge by this operation.
• How much new memory should be added to the old memory is
controlled by another valve, below.
Memory pipe : forget valve
• An element-wise multiplication
• if you multiply the old memory C_t-1 with a vector that is close to 0, that m
eans you want to forget most of the old memory.
• You let the old memory goes through, if your forget valve equals 1.
Memory pipe : new memory valve
• A piece-wise summation
• New memory and the old memory will merge by this operation.
• How much new memory should be added to the old memory is
controlled by another valve, below.
Generate new output : output valve
• This step has an output valve that is controlled by the new
memory, the previous output h_t-1, the input X_t and a bias
vector.
• This valve controls how much new memory should output to
the next LSTM unit.
Annex : LSTM diagram
Annex : LSTM diagram
References
• “Understanding LSTM and its diagrams” by Shi Yan
• https://medium.com/mlreview/understanding-lstm-and-its-
diagrams-37e2f46f1714
• “Understanding LSTM Networks” from colah's blog
• http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Understanding lstm and its diagrams

Understanding lstm and its diagrams

  • 1.
    Understanding LSTM and its Diagrams ShiYan - Slides written by Park JeeHyun 22 DEC 2017
  • 2.
    Understanding LSTM andits diagrams • Theoretically the naively connected neural network, so called recurrent neural network, can work. But in practice, it suffers from two problems: • vanishing gradient and • exploding gradient, which make it unusable. • Then later, LSTM (long short term memory) was invented to solve this issue by explicitly introducing a memory unit, called the cell into the network.
  • 3.
  • 4.
    Understanding LSTM andits diagrams • Memory pipe • Forget valve • New memory valve • Generate new output • Output valve
  • 5.
    Memory pipe • Anelement-wise multiplication • if you multiply the old memory C_t-1 with a vector that is close to 0, that m eans you want to forget most of the old memory. • You let the old memory goes through, if your forget valve equals 1. • A piece-wise summation • New memory and the old memory will merge by this operation. • How much new memory should be added to the old memory is controlled by another valve, below.
  • 6.
    Memory pipe :forget valve • An element-wise multiplication • if you multiply the old memory C_t-1 with a vector that is close to 0, that m eans you want to forget most of the old memory. • You let the old memory goes through, if your forget valve equals 1.
  • 7.
    Memory pipe :new memory valve • A piece-wise summation • New memory and the old memory will merge by this operation. • How much new memory should be added to the old memory is controlled by another valve, below.
  • 8.
    Generate new output: output valve • This step has an output valve that is controlled by the new memory, the previous output h_t-1, the input X_t and a bias vector. • This valve controls how much new memory should output to the next LSTM unit.
  • 9.
    Annex : LSTMdiagram
  • 10.
    Annex : LSTMdiagram
  • 11.
    References • “Understanding LSTMand its diagrams” by Shi Yan • https://medium.com/mlreview/understanding-lstm-and-its- diagrams-37e2f46f1714 • “Understanding LSTM Networks” from colah's blog • http://colah.github.io/posts/2015-08-Understanding-LSTMs/