Machine Learning
Recurrent Neural Networks
Created by Andrew Ferlitsch
September, 2017
Recurrent vs Feed Forward
• Feed Forward Network
• Inputs enter the network and progress forward through layers.
• There is no retention (memory) of past inputs or states, or time.
• Recurrent Neural Network
• Maybe only one layer.
• Outputs are cycled back in as inputs (fixed number of cycles).
• The outputs from previous state are added to inputs of next
state; maintaining memory of a fixed number of past states.
Inputs Outputs
Inputs are Feed
Forward thru
the layers.
Recurrent Neural Network
Outputs from each step are added to the inputs of the next step.
Inputs
for S0
Outputs Inputs
for Sx+1
Outputs for Sx+1
Outputs
for Sx
Retained
Single Layer
Inputs for
Initial state.
Retain outputs of
current state.
Retained
Retain outputs of
current state.
Add outputs of
previous state to
inputs of current state.
This is what is meant by RNN retain memory.
Useful for NLP for remembering context, and other
problems which are time series dependent to forecast.
Time Series
Time Progression in Recurrent Neural Network
Inputs
for T0
Outputs T0
Inputs for
Time T0
Outputs for time T0
Inputs
for T1
T0
T1
Outputs T1
Inputs
for T2
T2
Outputs T2
Inputs
for T3
T3
Time Progression
View of Neural Network Unrolled, i.e., as each cycle was a separate layer.
Prediction – Short Term Memory
Making Predictions in Sequenced Data (i.e., NLP)
Prediction at St
Squashing function for
prediction (tanh)
Past Prediction at St-1
Inputs at St
Example: Predict the next word in a sentence (or search query) based on the last word seen.
Short Term Memory because we only remember the last prediction.
Long Short - Term Memory (LSTM)
• Long Short – Term Memory is a type of RNN.
• Adds a layer typically between the input layer and the first
hidden layer.
• Retains some memory of past outputs (long) and a means to
forget (short).
Inputs Outputs
Hidden Layer(s)
The LSTM layer
LSTM Layer
LTSM Details (i.e., memory cell)
Xt
ht
Outputs from
Input layer
Outputs (hidden state)
to next layer
Memory
ht
Calculated Output at Time Tt
ht-1
Previous Output from LSTM layer
Inputs
for Tt
Outputs
for Tt
Ct
Constant value from one of
the inputs at Time Tt.
Ct-1
Previous Constant from LSTM layer
LTSM Constant Values
• Examples of constant values in LTSM memory cell:
• Time – How much time has passed.
• Location – What is my direction.
• Speed – What is my acceleration
Prediction – Long Term Memory
Split Neural Network – Inputs are passed through two duplicated NNs.
(Short Term)
Prediction at St
Memory Cell
Past Prediction at St-1
Inputs at St
(Long Term)
Prediction at St
Inputs are split across two neural networks in parallel. In one, the prediction is done without memory (short term). In the second,
the prediction is done with memory (long term). A final prediction is made from the short term and long term prediction.

Machine Learning - Introduction to Recurrent Neural Networks

  • 1.
    Machine Learning Recurrent NeuralNetworks Created by Andrew Ferlitsch September, 2017
  • 2.
    Recurrent vs FeedForward • Feed Forward Network • Inputs enter the network and progress forward through layers. • There is no retention (memory) of past inputs or states, or time. • Recurrent Neural Network • Maybe only one layer. • Outputs are cycled back in as inputs (fixed number of cycles). • The outputs from previous state are added to inputs of next state; maintaining memory of a fixed number of past states. Inputs Outputs Inputs are Feed Forward thru the layers.
  • 3.
    Recurrent Neural Network Outputsfrom each step are added to the inputs of the next step. Inputs for S0 Outputs Inputs for Sx+1 Outputs for Sx+1 Outputs for Sx Retained Single Layer Inputs for Initial state. Retain outputs of current state. Retained Retain outputs of current state. Add outputs of previous state to inputs of current state. This is what is meant by RNN retain memory. Useful for NLP for remembering context, and other problems which are time series dependent to forecast.
  • 4.
    Time Series Time Progressionin Recurrent Neural Network Inputs for T0 Outputs T0 Inputs for Time T0 Outputs for time T0 Inputs for T1 T0 T1 Outputs T1 Inputs for T2 T2 Outputs T2 Inputs for T3 T3 Time Progression View of Neural Network Unrolled, i.e., as each cycle was a separate layer.
  • 5.
    Prediction – ShortTerm Memory Making Predictions in Sequenced Data (i.e., NLP) Prediction at St Squashing function for prediction (tanh) Past Prediction at St-1 Inputs at St Example: Predict the next word in a sentence (or search query) based on the last word seen. Short Term Memory because we only remember the last prediction.
  • 6.
    Long Short -Term Memory (LSTM) • Long Short – Term Memory is a type of RNN. • Adds a layer typically between the input layer and the first hidden layer. • Retains some memory of past outputs (long) and a means to forget (short). Inputs Outputs Hidden Layer(s) The LSTM layer
  • 7.
    LSTM Layer LTSM Details(i.e., memory cell) Xt ht Outputs from Input layer Outputs (hidden state) to next layer Memory ht Calculated Output at Time Tt ht-1 Previous Output from LSTM layer Inputs for Tt Outputs for Tt Ct Constant value from one of the inputs at Time Tt. Ct-1 Previous Constant from LSTM layer
  • 8.
    LTSM Constant Values •Examples of constant values in LTSM memory cell: • Time – How much time has passed. • Location – What is my direction. • Speed – What is my acceleration
  • 9.
    Prediction – LongTerm Memory Split Neural Network – Inputs are passed through two duplicated NNs. (Short Term) Prediction at St Memory Cell Past Prediction at St-1 Inputs at St (Long Term) Prediction at St Inputs are split across two neural networks in parallel. In one, the prediction is done without memory (short term). In the second, the prediction is done with memory (long term). A final prediction is made from the short term and long term prediction.