2. LSTM – LONG SHORT TERM MEMORY
Has memory mechanisms allowing better storage of information
than plain RNNs.
Maintains state over time hence are good for modelling time series
data.
The data used to
investigate LSTM
architecture of RNN.
3. LSTM model explained
Contains the following: forget gate, candidate layer, input gate,
output gate, hidden state, memory state and outputs from the
LSTM cell.
Previous memory state is used for element wise multiplication with
the forget gate to decide present memory state.
The model used in code contains 60 time-steps and 1 output.
The data is then reshaped for modelling and a model for training is
used for predicting the test data.
4. The architecture uses time
series data and overcomes
the vanishing gradient
problem.
There are 50 epochs in the
architecture.
A prediction made for the
stock prices against the stock
market price.
5. GRU – GATED RECURRENT UNIT
Work using the same principle as LSTM.
Only that they are streamlined and are thus cheaper to run.
It is a trade off between computational and representational
power.
The backend being used
for this application is the
Tensor Flow core library.
6. GRU Model explained
Its unique in that it does not use memory units to control flow of data.
We therefore have fewer parameter to train in this case and therefore
the model takes shorter time.
They have two gates : the reset and update gate.
Reset gate determines combination of new input to memory.
Update gate determines how much of previous state should be kept.
7. GRU has the
reset gat and
the update gate.
Prediction
based on the
GRU model.
8. REFERENCE LIST
1. Huang, Y., Chen, C. H., & Huang, C. J. (2019). Motor fault detection and
feature extraction using RNN-based variational autoencoder. IEEE
Access, 7, 139086-139096.
2. Zhang, X., Chen, M. H., & Qin, Y. (2018, September). NLP-QA
Framework Based on LSTM-RNN. In 2018 2nd International Conference
on Data Science and Business Analytics (ICDSBA) (pp. 307-311). IEEE.
3. Kim, Y. J., & Chi, M. (2018, January). Temporal Belief Memory:
Imputing Missing Data during RNN Training. In In Proceedings of the
27th International Joint Conference on Artificial Intelligence (IJCAI-
2018).
Editor's Notes
Includes the memory cell and the gates. Contents of the memory cell are modulated by the input gates and forget gates. Gating allows retaining of information over deeper time epochs hence allowing the LSTM model to overcome the vanishing gradient problem. (Kim, & Chi, 2018
Since LSTMs store long term memory state, we create a data structure with 60 time steps and 1 output.
So for each element of training set, we have 60 previous training set elements.
Gated recurrent unit (GRU) layers work using the same principle as LSTM, but they’re somewhat streamlined and thus cheaper to run (although they may not have as much representational power as LSTM). This trade-off between computational expensiveness and representational power is seen everywhere in machine learning. (Huang, Chen, & Huang, 2019)
The major difference is that the GRU fully exposes its memory content using only leaky integration (but with an adaptive time constant controlled by the update gate). The GRU was inspired by the LSTM unit but is considered simpler to compute and implement. (Huang, Chen, & Huang, 2019)