RNN-LSTM.pptx

RNN & LSTM
DR. ANINDYA HALDER
DEPT. OF CSIT
COTTON UNIVERSITY, GUAGATI -01
1

Reccurent Neural Networks (RNN) :
The RNN has is highly preferred method , especially for sequential data.
Every node at a time step consists of an input from the previous node, and it proceeds using a feedback
loop.
 In RNN, each node generates a current
hidden state and its output is obtained by
using the given input and previous hidden
state as follows:
Fig: Compressed (left) and unfolded (right) basic Recurrent Neural Network.
2

How Recurrent Neural Network works
• RNN processes the sequence of vectors one by one.
• While processing, it passes the previous hidden state to the next step of the sequence. The hidden state
acts as the neural networks memory. It holds information on previous data the network has seen before.
Figure: Processing sequence one by one.
3

Cont…
• First, the input and previous hidden state are combined to form a vector.
• That vector now has information on the current input and previous inputs. The vector goes
through the tanh activation, and the output is the new hidden state, or the memory of the
network.
Figure: Passing hidden state to next time step. Figure: RNN Cell
4

Recurrent Neural Networks suffer from short-term memory. If a sequence is long
enough, they’ll have a hard time carrying information from earlier time steps to later
ones.
During back propagation, recurrent neural networks suffer from the vanishing gradient
problem. Gradients are values used to update a neural networks weights. The vanishing
gradient problem is when the gradient shrinks as it back propagates through time. If a
gradient value becomes extremely small, it doesn’t contribute too much learning.
Drawbacks of RNN:
5

Pros and Cons of RNN:
Advantages Drawbacks
• Possibility of processing input of any length
• Model size not increasing with size of input
• Computation takes into account historical
information
• Weights are shared across time
• Computation being slow
• Difficulty of accessing information from a long
time ago
• Cannot consider any future input for the current
state
The pros and cons of a typical RNN architecture are summed up in the table below:
6

Applications of RNN:
•Prediction problems.
•Machine Translation.
•Speech Recognition.
•Language Modelling and Generating Text.
•Video Tagging.
•Generating Image Descriptions.
•Text Summarization.
•Call Center Analysis.
7

Long Term Short Memory(LSTM):
Long short-term memory is a type of RNN model designed to prevent the output of a neural network from
either exploding or decaying (long-term dependency) as it passes through the feedback loops for a given
input.
8

Activation Functions of LSTM
In LSTM architecture, two types of activation functions are used:
 Tanh activation function
Sigmoid activation function
9

Cont..
Tanh:
 LSTM gates contains Tanh activations.
Tanh is a non-linear activation function.
 It regulates the values flowing through
the network, maintaining the values
between -1 and 1.
The tanh activation is used to help
regulate the values flowing through the
network.
Figure: Tanh squishes values to be between -1 and 1.
10

Sigmoid
LSTM gates contains sigmoid activations.
Sigmoid function squishes values between 0
and 1.
 That is helpful to update or forget data
because any number getting multiplied by 0 is
0, causing values to disappears or be
“forgotten.” Any number multiplied by 1 is the
same value therefore that value stay’s the same
or is “kept.”
Using Sigmoid activation function, the network
can learn which data is not important therefore
can be forgotten or which data is important to
keep.
Cont…
Figure:Sigmoid squishes values to be between 0
and 1
11

Cont…
• This gate decides what information should
be thrown away or kept.
• Information from the previous hidden state
and information from the current input is
passed through the sigmoid function.
• Values come out between 0 and 1. The
closer to 0 means to forget, and the closer
to 1 means to keep.
Forget gate
Figure: Forget Gate.
13

Input Gate
• The goal of this gate is to determine what new
information should be added to the networks
long-term memory (cell state), given the
previous hidden state and new input data.
• The input gate is a sigmoid activated network
which acts as a filter, identifying which
components of the ‘new memory vector’ are
worth retaining. This network will output a vector
of values in [0,1].
• It is also passed the hidden state and current
input into the tanh function to squish values
between -1 and 1 to help regulate the network.
Cont…
Figure: Input Gate.
14

Cell State
• The next step is to decide and store the
information from the new state in the cell state.
• The previous cell state C(t-1) gets multiplied with
forget vector f(t). If the outcome is 0, then values
will get dropped in the cell state.
• Next, the network takes the output value of the
input vector i(t) and performs point-by-point
addition, which updates the cell state giving the
network a new cell state C(t).
Cont…
Figure: Cell State.
15

Output Gate
 The output gate decides what the next hidden state
should be. The hidden state contains information on
previous inputs. The hidden state is also used for
predictions.
Cont…
Figure: Output Gate.
16

Reference:
1. https://www.pluralsight.com/guides/introduction-to-lstm-units-in-rnn
2. https://www.geeksforgeeks.org/introduction-to-recurrent-neural-
network/#:~:text=RNN%20converts%20the%20independent%20activations,to%20the%20next%
20hidden%20layer.
3. https://towardsmachinelearning.org/recurrent-neural-network-architecture-explained-in-
detail/
4. https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-
explanation-44e9eb85bf21
18

RNN-LSTM.pptx

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to RNN-LSTM.pptx

Similar to RNN-LSTM.pptx (20)

Recently uploaded

Recently uploaded (20)

RNN-LSTM.pptx