SlideShare a Scribd company logo
1 of 96
RNN & LSTM
DR. ANINDYA HALDER
1
Reccurent Neural Networks (RNN) :
The RNN has is highly preferred method , especially for sequential data.
Every node at a time step consists of an input from the previous node, and it proceeds using a feedback
loop.
 In RNN, each node generates a current
hidden state and its output is obtained by
using the given input and previous hidden
state as follows:
Fig: Compressed (left) and unfolded (right) basic Recurrent Neural Network.
35
How Recurrent Neural Network works
• RNN processes the sequence of vectors one by one.
• While processing, it passes the previous hidden state to the next step of the sequence. The hidden state
acts as the neural networks memory. It holds information on previous data the network has seen before.
Figure: Processing sequence one by one.
36
Cont…
• First, the input and previous hidden state are combined to form a vector.
• That vector now has information on the current input and previous inputs. The vector goes
through the tanh activation, and the output is the new hidden state, or the memory of the
network.
Figure: Passing hidden state to next time step. Figure: RNN Cell
37
Recurrent Neural Networks suffer from short-term memory. If a sequence is long
enough, they’ll have a hard time carrying information from earlier time steps to later
ones.
During back propagation, recurrent neural networks suffer from the vanishing gradient
problem. Gradients are values used to update a neural networks weights. The vanishing
gradient problem is when the gradient shrinks as it back propagates through time. If a
gradient value becomes extremely small, it doesn’t contribute too much learning.
Drawbacks of RNN:
38
Pros and Cons of RNN:
Advantages Drawbacks
• Possibility of processing input of any length
• Model size not increasing with size of input
• Computation takes into account historical
information
• Weights are shared across time
• Computation being slow
• Difficulty of accessing information from a long
time ago
• Cannot consider any future input for the current
state
The pros and cons of a typical RNN architecture are summed up in the table below:
39
Applications of RNN:
•Prediction problems.
•Machine Translation.
•Speech Recognition.
•Language Modelling and Generating Text.
•Video Tagging.
•Generating Image Descriptions.
•Text Summarization.
•Call Center Analysis.
40
Long Term Short Memory(LSTM):
Long short-term memory is a type of RNN model designed to prevent the output of a neural network from
either exploding or decaying (long-term dependency) as it passes through the feedback loops for a given
input.
63
Basic LSTM Unit and corresponding individual blocks
64
Forget Gate: whether an information
it should remember or to forget.
Cell State: governing the input, as well
as to forget the irrelevant information
Input Gate: decider state that helps to
decide what new information it needs
to store in the cell state.
Output Gate: helps in deciding the
values of next hidden state of the
network.
Activation Functions of LSTM
In LSTM architecture, two types of activation functions are used:
 Tanh activation function
Sigmoid activation function
65
Cont..
Tanh:
 LSTM gates contains Tanh activations.
Tanh is a non-linear activation function.
 It regulates the values flowing through
the network, maintaining the values
between -1 and 1.
The tanh activation is used to help
regulate the values flowing through the
network.
Figure: Tanh squishes values to be between -1 and 1.
66
Sigmoid
LSTM gates contains sigmoid activations.
Sigmoid function squishes values between 0
and 1.
 That is helpful to update or forget data
because any number getting multiplied by 0 is
0, causing values to disappears or be
“forgotten.” Any number multiplied by 1 is the
same value therefore that value stay’s the same
or is “kept.”
Using Sigmoid activation function, the network
can learn which data is not important therefore
can be forgotten or which data is important to
keep.
Cont…
Figure:Sigmoid squishes values to be between 0
and 1
67
Gates of LSTM
68
Cont…
• This gate decides what information should
be thrown away or kept.
• Information from the previous hidden state
and information from the current input is
passed through the sigmoid function.
• Values come out between 0 and 1. The
closer to 0 means to forget, and the closer
to 1 means to keep.
Forget gate
Figure: Forget Gate.
69
Input Gate
• The goal of this gate is to determine what new
information should be added to the networks
long-term memory (cell state), given the
previous hidden state and new input data.
• The input gate is a sigmoid activated network
which acts as a filter, identifying which
components of the ‘new memory vector’ are
worth retaining. This network will output a vector
of values in [0,1].
• It is also passed the hidden state and current
input into the tanh function to squish values
between -1 and 1 to help regulate the network.
Cont…
Figure: Input Gate.
70
Cell State
• The next step is to decide and store the
information from the new state in the cell state.
• The previous cell state C(t-1) gets multiplied with
forget vector f(t). If the outcome is 0, then values
will get dropped in the cell state.
• Next, the network takes the output value of the
input vector i(t) and performs point-by-point
addition, which updates the cell state giving the
network a new cell state C(t).
Cont…
Figure: Cell State.
71
Output Gate
 The output gate decides what the next hidden state
should be. The hidden state contains information on
previous inputs. The hidden state is also used for
predictions.
Cont…
Figure: Output Gate.
72
Applications of LSTM:
73
Autoencoder
Cont…
An Autoencoder consists of three layers:
1.Encoder
2.Compressed representation layer
3.Decoder
• The Encoder layer compresses the input image into a latent space representation. It encodes the input image
as a compressed representation in a reduced dimension.
• The Compressed representation layer represents the compressed input fed to the decoder layer.
• The decoder layer decodes the encoded image back to the original dimension
Case study
A deep LSTM autoencoder for detecting
anomalous ECG
Block diagram
Detailed architecture of ECG-NET
The concept behind ECG-NET
ECG data sets used:
81
Publicly available ECG5000 2 dataset
Original dataset was released by Eamonn Keogh and Yanping Chen that is downloaded from physionet.
This database is a 20 hour long BIDMC congestive Heart Failure Database (chfdb) ECG database
reported as ‘‘chf07’’ which is publicly available in the UCR Time Series Classification archive (Bagnall et al.)
(Bagnall et al., 2018) 2018.
http://www.timeseriesclassification.com/description.php?Dataset=ECG5000.
available ECG5000 dataset is a preprocessed dataset, where the data preprocessing was performed in two
steps
first step extracts each heartbeat and second step makes each heartbeat of equal length using interpolation
The dataset comprises both normal patients’ ECG as well as the ECG of those patients who have severe
congestive heart failure
Details of Data Splitting
Data Source: ECG5000 dataset released by Eamonn Keogh and YanpingChen that is downloaded from physionet
Threshold technique used:
Manual threshold
Automated threshold (Kapur’s thresholding procedure)
Hyper-parameter settings
Result and analysis section
Training and validation
Reconstruction of the training and
validation normal ECG signals
Training ECG samples
Validation ECG samples
Test Result:
Reconstruction of the test ECG signals
Correctly and incorrectly classified ECG
test samples using Manual Thresholding
procedure
Correctly and incorrectly classified ECG
test samples using Automated
Thresholding procedure
Compared Result
Advantages of the Proposed Method
94
Challenges involved in automated ECG arrhythmia detection include but not limited to the following:
Limited availability of the annotated ECG signals to train the model,
involvement of data imbalance problem in ECG dataset, where the normal ECG signals are predominant
over anomaly ECGs.
 The above said challenges are judiciously handled in the proposed (LSTM based autoencoder) ECG-NET
ECG-NET method requires only normal ECG signals for training (where no anomalous ECG signals are
required during the training phase)
Even than can achieve good accuracy during the testing phase considering both normal and anomalous test
ECG signals.
Thus, data imbalance problems and limited availability of annotated samples (particularly anomalous ECG
signals) are handled
Proposed an automated reconstruction loss threshold selection approach on testing phase based on
Kapur’s histogram thresholding approach
Proposed three LSTM based autoencoder architecture which yields better accuracy than existing
architectures.
Reference:
1. https://www.pluralsight.com/guides/introduction-to-lstm-units-in-rnn
2. https://www.geeksforgeeks.org/introduction-to-recurrent-neural-
network/#:~:text=RNN%20converts%20the%20independent%20activations,to%20the%20next%
20hidden%20layer.
3. https://towardsmachinelearning.org/recurrent-neural-network-architecture-explained-in-
detail/
4. https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-
explanation-44e9eb85bf21
95
96
Thank You

More Related Content

Similar to RNN-LSTM.pptx

Handwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPTHandwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPTRishabhTyagi48
 
UNET: Massive Scale DNN on Spark
UNET: Massive Scale DNN on SparkUNET: Massive Scale DNN on Spark
UNET: Massive Scale DNN on SparkZhan Zhang
 
High - Performance using Neural Networks in Direct Torque Control for Asynchr...
High - Performance using Neural Networks in Direct Torque Control for Asynchr...High - Performance using Neural Networks in Direct Torque Control for Asynchr...
High - Performance using Neural Networks in Direct Torque Control for Asynchr...IJECEIAES
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer FarooquiDatabricks
 
A Survey of Convolutional Neural Networks
A Survey of Convolutional Neural NetworksA Survey of Convolutional Neural Networks
A Survey of Convolutional Neural NetworksRimzim Thube
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Simplilearn
 
RNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantagesRNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantagesAbhijitVenkatesh1
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networksarjitkantgupta
 
A New hybrid method in watermarking using DCT and AES
A New hybrid method in watermarking using DCT and AESA New hybrid method in watermarking using DCT and AES
A New hybrid method in watermarking using DCT and AESIJERD Editor
 
Introduction to Perceptron and Neural Network.pptx
Introduction to Perceptron and Neural Network.pptxIntroduction to Perceptron and Neural Network.pptx
Introduction to Perceptron and Neural Network.pptxPoonam60376
 
Trackster Pruning at the CMS High-Granularity Calorimeter
Trackster Pruning at the CMS High-Granularity CalorimeterTrackster Pruning at the CMS High-Granularity Calorimeter
Trackster Pruning at the CMS High-Granularity CalorimeterYousef Fadila
 
Implementation of Back-Propagation Neural Network using Scilab and its Conver...
Implementation of Back-Propagation Neural Network using Scilab and its Conver...Implementation of Back-Propagation Neural Network using Scilab and its Conver...
Implementation of Back-Propagation Neural Network using Scilab and its Conver...IJEEE
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
Handwritten Digit Recognition using Convolutional Neural Networks
Handwritten Digit Recognition using Convolutional Neural  NetworksHandwritten Digit Recognition using Convolutional Neural  Networks
Handwritten Digit Recognition using Convolutional Neural NetworksIRJET Journal
 
Lab 6 Neural Network
Lab 6 Neural NetworkLab 6 Neural Network
Lab 6 Neural NetworkKyle Villano
 

Similar to RNN-LSTM.pptx (20)

Jpeg compression
Jpeg compressionJpeg compression
Jpeg compression
 
Handwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPTHandwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPT
 
Deep learning (2)
Deep learning (2)Deep learning (2)
Deep learning (2)
 
UNET: Massive Scale DNN on Spark
UNET: Massive Scale DNN on SparkUNET: Massive Scale DNN on Spark
UNET: Massive Scale DNN on Spark
 
High - Performance using Neural Networks in Direct Torque Control for Asynchr...
High - Performance using Neural Networks in Direct Torque Control for Asynchr...High - Performance using Neural Networks in Direct Torque Control for Asynchr...
High - Performance using Neural Networks in Direct Torque Control for Asynchr...
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 
A Survey of Convolutional Neural Networks
A Survey of Convolutional Neural NetworksA Survey of Convolutional Neural Networks
A Survey of Convolutional Neural Networks
 
Multi Layer Network
Multi Layer NetworkMulti Layer Network
Multi Layer Network
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
 
Deep learning
Deep learningDeep learning
Deep learning
 
RNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantagesRNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantages
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
 
A New hybrid method in watermarking using DCT and AES
A New hybrid method in watermarking using DCT and AESA New hybrid method in watermarking using DCT and AES
A New hybrid method in watermarking using DCT and AES
 
Introduction to Perceptron and Neural Network.pptx
Introduction to Perceptron and Neural Network.pptxIntroduction to Perceptron and Neural Network.pptx
Introduction to Perceptron and Neural Network.pptx
 
Trackster Pruning at the CMS High-Granularity Calorimeter
Trackster Pruning at the CMS High-Granularity CalorimeterTrackster Pruning at the CMS High-Granularity Calorimeter
Trackster Pruning at the CMS High-Granularity Calorimeter
 
Implementation of Back-Propagation Neural Network using Scilab and its Conver...
Implementation of Back-Propagation Neural Network using Scilab and its Conver...Implementation of Back-Propagation Neural Network using Scilab and its Conver...
Implementation of Back-Propagation Neural Network using Scilab and its Conver...
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
14_cnn complete.pptx
14_cnn complete.pptx14_cnn complete.pptx
14_cnn complete.pptx
 
Handwritten Digit Recognition using Convolutional Neural Networks
Handwritten Digit Recognition using Convolutional Neural  NetworksHandwritten Digit Recognition using Convolutional Neural  Networks
Handwritten Digit Recognition using Convolutional Neural Networks
 
Lab 6 Neural Network
Lab 6 Neural NetworkLab 6 Neural Network
Lab 6 Neural Network
 

Recently uploaded

DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage examplePragyanshuParadkar1
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
Effects of rheological properties on mixing
Effects of rheological properties on mixingEffects of rheological properties on mixing
Effects of rheological properties on mixingviprabot1
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 

Recently uploaded (20)

DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage example
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Effects of rheological properties on mixing
Effects of rheological properties on mixingEffects of rheological properties on mixing
Effects of rheological properties on mixing
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 

RNN-LSTM.pptx

  • 1. RNN & LSTM DR. ANINDYA HALDER 1
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35. Reccurent Neural Networks (RNN) : The RNN has is highly preferred method , especially for sequential data. Every node at a time step consists of an input from the previous node, and it proceeds using a feedback loop.  In RNN, each node generates a current hidden state and its output is obtained by using the given input and previous hidden state as follows: Fig: Compressed (left) and unfolded (right) basic Recurrent Neural Network. 35
  • 36. How Recurrent Neural Network works • RNN processes the sequence of vectors one by one. • While processing, it passes the previous hidden state to the next step of the sequence. The hidden state acts as the neural networks memory. It holds information on previous data the network has seen before. Figure: Processing sequence one by one. 36
  • 37. Cont… • First, the input and previous hidden state are combined to form a vector. • That vector now has information on the current input and previous inputs. The vector goes through the tanh activation, and the output is the new hidden state, or the memory of the network. Figure: Passing hidden state to next time step. Figure: RNN Cell 37
  • 38. Recurrent Neural Networks suffer from short-term memory. If a sequence is long enough, they’ll have a hard time carrying information from earlier time steps to later ones. During back propagation, recurrent neural networks suffer from the vanishing gradient problem. Gradients are values used to update a neural networks weights. The vanishing gradient problem is when the gradient shrinks as it back propagates through time. If a gradient value becomes extremely small, it doesn’t contribute too much learning. Drawbacks of RNN: 38
  • 39. Pros and Cons of RNN: Advantages Drawbacks • Possibility of processing input of any length • Model size not increasing with size of input • Computation takes into account historical information • Weights are shared across time • Computation being slow • Difficulty of accessing information from a long time ago • Cannot consider any future input for the current state The pros and cons of a typical RNN architecture are summed up in the table below: 39
  • 40. Applications of RNN: •Prediction problems. •Machine Translation. •Speech Recognition. •Language Modelling and Generating Text. •Video Tagging. •Generating Image Descriptions. •Text Summarization. •Call Center Analysis. 40
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63. Long Term Short Memory(LSTM): Long short-term memory is a type of RNN model designed to prevent the output of a neural network from either exploding or decaying (long-term dependency) as it passes through the feedback loops for a given input. 63
  • 64. Basic LSTM Unit and corresponding individual blocks 64 Forget Gate: whether an information it should remember or to forget. Cell State: governing the input, as well as to forget the irrelevant information Input Gate: decider state that helps to decide what new information it needs to store in the cell state. Output Gate: helps in deciding the values of next hidden state of the network.
  • 65. Activation Functions of LSTM In LSTM architecture, two types of activation functions are used:  Tanh activation function Sigmoid activation function 65
  • 66. Cont.. Tanh:  LSTM gates contains Tanh activations. Tanh is a non-linear activation function.  It regulates the values flowing through the network, maintaining the values between -1 and 1. The tanh activation is used to help regulate the values flowing through the network. Figure: Tanh squishes values to be between -1 and 1. 66
  • 67. Sigmoid LSTM gates contains sigmoid activations. Sigmoid function squishes values between 0 and 1.  That is helpful to update or forget data because any number getting multiplied by 0 is 0, causing values to disappears or be “forgotten.” Any number multiplied by 1 is the same value therefore that value stay’s the same or is “kept.” Using Sigmoid activation function, the network can learn which data is not important therefore can be forgotten or which data is important to keep. Cont… Figure:Sigmoid squishes values to be between 0 and 1 67
  • 69. Cont… • This gate decides what information should be thrown away or kept. • Information from the previous hidden state and information from the current input is passed through the sigmoid function. • Values come out between 0 and 1. The closer to 0 means to forget, and the closer to 1 means to keep. Forget gate Figure: Forget Gate. 69
  • 70. Input Gate • The goal of this gate is to determine what new information should be added to the networks long-term memory (cell state), given the previous hidden state and new input data. • The input gate is a sigmoid activated network which acts as a filter, identifying which components of the ‘new memory vector’ are worth retaining. This network will output a vector of values in [0,1]. • It is also passed the hidden state and current input into the tanh function to squish values between -1 and 1 to help regulate the network. Cont… Figure: Input Gate. 70
  • 71. Cell State • The next step is to decide and store the information from the new state in the cell state. • The previous cell state C(t-1) gets multiplied with forget vector f(t). If the outcome is 0, then values will get dropped in the cell state. • Next, the network takes the output value of the input vector i(t) and performs point-by-point addition, which updates the cell state giving the network a new cell state C(t). Cont… Figure: Cell State. 71
  • 72. Output Gate  The output gate decides what the next hidden state should be. The hidden state contains information on previous inputs. The hidden state is also used for predictions. Cont… Figure: Output Gate. 72
  • 75. Cont… An Autoencoder consists of three layers: 1.Encoder 2.Compressed representation layer 3.Decoder • The Encoder layer compresses the input image into a latent space representation. It encodes the input image as a compressed representation in a reduced dimension. • The Compressed representation layer represents the compressed input fed to the decoder layer. • The decoder layer decodes the encoded image back to the original dimension
  • 76. Case study A deep LSTM autoencoder for detecting anomalous ECG
  • 77.
  • 81. ECG data sets used: 81 Publicly available ECG5000 2 dataset Original dataset was released by Eamonn Keogh and Yanping Chen that is downloaded from physionet. This database is a 20 hour long BIDMC congestive Heart Failure Database (chfdb) ECG database reported as ‘‘chf07’’ which is publicly available in the UCR Time Series Classification archive (Bagnall et al.) (Bagnall et al., 2018) 2018. http://www.timeseriesclassification.com/description.php?Dataset=ECG5000. available ECG5000 dataset is a preprocessed dataset, where the data preprocessing was performed in two steps first step extracts each heartbeat and second step makes each heartbeat of equal length using interpolation The dataset comprises both normal patients’ ECG as well as the ECG of those patients who have severe congestive heart failure
  • 82. Details of Data Splitting Data Source: ECG5000 dataset released by Eamonn Keogh and YanpingChen that is downloaded from physionet
  • 83. Threshold technique used: Manual threshold Automated threshold (Kapur’s thresholding procedure)
  • 84.
  • 86. Result and analysis section Training and validation
  • 87.
  • 88. Reconstruction of the training and validation normal ECG signals Training ECG samples Validation ECG samples
  • 90. Reconstruction of the test ECG signals
  • 91. Correctly and incorrectly classified ECG test samples using Manual Thresholding procedure
  • 92. Correctly and incorrectly classified ECG test samples using Automated Thresholding procedure
  • 94. Advantages of the Proposed Method 94 Challenges involved in automated ECG arrhythmia detection include but not limited to the following: Limited availability of the annotated ECG signals to train the model, involvement of data imbalance problem in ECG dataset, where the normal ECG signals are predominant over anomaly ECGs.  The above said challenges are judiciously handled in the proposed (LSTM based autoencoder) ECG-NET ECG-NET method requires only normal ECG signals for training (where no anomalous ECG signals are required during the training phase) Even than can achieve good accuracy during the testing phase considering both normal and anomalous test ECG signals. Thus, data imbalance problems and limited availability of annotated samples (particularly anomalous ECG signals) are handled Proposed an automated reconstruction loss threshold selection approach on testing phase based on Kapur’s histogram thresholding approach Proposed three LSTM based autoencoder architecture which yields better accuracy than existing architectures.
  • 95. Reference: 1. https://www.pluralsight.com/guides/introduction-to-lstm-units-in-rnn 2. https://www.geeksforgeeks.org/introduction-to-recurrent-neural- network/#:~:text=RNN%20converts%20the%20independent%20activations,to%20the%20next% 20hidden%20layer. 3. https://towardsmachinelearning.org/recurrent-neural-network-architecture-explained-in- detail/ 4. https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step- explanation-44e9eb85bf21 95