SlideShare a Scribd company logo
1 of 62
Recurrent Neural Networks
CNN
• Nature of CNN
-classification, object recognition, pattern
matching, clustering
• Limitation
– CNNs generally don’t perform well when the input
data is interdependent in a sequential pattern.
– No correlation between previous and next input
– outputs are self dependent
Example: If you run 100 different inputs none of them
would be biased by the previous output.
Why RNN?
Why RNN?
Imagine a scenario like sentence generation or
text translation.
Why RNN?
• The words generated are dependent on the
words generated before
• In this case ,need to have some bias based on
the previous output.
• This is a moment where RNNs shine.
• RNNs includes a sense of memory about what
happened earlier in the sequence of data.
Why RNN?
• RNN’s are good at processing sequence data
for predictions. But how??
-Sequence should have interdependent
data.
-Examples :time series data, informative pieces
of strings, conversations etc.
Sequence?
Sequence?
Sequence of data?
• So this is a sequence, a particular order in
which one thing follows another.
• With this information, you can now see that
the ball is moving to the right.
• Sequence data comes in many forms
Audio sequence
• Audio is a natural sequence. You can chop up
an audio spectrogram into chunks and feed
that into RNN’s.
Text sequence
• Text is another form of sequences. You can
break Text up into a sequence of characters or
a sequence of words.
- “I” “am” “writing” “a” “letter”
Sequential Memory?
Sequential memory is a mechanism that
makes it easier for your brain to recognize
sequence patterns.
-I am writing a ……
Recurrent Neural Networks
• How RNN replicate this sequential memory
concept?
• Feed-forward Networks
• Recurrent Networks
• Recurrent Neuron
• Backpropagation Through Time (BPTT)
Traditional feed-forward neural
network
• In a feed-forward network whatever image is shown to the
classifier during test phase, it doesn’t alter the weights so
the second decision is not affected.
• This is one very important difference between feed-forward
networks and recurrent nets.
Note:Feed-forward nets don’t remember historic input data
at test time unlike recurrent networks.
• Feed-forward Networks
• Recurrent Networks
• Recurrent Neuron
• Backpropagation Through Time (BPTT)
Recurrent Networks
• How do we get a feed-forward neural network
to be able to use previous information to
effect later ones?
• An RNN has a looping mechanism that acts as
a highway to allow information to flow from
one step to the next.
Recurrent Networks
• Recurrent networks, on the other hand, take as
their input not just the current input, but also
what they have perceived previously in time.
A Simple Multilayer Perceptron
A Simple Multilayer Perceptron
A Simple Multilayer Perceptron
• Feed-forward Networks
• Recurrent Networks
• Recurrent Neuron
• Backpropagation Through Time (BPTT)
Recurrent Neuron
• A recurrent neuron now stores all the previous
step input and merges that information with the
current step input.
how recurrent neural networks work?
how recurrent neural networks work?
• Feed-forward Networks
• Recurrent Networks
• Recurrent Neuron
• Back propagation Through Time (BPTT)
how recurrent neural networks work?
• So now we understand how a RNN actually
works, but how does the training actually work?
• How do we decide the weights for each
connection? And how do we initialise these
weights for these hidden units.
• The purpose of recurrent nets is to accurately
classify sequential input. We rely on the back
propagation of error and gradient descent to do
so.
• But a standard back propagation like how used in
feed forward networks can’t be used here.
how recurrent neural networks work?
• The problem with RNNs is that they are cyclic
graphs unlike feed-forward networks which
are acyclic directional graphs.
• In feed-forward networks we could calculate
the error derivatives from the layer above. In a
RNN we don’t have such layering.
Recurrent Neural Networks
Recurrent Neural Networks
• Replication of RNN’s hidden units for every time step.
• Each replication through time step is like a layer in a
feed-forward network.
• Each time step t layer connects to all possible layers in
the time step t+1.
• Thus we randomly initialise the weights, unroll the
network and then use back propagation to optimise
the weights in the hidden layer.
• Initialisation is done by passing parameters to the
lowest layer.
• These parameters are also optimised as a part of back
propagation.
Recurrent Neural Networks
• An outcome of the unrolling is that each layer now
starts maintaining different weights and thus end up
getting optimised differently.
• The errors calculated w.r.t the weights are not
guaranteed to be equal.
• So each layer can have different weights at the end of
a single run.
• We definitely don’t want that to happen.
• The easy solution out is to aggregate the errors across
all the layers in some fashion.
• We can average out the errors or even sum them up.
• This way we can have a single layer in all time steps to
maintain the same weights.
Recurrent Network Architecture
Recurrent Network Architecture
x
A
h
Architecture for an RNN
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Sequence of outputs
Sequence of inputs
Start of
sequence
marker
End of
sequence
marker
Some
information
is passed
from one
subunit to
the next
Architecture for an 1980’s RNN
…
Problem with this: it’s extremely deep
and very hard to train
Breaking up a sentence into word
sequences
How to feed data into RNN?
• The first step is to feed “What” into the RNN.
• The RNN encodes “What” and produces an
output.
How to feed data into RNN?
How to feed data into RNN?
• For the next step, feed the word “time” and
the hidden state from the previous step.
• The RNN now has information on both the
word “What” and “time.”
How to feed data into RNN?
How to feed data into RNN?
• Repeat this process, until the final step.
• The final step of the RNN has encoded
information from all the words in previous
steps.
How to feed data into RNN?
How to feed data into RNN?
• Since the final output was created from the
rest of the sequence.
• Take the final output and pass it to the feed-
forward layer to classify an intent.
How to feed data into RNN?
Limitation
• Theoretically RNNs have infinite memory,
-capability to look back indefinitely.
• But practically they can only look back a last
few steps.(The problem of Long term
dependencies)
Vanishing Gradient
• Final hidden state of the RNN
• Short-term memory is caused by the infamous
vanishing gradient problem.
Vanishing Gradient
• As the RNN processes more steps, it has troubles
retaining information from previous steps.
• The information from the word “what” and
“time” is almost non-existent at the final time
step.
• Short-Term memory and the vanishing gradient is
due to the nature of back-propagation; an
algorithm used to train and optimize neural
networks.
Vanishing Gradient in Back
Propagation Network
• Training a neural network has three major steps.
• First, it does a forward pass and makes a
prediction.
• Second, it compares the prediction to the ground
truth using a loss function. The loss function
outputs an error value which is an estimate of
how poorly the network is performing.
• Last, it uses that error value to do back
propagation which calculates the gradients for
each node in the network.
Vanishing Gradient in Back
Propagation Network
Vanishing Gradient in Back
Propagation Network
• The gradient is the value used to adjust the networks
internal weights, allowing the network to learn.
• The bigger the gradient, the bigger the adjustments
and vice versa.
• Here is where the problem lies.
• When doing back propagation, each node in a layer
calculates it’s gradient with respect to the effects of the
gradients, in the layer before it.
• So if the adjustments to the layers before it is small,
then adjustments to the current layer will be even
smaller.
Vanishing Gradient in Back
Propagation Network
• That causes gradients to exponentially shrink
as it back propagates down.
• The earlier layers fail to do any learning as the
internal weights are barely being adjusted due
to extremely small gradients.
• And that’s the vanishing gradient problem.
Gradients shrink as it back-propagates
down
Gradients shrink as it back-propagates
through time
Impact of Gradient in BPNN
• The gradient is used to make adjustments in
the neural networks weights thus allowing it
to learn.
• Small gradients means small adjustments.
That causes the early layers not to learn.
Vanishing Gradient
• Because of vanishing gradients, the RNN doesn’t learn
the long-range dependencies across time steps.
• That means that there is a possibility that the word
“what” and “time” are not considered when trying to
predict the user’s intention.
• The network then has to make the best guess with “is
it?”.
• That’s pretty ambiguous and would be difficult even for
a human.
• So not being able to learn on earlier time steps causes
the network to have a short-term memory.
LSTM’s and GRU’s
• To mitigate short-term memory, two specialized
recurrent neural networks were created.
• One called Long Short-Term Memory or LSTM’s
for short. The other is Gated Recurrent Units or
GRU’s.
• LSTM’s and GRU’s essentially function just like
RNN’s, but they’re capable of learning long-term
dependencies using mechanisms called “gates.”
Where to use a RNN?
• Language Modelling and Generating Text
• Machine Translation
• Speech Recognition
• Generating Image Descriptions
• Video Tagging
• stock predictions
Real World Applications
• Neural Machine Translation
Real World Applications
• Sentiment Analysis
References
• https://hackernoon.com/rnn-or-recurrent-neural-
network-for-noobs-a9afbb00e860
• https://medium.com/cracking-the-data-science-
interview/recurrent-neural-networks-the-
powerhouse-of-language-modeling-
f292c918b879
• https://towardsdatascience.com/animated-rnn-
lstm-and-gru-ef124d06cf45
Thank you!

More Related Content

Similar to Complete solution for Recurrent neural network.pptx

Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & OpportunityiTrain
 
33.-Multi-Layer-Perceptron.pdf
33.-Multi-Layer-Perceptron.pdf33.-Multi-Layer-Perceptron.pdf
33.-Multi-Layer-Perceptron.pdfgnans Kgnanshek
 
lepibwp74jd2rz.pdf
lepibwp74jd2rz.pdflepibwp74jd2rz.pdf
lepibwp74jd2rz.pdfSajalTyagi6
 
ML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptxML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptxDebabrataPain1
 
240219_RNN, LSTM code.pptxdddddddddddddddd
240219_RNN, LSTM code.pptxdddddddddddddddd240219_RNN, LSTM code.pptxdddddddddddddddd
240219_RNN, LSTM code.pptxddddddddddddddddssuser2624f71
 
A Survey of Convolutional Neural Networks
A Survey of Convolutional Neural NetworksA Survey of Convolutional Neural Networks
A Survey of Convolutional Neural NetworksRimzim Thube
 
An Introduction to Long Short-term Memory (LSTMs)
An Introduction to Long Short-term Memory (LSTMs)An Introduction to Long Short-term Memory (LSTMs)
An Introduction to Long Short-term Memory (LSTMs)EmmanuelJosterSsenjo
 
Lecture on Deep Learning
Lecture on Deep LearningLecture on Deep Learning
Lecture on Deep LearningYasas Senarath
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerPoo Kuan Hoong
 
Concepts of Temporal CNN, Recurrent Neural Network, Attention
Concepts of Temporal CNN, Recurrent Neural Network, AttentionConcepts of Temporal CNN, Recurrent Neural Network, Attention
Concepts of Temporal CNN, Recurrent Neural Network, AttentionSaumyaMundra3
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspectiveAnirban Santara
 
Convolutional Neural Networks for Natural Language Processing / Stanford cs22...
Convolutional Neural Networks for Natural Language Processing / Stanford cs22...Convolutional Neural Networks for Natural Language Processing / Stanford cs22...
Convolutional Neural Networks for Natural Language Processing / Stanford cs22...changedaeoh
 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Randa Elanwar
 
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryAhmed Yousry
 
Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)Jon Lederman
 

Similar to Complete solution for Recurrent neural network.pptx (20)

Lec10new
Lec10newLec10new
Lec10new
 
lec10new.ppt
lec10new.pptlec10new.ppt
lec10new.ppt
 
Deep learning
Deep learningDeep learning
Deep learning
 
Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & Opportunity
 
33.-Multi-Layer-Perceptron.pdf
33.-Multi-Layer-Perceptron.pdf33.-Multi-Layer-Perceptron.pdf
33.-Multi-Layer-Perceptron.pdf
 
lepibwp74jd2rz.pdf
lepibwp74jd2rz.pdflepibwp74jd2rz.pdf
lepibwp74jd2rz.pdf
 
Rnn for seq
Rnn for seqRnn for seq
Rnn for seq
 
ML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptxML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptx
 
240219_RNN, LSTM code.pptxdddddddddddddddd
240219_RNN, LSTM code.pptxdddddddddddddddd240219_RNN, LSTM code.pptxdddddddddddddddd
240219_RNN, LSTM code.pptxdddddddddddddddd
 
A Survey of Convolutional Neural Networks
A Survey of Convolutional Neural NetworksA Survey of Convolutional Neural Networks
A Survey of Convolutional Neural Networks
 
An Introduction to Long Short-term Memory (LSTMs)
An Introduction to Long Short-term Memory (LSTMs)An Introduction to Long Short-term Memory (LSTMs)
An Introduction to Long Short-term Memory (LSTMs)
 
Lecture on Deep Learning
Lecture on Deep LearningLecture on Deep Learning
Lecture on Deep Learning
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
 
Concepts of Temporal CNN, Recurrent Neural Network, Attention
Concepts of Temporal CNN, Recurrent Neural Network, AttentionConcepts of Temporal CNN, Recurrent Neural Network, Attention
Concepts of Temporal CNN, Recurrent Neural Network, Attention
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
 
RNN-LSTM.pptx
RNN-LSTM.pptxRNN-LSTM.pptx
RNN-LSTM.pptx
 
Convolutional Neural Networks for Natural Language Processing / Stanford cs22...
Convolutional Neural Networks for Natural Language Processing / Stanford cs22...Convolutional Neural Networks for Natural Language Processing / Stanford cs22...
Convolutional Neural Networks for Natural Language Processing / Stanford cs22...
 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9
 
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
 
Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)
 

More from ArunKumar674066

Electronic devices and circuits.pptx
Electronic devices and circuits.pptxElectronic devices and circuits.pptx
Electronic devices and circuits.pptxArunKumar674066
 
ELECTRONIC CIRCUITS DESIGN.pdf
ELECTRONIC CIRCUITS DESIGN.pdfELECTRONIC CIRCUITS DESIGN.pdf
ELECTRONIC CIRCUITS DESIGN.pdfArunKumar674066
 
unit-5 digital electronics.pdf
unit-5 digital electronics.pdfunit-5 digital electronics.pdf
unit-5 digital electronics.pdfArunKumar674066
 
Digital design_unit I.pptx
Digital design_unit I.pptxDigital design_unit I.pptx
Digital design_unit I.pptxArunKumar674066
 
CommunicatSystems_basic ppt.pdf
CommunicatSystems_basic ppt.pdfCommunicatSystems_basic ppt.pdf
CommunicatSystems_basic ppt.pdfArunKumar674066
 
Gerd Keiser - Optical Fiber Communications-McGraw-Hill Education (2010).pdf
Gerd Keiser - Optical Fiber Communications-McGraw-Hill Education (2010).pdfGerd Keiser - Optical Fiber Communications-McGraw-Hill Education (2010).pdf
Gerd Keiser - Optical Fiber Communications-McGraw-Hill Education (2010).pdfArunKumar674066
 

More from ArunKumar674066 (8)

Electronic devices and circuits.pptx
Electronic devices and circuits.pptxElectronic devices and circuits.pptx
Electronic devices and circuits.pptx
 
ELECTRONIC CIRCUITS DESIGN.pdf
ELECTRONIC CIRCUITS DESIGN.pdfELECTRONIC CIRCUITS DESIGN.pdf
ELECTRONIC CIRCUITS DESIGN.pdf
 
unit-5 digital electronics.pdf
unit-5 digital electronics.pdfunit-5 digital electronics.pdf
unit-5 digital electronics.pdf
 
DE_Introduction.pptx
DE_Introduction.pptxDE_Introduction.pptx
DE_Introduction.pptx
 
Digital design_unit I.pptx
Digital design_unit I.pptxDigital design_unit I.pptx
Digital design_unit I.pptx
 
PCM_DC_PPT.pdf
PCM_DC_PPT.pdfPCM_DC_PPT.pdf
PCM_DC_PPT.pdf
 
CommunicatSystems_basic ppt.pdf
CommunicatSystems_basic ppt.pdfCommunicatSystems_basic ppt.pdf
CommunicatSystems_basic ppt.pdf
 
Gerd Keiser - Optical Fiber Communications-McGraw-Hill Education (2010).pdf
Gerd Keiser - Optical Fiber Communications-McGraw-Hill Education (2010).pdfGerd Keiser - Optical Fiber Communications-McGraw-Hill Education (2010).pdf
Gerd Keiser - Optical Fiber Communications-McGraw-Hill Education (2010).pdf
 

Recently uploaded

Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 

Recently uploaded (20)

Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 

Complete solution for Recurrent neural network.pptx

  • 2. CNN • Nature of CNN -classification, object recognition, pattern matching, clustering • Limitation – CNNs generally don’t perform well when the input data is interdependent in a sequential pattern. – No correlation between previous and next input – outputs are self dependent Example: If you run 100 different inputs none of them would be biased by the previous output.
  • 4. Why RNN? Imagine a scenario like sentence generation or text translation.
  • 5. Why RNN? • The words generated are dependent on the words generated before • In this case ,need to have some bias based on the previous output. • This is a moment where RNNs shine. • RNNs includes a sense of memory about what happened earlier in the sequence of data.
  • 6. Why RNN? • RNN’s are good at processing sequence data for predictions. But how?? -Sequence should have interdependent data. -Examples :time series data, informative pieces of strings, conversations etc.
  • 9. Sequence of data? • So this is a sequence, a particular order in which one thing follows another. • With this information, you can now see that the ball is moving to the right. • Sequence data comes in many forms
  • 10. Audio sequence • Audio is a natural sequence. You can chop up an audio spectrogram into chunks and feed that into RNN’s.
  • 11. Text sequence • Text is another form of sequences. You can break Text up into a sequence of characters or a sequence of words. - “I” “am” “writing” “a” “letter”
  • 12. Sequential Memory? Sequential memory is a mechanism that makes it easier for your brain to recognize sequence patterns. -I am writing a ……
  • 13. Recurrent Neural Networks • How RNN replicate this sequential memory concept?
  • 14. • Feed-forward Networks • Recurrent Networks • Recurrent Neuron • Backpropagation Through Time (BPTT)
  • 16. • In a feed-forward network whatever image is shown to the classifier during test phase, it doesn’t alter the weights so the second decision is not affected. • This is one very important difference between feed-forward networks and recurrent nets. Note:Feed-forward nets don’t remember historic input data at test time unlike recurrent networks.
  • 17. • Feed-forward Networks • Recurrent Networks • Recurrent Neuron • Backpropagation Through Time (BPTT)
  • 18. Recurrent Networks • How do we get a feed-forward neural network to be able to use previous information to effect later ones? • An RNN has a looping mechanism that acts as a highway to allow information to flow from one step to the next.
  • 19. Recurrent Networks • Recurrent networks, on the other hand, take as their input not just the current input, but also what they have perceived previously in time.
  • 20. A Simple Multilayer Perceptron
  • 21. A Simple Multilayer Perceptron
  • 22. A Simple Multilayer Perceptron
  • 23. • Feed-forward Networks • Recurrent Networks • Recurrent Neuron • Backpropagation Through Time (BPTT)
  • 24. Recurrent Neuron • A recurrent neuron now stores all the previous step input and merges that information with the current step input.
  • 25. how recurrent neural networks work?
  • 26. how recurrent neural networks work?
  • 27. • Feed-forward Networks • Recurrent Networks • Recurrent Neuron • Back propagation Through Time (BPTT)
  • 28. how recurrent neural networks work? • So now we understand how a RNN actually works, but how does the training actually work? • How do we decide the weights for each connection? And how do we initialise these weights for these hidden units. • The purpose of recurrent nets is to accurately classify sequential input. We rely on the back propagation of error and gradient descent to do so. • But a standard back propagation like how used in feed forward networks can’t be used here.
  • 29. how recurrent neural networks work? • The problem with RNNs is that they are cyclic graphs unlike feed-forward networks which are acyclic directional graphs. • In feed-forward networks we could calculate the error derivatives from the layer above. In a RNN we don’t have such layering.
  • 31. Recurrent Neural Networks • Replication of RNN’s hidden units for every time step. • Each replication through time step is like a layer in a feed-forward network. • Each time step t layer connects to all possible layers in the time step t+1. • Thus we randomly initialise the weights, unroll the network and then use back propagation to optimise the weights in the hidden layer. • Initialisation is done by passing parameters to the lowest layer. • These parameters are also optimised as a part of back propagation.
  • 32. Recurrent Neural Networks • An outcome of the unrolling is that each layer now starts maintaining different weights and thus end up getting optimised differently. • The errors calculated w.r.t the weights are not guaranteed to be equal. • So each layer can have different weights at the end of a single run. • We definitely don’t want that to happen. • The easy solution out is to aggregate the errors across all the layers in some fashion. • We can average out the errors or even sum them up. • This way we can have a single layer in all time steps to maintain the same weights.
  • 35. Architecture for an RNN http://colah.github.io/posts/2015-08-Understanding-LSTMs/ Sequence of outputs Sequence of inputs Start of sequence marker End of sequence marker Some information is passed from one subunit to the next
  • 36. Architecture for an 1980’s RNN … Problem with this: it’s extremely deep and very hard to train
  • 37. Breaking up a sentence into word sequences
  • 38. How to feed data into RNN? • The first step is to feed “What” into the RNN. • The RNN encodes “What” and produces an output.
  • 39. How to feed data into RNN?
  • 40. How to feed data into RNN? • For the next step, feed the word “time” and the hidden state from the previous step. • The RNN now has information on both the word “What” and “time.”
  • 41. How to feed data into RNN?
  • 42. How to feed data into RNN? • Repeat this process, until the final step. • The final step of the RNN has encoded information from all the words in previous steps.
  • 43. How to feed data into RNN?
  • 44. How to feed data into RNN? • Since the final output was created from the rest of the sequence. • Take the final output and pass it to the feed- forward layer to classify an intent.
  • 45. How to feed data into RNN?
  • 46. Limitation • Theoretically RNNs have infinite memory, -capability to look back indefinitely. • But practically they can only look back a last few steps.(The problem of Long term dependencies)
  • 47. Vanishing Gradient • Final hidden state of the RNN • Short-term memory is caused by the infamous vanishing gradient problem.
  • 48. Vanishing Gradient • As the RNN processes more steps, it has troubles retaining information from previous steps. • The information from the word “what” and “time” is almost non-existent at the final time step. • Short-Term memory and the vanishing gradient is due to the nature of back-propagation; an algorithm used to train and optimize neural networks.
  • 49. Vanishing Gradient in Back Propagation Network • Training a neural network has three major steps. • First, it does a forward pass and makes a prediction. • Second, it compares the prediction to the ground truth using a loss function. The loss function outputs an error value which is an estimate of how poorly the network is performing. • Last, it uses that error value to do back propagation which calculates the gradients for each node in the network.
  • 50. Vanishing Gradient in Back Propagation Network
  • 51. Vanishing Gradient in Back Propagation Network • The gradient is the value used to adjust the networks internal weights, allowing the network to learn. • The bigger the gradient, the bigger the adjustments and vice versa. • Here is where the problem lies. • When doing back propagation, each node in a layer calculates it’s gradient with respect to the effects of the gradients, in the layer before it. • So if the adjustments to the layers before it is small, then adjustments to the current layer will be even smaller.
  • 52. Vanishing Gradient in Back Propagation Network • That causes gradients to exponentially shrink as it back propagates down. • The earlier layers fail to do any learning as the internal weights are barely being adjusted due to extremely small gradients. • And that’s the vanishing gradient problem.
  • 53. Gradients shrink as it back-propagates down
  • 54. Gradients shrink as it back-propagates through time
  • 55. Impact of Gradient in BPNN • The gradient is used to make adjustments in the neural networks weights thus allowing it to learn. • Small gradients means small adjustments. That causes the early layers not to learn.
  • 56. Vanishing Gradient • Because of vanishing gradients, the RNN doesn’t learn the long-range dependencies across time steps. • That means that there is a possibility that the word “what” and “time” are not considered when trying to predict the user’s intention. • The network then has to make the best guess with “is it?”. • That’s pretty ambiguous and would be difficult even for a human. • So not being able to learn on earlier time steps causes the network to have a short-term memory.
  • 57. LSTM’s and GRU’s • To mitigate short-term memory, two specialized recurrent neural networks were created. • One called Long Short-Term Memory or LSTM’s for short. The other is Gated Recurrent Units or GRU’s. • LSTM’s and GRU’s essentially function just like RNN’s, but they’re capable of learning long-term dependencies using mechanisms called “gates.”
  • 58. Where to use a RNN? • Language Modelling and Generating Text • Machine Translation • Speech Recognition • Generating Image Descriptions • Video Tagging • stock predictions
  • 59. Real World Applications • Neural Machine Translation
  • 60. Real World Applications • Sentiment Analysis