Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
A Brief Introduction on Recurrent
Neural Network and Its Application
Qiang Gan
All contents are collected online, listed i...
Outline
1. RNN
o Model structure
o Parameters
o Learning algorithm
2. Long-Term Dependencies & Vanishing Gradient Problem
...
Before we start …
All contents are collected online, listed in Reference page.
Memory
• We are all familiar with the song 《Two Tigers》
o Two tigers, two tiger …
• What is the 10th word?
• We learned th...
“Memory” in Neural Network
• Traditional Neural Network
o Output relies only on current input
o input -> hidden -> output
...
“Memory” in Neural Network
• Four Steps in Network with “Memory”
1. (input + empty_hidden) -> hidden -> output
• Memory on...
Recurrent Neural Network
All contents are collected online, listed in Reference page.
Recurrent Neural Network
• Previous example
All contents are collected online, listed in Reference page.
Recurrent Neural Network
All contents are collected online, listed in Reference page.
Recurrent Neural Network
• Learning algorithm (Backpropagation Through Time)
o Unfold the RNN into DNN (weights shared)
o ...
Long-Term Dependencies Problem
• Consider trying to predict the last word in the text “I
grew up in France… I speak fluent...
Vanishing Gradient Problem
w1,w2,… are the weights, b1,b2,…are the biases,
C is some cost function.
aj = σ(zj), σ is activ...
Tanh and derivative
Vanishing Gradient Problem
All contents are collected online, listed in Reference page.
Long-Short Term Memory
• Standard RNN
• LSTM
o Forget gate, input gate, output gate, cell state
All contents are collected...
Long-Short Term Memory
All contents are collected online, listed in Reference page.
Long-Short Term Memory
All contents are collected online, listed in Reference page.
LSTM / GRU
LSTM GRU
(fewer parameters)
[1]An Empirical Exploration of Recurrent Network Architecture
[2]Empirical Evaluati...
Neural Machine Translation
• Encoder-decoder
o Input reversing
• 《Sequence to Sequence Learning with Neural Networks》
o In...
Attention Mechanism in NMT
Neural machine translation by jointly learning to align and translate. ICLR2015
All contents ar...
Visualization of Attention Matrix
• Translating from English to French
• Elements in each row add up to 1
• in grayscale (...
RNN Applications
• Image captioning
o Encode the image with CNN, and decode the embedded
information into description with...
RNN Applications
• Question answering
o Encode the document and query with RNN, and predict
the token.
Teaching Machines t...
Summary
1. RNN
o Model structure
o Parameters
o Learning algorithm
2. Long-Term Dependencies & Vanishing Gradient Problem
...
Reference
1. Anyone Can Learn To Code an LSTM-RNN in Python
2. Recurrent Neural Network Tutorial WILDML
3. ATTENTION AND M...
Thanks!
Upcoming SlideShare
Loading in …5
×

A Brief Introduction on Recurrent Neural Network and Its Application

A Brief Introduction on Recurrent Neural Network and Its Application for Nanjing Deep Learning Meetup Only

  • Login to see the comments

A Brief Introduction on Recurrent Neural Network and Its Application

  1. 1. A Brief Introduction on Recurrent Neural Network and Its Application Qiang Gan All contents are collected online, listed in Reference page. For Nanjing Deep Learning Meetup Only
  2. 2. Outline 1. RNN o Model structure o Parameters o Learning algorithm 2. Long-Term Dependencies & Vanishing Gradient Problem o LSTM / GRU 3. Neural Machine Translation o Encoder-decoder framework 4. Attention Mechanism o Extract information needed from source 5. RNN other applications o Image captioning o Question Answering All contents are collected online, listed in Reference page.
  3. 3. Before we start … All contents are collected online, listed in Reference page.
  4. 4. Memory • We are all familiar with the song 《Two Tigers》 o Two tigers, two tiger … • What is the 10th word? • We learned them as a sequence, a kind of conditional memory. • More example: driving steps, movie scenes, … All contents are collected online, listed in Reference page.
  5. 5. “Memory” in Neural Network • Traditional Neural Network o Output relies only on current input o input -> hidden -> output • Network with “Memory” o Output relies on current input and history information o (input + prev_hidden) -> hidden -> output All contents are collected online, listed in Reference page.
  6. 6. “Memory” in Neural Network • Four Steps in Network with “Memory” 1. (input + empty_hidden) -> hidden -> output • Memory only contains blue information 2. (input + prev_hidden) -> hidden -> output • Memory contains blue and red information 3. (input + prev_hidden) -> hidden -> output • Memory contains blue, red and green information 4. (input + prev_hidden) -> hidden -> output • Memory contains blue, red, green and purple information All contents are collected online, listed in Reference page.
  7. 7. Recurrent Neural Network All contents are collected online, listed in Reference page.
  8. 8. Recurrent Neural Network • Previous example All contents are collected online, listed in Reference page.
  9. 9. Recurrent Neural Network All contents are collected online, listed in Reference page.
  10. 10. Recurrent Neural Network • Learning algorithm (Backpropagation Through Time) o Unfold the RNN into DNN (weights shared) o Black is the prediction, errors are bright yellow, derivatives are mustard colored. All contents are collected online, listed in Reference page.
  11. 11. Long-Term Dependencies Problem • Consider trying to predict the last word in the text “I grew up in France… I speak fluent French.” • We need the context of France, from further back. All contents are collected online, listed in Reference page.
  12. 12. Vanishing Gradient Problem w1,w2,… are the weights, b1,b2,…are the biases, C is some cost function. aj = σ(zj), σ is activation function, zj=wjaj−1+bj is the weighted input to the neuron. All contents are collected online, listed in Reference page.
  13. 13. Tanh and derivative Vanishing Gradient Problem All contents are collected online, listed in Reference page.
  14. 14. Long-Short Term Memory • Standard RNN • LSTM o Forget gate, input gate, output gate, cell state All contents are collected online, listed in Reference page.
  15. 15. Long-Short Term Memory All contents are collected online, listed in Reference page.
  16. 16. Long-Short Term Memory All contents are collected online, listed in Reference page.
  17. 17. LSTM / GRU LSTM GRU (fewer parameters) [1]An Empirical Exploration of Recurrent Network Architecture [2]Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling All contents are collected online, listed in Reference page.
  18. 18. Neural Machine Translation • Encoder-decoder o Input reversing • 《Sequence to Sequence Learning with Neural Networks》 o Input doubling • 《Learning to Execute》 All contents are collected online, listed in Reference page.
  19. 19. Attention Mechanism in NMT Neural machine translation by jointly learning to align and translate. ICLR2015 All contents are collected online, listed in Reference page.
  20. 20. Visualization of Attention Matrix • Translating from English to French • Elements in each row add up to 1 • in grayscale (0: black, 1: white) • Alignments found • La Syrie -> Syria Neural machine translation by jointly learning to align and translate. ICLR2015 All contents are collected online, listed in Reference page.
  21. 21. RNN Applications • Image captioning o Encode the image with CNN, and decode the embedded information into description with RNN. Li-feifei, Stanford Vision Lab All contents are collected online, listed in Reference page.
  22. 22. RNN Applications • Question answering o Encode the document and query with RNN, and predict the token. Teaching Machines to Read and Comprehend. NIPS2015 Attentive Reader All contents are collected online, listed in Reference page.
  23. 23. Summary 1. RNN o Model structure o Parameters o Learning algorithm 2. Long-Term Dependencies & Vanishing Gradient Problem o LSTM / GRU 3. Neural Machine Translation o Encoder-decoder framework 4. Attention Mechanism o Extract information needed from source 5. RNN other applications o Image captioning o Question Answering
  24. 24. Reference 1. Anyone Can Learn To Code an LSTM-RNN in Python 2. Recurrent Neural Network Tutorial WILDML 3. ATTENTION AND MEMORY IN DEEP LEARNING AND NLP WILDML 4. Neural Networks and Deep Learning 5. Understanding LSTM Networks 6. Sequence to Sequence Learning with Neural Networks. NIPS2014 7. Teaching Machines to Read and Comprehend. NIPS2015
  25. 25. Thanks!

×