SlideShare a Scribd company logo
1 of 25
Download to read offline
A Brief Introduction on Recurrent
Neural Network and Its Application
Qiang Gan
All contents are collected online, listed in Reference page.
For Nanjing Deep Learning Meetup Only
Outline
1. RNN
o Model structure
o Parameters
o Learning algorithm
2. Long-Term Dependencies & Vanishing Gradient Problem
o LSTM / GRU
3. Neural Machine Translation
o Encoder-decoder framework
4. Attention Mechanism
o Extract information needed from source
5. RNN other applications
o Image captioning
o Question Answering
All contents are collected online, listed in Reference page.
Before we start …
All contents are collected online, listed in Reference page.
Memory
• We are all familiar with the song 《Two Tigers》
o Two tigers, two tiger …
• What is the 10th word?
• We learned them as a sequence, a kind of
conditional memory.
• More example: driving steps, movie scenes, …
All contents are collected online, listed in Reference page.
“Memory” in Neural Network
• Traditional Neural Network
o Output relies only on current input
o input -> hidden -> output
• Network with “Memory”
o Output relies on current input and history information
o (input + prev_hidden) -> hidden -> output
All contents are collected online, listed in Reference page.
“Memory” in Neural Network
• Four Steps in Network with “Memory”
1. (input + empty_hidden) -> hidden -> output
• Memory only contains blue information
2. (input + prev_hidden) -> hidden -> output
• Memory contains blue and red information
3. (input + prev_hidden) -> hidden -> output
• Memory contains blue, red and green information
4. (input + prev_hidden) -> hidden -> output
• Memory contains blue, red, green and purple information
All contents are collected online, listed in Reference page.
Recurrent Neural Network
All contents are collected online, listed in Reference page.
Recurrent Neural Network
• Previous example
All contents are collected online, listed in Reference page.
Recurrent Neural Network
All contents are collected online, listed in Reference page.
Recurrent Neural Network
• Learning algorithm (Backpropagation Through Time)
o Unfold the RNN into DNN (weights shared)
o Black is the prediction, errors are bright yellow, derivatives
are mustard colored.
All contents are collected online, listed in Reference page.
Long-Term Dependencies Problem
• Consider trying to predict the last word in the text “I
grew up in France… I speak fluent French.”
• We need the context of France, from further back.
All contents are collected online, listed in Reference page.
Vanishing Gradient Problem
w1,w2,… are the weights, b1,b2,…are the biases,
C is some cost function.
aj = σ(zj), σ is activation function,
zj=wjaj−1+bj is the weighted input to the neuron.
All contents are collected online, listed in Reference page.
Tanh and derivative
Vanishing Gradient Problem
All contents are collected online, listed in Reference page.
Long-Short Term Memory
• Standard RNN
• LSTM
o Forget gate, input gate, output gate, cell state
All contents are collected online, listed in Reference page.
Long-Short Term Memory
All contents are collected online, listed in Reference page.
Long-Short Term Memory
All contents are collected online, listed in Reference page.
LSTM / GRU
LSTM GRU
(fewer parameters)
[1]An Empirical Exploration of Recurrent Network Architecture
[2]Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
All contents are collected online, listed in Reference page.
Neural Machine Translation
• Encoder-decoder
o Input reversing
• 《Sequence to Sequence Learning with Neural Networks》
o Input doubling
• 《Learning to Execute》
All contents are collected online, listed in Reference page.
Attention Mechanism in NMT
Neural machine translation by jointly learning to align and translate. ICLR2015
All contents are collected online, listed in Reference page.
Visualization of Attention Matrix
• Translating from English to French
• Elements in each row add up to 1
• in grayscale (0: black, 1: white)
• Alignments found
• La Syrie -> Syria
Neural machine translation by jointly learning to align and translate. ICLR2015
All contents are collected online, listed in Reference page.
RNN Applications
• Image captioning
o Encode the image with CNN, and decode the embedded
information into description with RNN.
Li-feifei, Stanford Vision Lab
All contents are collected online, listed in Reference page.
RNN Applications
• Question answering
o Encode the document and query with RNN, and predict
the token.
Teaching Machines to Read and Comprehend. NIPS2015
Attentive Reader
All contents are collected online, listed in Reference page.
Summary
1. RNN
o Model structure
o Parameters
o Learning algorithm
2. Long-Term Dependencies & Vanishing Gradient Problem
o LSTM / GRU
3. Neural Machine Translation
o Encoder-decoder framework
4. Attention Mechanism
o Extract information needed from source
5. RNN other applications
o Image captioning
o Question Answering
Reference
1. Anyone Can Learn To Code an LSTM-RNN in Python
2. Recurrent Neural Network Tutorial WILDML
3. ATTENTION AND MEMORY IN DEEP LEARNING AND
NLP WILDML
4. Neural Networks and Deep Learning
5. Understanding LSTM Networks
6. Sequence to Sequence Learning with Neural
Networks. NIPS2014
7. Teaching Machines to Read and Comprehend.
NIPS2015
Thanks!

More Related Content

What's hot

Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkYan Xu
 
Deep Learning - RNN and CNN
Deep Learning - RNN and CNNDeep Learning - RNN and CNN
Deep Learning - RNN and CNNPradnya Saval
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You NeedDaiki Tanaka
 
Attention is All You Need (Transformer)
Attention is All You Need (Transformer)Attention is All You Need (Transformer)
Attention is All You Need (Transformer)Jeong-Gwan Lee
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Yuta Niki
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Simplilearn
 
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryRecurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryAndrii Gakhov
 
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Edureka!
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNNShuai Zhang
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]Dongmin Choi
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers Arvind Devaraj
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnnKuppusamy P
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
 
Notes on attention mechanism
Notes on attention mechanismNotes on attention mechanism
Notes on attention mechanismKhang Pham
 

What's hot (20)

Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Deep Learning - RNN and CNN
Deep Learning - RNN and CNNDeep Learning - RNN and CNN
Deep Learning - RNN and CNN
 
Lstm
LstmLstm
Lstm
 
Recurrent Neural Network
Recurrent Neural NetworkRecurrent Neural Network
Recurrent Neural Network
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
 
Attention is All You Need (Transformer)
Attention is All You Need (Transformer)Attention is All You Need (Transformer)
Attention is All You Need (Transformer)
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
Rnn & Lstm
Rnn & LstmRnn & Lstm
Rnn & Lstm
 
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryRecurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: Theory
 
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNN
 
Bert
BertBert
Bert
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
 
Perceptron
PerceptronPerceptron
Perceptron
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Notes on attention mechanism
Notes on attention mechanismNotes on attention mechanism
Notes on attention mechanism
 

Similar to A Brief Introduction on Recurrent Neural Network and Its Application

Visualization of Deep Learning
Visualization of Deep LearningVisualization of Deep Learning
Visualization of Deep LearningYaminiAlapati1
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerPoo Kuan Hoong
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...Big Data Spain
 
Document Analysis with Deep Learning
Document Analysis with Deep LearningDocument Analysis with Deep Learning
Document Analysis with Deep Learningaiaioo
 
State-Of-The Art Machine Learning Algorithms and How They Are Affected By Nea...
State-Of-The Art Machine Learning Algorithms and How They Are Affected By Nea...State-Of-The Art Machine Learning Algorithms and How They Are Affected By Nea...
State-Of-The Art Machine Learning Algorithms and How They Are Affected By Nea...inside-BigData.com
 
Intro to Neural Networks
Intro to Neural NetworksIntro to Neural Networks
Intro to Neural NetworksDean Wyatte
 
Pitfalls of Object Oriented Programming by SONY
Pitfalls of Object Oriented Programming by SONYPitfalls of Object Oriented Programming by SONY
Pitfalls of Object Oriented Programming by SONYAnaya Medias Swiss
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Saurabh Kaushik
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep LearningPoo Kuan Hoong
 
Big Data Malaysia - A Primer on Deep Learning
Big Data Malaysia - A Primer on Deep LearningBig Data Malaysia - A Primer on Deep Learning
Big Data Malaysia - A Primer on Deep LearningPoo Kuan Hoong
 
Sequence Modelling with Deep Learning
Sequence Modelling with Deep LearningSequence Modelling with Deep Learning
Sequence Modelling with Deep LearningNatasha Latysheva
 
Machine learning by using python lesson 2 Neural Networks By Professor Lili S...
Machine learning by using python lesson 2 Neural Networks By Professor Lili S...Machine learning by using python lesson 2 Neural Networks By Professor Lili S...
Machine learning by using python lesson 2 Neural Networks By Professor Lili S...Professor Lili Saghafi
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
 
DSRLab seminar Introduction to deep learning
DSRLab seminar   Introduction to deep learningDSRLab seminar   Introduction to deep learning
DSRLab seminar Introduction to deep learningPoo Kuan Hoong
 
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience hirokazutanaka
 
Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Alexander Korbonits
 
Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & OpportunityiTrain
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術CHENHuiMei
 

Similar to A Brief Introduction on Recurrent Neural Network and Its Application (20)

Visualization of Deep Learning
Visualization of Deep LearningVisualization of Deep Learning
Visualization of Deep Learning
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
 
Neural
NeuralNeural
Neural
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...
 
Document Analysis with Deep Learning
Document Analysis with Deep LearningDocument Analysis with Deep Learning
Document Analysis with Deep Learning
 
State-Of-The Art Machine Learning Algorithms and How They Are Affected By Nea...
State-Of-The Art Machine Learning Algorithms and How They Are Affected By Nea...State-Of-The Art Machine Learning Algorithms and How They Are Affected By Nea...
State-Of-The Art Machine Learning Algorithms and How They Are Affected By Nea...
 
Intro to Neural Networks
Intro to Neural NetworksIntro to Neural Networks
Intro to Neural Networks
 
Pitfalls of Object Oriented Programming by SONY
Pitfalls of Object Oriented Programming by SONYPitfalls of Object Oriented Programming by SONY
Pitfalls of Object Oriented Programming by SONY
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep Learning
 
Big Data Malaysia - A Primer on Deep Learning
Big Data Malaysia - A Primer on Deep LearningBig Data Malaysia - A Primer on Deep Learning
Big Data Malaysia - A Primer on Deep Learning
 
Sequence Modelling with Deep Learning
Sequence Modelling with Deep LearningSequence Modelling with Deep Learning
Sequence Modelling with Deep Learning
 
Machine learning by using python lesson 2 Neural Networks By Professor Lili S...
Machine learning by using python lesson 2 Neural Networks By Professor Lili S...Machine learning by using python lesson 2 Neural Networks By Professor Lili S...
Machine learning by using python lesson 2 Neural Networks By Professor Lili S...
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
Neural network
Neural networkNeural network
Neural network
 
DSRLab seminar Introduction to deep learning
DSRLab seminar   Introduction to deep learningDSRLab seminar   Introduction to deep learning
DSRLab seminar Introduction to deep learning
 
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
 
Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)
 
Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & Opportunity
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
 

More from Xiaohu ZHU

Theoretical Deep Learning
Theoretical Deep LearningTheoretical Deep Learning
Theoretical Deep LearningXiaohu ZHU
 
CBIR in the Era of Deep Learning
CBIR in the Era of Deep LearningCBIR in the Era of Deep Learning
CBIR in the Era of Deep LearningXiaohu ZHU
 
苏宁图像智能分析实践
苏宁图像智能分析实践苏宁图像智能分析实践
苏宁图像智能分析实践Xiaohu ZHU
 
Deep Reinforcement Learning An Introduction
Deep Reinforcement Learning An IntroductionDeep Reinforcement Learning An Introduction
Deep Reinforcement Learning An IntroductionXiaohu ZHU
 
Hangzhou Deep Learning Meetup-Deep Reinforcement Learning
Hangzhou Deep Learning Meetup-Deep Reinforcement LearningHangzhou Deep Learning Meetup-Deep Reinforcement Learning
Hangzhou Deep Learning Meetup-Deep Reinforcement LearningXiaohu ZHU
 
神经网络与深度学习
神经网络与深度学习神经网络与深度学习
神经网络与深度学习Xiaohu ZHU
 
Shanghai deep learning meetup 4
Shanghai deep learning meetup 4Shanghai deep learning meetup 4
Shanghai deep learning meetup 4Xiaohu ZHU
 
Shanghai Deep Learning Meetup #1
Shanghai Deep Learning Meetup #1Shanghai Deep Learning Meetup #1
Shanghai Deep Learning Meetup #1Xiaohu ZHU
 

More from Xiaohu ZHU (9)

Theoretical Deep Learning
Theoretical Deep LearningTheoretical Deep Learning
Theoretical Deep Learning
 
CBIR in the Era of Deep Learning
CBIR in the Era of Deep LearningCBIR in the Era of Deep Learning
CBIR in the Era of Deep Learning
 
Deep cv 101
Deep cv 101Deep cv 101
Deep cv 101
 
苏宁图像智能分析实践
苏宁图像智能分析实践苏宁图像智能分析实践
苏宁图像智能分析实践
 
Deep Reinforcement Learning An Introduction
Deep Reinforcement Learning An IntroductionDeep Reinforcement Learning An Introduction
Deep Reinforcement Learning An Introduction
 
Hangzhou Deep Learning Meetup-Deep Reinforcement Learning
Hangzhou Deep Learning Meetup-Deep Reinforcement LearningHangzhou Deep Learning Meetup-Deep Reinforcement Learning
Hangzhou Deep Learning Meetup-Deep Reinforcement Learning
 
神经网络与深度学习
神经网络与深度学习神经网络与深度学习
神经网络与深度学习
 
Shanghai deep learning meetup 4
Shanghai deep learning meetup 4Shanghai deep learning meetup 4
Shanghai deep learning meetup 4
 
Shanghai Deep Learning Meetup #1
Shanghai Deep Learning Meetup #1Shanghai Deep Learning Meetup #1
Shanghai Deep Learning Meetup #1
 

Recently uploaded

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

A Brief Introduction on Recurrent Neural Network and Its Application

  • 1. A Brief Introduction on Recurrent Neural Network and Its Application Qiang Gan All contents are collected online, listed in Reference page. For Nanjing Deep Learning Meetup Only
  • 2. Outline 1. RNN o Model structure o Parameters o Learning algorithm 2. Long-Term Dependencies & Vanishing Gradient Problem o LSTM / GRU 3. Neural Machine Translation o Encoder-decoder framework 4. Attention Mechanism o Extract information needed from source 5. RNN other applications o Image captioning o Question Answering All contents are collected online, listed in Reference page.
  • 3. Before we start … All contents are collected online, listed in Reference page.
  • 4. Memory • We are all familiar with the song 《Two Tigers》 o Two tigers, two tiger … • What is the 10th word? • We learned them as a sequence, a kind of conditional memory. • More example: driving steps, movie scenes, … All contents are collected online, listed in Reference page.
  • 5. “Memory” in Neural Network • Traditional Neural Network o Output relies only on current input o input -> hidden -> output • Network with “Memory” o Output relies on current input and history information o (input + prev_hidden) -> hidden -> output All contents are collected online, listed in Reference page.
  • 6. “Memory” in Neural Network • Four Steps in Network with “Memory” 1. (input + empty_hidden) -> hidden -> output • Memory only contains blue information 2. (input + prev_hidden) -> hidden -> output • Memory contains blue and red information 3. (input + prev_hidden) -> hidden -> output • Memory contains blue, red and green information 4. (input + prev_hidden) -> hidden -> output • Memory contains blue, red, green and purple information All contents are collected online, listed in Reference page.
  • 7. Recurrent Neural Network All contents are collected online, listed in Reference page.
  • 8. Recurrent Neural Network • Previous example All contents are collected online, listed in Reference page.
  • 9. Recurrent Neural Network All contents are collected online, listed in Reference page.
  • 10. Recurrent Neural Network • Learning algorithm (Backpropagation Through Time) o Unfold the RNN into DNN (weights shared) o Black is the prediction, errors are bright yellow, derivatives are mustard colored. All contents are collected online, listed in Reference page.
  • 11. Long-Term Dependencies Problem • Consider trying to predict the last word in the text “I grew up in France… I speak fluent French.” • We need the context of France, from further back. All contents are collected online, listed in Reference page.
  • 12. Vanishing Gradient Problem w1,w2,… are the weights, b1,b2,…are the biases, C is some cost function. aj = σ(zj), σ is activation function, zj=wjaj−1+bj is the weighted input to the neuron. All contents are collected online, listed in Reference page.
  • 13. Tanh and derivative Vanishing Gradient Problem All contents are collected online, listed in Reference page.
  • 14. Long-Short Term Memory • Standard RNN • LSTM o Forget gate, input gate, output gate, cell state All contents are collected online, listed in Reference page.
  • 15. Long-Short Term Memory All contents are collected online, listed in Reference page.
  • 16. Long-Short Term Memory All contents are collected online, listed in Reference page.
  • 17. LSTM / GRU LSTM GRU (fewer parameters) [1]An Empirical Exploration of Recurrent Network Architecture [2]Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling All contents are collected online, listed in Reference page.
  • 18. Neural Machine Translation • Encoder-decoder o Input reversing • 《Sequence to Sequence Learning with Neural Networks》 o Input doubling • 《Learning to Execute》 All contents are collected online, listed in Reference page.
  • 19. Attention Mechanism in NMT Neural machine translation by jointly learning to align and translate. ICLR2015 All contents are collected online, listed in Reference page.
  • 20. Visualization of Attention Matrix • Translating from English to French • Elements in each row add up to 1 • in grayscale (0: black, 1: white) • Alignments found • La Syrie -> Syria Neural machine translation by jointly learning to align and translate. ICLR2015 All contents are collected online, listed in Reference page.
  • 21. RNN Applications • Image captioning o Encode the image with CNN, and decode the embedded information into description with RNN. Li-feifei, Stanford Vision Lab All contents are collected online, listed in Reference page.
  • 22. RNN Applications • Question answering o Encode the document and query with RNN, and predict the token. Teaching Machines to Read and Comprehend. NIPS2015 Attentive Reader All contents are collected online, listed in Reference page.
  • 23. Summary 1. RNN o Model structure o Parameters o Learning algorithm 2. Long-Term Dependencies & Vanishing Gradient Problem o LSTM / GRU 3. Neural Machine Translation o Encoder-decoder framework 4. Attention Mechanism o Extract information needed from source 5. RNN other applications o Image captioning o Question Answering
  • 24. Reference 1. Anyone Can Learn To Code an LSTM-RNN in Python 2. Recurrent Neural Network Tutorial WILDML 3. ATTENTION AND MEMORY IN DEEP LEARNING AND NLP WILDML 4. Neural Networks and Deep Learning 5. Understanding LSTM Networks 6. Sequence to Sequence Learning with Neural Networks. NIPS2014 7. Teaching Machines to Read and Comprehend. NIPS2015