SlideShare a Scribd company logo
1 of 18
Download to read offline
Long Short Term Memory Neural Networks
Short Overview and Examples
Ralph Schlosser
https://github.com/bwv988
February 2018
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 1 / 18
Overview
Agenda
RNN
Vanishing / Exploding Gradient Problem
LSTM
Keras
Outlook
Demo
Links
Git repo: https://github.com/bwv988/lstm-neural-net-tests
Demo: https://www.kaggle.com/ternaryrealm/
lstm-time-series-explorations-with-keras
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 2 / 18
RNN
Recurrent Neural Networks (RNN) are an extension to traditional feed
forward NN.
Original application: Sequence data, e.g.:
Music, video
Words in a sentence
Financial data
Image patterns
Main advantage over traditional (D)NNs: Can retain state over a
period of time.
There are other tools to model sequence data, e.g. Hidden Markov
Models.
But: Becomes computationally unfeasible for modelling large time
dependencies.
Today, RNNs often outperform classical sequence models.
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 3 / 18
Elements of a simple RNN
Input layer: x with weight θx .
Hidden, recursive layer (feeds back into itself): h with weight θ.
Output layer: y with weight θy .
Arbitrary (e.g. RELU) activation function φ(·).
ht = θφ(ht−1) + θx xt
yt = θy φ(ht)
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 4 / 18
Unrolling the Recursion
We can see how this is applicable to sequence data when unrolling the
recursion:
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 5 / 18
Vanishing / Exploding Gradient Problem
Training the RNN means: Perform backpropagation to find optimal
weights.
Need to minimize the error (or loss) function E wrt., say, parameter θ.
Optimization problem for S steps:
∂E
∂θ
=
S
t=1
∂Et
∂θ
Applying the chain rule gives that for a particular time step t, and
looking at θk happening in layer k:
∂Et
∂θ
=
t
k=1
∂Et
∂yt
∂yt
∂ht
∂ht
∂hk
∂hk
∂θ
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 6 / 18
Vanishing / Exploding Gradient Problem
The issue is with the term ∂ht
∂hk
.
Further maths shows (omitting many, many details):
∂ht
∂hk
≤ ct−k
Here: c is some constant term related to θ and the choice of the
activation function φ.
Problem:
c < 1: Gradients tend to zero (vanish).
c > 1: Gradients will tend to infinity (explode).
Impact of vanishing gradients to RNN: Can’t “remember” impacts of
long sequences.
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 7 / 18
LSTM
Variant of RNNs that introduce a number of special, internal gates.
Internal gates help with the problem of learning relationships between
both long and short sequences in data.
Con: Introduces many more internal parameters which must be learned.
– Time consuming
Pro: Introduces many more internal parameters which must be learned.
– Flexible
Source: https://blog.statsbot.co/time-series-prediction-using-recurrent-neural-networks-lstms-807fa6ca7f
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 8 / 18
LSTM Gates
Input gate i:
Takes previous output ht−1 and current input xt.
it ∈ (0, 1)
it = σ(θxi xt + θhtht−1 + bi )
Forget gate f :
Takes previous output ht−1 and current input xt.
ft ∈ (0, 1)
ft = σ(θxf xt + θhf ht−1 + bf )
If ft = 0: Forget previous state, otherwise pass through prev. state.
Read gate g:
Takes previous output ht−1 and current input xt.
gt ∈ (0, 1)
gt = σ(θxg xt + θhg ht−1 + bg )
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 9 / 18
LSTM Gates
Cell gate c:
New value depends on ft, its previous state ct−1, and the read gate gt.
Element-wise multiplication: ct = ft ct−1 + it gt.
We can learn whether to store or erase the old cell value.
Output gate o:
ot = σ(θxoxt + θhoht−1 + bo)
ot ∈ (0, 1)
New output gate h:
ht = ot tanh(ct)
Will be fed as input into next block.
Intuition:
We learn when to retain a state, or when to forget it.
Parameters are constantly updated as new data arrives.
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 10 / 18
Practical Part
Let’s see this in action sans some of the more technical details. ;)
The practical examples are based on Keras: https://keras.io/
First a few words on Keras.
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 11 / 18
Keras
Consistent and simple high-level APIs for Deep Learning in Python.
Focus on getting stuff done w/o having to write tons of lines of code.
Fantastic documentation!
Has abstraction layer for multiple Deep Learning backends:
Tensorflow
CNTK
Theano (has reached its final release)
mxnet (experimental?)
Comparable in its ease of use to sklearn in Python, or mlr in R.
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 12 / 18
Keras Runs Everywhere
On iOS, via Apple’s CoreML (Keras support officially provided by
Apple).
On Android, via the TensorFlow Android runtime. Example: Not
Hotdog app.
In the browser, via GPU-accelerated JavaScript runtimes such as
Keras.js and WebDNN.
On Google Cloud, via TensorFlow-Serving.
In a Python webapp backend (such as a Flask app).
On the JVM, via DL4J model import provided by SkyMind.
On Raspberry Pi.
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 13 / 18
Keras & Ranking in the ML Community
Source: https://keras.io/why-use-keras/
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 14 / 18
Outlook
Some interesting, more recent advances with LSTM.
LSTMs are Turing-complete.
As a result: Can produce any output a human-made computer program
could produce, given sufficient units and weights (and of course time,
money, computational power).
DNNs are often called universal function approximators; LSTMs are
universal program approximators.
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 15 / 18
O M G
Is the end of human-made software nigh???? ;)
Neural Turing Machines: LSTMs and other techniques can be leveraged to
learn (as of yet simple) algorithms from data:
https: // arxiv. org/ pdf/ 1410. 5401. pdf
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 16 / 18
Demo
Let’s run this one on Kaggle:
https://www.kaggle.com/ternaryrealm/
lstm-time-series-explorations-with-keras
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 17 / 18
References
Main source for this presentation – Nando de Freitas brilliant lecture:
https://www.youtube.com/watch?v=56TYLaQN4N8
Ilya Sutskever PhD thesis: http://www.cs.utoronto.ca/~ilya/
pubs/ilya_sutskever_phd_thesis.pdf
“A Critical Review of Recurrent Neural Networks for Sequence
Learning”: https://arxiv.org/abs/1506.00019
Why RNNs are difficult to train:
https://arxiv.org/pdf/1211.5063.pdf
Original LSTM paper: https://www.mitpressjournals.org/doi/
abs/10.1162/neco.1997.9.8.1735
Keras documentation: https://keras.io/
Nice blog post explaining LSTMs: https://blog.statsbot.co/
time-series-prediction-using-recurrent-neural-networks-lst
Ralph Schlosser Long Short Term Memory Neural Networks February 2018 18 / 18

More Related Content

What's hot

Time series predictions using LSTMs
Time series predictions using LSTMsTime series predictions using LSTMs
Time series predictions using LSTMsSetu Chokshi
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnnKuppusamy P
 
Long Short Term Memory (Neural Networks)
Long Short Term Memory (Neural Networks)Long Short Term Memory (Neural Networks)
Long Short Term Memory (Neural Networks)Olusola Amusan
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkKnoldus Inc.
 
RNN and its applications
RNN and its applicationsRNN and its applications
RNN and its applicationsSungjoon Choi
 
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryRecurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryAndrii Gakhov
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Simplilearn
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkYan Xu
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural NetworksSharath TS
 
An Introduction to Long Short-term Memory (LSTMs)
An Introduction to Long Short-term Memory (LSTMs)An Introduction to Long Short-term Memory (LSTMs)
An Introduction to Long Short-term Memory (LSTMs)EmmanuelJosterSsenjo
 
Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10) Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10) Larry Guo
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model佳蓉 倪
 
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Edureka!
 
An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learningSubrat Panda, PhD
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingMinh Pham
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural NetworksDatabricks
 
Sequence Modelling with Deep Learning
Sequence Modelling with Deep LearningSequence Modelling with Deep Learning
Sequence Modelling with Deep LearningNatasha Latysheva
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual IntroductionLukas Masuch
 

What's hot (20)

Time series predictions using LSTMs
Time series predictions using LSTMsTime series predictions using LSTMs
Time series predictions using LSTMs
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
 
Long Short Term Memory (Neural Networks)
Long Short Term Memory (Neural Networks)Long Short Term Memory (Neural Networks)
Long Short Term Memory (Neural Networks)
 
Recurrent neural network
Recurrent neural networkRecurrent neural network
Recurrent neural network
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
RNN and its applications
RNN and its applicationsRNN and its applications
RNN and its applications
 
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryRecurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: Theory
 
Recurrent Neural Network
Recurrent Neural NetworkRecurrent Neural Network
Recurrent Neural Network
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
An Introduction to Long Short-term Memory (LSTMs)
An Introduction to Long Short-term Memory (LSTMs)An Introduction to Long Short-term Memory (LSTMs)
An Introduction to Long Short-term Memory (LSTMs)
 
Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10) Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10)
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model
 
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
 
An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learning
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Sequence Modelling with Deep Learning
Sequence Modelling with Deep LearningSequence Modelling with Deep Learning
Sequence Modelling with Deep Learning
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 

Similar to LSTM Tutorial

Hs java open_party
Hs java open_partyHs java open_party
Hs java open_partyOpen Party
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learningPaco Nathan
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustEvan Chan
 
Automating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency SpreadsAutomating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency SpreadsScyllaDB
 
Distributed Postgres
Distributed PostgresDistributed Postgres
Distributed PostgresStas Kelvich
 
GraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesGraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesPaco Nathan
 
Pacemaker+DRBD
Pacemaker+DRBDPacemaker+DRBD
Pacemaker+DRBDDan Frincu
 
Machine Learning and Hadoop
Machine Learning and HadoopMachine Learning and Hadoop
Machine Learning and HadoopJosh Patterson
 
Data Analytics and Machine Learning: From Node to Cluster on ARM64
Data Analytics and Machine Learning: From Node to Cluster on ARM64Data Analytics and Machine Learning: From Node to Cluster on ARM64
Data Analytics and Machine Learning: From Node to Cluster on ARM64Ganesh Raju
 
BKK16-404B Data Analytics and Machine Learning- from Node to Cluster
BKK16-404B Data Analytics and Machine Learning- from Node to ClusterBKK16-404B Data Analytics and Machine Learning- from Node to Cluster
BKK16-404B Data Analytics and Machine Learning- from Node to ClusterLinaro
 
BKK16-408B Data Analytics and Machine Learning From Node to Cluster
BKK16-408B Data Analytics and Machine Learning From Node to ClusterBKK16-408B Data Analytics and Machine Learning From Node to Cluster
BKK16-408B Data Analytics and Machine Learning From Node to ClusterLinaro
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging EnvironmentsPaul Groth
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in SparkPaco Nathan
 
2023-02-22_Tiberti_CyberX.pdf
2023-02-22_Tiberti_CyberX.pdf2023-02-22_Tiberti_CyberX.pdf
2023-02-22_Tiberti_CyberX.pdfcifoxo
 
SCALING THE HTM SPATIAL POOLER
SCALING THE HTM SPATIAL POOLERSCALING THE HTM SPATIAL POOLER
SCALING THE HTM SPATIAL POOLERijaia
 
Comp7404 ai group_project_15apr2018_v2.1
Comp7404 ai group_project_15apr2018_v2.1Comp7404 ai group_project_15apr2018_v2.1
Comp7404 ai group_project_15apr2018_v2.1paul0001
 

Similar to LSTM Tutorial (20)

Hs java open_party
Hs java open_partyHs java open_party
Hs java open_party
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
 
Literature Review
Literature ReviewLiterature Review
Literature Review
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to Rust
 
Automating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency SpreadsAutomating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency Spreads
 
An Optics Life
An Optics LifeAn Optics Life
An Optics Life
 
Distributed Postgres
Distributed PostgresDistributed Postgres
Distributed Postgres
 
GraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesGraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communities
 
Pacemaker+DRBD
Pacemaker+DRBDPacemaker+DRBD
Pacemaker+DRBD
 
rooter.pdf
rooter.pdfrooter.pdf
rooter.pdf
 
Machine Learning and Hadoop
Machine Learning and HadoopMachine Learning and Hadoop
Machine Learning and Hadoop
 
Data Analytics and Machine Learning: From Node to Cluster on ARM64
Data Analytics and Machine Learning: From Node to Cluster on ARM64Data Analytics and Machine Learning: From Node to Cluster on ARM64
Data Analytics and Machine Learning: From Node to Cluster on ARM64
 
BKK16-404B Data Analytics and Machine Learning- from Node to Cluster
BKK16-404B Data Analytics and Machine Learning- from Node to ClusterBKK16-404B Data Analytics and Machine Learning- from Node to Cluster
BKK16-404B Data Analytics and Machine Learning- from Node to Cluster
 
BKK16-408B Data Analytics and Machine Learning From Node to Cluster
BKK16-408B Data Analytics and Machine Learning From Node to ClusterBKK16-408B Data Analytics and Machine Learning From Node to Cluster
BKK16-408B Data Analytics and Machine Learning From Node to Cluster
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging Environments
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in Spark
 
2023-02-22_Tiberti_CyberX.pdf
2023-02-22_Tiberti_CyberX.pdf2023-02-22_Tiberti_CyberX.pdf
2023-02-22_Tiberti_CyberX.pdf
 
eScience Cluster Arch. Overview
eScience Cluster Arch. OvervieweScience Cluster Arch. Overview
eScience Cluster Arch. Overview
 
SCALING THE HTM SPATIAL POOLER
SCALING THE HTM SPATIAL POOLERSCALING THE HTM SPATIAL POOLER
SCALING THE HTM SPATIAL POOLER
 
Comp7404 ai group_project_15apr2018_v2.1
Comp7404 ai group_project_15apr2018_v2.1Comp7404 ai group_project_15apr2018_v2.1
Comp7404 ai group_project_15apr2018_v2.1
 

Recently uploaded

➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
hybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptxhybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptx9to5mart
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 

Recently uploaded (20)

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
hybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptxhybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptx
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 

LSTM Tutorial

  • 1. Long Short Term Memory Neural Networks Short Overview and Examples Ralph Schlosser https://github.com/bwv988 February 2018 Ralph Schlosser Long Short Term Memory Neural Networks February 2018 1 / 18
  • 2. Overview Agenda RNN Vanishing / Exploding Gradient Problem LSTM Keras Outlook Demo Links Git repo: https://github.com/bwv988/lstm-neural-net-tests Demo: https://www.kaggle.com/ternaryrealm/ lstm-time-series-explorations-with-keras Ralph Schlosser Long Short Term Memory Neural Networks February 2018 2 / 18
  • 3. RNN Recurrent Neural Networks (RNN) are an extension to traditional feed forward NN. Original application: Sequence data, e.g.: Music, video Words in a sentence Financial data Image patterns Main advantage over traditional (D)NNs: Can retain state over a period of time. There are other tools to model sequence data, e.g. Hidden Markov Models. But: Becomes computationally unfeasible for modelling large time dependencies. Today, RNNs often outperform classical sequence models. Ralph Schlosser Long Short Term Memory Neural Networks February 2018 3 / 18
  • 4. Elements of a simple RNN Input layer: x with weight θx . Hidden, recursive layer (feeds back into itself): h with weight θ. Output layer: y with weight θy . Arbitrary (e.g. RELU) activation function φ(·). ht = θφ(ht−1) + θx xt yt = θy φ(ht) Ralph Schlosser Long Short Term Memory Neural Networks February 2018 4 / 18
  • 5. Unrolling the Recursion We can see how this is applicable to sequence data when unrolling the recursion: Ralph Schlosser Long Short Term Memory Neural Networks February 2018 5 / 18
  • 6. Vanishing / Exploding Gradient Problem Training the RNN means: Perform backpropagation to find optimal weights. Need to minimize the error (or loss) function E wrt., say, parameter θ. Optimization problem for S steps: ∂E ∂θ = S t=1 ∂Et ∂θ Applying the chain rule gives that for a particular time step t, and looking at θk happening in layer k: ∂Et ∂θ = t k=1 ∂Et ∂yt ∂yt ∂ht ∂ht ∂hk ∂hk ∂θ Ralph Schlosser Long Short Term Memory Neural Networks February 2018 6 / 18
  • 7. Vanishing / Exploding Gradient Problem The issue is with the term ∂ht ∂hk . Further maths shows (omitting many, many details): ∂ht ∂hk ≤ ct−k Here: c is some constant term related to θ and the choice of the activation function φ. Problem: c < 1: Gradients tend to zero (vanish). c > 1: Gradients will tend to infinity (explode). Impact of vanishing gradients to RNN: Can’t “remember” impacts of long sequences. Ralph Schlosser Long Short Term Memory Neural Networks February 2018 7 / 18
  • 8. LSTM Variant of RNNs that introduce a number of special, internal gates. Internal gates help with the problem of learning relationships between both long and short sequences in data. Con: Introduces many more internal parameters which must be learned. – Time consuming Pro: Introduces many more internal parameters which must be learned. – Flexible Source: https://blog.statsbot.co/time-series-prediction-using-recurrent-neural-networks-lstms-807fa6ca7f Ralph Schlosser Long Short Term Memory Neural Networks February 2018 8 / 18
  • 9. LSTM Gates Input gate i: Takes previous output ht−1 and current input xt. it ∈ (0, 1) it = σ(θxi xt + θhtht−1 + bi ) Forget gate f : Takes previous output ht−1 and current input xt. ft ∈ (0, 1) ft = σ(θxf xt + θhf ht−1 + bf ) If ft = 0: Forget previous state, otherwise pass through prev. state. Read gate g: Takes previous output ht−1 and current input xt. gt ∈ (0, 1) gt = σ(θxg xt + θhg ht−1 + bg ) Ralph Schlosser Long Short Term Memory Neural Networks February 2018 9 / 18
  • 10. LSTM Gates Cell gate c: New value depends on ft, its previous state ct−1, and the read gate gt. Element-wise multiplication: ct = ft ct−1 + it gt. We can learn whether to store or erase the old cell value. Output gate o: ot = σ(θxoxt + θhoht−1 + bo) ot ∈ (0, 1) New output gate h: ht = ot tanh(ct) Will be fed as input into next block. Intuition: We learn when to retain a state, or when to forget it. Parameters are constantly updated as new data arrives. Ralph Schlosser Long Short Term Memory Neural Networks February 2018 10 / 18
  • 11. Practical Part Let’s see this in action sans some of the more technical details. ;) The practical examples are based on Keras: https://keras.io/ First a few words on Keras. Ralph Schlosser Long Short Term Memory Neural Networks February 2018 11 / 18
  • 12. Keras Consistent and simple high-level APIs for Deep Learning in Python. Focus on getting stuff done w/o having to write tons of lines of code. Fantastic documentation! Has abstraction layer for multiple Deep Learning backends: Tensorflow CNTK Theano (has reached its final release) mxnet (experimental?) Comparable in its ease of use to sklearn in Python, or mlr in R. Ralph Schlosser Long Short Term Memory Neural Networks February 2018 12 / 18
  • 13. Keras Runs Everywhere On iOS, via Apple’s CoreML (Keras support officially provided by Apple). On Android, via the TensorFlow Android runtime. Example: Not Hotdog app. In the browser, via GPU-accelerated JavaScript runtimes such as Keras.js and WebDNN. On Google Cloud, via TensorFlow-Serving. In a Python webapp backend (such as a Flask app). On the JVM, via DL4J model import provided by SkyMind. On Raspberry Pi. Ralph Schlosser Long Short Term Memory Neural Networks February 2018 13 / 18
  • 14. Keras & Ranking in the ML Community Source: https://keras.io/why-use-keras/ Ralph Schlosser Long Short Term Memory Neural Networks February 2018 14 / 18
  • 15. Outlook Some interesting, more recent advances with LSTM. LSTMs are Turing-complete. As a result: Can produce any output a human-made computer program could produce, given sufficient units and weights (and of course time, money, computational power). DNNs are often called universal function approximators; LSTMs are universal program approximators. Ralph Schlosser Long Short Term Memory Neural Networks February 2018 15 / 18
  • 16. O M G Is the end of human-made software nigh???? ;) Neural Turing Machines: LSTMs and other techniques can be leveraged to learn (as of yet simple) algorithms from data: https: // arxiv. org/ pdf/ 1410. 5401. pdf Ralph Schlosser Long Short Term Memory Neural Networks February 2018 16 / 18
  • 17. Demo Let’s run this one on Kaggle: https://www.kaggle.com/ternaryrealm/ lstm-time-series-explorations-with-keras Ralph Schlosser Long Short Term Memory Neural Networks February 2018 17 / 18
  • 18. References Main source for this presentation – Nando de Freitas brilliant lecture: https://www.youtube.com/watch?v=56TYLaQN4N8 Ilya Sutskever PhD thesis: http://www.cs.utoronto.ca/~ilya/ pubs/ilya_sutskever_phd_thesis.pdf “A Critical Review of Recurrent Neural Networks for Sequence Learning”: https://arxiv.org/abs/1506.00019 Why RNNs are difficult to train: https://arxiv.org/pdf/1211.5063.pdf Original LSTM paper: https://www.mitpressjournals.org/doi/ abs/10.1162/neco.1997.9.8.1735 Keras documentation: https://keras.io/ Nice blog post explaining LSTMs: https://blog.statsbot.co/ time-series-prediction-using-recurrent-neural-networks-lst Ralph Schlosser Long Short Term Memory Neural Networks February 2018 18 / 18