SlideShare a Scribd company logo
Introduction to theano
Case study of word embedding models
Shashank
Gupta
MS, SIEL IIIT-H
Numpy is fast, why theano?
Bound by CPU
Bound by Python interpreter (less scope of runtime optimization like lazy
evaluation etc.)
Lack of symbolic differentiation
Finite difference method exists, but prone to numerical errors
SymPy supports symbolic differentiation but expression graphs not optimized as
theano
What’s theano?
Math expression compiler
Compiles a math expression to optimized C code
Targeted for GPU (CPU if GPU not supported)
Strongly typed as compared to python’s dynamically typed system
Works as GPU metaprogrammer (abstraction on top of GPU programming)
Alternative to theano
CUDA (PyCUDA it’s python wrapper)
Asynchronous Stochastic Gradient Descent - smart way to parallelize SGD
Main idea - Don’t create lock for one update, let threads run in async.
Could speed up ML algos more than matrix-vector parallelization
Automatic Differentiation
Crude way - Finite difference method
df / dx = (f(x + h) - f(x - h)) / 2*h
Prone to coding bugs
Prone to numerical error
Symbolic differentiation
Calculate gradient analytically
For a given math expression construct symbolic computation graph
Ref:
http://cs231n.github.io/optimization-2/
Backprop using symbolic differentiation
Once graph is computed forward pass is just flow of input to compute output
Backprop is going backwards from last node to compute gradients locally at each
node and accumulate those local gradients to compute global gradient (using
chain rule)
Ex :
Cont.
Each entity in the computation graph is a symbolic variable (a node in
computational graph)
Each operation is an ‘op’ node (theano specific)
‘Op’ node takes some inputs and produces some output
‘Apply’ node which applies op to inputs
‘Type’ nodes with associated type information of symbolic variable
Cont.
This is kind of abstract representation of math expression
Can think as intermediate representation in standard compiler phase
This is how theano represents the computational graph internally
Cont.
Ref: http://deeplearning.net/software/theano/extending/graphstructures.html
Optimization
Once this ‘intermediate representation’ is generated theano optimizes this graph
for efficient computation
Can think of it as compiler optimization phase
It reorders, remove redundant expression etc. to generate an ‘equivalent’ graph
which gives same output as input
Shared variables
Symbolic variables with predefined values
In Machine learning used to define parameters of the models with pre-defined
values (W, b matrices in NN)
Sends these values to host GPU with optimised storage
Theano functions
Compiles the ‘abstract’ computation graph to optimised C code
Compiles it targeting GPU
C code is highly optimized for numeric computation
Sort of interface between theano code and it’s calling python code
Case study : Word embedding models
1st Model : Autoencoder based word embedding
Ref : http://arxiv.org/pdf/1412.4930v2.pdf
Cont.
U is the embedding matrix which is learned by optimizing this objective
Code
Refer to github gist :
https://gist.github.com/shashankg7/aec2303803e7b39b150a9f78cb59db09
Only theano part included, I/O and preprocessing omitted
Model 2 - GloVe word embedding
Optimizes squared loss function
Challenges in practical implementation in theano
Main thing to remember - VECTORIZED implementation
How to handle large matrices
Refer to github gist :
https://gist.github.com/shashankg7/aec2303803e7b39b150a9f78cb59db09
Model 3 : Skip-gram negative sampling
Alpha stage: Not able to figure out vectorization of loss function
Refer to github gist :
https://gist.github.com/shashankg7/aec2303803e7b39b150a9f78cb59db09
THANKS !!!

More Related Content

What's hot

Scientific Python
Scientific PythonScientific Python
Scientific Python
Eueung Mulyana
 
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLabIntroduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
CloudxLab
 
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
Simplilearn
 
Deep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlowDeep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlow
Oswald Campesato
 
Predicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman networkPredicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman network
Kazuki Fujikawa
 
Dsp lab pdf
Dsp lab pdfDsp lab pdf
Dsp lab pdf
shareslidesplease
 
Dsp lab manual 15 11-2016
Dsp lab manual 15 11-2016Dsp lab manual 15 11-2016
Dsp lab manual 15 11-2016
Gopinath.B.L Naidu
 
"PyTorch Deep Learning Framework: Status and Directions," a Presentation from...
"PyTorch Deep Learning Framework: Status and Directions," a Presentation from..."PyTorch Deep Learning Framework: Status and Directions," a Presentation from...
"PyTorch Deep Learning Framework: Status and Directions," a Presentation from...
Edge AI and Vision Alliance
 
Digital Signal Processing Lab Manual
Digital Signal Processing Lab Manual Digital Signal Processing Lab Manual
Digital Signal Processing Lab Manual
Amairullah Khan Lodhi
 
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with Chainer
Seiya Tokui
 
Deep Learning with PyTorch
Deep Learning with PyTorchDeep Learning with PyTorch
Deep Learning with PyTorch
Mayur Bhangale
 
DSP Mat Lab
DSP Mat LabDSP Mat Lab
Introduction to PyTorch
Introduction to PyTorchIntroduction to PyTorch
Introduction to PyTorch
Jun Young Park
 
Introduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlowIntroduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlow
Sri Ambati
 
Introduction to TensorFlow
Introduction to TensorFlowIntroduction to TensorFlow
Introduction to TensorFlow
Ralph Vincent Regalado
 
Differences of Deep Learning Frameworks
Differences of Deep Learning FrameworksDifferences of Deep Learning Frameworks
Differences of Deep Learning Frameworks
Seiya Tokui
 
Attention mechanisms with tensorflow
Attention mechanisms with tensorflowAttention mechanisms with tensorflow
Attention mechanisms with tensorflow
Keon Kim
 
Dive Into PyTorch
Dive Into PyTorchDive Into PyTorch
Dive Into PyTorch
Illarion Khlestov
 
Pytorch for tf_developers
Pytorch for tf_developersPytorch for tf_developers
Pytorch for tf_developers
Abdul Muneer
 
Tensor flow
Tensor flowTensor flow
Tensor flow
Nikhil Krishna Nair
 

What's hot (20)

Scientific Python
Scientific PythonScientific Python
Scientific Python
 
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLabIntroduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
 
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
 
Deep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlowDeep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlow
 
Predicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman networkPredicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman network
 
Dsp lab pdf
Dsp lab pdfDsp lab pdf
Dsp lab pdf
 
Dsp lab manual 15 11-2016
Dsp lab manual 15 11-2016Dsp lab manual 15 11-2016
Dsp lab manual 15 11-2016
 
"PyTorch Deep Learning Framework: Status and Directions," a Presentation from...
"PyTorch Deep Learning Framework: Status and Directions," a Presentation from..."PyTorch Deep Learning Framework: Status and Directions," a Presentation from...
"PyTorch Deep Learning Framework: Status and Directions," a Presentation from...
 
Digital Signal Processing Lab Manual
Digital Signal Processing Lab Manual Digital Signal Processing Lab Manual
Digital Signal Processing Lab Manual
 
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with Chainer
 
Deep Learning with PyTorch
Deep Learning with PyTorchDeep Learning with PyTorch
Deep Learning with PyTorch
 
DSP Mat Lab
DSP Mat LabDSP Mat Lab
DSP Mat Lab
 
Introduction to PyTorch
Introduction to PyTorchIntroduction to PyTorch
Introduction to PyTorch
 
Introduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlowIntroduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlow
 
Introduction to TensorFlow
Introduction to TensorFlowIntroduction to TensorFlow
Introduction to TensorFlow
 
Differences of Deep Learning Frameworks
Differences of Deep Learning FrameworksDifferences of Deep Learning Frameworks
Differences of Deep Learning Frameworks
 
Attention mechanisms with tensorflow
Attention mechanisms with tensorflowAttention mechanisms with tensorflow
Attention mechanisms with tensorflow
 
Dive Into PyTorch
Dive Into PyTorchDive Into PyTorch
Dive Into PyTorch
 
Pytorch for tf_developers
Pytorch for tf_developersPytorch for tf_developers
Pytorch for tf_developers
 
Tensor flow
Tensor flowTensor flow
Tensor flow
 

Similar to Introduction to theano, case study of Word Embeddings

Migration To Multi Core - Parallel Programming Models
Migration To Multi Core - Parallel Programming ModelsMigration To Multi Core - Parallel Programming Models
Migration To Multi Core - Parallel Programming Models
Zvi Avraham
 
Parallel Programming Primer
Parallel Programming PrimerParallel Programming Primer
Parallel Programming Primer
Sri Prasanna
 
Parallel Programming Primer 1
Parallel Programming Primer 1Parallel Programming Primer 1
Parallel Programming Primer 1
mobius.cn
 
parellel computing
parellel computingparellel computing
parellel computing
katakdound
 
Unmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeUnmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/Invoke
Dmitri Nesteruk
 
Parallel Programming on the ANDC cluster
Parallel Programming on the ANDC clusterParallel Programming on the ANDC cluster
Parallel Programming on the ANDC cluster
Sudhang Shankar
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
MLconf
 
C Programming Unit-1
C Programming Unit-1C Programming Unit-1
C Programming Unit-1
Vikram Nandini
 
Writing Efficient Code Feb 08
Writing Efficient Code Feb 08Writing Efficient Code Feb 08
Writing Efficient Code Feb 08
Ganesh Samarthyam
 
Compiler gate question key
Compiler gate question keyCompiler gate question key
Compiler gate question key
ArthyR3
 
25-MPI-OpenMP.pptx
25-MPI-OpenMP.pptx25-MPI-OpenMP.pptx
25-MPI-OpenMP.pptx
GopalPatidar13
 
Cpcs302 1
Cpcs302  1Cpcs302  1
Cpcs302 1
guest5de1a5
 
Automatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELAutomatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSEL
Joel Falcou
 
Integrative Parallel Programming in HPC
Integrative Parallel Programming in HPCIntegrative Parallel Programming in HPC
Integrative Parallel Programming in HPC
Victor Eijkhout
 
Target updated track f
Target updated   track fTarget updated   track f
Target updated track f
Alona Gradman
 
Chip Ex2010 Gert Goossens
Chip Ex2010 Gert GoossensChip Ex2010 Gert Goossens
Chip Ex2010 Gert Goossens
Alona Gradman
 
Machine learning on streams of data
Machine learning on streams of dataMachine learning on streams of data
Machine learning on streams of data
Tomasz Sosiński
 
1588147798Begining_ABUAD1.pdf
1588147798Begining_ABUAD1.pdf1588147798Begining_ABUAD1.pdf
1588147798Begining_ABUAD1.pdf
SemsemSameer1
 
Prograamção Paralela Baseada em Componentes (Introduzido o Modelo #)
Prograamção Paralela Baseada em Componentes (Introduzido o Modelo #)Prograamção Paralela Baseada em Componentes (Introduzido o Modelo #)
Prograamção Paralela Baseada em Componentes (Introduzido o Modelo #)
Heron Carvalho
 
Compiler Construction introduction
Compiler Construction introductionCompiler Construction introduction
Compiler Construction introduction
Rana Ehtisham Ul Haq
 

Similar to Introduction to theano, case study of Word Embeddings (20)

Migration To Multi Core - Parallel Programming Models
Migration To Multi Core - Parallel Programming ModelsMigration To Multi Core - Parallel Programming Models
Migration To Multi Core - Parallel Programming Models
 
Parallel Programming Primer
Parallel Programming PrimerParallel Programming Primer
Parallel Programming Primer
 
Parallel Programming Primer 1
Parallel Programming Primer 1Parallel Programming Primer 1
Parallel Programming Primer 1
 
parellel computing
parellel computingparellel computing
parellel computing
 
Unmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeUnmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/Invoke
 
Parallel Programming on the ANDC cluster
Parallel Programming on the ANDC clusterParallel Programming on the ANDC cluster
Parallel Programming on the ANDC cluster
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
 
C Programming Unit-1
C Programming Unit-1C Programming Unit-1
C Programming Unit-1
 
Writing Efficient Code Feb 08
Writing Efficient Code Feb 08Writing Efficient Code Feb 08
Writing Efficient Code Feb 08
 
Compiler gate question key
Compiler gate question keyCompiler gate question key
Compiler gate question key
 
25-MPI-OpenMP.pptx
25-MPI-OpenMP.pptx25-MPI-OpenMP.pptx
25-MPI-OpenMP.pptx
 
Cpcs302 1
Cpcs302  1Cpcs302  1
Cpcs302 1
 
Automatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELAutomatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSEL
 
Integrative Parallel Programming in HPC
Integrative Parallel Programming in HPCIntegrative Parallel Programming in HPC
Integrative Parallel Programming in HPC
 
Target updated track f
Target updated   track fTarget updated   track f
Target updated track f
 
Chip Ex2010 Gert Goossens
Chip Ex2010 Gert GoossensChip Ex2010 Gert Goossens
Chip Ex2010 Gert Goossens
 
Machine learning on streams of data
Machine learning on streams of dataMachine learning on streams of data
Machine learning on streams of data
 
1588147798Begining_ABUAD1.pdf
1588147798Begining_ABUAD1.pdf1588147798Begining_ABUAD1.pdf
1588147798Begining_ABUAD1.pdf
 
Prograamção Paralela Baseada em Componentes (Introduzido o Modelo #)
Prograamção Paralela Baseada em Componentes (Introduzido o Modelo #)Prograamção Paralela Baseada em Componentes (Introduzido o Modelo #)
Prograamção Paralela Baseada em Componentes (Introduzido o Modelo #)
 
Compiler Construction introduction
Compiler Construction introductionCompiler Construction introduction
Compiler Construction introduction
 

Recently uploaded

Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
Mahmoud Morsy
 
Casting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdfCasting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdf
zubairahmad848137
 
Textile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdfTextile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdf
NazakatAliKhoso2
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptxML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
JamalHussainArman
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
abbyasa1014
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
john krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptxjohn krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptx
Madan Karki
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
shadow0702a
 
Hematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood CountHematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood Count
shahdabdulbaset
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
Nada Hikmah
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
sachin chaurasia
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENTNATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
Addu25809
 

Recently uploaded (20)

Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
 
Casting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdfCasting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdf
 
Textile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdfTextile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdf
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptxML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
john krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptxjohn krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptx
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
 
Hematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood CountHematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood Count
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENTNATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
 

Introduction to theano, case study of Word Embeddings

  • 1. Introduction to theano Case study of word embedding models Shashank Gupta MS, SIEL IIIT-H
  • 2. Numpy is fast, why theano? Bound by CPU Bound by Python interpreter (less scope of runtime optimization like lazy evaluation etc.) Lack of symbolic differentiation Finite difference method exists, but prone to numerical errors SymPy supports symbolic differentiation but expression graphs not optimized as theano
  • 3. What’s theano? Math expression compiler Compiles a math expression to optimized C code Targeted for GPU (CPU if GPU not supported) Strongly typed as compared to python’s dynamically typed system Works as GPU metaprogrammer (abstraction on top of GPU programming)
  • 4. Alternative to theano CUDA (PyCUDA it’s python wrapper) Asynchronous Stochastic Gradient Descent - smart way to parallelize SGD Main idea - Don’t create lock for one update, let threads run in async. Could speed up ML algos more than matrix-vector parallelization
  • 5. Automatic Differentiation Crude way - Finite difference method df / dx = (f(x + h) - f(x - h)) / 2*h Prone to coding bugs Prone to numerical error
  • 6. Symbolic differentiation Calculate gradient analytically For a given math expression construct symbolic computation graph Ref: http://cs231n.github.io/optimization-2/
  • 7. Backprop using symbolic differentiation Once graph is computed forward pass is just flow of input to compute output Backprop is going backwards from last node to compute gradients locally at each node and accumulate those local gradients to compute global gradient (using chain rule) Ex :
  • 8. Cont. Each entity in the computation graph is a symbolic variable (a node in computational graph) Each operation is an ‘op’ node (theano specific) ‘Op’ node takes some inputs and produces some output ‘Apply’ node which applies op to inputs ‘Type’ nodes with associated type information of symbolic variable
  • 9. Cont. This is kind of abstract representation of math expression Can think as intermediate representation in standard compiler phase This is how theano represents the computational graph internally
  • 11. Optimization Once this ‘intermediate representation’ is generated theano optimizes this graph for efficient computation Can think of it as compiler optimization phase It reorders, remove redundant expression etc. to generate an ‘equivalent’ graph which gives same output as input
  • 12. Shared variables Symbolic variables with predefined values In Machine learning used to define parameters of the models with pre-defined values (W, b matrices in NN) Sends these values to host GPU with optimised storage
  • 13. Theano functions Compiles the ‘abstract’ computation graph to optimised C code Compiles it targeting GPU C code is highly optimized for numeric computation Sort of interface between theano code and it’s calling python code
  • 14. Case study : Word embedding models 1st Model : Autoencoder based word embedding Ref : http://arxiv.org/pdf/1412.4930v2.pdf
  • 15. Cont. U is the embedding matrix which is learned by optimizing this objective
  • 16. Code Refer to github gist : https://gist.github.com/shashankg7/aec2303803e7b39b150a9f78cb59db09 Only theano part included, I/O and preprocessing omitted
  • 17. Model 2 - GloVe word embedding Optimizes squared loss function Challenges in practical implementation in theano Main thing to remember - VECTORIZED implementation How to handle large matrices Refer to github gist : https://gist.github.com/shashankg7/aec2303803e7b39b150a9f78cb59db09
  • 18. Model 3 : Skip-gram negative sampling Alpha stage: Not able to figure out vectorization of loss function Refer to github gist : https://gist.github.com/shashankg7/aec2303803e7b39b150a9f78cb59db09