SlideShare a Scribd company logo
1 of 22
Download to read offline
Tutorial:
Deep learning
implementations and frameworks
Seiya Tokui*, Kenta Oono*,
Atsunori Kanemura+,
Toshihiro Kamishima+
*Preferred Networks, Inc. (PFN)
{tokui,oono}@preferred.jp
+National Institute of Advanced Industrial Science and
Technology (AIST)
atsu-kan@aist.go.jp, mail@kamishima.net
12016-04-19 DLIF Tutorial @ PAKDD2016
Introduction
Atsunori Kanemura
AIST, Japan
22016-04-19 DLIF Tutorial @ PAKDD2016
Objective
•  Get into deep learning research and practices
•  1) Learn the building blocks that are common to
most deep learning frameworks
–  Review key technologies.
•  2) Understand the differences between the various
implementations
–  How specific DL frameworks differ
–  Useful to decide which framework to start with
•  Not to know coding know-hows (although
coding examples will be given).
32016-04-19 DLIF Tutorial @ PAKDD2016
Model audience
•  Want to use neural networks
•  Want to model neural network architectures for
practical problems
•  Expected background:
–  Basics of computer science and numerical
computation
–  General machine learning terminologies (in
particular around supervised learning)
–  Basic knowledge or practices of neural networks
(recommended)
–  Basic knowledge of Python programming
language (recommended)
42016-04-19 DLIF Tutorial @ PAKDD2016
Overview
•  1st session (8:30 – 10:00)
– Introduction (AK)
– Basics of neural networks (AK)
– Common design of neural network
implementations (KO)
•  2nd session (10:30 – 12:30)
– Differences of deep learning frameworks
(ST)
– Coding examples of frameworks (KO & ST)
– Conclusion (ST)
52016-04-19 DLIF Tutorial @ PAKDD2016
Frameworks to be (and not to be) explained
•  Deeply explained with coding examples
–  Chainer - Python
–  Keras - Python
–  Tensorflow – Python
•  Also compared
–  Torch.nn – Lua
–  Theano – Python
–  Caffe – C++ & Python & Matlab
–  MXNet ̶ Many
–  autograd ̶ Python & Lua
•  Others not explained
–  Cloud computing, Matlab toolboxes, DL4J, H2O, CNTK
–  Wrappers: Lasagne, Blocks, skflow
–  TensorBoard, DIGITS (only mention their names)
62016-04-19 DLIF Tutorial @ PAKDD2016
Basics of Neural Networks
Atsunori Kanemura
AIST, Japan
72016-04-19 DLIF Tutorial @ PAKDD2016
Artificial neural networks
•  Biologically inspired
–  A biological neuron is a nonlinear unit
connected with synapses at
the dendrites (input) and
the axon (output)
•  A building block for pattern recognition
systems (and more)
82016-04-19 DLIF Tutorial @ PAKDD2016
Why neural networks?
•  Superior performance
–  Image recognition
•  ImageNet LSVR Challenge – Exceeds human
performance
–  Playing games
•  AlphaGO – Human experts have defeated
•  Extended to other problems
–  Images and text
•  Show & Tell – Generate texts from images with
intermediate representations (“embeddings”)
–  Learn artist styles
–  Many others (translation, speech recognition, …)
92016-04-19 DLIF Tutorial @ PAKDD2016
Technical inside of NNs
•  Layered processing with
linear transformation (aka. matrix multiplication,
affine transformation)
+ nonlinear operation (aka. activation function)
•  Adapt to data
102016-04-19 DLIF Tutorial @ PAKDD2016
11
Mathematical model for a neuron
•  Compare the product of input and
weights (parameters) with a threshold
–  Plasticity of the neuron
= The change of parameters and
……
∑
f : nonlinear transform	
2016-04-19 DLIF Tutorial @ PAKDD2016
b
x
w
b
x1
x2
y
w
xD
y = f
⇣ DX
d=1
wdxd b
⌘
= f(wT
x b)
b
Generalized linear discriminant
•  Generalized linear discriminant
–    :Nonlinear transformation
–  ⇒ Logistic (classical), Probit, etc.
122016-04-19 DLIF Tutorial @ PAKDD2016
f(·)
f(wT
x)
?
? by = f
⇣ DX
d=1
wdxd b
⌘
= f(wT
x b)
yn =
(
1 (xn is positive)
0 (xn is negative)
Learning with loss minimization
•  Learn from many samples
•  Binary output
•  Define the loss function
•  Minimize J to learn (estimate) the
parameters
132016-04-19 DLIF Tutorial @ PAKDD2016
(Squared error)
w⇤
= arg min
w
J(w)
{xn, y⇤
n}N
n=1
y⇤
n =
(
1 (xn is positive)
0 (xn is negative)
J(w) =
1
2
NX
n=1
(f(wT
xn) y⇤
n)2
Neural networks
•  Multi-layered
•  Minimize the loss to learn the parameters
142016-04-19 DLIF Tutorial @ PAKDD2016
※ f works element-wise
y1
= f1(W 10
x)
y2
= f2(W 21
y1
)
y3
= f3(W 32
y2
)
...
yL
= fL(W (L)(L 1)
yL 1
)
J({W }) =
1
2
NX
n=1
(yL
(xn) y⇤
n)2
Gradient descent
•  The gradient of the loss for 1-layer model is
•  The update rule
152016-04-19 DLIF Tutorial @ PAKDD2016
(r is a
constant
learning
rate)
rwJ(w) =
1
2
NX
n=1
rw(f(wT
xn) y⇤
n)2
=
NX
n=1
(f(wT
xn) y⇤
n)rwf(wT
xn)
=
NX
n=1
(f(wT
xn) y⇤
n)f(wT
xn)(1 f(wT
xn))xn
w w rrwJ(w) = w
NX
n=1
h(xn, w)xn
h(xn, w)
def
= (f(wT
xn) y⇤
n)f(wT
xn)(1 f(wT
xn))
Backprop
•  Use the chain rule to derive the gradient
•  E.g. 2-layer case
–  ⇒ Calculate gradient recursively from top to
bottom layers
•  Cf. Gradient vanishing, ReLU
162016-04-19 DLIF Tutorial @ PAKDD2016
y1
n = f(W 10
xn), y2
n = f(w21
· y1
n)
@J
@W10
kl
=
X
n,i
@J
@y1
ni
@y1
ni
@W10
kl
J(W 10
, w21
) =
1
2
X
n
(y2
n y⇤
n)2
Automatic Differentiation
•  The math for backprop is obvious (but
tedious) if the NN architecture has been
defined
•  Can be automatically calculated after
defining the NN model
•  This is called automatic differentiation
(which is a general concept that makes use
of the chain rule)
172016-04-19 DLIF Tutorial @ PAKDD2016
Parameter update
•  Gradient Descent (GD)
•  Stochastic Gradient Descent (SGD)
–  Take several samples (say, 128) from the
dataset (mini-batch), estimate the gradient.
–  Theoretically motivated as the Robbins-Monro
algorithm
•  SGD to general gradient-based algorithms
–  Adam, AdaGrad, etc.
–  Use momentum and other techniques
182016-04-19 DLIF Tutorial @ PAKDD2016
w w rrwJ(w) = w
NX
n=1
h(xn, w)xn
h(xn, w
¯
)
def
= (f(wT
xn) yn)f(wT
xn)(1 f(wT
xn))
Overfitting and generalization error
•  The goal of learning is to decrease the
generalization error, which is the error for
previously unseen data
•  Having a low error at the data at hand is
not enough (or even harmful)
–  We can achieve 0% error by memorizing all the
examples in the training data
–  Complicated models (i.e., NNs with many
parameters and layers) can achieve this (if the
learning algorithm is clever enough).
192016-04-19 DLIF Tutorial @ PAKDD2016
Training procedure
•  Avoid overfitting
•  Split the data into two parts
–  Training dataset
•  We optimize the parameters using this training dataset
–  Validation dataset
•  We evaluate the performance of the learned NN with
this validation dataset
•  Optional: Test errors
–  If you want to estimate the generalization error,
use three-way splitting of the data and use the
last one, the test dataset, to measure
generalization error
202016-04-19 DLIF Tutorial @ PAKDD2016
Train Validation
Available data
Extra topics implemented
by most of the frameworks
•  Weights initialization
–  Random
–  Pretraining
–  Transfer from another trained network
•  Techniques for avoid overffiting
–  Dropout
–  Batch normalization
–  ResNet
•  Convolution
•  Visualization
–  Deconvolution
212016-04-19 DLIF Tutorial @ PAKDD2016
Summary of this Part
•  Neural networks are computational model
that stacks neurons, or non-linear
computational units
•  The gradients of the loss w.r.t. the
parameter are recursively calculated from
top to bottom by backprop
•  Care must be taken to avoid overfitting by
following validation procedures
222016-04-19 DLIF Tutorial @ PAKDD2016

More Related Content

What's hot

Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryKenta Oono
 
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...MLconf
 
OpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon ValleyOpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon ValleyGanesan Narayanasamy
 
Neural networks and google tensor flow
Neural networks and google tensor flowNeural networks and google tensor flow
Neural networks and google tensor flowShannon McCormick
 
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...Big Data Spain
 
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15MLconf
 
TensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewTensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewPoo Kuan Hoong
 
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16MLconf
 
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...MLconf
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative ModelsKenta Oono
 
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerSeiya Tokui
 
Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & PythonLonghow Lam
 
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16MLconf
 
Introduction to neural networks and Keras
Introduction to neural networks and KerasIntroduction to neural networks and Keras
Introduction to neural networks and KerasJie He
 
PyTorch Tutorial for NTU Machine Learing Course 2017
PyTorch Tutorial for NTU Machine Learing Course 2017PyTorch Tutorial for NTU Machine Learing Course 2017
PyTorch Tutorial for NTU Machine Learing Course 2017Yu-Hsun (lymanblue) Lin
 
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017Yu-Hsun (lymanblue) Lin
 

What's hot (20)

Introduction to TensorFlow
Introduction to TensorFlowIntroduction to TensorFlow
Introduction to TensorFlow
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
 
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
 
OpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon ValleyOpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon Valley
 
Neural networks and google tensor flow
Neural networks and google tensor flowNeural networks and google tensor flow
Neural networks and google tensor flow
 
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
 
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
 
Deep Learning in theano
Deep Learning in theanoDeep Learning in theano
Deep Learning in theano
 
TensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewTensorFlow and Keras: An Overview
TensorFlow and Keras: An Overview
 
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
 
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative Models
 
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with Chainer
 
Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & Python
 
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
 
Introduction to neural networks and Keras
Introduction to neural networks and KerasIntroduction to neural networks and Keras
Introduction to neural networks and Keras
 
Tensor flow
Tensor flowTensor flow
Tensor flow
 
PyTorch Tutorial for NTU Machine Learing Course 2017
PyTorch Tutorial for NTU Machine Learing Course 2017PyTorch Tutorial for NTU Machine Learing Course 2017
PyTorch Tutorial for NTU Machine Learing Course 2017
 
Machine Intelligence at Google Scale: TensorFlow
Machine Intelligence at Google Scale: TensorFlowMachine Intelligence at Google Scale: TensorFlow
Machine Intelligence at Google Scale: TensorFlow
 
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
 

Viewers also liked

Viewers also liked (16)

11 .acuerdo recapt-mba lues 27vfinal-2
11 .acuerdo recapt-mba lues 27vfinal-211 .acuerdo recapt-mba lues 27vfinal-2
11 .acuerdo recapt-mba lues 27vfinal-2
 
Kcç
KcçKcç
Kcç
 
Les diferents giocondes
Les diferents giocondesLes diferents giocondes
Les diferents giocondes
 
Baani nh 8
Baani nh 8Baani nh 8
Baani nh 8
 
Asdfghjklñqwertyuiopzxcvbnm807
Asdfghjklñqwertyuiopzxcvbnm807Asdfghjklñqwertyuiopzxcvbnm807
Asdfghjklñqwertyuiopzxcvbnm807
 
10 exercises that burn more calories than running Men's ...
10 exercises that burn more calories than running Men's ...10 exercises that burn more calories than running Men's ...
10 exercises that burn more calories than running Men's ...
 
Kevin carrillo
Kevin carrilloKevin carrillo
Kevin carrillo
 
Agenda Arquidiocesana N°252
Agenda Arquidiocesana N°252Agenda Arquidiocesana N°252
Agenda Arquidiocesana N°252
 
Agenda Arquidiocesana N°246
Agenda Arquidiocesana N°246Agenda Arquidiocesana N°246
Agenda Arquidiocesana N°246
 
Gold Final Cat_5-12-2014
Gold Final Cat_5-12-2014Gold Final Cat_5-12-2014
Gold Final Cat_5-12-2014
 
Agenda Arquidiocesana N° 291
Agenda Arquidiocesana N° 291Agenda Arquidiocesana N° 291
Agenda Arquidiocesana N° 291
 
Agenda Arquidiocesana N°250
Agenda Arquidiocesana N°250Agenda Arquidiocesana N°250
Agenda Arquidiocesana N°250
 
300353120 auditoria-financiera
300353120 auditoria-financiera300353120 auditoria-financiera
300353120 auditoria-financiera
 
Camilla Stoltenberg: Én verden, én helse
Camilla Stoltenberg: Én verden, én helseCamilla Stoltenberg: Én verden, én helse
Camilla Stoltenberg: Én verden, én helse
 
Norma ISO 14644 - 2016
Norma ISO 14644  - 2016Norma ISO 14644  - 2016
Norma ISO 14644 - 2016
 
DIMITRIS BEKIARIS
DIMITRIS BEKIARISDIMITRIS BEKIARIS
DIMITRIS BEKIARIS
 

Similar to PAKDD2016 Tutorial DLIF: Introduction and Basics

Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...DrkhanchanaR
 
On the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of PythonOn the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of PythonTakeshi Akutsu
 
On the necessity and inapplicability of python
On the necessity and inapplicability of pythonOn the necessity and inapplicability of python
On the necessity and inapplicability of pythonYung-Yu Chen
 
UnSupervised Learning Clustering
UnSupervised Learning ClusteringUnSupervised Learning Clustering
UnSupervised Learning ClusteringFEG
 
Scilab Challenge@NTU 2014/2015 Project Briefing
Scilab Challenge@NTU 2014/2015 Project BriefingScilab Challenge@NTU 2014/2015 Project Briefing
Scilab Challenge@NTU 2014/2015 Project BriefingTBSS Group
 
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflowNVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflowNVIDIA Taiwan
 
Using Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsUsing Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsGreg Makowski
 
Differences of Deep Learning Frameworks
Differences of Deep Learning FrameworksDifferences of Deep Learning Frameworks
Differences of Deep Learning FrameworksSeiya Tokui
 
ECCV2022 paper reading - MultiMAE: Multi-modal Multi-task Masked Autoencoders...
ECCV2022 paper reading - MultiMAE: Multi-modal Multi-task Masked Autoencoders...ECCV2022 paper reading - MultiMAE: Multi-modal Multi-task Masked Autoencoders...
ECCV2022 paper reading - MultiMAE: Multi-modal Multi-task Masked Autoencoders...Antonio Tejero de Pablos
 
Introduction to artificial neural network
Introduction to artificial neural networkIntroduction to artificial neural network
Introduction to artificial neural networkDr. C.V. Suresh Babu
 
Fundamental of deep learning
Fundamental of deep learningFundamental of deep learning
Fundamental of deep learningStanley Wang
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlowBarbara Fusinska
 
Entity embeddings for categorical data
Entity embeddings for categorical dataEntity embeddings for categorical data
Entity embeddings for categorical dataPaul Skeie
 
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15MLconf
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConfXavier Amatriain
 

Similar to PAKDD2016 Tutorial DLIF: Introduction and Basics (20)

2015 03-28-eb-final
2015 03-28-eb-final2015 03-28-eb-final
2015 03-28-eb-final
 
21AI401 AI Unit 1.pdf
21AI401 AI Unit 1.pdf21AI401 AI Unit 1.pdf
21AI401 AI Unit 1.pdf
 
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
 
On the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of PythonOn the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of Python
 
On the necessity and inapplicability of python
On the necessity and inapplicability of pythonOn the necessity and inapplicability of python
On the necessity and inapplicability of python
 
UnSupervised Learning Clustering
UnSupervised Learning ClusteringUnSupervised Learning Clustering
UnSupervised Learning Clustering
 
Deep Learning and CNN Architectures
Deep Learning and CNN ArchitecturesDeep Learning and CNN Architectures
Deep Learning and CNN Architectures
 
Scilab Challenge@NTU 2014/2015 Project Briefing
Scilab Challenge@NTU 2014/2015 Project BriefingScilab Challenge@NTU 2014/2015 Project Briefing
Scilab Challenge@NTU 2014/2015 Project Briefing
 
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflowNVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
 
Using Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsUsing Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical Applications
 
Differences of Deep Learning Frameworks
Differences of Deep Learning FrameworksDifferences of Deep Learning Frameworks
Differences of Deep Learning Frameworks
 
Unsupervised learning networks
Unsupervised learning networksUnsupervised learning networks
Unsupervised learning networks
 
lecture1.ppt
lecture1.pptlecture1.ppt
lecture1.ppt
 
ECCV2022 paper reading - MultiMAE: Multi-modal Multi-task Masked Autoencoders...
ECCV2022 paper reading - MultiMAE: Multi-modal Multi-task Masked Autoencoders...ECCV2022 paper reading - MultiMAE: Multi-modal Multi-task Masked Autoencoders...
ECCV2022 paper reading - MultiMAE: Multi-modal Multi-task Masked Autoencoders...
 
Introduction to artificial neural network
Introduction to artificial neural networkIntroduction to artificial neural network
Introduction to artificial neural network
 
Fundamental of deep learning
Fundamental of deep learningFundamental of deep learning
Fundamental of deep learning
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlow
 
Entity embeddings for categorical data
Entity embeddings for categorical dataEntity embeddings for categorical data
Entity embeddings for categorical data
 
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf
 

Recently uploaded

Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 

Recently uploaded (20)

Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 

PAKDD2016 Tutorial DLIF: Introduction and Basics

  • 1. Tutorial: Deep learning implementations and frameworks Seiya Tokui*, Kenta Oono*, Atsunori Kanemura+, Toshihiro Kamishima+ *Preferred Networks, Inc. (PFN) {tokui,oono}@preferred.jp +National Institute of Advanced Industrial Science and Technology (AIST) atsu-kan@aist.go.jp, mail@kamishima.net 12016-04-19 DLIF Tutorial @ PAKDD2016
  • 3. Objective •  Get into deep learning research and practices •  1) Learn the building blocks that are common to most deep learning frameworks –  Review key technologies. •  2) Understand the differences between the various implementations –  How specific DL frameworks differ –  Useful to decide which framework to start with •  Not to know coding know-hows (although coding examples will be given). 32016-04-19 DLIF Tutorial @ PAKDD2016
  • 4. Model audience •  Want to use neural networks •  Want to model neural network architectures for practical problems •  Expected background: –  Basics of computer science and numerical computation –  General machine learning terminologies (in particular around supervised learning) –  Basic knowledge or practices of neural networks (recommended) –  Basic knowledge of Python programming language (recommended) 42016-04-19 DLIF Tutorial @ PAKDD2016
  • 5. Overview •  1st session (8:30 – 10:00) – Introduction (AK) – Basics of neural networks (AK) – Common design of neural network implementations (KO) •  2nd session (10:30 – 12:30) – Differences of deep learning frameworks (ST) – Coding examples of frameworks (KO & ST) – Conclusion (ST) 52016-04-19 DLIF Tutorial @ PAKDD2016
  • 6. Frameworks to be (and not to be) explained •  Deeply explained with coding examples –  Chainer - Python –  Keras - Python –  Tensorflow – Python •  Also compared –  Torch.nn – Lua –  Theano – Python –  Caffe – C++ & Python & Matlab –  MXNet ̶ Many –  autograd ̶ Python & Lua •  Others not explained –  Cloud computing, Matlab toolboxes, DL4J, H2O, CNTK –  Wrappers: Lasagne, Blocks, skflow –  TensorBoard, DIGITS (only mention their names) 62016-04-19 DLIF Tutorial @ PAKDD2016
  • 7. Basics of Neural Networks Atsunori Kanemura AIST, Japan 72016-04-19 DLIF Tutorial @ PAKDD2016
  • 8. Artificial neural networks •  Biologically inspired –  A biological neuron is a nonlinear unit connected with synapses at the dendrites (input) and the axon (output) •  A building block for pattern recognition systems (and more) 82016-04-19 DLIF Tutorial @ PAKDD2016
  • 9. Why neural networks? •  Superior performance –  Image recognition •  ImageNet LSVR Challenge – Exceeds human performance –  Playing games •  AlphaGO – Human experts have defeated •  Extended to other problems –  Images and text •  Show & Tell – Generate texts from images with intermediate representations (“embeddings”) –  Learn artist styles –  Many others (translation, speech recognition, …) 92016-04-19 DLIF Tutorial @ PAKDD2016
  • 10. Technical inside of NNs •  Layered processing with linear transformation (aka. matrix multiplication, affine transformation) + nonlinear operation (aka. activation function) •  Adapt to data 102016-04-19 DLIF Tutorial @ PAKDD2016
  • 11. 11 Mathematical model for a neuron •  Compare the product of input and weights (parameters) with a threshold –  Plasticity of the neuron = The change of parameters and …… ∑ f : nonlinear transform 2016-04-19 DLIF Tutorial @ PAKDD2016 b x w b x1 x2 y w xD y = f ⇣ DX d=1 wdxd b ⌘ = f(wT x b) b
  • 12. Generalized linear discriminant •  Generalized linear discriminant –    :Nonlinear transformation –  ⇒ Logistic (classical), Probit, etc. 122016-04-19 DLIF Tutorial @ PAKDD2016 f(·) f(wT x) ? ? by = f ⇣ DX d=1 wdxd b ⌘ = f(wT x b) yn = ( 1 (xn is positive) 0 (xn is negative)
  • 13. Learning with loss minimization •  Learn from many samples •  Binary output •  Define the loss function •  Minimize J to learn (estimate) the parameters 132016-04-19 DLIF Tutorial @ PAKDD2016 (Squared error) w⇤ = arg min w J(w) {xn, y⇤ n}N n=1 y⇤ n = ( 1 (xn is positive) 0 (xn is negative) J(w) = 1 2 NX n=1 (f(wT xn) y⇤ n)2
  • 14. Neural networks •  Multi-layered •  Minimize the loss to learn the parameters 142016-04-19 DLIF Tutorial @ PAKDD2016 ※ f works element-wise y1 = f1(W 10 x) y2 = f2(W 21 y1 ) y3 = f3(W 32 y2 ) ... yL = fL(W (L)(L 1) yL 1 ) J({W }) = 1 2 NX n=1 (yL (xn) y⇤ n)2
  • 15. Gradient descent •  The gradient of the loss for 1-layer model is •  The update rule 152016-04-19 DLIF Tutorial @ PAKDD2016 (r is a constant learning rate) rwJ(w) = 1 2 NX n=1 rw(f(wT xn) y⇤ n)2 = NX n=1 (f(wT xn) y⇤ n)rwf(wT xn) = NX n=1 (f(wT xn) y⇤ n)f(wT xn)(1 f(wT xn))xn w w rrwJ(w) = w NX n=1 h(xn, w)xn h(xn, w) def = (f(wT xn) y⇤ n)f(wT xn)(1 f(wT xn))
  • 16. Backprop •  Use the chain rule to derive the gradient •  E.g. 2-layer case –  ⇒ Calculate gradient recursively from top to bottom layers •  Cf. Gradient vanishing, ReLU 162016-04-19 DLIF Tutorial @ PAKDD2016 y1 n = f(W 10 xn), y2 n = f(w21 · y1 n) @J @W10 kl = X n,i @J @y1 ni @y1 ni @W10 kl J(W 10 , w21 ) = 1 2 X n (y2 n y⇤ n)2
  • 17. Automatic Differentiation •  The math for backprop is obvious (but tedious) if the NN architecture has been defined •  Can be automatically calculated after defining the NN model •  This is called automatic differentiation (which is a general concept that makes use of the chain rule) 172016-04-19 DLIF Tutorial @ PAKDD2016
  • 18. Parameter update •  Gradient Descent (GD) •  Stochastic Gradient Descent (SGD) –  Take several samples (say, 128) from the dataset (mini-batch), estimate the gradient. –  Theoretically motivated as the Robbins-Monro algorithm •  SGD to general gradient-based algorithms –  Adam, AdaGrad, etc. –  Use momentum and other techniques 182016-04-19 DLIF Tutorial @ PAKDD2016 w w rrwJ(w) = w NX n=1 h(xn, w)xn h(xn, w ¯ ) def = (f(wT xn) yn)f(wT xn)(1 f(wT xn))
  • 19. Overfitting and generalization error •  The goal of learning is to decrease the generalization error, which is the error for previously unseen data •  Having a low error at the data at hand is not enough (or even harmful) –  We can achieve 0% error by memorizing all the examples in the training data –  Complicated models (i.e., NNs with many parameters and layers) can achieve this (if the learning algorithm is clever enough). 192016-04-19 DLIF Tutorial @ PAKDD2016
  • 20. Training procedure •  Avoid overfitting •  Split the data into two parts –  Training dataset •  We optimize the parameters using this training dataset –  Validation dataset •  We evaluate the performance of the learned NN with this validation dataset •  Optional: Test errors –  If you want to estimate the generalization error, use three-way splitting of the data and use the last one, the test dataset, to measure generalization error 202016-04-19 DLIF Tutorial @ PAKDD2016 Train Validation Available data
  • 21. Extra topics implemented by most of the frameworks •  Weights initialization –  Random –  Pretraining –  Transfer from another trained network •  Techniques for avoid overffiting –  Dropout –  Batch normalization –  ResNet •  Convolution •  Visualization –  Deconvolution 212016-04-19 DLIF Tutorial @ PAKDD2016
  • 22. Summary of this Part •  Neural networks are computational model that stacks neurons, or non-linear computational units •  The gradients of the loss w.r.t. the parameter are recursively calculated from top to bottom by backprop •  Care must be taken to avoid overfitting by following validation procedures 222016-04-19 DLIF Tutorial @ PAKDD2016