SlideShare a Scribd company logo
1 of 31
Download to read offline
[course site]
Day 2 Lecture 1
Multilayer Perceptron
Elisa Sayrol
Acknowledgements
Antonio Bonafonte
Kevin McGuinness
kevin.mcguinness@dcu.ie
Research Fellow
Insight Centre for Data Analytics
Dublin City University
Xavier Giro-i-Nieto
xavier.giro@upc.edu
…in our last lecture
Single Neuron Model (Perceptron)
The perceptron can address both regression or classification problems, depending
on the chosen activation function
Linear Regression (eg. 1D input - 1D ouput)
! " = $%
" + '
Binary Classification (eg. 2D input, 1D ouput)
MultiClass: Softmax
! " = $ " = $(&'" + ))Sigmoid
Non-linear decision boundaries
Linear models can only produce linear
decision boundaries
Real world data often needs a non-linear
decision boundary
Images
Audio
Text
Learn a suitable representation space from the data by using Deep neural Networks
Example: X-OR.
AND and OR can be generated with a single perceptron
g
-3
x1
x2
2
2
y1
x1
x2 AND
0
0
1
1
g
-1
x1
x2
2
2
y2
OR
0
0
x2
1
x1
1
!" = $ %&' + ) = *( 2 2 ·
."
./
− 3) !/ = $ %&' + ) = *( 2 2 ·
."
./
− 1)
Input vector
(x1,x2)
Class
OR
(0,0) 0
(0,1) 1
(1,0) 1
(1,1) 1
Input vector
(x1,x2)
Class
AND
(0,0) 0
(0,1) 0
(1,0) 0
(1,1) 1
Example: X-OR
X-OR a Non-linear separable problem can not be
generated with a single perceptron
XOR
0
0
x2
1
x1
1
Input vector
(x1,x2)
Class
XOR
(0,0) 0
(0,1) 1
(1,0) 1
(1,1) 0
g
-1
x1
x2
-2
2
h1
x1
x2
0
0
1
1
g
-1
x1
x2
2
-2
h2
0
0
x2
1
x1
1
h1
g
-1
h1
h2
2
2
y
0
h2
(0,0)
(1,1)
(0,1)
(1,0)
Example: X-OR. However…..
ℎ" = $ %&&
'
( + *"" = +( −2 2 ·
0"
01
− 1)
ℎ1 = $ %&4
'
( + *"1 = +( 2 −2 ·
0"
01
− 1)
5 = $ %4
'
6 + *1 = +( 2 2 ·
ℎ"
ℎ1
− 1)
g
-1
x1
x2
-2
2
h1
x1
x2
0
0
1
1
g
-1
x1
x2
2
-2
h2
0
0
x2
1
x1
1
g
-1
h1
h2
2
2
y
h1
0
h2
(0,0)
(1,1)
(0,1)
(1,0)
Input vector
(x1,x2)
h1
(0,0) 0
(0,1) 1
(1,0) 0
(1,1) 0
Input vector
(x1,x2)
h1
(0,0) 0
(0,1) 0
(1,0) 1
(1,1) 0
Input vector
(h1,h2)
y1
(0,0) 0
(0,1) 1
(1,0) 1
Example: X-OR. However…..
Input vector
(x1,x2)
y1
(0,0) 0
(0,1) 1
(1,0) 1
(1,1) 0
Example: X-OR. Finally
x1
x2
0
0
1
1
ℎ" = $ %&&
'
( + *"" = +( −2 2 ·
0"
01
− 1)
ℎ1 = $ %&4
'
( + *"1 = +( 2 −2 ·
0"
01
− 1)
5 = $ %4
'
6 + *1 = +( 2 2 ·
ℎ"
ℎ1
− 1)
g h1
g
-1
x1
x2
2
-2
h2
2
-2
g
-1
Input
layer
Hidden
layer
Output
Layer
y
Three layer Network:
-Input Layer
-Hidden Layer
-Output Layer
2-2-1 Fully connected topology
(all neurons in a layer connected
Connected to all neurons in the
following layer)
Another Example: Star Region (Univ. Texas)
https://www.cs.utexas.edu/~teammco/misc/mlp/
Neural networks
A neural network is simply a composition of
simple neurons into several layers
Each neuron simply computes a linear
combination of its inputs, adds a bias, and
passes the result through an activation
function g(x)
The network can contain one or more hidden
layers. The outputs of these hidden layers can
be thought of as a new representation of the
data (new features).
The final output is the target variable (y = f(x))
Multilayer perceptrons
When each node in each layer is a linear
combination of all inputs from the previous
layer then the network is called a multilayer
perceptron (MLP)
Weights can be organized into matrices.
Forward pass computes
Depth
Width
!(#)
=g(%(#)
!(&)
+((#)
)
g: activation function. i.e. sigmoid f : target function. i.e. softmax
Fully
connected
Network
y = f(x)
Multilayer perceptrons
16
Forward pass computes
w11 w12 w13 w14
w21 w22 w23 w24
w31 w32 w33 w34
w41 w42 w43 w44
W1
x1
x2
x3
x4
b1
b2
b3
b4
b1h0
h11=	g(	wx +b )		
x1 x2 x3 x4
y1
Layer 1
Layer 2
Layer 3
Layer 0
y2
h0
h1
h2
h3
Multilayer perceptrons
17
Forward pass computes
w11 w12 w13 w14
w21 w22 w23 w24
w31 w32 w33 w34
w41 w42 w43 w44
W1
x1
x2
x3
x4
b1
b2
b3
b4
b1h0
x1 x2 x3 x4
y1
Layer 1
Layer 2
Layer 3
Layer 0
y2
h0
h1
h2
h3
h11=	g(	wx +b )		
h12=	g(	wx +b )
Universal approximation theorem
Universal approximation theorem states that “the standard multilayer feed-forward network with a single hidden layer,
which contains finite number of hidden neurons, is a universal approximator among continuous functions on compact
subsets of Rn, under mild assumptions on the activation function.”
If a 2 layer NN is a universal approximator, then why do we need deep nets??
The universal approximation theorem:
Says nothing about the how easy/difficult it is to fit such approximators
Needs a “finite number of hidden neurons”: finite may be extremely large
In practice, deep nets can usually represent more complex functions with less total neurons (and
therefore, less parameters)
…Learning
Linear regression – Loss Function
y
x
Loss function is square (Euclidean) loss
Logistic regression
Activation function is the sigmoid
Loss function is cross entropy
x2
x1
g(wTx + b) = ½
w
g(wTx + b) > ½
g(wTx + b) < ½
1
0
Fitting linear models
E.g. linear regression
Need to optimize L
Gradient descent
w
L
Tangent lineLoss
function
wt
wt+1
a : learning rate (aka step size)
Training
Estimate parameters !(W(t), b(t)) from training examples given a Loss Function
"
∗
= %&'()*+ℒ -+ . , 0
• Iteratively adapt each parameter
Basic idea: gradient descent.
• Dependencies are very complex.
Global minimum: challenging. Local minima: can be good enough.
• Initialization influences in the solutions.
Training
Gradient Descent: Move the parameter ! in small steps in the direction opposite sign of the
derivative of the loss with respect !.
!(#) = !(#&') − )(#&') * +,ℒ(., 0 1 )
Stochastic gradient descent (SGD): estimate the gradient with one sample, or better, with a
minibatch of examples.
Several strategies have been proposed to update the weights: Adam, RMSProp, Adamax, etc.
known as: optimizers
Gradient descent examples
Linear regression
http://nbviewer.jupyter.org/github/kevinmcguinness/ml-examples/blob/master/notebooks/GD_Regression.ipynb
https://github.com/kevinmcguinness/ml-examples/blob/master/notebooks/GD_Regression.ipynb
Logistic regression
http://nbviewer.jupyter.org/github/kevinmcguinness/ml-examples/blob/master/notebooks/GD_Classification.ipynb
https://github.com/kevinmcguinness/ml-examples/blob/master/notebooks/GD_Classification.ipynb
MNIST Example
Handwritten digits
• 60.000 training examples
• 10.000 test examples
• 10 classes (digits 0-9)
• 28x28 grayscale images(784 pixels)
• http://yann.lecun.com/exdb/mnist/
The objective is to learn a function that predicts the digit from the image
MNIST Example
Model
• 3 layer neural-network ( 2 hidden layers)
• Tanh units (activation function)
• 512-512-10
• Softmax on top layer
• Cross entropy Loss
MNIST Example
Training
• 40 epochs using min-batch SGD
• Size of the mini-batch: 128
• Leaning Rate: 0.1 (fixed)
• Takes 5 minutes to train on GPU
Accuracy Results
• 98.12% (188 errors in 10.000 test examples)
there are ways to improve accuracy…
Metrics
!""#$%"& =
() + (+
() + (+ + ,) + ,+
there are other metrics….
Summary
• Multilayer Perceptron Networks allow us to build non-linear decision boundaries
• Multilayer Perceptron Networks are composed of the input layer, hidden layers and the
output layer. All neurons in one layer are connected to all neurons from the previous
layer and the layer that follows
• Multilayer Perceptron Networks have a large number of parameters that have to be
estimated trough training with the goal of minimizing a given loss function
• With Multiple Layer Perceptrons we need to find the gradient of the loss function with
respect to all the parameters of the model (W(t), b(t))
30
Assignment D2L2.1
Given the following network to obtain a XNOR operation, Indicate which parameters are correct:
● w111=-2, w112=2, w121=2,w122=-2,b1=-1,w211=2,w221=2,b2=-1
● w111=-2, w112=2, w121=2,w122=-2,b1=-1,w211=2,w221=2,b2=1
● w111=-2, w112=2, w121=2,w122=-2,b1=-1,w211=-2,w221=-2,b2=1
● w111=-2, w112=2, w121=2,w122=-2,b1=-1,w211=-2,w221=-2,b2=-1
31
Assignment D2L1.2
Given the following Fully Connected Network, with an input of 256 elements, 2 hidden layers and an
output layer, how many parameters do you need to estimate ?

More Related Content

What's hot

What's hot (20)

Loss functions (DLAI D4L2 2017 UPC Deep Learning for Artificial Intelligence)
Loss functions (DLAI D4L2 2017 UPC Deep Learning for Artificial Intelligence)Loss functions (DLAI D4L2 2017 UPC Deep Learning for Artificial Intelligence)
Loss functions (DLAI D4L2 2017 UPC Deep Learning for Artificial Intelligence)
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
 
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
 
Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks
Skip RNN: Learning to Skip State Updates in Recurrent Neural NetworksSkip RNN: Learning to Skip State Updates in Recurrent Neural Networks
Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks
 
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
 
The Perceptron (D1L2 Deep Learning for Speech and Language)
The Perceptron (D1L2 Deep Learning for Speech and Language)The Perceptron (D1L2 Deep Learning for Speech and Language)
The Perceptron (D1L2 Deep Learning for Speech and Language)
 
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
 
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
 
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
 
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
 
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
 
The Perceptron (D1L1 Insight@DCU Machine Learning Workshop 2017)
The Perceptron (D1L1 Insight@DCU Machine Learning Workshop 2017)The Perceptron (D1L1 Insight@DCU Machine Learning Workshop 2017)
The Perceptron (D1L1 Insight@DCU Machine Learning Workshop 2017)
 
Deep Learning without Annotations - Xavier Giro - UPC Barcelona 2018
Deep Learning without Annotations - Xavier Giro - UPC Barcelona 2018Deep Learning without Annotations - Xavier Giro - UPC Barcelona 2018
Deep Learning without Annotations - Xavier Giro - UPC Barcelona 2018
 
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
 
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
 
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoConvolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in Theano
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
 
TypeScript and Deep Learning
TypeScript and Deep LearningTypeScript and Deep Learning
TypeScript and Deep Learning
 

Similar to Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018

[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
台灣資料科學年會
 

Similar to Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018 (20)

Deep learning (2)
Deep learning (2)Deep learning (2)
Deep learning (2)
 
tutorial.ppt
tutorial.ppttutorial.ppt
tutorial.ppt
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
 
C++ and Deep Learning
C++ and Deep LearningC++ and Deep Learning
C++ and Deep Learning
 
ann-ics320Part4.ppt
ann-ics320Part4.pptann-ics320Part4.ppt
ann-ics320Part4.ppt
 
ann-ics320Part4.ppt
ann-ics320Part4.pptann-ics320Part4.ppt
ann-ics320Part4.ppt
 
Scalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNetScalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNet
 
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and Tensorflow
 
Neural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningNeural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learning
 
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
 
Deep Learning: R with Keras and TensorFlow
Deep Learning: R with Keras and TensorFlowDeep Learning: R with Keras and TensorFlow
Deep Learning: R with Keras and TensorFlow
 
Scala and Deep Learning
Scala and Deep LearningScala and Deep Learning
Scala and Deep Learning
 
M7 - Neural Networks in machine learning.pdf
M7 - Neural Networks in machine learning.pdfM7 - Neural Networks in machine learning.pdf
M7 - Neural Networks in machine learning.pdf
 
Neural network
Neural networkNeural network
Neural network
 
Java and Deep Learning
Java and Deep LearningJava and Deep Learning
Java and Deep Learning
 
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
 
Lec 6-bp
Lec 6-bpLec 6-bp
Lec 6-bp
 
Introduction to Neural Networks and Deep Learning
Introduction to Neural Networks and Deep LearningIntroduction to Neural Networks and Deep Learning
Introduction to Neural Networks and Deep Learning
 
ai7.ppt
ai7.pptai7.ppt
ai7.ppt
 

More from Universitat Politècnica de Catalunya

Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 

More from Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
 

Recently uploaded

Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 

Recently uploaded (20)

Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 

Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018

  • 1. [course site] Day 2 Lecture 1 Multilayer Perceptron Elisa Sayrol
  • 2. Acknowledgements Antonio Bonafonte Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University Xavier Giro-i-Nieto xavier.giro@upc.edu
  • 3. …in our last lecture
  • 4. Single Neuron Model (Perceptron) The perceptron can address both regression or classification problems, depending on the chosen activation function
  • 5. Linear Regression (eg. 1D input - 1D ouput) ! " = $% " + '
  • 6. Binary Classification (eg. 2D input, 1D ouput) MultiClass: Softmax ! " = $ " = $(&'" + ))Sigmoid
  • 7. Non-linear decision boundaries Linear models can only produce linear decision boundaries Real world data often needs a non-linear decision boundary Images Audio Text Learn a suitable representation space from the data by using Deep neural Networks
  • 8. Example: X-OR. AND and OR can be generated with a single perceptron g -3 x1 x2 2 2 y1 x1 x2 AND 0 0 1 1 g -1 x1 x2 2 2 y2 OR 0 0 x2 1 x1 1 !" = $ %&' + ) = *( 2 2 · ." ./ − 3) !/ = $ %&' + ) = *( 2 2 · ." ./ − 1) Input vector (x1,x2) Class OR (0,0) 0 (0,1) 1 (1,0) 1 (1,1) 1 Input vector (x1,x2) Class AND (0,0) 0 (0,1) 0 (1,0) 0 (1,1) 1
  • 9. Example: X-OR X-OR a Non-linear separable problem can not be generated with a single perceptron XOR 0 0 x2 1 x1 1 Input vector (x1,x2) Class XOR (0,0) 0 (0,1) 1 (1,0) 1 (1,1) 0
  • 10. g -1 x1 x2 -2 2 h1 x1 x2 0 0 1 1 g -1 x1 x2 2 -2 h2 0 0 x2 1 x1 1 h1 g -1 h1 h2 2 2 y 0 h2 (0,0) (1,1) (0,1) (1,0) Example: X-OR. However….. ℎ" = $ %&& ' ( + *"" = +( −2 2 · 0" 01 − 1) ℎ1 = $ %&4 ' ( + *"1 = +( 2 −2 · 0" 01 − 1) 5 = $ %4 ' 6 + *1 = +( 2 2 · ℎ" ℎ1 − 1)
  • 11. g -1 x1 x2 -2 2 h1 x1 x2 0 0 1 1 g -1 x1 x2 2 -2 h2 0 0 x2 1 x1 1 g -1 h1 h2 2 2 y h1 0 h2 (0,0) (1,1) (0,1) (1,0) Input vector (x1,x2) h1 (0,0) 0 (0,1) 1 (1,0) 0 (1,1) 0 Input vector (x1,x2) h1 (0,0) 0 (0,1) 0 (1,0) 1 (1,1) 0 Input vector (h1,h2) y1 (0,0) 0 (0,1) 1 (1,0) 1 Example: X-OR. However….. Input vector (x1,x2) y1 (0,0) 0 (0,1) 1 (1,0) 1 (1,1) 0
  • 12. Example: X-OR. Finally x1 x2 0 0 1 1 ℎ" = $ %&& ' ( + *"" = +( −2 2 · 0" 01 − 1) ℎ1 = $ %&4 ' ( + *"1 = +( 2 −2 · 0" 01 − 1) 5 = $ %4 ' 6 + *1 = +( 2 2 · ℎ" ℎ1 − 1) g h1 g -1 x1 x2 2 -2 h2 2 -2 g -1 Input layer Hidden layer Output Layer y Three layer Network: -Input Layer -Hidden Layer -Output Layer 2-2-1 Fully connected topology (all neurons in a layer connected Connected to all neurons in the following layer)
  • 13. Another Example: Star Region (Univ. Texas) https://www.cs.utexas.edu/~teammco/misc/mlp/
  • 14. Neural networks A neural network is simply a composition of simple neurons into several layers Each neuron simply computes a linear combination of its inputs, adds a bias, and passes the result through an activation function g(x) The network can contain one or more hidden layers. The outputs of these hidden layers can be thought of as a new representation of the data (new features). The final output is the target variable (y = f(x))
  • 15. Multilayer perceptrons When each node in each layer is a linear combination of all inputs from the previous layer then the network is called a multilayer perceptron (MLP) Weights can be organized into matrices. Forward pass computes Depth Width !(#) =g(%(#) !(&) +((#) ) g: activation function. i.e. sigmoid f : target function. i.e. softmax Fully connected Network y = f(x)
  • 16. Multilayer perceptrons 16 Forward pass computes w11 w12 w13 w14 w21 w22 w23 w24 w31 w32 w33 w34 w41 w42 w43 w44 W1 x1 x2 x3 x4 b1 b2 b3 b4 b1h0 h11= g( wx +b ) x1 x2 x3 x4 y1 Layer 1 Layer 2 Layer 3 Layer 0 y2 h0 h1 h2 h3
  • 17. Multilayer perceptrons 17 Forward pass computes w11 w12 w13 w14 w21 w22 w23 w24 w31 w32 w33 w34 w41 w42 w43 w44 W1 x1 x2 x3 x4 b1 b2 b3 b4 b1h0 x1 x2 x3 x4 y1 Layer 1 Layer 2 Layer 3 Layer 0 y2 h0 h1 h2 h3 h11= g( wx +b ) h12= g( wx +b )
  • 18. Universal approximation theorem Universal approximation theorem states that “the standard multilayer feed-forward network with a single hidden layer, which contains finite number of hidden neurons, is a universal approximator among continuous functions on compact subsets of Rn, under mild assumptions on the activation function.” If a 2 layer NN is a universal approximator, then why do we need deep nets?? The universal approximation theorem: Says nothing about the how easy/difficult it is to fit such approximators Needs a “finite number of hidden neurons”: finite may be extremely large In practice, deep nets can usually represent more complex functions with less total neurons (and therefore, less parameters)
  • 20. Linear regression – Loss Function y x Loss function is square (Euclidean) loss
  • 21. Logistic regression Activation function is the sigmoid Loss function is cross entropy x2 x1 g(wTx + b) = ½ w g(wTx + b) > ½ g(wTx + b) < ½ 1 0
  • 22. Fitting linear models E.g. linear regression Need to optimize L Gradient descent w L Tangent lineLoss function wt wt+1 a : learning rate (aka step size)
  • 23. Training Estimate parameters !(W(t), b(t)) from training examples given a Loss Function " ∗ = %&'()*+ℒ -+ . , 0 • Iteratively adapt each parameter Basic idea: gradient descent. • Dependencies are very complex. Global minimum: challenging. Local minima: can be good enough. • Initialization influences in the solutions.
  • 24. Training Gradient Descent: Move the parameter ! in small steps in the direction opposite sign of the derivative of the loss with respect !. !(#) = !(#&') − )(#&') * +,ℒ(., 0 1 ) Stochastic gradient descent (SGD): estimate the gradient with one sample, or better, with a minibatch of examples. Several strategies have been proposed to update the weights: Adam, RMSProp, Adamax, etc. known as: optimizers
  • 25. Gradient descent examples Linear regression http://nbviewer.jupyter.org/github/kevinmcguinness/ml-examples/blob/master/notebooks/GD_Regression.ipynb https://github.com/kevinmcguinness/ml-examples/blob/master/notebooks/GD_Regression.ipynb Logistic regression http://nbviewer.jupyter.org/github/kevinmcguinness/ml-examples/blob/master/notebooks/GD_Classification.ipynb https://github.com/kevinmcguinness/ml-examples/blob/master/notebooks/GD_Classification.ipynb
  • 26. MNIST Example Handwritten digits • 60.000 training examples • 10.000 test examples • 10 classes (digits 0-9) • 28x28 grayscale images(784 pixels) • http://yann.lecun.com/exdb/mnist/ The objective is to learn a function that predicts the digit from the image
  • 27. MNIST Example Model • 3 layer neural-network ( 2 hidden layers) • Tanh units (activation function) • 512-512-10 • Softmax on top layer • Cross entropy Loss
  • 28. MNIST Example Training • 40 epochs using min-batch SGD • Size of the mini-batch: 128 • Leaning Rate: 0.1 (fixed) • Takes 5 minutes to train on GPU Accuracy Results • 98.12% (188 errors in 10.000 test examples) there are ways to improve accuracy… Metrics !""#$%"& = () + (+ () + (+ + ,) + ,+ there are other metrics….
  • 29. Summary • Multilayer Perceptron Networks allow us to build non-linear decision boundaries • Multilayer Perceptron Networks are composed of the input layer, hidden layers and the output layer. All neurons in one layer are connected to all neurons from the previous layer and the layer that follows • Multilayer Perceptron Networks have a large number of parameters that have to be estimated trough training with the goal of minimizing a given loss function • With Multiple Layer Perceptrons we need to find the gradient of the loss function with respect to all the parameters of the model (W(t), b(t))
  • 30. 30 Assignment D2L2.1 Given the following network to obtain a XNOR operation, Indicate which parameters are correct: ● w111=-2, w112=2, w121=2,w122=-2,b1=-1,w211=2,w221=2,b2=-1 ● w111=-2, w112=2, w121=2,w122=-2,b1=-1,w211=2,w221=2,b2=1 ● w111=-2, w112=2, w121=2,w122=-2,b1=-1,w211=-2,w221=-2,b2=1 ● w111=-2, w112=2, w121=2,w122=-2,b1=-1,w211=-2,w221=-2,b2=-1
  • 31. 31 Assignment D2L1.2 Given the following Fully Connected Network, with an input of 256 elements, 2 hidden layers and an output layer, how many parameters do you need to estimate ?