SlideShare a Scribd company logo
1 of 29
Neural Networks
to Deep Learning
Introduction
16.03.11 You Sung Min
1. Perceptron
2. Multilayer Perceptron(MLP)
3. Algorithm of Neural Networks
4. Deep Networks
5. AlphaGo
Contents
Structure of perceptron (Developed in 1950s)
A simple model to emulate a single neuron
A perceptron takes binary inputs (𝒙 𝟏, 𝒙 𝟐, 𝒙 𝟑 … )
and produce a single binary output (0, 1)
Perceptron
=
𝟎 𝒊𝒇
𝒋
𝝎𝒋 𝒙𝒋 ≤ 𝑻
𝟏 𝒊𝒇
𝒋
𝝎𝒋 𝒙𝒋 > 𝑻
𝝎 𝟏
𝝎 𝟐
𝝎 𝟑
𝒋
𝝎𝒋 𝒙𝒋Binary Inputs
Threshold T
Realistic example
 Suppose the week end is coming up
 There is a cheese festival in your city
 And you like cheese
→ Decide to go or not to go ?
1. Is the weather good? (i.e., 𝒙 𝟏 = 𝟏 𝒐𝒓 𝟎)
2. Does your girlfriend want to accompany you? (𝒙 𝟐 = 𝟏 𝒐𝒓 𝟎)
3. Is the festival near public transit? (𝒙 𝟑 = 𝟏 𝒐𝒓 𝟎)
The decision is depend on the output value
Perceptron
𝒐𝒖𝒕𝒑𝒖𝒕 =
𝟎 (𝒅𝒐𝒏𝒕′
𝒈𝒐) 𝒊𝒇
𝒋
𝝎𝒋 𝒙𝒋 ≤ 𝒕𝒉𝒓𝒆𝒔𝒉𝒐𝒍𝒅
𝟏 (𝒈𝒐) 𝒊𝒇
𝒋
𝝎𝒋 𝒙𝒋 > 𝒕𝒉𝒓𝒆𝒔𝒉𝒐𝒍𝒅
Affecting factors (input)
1. Weather (𝒙 𝟏 = 𝟏 𝒐𝒓 𝟎)
2. Girlfriend (𝒙 𝟐 = 𝟏 𝒐𝒓 𝟎)
3. Public transit (𝒙 𝟑 = 𝟏 𝒐𝒓 𝟎)
Weight, Threshold and Output (Decision)
If 𝜔1 = 6, 𝜔2 = 2 and 𝜔3 = 2
→ Threshold = 5, Depends on only the weather
→ Threshold = 3, More willing to go to the festival
Perceptron
Go (1): 𝒋 𝝎𝒋 𝒙𝒋 > 𝒕𝒉𝒓𝒆𝒔𝒉𝒐𝒍𝒅
Recognizing handwritten digits
Handwritten digits
Rule-based approach
 “9” has a loop at the top, and a vertical stroke in the
bottom right
 Rules are complicated; exceptions
5 0 4 1 9 2
Neural Networks approach
Use large training samples
(handwritten digits data)
Develop a system able to
learn from those examples
- Automatically infer rules
for recognizing digits
Recognizing handwritten digits
Four-layer net (with two hidden layers)
Multilayer Perceptron (MLP)
Input: intensities of pixels
(e.g., 4096 input neurons
for 64-by-64 grayscale
image “9”)
output:
< 0.5 for “not 9”;
> 0.5 for “ 9 “
Not Efficient
Binary Input
(Intensity of a pixel)
Three-layer net to recognize each digit
Multilayer Perceptron (MLP)
Desired output for “5”
𝒚(𝒙) = 𝟎, 𝟎, 𝟎, 𝟎, 𝟏, 𝟎, 𝟎, 𝟎, 𝟎 𝑻
Handwritten digit with
28 by 28 pixel image
Binary Input
(Intensity of a pixel)
(Quadratic) Cost function
Learning of Neural Network
Input “5”
Output vector for “5”
𝒐𝒖𝒕𝒑𝒖𝒕 = (𝒂 𝟏, 𝒂 𝟐, … , 𝒂 𝟏𝟎) 𝑻
Target vector (Desired output)
𝒚(𝒙) = 𝟎, 𝟎, 𝟎, 𝟎, 𝟏, 𝟎, 𝟎, 𝟎, 𝟎 𝑻
𝑪 𝝎, 𝒃 =
𝟏
𝟐𝒏
𝒙
| 𝒚 𝒙 − 𝒐𝒖𝒕𝒑𝒖𝒕 | 𝟐
Cost function
Minimize difference
Randomly initialized networks
Find weights to approximate 𝒚(𝒙) for all x
 (Quadratic) Cost function
 Gradient descent
Learning of Neural Network
𝑪 𝝎, 𝒃 =
𝟏
𝟐𝒏
𝒙
| 𝒚 𝒙 − 𝒐𝒖𝒕𝒑𝒖𝒕 | 𝟐
𝚫𝑪 ≈
𝝏𝑪
𝝏𝒗 𝟏
𝚫𝒗 𝟏 +
𝝏𝑪
𝝏𝒗 𝟐
𝚫𝒗 𝟐
𝚫𝑪 ≈ 𝛁𝑪 ∙ 𝚫𝒗
𝛁𝑪 ≡ (
𝝏𝑪
𝝏𝒗 𝟏
,
𝝏𝑪
𝝏𝒗 𝟐
) 𝑻Gradient
Vector
 Learning algorithm
 Gradient descent to weights & biases
Learning of Neural Network
𝚫𝒗 = −𝜼𝛁𝑪 (𝜼 ∶ 𝒍𝒆𝒂𝒓𝒏𝒊𝒏𝒈 𝒓𝒂𝒕𝒆)
𝒗 → 𝒗′
= 𝒗 − 𝜼𝛁𝑪
𝒃𝒍 → 𝒃𝒍
′
= 𝒃𝒍 − 𝜼
𝝏𝑪
𝝏𝒃𝒍
𝝎 𝒌 → 𝝎 𝒌
′
= 𝝎 𝒌 − 𝜼
𝝏𝑪
𝝏𝝎 𝒌
Weight
Bias
Backpropagation algorithm
• Computation of partial derivatives
𝝏𝑪
𝝏𝒃𝒋
𝒍
𝝏𝑪
𝝏𝝎𝒋𝒌
𝒍
Error backpropagation path
(Fully-connected) Multi-layer network
Deep Network
More complex networks, more complicated problems ?
Computation of partial derivatives
Vanishing gradient problem
𝝏𝑪
𝝏𝒃 𝟏
= 𝝈′ 𝒛 𝟏 ∗ 𝝎 𝟐 ∗ 𝝈′ 𝒛 𝟐 ∗ 𝝎 𝟑 ∗ 𝝈′ 𝒛 𝟑 ∗ 𝝎 𝟒 ∗ 𝝈′ 𝒛 𝟒 ∗
𝝏𝑪
𝝏𝒂 𝟒
0.25
0
∵ 𝝎𝒋 < 𝟏, 𝝈′ 𝒛𝒋 <
𝟏
𝟒
< 𝟏/𝟒
Hard to train
deep architecture network
𝑍1 𝑍2 𝑍3 𝑍4
< 𝟏/𝟒
Backpropagation
Deep Networks
To learn deep architecture network
 Convolutional Neural Network (CNN)
 Deep belief Net (DBN)
 Stacked Auto Encoder (SAE)
 Recurrent Neural Network (RNN)
Convolutional Neural Networks
3 Characteristics of CNN
 Local receptive field (connectivity)
- Reduce connections between neurons
 Shared weights
- Reduce total number of weights and bias
 Pooling layer
- Simplify (condense) information
Convolutional Neural Networks
 Local receptive field (connectivity)
28 by 28 23 by 23
5 by 5
Kernel
(window)
2D Convolution
1. Detect local information
(features)
(e.g., Edge, Shape)
2. Reduce connections
between layers
• Fully connected network
→ 𝟐𝟖 ∗ 𝟐𝟖 ∗ 𝟐𝟑 ∗ 𝟐3 connections
• Local connected network
→ 𝟓 ∗ 𝟓 ∗ 𝟐𝟑 ∗ 𝟐𝟑 connections
𝑤11 𝑤12
𝑤55
Convolutional Neural Networks
 Shared weights
1. Detect same feature
in other positions
2. Reduce total number of
weights and bias
3. Construct multiple feature
maps (kernels)
𝒐𝒖𝒕𝒑𝒖𝒕 = 𝝈(𝒃 +
𝒍=𝟎
𝟒
𝒎=𝟎
𝟒
𝝎𝒍,𝒎 𝒂𝒋+𝒍,𝒌+𝒎)
Convolutional Neural Networks
 Pooling layer
1. Simplify (condense)
information in the feature
map
2. Reduce connections
(weights and biases)
Max-pooling:
Output only maximum activation
Conv. Pooling
Deep Networks Application
AlphaGo
 Game of Go (바둑) Artificial Intelligence algorithm
 Google Deepmind
 Convolutional Neural Network (CNN)
 Monte Carlo Tree Search (MTCS)
 Achieved 99.8 % winning rate against other Go
program
 Defeated Human European Go champion by 5:0
Silver, David, et al. "Mastering the game of Go with deep neural
networks and tree search." Nature 529.7587 (2016): 484-489.
Alphago
Convolutional Neural Network in Alphago
The board position as the input with 19 by 19 image
Policy Network
 Decide where to place stone by calculating the probability
Value Network
 Evaluate current state of board positions (winning probability)
Supervised learning(SL)
 Learn from human expert moves (30 million board states)
Reinforcement learning(RL)
 Self-learning to reinforce policy network
Alphago
Convolutional Neural Network in Alphago
Policy
Network
Value
Network
13 Convolutional Layers
Computation
Implementation
CPU 1202
GPU 176
Threads 64
30 billion computation
19 by 19
Board Status
Input
PD for next move
Output
Value of
game state
Alphago
Game Tree Search Algorithm
Alphago
Monte Carlo Tree Search (MTCS)
Current State
Fast rollout
References
Image Source from
http://neuralnetworksanddeeplearning.com
Silver, David, et al. "Mastering the game of Go
with deep neural networks and tree
search." Nature 529.7587 (2016): 484-489.
SPRi Issue Report 2016-002, (2016)
Human Visual Pathway
Recognizing handwritten digits
Human visual system easily recognize
Connection between visual cortex
 E.g. V1 (140 million neurons, tens of billions of connections)
Functional structure of layer
 V1 / V2 : Basic visual features
 V3 / V5 : Spatial localization
 V3 : Shape perception
 V4 : Color vision
Appendix
Backpropagation algorithm
𝒂𝒋
𝒍
= 𝝈(
𝒌
𝝎𝒋𝒌
𝒍
𝒂 𝒌
𝒍−𝟏
+ 𝒃𝒋
𝒍
)
• Output of each neuron
Weight of connection
Output of prior neuron
Bias of current neuron
Appendix
𝑪 𝝎, 𝒃
=
𝟏
𝟐𝒏
𝒙
| 𝒚 𝒙 − 𝒂 𝑳
(𝒙) | 𝟐
• Cost function
𝑪 =
𝟏
𝟐
| 𝒚 − 𝒂 𝑳
| 𝟐
=
𝟏
𝟐
𝒋
(𝒚𝒋 − 𝒂𝒋
𝑳
) 𝟐
• Cost function
for single training example
Backpropagation algorithm
Appendix

More Related Content

What's hot

Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionTe-Yen Liu
 
Reducing the dimensionality of data with neural networks
Reducing the dimensionality of data with neural networksReducing the dimensionality of data with neural networks
Reducing the dimensionality of data with neural networksHakky St
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationYan Xu
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Oswald Campesato
 
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare EventsTaegyun Jeon
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlowBarbara Fusinska
 
TensorFlow Tutorial Part2
TensorFlow Tutorial Part2TensorFlow Tutorial Part2
TensorFlow Tutorial Part2Sungjoon Choi
 
Lecture 5: Neural Networks II
Lecture 5: Neural Networks IILecture 5: Neural Networks II
Lecture 5: Neural Networks IISang Jun Lee
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural NetworksDatabricks
 
From neural networks to deep learning
From neural networks to deep learningFrom neural networks to deep learning
From neural networks to deep learningViet-Trung TRAN
 
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoConvolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoSeongwon Hwang
 
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...Simplilearn
 
Introduction to Applied Machine Learning
Introduction to Applied Machine LearningIntroduction to Applied Machine Learning
Introduction to Applied Machine LearningSheilaJimenezMorejon
 
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...Hiroki Nakahara
 
2.5 backpropagation
2.5 backpropagation2.5 backpropagation
2.5 backpropagationKrish_ver2
 
15 Machine Learning Multilayer Perceptron
15 Machine Learning Multilayer Perceptron15 Machine Learning Multilayer Perceptron
15 Machine Learning Multilayer PerceptronAndres Mendez-Vazquez
 

What's hot (20)

TensorFlow in 3 sentences
TensorFlow in 3 sentencesTensorFlow in 3 sentences
TensorFlow in 3 sentences
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
 
Reducing the dimensionality of data with neural networks
Reducing the dimensionality of data with neural networksReducing the dimensionality of data with neural networks
Reducing the dimensionality of data with neural networks
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
 
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlow
 
TensorFlow Tutorial Part2
TensorFlow Tutorial Part2TensorFlow Tutorial Part2
TensorFlow Tutorial Part2
 
Lecture 5: Neural Networks II
Lecture 5: Neural Networks IILecture 5: Neural Networks II
Lecture 5: Neural Networks II
 
Deep learning
Deep learningDeep learning
Deep learning
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
From neural networks to deep learning
From neural networks to deep learningFrom neural networks to deep learning
From neural networks to deep learning
 
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoConvolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in Theano
 
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
 
Introduction to Applied Machine Learning
Introduction to Applied Machine LearningIntroduction to Applied Machine Learning
Introduction to Applied Machine Learning
 
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...
 
2.5 backpropagation
2.5 backpropagation2.5 backpropagation
2.5 backpropagation
 
15 Machine Learning Multilayer Perceptron
15 Machine Learning Multilayer Perceptron15 Machine Learning Multilayer Perceptron
15 Machine Learning Multilayer Perceptron
 

Similar to Neural Networks to Deep Learning: Perceptrons, MLPs, CNNs and AlphaGo

Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)SungminYou
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningCastLabKAIST
 
How DeepMind Mastered The Game Of Go
How DeepMind Mastered The Game Of GoHow DeepMind Mastered The Game Of Go
How DeepMind Mastered The Game Of GoTim Riser
 
Restricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for AttributionRestricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for Attributiontaeseon ryu
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspectiveAnirban Santara
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetSungminYou
 
Neural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningNeural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningTapas Majumdar
 
NeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximateProgramsNeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximateProgramsMohid Nabil
 
Towards neuralprocessingofgeneralpurposeapproximateprograms
Towards neuralprocessingofgeneralpurposeapproximateprogramsTowards neuralprocessingofgeneralpurposeapproximateprograms
Towards neuralprocessingofgeneralpurposeapproximateprogramsParidha Saxena
 
"An adaptive modular approach to the mining of sensor network ...
"An adaptive modular approach to the mining of sensor network ..."An adaptive modular approach to the mining of sensor network ...
"An adaptive modular approach to the mining of sensor network ...butest
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer FarooquiDatabricks
 
Deep learning: what? how? why? How to win a Kaggle competition
Deep learning: what? how? why? How to win a Kaggle competitionDeep learning: what? how? why? How to win a Kaggle competition
Deep learning: what? how? why? How to win a Kaggle competition317070
 
Neural Networks in Data Mining - “An Overview”
Neural Networks  in Data Mining -   “An Overview”Neural Networks  in Data Mining -   “An Overview”
Neural Networks in Data Mining - “An Overview”Dr.(Mrs).Gethsiyal Augasta
 

Similar to Neural Networks to Deep Learning: Perceptrons, MLPs, CNNs and AlphaGo (20)

Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
 
How DeepMind Mastered The Game Of Go
How DeepMind Mastered The Game Of GoHow DeepMind Mastered The Game Of Go
How DeepMind Mastered The Game Of Go
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Restricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for AttributionRestricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for Attribution
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
 
B.tech_project_ppt.pptx
B.tech_project_ppt.pptxB.tech_project_ppt.pptx
B.tech_project_ppt.pptx
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNet
 
Neural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningNeural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learning
 
NeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximateProgramsNeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximatePrograms
 
Towards neuralprocessingofgeneralpurposeapproximateprograms
Towards neuralprocessingofgeneralpurposeapproximateprogramsTowards neuralprocessingofgeneralpurposeapproximateprograms
Towards neuralprocessingofgeneralpurposeapproximateprograms
 
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
 
Neural networks
Neural networksNeural networks
Neural networks
 
report
reportreport
report
 
"An adaptive modular approach to the mining of sensor network ...
"An adaptive modular approach to the mining of sensor network ..."An adaptive modular approach to the mining of sensor network ...
"An adaptive modular approach to the mining of sensor network ...
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 
Deep learning: what? how? why? How to win a Kaggle competition
Deep learning: what? how? why? How to win a Kaggle competitionDeep learning: what? how? why? How to win a Kaggle competition
Deep learning: what? how? why? How to win a Kaggle competition
 
Neural Networks in Data Mining - “An Overview”
Neural Networks  in Data Mining -   “An Overview”Neural Networks  in Data Mining -   “An Overview”
Neural Networks in Data Mining - “An Overview”
 
Artificial Neural networks
Artificial Neural networksArtificial Neural networks
Artificial Neural networks
 
ai7.ppt
ai7.pptai7.ppt
ai7.ppt
 

Recently uploaded

Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Effects of rheological properties on mixing
Effects of rheological properties on mixingEffects of rheological properties on mixing
Effects of rheological properties on mixingviprabot1
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 

Recently uploaded (20)

Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
Effects of rheological properties on mixing
Effects of rheological properties on mixingEffects of rheological properties on mixing
Effects of rheological properties on mixing
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 

Neural Networks to Deep Learning: Perceptrons, MLPs, CNNs and AlphaGo

  • 1. Neural Networks to Deep Learning Introduction 16.03.11 You Sung Min
  • 2. 1. Perceptron 2. Multilayer Perceptron(MLP) 3. Algorithm of Neural Networks 4. Deep Networks 5. AlphaGo Contents
  • 3. Structure of perceptron (Developed in 1950s) A simple model to emulate a single neuron A perceptron takes binary inputs (𝒙 𝟏, 𝒙 𝟐, 𝒙 𝟑 … ) and produce a single binary output (0, 1) Perceptron = 𝟎 𝒊𝒇 𝒋 𝝎𝒋 𝒙𝒋 ≤ 𝑻 𝟏 𝒊𝒇 𝒋 𝝎𝒋 𝒙𝒋 > 𝑻 𝝎 𝟏 𝝎 𝟐 𝝎 𝟑 𝒋 𝝎𝒋 𝒙𝒋Binary Inputs Threshold T
  • 4. Realistic example  Suppose the week end is coming up  There is a cheese festival in your city  And you like cheese → Decide to go or not to go ? 1. Is the weather good? (i.e., 𝒙 𝟏 = 𝟏 𝒐𝒓 𝟎) 2. Does your girlfriend want to accompany you? (𝒙 𝟐 = 𝟏 𝒐𝒓 𝟎) 3. Is the festival near public transit? (𝒙 𝟑 = 𝟏 𝒐𝒓 𝟎) The decision is depend on the output value Perceptron 𝒐𝒖𝒕𝒑𝒖𝒕 = 𝟎 (𝒅𝒐𝒏𝒕′ 𝒈𝒐) 𝒊𝒇 𝒋 𝝎𝒋 𝒙𝒋 ≤ 𝒕𝒉𝒓𝒆𝒔𝒉𝒐𝒍𝒅 𝟏 (𝒈𝒐) 𝒊𝒇 𝒋 𝝎𝒋 𝒙𝒋 > 𝒕𝒉𝒓𝒆𝒔𝒉𝒐𝒍𝒅
  • 5. Affecting factors (input) 1. Weather (𝒙 𝟏 = 𝟏 𝒐𝒓 𝟎) 2. Girlfriend (𝒙 𝟐 = 𝟏 𝒐𝒓 𝟎) 3. Public transit (𝒙 𝟑 = 𝟏 𝒐𝒓 𝟎) Weight, Threshold and Output (Decision) If 𝜔1 = 6, 𝜔2 = 2 and 𝜔3 = 2 → Threshold = 5, Depends on only the weather → Threshold = 3, More willing to go to the festival Perceptron Go (1): 𝒋 𝝎𝒋 𝒙𝒋 > 𝒕𝒉𝒓𝒆𝒔𝒉𝒐𝒍𝒅
  • 6. Recognizing handwritten digits Handwritten digits Rule-based approach  “9” has a loop at the top, and a vertical stroke in the bottom right  Rules are complicated; exceptions 5 0 4 1 9 2
  • 7. Neural Networks approach Use large training samples (handwritten digits data) Develop a system able to learn from those examples - Automatically infer rules for recognizing digits Recognizing handwritten digits
  • 8. Four-layer net (with two hidden layers) Multilayer Perceptron (MLP) Input: intensities of pixels (e.g., 4096 input neurons for 64-by-64 grayscale image “9”) output: < 0.5 for “not 9”; > 0.5 for “ 9 “ Not Efficient Binary Input (Intensity of a pixel)
  • 9. Three-layer net to recognize each digit Multilayer Perceptron (MLP) Desired output for “5” 𝒚(𝒙) = 𝟎, 𝟎, 𝟎, 𝟎, 𝟏, 𝟎, 𝟎, 𝟎, 𝟎 𝑻 Handwritten digit with 28 by 28 pixel image Binary Input (Intensity of a pixel)
  • 10. (Quadratic) Cost function Learning of Neural Network Input “5” Output vector for “5” 𝒐𝒖𝒕𝒑𝒖𝒕 = (𝒂 𝟏, 𝒂 𝟐, … , 𝒂 𝟏𝟎) 𝑻 Target vector (Desired output) 𝒚(𝒙) = 𝟎, 𝟎, 𝟎, 𝟎, 𝟏, 𝟎, 𝟎, 𝟎, 𝟎 𝑻 𝑪 𝝎, 𝒃 = 𝟏 𝟐𝒏 𝒙 | 𝒚 𝒙 − 𝒐𝒖𝒕𝒑𝒖𝒕 | 𝟐 Cost function Minimize difference Randomly initialized networks
  • 11. Find weights to approximate 𝒚(𝒙) for all x  (Quadratic) Cost function  Gradient descent Learning of Neural Network 𝑪 𝝎, 𝒃 = 𝟏 𝟐𝒏 𝒙 | 𝒚 𝒙 − 𝒐𝒖𝒕𝒑𝒖𝒕 | 𝟐 𝚫𝑪 ≈ 𝝏𝑪 𝝏𝒗 𝟏 𝚫𝒗 𝟏 + 𝝏𝑪 𝝏𝒗 𝟐 𝚫𝒗 𝟐 𝚫𝑪 ≈ 𝛁𝑪 ∙ 𝚫𝒗 𝛁𝑪 ≡ ( 𝝏𝑪 𝝏𝒗 𝟏 , 𝝏𝑪 𝝏𝒗 𝟐 ) 𝑻Gradient Vector
  • 12.  Learning algorithm  Gradient descent to weights & biases Learning of Neural Network 𝚫𝒗 = −𝜼𝛁𝑪 (𝜼 ∶ 𝒍𝒆𝒂𝒓𝒏𝒊𝒏𝒈 𝒓𝒂𝒕𝒆) 𝒗 → 𝒗′ = 𝒗 − 𝜼𝛁𝑪 𝒃𝒍 → 𝒃𝒍 ′ = 𝒃𝒍 − 𝜼 𝝏𝑪 𝝏𝒃𝒍 𝝎 𝒌 → 𝝎 𝒌 ′ = 𝝎 𝒌 − 𝜼 𝝏𝑪 𝝏𝝎 𝒌 Weight Bias
  • 13. Backpropagation algorithm • Computation of partial derivatives 𝝏𝑪 𝝏𝒃𝒋 𝒍 𝝏𝑪 𝝏𝝎𝒋𝒌 𝒍 Error backpropagation path
  • 14. (Fully-connected) Multi-layer network Deep Network More complex networks, more complicated problems ?
  • 15. Computation of partial derivatives Vanishing gradient problem 𝝏𝑪 𝝏𝒃 𝟏 = 𝝈′ 𝒛 𝟏 ∗ 𝝎 𝟐 ∗ 𝝈′ 𝒛 𝟐 ∗ 𝝎 𝟑 ∗ 𝝈′ 𝒛 𝟑 ∗ 𝝎 𝟒 ∗ 𝝈′ 𝒛 𝟒 ∗ 𝝏𝑪 𝝏𝒂 𝟒 0.25 0 ∵ 𝝎𝒋 < 𝟏, 𝝈′ 𝒛𝒋 < 𝟏 𝟒 < 𝟏/𝟒 Hard to train deep architecture network 𝑍1 𝑍2 𝑍3 𝑍4 < 𝟏/𝟒 Backpropagation
  • 16. Deep Networks To learn deep architecture network  Convolutional Neural Network (CNN)  Deep belief Net (DBN)  Stacked Auto Encoder (SAE)  Recurrent Neural Network (RNN)
  • 17. Convolutional Neural Networks 3 Characteristics of CNN  Local receptive field (connectivity) - Reduce connections between neurons  Shared weights - Reduce total number of weights and bias  Pooling layer - Simplify (condense) information
  • 18. Convolutional Neural Networks  Local receptive field (connectivity) 28 by 28 23 by 23 5 by 5 Kernel (window) 2D Convolution 1. Detect local information (features) (e.g., Edge, Shape) 2. Reduce connections between layers • Fully connected network → 𝟐𝟖 ∗ 𝟐𝟖 ∗ 𝟐𝟑 ∗ 𝟐3 connections • Local connected network → 𝟓 ∗ 𝟓 ∗ 𝟐𝟑 ∗ 𝟐𝟑 connections 𝑤11 𝑤12 𝑤55
  • 19. Convolutional Neural Networks  Shared weights 1. Detect same feature in other positions 2. Reduce total number of weights and bias 3. Construct multiple feature maps (kernels) 𝒐𝒖𝒕𝒑𝒖𝒕 = 𝝈(𝒃 + 𝒍=𝟎 𝟒 𝒎=𝟎 𝟒 𝝎𝒍,𝒎 𝒂𝒋+𝒍,𝒌+𝒎)
  • 20. Convolutional Neural Networks  Pooling layer 1. Simplify (condense) information in the feature map 2. Reduce connections (weights and biases) Max-pooling: Output only maximum activation Conv. Pooling
  • 21. Deep Networks Application AlphaGo  Game of Go (바둑) Artificial Intelligence algorithm  Google Deepmind  Convolutional Neural Network (CNN)  Monte Carlo Tree Search (MTCS)  Achieved 99.8 % winning rate against other Go program  Defeated Human European Go champion by 5:0 Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.
  • 22. Alphago Convolutional Neural Network in Alphago The board position as the input with 19 by 19 image Policy Network  Decide where to place stone by calculating the probability Value Network  Evaluate current state of board positions (winning probability) Supervised learning(SL)  Learn from human expert moves (30 million board states) Reinforcement learning(RL)  Self-learning to reinforce policy network
  • 23. Alphago Convolutional Neural Network in Alphago Policy Network Value Network 13 Convolutional Layers Computation Implementation CPU 1202 GPU 176 Threads 64 30 billion computation 19 by 19 Board Status Input PD for next move Output Value of game state
  • 25. Alphago Monte Carlo Tree Search (MTCS) Current State Fast rollout
  • 26. References Image Source from http://neuralnetworksanddeeplearning.com Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489. SPRi Issue Report 2016-002, (2016)
  • 27. Human Visual Pathway Recognizing handwritten digits Human visual system easily recognize Connection between visual cortex  E.g. V1 (140 million neurons, tens of billions of connections) Functional structure of layer  V1 / V2 : Basic visual features  V3 / V5 : Spatial localization  V3 : Shape perception  V4 : Color vision Appendix
  • 28. Backpropagation algorithm 𝒂𝒋 𝒍 = 𝝈( 𝒌 𝝎𝒋𝒌 𝒍 𝒂 𝒌 𝒍−𝟏 + 𝒃𝒋 𝒍 ) • Output of each neuron Weight of connection Output of prior neuron Bias of current neuron Appendix
  • 29. 𝑪 𝝎, 𝒃 = 𝟏 𝟐𝒏 𝒙 | 𝒚 𝒙 − 𝒂 𝑳 (𝒙) | 𝟐 • Cost function 𝑪 = 𝟏 𝟐 | 𝒚 − 𝒂 𝑳 | 𝟐 = 𝟏 𝟐 𝒋 (𝒚𝒋 − 𝒂𝒋 𝑳 ) 𝟐 • Cost function for single training example Backpropagation algorithm Appendix

Editor's Notes

  1. 이러한 네트워크를 어떻게 학습시킬 수 있을지 앞서 말한 것과 동일한 구조의 네트워크 초기에 네트워크의 weight와 bias는 무작위로 초기화 이때 하나의 Training Data가 입력된다면, 이때 우리가 원하는 출력은 위(Target Vector)와 같음 하지만 현재 상태는 무작위로 초기화 되어있기 때문에 이러한 결과값을 가지지 않을 가능성이 매우 높음 따라서 현재 상태의 출력은 어떠한 임의의 벡터 값이 나올 것 이때 이 네트워크에 대한 Cost Function을 다음과 같이 정의 모든 Training Data에 대하여, Output Vector와 Target Vector의 차이의 평균 앞서 Target와 Output의 차이를 최소화 하는 것이 목표라 했기에, 따라서 이 네트워크를 학습시킨다는 것은 Cost Function을 최소화 하는 문제로 생각
  2. 13층의 컨볼루션 신경망의 값을 산출하기 위해선 약 300억 번의 연산수 필요
  3. Selection 현재 상태로부터 특정 경로에 대하여 앞으로의 수를 탐색(정해진 값, 몇수 후까지) Expansion 현재 상태에서 탐색한 수 중에서 가장 좋은 수의 마지막 노드를 확장 Evaluation 확장된 수로부터 최종적으로(경기 종료) 나오는 결과를 예측(Fast rollout: 전체 바둑판의 상황이 아닌, 국소부위(3 by 3)에 대해서만 판단) Backup 현재 상태로부터 확장된 노드까지의 경로의 Weights 갱신
  4. 13층의 컨볼루션 신경망의 값을 산출하기 위해선 약 300억 번의 연산수 필요