SlideShare a Scribd company logo
Case Study of CNN
from LeNet to ResNet
NamHyuk Ahn @ Ajou Univ.
2016. 03. 09
Convolutional Neural Network
Convolution Layer
- Convolution (3-dim dot product) image and filter
- Stack filter in one layer (See blue and green output,
called channel)
Convolution Layer
- Local Connectivity
• Instead connect all pixels to neurons, connect
only local region of input (called receptive field)
• It can reduce many parameter
- Parameter sharing
• To reduce parameter, each channel have same
filter. (# of filter == # of channel)
Convolution Layer
- Example) 1st conv layer in AlexNet
• Input: [224, 224], filter: [11x11x3], 96, output: [55, 55]
- Each filter extract different features (i.e. horizontal
edge, vertical edge…)
Pooling Layer
- Downsample image to reduce parameter
- Usually use max pooling (take maximum value in
region)
ReLU, FC Layer
- ReLU
• Sort of activation function (e.g. sigmoid, tanh…)
- Fully-connected Layer
• Same as normal neural network
Convolutional Neural Network
Training CNN
1. Calculate loss function with foward-prop
2. Optimize parameter w.r.t loss function with back-
prop
• Use gradient descent method (SGD)
• Gradient of weight can calculate with chain rule of partial derivate
ILSVRC trend
AlexNet (2012)
(ILSVRC 2012 winner)
AlexNet
- ReLU
- Data augmentation
- Dropout
- Ensemble CNN (1-CNN 18.2%, 7-CNN 15.4%)
AlexNet
- Other methods (but will not mention today)
• SGD + momentum (+ mini-batch)
• Multiple GPU
• Weight Decay
• Local Response Normalization
Problems of sigmoid
- Gradient vanishing
• when gradient pass sigmoid, it can vanish
because local gradient of sigmoid can be almost
zero.
- Output is not zero-centered
• cause bad performance
ReLU
- Converge of SGD is faster than sigmoid-like
- Computationally cheap
Data augmentation
- Randomly crop [256, 256] images to [224, 224]
- At test time, crop 5 images and average to predict
Dropout
- Similar to bagging (approximation of bagging)
- Act like regularizer (reduce overfit)
- Instead of using all neurons, “dropout” some neurons
randomly (usually 0.5 probability)
Dropout
• At test time, not “dropout” neurons, but use
weighted neurons (usually 0.5)
• Weight is expected value of each neurons
Architecture
- conv - pool - … - fc - softmax (similar to LeNet)
- Use large size filter (i.e. 11x11)
Architecture
- Weights must be initalized randomly
• If not, all gradients of neurons will be same
• Usually, use gaussian distribution, std = 0.01
- Use mini-batch SGD and momentum SGD to
update weight
VGGNet (2014)
(ILSVRC 2014 2nd)
VGGNet
- Use small size kernel (always 3x3)
• Can use multiple non-linearlity (e.g. ReLU)
• Less weights to train
- Hard data augmentation (more than AlexNet)
- Ensemble 7 model (ILSVRC submission 7.3%)
Architecture
- Most memory needs in early layers, most parameters
increase in fc layers.
GoogLeNet - Inception v1 (2014)
(ILSVRC 2014 winner)
GoogLeNet
Inception module
- Use 1x1, 3x3 and 5x5 conv
simultaneously to capture
variety of structure
- Capture dense structure to
1x1, more spread out structure
to 3x3, 5x5
- Computational expensive
• Use 1x1 conv layer to
reduce dimension (explain
details in later in ResNet)
Auxiliary Classifiers
- Deep network raises concern about effectiveness
of graident in backprop
- Loss of auxiliary is added to total loss (weighted by
0.3), remove at test time
Average Pooling
- Proposed in Network in Network (also used in
GoogLeNet)
- Problems of fc layer
• Needs lots of parameter, easy to overfit
- Replace fc to average pooling
Average Pooling
- Make channel as same as # of class in last conv
- Calc average on each channel, and pass to softmax
- Reduce overfit
MSRA ResNet (2015)
(ILSVRC 2015 winner)
before ResNet..
- Have to know about
• PReLU
• Xavier Initalization
• Batch Normalization
PReLU
- Adaptive version of ReLU
- Train slope of function when x < 0
- Slightly more parameter (# of layer x # of channel)
Xavier Initalization
- If init with gaussian distribution, output of neurons
will be nearly zeros when network is deeep
- If increase std (1.0), output will saturate to -1 or 1
- Xavier init decide initial value by number of input
neurons
- Looks fine, but this init method assume linear
activation so can’t use in ReLU-like network
output is saturated
output is vanished
Xavier Initalization / 2
Xavier Initalization
Xavier Initalization / 2
Batch Normalization
- Make output to be gaussian distribution, but
normalization cost a lot
• Calc mean, variance in each dimension (assume each dims are
uncorrelated)
• Calc mean, variance in mini-batch (not entire set)
- Normalize constrain non-linearlity and constrain
network by assume each dims are uncorrelated
• Linear transform output (factors are parameter)
Batch Normalization
- When test, calc mean, variance using entire set (use
moving average)
- BN act like regularizer (don’t need Dropout)
ResNet
ResNet
Problem of degradation
- More depth, more accurate but deep network can
vanish/explode gradient
• BN, Xavier Init, Dropout can handle (~30 layer)
- More deeper, degradation problem occur
• Not only overfit, but also increase training error
Deep Residual Learning
- Element-wise addition with F(x) and shortcut
connection, and pass through ReLU non-linearlity
- Dim of x, F(x) are unequal (changing of channel),
linear project x to match dim (done by 1x1 conv)
- Similar to LSTM
Deeper Bottleneck
- To reduce training time, modify as bottleneck design
(just for economical reason)
• (3x3x3)x64x64 + (3x3x3)x64x64=221184 (left)
• (1x1x3)x256x64 + (3x3x3)x64x64 + (1x1x3)x64x256=208896 (right)
• More width(channel) in right, but similar parameter
• Similar method also used in GoogLeNet
ResNet
- Data augmentation as AlexNet does
- Batch Normalization (no dropout)
- Xavier / 2 initalization
- Average pooling
- Structure follows VGGNet style
Conclusion
Top-5Error
0%
4%
8%
12%
16%
AlexN
et
(2012)
VG
G
N
et
(2014)
Inception-V1
(2014)
H
um
an
PR
eLU
-net
(2015)
BN
-Inception
(2015)
R
esN
et-152
(2015)
Inception-R
esN
et
(2016)
3.1%
3.57%
4.82%4.94%5.1%
6.66%
7.32%
15.31%
Conclusion
- Dropout, BN
- ReLU-like activation (e.g. PReLU, ELU..)
- Xavier initalization
- Average pooling
- Use pre-trained model :)
Reference
- Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep
convolutional neural networks." Advances in neural information processing systems. 2012.
- Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image
recognition." arXiv preprint arXiv:1409.1556 (2014).
- Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." arXiv preprint arXiv:1312.4400 (2013).
- He, Kaiming, et al. "Delving deep into rectifiers: Surpassing human-level performance on imagenet
classification." Proceedings of the IEEE International Conference on Computer Vision. 2015.
- He, Kaiming, et al. "Deep Residual Learning for Image Recognition." arXiv preprint arXiv:1512.03385
(2015).
- Szegedy, Christian, Sergey Ioffe, and Vincent Vanhoucke. "Inception-v4, Inception-ResNet and the
Impact of Residual Connections on Learning." arXiv preprint arXiv:1602.07261 (2016).
- Gu, Jiuxiang, et al. "Recent Advances in Convolutional Neural Networks." arXiv preprint arXiv:
1512.07108 (2015). (good for tutorial)
- Also Thanks to CS231n, I used some figures in CS231n lecture slides. 

see http://cs231n.stanford.edu/index.html

More Related Content

What's hot

Cnn
CnnCnn
Machine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural NetworkMachine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural Network
Richard Kuo
 
Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its Applications
Kasun Chinthaka Piyarathna
 
Cnn method
Cnn methodCnn method
Cnn method
AmirSajedi1
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
ananth
 
AlexNet
AlexNetAlexNet
AlexNet
Bertil Hatt
 
Deep Learning - CNN and RNN
Deep Learning - CNN and RNNDeep Learning - CNN and RNN
Deep Learning - CNN and RNN
Ashray Bhandare
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
Mark Chang
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNN
Shuai Zhang
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
Taegyun Jeon
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
Owin Will
 
Hadoop Pig
Hadoop PigHadoop Pig
Hadoop Pig
Madhur Nawandar
 
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksSequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural Networks
Nguyen Quang
 
Variational Autoencoder Tutorial
Variational Autoencoder Tutorial Variational Autoencoder Tutorial
Variational Autoencoder Tutorial
Hojin Yang
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural Networks
milad abbasi
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
Akash Goel
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
CloudxLab
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketaki
Ketaki Patwari
 
CNN Algorithm
CNN AlgorithmCNN Algorithm
CNN Algorithm
georgejustymirobi1
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIER
Knoldus Inc.
 

What's hot (20)

Cnn
CnnCnn
Cnn
 
Machine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural NetworkMachine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural Network
 
Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its Applications
 
Cnn method
Cnn methodCnn method
Cnn method
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
 
AlexNet
AlexNetAlexNet
AlexNet
 
Deep Learning - CNN and RNN
Deep Learning - CNN and RNNDeep Learning - CNN and RNN
Deep Learning - CNN and RNN
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNN
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
 
Hadoop Pig
Hadoop PigHadoop Pig
Hadoop Pig
 
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksSequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural Networks
 
Variational Autoencoder Tutorial
Variational Autoencoder Tutorial Variational Autoencoder Tutorial
Variational Autoencoder Tutorial
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural Networks
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketaki
 
CNN Algorithm
CNN AlgorithmCNN Algorithm
CNN Algorithm
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIER
 

Similar to Case Study of Convolutional Neural Network

Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Ahmed Yousry
 
Paper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satPaper study: Learning to solve circuit sat
Paper study: Learning to solve circuit sat
ChenYiHuang5
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
Yan Xu
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
CastLabKAIST
 
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Lecture 5: Convolutional Neural Network Models
Lecture 5: Convolutional Neural Network ModelsLecture 5: Convolutional Neural Network Models
Lecture 5: Convolutional Neural Network Models
Mohamed Loey
 
Network Deconvolution review [cdm]
Network Deconvolution review [cdm]Network Deconvolution review [cdm]
Network Deconvolution review [cdm]
Dongmin Choi
 
14_cnn complete.pptx
14_cnn complete.pptx14_cnn complete.pptx
14_cnn complete.pptx
FaizanNadeem10
 
Cerebellar Model Articulation Controller
Cerebellar Model Articulation ControllerCerebellar Model Articulation Controller
Cerebellar Model Articulation Controller
Zahra Sadeghi
 
Parallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel applicationParallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel application
Geoffrey Fox
 
nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyay
abhishek upadhyay
 
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
ali hassan
 
Lec 6-bp
Lec 6-bpLec 6-bp
Lec 6-bp
Taymoor Nazmy
 
Resnet
ResnetResnet
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Junaid Bhat
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
ankit_ppt
 
Loop parallelization & pipelining
Loop parallelization & pipeliningLoop parallelization & pipelining
Loop parallelization & pipelining
jagrat123
 
convolutional_neural_networks in deep learning
convolutional_neural_networks in deep learningconvolutional_neural_networks in deep learning
convolutional_neural_networks in deep learning
ssusere5ddd6
 
ML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptxML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptx
DebabrataPain1
 

Similar to Case Study of Convolutional Neural Network (20)

Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
 
Paper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satPaper study: Learning to solve circuit sat
Paper study: Learning to solve circuit sat
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
CNN.pptx
CNN.pptxCNN.pptx
CNN.pptx
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
 
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
 
Lecture 5: Convolutional Neural Network Models
Lecture 5: Convolutional Neural Network ModelsLecture 5: Convolutional Neural Network Models
Lecture 5: Convolutional Neural Network Models
 
Network Deconvolution review [cdm]
Network Deconvolution review [cdm]Network Deconvolution review [cdm]
Network Deconvolution review [cdm]
 
14_cnn complete.pptx
14_cnn complete.pptx14_cnn complete.pptx
14_cnn complete.pptx
 
Cerebellar Model Articulation Controller
Cerebellar Model Articulation ControllerCerebellar Model Articulation Controller
Cerebellar Model Articulation Controller
 
Parallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel applicationParallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel application
 
nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyay
 
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
 
Lec 6-bp
Lec 6-bpLec 6-bp
Lec 6-bp
 
Resnet
ResnetResnet
Resnet
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
Loop parallelization & pipelining
Loop parallelization & pipeliningLoop parallelization & pipelining
Loop parallelization & pipelining
 
convolutional_neural_networks in deep learning
convolutional_neural_networks in deep learningconvolutional_neural_networks in deep learning
convolutional_neural_networks in deep learning
 
ML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptxML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptx
 

More from NamHyuk Ahn

Supporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OSSupporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OS
NamHyuk Ahn
 
TensorFlow Tutorial
TensorFlow TutorialTensorFlow Tutorial
TensorFlow Tutorial
NamHyuk Ahn
 
Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)
NamHyuk Ahn
 
Single Shot Multibox Detector
Single Shot Multibox DetectorSingle Shot Multibox Detector
Single Shot Multibox Detector
NamHyuk Ahn
 
Multimodal Residual Learning for Visual QA
Multimodal Residual Learning for Visual QAMultimodal Residual Learning for Visual QA
Multimodal Residual Learning for Visual QA
NamHyuk Ahn
 
Google's Multilingual Neural Machine Translation System
Google's Multilingual Neural Machine Translation SystemGoogle's Multilingual Neural Machine Translation System
Google's Multilingual Neural Machine Translation System
NamHyuk Ahn
 
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image SegmentationDeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
NamHyuk Ahn
 

More from NamHyuk Ahn (7)

Supporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OSSupporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OS
 
TensorFlow Tutorial
TensorFlow TutorialTensorFlow Tutorial
TensorFlow Tutorial
 
Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)
 
Single Shot Multibox Detector
Single Shot Multibox DetectorSingle Shot Multibox Detector
Single Shot Multibox Detector
 
Multimodal Residual Learning for Visual QA
Multimodal Residual Learning for Visual QAMultimodal Residual Learning for Visual QA
Multimodal Residual Learning for Visual QA
 
Google's Multilingual Neural Machine Translation System
Google's Multilingual Neural Machine Translation SystemGoogle's Multilingual Neural Machine Translation System
Google's Multilingual Neural Machine Translation System
 
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image SegmentationDeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
 

Recently uploaded

NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
BrazilAccount1
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
Vijay Dialani, PhD
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
Jayaprasanna4
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 

Recently uploaded (20)

NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 

Case Study of Convolutional Neural Network

  • 1. Case Study of CNN from LeNet to ResNet NamHyuk Ahn @ Ajou Univ. 2016. 03. 09
  • 3.
  • 4.
  • 5. Convolution Layer - Convolution (3-dim dot product) image and filter - Stack filter in one layer (See blue and green output, called channel)
  • 6. Convolution Layer - Local Connectivity • Instead connect all pixels to neurons, connect only local region of input (called receptive field) • It can reduce many parameter - Parameter sharing • To reduce parameter, each channel have same filter. (# of filter == # of channel)
  • 7. Convolution Layer - Example) 1st conv layer in AlexNet • Input: [224, 224], filter: [11x11x3], 96, output: [55, 55] - Each filter extract different features (i.e. horizontal edge, vertical edge…)
  • 8. Pooling Layer - Downsample image to reduce parameter - Usually use max pooling (take maximum value in region)
  • 9. ReLU, FC Layer - ReLU • Sort of activation function (e.g. sigmoid, tanh…) - Fully-connected Layer • Same as normal neural network
  • 11. Training CNN 1. Calculate loss function with foward-prop 2. Optimize parameter w.r.t loss function with back- prop • Use gradient descent method (SGD) • Gradient of weight can calculate with chain rule of partial derivate
  • 12.
  • 13.
  • 14.
  • 16.
  • 18. AlexNet - ReLU - Data augmentation - Dropout - Ensemble CNN (1-CNN 18.2%, 7-CNN 15.4%)
  • 19. AlexNet - Other methods (but will not mention today) • SGD + momentum (+ mini-batch) • Multiple GPU • Weight Decay • Local Response Normalization
  • 20. Problems of sigmoid - Gradient vanishing • when gradient pass sigmoid, it can vanish because local gradient of sigmoid can be almost zero. - Output is not zero-centered • cause bad performance
  • 21. ReLU - Converge of SGD is faster than sigmoid-like - Computationally cheap
  • 22. Data augmentation - Randomly crop [256, 256] images to [224, 224] - At test time, crop 5 images and average to predict
  • 23. Dropout - Similar to bagging (approximation of bagging) - Act like regularizer (reduce overfit) - Instead of using all neurons, “dropout” some neurons randomly (usually 0.5 probability)
  • 24. Dropout • At test time, not “dropout” neurons, but use weighted neurons (usually 0.5) • Weight is expected value of each neurons
  • 25. Architecture - conv - pool - … - fc - softmax (similar to LeNet) - Use large size filter (i.e. 11x11)
  • 26. Architecture - Weights must be initalized randomly • If not, all gradients of neurons will be same • Usually, use gaussian distribution, std = 0.01 - Use mini-batch SGD and momentum SGD to update weight
  • 28. VGGNet - Use small size kernel (always 3x3) • Can use multiple non-linearlity (e.g. ReLU) • Less weights to train - Hard data augmentation (more than AlexNet) - Ensemble 7 model (ILSVRC submission 7.3%)
  • 29. Architecture - Most memory needs in early layers, most parameters increase in fc layers.
  • 30. GoogLeNet - Inception v1 (2014) (ILSVRC 2014 winner)
  • 32. Inception module - Use 1x1, 3x3 and 5x5 conv simultaneously to capture variety of structure - Capture dense structure to 1x1, more spread out structure to 3x3, 5x5 - Computational expensive • Use 1x1 conv layer to reduce dimension (explain details in later in ResNet)
  • 33. Auxiliary Classifiers - Deep network raises concern about effectiveness of graident in backprop - Loss of auxiliary is added to total loss (weighted by 0.3), remove at test time
  • 34. Average Pooling - Proposed in Network in Network (also used in GoogLeNet) - Problems of fc layer • Needs lots of parameter, easy to overfit - Replace fc to average pooling
  • 35. Average Pooling - Make channel as same as # of class in last conv - Calc average on each channel, and pass to softmax - Reduce overfit
  • 37. before ResNet.. - Have to know about • PReLU • Xavier Initalization • Batch Normalization
  • 38. PReLU - Adaptive version of ReLU - Train slope of function when x < 0 - Slightly more parameter (# of layer x # of channel)
  • 39. Xavier Initalization - If init with gaussian distribution, output of neurons will be nearly zeros when network is deeep - If increase std (1.0), output will saturate to -1 or 1 - Xavier init decide initial value by number of input neurons - Looks fine, but this init method assume linear activation so can’t use in ReLU-like network
  • 41. Xavier Initalization / 2 Xavier Initalization Xavier Initalization / 2
  • 42. Batch Normalization - Make output to be gaussian distribution, but normalization cost a lot • Calc mean, variance in each dimension (assume each dims are uncorrelated) • Calc mean, variance in mini-batch (not entire set) - Normalize constrain non-linearlity and constrain network by assume each dims are uncorrelated • Linear transform output (factors are parameter)
  • 43. Batch Normalization - When test, calc mean, variance using entire set (use moving average) - BN act like regularizer (don’t need Dropout)
  • 46. Problem of degradation - More depth, more accurate but deep network can vanish/explode gradient • BN, Xavier Init, Dropout can handle (~30 layer) - More deeper, degradation problem occur • Not only overfit, but also increase training error
  • 47. Deep Residual Learning - Element-wise addition with F(x) and shortcut connection, and pass through ReLU non-linearlity - Dim of x, F(x) are unequal (changing of channel), linear project x to match dim (done by 1x1 conv) - Similar to LSTM
  • 48. Deeper Bottleneck - To reduce training time, modify as bottleneck design (just for economical reason) • (3x3x3)x64x64 + (3x3x3)x64x64=221184 (left) • (1x1x3)x256x64 + (3x3x3)x64x64 + (1x1x3)x64x256=208896 (right) • More width(channel) in right, but similar parameter • Similar method also used in GoogLeNet
  • 49. ResNet - Data augmentation as AlexNet does - Batch Normalization (no dropout) - Xavier / 2 initalization - Average pooling - Structure follows VGGNet style
  • 52. Conclusion - Dropout, BN - ReLU-like activation (e.g. PReLU, ELU..) - Xavier initalization - Average pooling - Use pre-trained model :)
  • 53. Reference - Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. - Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014). - Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." arXiv preprint arXiv:1312.4400 (2013). - He, Kaiming, et al. "Delving deep into rectifiers: Surpassing human-level performance on imagenet classification." Proceedings of the IEEE International Conference on Computer Vision. 2015. - He, Kaiming, et al. "Deep Residual Learning for Image Recognition." arXiv preprint arXiv:1512.03385 (2015). - Szegedy, Christian, Sergey Ioffe, and Vincent Vanhoucke. "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning." arXiv preprint arXiv:1602.07261 (2016). - Gu, Jiuxiang, et al. "Recent Advances in Convolutional Neural Networks." arXiv preprint arXiv: 1512.07108 (2015). (good for tutorial) - Also Thanks to CS231n, I used some figures in CS231n lecture slides. 
 see http://cs231n.stanford.edu/index.html