SlideShare a Scribd company logo
Convolution Neural Network for Visual
Recognition
Outline
• Quick overview of Artificial Neural Network (ANN)
• What is Convolution? Convolutional Neural Network (CNN)? Why?
• How it works?
• Demo
• Code
• References
• Discussion
7/24/18 Creative Common BY-SA-NC 2
Neural Network
source: http://www.kurzweilai.net/images/neuron_structure1.jpg and https://theclevermachine.files.wordpress.com/2014/09/perceptron2.png
7/24/18 Creative Common BY-SA-NC 3
Forward Feed and Back Propagation
source: https://theclevermachine.wordpress.com/2014/09/11/a-gentle-introduction-to-artificial-neural-networks/
7/24/18 Creative Common BY-SA-NC 4
Activation Function
image source: https://www.gabormelli.com/RKB/Neuron_Activation_Function
7/24/18 Creative Common BY-SA-NC 5
Why Convolution Neural Network?
Image source: https://www.coursera.org/lecture/convolutional-neural-networks/why-convolutions-Xv7B5
• Reduce number of weights
required for training.
• Use filter to capture local
information; more meaningful
search, move from pixel
recognition to pattern
recognition.
• Sparsity of connections (means
most of the weights are 0. This
can lead to an increase in space
and time efficiency.)
7/24/18 Creative Common BY-SA-NC 6
What is Convolution?
Image source: https://www.youtube.com/watch?v=cOmkIsWfAcg
• In mathematics, a convolution is
the integral measuring how
much two functions overlap as
one passes over the other.
• A convolution is a way of mixing
two functions by multiplying
them.
7/24/18 Creative Common BY-SA-NC 7
Image Convolution
image source: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
7/24/18 Creative Common BY-SA-NC 8
• Original image: function f
• Filter: function g
• Image convolution f * g
Example: 8
f * gg
g2
g1
gn
Approach
image source: image source: cs231n_2017_lecture5.pdf slide-38
7/24/18 Creative Common BY-SA-NC 9
Convolution
image source: cs231n_2017_lecture5.pdf slide-39
7/24/18 Creative Common BY-SA-NC 10
CNN Layers
source: partially from cs231n_2017
A simple ConvNet for CIFAR-10 classification could have the architecture
[INPUT - CONV - RELU - POOL - FC].
In more detail:
• INPUT [e.g. 32x32x3]
• Holds the raw pixel values of the image, width 32, height 32, and with three color channels R,G,B.
• CONV layer [32x32x6]
• Holds the output of neurons that are connected to local regions in the input,
• each computing a dot product between their weights and a small region they are connected to in the input volume. This may
result in volume such as [32x32x6] if we decided to use 6 filters.
• RELU layer [32x32x6]
• will apply an elementwise activation function, such as the max(0,x) thresholding at zero. This leaves the size of the volume
unchanged ([32x32x6]).
• POOL layer [16x16x6]
• will perform a downsampling operation along the spatial dimensions (width, height), resulting in volume such as [16x16x6].
• FC (i.e. fully-connected) layer [400x1]> [120x1] > [84x1]
• will compute the class scores, resulting in volume of size [1x1x10], where each of the 10 numbers correspond to a class
score, such as among the 10 categories of CIFAR-10. As with ordinary Neural Networks and as the name implies, each neuron
in this layer will be connected to all the numbers in the previous volume.
Notes: switch 12 filters used in original note to 6 filters.
7/24/18 Creative Common BY-SA-NC 11
Convolution
source cs231n
Calculation Demo:
http://cs231n.github.io/convolutional-networks/
7/24/18 Creative Common BY-SA-NC 12
7/24/18 Creative Common BY-SA-NC 13
Image source: image source: cs231n_2017_lecture5.pdf slide-39
Activation Function - ReLU
• Remove negative values.
• When we use ReLU, we should
watch for dead units in the
network (= units that never
activate). If there is many dead
units in training our network, we
might want to consider using
leaky_ReLU instead.
7/24/18 Creative Common BY-SA-NC 14
Max-Pooling
Image source: cs231n
7/24/18 Creative Common BY-SA-NC 15
Architecture Example
source: https://medium.com/machine-learning-bites/deeplearning-series-convolutional-neural-networks-a9c2f2ee1524
7/24/18 Creative Common BY-SA-NC 16
Conv Layer
image source: cs231n_2017_lecture5.pdf slide-39
7/24/18 Creative Common BY-SA-NC 17
Operation – Convolution
image source: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
7/24/18 Creative Common BY-SA-NC 18
Operation – Activation
Image source: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
7/24/18 Creative Common BY-SA-NC 19
Operation – Pooling
image source: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
7/24/18 Creative Common BY-SA-NC 20
Architecture Example
7/24/18 Creative Common BY-SA-NC 21
Alexnet - Trained
Filters
source: cs231n
Example filters learned by Krizhevsky et al.
Each of the 96 filters shown here is of size
[11x11x3], and each one is shared by the
55*55 neurons in one depth slice. Notice
that the parameter sharing assumption is
relatively reasonable: If detecting a
horizontal edge is important at some location
in the image, it should intuitively be useful at
some other location as well due to the
translationally-invariant structure of images.
There is therefore no need to relearn to
detect a horizontal edge at every one of the
55*55 distinct locations in the Conv layer
output volume.
7/24/18 Creative Common BY-SA-NC 22
Summary
source: partially from cs231n_2017_lecture5.pdf slide-76
• Workflow
1. Initialize all filter weights and parameters with random numbers.
2. Use original images as input,
2.1 Apply Filters to Original Image > Conv layer
2.2 Apply Activation Function (e.g. ReLU) to Conv layer > Feature Map
2.3 Apply Pooling Filter to Feature Map > Smaller Feature Map (optional)
2.4 Flatten the Feature Map > Full Connected Network (FC)
2.5 Apply ANN training (forward and backward propagation) to FC
2.6 Optimize the Weights, Calculate error, adjust weights, loop with original images till the probability of correct class is high.
3. Test the result, if happy, then save filters (weight and parameters) for future use, else loop.
• ConvNets stack CONV,POOL,FC layers
[(CONV-RELU)*N-POOL?]*M-(FC-RELU)*K, SOFTMAX
where - N is usually up to ~5, M is large, 0 <= K <= 2
- Trend towards smaller filters and deeper architectures
- Trend towards getting rid of POOL/FC layers (just CONV)
• But!!
- recent advances such as ResNet/GoogLeNet challenge this paradigm.
- Proposed new Capsule Neural Network can overcome some shortcoming of ConvNets.
7/24/18 Creative Common BY-SA-NC 23
Various CNN Architectures
From https://www.jeremyjordan.me/convnet-architectures/
7/24/18 Creative Common BY-SA-NC 24
These architectures serve as rich feature extractors which can be used for image
classification, object detection, image segmentation, and many other more
advanced tasks.
Classic network architectures (included for historical purposes)
• [LeNet-5](https://www.jeremyjordan.me/convnet-architectures/#lenet5)
• [AlexNet](https://www.jeremyjordan.me/convnet-architectures/#alexnet)
• [VGG 16](https://www.jeremyjordan.me/convnet-architectures/#vgg16 )
Modern network architectures
• [Inception](https://www.jeremyjordan.me/convnet-architectures/#inception)
• [ResNet](https://www.jeremyjordan.me/convnet-architectures/#resnet)
• [DenseNet](https://www.jeremyjordan.me/convnet-architectures/#densenet )
Network Performance
Source: https://www.semanticscholar.org/paper/An-Analysis-of-Deep-Neural-Network-Models-for-Canziani-Paszke/28ee688947cf9d31fc48f07a0497cd75200a9485 and
https://arxiv.org/pdf/1605.07678.pdf
7/24/18 Creative Common BY-SA-NC 25
Reference
• [How to Select Activation Function for Deep Neural Network](https://engmrk.com/activation-function-for-dnn/ )
• [Using Convolutional Neural Networks for Image Recognition](https://ip.cadence.com/uploads/901/cnn_wp-pdf)
• [Activation Functions: Neural Networks](https://towardsdatascience.com/activation-functions-neural-networks-
1cbd9f8d91d6)
• [Convolutional Neural Networks Tutorial in TensorFlow](http://adventuresinmachinelearning.com/convolutional-neural-
networks-tutorial-tensorflow/)
• [Rethinking the Inception Architecture for Computer Vision](https://arxiv.org/pdf/1512.00567.pdf)
7/24/18 Creative Common BY-SA-NC 26
Demo
[Demo - filtering](https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/ ) building image
[Demo – cs231n](http://cs231n.stanford.edu/) end to end architecture in real-time
[Demo – convolution calculation](http://cs231n.github.io/convolutional-networks/ ) dot product
[Demo – cifar10 ](https://cs.stanford.edu/people/karpathy/convnetjs/demo/cifar10.html) in details filter/ReLU
7/24/18 Creative Common BY-SA-NC 27
Code
[image classification with Tensorflow](https://github.com/rkuo/ml-tensorflow/blob/master/cnn-cifar10/cnn-cifar10-keras-v0.2.0.ipynb ) use tensorflow local
[image classification with Keras](https://github.com/rkuo/ml-tensorflow/blob/master/cnn-cifar10/cnn-cifar10-keras-v0.2.0.ipynb ) use keras local
[catsdogs](https://github.com/rkuo/fastai/blob/master/lesson1-catsdogs/Fastai_2_Lesson1.ipynb) use fastai with pre-trained model = resnet34
[tableschairs](https://github.com/rkuo/fastai/blob/master/lesson1-tableschairs/Fastai_2_Lesson1a-tableschairs.ipynb ) switch data
7/24/18 Creative Common BY-SA-NC 28
Image Classification
with Tensorflow
7/24/18 Creative Common BY-SA-NC 29
Image Classification
with Keras
7/24/18 Creative Common BY-SA-NC 30
TablesChairs with
Fastai
7/24/18 Creative Common BY-SA-NC 31
Catsdogs Model with
Fastai
7/24/18 Creative Common BY-SA-NC 32
Supplement Slides
7/24/18 Creative Common BY-SA-NC 33
Why Convolution
Neural Network?
Image source:
https://www.youtube.com/watch?v=QsxKKyhYxFQ
• Reduce number of weights
required for training.
• Use filter to capture local
information; more meaningful
search, move from pixel
recognition to pattern
recognition.
• Sparsity of connections (means
most of the weights are 0. This
can lead to an increase in space
and time efficiency.)
7/24/18 Creative Common BY-SA-NC 34
LeNet 5
source: Yann. LeCun, L. Bottou, Y. Bengio, and P. Haffner,
Gradient-based learning applied to document
recognition, Proc. IEEE 86(11): 2278–2324, 1998.
- 2 Conv
- 2 Subsampling
- 2 FC
- Gaussian Connectors
7/24/18 Creative Common BY-SA-NC 35
7/24/18 Creative Common BY-SA-NC 36
Inception v3

More Related Content

What's hot

Cnn
CnnCnn
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
Yan Xu
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
Suraj Aavula
 
Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)
Muhammad Haroon
 
CONVOLUTIONAL NEURAL NETWORK
CONVOLUTIONAL NEURAL NETWORKCONVOLUTIONAL NEURAL NETWORK
CONVOLUTIONAL NEURAL NETWORK
Md Rajib Bhuiyan
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
Appsilon Data Science
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
ananth
 
Deep Learning - CNN and RNN
Deep Learning - CNN and RNNDeep Learning - CNN and RNN
Deep Learning - CNN and RNN
Ashray Bhandare
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNN
Noura Hussein
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural Networks
Ashray Bhandare
 
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Universitat Politècnica de Catalunya
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNN
Shuai Zhang
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketaki
Ketaki Patwari
 
CNN Machine learning DeepLearning
CNN Machine learning DeepLearningCNN Machine learning DeepLearning
CNN Machine learning DeepLearning
Abhishek Sharma
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
Taegyun Jeon
 
cnn ppt.pptx
cnn ppt.pptxcnn ppt.pptx
cnn ppt.pptx
rohithprabhas1
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
leopauly
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Simplilearn
 
CNN Tutorial
CNN TutorialCNN Tutorial
CNN Tutorial
Sungjoon Choi
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
Oswald Campesato
 

What's hot (20)

Cnn
CnnCnn
Cnn
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
 
Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)
 
CONVOLUTIONAL NEURAL NETWORK
CONVOLUTIONAL NEURAL NETWORKCONVOLUTIONAL NEURAL NETWORK
CONVOLUTIONAL NEURAL NETWORK
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
 
Deep Learning - CNN and RNN
Deep Learning - CNN and RNNDeep Learning - CNN and RNN
Deep Learning - CNN and RNN
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNN
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural Networks
 
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNN
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketaki
 
CNN Machine learning DeepLearning
CNN Machine learning DeepLearningCNN Machine learning DeepLearning
CNN Machine learning DeepLearning
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
 
cnn ppt.pptx
cnn ppt.pptxcnn ppt.pptx
cnn ppt.pptx
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
 
CNN Tutorial
CNN TutorialCNN Tutorial
CNN Tutorial
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 

Similar to Machine Learning - Convolutional Neural Network

IRJET-Multiple Object Detection using Deep Neural Networks
IRJET-Multiple Object Detection using Deep Neural NetworksIRJET-Multiple Object Detection using Deep Neural Networks
IRJET-Multiple Object Detection using Deep Neural Networks
IRJET Journal
 
B.tech_project_ppt.pptx
B.tech_project_ppt.pptxB.tech_project_ppt.pptx
B.tech_project_ppt.pptx
supratikmondal6
 
Garbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesGarbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning Techniques
IRJET Journal
 
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignPR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
Jinwon Lee
 
Modern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationModern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentation
Gioele Ciaparrone
 
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable C...
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable C...InternImage: Exploring Large-Scale Vision Foundation Models with Deformable C...
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable C...
taeseon ryu
 
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
Bomm Kim
 
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 Once-for-All: Train One Network and Specialize it for Efficient Deployment Once-for-All: Train One Network and Specialize it for Efficient Deployment
Once-for-All: Train One Network and Specialize it for Efficient Deployment
taeseon ryu
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
NUPUR YADAV
 
Handwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPTHandwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPT
RishabhTyagi48
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
ssuser3aa461
 
U-Netpresentation.pptx
U-Netpresentation.pptxU-Netpresentation.pptx
U-Netpresentation.pptx
NoorUlHaq47
 
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Alex Conway
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
Edge AI and Vision Alliance
 
Deep Learning for Computer Vision - PyconDE 2017
Deep Learning for Computer Vision - PyconDE 2017Deep Learning for Computer Vision - PyconDE 2017
Deep Learning for Computer Vision - PyconDE 2017
Alex Conway
 
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksPR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
Jinwon Lee
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
DonghyunKang12
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
Aditya Bhattacharya
 
PR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesPR243: Designing Network Design Spaces
PR243: Designing Network Design Spaces
Jinwon Lee
 
Transfer Learning (20230516)
Transfer Learning (20230516)Transfer Learning (20230516)
Transfer Learning (20230516)
FEG
 

Similar to Machine Learning - Convolutional Neural Network (20)

IRJET-Multiple Object Detection using Deep Neural Networks
IRJET-Multiple Object Detection using Deep Neural NetworksIRJET-Multiple Object Detection using Deep Neural Networks
IRJET-Multiple Object Detection using Deep Neural Networks
 
B.tech_project_ppt.pptx
B.tech_project_ppt.pptxB.tech_project_ppt.pptx
B.tech_project_ppt.pptx
 
Garbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesGarbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning Techniques
 
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignPR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
 
Modern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationModern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentation
 
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable C...
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable C...InternImage: Exploring Large-Scale Vision Foundation Models with Deformable C...
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable C...
 
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
 
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 Once-for-All: Train One Network and Specialize it for Efficient Deployment Once-for-All: Train One Network and Specialize it for Efficient Deployment
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
Handwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPTHandwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPT
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
 
U-Netpresentation.pptx
U-Netpresentation.pptxU-Netpresentation.pptx
U-Netpresentation.pptx
 
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
 
Deep Learning for Computer Vision - PyconDE 2017
Deep Learning for Computer Vision - PyconDE 2017Deep Learning for Computer Vision - PyconDE 2017
Deep Learning for Computer Vision - PyconDE 2017
 
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksPR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
 
PR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesPR243: Designing Network Design Spaces
PR243: Designing Network Design Spaces
 
Transfer Learning (20230516)
Transfer Learning (20230516)Transfer Learning (20230516)
Transfer Learning (20230516)
 

More from Richard Kuo

View Orchestration from Model Driven Engineering Prospective
View Orchestration from Model Driven Engineering ProspectiveView Orchestration from Model Driven Engineering Prospective
View Orchestration from Model Driven Engineering Prospective
Richard Kuo
 
Telecom Infra Project study notes
Telecom Infra Project study notesTelecom Infra Project study notes
Telecom Infra Project study notes
Richard Kuo
 
5g, gpu and fpga
5g, gpu and fpga5g, gpu and fpga
5g, gpu and fpga
Richard Kuo
 
Learning
Learning Learning
Learning
Richard Kuo
 
Kubernetes20151017a
Kubernetes20151017aKubernetes20151017a
Kubernetes20151017a
Richard Kuo
 
IaaS with Chef
IaaS with ChefIaaS with Chef
IaaS with Chef
Richard Kuo
 
Ontology, Semantic Web and DBpedia
Ontology, Semantic Web and DBpediaOntology, Semantic Web and DBpedia
Ontology, Semantic Web and DBpedia
Richard Kuo
 
SDN and NFV
SDN and NFVSDN and NFV
SDN and NFV
Richard Kuo
 
Graph Database
Graph DatabaseGraph Database
Graph Database
Richard Kuo
 
UML, OWL and REA based enterprise business model 20110201a
UML, OWL and REA based enterprise business model 20110201aUML, OWL and REA based enterprise business model 20110201a
UML, OWL and REA based enterprise business model 20110201a
Richard Kuo
 
Open v switch20150410b
Open v switch20150410bOpen v switch20150410b
Open v switch20150410b
Richard Kuo
 
Spark Study Notes
Spark Study NotesSpark Study Notes
Spark Study Notes
Richard Kuo
 
Docker and coreos20141020b
Docker and coreos20141020bDocker and coreos20141020b
Docker and coreos20141020b
Richard Kuo
 
Cloud computing reference architecture from nist and ibm
Cloud computing reference architecture from nist and ibmCloud computing reference architecture from nist and ibm
Cloud computing reference architecture from nist and ibmRichard Kuo
 

More from Richard Kuo (15)

View Orchestration from Model Driven Engineering Prospective
View Orchestration from Model Driven Engineering ProspectiveView Orchestration from Model Driven Engineering Prospective
View Orchestration from Model Driven Engineering Prospective
 
Telecom Infra Project study notes
Telecom Infra Project study notesTelecom Infra Project study notes
Telecom Infra Project study notes
 
5g, gpu and fpga
5g, gpu and fpga5g, gpu and fpga
5g, gpu and fpga
 
Learning
Learning Learning
Learning
 
Kubernetes20151017a
Kubernetes20151017aKubernetes20151017a
Kubernetes20151017a
 
IaaS with Chef
IaaS with ChefIaaS with Chef
IaaS with Chef
 
Ontology, Semantic Web and DBpedia
Ontology, Semantic Web and DBpediaOntology, Semantic Web and DBpedia
Ontology, Semantic Web and DBpedia
 
SDN and NFV
SDN and NFVSDN and NFV
SDN and NFV
 
Graph Database
Graph DatabaseGraph Database
Graph Database
 
UML, OWL and REA based enterprise business model 20110201a
UML, OWL and REA based enterprise business model 20110201aUML, OWL and REA based enterprise business model 20110201a
UML, OWL and REA based enterprise business model 20110201a
 
Open v switch20150410b
Open v switch20150410bOpen v switch20150410b
Open v switch20150410b
 
Spark Study Notes
Spark Study NotesSpark Study Notes
Spark Study Notes
 
Docker and coreos20141020b
Docker and coreos20141020bDocker and coreos20141020b
Docker and coreos20141020b
 
Git studynotes
Git studynotesGit studynotes
Git studynotes
 
Cloud computing reference architecture from nist and ibm
Cloud computing reference architecture from nist and ibmCloud computing reference architecture from nist and ibm
Cloud computing reference architecture from nist and ibm
 

Recently uploaded

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 

Machine Learning - Convolutional Neural Network

  • 1. Convolution Neural Network for Visual Recognition
  • 2. Outline • Quick overview of Artificial Neural Network (ANN) • What is Convolution? Convolutional Neural Network (CNN)? Why? • How it works? • Demo • Code • References • Discussion 7/24/18 Creative Common BY-SA-NC 2
  • 3. Neural Network source: http://www.kurzweilai.net/images/neuron_structure1.jpg and https://theclevermachine.files.wordpress.com/2014/09/perceptron2.png 7/24/18 Creative Common BY-SA-NC 3
  • 4. Forward Feed and Back Propagation source: https://theclevermachine.wordpress.com/2014/09/11/a-gentle-introduction-to-artificial-neural-networks/ 7/24/18 Creative Common BY-SA-NC 4
  • 5. Activation Function image source: https://www.gabormelli.com/RKB/Neuron_Activation_Function 7/24/18 Creative Common BY-SA-NC 5
  • 6. Why Convolution Neural Network? Image source: https://www.coursera.org/lecture/convolutional-neural-networks/why-convolutions-Xv7B5 • Reduce number of weights required for training. • Use filter to capture local information; more meaningful search, move from pixel recognition to pattern recognition. • Sparsity of connections (means most of the weights are 0. This can lead to an increase in space and time efficiency.) 7/24/18 Creative Common BY-SA-NC 6
  • 7. What is Convolution? Image source: https://www.youtube.com/watch?v=cOmkIsWfAcg • In mathematics, a convolution is the integral measuring how much two functions overlap as one passes over the other. • A convolution is a way of mixing two functions by multiplying them. 7/24/18 Creative Common BY-SA-NC 7
  • 8. Image Convolution image source: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/ 7/24/18 Creative Common BY-SA-NC 8 • Original image: function f • Filter: function g • Image convolution f * g Example: 8 f * gg g2 g1 gn
  • 9. Approach image source: image source: cs231n_2017_lecture5.pdf slide-38 7/24/18 Creative Common BY-SA-NC 9
  • 10. Convolution image source: cs231n_2017_lecture5.pdf slide-39 7/24/18 Creative Common BY-SA-NC 10
  • 11. CNN Layers source: partially from cs231n_2017 A simple ConvNet for CIFAR-10 classification could have the architecture [INPUT - CONV - RELU - POOL - FC]. In more detail: • INPUT [e.g. 32x32x3] • Holds the raw pixel values of the image, width 32, height 32, and with three color channels R,G,B. • CONV layer [32x32x6] • Holds the output of neurons that are connected to local regions in the input, • each computing a dot product between their weights and a small region they are connected to in the input volume. This may result in volume such as [32x32x6] if we decided to use 6 filters. • RELU layer [32x32x6] • will apply an elementwise activation function, such as the max(0,x) thresholding at zero. This leaves the size of the volume unchanged ([32x32x6]). • POOL layer [16x16x6] • will perform a downsampling operation along the spatial dimensions (width, height), resulting in volume such as [16x16x6]. • FC (i.e. fully-connected) layer [400x1]> [120x1] > [84x1] • will compute the class scores, resulting in volume of size [1x1x10], where each of the 10 numbers correspond to a class score, such as among the 10 categories of CIFAR-10. As with ordinary Neural Networks and as the name implies, each neuron in this layer will be connected to all the numbers in the previous volume. Notes: switch 12 filters used in original note to 6 filters. 7/24/18 Creative Common BY-SA-NC 11
  • 13. 7/24/18 Creative Common BY-SA-NC 13 Image source: image source: cs231n_2017_lecture5.pdf slide-39
  • 14. Activation Function - ReLU • Remove negative values. • When we use ReLU, we should watch for dead units in the network (= units that never activate). If there is many dead units in training our network, we might want to consider using leaky_ReLU instead. 7/24/18 Creative Common BY-SA-NC 14
  • 15. Max-Pooling Image source: cs231n 7/24/18 Creative Common BY-SA-NC 15
  • 17. Conv Layer image source: cs231n_2017_lecture5.pdf slide-39 7/24/18 Creative Common BY-SA-NC 17
  • 18. Operation – Convolution image source: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/ 7/24/18 Creative Common BY-SA-NC 18
  • 19. Operation – Activation Image source: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/ 7/24/18 Creative Common BY-SA-NC 19
  • 20. Operation – Pooling image source: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/ 7/24/18 Creative Common BY-SA-NC 20
  • 22. Alexnet - Trained Filters source: cs231n Example filters learned by Krizhevsky et al. Each of the 96 filters shown here is of size [11x11x3], and each one is shared by the 55*55 neurons in one depth slice. Notice that the parameter sharing assumption is relatively reasonable: If detecting a horizontal edge is important at some location in the image, it should intuitively be useful at some other location as well due to the translationally-invariant structure of images. There is therefore no need to relearn to detect a horizontal edge at every one of the 55*55 distinct locations in the Conv layer output volume. 7/24/18 Creative Common BY-SA-NC 22
  • 23. Summary source: partially from cs231n_2017_lecture5.pdf slide-76 • Workflow 1. Initialize all filter weights and parameters with random numbers. 2. Use original images as input, 2.1 Apply Filters to Original Image > Conv layer 2.2 Apply Activation Function (e.g. ReLU) to Conv layer > Feature Map 2.3 Apply Pooling Filter to Feature Map > Smaller Feature Map (optional) 2.4 Flatten the Feature Map > Full Connected Network (FC) 2.5 Apply ANN training (forward and backward propagation) to FC 2.6 Optimize the Weights, Calculate error, adjust weights, loop with original images till the probability of correct class is high. 3. Test the result, if happy, then save filters (weight and parameters) for future use, else loop. • ConvNets stack CONV,POOL,FC layers [(CONV-RELU)*N-POOL?]*M-(FC-RELU)*K, SOFTMAX where - N is usually up to ~5, M is large, 0 <= K <= 2 - Trend towards smaller filters and deeper architectures - Trend towards getting rid of POOL/FC layers (just CONV) • But!! - recent advances such as ResNet/GoogLeNet challenge this paradigm. - Proposed new Capsule Neural Network can overcome some shortcoming of ConvNets. 7/24/18 Creative Common BY-SA-NC 23
  • 24. Various CNN Architectures From https://www.jeremyjordan.me/convnet-architectures/ 7/24/18 Creative Common BY-SA-NC 24 These architectures serve as rich feature extractors which can be used for image classification, object detection, image segmentation, and many other more advanced tasks. Classic network architectures (included for historical purposes) • [LeNet-5](https://www.jeremyjordan.me/convnet-architectures/#lenet5) • [AlexNet](https://www.jeremyjordan.me/convnet-architectures/#alexnet) • [VGG 16](https://www.jeremyjordan.me/convnet-architectures/#vgg16 ) Modern network architectures • [Inception](https://www.jeremyjordan.me/convnet-architectures/#inception) • [ResNet](https://www.jeremyjordan.me/convnet-architectures/#resnet) • [DenseNet](https://www.jeremyjordan.me/convnet-architectures/#densenet )
  • 26. Reference • [How to Select Activation Function for Deep Neural Network](https://engmrk.com/activation-function-for-dnn/ ) • [Using Convolutional Neural Networks for Image Recognition](https://ip.cadence.com/uploads/901/cnn_wp-pdf) • [Activation Functions: Neural Networks](https://towardsdatascience.com/activation-functions-neural-networks- 1cbd9f8d91d6) • [Convolutional Neural Networks Tutorial in TensorFlow](http://adventuresinmachinelearning.com/convolutional-neural- networks-tutorial-tensorflow/) • [Rethinking the Inception Architecture for Computer Vision](https://arxiv.org/pdf/1512.00567.pdf) 7/24/18 Creative Common BY-SA-NC 26
  • 27. Demo [Demo - filtering](https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/ ) building image [Demo – cs231n](http://cs231n.stanford.edu/) end to end architecture in real-time [Demo – convolution calculation](http://cs231n.github.io/convolutional-networks/ ) dot product [Demo – cifar10 ](https://cs.stanford.edu/people/karpathy/convnetjs/demo/cifar10.html) in details filter/ReLU 7/24/18 Creative Common BY-SA-NC 27
  • 28. Code [image classification with Tensorflow](https://github.com/rkuo/ml-tensorflow/blob/master/cnn-cifar10/cnn-cifar10-keras-v0.2.0.ipynb ) use tensorflow local [image classification with Keras](https://github.com/rkuo/ml-tensorflow/blob/master/cnn-cifar10/cnn-cifar10-keras-v0.2.0.ipynb ) use keras local [catsdogs](https://github.com/rkuo/fastai/blob/master/lesson1-catsdogs/Fastai_2_Lesson1.ipynb) use fastai with pre-trained model = resnet34 [tableschairs](https://github.com/rkuo/fastai/blob/master/lesson1-tableschairs/Fastai_2_Lesson1a-tableschairs.ipynb ) switch data 7/24/18 Creative Common BY-SA-NC 28
  • 29. Image Classification with Tensorflow 7/24/18 Creative Common BY-SA-NC 29
  • 30. Image Classification with Keras 7/24/18 Creative Common BY-SA-NC 30
  • 32. Catsdogs Model with Fastai 7/24/18 Creative Common BY-SA-NC 32
  • 34. Why Convolution Neural Network? Image source: https://www.youtube.com/watch?v=QsxKKyhYxFQ • Reduce number of weights required for training. • Use filter to capture local information; more meaningful search, move from pixel recognition to pattern recognition. • Sparsity of connections (means most of the weights are 0. This can lead to an increase in space and time efficiency.) 7/24/18 Creative Common BY-SA-NC 34
  • 35. LeNet 5 source: Yann. LeCun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proc. IEEE 86(11): 2278–2324, 1998. - 2 Conv - 2 Subsampling - 2 FC - Gaussian Connectors 7/24/18 Creative Common BY-SA-NC 35
  • 36. 7/24/18 Creative Common BY-SA-NC 36 Inception v3

Editor's Notes

  1. Convolution Neural Network for Visual Recognition (捲積神經網絡用於視覺識別)
  2. Max-Pooling 最大池化 Use 6 filters size = 5 x 5 x 3 3072 x 3072 = 9.43m vs 156 x 4704 = 733824 Stride 步長
  3. 9 + 1 + (-2) + 1 (bias) = 9 Hyper-Parameters: Accepts a volume of size W1×H1×D1 Requires four hyper-parameters: Number of filters K, their spatial extent F, the stride S, the amount of zero padding P. Produces a volume of size W2×H2×D2 where: W2=(W1−F+2P)/S+1 H2=(H1−F+2P)/S+1 (i.e. width and height are computed equally by symmetry) D2=K With parameter sharing, it introduces F⋅F⋅D1 weights per filter, for a total of (F⋅F⋅D1)⋅K weights and K biases. In the output volume, the d-th depth slice (of size W2×H2) is the result of performing a valid convolution of the d-th filter over the input volume with a stride of S, and then offset by d-th bias. A common setting of the hyper-parameters is F=3,S=1,P=1.
  4. For consistency, function f should be g
  5. Max-Pooling 最大池化 http://www.ais.uni-bonn.de/papers/icann2010_maxpool.pdf show max-pooling is effective.
  6. Source cs231n: Example Architecture: Overview: We will go into more details below, but a simple ConvNet for CIFAR-10 classification could have the architecture [INPUT - CONV - RELU - POOL - FC]. In more detail: INPUT [32x32x3] will hold the raw pixel values of the image, in this case an image of width 32, height 32, and with three color channels R,G,B. CONV layer will compute the output of neurons that are connected to local regions in the input, each computing a dot product between their weights and a small region they are connected to in the input volume. This may result in volume such as [32x32x12] if we decided to use 12 filters. Use 6 here. RELU layer will apply an elementwise activation function, such as the max(0,x) thresholding at zero. This leaves the size of the volume unchanged ([32x32x12]). POOL layer will perform a downsampling operation along the spatial dimensions (width, height), resulting in volume such as [16x16x12]. FC (i.e. fully-connected) layer will compute the class scores, resulting in volume of size [1x1x10], where each of the 10 numbers correspond to a class score, such as among the 10 categories of CIFAR-10. As with ordinary Neural Networks and as the name implies, each neuron in this layer will be connected to all the numbers in the previous volume.
  7. Each Filter Generates One Feature Map
  8. In particular, pooling makes the input representations (feature dimension) smaller and more manageable reduces the number of parameters and computations in the network, therefore, controlling overfitting [4] makes the network invariant to small transformations, distortions and translations in the input image (a small distortion in input will not change the output of Pooling – since we take the maximum / average value in a local neighborhood). helps us arrive at an almost scale invariant representation of our image (the exact term is “equivariant”). This is very powerful since we can detect objects in an image no matter where they are located (read [18] and [19] for details).
  9. [INPUT – [CONV – RELU]*2 – POOL]*3 – [FC]*2 - SoftMax
  10. Alexnet - https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax
  11. Concept: Find a set of filters (function-g, matrix with weights) and parameters which can create proper feature maps, and cause various activation functions to be fired at different (layers) that leads to correct class has highest probability. f*g*a*p*fc -> max-y This should include the option of DROPOUT. Give a image function f, find a filter g, and activation function a, and pooling function p that leads to max y value (associate with f). Use red color glass filter to look a red letter-A written on a white paper, we will see a write letter-A written on a black paper.
  12. Source cs231n: Example Architecture: Overview: We will go into more details below, but a simple ConvNet for CIFAR-10 classification could have the architecture [INPUT - CONV - RELU - POOL - FC]. In more detail: INPUT [32x32x3] will hold the raw pixel values of the image, in this case an image of width 32, height 32, and with three color channels R,G,B. CONV layer will compute the output of neurons that are connected to local regions in the input, each computing a dot product between their weights and a small region they are connected to in the input volume. This may result in volume such as [32x32x12] if we decided to use 12 filters. RELU layer will apply an elementwise activation function, such as the max(0,x) thresholding at zero. This leaves the size of the volume unchanged ([32x32x12]). POOL layer will perform a downsampling operation along the spatial dimensions (width, height), resulting in volume such as [16x16x12]. FC (i.e. fully-connected) layer will compute the class scores, resulting in volume of size [1x1x10], where each of the 10 numbers correspond to a class score, such as among the 10 categories of CIFAR-10. As with ordinary Neural Networks and as the name implies, each neuron in this layer will be connected to all the numbers in the previous volume.
  13. Demo: http://cs231n.stanford.edu/
  14. Max-Pooling 最大池化 Use 6 filters size = 5 x 5 x 3 3072 x 3072 = 9.43m vs 156 x 4704 = 733824 Stride 步長
  15. []()