1. Convolutional Neural
Network (CNN)
1
In the name of God
Mehrnaz Faraz
Faculty of Electrical Engineering
K. N. Toosi University of Technology
Milad Abbasi
Faculty of Electrical Engineering
Sharif University of Technology
2. CNN
• A supervised deep learning algorithm
• Not fully connected neural network
• Suitable for big data and tensors
– Tensor: Multidimensional array
• Uses relatively little pre-processing compared to other
algorithms
2
3. Using CNN
• Computer vision
– Face recognition
– Scene labelling
– Image classification
– Action recognition
– Human pose estimation
– Document analysis
• Natural Language Processing
– Speech recognition
3
6. CNN
• Using convolutional layers
• Using pooling layers
• Using multiple filters in a layer
– Creates different outputs in a layer
• Suitable for image data
6
7. Convolutional Layer
• An example input volume in red (e.g. a 32x32x3 image)
– Color image: Height, Width, Depth (Channels)
– Each pixel has 3 channels (R,G and B)
Input image: 32x32x3
Filter: 5x5x3
7
32
32
3
Height
width
depth
5
5
3
12. Convolutional Layer
• Convolutional layer is NOT fully connected
– Each neuron is connected only to a local region in the input
volume spatially
12
13. Convolutional Layer
• Increasing number of neurons Increasing parameters
and computational bourdon
• Parameter sharing
– Sharing of weights by all neurons in a particular feature map
– Reduces the number of parameters
• Local connectivity
– Each neural connected only to a subset of the input image
13
14. Number of Parameters
14
Input: 256x256x3
Parameters: 256*256*3+1=196,609 Parameters: 128*128*3+1=49,153
Kernel: 128x128x3
Parameter sharing
17. Padding
• The size of the feature map is smaller than the input
• To maintain the same dimensionality
– Using padding to surround the input with zero
17
19. Example
• Size of feature map:
– i: size of input
– K: size of kernel
– p: padding
– s: stride
– o: size of feature map
19
2
1
i p k
o
s
20. Non-linearity
• Adds ReLU after each convolutional layer
• To introduce nonlinearity to a system that basically has just
been computing linear operations during the conv layers
• ReLU dose not saturate
20
Input Image
Feature Maps
Convolutional Layer/ Stacked feature map
22. Pooling Layer
• Or subsampling layer
• Periodically in-between Conv layers in a ConvNet
• Reduce the amount of parameters, size of data, and
computation in the network
• Control overfitting
• Types of pooling:
– Stride
– Mean pooling
– Max pooling
– Sum pooling
22
24. CNN Overview
• CNNs have two components:
– The Hidden layers/Feature extraction part
• Perform a series of convolutions and pooling operations
• The convolution is performed on the input data with the
use of a filter or kernel to then produce a feature map
– The Classification part
• Assign a probability for the object on the image being
what the algorithm predicts it is
24
29. Common Architectures in CNN
• Classic network architectures:
– LeNet-5
– AlexNet
– VGG16
• Modern network architectures:
– Inception (GoogLeNet)
– ResNet
– ResNeXt
– DenseNet
29
30. LeNet-5
– 7 layers
– 3 convolutional layers (C1, C3 and C5)
– 2 sub-sampling (pooling) layers (S2 and S4)/ mean pooling
– 1 fully connected layer (F6)
– 60,000 parameters
30
LeCun et al. in 1998
31. AlexNet
– The general architecture is quite similar to LeNet-5
– This model is considerably larger than LeNet-5
– Opening for computer vision tasks with deep learning
– 60 million parameters
31
Alex Krizhevsky et al. in 2012
32. VGG16
– Offers a deeper yet simpler variant of the convolutional
structures
– 138 million parameters
32
Introduced in 2014
33. GoogLeNet
– Comprised of a basic unit referred to as an "Inception
cell
33
In 2014, researchers at Google