Cnn

Convolutional Neural
Network (CNN)
1
In the name of God
Mehrnaz Faraz
Faculty of Electrical Engineering
K. N. Toosi University of Technology
Milad Abbasi
Faculty of Electrical Engineering
Sharif University of Technology

CNN
• A supervised deep learning algorithm
• Not fully connected neural network
• Suitable for big data and tensors
– Tensor: Multidimensional array
• Uses relatively little pre-processing compared to other
algorithms
2

Using CNN
• Computer vision
– Face recognition
– Scene labelling
– Image classification
– Action recognition
– Human pose estimation
– Document analysis
• Natural Language Processing
– Speech recognition
3

Using CNN
4
Face recognition Scene labelling
Human pose estimation
Document analysis

CNN Using
• Classification
• Object detection
• Segmentation
5

CNN
• Using convolutional layers
• Using pooling layers
• Using multiple filters in a layer
– Creates different outputs in a layer
• Suitable for image data
6

Convolutional Layer
• An example input volume in red (e.g. a 32x32x3 image)
– Color image: Height, Width, Depth (Channels)
– Each pixel has 3 channels (R,G and B)
Input image: 32x32x3
Filter: 5x5x3
7
32
32
3
Height
width
depth
5
5
3

Convolutional Layer
• Convolving input with a filter
– Convolution: Sum of element-wise multiplications
– Example:
8

Convolutional Layer
10
Input (x)
Filter (w)
Feature Map
Stacked feature map with 10 different filters
A neuron
(number)
T
w x b

Convolutional Layer
• Stacked feature map:
11
Input
Filter
Filter
Feature Map

Convolutional Layer
• Convolutional layer is NOT fully connected
– Each neuron is connected only to a local region in the input
volume spatially
12

Convolutional Layer
• Increasing number of neurons Increasing parameters
and computational bourdon
• Parameter sharing
– Sharing of weights by all neurons in a particular feature map
– Reduces the number of parameters
• Local connectivity
– Each neural connected only to a subset of the input image
13

Number of Parameters
14
Input: 256x256x3
Parameters: 256*256*3+1=196,609 Parameters: 128*128*3+1=49,153
Kernel: 128x128x3
Parameter sharing

Stride
• Specifies how much we move the convolution filter at
each step
15

Padding
• The size of the feature map is smaller than the input
• To maintain the same dimensionality
– Using padding to surround the input with zero
17

Example
18
P=0, S=1
P=2, S=1
P=1, S=2
P=1, S=2

Example
• Size of feature map:
– i: size of input
– K: size of kernel
– p: padding
– s: stride
– o: size of feature map
19
2
1
i p k
o
s
  
   

Non-linearity
• Adds ReLU after each convolutional layer
• To introduce nonlinearity to a system that basically has just
been computing linear operations during the conv layers
• ReLU dose not saturate
20
Input Image
Feature Maps
Convolutional Layer/ Stacked feature map

Non-linearity
21
• Convolution + ReLU

Pooling Layer
• Or subsampling layer
• Periodically in-between Conv layers in a ConvNet
• Reduce the amount of parameters, size of data, and
computation in the network
• Control overfitting
• Types of pooling:
– Stride
– Mean pooling
– Max pooling
– Sum pooling
22

Pooling Layer
• Mean pooling
• Max pooling
23
With stride 2

CNN Overview
• CNNs have two components:
– The Hidden layers/Feature extraction part
• Perform a series of convolutions and pooling operations
• The convolution is performed on the input data with the
use of a filter or kernel to then produce a feature map
– The Classification part
• Assign a probability for the object on the image being
what the algorithm predicts it is
24

Training
• Back propagation:
28

Common Architectures in CNN
• Classic network architectures:
– LeNet-5
– AlexNet
– VGG16
• Modern network architectures:
– Inception (GoogLeNet)
– ResNet
– ResNeXt
– DenseNet
29

LeNet-5
– 7 layers
– 3 convolutional layers (C1, C3 and C5)
– 2 sub-sampling (pooling) layers (S2 and S4)/ mean pooling
– 1 fully connected layer (F6)
– 60,000 parameters
30
LeCun et al. in 1998

AlexNet
– The general architecture is quite similar to LeNet-5
– This model is considerably larger than LeNet-5
– Opening for computer vision tasks with deep learning
– 60 million parameters
31
Alex Krizhevsky et al. in 2012

VGG16
– Offers a deeper yet simpler variant of the convolutional
structures
– 138 million parameters
32
Introduced in 2014

GoogLeNet
– Comprised of a basic unit referred to as an "Inception
cell
33
In 2014, researchers at Google

Cnn

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Cnn

Similar to Cnn (20)

Recently uploaded

Recently uploaded (20)

Cnn