Convolutional neural network (CNN) is a regularized type of feed-forward neural network that learns feature engineering by itself via filters (or kernel) optimization. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by using regularized weights over fewer connections. For example, for each neuron in the fully-connected layer 10,000 weights would be required for processing an image sized 100 × 100 pixels. However, applying cascaded convolution (or cross-correlation) kernels, only 25 neurons are required to process 5x5-sized tiles. Higher-layer features are extracted from wider context windows, compared to lower-layer features.
CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Networks (SIANN), based on the shared-weight architecture of the convolution kernels or filters that slide along input features and provide translation-equivariant responses known as feature maps. Counter-intuitively, most convolutional neural networks are not invariant to translation, due to the downsampling operation they apply to the input.
Feed-forward neural networks are usually fully connected networks, that is, each neuron in one layer is connected to all neurons in the next layer. The "full connectivity" of these networks make them prone to overfitting data. Typical ways of regularization, or preventing overfitting, include: penalizing parameters during training (such as weight decay) or trimming connectivity (skipped connections, dropout, etc.) Robust datasets also increases the probability that CNNs will learn the generalized principles that characterize a given dataset rather than the biases of a poorly-populated set.
Convolutional networks were inspired by biological processes in that the connectivity pattern between neurons resembles the organization of the animal visual cortex. Individual cortical neurons respond to stimuli only in a restricted region of the visual field known as the receptive field. The receptive fields of different neurons partially overlap such that they cover the entire visual field.
CNNs use relatively little pre-processing compared to other image classification algorithms. This means that the network learns to optimize the filters (or kernels) through automated learning, whereas in traditional algorithms these filters are hand-engineered. This independence from prior knowledge and human intervention in feature extraction is a major advantage.
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Convolutional Neural Networks CNN
1.
2. DATA SCIENCE
2
C o n v o l u t i o n a l N e u r a l N e t w o r k
Field of
Artificial Intelligence
Field of
M achine Learning
Bangladesh University of Professionals(BUP)
Deep Learning
3. 3
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
Input Classification
Feature Extraction
Car
Not Car
Output
Input Feature Extraction + Classification
Car
Not Car
Output
Deep Learning
Machine Learning
4. 4
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
Neural Network
As like as Human Neuron Network
5. 5
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
Neural Network
Artificial Neural Network
(ANN)
Regression and Classification
Convolutional Neural Network
(CNN)
Computer Vision
Recurrent Neural Network
(RNN)
Time Series Analysis
6. AN
N
6
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
CN
N
RN
N
7. 7
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Application of CNN
Convolutional Neural Networks CNNs have the potentials to
revolutionize many industries and improve our daily life. CNN have a
wide range of applications. Such as:
Object detection.
Face Recognition.
Medical Imaging.
Analyzing Documents.
Understanding Climate.
Grey Areas.
Advertising.
Self Driving Car.
Security Systems.
8. C o n v o l u t i o n a l N e u r a l N e t w o r k
8
snow covered ground, a blue and a white jacket,
a black and a gray jacket, two people standing
on skis, woman wearing a blue jacket, a blue
helmet, a blue helmet on the head, blue jeans on
the girl, a man and a snow covered mountain, a
ski lift, white snow on the ground, trees covered
in snow, a tree with no leaves.
Optical Character Recognition is
designed to convert your
handwriting into text.
Bangladesh University of Professionals(BUP)
Application
s
10. 1 0
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
Convolutional Neural Networks (CNN) learns multilevel features
and classifier in a joint fashion and performs much better than
traditional approaches for various image classification and
segmentation problems.
Introduction
11. 11
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
CNN – What do they learn?
12. CNN - Components
There are four main components in the CNN:
1. Convolutional
2. Non-Linearity
3. Pooling or Sub Sampling
4. Classification (Fully Connected Layer)
1 2
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
13. 1 3
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
Input
An image is a matrix of pixel
values.
If we consider a RGB image, each
pixel will have the combined values
of R, G and B.
If we consider a gray scale image,
the value of each pixel in the
matrix will range from 0 to 255.
What we see!!! What Computer See!!!
14. 1 4
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
Convolutional
The primary purpose of Convolutional in case of CNN is to
extract features from the input image.
0
0
0
0
1
1
1
1
1
4 Filter / Kernel / Feature Detector
Convolved Features /
Activation Map /
Feature Map
Image
0
0
0
1
0
0
1
1
1
1
0
0 0
0 1 1
1X1
1X1
1X1
1X0
1X0
1X1
0X0
0X0
0X1
15. 1 5
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
Convolutional…
Input Image
16. • The size of the output volume is controlled by three parameters that we need
to decide before the convolutional step is performed:
1 6
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Convolutional…
Depth: Depth corresponds to the number of filters we use for the convolutional
operation.
Stride: Stride is the number of pixels by which we slide our filter matrix over the
input matrix.
Zero-padding: Sometimes, it is convenient to pad the input matrix with zeros
around the border, so that we can apply the filter to bordering elements of our input
image matrix.
With zero-padding wide convolutional
Without zero-padding narrow convolutional
17. 1 7
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Non-Linearity (ReLU)
• Replaces all negative pixel values in the feature
map by zero.
• The purpose of ReLU is to introduce non-linearity
in CNN, since most of the real world data would
be non-linear.
• Other non-linear functions such as tanh (-1,1) or
sigmoid (0,1) can also be used instead of ReLU
(0,input)
23
18
75
101
0
25
0
20
100
25
0
35
18
0
20
15
15 20 -10 35
25
18 -110
-15 -10
20
23
18
75
101
25 100
Transfer Function
0,0
18. 1 8
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Pooling
Reduces the dimensionality of each feature map but retains the most important
information. Pooling can be different types: Max, Average, Sum etc.
4
1
2
4
3
2
5
5
1
4
9
7
2
3
1
8 8
9
7
2x2 region
19. 1 9
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Story so far…..
• Together these layers extract the useful features from the images.
• The output from the convolutional and pooling layers represent high-level
features of the input image.
High-level features
Feature extraction
2x2
Subsampling
5x5
Convolution
2x2
Subsampling
5x5
Convolution
input
32x32
C1
feature maps
28x28
S1
feature maps
14x14
C2
feature maps
10x10
20. 2 0
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Fully Connected Layer
• A traditional Multi-layer Perceptron.
• The term “Fully-Connected” implies that every neuron in the previous layer is
connected to every neuron on the next layer.
• Their activations can hence be computed with a matrix multiplication followed
by a bias offset.
• The purpose of the fully connected layer is to use the high level features for
classifying the input image into various classes based on the training dataset.
21. 2 1
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Fully Connected Layer
22. 2 2
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Fully Connected Layer
24. 2 4
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Overall CNN Architecture
Feature extraction
2x2
Subsampling
5x5
Convolution
2x2
Subsampling
5x5
Convolution
input
32x32
C1
feature maps
28x28
S1
feature maps
14x14
C2
feature maps
10x10
S2
feature maps
5x5
n1
n2
output
Classification
Fully
Connected
0
1
8
9
25. 2 5
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Putting it all together – Training using Backpropagation
Step-3: Calculate the total error at the output layer (summation over all 4
classes).
Step-2: The network takes a training Image as input, goes through the
forward propagation step (convolutional, ReLU and pooling operations
along with forward propagation in the fully connected layer) and find the
output probabilities for each class.
Step-1: We initialize all filters and parameters / weights with random
values.
• Let’s say the output probabilities for the boat image above are [0.2,0.4,0.1,0.3].
• Since weight are randomly assigned for the first training example, output
probabilities are also random.
26. 2 6
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Step-4: Use Backpropagation to calculate the gradients of the error with
respect to all weights in the network and use gradient descent to update
all filter values/weights and parameter values to minimize the output
error.
• The weights are adjusted in proportion to their contribution to the total error.
• When the same image is input again, output probabilities might now be
[0.1, 0.1, 0.7, 0.1], which is closer to the target vector [0, 0, 1, 0].
• This means that the network has learnt to classify this particular image correctly
by adjusting its weights / filters such that the output error is reduced.
• Parameters like number of filters, filter sizes, architecture of the network etc.
have all been fixed before Step 1 and do not change during training process -
only the values of the filter matrix and connection weights get updated.
Step-5: Repeat steps 2-4 with all images in the training set.
27. 2 7
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Types of
CNN Architectures
28. 2 8
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Year CNN Architecture Developed By
1998 LeNet Yann LeCun et al.
2012 AlexNet Alex Krizhevsky, Geoffrey Hinton and Ilya Sutskever
2013 ZFNet Matthew Zeiler and Rob Fergus
2014 GoogleNet Google
2014 VGGNet Simonyan and Zisserman
2015 ResNet Kaiming He
2017 DenseNet Gao Huang, Zhuang Liu, Laurens van der Maaten and Killian Q.Weinberger
29. 2 9
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
GoogleNet
30. 3 0
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Limitation
CNNs also have some drawbacks that limit their performance and
applicability. One of the main disadvantages of CNNs is that they require
a large amount of labeled data to train effectively, which can be costly and
time-consuming to obtain and annotate.
Coordinate Frame
Classification of Images with
different Positions
Illustrates the dismantled
components of a face