Convolutional Neural Networks

Convolutional Neural Networks (CNN)
November 5, 2019
Amit Praseed Classiﬁcation November 5, 2019 1 / 31

What do Neural Networks Learn?
Every input to a neural network is a feature.
The neural network outputs whether the data point corresponding to
the input features belongs to class A or B.
In simple terms, a neural network builds a mapping from a non linear
combination of inputs (features) to the outputs (class labels).
In the case of image recognition, the input is an image itself. Usually
every pixel is a separate input to the neural network.
The MLP learns a non linear combination of pixel values to predict a
class label.

Neural Networks and Image Recognition

MLPs perform poorly in recognizing images.
MLPs cannot learn spatial correlations between images.
Prone to overﬁtting due to an abnormally large number of inputs and
weights.
There is no concept of ”features” because each feature is a pixel value.
How do we specify features in an image?

It would be marvelous if the network could learn ”features” by itself...
which is what CNN does.

Basic Idea behind CNN

Filters and Convolutions
An image is basically a matrix of pixel values.
So it makes sense to define all operations on images in terms of oper-
ations of a matrix as well.
All operations such as smoothing, sharpening, blurring, edge detection
etc can be defined in terms of operations on a matrix.
For this, an operation is denoted by a smaller matrix called a filter.
The filter is moved over the image from left to right and top to bottom,
and the corresponding elements of the image and the filter are multi-
plied and added. The resultant value is the pixel value of the modified
image.
This operation is called as convolution.

Smoothing using Convolution

Edge Detection using Convolution

Learning Low Level Features in CNN
The first step in a CNN is to learn the low level features such as edges
from the input image.
This is done using a Convolutional Layer.
For this, a filter is moved over the entire image as in the case of
convolution.
Are the filter weights static?
NO.
The filter weights are randomly initialized and the weights are updated
using backpropagation till the weights stabilize.
This means we have no idea which feature (horizontal edge, vertical
edge, slanting lines...) a particular filter learns to recognize.
A number of filters are used in a CNN, and each filter learns to recognize
a particular feature.
Each filter is said to output a Feature Map.

Peculiarities of the Convolutional Layer
Local Receptive Fields
Not fully connected as an MLP
Shared Weights
Learn to recognize a feature irrespective of its absolute location
Fewer parameters, hence less prone to overﬁtting
ReLu activation function

Convolutional Layer

Pooling
The convolutional layers recognize the presence of features in the im-
age.
However, the output of these layers also contain positional information
i.e. where these features were found.
Usually positional information acts as a burden in classiﬁcation. We
want relative positional information of features, not where the absolute
position of a feature is.
The Pooling Layer removes positional information from the output of
the Convolutional Layers.

Max Pooling

That’s It!!!
A CNN is essentially comprised of multiple convolutional and pooling
layers one after the other.
Each successive layer recognizes more sophisticated features using low
level features detected by the previous layers.

CNN Architecture

A Note on the Output Layer
While all the other layers are only partially connected, the output layer
is fully connected.
The number of nodes in the output layer is usually equally to the
number of classes in the classiﬁcation problem. For example, if you
want to classify cats, dogs, wolves and foxes, the output layer will have
four nodes.
The nodes in the output layer have a special activation function, called
Softmax Activation Function.
aL
j =
exL
j
k exL
k

Softmax Activation
Softmax Activation forms a probability distribution, and gives the prob-
ability that the given input belongs to class j.
Along with a new log likelihood cost function given by
C = −ln aL
j
the network can counter learning slowdown as well.

A Note on Overfitting
Even though CNN uses much fewer weights than MLP, it can still suffer
from overfitting.
Techniques to counter overfitting, such as regularization, validation,
acquiring new data etc. can still be used here.
Another technique usually used to reduce the effects of overfitting is
the use of Ensemble Classifiers.
Similar to Random Forests, we can use a number of neural networks
(CNN or MLP), train them separately and employ a majority voting to
decide the class during testing.
However, MLP or CNN need a lot more time to train and hence main-
taining multiple models is infeasible.
Rather, there is a technique that tries to use only one physical model,
but train multiple virtual models in it.

Dropout
The idea behind dropout is to
randomly disable or drop 50%
of the neurons during diﬀerent
stages of training.
This is done so that the neu-
ral network as a whole becomes
more robust.
Virtually, we are training mul-
tiple neural networks for the
same input, which can help in
reducing overﬁtting.

How does CNN overcome the diﬃculties in training Deep
Networks?
Learning Slowdown → Softmax Activation function in the output layer
+ Log Likelihood Cost
Vanishing Gradient → ReLu Activation Function in convolutional layers
Overﬁtting → Shared Weights and Biases, Regularization, Dropout

Convolutional Neural Networks

Recommended

Recommended

More Related Content

Similar to Convolutional Neural Networks

Similar to Convolutional Neural Networks (20)

More from amitpraseed

More from amitpraseed (6)

Recently uploaded

Recently uploaded (20)

Convolutional Neural Networks