The document discusses Convolutional Neural Networks (CNNs). It explains that CNNs are a type of neural network that use convolutional operations in at least one layer. CNNs are well-suited for image classification and segmentation problems. The key layers in a CNN are convolutional layers, pooling layers, flattening layers, and fully connected layers. Convolutional layers act as feature extractors, pooling layers reduce spatial size, flattening layers transform pooled features into a vector, and fully connected layers are for classification.
2. DEEP LEARNING
Multiple definitions, however, these definitions have in common:
Multiple layers of processing units
Supervised or unsupervised learning of feature representations in
each layer, with the layers forming a hierarchy from low-level to
high-level features
2
3. DEEP LEARNING
Deep Learning has proved to
be a very powerful tool because
of its ability to handle large
amounts of data
The interest to use hidden
layers has surpassed
traditional techniques,
especially in pattern
recognition
3
5. 5
What are the CNNs???
Convolutional Neural Network (CNN) is a multi-layer neural
network
Neural Networks with a Convolutional operation in at least one of
the layers
CNN performs much better than traditional approaches for various
image classification and segmentation problems
CNN is used to reduce the images into a form that is easier to
process, without losing features that are critical for getting a good
prediction
6. 6
Applications of CNN
Face Recognition
Image Classification
Object Detection
Segmentation
Self-driving cars that leverage CNN based vision systems.
And many More
7. Drawbacks of CNN
A Convolutional neural network is significantly slower due to
different operations such as maxpooling.
If the CNN has several layers, then the training process takes a lot of
time if the computer doesn’t consist of a good GPU
A ConvNet requires a large Dataset to process and train the neural
network
7
9. Convolutional Layer
9
Convolutional layer acts as a feature extractor that extracts features of the
inputs such as edges, corners , endpoints
This image shows what a convolution is
We take a filter/kernel(3×3 matrix) and apply it to the input image to get
the convolved feature
13. Normalization
13
Once we get the output, we shall apply Rectified Linear Unit (relu)
activation function on each field of output
After applying the ReLU a stack of images become a stack of
images with no negative values
Max(y,0)
14. 14
Pooling Layer
The Pooling layer is responsible for reducing the spatial size of
the Convolved Feature
This is to decrease the computational power required to process
the data by reducing the dimensions
There are two types of pooling
max pooling
average pooling
15. Max Pooling
15
The most common approach used in pooling is max pooling
in Max Pooling is we find the maximum value of a pixel from a
portion of the image covered by the kernel
Max Pooling also performs as a Noise Suppressant
Pooling Filter Size = 3 X 3
Stride = 1
16. Avg Pooling
16
Average Pooling returns the average of all the values from the
portion of the image covered by the Kernel
Average Pooling simply performs dimensionality reduction as a
noise suppressing mechanism
We can say that Max Pooling performs a lot better than Average
Pooling
Pooling Filter Size = 3 X 3
Stride = 1
18. 18
Flattening Layer
Once the pooled featured map is obtained, the next step is to
flatten it
We flatten the output of the convolutional layers to create a
single long feature vector
Flattening involves transforming the entire pooled feature map
matrix into a single column which is then fed to the neural
network for processing
19. Flattening
we are literally going to
flatten our pooled feature
map into a column like in the
image below
it is connected to the final
classification model, which is
called a fully-connected layer
19
21. 21
Fully Connected Layer
A traditional Multi-Layer Perceptron
The term “Fully Connected” implies that every neuron in the previous layer
is connected to every neuron on the next layer
The purpose of the Fully Connected layer is to use the high-level features for
classifying the input image into various classes based on the training
dataset
Fully connect layer act as classifier
22. Fully Connected
A summation of product of inputs and weights at each output node
determines the final prediction
22