Roozbeh Sanaei
Convolutional Neural Networks
2
Convolution
Roozbeh Sanaei
Kernel = Filter =
https://stackoverflow.com/questions/52067833/how-to-plot-an-animated-matrix-in-matplotlib
3
Convolution Over Volumes
Roozbeh Sanaei https://towardsdatascience.com/the-most-intuitive-and-easiest-guide-for-convolutional-neural-network-3607be47480
4
Convolution Over Volumes
Roozbeh Sanaei
CNNs : Kernel = Filter = Feature Detector
https://mmuratarat.github.io/2019-01-17/implementing-padding-schemes-of-tensorflow-in-python
5
Padding
Roozbeh Sanaei https://ai.stackexchange.com/questions/17004/convolutional-neural-network-does-each-filter-in-each-convolution-layer-create
6
Strided Convolution
Roozbeh Sanaei https://ai.stackexchange.com/questions/17004/convolutional-neural-network-does-each-filter-in-each-convolution-layer-create
Stride-1 convolution ("non-strided") Stride-2 convolution ("strided")
7
Convolution Operation Arithmetic
Roozbeh Sanaei https://towardsdatascience.com/pytorch-basics-how-to-train-your-neural-net-intro-to-cnn-26a14c2ea29
8
Max Pooling and Average Pooling Layers
Roozbeh Sanaei
Yani, Muhamad. "Application of transfer learning using convolutional neural network method for early detection of terry’s nail." Journal
of Physics: Conference Series. Vol. 1201. No. 1. IOP Publishing, 2019.
9
Why Convolutions?
Roozbeh Sanaei Coursera Deep Learning Specialization, Convolutional Neural Networks
Parameter Sharing: A feature detector (such as a vertical edge detector)
that’s useful in one part of the image, is probably useful in another part
of the image.
Sparsity of Connections: In each layer, each output value depends only
on small number of inputs.
10
LeNet-5
Roozbeh Sanaei
Yani, Muhamad. "Application of transfer learning using convolutional neural network method for early detection of
terry’s nail." Journal of Physics: Conference Series. Vol. 1201. No. 1. IOP Publishing, 2019.
11
Overlap Pooling
Roozbeh Sanaei http://datahacker.rs/tf-alexnet/
12
Roozbeh Sanaei
https://neurohive.io/en/popular-networks/alexnet-imagenet-classification-with-deep-convolutional-neural-networks/
AlexNet
13
Roozbeh Sanaei
https://androidkt.com/keras-vgg16-model-example/
VGG-16
14
Roozbeh Sanaei https://realelectricwizard.wordpress.com/2019/03/30/the-mystery-of-vanishing-and-exploding-gradients/
Vanishing Gradient Problem
15
Roozbeh Sanaei https://towardsdatascience.com/residual-blocks-building-blocks-of-resnet-fd90ca15d6ec
Residual Blocks
16
Roozbeh Sanaei
https://towardsdatascience.com/residual-blocks-building-blocks-of-resnet-fd90ca15d6ec
Residual Networks
17
Roozbeh Sanaei https://towardsdatascience.com/a-comprehensive-introduction-to-different-types-of-convolutions-in-deep-learning-669281e58215
1×1 Convolution
18
Roozbeh Sanaei
https://www.kdnuggets.com/2017/08/intuitive-guide-deep-network-architectures.html
Inception Module
19
Roozbeh Sanaei
http://datahacker.rs/building-inception-network/
Inception Network
20
Roozbeh Sanaei https://towardsdatascience.com/a-comprehensive-introduction-to-different-types-of-convolutions-in-deep-learning-669281e58215
Spatially Separable Convolution
21
Roozbeh Sanaei https://towardsdatascience.com/a-comprehensive-introduction-to-different-types-of-convolutions-in-deep-learning-669281e58215
Depthwise separable convolution
22
Roozbeh Sanaei https://towardsdatascience.com/a-comprehensive-introduction-to-different-types-of-convolutions-in-deep-learning-669281e58215
MobileNet V1 and V2
23
Roozbeh Sanaei
https://ai.googleblog.com/2019/05/efficientnet-improving-accuracy-and.html
EfficientNet
24
Roozbeh Sanaei
https://medium.com/analytics-vidhya/weakly-supervised-learning-for-object-localization-4b73d4f4f4a6
Object Classification, Localization, Detection, Segmentation
25
Roozbeh Sanaei
https://towardsdatascience.com/evolution-of-object-detection-and-localization-algorithms-e241021d8bad
Object Classification and Localization
26
Roozbeh Sanaei
https://stackoverflow.com/questions/50575301/yolo-object-detection-how-does-the-algorithm-predict-
bounding-boxes-larger-than
YOLO
27
28
Roozbeh Sanaei
http://datahacker.rs/deep-learning-intersection-over-union/
Ground Truth
Prediction
29
Roozbeh Sanaei https://www.analyticsvidhya.com/blog/2020/08/selecting-the-right-bounding-box-using-non-max-suppression-with-implementation/
Ground Truth
Prediction
Non-Max Suppression
Step 1: Select the box with highest objectiveness score
Step 2: Then, compare the overlap (intersection over union) of this box with other boxes
Step 3: Remove the bounding boxes with overlap (intersection over union) >50%
Step 4: Then, move to the next highest objectiveness score
Step 5: Finally, repeat steps 2-4
30
Roozbeh Sanaei
https://towardsdatascience.com/region-proposal-network-a-detailed-view-1305c7875853
Ground Truth
Prediction
Anchor Points/Boxes
31
1. Pre-train the CNN network on classification
2. Propose ROI candidates through selective search (class independent)
3. Warp ROIs candidates to CNN required size
4. Fine tune the CNN on warped ROIs for K + 1 classes (additional class is background)
5. CNN Generated features are fed to binary SVMs for each class
6. Regression model is trained to correct predicted window
Roozbeh Sanaei
RCNN
Roozbeh Sanaei 32
Generate region proposals that may contain objects
1. Create regions to start with through image segmentation
2. Iteratively:
1. Calculate similarities between all neighboring regions
2. Bundle two most similar regions
32
Roozbeh Sanaei
Selective Search
Roozbeh Sanaei 33
Predicted bounding box coordinates
Ground truth box coordinates
So all the bounding box correction functions,
can take any value between [-∞, +∞].
33
Roozbeh Sanaei
Bounding Box Regression
Roozbeh Sanaei 34
• Finding false positive samples during the training loops
• Including them in the training data so as to improve the classifier.
34
Roozbeh Sanaei
Hard Negative Mining
Roozbeh Sanaei 35
Instead of extracting separate CNN feature vectors for each region proposal, they are
aggregated into one CNN forward pass over the entire image, So pretrained CNN is
altered as:
• The last max pooling layer of the pre-trained CNN is replaced with a RoI pooling layer.
• The last fully connected layer and the last softmax layer (K classes) with a fully connected layer and softmax over
K + 1 classes.
Branch the last layer out to a bounding-box regression model which predicts offsets relative to the original RoI for
each of K classes.
35
Roozbeh Sanaei
Faster RCNN
Roozbeh Sanaei 36
The loss function sums up the cost of classification and bounding box prediction where bounding box
prediction is ignored for the “background”
36
Roozbeh Sanaei
RCNN Loss function
Roozbeh Sanaei 37
https://deepsense.ai/region-of-interest-pooling-explained/
37
Roozbeh Sanaei
ROI Pooling
Roozbeh Sanaei 38
https://arxiv.org/pdf/1611.10012.pdf
38
Roozbeh Sanaei
Region Proposal Network
Roozbeh Sanaei 39
https://towardsdatascience.com/region-proposal-network-a-detailed-view-1305c7875853
Anchor point: Every point in the feature map generated by backbone network
Anchor boxes: Candidate boxes generated for every anchor point with different scale and aspect ratio
39
Roozbeh Sanaei
Anchor Points/Boxes
Roozbeh Sanaei 40
https://towardsdatascience.com/region-proposal-network-a-detailed-view-1305c7875853
40
Roozbeh Sanaei
Region Proposal Network (RPN)
Roozbeh Sanaei 41
https://towardsdatascience.com/region-proposal-network-a-detailed-view-1305c7875853
41
Roozbeh Sanaei
Full Network Architecture
Roozbeh Sanaei 42
file:///C:/Users/uic81403/Downloads/03fasterr-cnn-towardsreal-timeobjectdetectionwithregionproposalnetworks-160301150110.pdf
Roozbeh Sanaei 42
RPN Loss
Roozbeh Sanaei 43
(SSD)
43
Roozbeh Sanaei
Feature Pyramid
Roozbeh Sanaei 44
Reconstructed layers are semantically
strong but localization is inaccurate due
to down-sampling and up-sampling.
Lateral connections between
reconstructed layers and the
corresponding feature maps improves
localization
44
Roozbeh Sanaei
Feature Pyramid
Roozbeh Sanaei 45
45
Roozbeh Sanaei
Transposed Convolution
https://towardsdatascience.com/transposed-convolution-demystified-84ca81b4baba
Roozbeh Sanaei 46
46
Roozbeh Sanaei
UNet
https://towardsdatascience.com/unet-line-by-line-explanation-9b191c76baf5
Roozbeh Sanaei 47
47
Roozbeh Sanaei
Siamese network and binary classification
https://www.pyimagesearch.com/2020/11/30/siamese-networks-with-keras-tensorflow-and-deep-learning/
Roozbeh Sanaei 48
48
Roozbeh Sanaei
Siamese network and triplet loss
https://omoindrot.github.io/triplet-loss
Roozbeh Sanaei 49
49
Roozbeh Sanaei
Gram Matrix
https://towardsdatascience.com/introduction-to-neural-style-transfer-with-tensorflow-99915a5d624a
Roozbeh Sanaei 50
50
Roozbeh Sanaei
Style Transfer
https://towardsdatascience.com/introduction-to-neural-style-transfer-with-tensorflow-99915a5d624a
Content
Style
Generated
Image
G represents gram matrix

Convolutional neural networks