CNN-AlexNet-ResNet presentation describing their architectures

Convolutional Neural Networks (CNNs) had always
been the go-to model for object recognition
• They’re strong models that are easy to control and even
easier to train.
• They don’t experience overfitting at any alarming scales
when being used on millions of images.
• The only problem: they’re hard to apply to high resolution
images.

AlexNet
• The architecture consists of eight layers: five convolutional layers and
three fully-connected layers.
• ReLU Nonlinearity. AlexNet uses Rectified Linear Units (ReLU)
instead of the tanh function, which was standard at the time. ReLU’s
advantage is in training time; a CNN using ReLU was able to reach a
25% error on the CIFAR-10 dataset six times faster than a CNN using
tanh.
• Overlapping Pooling. CNNs traditionally “pool” outputs of
neighboring groups of neurons with no overlapping.
However, when the authors introduced overlap, they saw a reduction
in error by about 0.5% and found that models with overlapping
pooling generally find it harder to overfit.

The Overfitting Problem. AlexNet had 60 million parameters, a major issue in terms
of overfitting. Two methods were employed to reduce overfitting:
•Data Augmentation. The authors used label-preserving transformation to make their
data more varied. Specifically, they generated image translations and horizontal
reflections, which increased the training set by a factor of 2048. They also performed
Principle Component Analysis (PCA) on the RGB pixel values to change the intensities of
RGB channels, which reduced the top-1 error rate by more than 1%.
•Dropout. This technique consists of “turning off” neurons with a predetermined
probability (e.g. 50%). This means that every iteration uses a different sample of the
model’s parameters, which forces each neuron to have more robust features that can be
used with other random neurons. However, dropout also increases the training time
needed for the model’s convergence.
What Now? AlexNet is an incredibly powerful model capable of achieving high accuracies
on very challenging datasets. However, removing any of the convolutional layers will
drastically degrade AlexNet’s performance. AlexNet is a leading architecture for any
object-detection task and may have huge applications in the computer vision sector of
artificial intelligence problems. In the future, AlexNet may be adopted more than CNNs
for image tasks.

VGGNet Architecture
There are total of 13 convolutional layers and 3 fully connected layers in VGG16 architecture.
VGG has smaller filters (3*3) with more depth instead of having large filters. It has ended up
having the same effective receptive field as if you only have one 7 x 7 convolutional layers.
Another variation of VGGNet has 19 weight layers consisting of 16 convolutional layers with 3
fully connected layers and same 5 pooling layers. In both variation of VGGNet there consists
of two Fully Connected layers with 4096 channels each which is followed by another fully
connected layer with 1000 channels to predict 1000 labels. Last fully connected layer uses
softmax layer for classification purpose.

Architecture walkthrough:
•The first two layers are convolutional layers with 3*3 filters, and first two
layers use 64 filters that results in 224*224*64 volume as same
convolutions are used. The filters are always 3*3 with stride of 1
•After this, pooling layer was used with max-pool of 2*2 size and stride 2
which reduces height and width of a volume from 224*224*64 to
112*112*64.
•This is followed by 2 more convolution layers with 128 filters. This results
in the new dimension of 112*112*128.
•After pooling layer is used, volume is reduced to 56*56*128.
•Two more convolution layers are added with 256 filters each followed by
down sampling layer that reduces the size to 28*28*256.
•Two more stack each with 3 convolution layer is separated by a max-pool
layer.
•After the final pooling layer, 7*7*512 volume is flattened into Fully
Connected (FC) layer with 4096 channels and softmax output of 1000
classes.

CNN-AlexNet-ResNet presentation describing their architectures

CNN-AlexNet-ResNet presentation describing their architectures

More Related Content

Similar to CNN-AlexNet-ResNet presentation describing their architectures

Recently uploaded

CNN-AlexNet-ResNet presentation describing their architectures