RESNET
BY ASHWIN JOSEPH KURIAN
Why Resnet?
• ResNets solve is the famous known
vanishing gradient problem.
• With ResNets, the gradients can flow directly
through the skip connections backwards from
later layers to initial filters.
Resnet Architecture
• the ResNet (the one on
the right) consists on
one convolution and
pooling step (on
orange) followed by 4
layers of similar
behavior.
Different Layers used in ResNet:
• Conv2D
• Batch Normalization
• Activation (Relu)
• Zero Padding
• Max Pooling
• Global Average Pooling
• Dense
shortcut connections
• The formulation of F(x)+x can be realized by feed for-ward
neural networks with “shortcut connections” (Fig. 2).Shortcut
connections are those skipping one or more layers. In our case,
the shortcut connections simply perform identity mapping, and
their outputs are added to the outputs of the stacked layers
• Identity short-cut connections add neither extra parameter nor
computational complexity.
• Helps to attain early convergence of models. [2]
• Helps to increase accuracy of deep networks. [2]
About the plain network:
• The convolutional layers mostly have 3×3 filters and
follow two simple design rules:
• (i) for the same output feature map size, the layers
have the same number of filters
• (ii) if the feature map size is halved, the number of
filters is doubled so as to preserve the time complexity
per layer.
• We perform down sampling directly by convolutional
layers that have a stride of 2.
Turning Plain Network into its residual
counterpart
• insert shortcut connections which turn the network into
its counterpart residual version.
• The identity shortcuts can be directly used when the
input and output are of the same dimensions (solid line
shortcuts)
• when the shortcuts go across feature maps of twosizes,
they are performed with a stride of 2.
Conv1 block
• Conv1 — consisting on a convolution + batch
normalization + max pooling operation.
ResNet Layers
• This is because when ResNets go deeper, they normally do
it by increasing the number of operations within a block,
but the number of total layers remains the same — 4.
• An operation here refers to a convolution a batch
normalization and a ReLU activation to an input,
except the last operation of a block, that does not
have the ReLU.
• The blocks are of 2 types— Basic Block — and the blocks
that include 3 operations — Bottleneck Block.
Parameters Affecting ResNet
performance:
• Number of Layers:
• The 50/101/152-layer ResNets were found to be more accurate than the 34-
layer ones by considerable margins No degradation problem was observed and
thus enjoy significant accuracy gains from considerably increased depth. The
benefits of depth are witnessed for all evaluation metrics
Yolo: A variant of Resnet
• Softmax layer is replaced by 1x1 convolutional layer
with logistic function.
• a logistic function is used to cope with multi-label
classification.
References
• https://towardsdatascience.com/understanding-and-visu
alizing-resnets-442284831be8
• Original Resnet Paper
• Performance Comparison of Pretrained Convolutional
Neural Networks on Crack Detection in Buildings- Ç.F.
Özgenelaand A.GönençSorguç [2018]
• AN ANALYSIS OF DEEP NEURAL NETWORK MODELS FOR
PRACTICAL APPLICATIONS- Alfredo Canziani, Eugenio
Culurciello and Adam Paszke [2017]

Resnet

  • 1.
  • 2.
    Why Resnet? • ResNetssolve is the famous known vanishing gradient problem. • With ResNets, the gradients can flow directly through the skip connections backwards from later layers to initial filters.
  • 3.
    Resnet Architecture • theResNet (the one on the right) consists on one convolution and pooling step (on orange) followed by 4 layers of similar behavior.
  • 5.
    Different Layers usedin ResNet: • Conv2D • Batch Normalization • Activation (Relu) • Zero Padding • Max Pooling • Global Average Pooling • Dense
  • 6.
    shortcut connections • Theformulation of F(x)+x can be realized by feed for-ward neural networks with “shortcut connections” (Fig. 2).Shortcut connections are those skipping one or more layers. In our case, the shortcut connections simply perform identity mapping, and their outputs are added to the outputs of the stacked layers • Identity short-cut connections add neither extra parameter nor computational complexity. • Helps to attain early convergence of models. [2] • Helps to increase accuracy of deep networks. [2]
  • 7.
    About the plainnetwork: • The convolutional layers mostly have 3×3 filters and follow two simple design rules: • (i) for the same output feature map size, the layers have the same number of filters • (ii) if the feature map size is halved, the number of filters is doubled so as to preserve the time complexity per layer. • We perform down sampling directly by convolutional layers that have a stride of 2.
  • 8.
    Turning Plain Networkinto its residual counterpart • insert shortcut connections which turn the network into its counterpart residual version. • The identity shortcuts can be directly used when the input and output are of the same dimensions (solid line shortcuts) • when the shortcuts go across feature maps of twosizes, they are performed with a stride of 2.
  • 9.
    Conv1 block • Conv1— consisting on a convolution + batch normalization + max pooling operation.
  • 10.
    ResNet Layers • Thisis because when ResNets go deeper, they normally do it by increasing the number of operations within a block, but the number of total layers remains the same — 4. • An operation here refers to a convolution a batch normalization and a ReLU activation to an input, except the last operation of a block, that does not have the ReLU. • The blocks are of 2 types— Basic Block — and the blocks that include 3 operations — Bottleneck Block.
  • 11.
    Parameters Affecting ResNet performance: •Number of Layers: • The 50/101/152-layer ResNets were found to be more accurate than the 34- layer ones by considerable margins No degradation problem was observed and thus enjoy significant accuracy gains from considerably increased depth. The benefits of depth are witnessed for all evaluation metrics
  • 15.
    Yolo: A variantof Resnet • Softmax layer is replaced by 1x1 convolutional layer with logistic function. • a logistic function is used to cope with multi-label classification.
  • 16.
    References • https://towardsdatascience.com/understanding-and-visu alizing-resnets-442284831be8 • OriginalResnet Paper • Performance Comparison of Pretrained Convolutional Neural Networks on Crack Detection in Buildings- Ç.F. Özgenelaand A.GönençSorguç [2018] • AN ANALYSIS OF DEEP NEURAL NETWORK MODELS FOR PRACTICAL APPLICATIONS- Alfredo Canziani, Eugenio Culurciello and Adam Paszke [2017]