Comprehension
of deep-learning
- CNN from VGG to DenseNet
19.07.18 You Sung Min
1. Review of Deep learning
(Convolutional Neural Network)
2. Residual network (Resnet)
He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings
of the IEEE conference on computer vision and pattern recognition. 2016.
3. Densely connected convolutional network
(DenseNet)
Huang, Gao, et al. "Densely connected convolutional networks." Proceedings of
the IEEE conference on computer vision and pattern recognition. 2017.
Contents
Structure of Neural Networks
 A simple model to emulate a single neuron
 This model produces a binary output
Review of Deep learning
=
𝟎 𝒊𝒇
𝒋
𝝎𝒋 𝒙𝒋 ≤ 𝑻
𝟏 𝒊𝒇
𝒋
𝝎𝒋 𝒙𝒋 > 𝑻
𝝎 𝟏
𝝎 𝟐
𝝎 𝟑
𝒋
𝝎𝒋 𝒙𝒋Inputs
Threshold T
Perceptron (1950) Neuron
Review of Deep learning
Multilayer Perceptron (MLP)
 A network model consists of perceptrons
 This model produces vectorized outputs
Multilayer Perceptron (MLP)
Review of Deep learning
Handwritten digit with
28 by 28 pixel image
Binary Input
(Intensity of a pixel)
28
28
Input
(784)
Desired output for “5”
𝒚(𝒙) = 𝟎, 𝟎, 𝟎, 𝟎, 𝟏, 𝟎, 𝟎, 𝟎, 𝟎 𝑻
Convolutional Neural Network
 Convolution layer
 Subsampling (Pooling) layer
 Rectified Linear Unit(ReLU)
Review of Deep learning
Feature Extractor Classifier
Review of Deep learning
 Local receptive field (connectivity)
28 by 28 23 by 23
5 by 5
Kernel
(window)
2D Convolution
1. Detect local information
(features)
(e.g., Edge, Shape)
2. Reduce connections
between layers
• Fully connected network
→ 𝟐𝟖 ∗ 𝟐𝟖 ∗ 𝟐𝟑 ∗ 𝟐3 connections
• Local connected network
→ 𝟓 ∗ 𝟓 ∗ 𝟐𝟑 ∗ 𝟐𝟑 connections
𝑤11 𝑤12
𝑤55
Review of Deep learning
 Shared weights
1. Detect same feature
in other positions
2. Reduce total number of
weights and bias
3. Construct multiple feature
maps (kernels)
𝒐𝒖𝒕𝒑𝒖𝒕 = 𝝈(𝒃 +
𝒍=𝟎
𝟒
𝒎=𝟎
𝟒
𝝎𝒍,𝒎 𝒂𝒋+𝒍,𝒌+𝒎)
Review of Deep learning
 Pooling layer
1. Simplify (condense)
information in the feature
map
2. Reduce connections
(weights and biases)
Max-pooling:
Output only maximum activation
Conv. Pooling
Convolutional Neural Network
Review of Deep learning
Convolutional Neural Network
Review of Deep learning
y = max(x,0)
Convolutional Neural Network
Review of Deep learning
Feature map
 Deeper neural networks are more difficult to train
 Vanishing (or exploding) gradient problem
 Degradation problem
Residual network (ResNet)
 Residual learning
 Desired underlying mapping: 𝓗(𝒙)
 Nonlinear stacked layer mapping: ℱ 𝒙 ≔ ℋ 𝒙 − 𝒙
 ∴ ℋ 𝒙 = ℱ 𝒙 + 𝒙
Residual network (ResNet)
Residual mapping
Residual network (ResNet)
Residual network (ResNet)
Residual network (ResNet)
Degradation problem
Residual network (ResNet)
 A: zero-padding shortcuts
 B: project shortcut for increasing
dimension, other shortcut are
identity
 C: all shortcuts are projections
 Dense block
 Short paths from early layers to later layers
Densely connected convolutional network
 Connect all layer (with
matching feature-map
size) directly
 Combine feature by
concatenating
 ∴ 𝑳-layer has
𝑳 𝑳+𝟏
𝟐
conntections
 DenseNet
 ResNets output 𝒙𝒍 = 𝑯𝒍 𝒙𝒍−𝟏 + 𝒙𝒍−𝟏
 DenseNets output 𝒙𝒍 = 𝑯𝒍 𝒙 𝟎, 𝒙 𝟏, … , 𝒙𝒍−𝟏
𝑤ℎ𝑒𝑟𝑒, 𝑯𝒍 𝒊𝒔 𝑩𝑵 + 𝑹𝒆𝑳𝑼 + 𝟑 ∗ 𝟑 𝒄𝒐𝒏𝒗𝒐𝒍𝒖𝒕𝒊𝒐𝒏
Densely connected convolutional network
AVP: average pooling
Transition layer
1 by 1 Conv
2 by 2 AVP
 Collective knowledge: very narrow layers (e.g., 12)
 Bottleneck layer: 1 by 1 conv before 3 by 3 conv
 Compression: reduce # of feature-map at transition layer
Densely connected convolutional network
Densely connected convolutional network
Densely connected convolutional network
References
 Image Source from https://deeplearning4j.org/convolutionalnets
 Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding
convolutional networks.“ European Conference on Computer Vision,
Springer International Publishing, 2014.
 Jia-Bin Huang, “Lecture 29 Convolutional Neural Networks”,
Computer Vision Spring 2015
 He, Kaiming, et al. "Deep residual learning for image
recognition." Proceedings of the IEEE conference on computer vision
and pattern recognition. 2016.
 Huang, Gao, et al. "Densely connected convolutional
networks." Proceedings of the IEEE conference on computer vision
and pattern recognition. 2017.

Convolutional neural network from VGG to DenseNet

  • 1.
    Comprehension of deep-learning - CNNfrom VGG to DenseNet 19.07.18 You Sung Min
  • 2.
    1. Review ofDeep learning (Convolutional Neural Network) 2. Residual network (Resnet) He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. 3. Densely connected convolutional network (DenseNet) Huang, Gao, et al. "Densely connected convolutional networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. Contents
  • 3.
    Structure of NeuralNetworks  A simple model to emulate a single neuron  This model produces a binary output Review of Deep learning = 𝟎 𝒊𝒇 𝒋 𝝎𝒋 𝒙𝒋 ≤ 𝑻 𝟏 𝒊𝒇 𝒋 𝝎𝒋 𝒙𝒋 > 𝑻 𝝎 𝟏 𝝎 𝟐 𝝎 𝟑 𝒋 𝝎𝒋 𝒙𝒋Inputs Threshold T Perceptron (1950) Neuron
  • 4.
    Review of Deeplearning Multilayer Perceptron (MLP)  A network model consists of perceptrons  This model produces vectorized outputs
  • 5.
    Multilayer Perceptron (MLP) Reviewof Deep learning Handwritten digit with 28 by 28 pixel image Binary Input (Intensity of a pixel) 28 28 Input (784) Desired output for “5” 𝒚(𝒙) = 𝟎, 𝟎, 𝟎, 𝟎, 𝟏, 𝟎, 𝟎, 𝟎, 𝟎 𝑻
  • 6.
    Convolutional Neural Network Convolution layer  Subsampling (Pooling) layer  Rectified Linear Unit(ReLU) Review of Deep learning Feature Extractor Classifier
  • 7.
    Review of Deeplearning  Local receptive field (connectivity) 28 by 28 23 by 23 5 by 5 Kernel (window) 2D Convolution 1. Detect local information (features) (e.g., Edge, Shape) 2. Reduce connections between layers • Fully connected network → 𝟐𝟖 ∗ 𝟐𝟖 ∗ 𝟐𝟑 ∗ 𝟐3 connections • Local connected network → 𝟓 ∗ 𝟓 ∗ 𝟐𝟑 ∗ 𝟐𝟑 connections 𝑤11 𝑤12 𝑤55
  • 8.
    Review of Deeplearning  Shared weights 1. Detect same feature in other positions 2. Reduce total number of weights and bias 3. Construct multiple feature maps (kernels) 𝒐𝒖𝒕𝒑𝒖𝒕 = 𝝈(𝒃 + 𝒍=𝟎 𝟒 𝒎=𝟎 𝟒 𝝎𝒍,𝒎 𝒂𝒋+𝒍,𝒌+𝒎)
  • 9.
    Review of Deeplearning  Pooling layer 1. Simplify (condense) information in the feature map 2. Reduce connections (weights and biases) Max-pooling: Output only maximum activation Conv. Pooling
  • 10.
  • 11.
    Convolutional Neural Network Reviewof Deep learning y = max(x,0)
  • 12.
    Convolutional Neural Network Reviewof Deep learning Feature map
  • 13.
     Deeper neuralnetworks are more difficult to train  Vanishing (or exploding) gradient problem  Degradation problem Residual network (ResNet)
  • 14.
     Residual learning Desired underlying mapping: 𝓗(𝒙)  Nonlinear stacked layer mapping: ℱ 𝒙 ≔ ℋ 𝒙 − 𝒙  ∴ ℋ 𝒙 = ℱ 𝒙 + 𝒙 Residual network (ResNet) Residual mapping
  • 15.
  • 16.
  • 17.
  • 18.
    Residual network (ResNet) A: zero-padding shortcuts  B: project shortcut for increasing dimension, other shortcut are identity  C: all shortcuts are projections
  • 19.
     Dense block Short paths from early layers to later layers Densely connected convolutional network  Connect all layer (with matching feature-map size) directly  Combine feature by concatenating  ∴ 𝑳-layer has 𝑳 𝑳+𝟏 𝟐 conntections
  • 20.
     DenseNet  ResNetsoutput 𝒙𝒍 = 𝑯𝒍 𝒙𝒍−𝟏 + 𝒙𝒍−𝟏  DenseNets output 𝒙𝒍 = 𝑯𝒍 𝒙 𝟎, 𝒙 𝟏, … , 𝒙𝒍−𝟏 𝑤ℎ𝑒𝑟𝑒, 𝑯𝒍 𝒊𝒔 𝑩𝑵 + 𝑹𝒆𝑳𝑼 + 𝟑 ∗ 𝟑 𝒄𝒐𝒏𝒗𝒐𝒍𝒖𝒕𝒊𝒐𝒏 Densely connected convolutional network AVP: average pooling Transition layer 1 by 1 Conv 2 by 2 AVP
  • 21.
     Collective knowledge:very narrow layers (e.g., 12)  Bottleneck layer: 1 by 1 conv before 3 by 3 conv  Compression: reduce # of feature-map at transition layer Densely connected convolutional network
  • 22.
  • 23.
  • 24.
    References  Image Sourcefrom https://deeplearning4j.org/convolutionalnets  Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding convolutional networks.“ European Conference on Computer Vision, Springer International Publishing, 2014.  Jia-Bin Huang, “Lecture 29 Convolutional Neural Networks”, Computer Vision Spring 2015  He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.  Huang, Gao, et al. "Densely connected convolutional networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.