DeepLab: Semantic Image
Segmentation with Deep
Convolutional Nets, Atrous
Convolution, and Fully
Connected CRFs
(A.K.A. DeepLabv2)
J. Miguel Valverde
@jmlipman
14/02/2021
Semantic Segmentation
Source: https://pixabay.com/photos/girl-kid-coloring-colors-art-2586719/
J. Miguel Valverde 2
The DeepLab family
DeepLabv1
DeepLabv2
DeepLabv3
DeepLabv3+
...
J. Miguel Valverde 3
DeepLabv2 tackles three problems
1) Problem: Reduced feature resolution
- Maxpooling
Causes:
Source: computersciencewiki.org
Source: https://github.com/vdumoulin/conv_arithmetic
- Conv, strides = 2
J. Miguel Valverde 4
DeepLabv2 tackles three problems
1) Problem: Reduced feature resolution
- Maxpooling
Causes:
Source: computersciencewiki.org
Source: https://github.com/vdumoulin/conv_arithmetic
- Conv, strides = 2
J. Miguel Valverde 5
DeepLabv2 tackles three problems
Atrous conv. Regular conv.
1) Problem: Reduced feature resolution
J. Miguel Valverde 6
DeepLabv2 tackles three problems
2) Problem: Existence of multiple-scale objects
Same class
Different size (FoV)
J. Miguel Valverde 7
DeepLabv2 tackles three problems
3) Reduced accuracy (in borders)
J. Miguel Valverde 8
DeepLabv2 tackles three problems
3) Reduced accuracy (in borders)
Downsampling, maxpooling, useful
to achieve invariance in
classification
In Segmentation we want to preserve spatial information
Cause:
J. Miguel Valverde 9
DeepLabv2 proposes three methods
1) Atrous convolution (or dilated convolution)
2) Atrous Spatial Pyramid Pooling (ASPP)
3) Conditional Random Fields (CRFs)
Backbone
ASPP
Upsample x8
CRF
Pipeline
J. Miguel Valverde 10
DeepLabv2 proposes three methods
1) Atrous convolution (or dilated convolution)
Dilated Regular
Filter = 3x3 → Increases the field of view.
J. Miguel Valverde 11
DeepLabv2 proposes three methods
1) Atrous convolution (or dilated convolution)
Dilated Regular
Filter = 3x3 → Increases the field of view.
J. Miguel Valverde 12
DeepLabv2 proposes three methods
(Field of view, reminder)
Operation Field of view (size)
Conv 3x3 3
J. Miguel Valverde 13
DeepLabv2 proposes three methods
(Field of view, reminder)
Operation Field of view (size)
Conv 3x3 3
Conv 3x3 3 + 2 = 5
... ...
J. Miguel Valverde 14
DeepLabv2 proposes three methods
1) Atrous convolution (or dilated convolution)
Dilated Regular
Filter = 3x3
● Same # of params
● Same amount of computation
● Adjust FoV with `rate`
J. Miguel Valverde 15
DeepLabv2 proposes three methods
2) Atrous Spatial Pyramid Pooling (ASPP)
FoV: 1 3 3
5
Inceptionv3
Conv
Conv
Conv
Conv
Conv Conv Conv
Conv
Conv
...
J. Miguel Valverde 16
DeepLabv2 proposes three methods
2) Atrous Spatial Pyramid Pooling (ASPP)
J. Miguel Valverde 17
DeepLabv2 proposes three methods
3) Conditional Random Fields (CRFs)
Source: http://www.jonathanfischer.net/lets-build-gameplaykit-grid-pathfinding/
Short-range / local CRFs Fully-connected CRFs
Image → Graph
J. Miguel Valverde 18
DeepLabv2 proposes three methods
3) Conditional Random Fields (CRFs)
CNN
J. Miguel Valverde 19
DeepLabv2 proposes three methods
3) Conditional Random Fields (CRFs)
p = coordinates
I = RGB intensity values
J. Miguel Valverde 20
DeepLabv2 proposes three methods
3) Conditional Random Fields (CRFs)
Always positive
J. Miguel Valverde 21
DeepLabv2 proposes three methods
3) Conditional Random Fields (CRFs)
Always positive
Always negative
J. Miguel Valverde 22
DeepLabv2 proposes three methods
3) Conditional Random Fields (CRFs)
Always positive
Always negative
J. Miguel Valverde 23
DeepLabv2 proposes three methods
3) Conditional Random Fields (CRFs)
Small distance, similar intensities
J. Miguel Valverde 24
DeepLabv2 proposes three methods
3) Conditional Random Fields (CRFs)
Small distance, similar intensities
Small negative values → large penalty
J. Miguel Valverde 25
DeepLabv2 proposes three methods
3) Conditional Random Fields (CRFs)
Small distance, similar intensities
Small negative values → large penalty
But such penalty → only if labels are different!
J. Miguel Valverde 26
DeepLabv2 proposes three methods
3) Conditional Random Fields (CRFs)
Large distance, different intensities
J. Miguel Valverde 27
DeepLabv2 proposes three methods
3) Conditional Random Fields (CRFs)
Large distance, different intensities
Large negative values → very small penalty
J. Miguel Valverde 28
Experiments
Learning rate policy
vs. decreasing with a fixed step size
J. Miguel Valverde 29
Experiments
Learning rate policy
Different FoV in the ASPP
Larger → Better
vs. decreasing with a fixed step size
J. Miguel Valverde 30
Experiments
Different architectures
ResNet-101 > VGG-16
CRFs
With > Without
J. Miguel Valverde 31
The code!
J. Miguel Valverde 32
The original ideas were old
1) Atrous convolution (or dilated convolution)
2) Atrous Spatial Pyramid Pooling (ASPP)
3) Conditional Random Fields (CRFs)
1989
2001
2014
DeepLabv2 paper: 2017
J. Miguel Valverde 33
Do you want to learn more about CRFs?
“Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data”. J. Lafferty et al. (2001)
Original work
“An Introduction to Conditional Random Fields”. C. Sutton and A. McCallum (2011)
https://homepages.inf.ed.ac.uk/csutton/publications/crftut-fnt.pdf
Great introduction
Coursera course

DeepLabv2 deeplabv2 machine learning description

  • 1.
    DeepLab: Semantic Image Segmentationwith Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs (A.K.A. DeepLabv2) J. Miguel Valverde @jmlipman 14/02/2021 Semantic Segmentation Source: https://pixabay.com/photos/girl-kid-coloring-colors-art-2586719/
  • 2.
    J. Miguel Valverde2 The DeepLab family DeepLabv1 DeepLabv2 DeepLabv3 DeepLabv3+ ...
  • 3.
    J. Miguel Valverde3 DeepLabv2 tackles three problems 1) Problem: Reduced feature resolution - Maxpooling Causes: Source: computersciencewiki.org Source: https://github.com/vdumoulin/conv_arithmetic - Conv, strides = 2
  • 4.
    J. Miguel Valverde4 DeepLabv2 tackles three problems 1) Problem: Reduced feature resolution - Maxpooling Causes: Source: computersciencewiki.org Source: https://github.com/vdumoulin/conv_arithmetic - Conv, strides = 2
  • 5.
    J. Miguel Valverde5 DeepLabv2 tackles three problems Atrous conv. Regular conv. 1) Problem: Reduced feature resolution
  • 6.
    J. Miguel Valverde6 DeepLabv2 tackles three problems 2) Problem: Existence of multiple-scale objects Same class Different size (FoV)
  • 7.
    J. Miguel Valverde7 DeepLabv2 tackles three problems 3) Reduced accuracy (in borders)
  • 8.
    J. Miguel Valverde8 DeepLabv2 tackles three problems 3) Reduced accuracy (in borders) Downsampling, maxpooling, useful to achieve invariance in classification In Segmentation we want to preserve spatial information Cause:
  • 9.
    J. Miguel Valverde9 DeepLabv2 proposes three methods 1) Atrous convolution (or dilated convolution) 2) Atrous Spatial Pyramid Pooling (ASPP) 3) Conditional Random Fields (CRFs) Backbone ASPP Upsample x8 CRF Pipeline
  • 10.
    J. Miguel Valverde10 DeepLabv2 proposes three methods 1) Atrous convolution (or dilated convolution) Dilated Regular Filter = 3x3 → Increases the field of view.
  • 11.
    J. Miguel Valverde11 DeepLabv2 proposes three methods 1) Atrous convolution (or dilated convolution) Dilated Regular Filter = 3x3 → Increases the field of view.
  • 12.
    J. Miguel Valverde12 DeepLabv2 proposes three methods (Field of view, reminder) Operation Field of view (size) Conv 3x3 3
  • 13.
    J. Miguel Valverde13 DeepLabv2 proposes three methods (Field of view, reminder) Operation Field of view (size) Conv 3x3 3 Conv 3x3 3 + 2 = 5 ... ...
  • 14.
    J. Miguel Valverde14 DeepLabv2 proposes three methods 1) Atrous convolution (or dilated convolution) Dilated Regular Filter = 3x3 ● Same # of params ● Same amount of computation ● Adjust FoV with `rate`
  • 15.
    J. Miguel Valverde15 DeepLabv2 proposes three methods 2) Atrous Spatial Pyramid Pooling (ASPP) FoV: 1 3 3 5 Inceptionv3 Conv Conv Conv Conv Conv Conv Conv Conv Conv ...
  • 16.
    J. Miguel Valverde16 DeepLabv2 proposes three methods 2) Atrous Spatial Pyramid Pooling (ASPP)
  • 17.
    J. Miguel Valverde17 DeepLabv2 proposes three methods 3) Conditional Random Fields (CRFs) Source: http://www.jonathanfischer.net/lets-build-gameplaykit-grid-pathfinding/ Short-range / local CRFs Fully-connected CRFs Image → Graph
  • 18.
    J. Miguel Valverde18 DeepLabv2 proposes three methods 3) Conditional Random Fields (CRFs) CNN
  • 19.
    J. Miguel Valverde19 DeepLabv2 proposes three methods 3) Conditional Random Fields (CRFs) p = coordinates I = RGB intensity values
  • 20.
    J. Miguel Valverde20 DeepLabv2 proposes three methods 3) Conditional Random Fields (CRFs) Always positive
  • 21.
    J. Miguel Valverde21 DeepLabv2 proposes three methods 3) Conditional Random Fields (CRFs) Always positive Always negative
  • 22.
    J. Miguel Valverde22 DeepLabv2 proposes three methods 3) Conditional Random Fields (CRFs) Always positive Always negative
  • 23.
    J. Miguel Valverde23 DeepLabv2 proposes three methods 3) Conditional Random Fields (CRFs) Small distance, similar intensities
  • 24.
    J. Miguel Valverde24 DeepLabv2 proposes three methods 3) Conditional Random Fields (CRFs) Small distance, similar intensities Small negative values → large penalty
  • 25.
    J. Miguel Valverde25 DeepLabv2 proposes three methods 3) Conditional Random Fields (CRFs) Small distance, similar intensities Small negative values → large penalty But such penalty → only if labels are different!
  • 26.
    J. Miguel Valverde26 DeepLabv2 proposes three methods 3) Conditional Random Fields (CRFs) Large distance, different intensities
  • 27.
    J. Miguel Valverde27 DeepLabv2 proposes three methods 3) Conditional Random Fields (CRFs) Large distance, different intensities Large negative values → very small penalty
  • 28.
    J. Miguel Valverde28 Experiments Learning rate policy vs. decreasing with a fixed step size
  • 29.
    J. Miguel Valverde29 Experiments Learning rate policy Different FoV in the ASPP Larger → Better vs. decreasing with a fixed step size
  • 30.
    J. Miguel Valverde30 Experiments Different architectures ResNet-101 > VGG-16 CRFs With > Without
  • 31.
    J. Miguel Valverde31 The code!
  • 32.
    J. Miguel Valverde32 The original ideas were old 1) Atrous convolution (or dilated convolution) 2) Atrous Spatial Pyramid Pooling (ASPP) 3) Conditional Random Fields (CRFs) 1989 2001 2014 DeepLabv2 paper: 2017
  • 33.
    J. Miguel Valverde33 Do you want to learn more about CRFs? “Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data”. J. Lafferty et al. (2001) Original work “An Introduction to Conditional Random Fields”. C. Sutton and A. McCallum (2011) https://homepages.inf.ed.ac.uk/csutton/publications/crftut-fnt.pdf Great introduction Coursera course