Image Segmentation: Approaches and Challenges

© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Semantic Segmentation

Problem statement: Pixel-level classification task

Applications: Brain tissue segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentation Olaf Ronneberger, Philipp Fischer, Thomas Brox, 2015

source: https://github.com/reachsumit/deep-unet-for-satellite-image-segmentation
Applications: Satellite image land use

Applications: Self-driving cars
source: https://www.youtube.com/watch?v=ATlcEDSPWXY

How does it work?
Source: Fully Convolutional Networks for Semantic Segmentation, Long et al. 2015
Deep Neural Network

Input
RGB or Grayscale Images
Unsigned integer [0,255]

N classes
Output: predict one “heat-map” per class
Softmax across class axis

How does it work?
Trained to minimize the softmax cross entropy loss for each pixel i,j
predictions among the N different classes:
𝑙𝑜𝑠𝑠 = −
𝑖,𝑗
𝐻,𝑊
𝑐
𝑁
𝑦𝑖,𝑗,𝑐 ∗ log(𝑝𝑖,𝑗,𝑐)
𝑙𝑜𝑠𝑠 = −
𝑖,𝑗
𝐻,𝑊
log(𝑝𝑖,𝑗,𝑐=𝑦 𝑖,𝑗
)

Main challenge: capturing multi-scale context
cow?

Source: Deep LabV3 Rethinking Atrous Convolution for Semantic Image Segmentation, Chen et al. 2017
Strategies for capturing multi-scale context

Architectures: HourGlass
Architecture of the full network. The convolution network is based on the VGG16 architecture. The deconvolution
network uses unpooling and deconvolution layers. Source: H. Noh et al. (2015)

Architectures: U-Net
U-Net: Convolutional Networks for Biomedical Image Segmentation Olaf Ronneberger, Philipp Fischer, Thomas Brox, 2015

Architectures: DeepLab V3
Source: Rethinking Atrous Convolution for Semantic Image Segmentation Liang-Chieh Chen, George Papandreou,
Florian Schroff, Hartwig Adam, 2017

Architectures: DeepLab V3+
Source: Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation, Liang-Chieh Chen, Yukun Zhu,
George Papandreou, Florian Schroff, and Hartwig Adam, 2018

Architectures: and more
See this medium blog post: Review of deep learning algorithm for semantic
segmentation
Fully Convolutional Network
ParseNet
Feature Pyramid Network
Pyramid Scene Parsing network (PSPNet)
Path Aggregation Network (PANet)
Context Encoding Network (EncNet)

Conclusion
The key challenge in semantic segmentation is to
efficiently mix local and global context for pixel-wise
predictions

Thank you!
Go Build! https://gluon-cv.mxnet.io/build/examples_segmentation/index.html

Image Segmentation: Approaches and Challenges

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Image Segmentation: Approaches and Challenges

Similar to Image Segmentation: Approaches and Challenges (20)

More from Apache MXNet

More from Apache MXNet (20)

Recently uploaded

Recently uploaded (20)

Image Segmentation: Approaches and Challenges

Editor's Notes