Image segmentation hj_cho

Image Segmentation
DeepBio
Hyungjoo Cho

Pixel wise prediction(classification)

Image Segmentation
• CNNs
• RNNs
• GANs

Fully Convolution Network
http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Long_Fully_Convolutional_Networks_2015_CVPR_paper.pdf
• End-to-end, Pixel-to-pixel prediction
• Backwards convolution for up-sampling
• Per-pixel multinomial logistic loss

Limitations
• Fixed size receptive field
• Too simple structure to get detailed features

Deconvolution Network
http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Noh_Learning_Deconvolution_Network_ICCV_2015_paper.pdf
• Combining unpooling, deconvolution(with crop), and Relu
• Reconstruction of the detailed structure of an object in finer resolution
• Batch-normalization

Limitations
• Difficult to learn
• Still lose spatial information

U-Net
https://arxiv.org/pdf/1505.04597.pdf
• Do not use unpooling(only up-convolution)
• Skip-connection(with concat)
• Do not have fully connected layer
• Elastic deformation

Limitations
• Didn’t use batch-norm
• VGG is not the best solution for feature extracting

Deep contextual networks
http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/11789
• Auxiliary connection, classifier
• Ensemble
• Lower memory consumption

FusionNet
• Skip-connection(with summation)
• Residual block(shortcut connection)
• Elastic deformation

Limitations
• Memory
• Memory
• Memory
• Memory
• Memory

Pyramid Scene Parsing Net
• Pre-trained FCN with ResNet(1/8 sized feature map)
• Pyramid pooling & 1x1 cone
• Bilinear interpolation
• Avg pooling is better than Max pooling

Multi-Dimensional RNNs
• GOD GRAVES!!
• 1D RNNs(Bi-directional RNNs) couldn’t explain images well
• Need to access to the surrounding context in all directions
• N-dimensional data : At least 2^(N) hidden layers
• The input layer is size 3(RGB) or 1(Gray) or patch and the output layer(softmax) is size of classes

Assume that…
A00 A01 A02
A10 A11 A12
A20 A21 A22
• 3X3 IMAGE

A00 A01 A02 A10 A11 A12 A20 A21 A22
O00 O01 O02 O10 O11 O12 O20 O21 O22

Scene Labeling with LSTM
http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Byeon_Scene_Labeling_With_2015_CVPR_paper.pdf
• Patch without overlapping
• Four separate 2D-LSTM block with summation
• The size of the layer corresponds
to the number of feature maps

Turned MD RNNs
• Standard MD RNNs was not easy to parallelize
• Rotate 45 degrees
• Easy to parallelize!!
• This introduces context gaps

PyraMiD RNNs
• Fill the blanks
• More spatial information
• For 3D-Image : PyraMiD needs only 6, while standard needs 8 cubes

Grid LSTM
• GOD GRAVES
• Connections along depth dimension as well as temporal dimension
• 3D Grid LSTM = Multi-dimensional LSTM + memory connection

Pix2Pix
https://arxiv.org/pdf/1611.07004v1.pdf
• Pixel to pixel translation
• U-Net + Conditional Gan loss
• Also doing well segmentation tasks

Pix2Pix
https://arxiv.org/pdf/1611.07004v1.pdf

Adversarial Networks for the Detection of
Aggressive Prostate Cancer
• Pix2pix structure
• Conditional Gan loss
• Instance norm in stead of batch norm

Image segmentation hj_cho

More Related Content

What's hot

Viewers also liked

Similar to Image segmentation hj_cho

Recently uploaded

Image segmentation hj_cho