In the image classification task, we only need to learn local features, but in the image segmentation task, we also need to learn positional information. Therefore, there is a difference between the image segmentation task and the image classification task in the features to be learned. In this study, we propose SE-U-Net++, which efficiently learns both local features and positional information by incorporating SE blocks, and a transfer learning algorithm that bridges the difference between the tasks by comparing parameters in the convolutional layer.
Advancing Engineering with AI through the Next Generation of Strategic Projec...
SE-U-Net++ Image Segmentation Model with Transfer Learning
1. Transfer Learning Model
for Image Segmentation
by Integrating U-Net++ and SE Block
GCCE2020 1
2nd Satoshi Yamane
Institute of Science and Engineering, Kanazawa University
1st Yuta Suzuki
Graduate School of Natural Science and Technology, Kanazawa University
8. ImageNet
• Over 14 million images.
• Names of objects in the image (class labels) are added
• Over 20,000 different object names (class labels)
GCCE2020 8
ImageNet is a very large dataset for image classifica8on
ImageNet-trained models
Transfer
Learning
Apply to new tasks
9. Transfer Learning
GCCE2020 9
Re-training with new data
Transfer Learning is a method of applying a model learned in one task to another task.
• Often effective at CNN
• Pre-training task need to be large
• Useful when you can't prepare a lot of data for training
10. Fine-Tuning
GCCE2020 10
• The lower layers of CNN are learning general features ➡ Freeze the training
• SegmentaGon doesn't oHen work
Re-training with new data
Freeze the training
Fine tuning is a transfer learning technique commonly used in CNNs for image classifica8on tasks
12. U-Net
GCCE2020 12
U-Net is a representative model for segmentation
• Learn Local Features by Convolution
• Learn location information by skip connections
O. Ronneberger, P. Fischer and T. Brox, U-Net: Convolu@onal Networks for Biomedical Image Segmenta@on, arXiv:1505.04597, 2015.
Encoder
Decoder
13. Transfer Learning in U-Net
• Using Pre-trained models for the encoder part of U-Net
GCCE2020 13
Vladimir Iglovikov, Alexey Shvets, TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation, arXiv:1801.05746, 2018.
Using pre-trained models
in ImageNet
14. U-Net++
U-Net++ is an improved model of U-Net
GCCE2020 14
Zongwei Zhou, Md Mahfuzur Rahman Siddiquee, Nima Tajbakhsh, Jianming Liang, UNet++: A Nested U-Net Architecture for Medical Image SegmentaGon, arXiv:1807.10165, 2018.
• Decode from each scale's Encoder part and connect them to the Decoder by skip connection
• Reduce feature map gaps between Encoder and Decoder
15. Problems to be solved (1)
Segmenta8on tasks are expensive to produce data for training
GCCE2020 15
Transfer learning with a trained model in a ImageNet(classification task)
There are differences in learning feature
between classification and segmentation
16. Problems to be solved (2)
GCCE2020 16
Rabbit 98%
Predict
Predict
classification
segmentation
local features
local features
+
location information
Learning features
Learning features
We need to learn to
bridge this difference
18. Proposed SE-U-Net++
GCCE2020 18
• SE-Block was attached to the encoder part of U-Net++
Efficient learning of local features and locaGon informaGon
19. SE-Block
GCCE2020 19
Jie Hu, Li Shen,Samuel Albanie, Gang Sun,Enhua Wu, Squeeze-and-Excitation Networks, arXiv:1709.01507v3, 2018.
• Weighting channels in the CNN feature map
Emphasize high-value feature maps and suppress low-value feature maps
Emphasize
location information
Emphasize
local features
20. Proposed Transfer Learning Algorithm
GCCE2020 20
Cosine similarity
Visualize the learning features difference between
ImageNet and segmentation
Freeze training in areas with similar learning features
Apply fine tuning to models with U-Net structure
22. Experiments data
GCCE2020 22
Kaggle’s 2018 Data Science Bowl
• It is a task to detect the nucleus of a cell
• It has 670 training data.
Predict
https://www.kaggle.com/c/data-science-bowl-2018/data
23. Experiments model
GCCE2020 23
The models used are U-Net, U-Net++, and SE-U-Net++ with a VGG16 encoder
consisGng of 13 convoluGonal layers
VGG16
24. Result of SE-U-Net++
GCCE2020 24
unsupervised supervised
U-Net 0.9246 0.9336
U-Net++ 0.9302 0.9443
SE-U-Net++ 0.9363 0.9445
RESULT OF 200 EPOCHs (Mean IoU)
RESULT OF 1000 EPOCHs (Mean IoU)
unsupervised supervised
U-Net 0.9735 0.9756
U-Net++ 0.9754 0.9788
SE-U-Net++ 0.9769 0.9786
SE-U-Net++ is
more efficient
in learning
25. The learning features difference between
ImageNet and segmentation
GCCE2020 25
In the first half of the encoder, the U-Net structural model is very different from ImageNet in terms of training features.
26. Learning Features of SE-U-Net++ and Traditional Fine Tuning
GCCE2020 26
Freezing a large difference area in learning features reduce performance.
t
Freeze this area
27. Result of Proposed Transfer Learning Algorithm
GCCE2020 27
supervised proposed
U-Net 0.9336 0.9341
U-Net++ 0.9443 0.9453
SE-U-Net++ 0.9445 0.9454
RESULT OF 200 EPOCHs (Mean IoU)
supervised proposed
U-Net 0.9756 0.9757
U-Net++ 0.9788 0.9789
SE-U-Net++ 0.9786 0.9787
RESULT OF 1000 EPOCHs (Mean IoU)
Fine tuning
has been
applied to
the U-Net
structural
model
29. Conclusion
• we proposed SE-U-Net++, which efficiently learns both local features
and location information by attaching SE blocks
• we also proposed a transfer learning algorithm that bridges the
difference between the tasks by comparing parameters in the
convolutional layer
• As a result, SE-U-Net++ showed better performance than U-Net++
• As a result, we were able to apply fine tuning to models with a U-Net
structure
GCCE2020 29