1. The document describes Osamu Akiyama's participation in the 2018 Data Science Bowl competition where he ranked 9th in Stage 1 and 71st overall.
2. It then discusses the competition format which involved a two-stage structure with limited training data, requiring strong generalization. Two popular approaches for the instance segmentation task were Mask R-CNN and U-Net.
3. Akiyama's solution was based on Deep Watershed Transform, using an ensemble of segmentation networks. However, his attempt to use GANs for semi-supervised learning was unsuccessful. The top solution used an enhanced U-Net model with techniques like touching border prediction and combined loss functions.
kaggle Tokyo Meetup #4 Lightning Talk 2018 Data Science Bowl
1. 2018.05.12 @osciiart Stage 1: 9th, Stage 2: 71st /3634
kaggle Tokyo Meetup #4
Lightning Talk
2018 Data Science Bowl
2. Who am I?
秋山理 Osamu Akiyama @osciiart
Biography
• 京都大学生命科学修士号
• 大阪大学医学部医学科5回 (31歳)
• 研究: 脳科学, BMI
• AIメディカル研究会 (AIMS)
paper
• Akiyama O. ASCII Art Synthesis with Convolutional Networks. NIPS 2017 Workshop,
Machine Learning for Creativity and Design. 2017.
• ASCII.jp: アスキーアートの精度はディープラーニングでどこまで上がるのか?
• VICE MOTHERBOARD: This Machine Learning Algorithm Can Turn Any Line Drawing Into ASCII Art
Kaggle status (@osciiart)
• 3 Silver, 1 Bronze
Other competition result
• DeepAnalytics バイエル薬品 医薬情報テキストマイニング 2nd / 127
• Bioinformatics Contest 2018 20th
4. Evaluation
Pred Label
IoU > threshold -> True Positive
Average Precision (AP) =
mean AP (mAP) =
1.00
0.00
0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95
mAP
threshold
AP
5. 2 Stage Competition
Strong generalization required
Train data: 665 Stage 1 Test data: 65
Stage 2 Test Data: 3019 (most of all is fake)
6. Mask R-CNN vs U-Net
2-stage detector
• Detection とSegmentationのprocessを分離
• 精度が高い (State-of-the-Art)
• Occlusion, Class imbalanceに対応できる
• 学習が難しい
1-stage detector
• そのままではInstanceを分離できない
• Simple and Fast
• Occlusion, Class imbalance に弱い
• Ensembleが適用しやすい
Ronneberger, O., Fischer, P., Brox, T. U-Net: Convolutional
Networks for Biomedical Image Segmentation. arXiv. 2015.
He, K., Gkioxari, G., Dollár, P., Girshick, R. Mask R-CNN. arXiv.
2017.
7. The Organizer Stands Like God
主催者の一人 Allen がコンペ開始からぶっちぎりの1位に君臨
Stage 1 で結局誰もAllenを追い抜けなかった
Allen が積極的に手法を公開したため、彼の手法をいかに再現するかの勝負の様相
Allenの手法がMask R-CNNのため多くの人がMask R-CNNに注目した
1st Stage LB
8. My Solution: (based on) Deep Watershed Transform
• 3 net in serial -> in parallel (for simplification)
• Binned depth classification -> normalized depth regression (for size augmentation)
Bai M, Urtasun R. Deep Watershed Transform for Instance Segmentation. arXiv. 2016.
SegNet
Direction
Net
Depth
Net
DeepLab
V3+’
• Augmentation
• Random cropping
• Resize (0.5 – 2.0)
• Rotation (-180° - 180°)
• Flip
• Hue, Saturation, Lightness
• TTA
• Mean diameter (25, 30, 35, 40, 45 pixel)
• Flip
• Rotation (0°, 90°,…, 270°)
Marvelous Article: Applying Deep Watershed Transform to Kaggle Data Science Bowl 2018
9. My Solution: Semi-supervised by GAN
(doesn’t work)
• Generator (labeled)
G
True Label
D
PredictionInput
Real PairAdv Loss
MSE Loss
D
Real Pair
or
Fake Pair
Adv Loss
Prediction
Input
True Label
Input
or
G
D
PredictionInput
Real PairAdv Loss
• Discriminator
• Generator (unlabeled)
10. 1st place solution: U-Net on Steroids
• targets - we predict touching borders along with the masks to solve the problem as
instance segmentation
• loss function - that combines crossentropy and soft dice loss in such a way that pixel
imbalance doesn't affect the results
• very deep encoder-decoder architectures that also achieve state-of-the-art results in other
binary segmentation problems (SpaceNet, Inria and others)
• tricky postprocessing that combines watershed, morphological features and second-level
model with Gradient Boosted Trees (increased 0.015)
• task specific data augmentations