SlideShare a Scribd company logo
1 of 47
Download to read offline
Instance Segmentation
The first independent seminar #8
Bar Vinograd / 25.03.2018 / Tel Aviv University
● What is Instance Segmentation?
● Mask R-CNN Overview
● Instance Embedding (3 papers)
● Summary
Agenda
What is instance segmentation?
http://cs231n.stanford.edu/index.html
What is instance segmentation?
Datasets
● Stills
○ CVPPP leaf segmentation
○ PASCAL VOC
○ COCO
○ CityScapes
○ KITTI Vehicles
○ ...
● Video
○ DAVIS
○ CityScapes
MASK R-CNN
Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick @ FAIR
https://arxiv.org/abs/1703.06870
DensePose: Dense Human Pose Estimation In The
Wild
Rıza Alp Güler, Natalia Neverova, Iasonas Kokkinos @ FAIR
Change the mask head
with body part / position
head
https://arxiv.org/abs/1802.00434
Problems with Mask R-CNN
● Slow: ~5fps with 1080ti at 800x1100
● There may be more than one instance in each box
● Performs poorly on objects with low box fill rate (chair, bicycle)
● A pixel may be shared by multiple objects
● Multi step - complex to implement and tweek.
RetinaNet : Focal Loss for Dense Object Detection
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár @ FAIR
https://arxiv.org/abs/1708.02002
RetinaNet : Focal Loss for Dense Object Detection
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár @ FAIR
https://arxiv.org/abs/1708.02002
THE FUTURE
Instance Embedding
Give every pixel an n-dimensional “color”
in an embedding space and cluster in that
space
Papers for today:
● 1703.10277 - Semantic Instance Segmentation via Deep Metric Learning
● 1708.02551 - Semantic Instance Segmentation with a Discriminative Loss
Function
● 1712.08273 - Recurrent Pixel Embedding for Instance Grouping
Instance Embedding
The 2018 Data Science Bowl on
Kaggle. Instance segmentation on
cell nuclei.
Original
Image
Semantic
Segmentation
First 7 dimensions
of embedding
space
Semantic Instance Segmentation via Deep Metric
Learning
Alireza Fathi, Zbigniew Wojna, Vivek Rathod, Peng Wang, Hyun Oh Song, Sergio
Guadarrama, Kevin P. Murphy
Google / UCLA
https://arxiv.org/abs/1703.10277
Semantic Instance Segmentation via Deep Metric
Learning
Semantic Instance Segmentation via Deep Metric
Learning
Pairwise pixel loss
Weights are set s.t. they balance large and small objects and summed to 1
Semantic Instance Segmentation via Deep Metric
Learning
Semantic Instance Segmentation via Deep Metric
Learning
Semantic Instance Segmentation via Deep Metric
Learning
Training the seeds
Pick K (=10) pixels at random and grow a mask around them with various
thresholds τ.
If we find a sufficient intersection with a ground truth object, the pixel is assigned
with its class.
Semantic Instance Segmentation via Deep Metric
Learning
Picking the seeds
Unlike NMS, diversity in embedding space is encouraged, rather than spatial
diversity.
Semantic Instance Segmentation via Deep Metric
Learning
● DeepLab v2 (resnet-101) backbone. Pre-trained with COCO
● Training starts with no classification/seediness score and gradually increased
to 0.2.
● Backbone used with a pyramid (0.25, 0.5, 1, 2) and results fed to the
embedding and seedines models.
● Evaluated on PASCAL VOC 2012
Semantic Instance Segmentation with a
Discriminative Loss Function
Bert De Brabandere, Davy Neven, Luc Van Gool
ESAT-PSI, KU Leuven
https://arxiv.org/abs/1708.02551
https://github.com/DavyNeven/fastSceneUnderstanding
Semantic Instance Segmentation with a
Discriminative Loss Function
● Very similar to the previous
paper
● Uses discriminative loss
● Each class is embedded
independently
Semantic Instance Segmentation with a
Discriminative Loss Function
Pulling Force
Pushing Force
Regularization
α = 1
β = 1
γ = 0.001
Push threshold
1.5
Pull threshold
0.5
Semantic Instance Segmentation with a
Discriminative Loss Function
Semantic Instance Segmentation
with a Discriminative Loss
Function
Parsing | mean-shift clustering
1. Pick an unlabeled pixel and assume its
embedding value is the mean of the instance
a. find all pixels that are close (below threshold) to current mean
b. Calc the mean of the new set in embedding space
c. Go to step a. and repeat until convergence (mean is not changing)
2. Go to step 1. if more unlabeled pixels remains
Semantic Instance Segmentation with a
Discriminative Loss Function
Semantic Instance Segmentation with a
Discriminative Loss Function
Semantic Instance Segmentation with a
Discriminative Loss Function
● A semantic segmentation mask should be trained alongside. Clustering only
on pixels that are considered a part of an object.
● Probably best to set the pull threshold to 0.
● Unlike a loss on similar/different pixel pairs, with contrastive loss, information
flows between all pixels
● Semantic Segmentation matters a lot
● No need to balance instance sizes
Recurrent Pixel Embedding for Instance Grouping
Shu Kong, Charless Fowlkes
University of California
https://arxiv.org/abs/1712.08273
https://github.com/aimerykong/Recurrent-Pixel-Embedding-for-Instance-Grouping
● Embedding on a n-dimensional sphere
● Pairwise pixel loss, cosine distance.
● Main Contribution: mean-shift clustering is part of the model and
differentiable
Recurrent Pixel Embedding for Instance Grouping
Recurrent Pixel Embedding for Instance Grouping
Recurrent Pixel Embedding for Instance Grouping
● Calibrated cosine distance
Weighted by the size of the instances
Use α = 0.5
Recurrent Pixel Embedding for Instance Grouping
Recurrent Pixel Embedding for Instance Grouping
●
● Cubic convergence guarantees
● May be applied only to neighbourhoods or any subset of the whole image
Recurrent Pixel Embedding for Instance Grouping
More on GBMS: http://www.cs.cmu.edu/~aarti/SMLRG/miguel_slides.pdf
Recurrent Pixel Embedding for Instance Grouping
● Gaussian distribution is not appropriate because the distance should be taken
with respect to the cosine distance
● Using von Mises-Fisher distribution
“gaussian” on the sphere surface
● Should perform L2 normalization
After each iteration
Recurrent Pixel Embedding for Instance Grouping
Fdsa
Uses the von Mises-Fisher distribution (gaussian on a sphere surface) instead of
the gaussian kernel
Recurrent Pixel Embedding for Instance Grouping
● Computing the similarity matrix is expensive. Only some of pixels participate
in this phase ~50%
● The loss is backpropagated
at each iteration of the
module
● The iterative application is considered as parallel to hard negative mining.
● DeepLab-v3 is used a backbone
Comparison
Embedding Loss Seeds Parsing
Semantic Instance Segmentation via Deep Metric Learning
pairwise sigmoid-like loss
euclidean distance
learned Seediness score expand mask around seeds
Semantic Instance Segmentation with a Discriminative Loss Function
center ≠ center
point -> center
euclidean distance
random mean-shift around seeds
Recurrent Pixel Embedding for Instance Grouping
pairwise pixel + GBMS
cosine distance
random GBMS Proposals and
simple LR + mean shift
Other papers
● End-to-End Instance Segmentation with Recurrent Attention
https://arxiv.org/abs/1605.09410
● Deep Watershed Transform for Instance Segmentation
https://arxiv.org/abs/1611.08303
● Associative Embedding: End-to-End Learning for Joint Detection and
Grouping
http://ttic.uchicago.edu/~mmaire/papers/pdf/affinity_cnn_cvpr2016.pdf
● SGN: Sequential Grouping Networks for Instance Segmentation
https://www.cs.toronto.edu/~urtasun/publications/liu_etal_iccv17.pdf
Takeaways
● Use contrastive loss with pulling threshold 0
● Either learn a seedeniess model or implement GBMS
● Accuracy/Speed trade off is achieved by almost exclusively replacing the
backbone
● Pretrain on COCO
● No one need to more than 64 dimensions of embedding space
● When all fails, use Mask-RCNN
Questions?
Thank You!
me@barvinograd.com

More Related Content

Similar to Instance Segmentation with Embedding | Bar Vinograd

cvpr2011: game theory in CVPR part 2
cvpr2011: game theory in CVPR part 2cvpr2011: game theory in CVPR part 2
cvpr2011: game theory in CVPR part 2
zukun
 
110726IGARSS_MIL.pptx
110726IGARSS_MIL.pptx110726IGARSS_MIL.pptx
110726IGARSS_MIL.pptx
grssieee
 
Learning In Nonstationary Environments: Perspectives And Applications. Part2:...
Learning In Nonstationary Environments: Perspectives And Applications. Part2:...Learning In Nonstationary Environments: Perspectives And Applications. Part2:...
Learning In Nonstationary Environments: Perspectives And Applications. Part2:...
Giacomo Boracchi
 
(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...
Jacky Liu
 

Similar to Instance Segmentation with Embedding | Bar Vinograd (20)

Yolo
YoloYolo
Yolo
 
Lalal
LalalLalal
Lalal
 
PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...
PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...
PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...
 
cvpr2011: game theory in CVPR part 2
cvpr2011: game theory in CVPR part 2cvpr2011: game theory in CVPR part 2
cvpr2011: game theory in CVPR part 2
 
Machine Learning Tools and Particle Swarm Optimization for Content-Based Sear...
Machine Learning Tools and Particle Swarm Optimization for Content-Based Sear...Machine Learning Tools and Particle Swarm Optimization for Content-Based Sear...
Machine Learning Tools and Particle Swarm Optimization for Content-Based Sear...
 
Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)
 
Object Detection An Overview
Object Detection An OverviewObject Detection An Overview
Object Detection An Overview
 
Compressed Sensing using Generative Model
Compressed Sensing using Generative ModelCompressed Sensing using Generative Model
Compressed Sensing using Generative Model
 
110726IGARSS_MIL.pptx
110726IGARSS_MIL.pptx110726IGARSS_MIL.pptx
110726IGARSS_MIL.pptx
 
Learning In Nonstationary Environments: Perspectives And Applications. Part2:...
Learning In Nonstationary Environments: Perspectives And Applications. Part2:...Learning In Nonstationary Environments: Perspectives And Applications. Part2:...
Learning In Nonstationary Environments: Perspectives And Applications. Part2:...
 
IRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and PythonIRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and Python
 
Background subtraction
Background subtractionBackground subtraction
Background subtraction
 
(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...
 
brief Introduction to Different Kinds of GANs
brief Introduction to Different Kinds of GANsbrief Introduction to Different Kinds of GANs
brief Introduction to Different Kinds of GANs
 
PointNet
PointNetPointNet
PointNet
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
 
Mnist report ppt
Mnist report pptMnist report ppt
Mnist report ppt
 
Large Scale Distributed Deep Networks
Large Scale Distributed Deep NetworksLarge Scale Distributed Deep Networks
Large Scale Distributed Deep Networks
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Instance Segmentation with Embedding | Bar Vinograd

  • 1. Instance Segmentation The first independent seminar #8 Bar Vinograd / 25.03.2018 / Tel Aviv University
  • 2. ● What is Instance Segmentation? ● Mask R-CNN Overview ● Instance Embedding (3 papers) ● Summary Agenda
  • 3. What is instance segmentation? http://cs231n.stanford.edu/index.html
  • 4. What is instance segmentation?
  • 5. Datasets ● Stills ○ CVPPP leaf segmentation ○ PASCAL VOC ○ COCO ○ CityScapes ○ KITTI Vehicles ○ ... ● Video ○ DAVIS ○ CityScapes
  • 6. MASK R-CNN Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick @ FAIR https://arxiv.org/abs/1703.06870
  • 7. DensePose: Dense Human Pose Estimation In The Wild Rıza Alp Güler, Natalia Neverova, Iasonas Kokkinos @ FAIR Change the mask head with body part / position head https://arxiv.org/abs/1802.00434
  • 8. Problems with Mask R-CNN ● Slow: ~5fps with 1080ti at 800x1100 ● There may be more than one instance in each box ● Performs poorly on objects with low box fill rate (chair, bicycle) ● A pixel may be shared by multiple objects ● Multi step - complex to implement and tweek.
  • 9. RetinaNet : Focal Loss for Dense Object Detection Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár @ FAIR https://arxiv.org/abs/1708.02002
  • 10. RetinaNet : Focal Loss for Dense Object Detection Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár @ FAIR https://arxiv.org/abs/1708.02002
  • 12. Instance Embedding Give every pixel an n-dimensional “color” in an embedding space and cluster in that space Papers for today: ● 1703.10277 - Semantic Instance Segmentation via Deep Metric Learning ● 1708.02551 - Semantic Instance Segmentation with a Discriminative Loss Function ● 1712.08273 - Recurrent Pixel Embedding for Instance Grouping
  • 13. Instance Embedding The 2018 Data Science Bowl on Kaggle. Instance segmentation on cell nuclei. Original Image Semantic Segmentation First 7 dimensions of embedding space
  • 14. Semantic Instance Segmentation via Deep Metric Learning Alireza Fathi, Zbigniew Wojna, Vivek Rathod, Peng Wang, Hyun Oh Song, Sergio Guadarrama, Kevin P. Murphy Google / UCLA https://arxiv.org/abs/1703.10277
  • 15. Semantic Instance Segmentation via Deep Metric Learning
  • 16. Semantic Instance Segmentation via Deep Metric Learning Pairwise pixel loss Weights are set s.t. they balance large and small objects and summed to 1
  • 17. Semantic Instance Segmentation via Deep Metric Learning
  • 18. Semantic Instance Segmentation via Deep Metric Learning
  • 19. Semantic Instance Segmentation via Deep Metric Learning Training the seeds Pick K (=10) pixels at random and grow a mask around them with various thresholds τ. If we find a sufficient intersection with a ground truth object, the pixel is assigned with its class.
  • 20. Semantic Instance Segmentation via Deep Metric Learning Picking the seeds Unlike NMS, diversity in embedding space is encouraged, rather than spatial diversity.
  • 21. Semantic Instance Segmentation via Deep Metric Learning ● DeepLab v2 (resnet-101) backbone. Pre-trained with COCO ● Training starts with no classification/seediness score and gradually increased to 0.2. ● Backbone used with a pyramid (0.25, 0.5, 1, 2) and results fed to the embedding and seedines models. ● Evaluated on PASCAL VOC 2012
  • 22.
  • 23. Semantic Instance Segmentation with a Discriminative Loss Function Bert De Brabandere, Davy Neven, Luc Van Gool ESAT-PSI, KU Leuven https://arxiv.org/abs/1708.02551 https://github.com/DavyNeven/fastSceneUnderstanding
  • 24. Semantic Instance Segmentation with a Discriminative Loss Function ● Very similar to the previous paper ● Uses discriminative loss ● Each class is embedded independently
  • 25. Semantic Instance Segmentation with a Discriminative Loss Function Pulling Force Pushing Force Regularization α = 1 β = 1 γ = 0.001 Push threshold 1.5 Pull threshold 0.5
  • 26. Semantic Instance Segmentation with a Discriminative Loss Function
  • 27. Semantic Instance Segmentation with a Discriminative Loss Function Parsing | mean-shift clustering 1. Pick an unlabeled pixel and assume its embedding value is the mean of the instance a. find all pixels that are close (below threshold) to current mean b. Calc the mean of the new set in embedding space c. Go to step a. and repeat until convergence (mean is not changing) 2. Go to step 1. if more unlabeled pixels remains
  • 28. Semantic Instance Segmentation with a Discriminative Loss Function
  • 29. Semantic Instance Segmentation with a Discriminative Loss Function
  • 30.
  • 31. Semantic Instance Segmentation with a Discriminative Loss Function ● A semantic segmentation mask should be trained alongside. Clustering only on pixels that are considered a part of an object. ● Probably best to set the pull threshold to 0. ● Unlike a loss on similar/different pixel pairs, with contrastive loss, information flows between all pixels ● Semantic Segmentation matters a lot ● No need to balance instance sizes
  • 32.
  • 33. Recurrent Pixel Embedding for Instance Grouping Shu Kong, Charless Fowlkes University of California https://arxiv.org/abs/1712.08273 https://github.com/aimerykong/Recurrent-Pixel-Embedding-for-Instance-Grouping
  • 34. ● Embedding on a n-dimensional sphere ● Pairwise pixel loss, cosine distance. ● Main Contribution: mean-shift clustering is part of the model and differentiable Recurrent Pixel Embedding for Instance Grouping
  • 35. Recurrent Pixel Embedding for Instance Grouping
  • 36. Recurrent Pixel Embedding for Instance Grouping ● Calibrated cosine distance Weighted by the size of the instances Use α = 0.5
  • 37. Recurrent Pixel Embedding for Instance Grouping
  • 38. Recurrent Pixel Embedding for Instance Grouping ● ● Cubic convergence guarantees ● May be applied only to neighbourhoods or any subset of the whole image
  • 39. Recurrent Pixel Embedding for Instance Grouping More on GBMS: http://www.cs.cmu.edu/~aarti/SMLRG/miguel_slides.pdf
  • 40. Recurrent Pixel Embedding for Instance Grouping ● Gaussian distribution is not appropriate because the distance should be taken with respect to the cosine distance ● Using von Mises-Fisher distribution “gaussian” on the sphere surface ● Should perform L2 normalization After each iteration
  • 41. Recurrent Pixel Embedding for Instance Grouping Fdsa Uses the von Mises-Fisher distribution (gaussian on a sphere surface) instead of the gaussian kernel
  • 42. Recurrent Pixel Embedding for Instance Grouping ● Computing the similarity matrix is expensive. Only some of pixels participate in this phase ~50% ● The loss is backpropagated at each iteration of the module ● The iterative application is considered as parallel to hard negative mining. ● DeepLab-v3 is used a backbone
  • 43. Comparison Embedding Loss Seeds Parsing Semantic Instance Segmentation via Deep Metric Learning pairwise sigmoid-like loss euclidean distance learned Seediness score expand mask around seeds Semantic Instance Segmentation with a Discriminative Loss Function center ≠ center point -> center euclidean distance random mean-shift around seeds Recurrent Pixel Embedding for Instance Grouping pairwise pixel + GBMS cosine distance random GBMS Proposals and simple LR + mean shift
  • 44. Other papers ● End-to-End Instance Segmentation with Recurrent Attention https://arxiv.org/abs/1605.09410 ● Deep Watershed Transform for Instance Segmentation https://arxiv.org/abs/1611.08303 ● Associative Embedding: End-to-End Learning for Joint Detection and Grouping http://ttic.uchicago.edu/~mmaire/papers/pdf/affinity_cnn_cvpr2016.pdf ● SGN: Sequential Grouping Networks for Instance Segmentation https://www.cs.toronto.edu/~urtasun/publications/liu_etal_iccv17.pdf
  • 45. Takeaways ● Use contrastive loss with pulling threshold 0 ● Either learn a seedeniess model or implement GBMS ● Accuracy/Speed trade off is achieved by almost exclusively replacing the backbone ● Pretrain on COCO ● No one need to more than 64 dimensions of embedding space ● When all fails, use Mask-RCNN