SlideShare a Scribd company logo
1 of 19
Download to read offline
2020/07/02
Working Group
Survey on Self-Supervised Image Segmentation
Arithmer R3 Div. R3 - Yann Le Guilly
Working Group: Survey on Self-Supervised Image Segmentation
Self-Introduction: Yann Le Guilly
● Education
○ Undergraduate in physics, University of Rennes 1 (France)
○ Master’s degree in applied mathematics, IRMAR laboratory, University of Rennes 1 (France)
○ Research Student at Tokyo Institute of Technology, Murata Laboratory
● Former Job
○ Machine Learning Engineer at Abeja, Inc.
■ Development of AI-based product and Proofs of concept
● Current Job
○ Machine Learning Engineer
■ Development of AI-based product and Proofs of concept
Working Group: Survey on Self-Supervised Image Segmentation
Agenda
● Introduction
○ Image Segmentation
○ Problems
○ Self-supervised learning
○ Content of this presentation
● Survey
○ Pseudo Labeling
○ Class Activation Map
○ Image Depth Information
○ How about using videos
● Some comments
Working Group: Survey on Self-Supervised Image Segmentation
Introduction: Image Segmentation
1
Working Group: Survey on Self-Supervised Image Segmentation
Introduction: Problems
2
CVAT: open source annotation tool
- very hard to annotate (=expensive)
- pixel-wise annotation
- different types of objects
- boundaries are often blurry
- easy to make mistake
- lots of unclear cases
- Need to keep focus for long time
How can we change this situation?
Working Group: Survey on Self-Supervised Image Segmentation
Introduction: Self-supervised learning
● Research topic pushed by
“godfathers of AI” and specially
by Yann Le Cun
● Pre-train a feature extractor in
an unsupervised way then train
the classifier with annotated
data
● Reach SOTA with only 10% of
labels in the last papers (starting
to go beyond)
● Methods coming from NLP
3
Slide from Yann Le Cun’s (recurrent) presentation
Working Group: Survey on Self-Supervised Image Segmentation
Introduction: Content of this presentation
- 3 very different directions
- Using pseudo labelling
- Using Activation Map (like grad-CAM)
- Using depth information
- Leverage frames in videos
- High level explanations
- For more details, I provided notes for some papers
- For even more details, please read the original papers
4
Spoiler: there is no self-supervised learning
semantic segmentation method that is very
convincing...
Working Group: Survey on Self-Supervised Image Segmentation
Pseudo Labeling: Introduction
- Usually use softmax to differentiate
“strong prediction” to “weak
prediction”
- Keep only “strong prediction” as
pseudo-labels
- popular research topic
Issue:
- Softmax is not the best way since it
takes the best score without
considering other scores (which might
be promising also)
5
Simple illustration on what is pseudo labeling
Working Group: Survey on Self-Supervised Image Segmentation
Pseudo Labeling: Entropy-guided (1/2)
My notes:
https://arithmer.co.jp/wp-content/uploads/pdf/notes_ESL_Entropy-guided_Self-supervised_Learning_for_Domain_Adaptation_in_Semantic_Segmentation.pdf
Original paper: https://arxiv.org/abs/2006.08658v1
Repo: https://github.com/liyunsheng13/BDL
6
Working Group: Survey on Self-Supervised Image Segmentation
Pseudo Labeling: Entropy-guided (2/2)
7
Model Pre-trained on GTA5 dataset and method used on Cityscapes: consistently improves the final accuracy but by less than 1% mIoU
State-of-the-art for fully supervised learning: 85.1% (IT = image translation)
Working Group: Survey on Self-Supervised Image Segmentation
Class Activation Map: Introduction
8
original paper: https://arxiv.org/abs/1610.02391
- Uses gradient information flow to the neurons of
a specific CNN layer to identify regions of
activation
- Step forward interpretability
- Work from 2017 (last version from last
December)
Working Group: Survey on Self-Supervised Image Segmentation
Class Activation Map: Equivariant Attention Mechanism
9
My notes:
https://arithmer.co.jp/wp-content/uploads/pdf/notes_Self-supervised_Equivariant_Attention_Mechanism_for_Weakly_Supervised_Semantic_Segmentation.pdf
Original paper: https://arxiv.org/abs/2004.04581
Repo: https://github.com/YudeWang/SEAM
1. Need image level annotation (class)
2. 3 loss functions combines (ECR, ER
and cls)
a. both CAM outputs should be
similar despite Affine transform
b. But CAM degenerates
(converge to a trivial solution)
so ECR regularizes the PCM
outputs with the original CAM
Working Group: Survey on Self-Supervised Image Segmentation
Class Activation Map: Equivariant Attention Mechanism
10
Main contribution: PCM module
- uses attention to refine the mask
from grad-CAM
- attention can capture contextual
information
Working Group: Survey on Self-Supervised Image Segmentation
Class Activation Map: Experiments
Pascal VOC 2012 val with Image level annotation only (no
mask). Using ResNet38 as backbone.
fully-supervised learning SOTA: 90.0%
(backbone is huge in this case though)
Comparison on
Pascal VOC
2012 dataset
with other
weakly
supervised
methods
11
Working Group: Survey on Self-Supervised Image Segmentation
Image Depth Information: HN labels
- HN labels generation
- Computing angles and height
relative to the floor plane using
RGB-D
- Everything is binned to create
labels
- Training segmentation model on
HN-labels
- Fine-tuning on real dataset
The point is more to show:
pretrained HN labels >> pretrained ImageNet
12
Working Group: Survey on Self-Supervised Image Segmentation
Image Depth Information: Experiments
dataset: NUYv2, trained on 50 epochs only. HN labels are from NUYv2. ImageNet is 25x bigger
13
Working Group: Survey on Self-Supervised Image Segmentation
How about using videos: Self-supervised Video Object
Segmentation
Original paper:
https://arxiv.org/abs/2006.12480v1
- Learn unsupervised to track
similar pixels across frames
- Then when giving an initial
mask (on this illustration, in the
frame 0), the model can infer
the same object for the next
frames
14
Working Group: Survey on Self-Supervised Image Segmentation
How about using videos: Self-supervised Video Object
Segmentation
Original paper:
https://arxiv.org/abs/2006.12480v1
- Learn unsupervised to track
similar pixels across frames
- Then when giving an initial
mask (on this illustration, in the
frame 0), the model can infer
the same object for the next
frames
On Youtube VOS val. J (Mean) J (Recall) F(Mean) F(Recall). The
models in the bottom part of the table are trained fully-supervised.
15
19

More Related Content

What's hot

Active learning lecture
Active learning lectureActive learning lecture
Active learning lectureazuring
 
Weakly supervised semantic segmentation of 3D point cloud
Weakly supervised semantic segmentation of 3D point cloudWeakly supervised semantic segmentation of 3D point cloud
Weakly supervised semantic segmentation of 3D point cloudArithmer Inc.
 
07 regularization
07 regularization07 regularization
07 regularizationRonald Teo
 
Instance based learning
Instance based learningInstance based learning
Instance based learningSlideshare
 
activelearning.ppt
activelearning.pptactivelearning.ppt
activelearning.pptbutest
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOswald Campesato
 
Federated learning in brief
Federated learning in briefFederated learning in brief
Federated learning in briefShashi Perera
 
(SURVEY) Semi Supervised Learning
(SURVEY) Semi Supervised Learning(SURVEY) Semi Supervised Learning
(SURVEY) Semi Supervised LearningYamato OKAMOTO
 
HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningYan Xu
 
Introduction to CUDA
Introduction to CUDAIntroduction to CUDA
Introduction to CUDARaymond Tay
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Suraj Aavula
 
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Preferred Networks
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networksSi Haem
 
AlexNet, VGG, GoogleNet, Resnet
AlexNet, VGG, GoogleNet, ResnetAlexNet, VGG, GoogleNet, Resnet
AlexNet, VGG, GoogleNet, ResnetJungwon Kim
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronMostafa G. M. Mostafa
 

What's hot (20)

Active learning lecture
Active learning lectureActive learning lecture
Active learning lecture
 
Federated Learning
Federated LearningFederated Learning
Federated Learning
 
Weakly supervised semantic segmentation of 3D point cloud
Weakly supervised semantic segmentation of 3D point cloudWeakly supervised semantic segmentation of 3D point cloud
Weakly supervised semantic segmentation of 3D point cloud
 
07 regularization
07 regularization07 regularization
07 regularization
 
Instance based learning
Instance based learningInstance based learning
Instance based learning
 
crowd counting.pptx
crowd counting.pptxcrowd counting.pptx
crowd counting.pptx
 
activelearning.ppt
activelearning.pptactivelearning.ppt
activelearning.ppt
 
Federated Learning
Federated LearningFederated Learning
Federated Learning
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Federated learning in brief
Federated learning in briefFederated learning in brief
Federated learning in brief
 
(SURVEY) Semi Supervised Learning
(SURVEY) Semi Supervised Learning(SURVEY) Semi Supervised Learning
(SURVEY) Semi Supervised Learning
 
HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep Learning
 
Introduction to CUDA
Introduction to CUDAIntroduction to CUDA
Introduction to CUDA
 
Lstm
LstmLstm
Lstm
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
 
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
AlexNet, VGG, GoogleNet, Resnet
AlexNet, VGG, GoogleNet, ResnetAlexNet, VGG, GoogleNet, Resnet
AlexNet, VGG, GoogleNet, Resnet
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer Perceptron
 
Machine Learning for Dummies
Machine Learning for DummiesMachine Learning for Dummies
Machine Learning for Dummies
 

Similar to Survey on self supervised image segmentation

UserZoom Education Series - Tips & Tricks: Inserting images into surveys
UserZoom Education Series - Tips & Tricks: Inserting images into surveysUserZoom Education Series - Tips & Tricks: Inserting images into surveys
UserZoom Education Series - Tips & Tricks: Inserting images into surveysUserZoom
 
5212303961620480 1585670953 joanna_stachera_proposal_g_soc2020
5212303961620480 1585670953 joanna_stachera_proposal_g_soc20205212303961620480 1585670953 joanna_stachera_proposal_g_soc2020
5212303961620480 1585670953 joanna_stachera_proposal_g_soc2020JoannaStachera1
 
Review : Rethinking Pre-training and Self-training
Review : Rethinking Pre-training and Self-trainingReview : Rethinking Pre-training and Self-training
Review : Rethinking Pre-training and Self-trainingDongmin Choi
 
Convolutional Neural Networks for Computer vision Applications
Convolutional Neural Networks for Computer vision ApplicationsConvolutional Neural Networks for Computer vision Applications
Convolutional Neural Networks for Computer vision ApplicationsAlex Conway
 
Challenges of Deep Learning in Computer Vision Webinar - Tessellate Imaging
Challenges of Deep Learning in Computer Vision Webinar - Tessellate ImagingChallenges of Deep Learning in Computer Vision Webinar - Tessellate Imaging
Challenges of Deep Learning in Computer Vision Webinar - Tessellate ImagingAdhesh Shrivastava
 
2_Image Classification.pdf
2_Image Classification.pdf2_Image Classification.pdf
2_Image Classification.pdfFEG
 
Comparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksComparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksVincenzo Lomonaco
 
[Cryptica 22] Deep Learning on Tabular Data, Predicting Profitability - Peiyu...
[Cryptica 22] Deep Learning on Tabular Data, Predicting Profitability - Peiyu...[Cryptica 22] Deep Learning on Tabular Data, Predicting Profitability - Peiyu...
[Cryptica 22] Deep Learning on Tabular Data, Predicting Profitability - Peiyu...DataScienceConferenc1
 
From Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptxFrom Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptxChun-Hao Chang
 
Machine Learning ass. of tanumalakar.pdf
Machine Learning ass. of tanumalakar.pdfMachine Learning ass. of tanumalakar.pdf
Machine Learning ass. of tanumalakar.pdfDYDF
 
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGHANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGIRJET Journal
 
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGHANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGIRJET Journal
 
IRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural NetworkIRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural NetworkIRJET Journal
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learningAntonio Rueda-Toicen
 
NVIDIA深度學習教育機構 (DLI): Deep Learning Institute
NVIDIA深度學習教育機構 (DLI): Deep Learning InstituteNVIDIA深度學習教育機構 (DLI): Deep Learning Institute
NVIDIA深度學習教育機構 (DLI): Deep Learning InstituteNVIDIA Taiwan
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...StampedeCon
 

Similar to Survey on self supervised image segmentation (20)

UserZoom Education Series - Tips & Tricks: Inserting images into surveys
UserZoom Education Series - Tips & Tricks: Inserting images into surveysUserZoom Education Series - Tips & Tricks: Inserting images into surveys
UserZoom Education Series - Tips & Tricks: Inserting images into surveys
 
5212303961620480 1585670953 joanna_stachera_proposal_g_soc2020
5212303961620480 1585670953 joanna_stachera_proposal_g_soc20205212303961620480 1585670953 joanna_stachera_proposal_g_soc2020
5212303961620480 1585670953 joanna_stachera_proposal_g_soc2020
 
Rebort
RebortRebort
Rebort
 
Review : Rethinking Pre-training and Self-training
Review : Rethinking Pre-training and Self-trainingReview : Rethinking Pre-training and Self-training
Review : Rethinking Pre-training and Self-training
 
Convolutional Neural Networks for Computer vision Applications
Convolutional Neural Networks for Computer vision ApplicationsConvolutional Neural Networks for Computer vision Applications
Convolutional Neural Networks for Computer vision Applications
 
Challenges of Deep Learning in Computer Vision Webinar - Tessellate Imaging
Challenges of Deep Learning in Computer Vision Webinar - Tessellate ImagingChallenges of Deep Learning in Computer Vision Webinar - Tessellate Imaging
Challenges of Deep Learning in Computer Vision Webinar - Tessellate Imaging
 
2_Image Classification.pdf
2_Image Classification.pdf2_Image Classification.pdf
2_Image Classification.pdf
 
Person Recognition
Person RecognitionPerson Recognition
Person Recognition
 
Comparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksComparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural Networks
 
[Cryptica 22] Deep Learning on Tabular Data, Predicting Profitability - Peiyu...
[Cryptica 22] Deep Learning on Tabular Data, Predicting Profitability - Peiyu...[Cryptica 22] Deep Learning on Tabular Data, Predicting Profitability - Peiyu...
[Cryptica 22] Deep Learning on Tabular Data, Predicting Profitability - Peiyu...
 
From Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptxFrom Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptx
 
Machine Learning ass. of tanumalakar.pdf
Machine Learning ass. of tanumalakar.pdfMachine Learning ass. of tanumalakar.pdf
Machine Learning ass. of tanumalakar.pdf
 
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGHANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
 
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGHANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
 
LTB Demo - Healthcare Evaluation
LTB Demo - Healthcare EvaluationLTB Demo - Healthcare Evaluation
LTB Demo - Healthcare Evaluation
 
IRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural NetworkIRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural Network
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
NVIDIA深度學習教育機構 (DLI): Deep Learning Institute
NVIDIA深度學習教育機構 (DLI): Deep Learning InstituteNVIDIA深度學習教育機構 (DLI): Deep Learning Institute
NVIDIA深度學習教育機構 (DLI): Deep Learning Institute
 
Mini Project- Face Recognition
Mini Project- Face RecognitionMini Project- Face Recognition
Mini Project- Face Recognition
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
 

More from Arithmer Inc.

コーディネートレコメンド
コーディネートレコメンドコーディネートレコメンド
コーディネートレコメンドArithmer Inc.
 
Arithmerソリューション紹介 流体予測システム
Arithmerソリューション紹介 流体予測システムArithmerソリューション紹介 流体予測システム
Arithmerソリューション紹介 流体予測システムArithmer Inc.
 
Arithmer NLP 自然言語処理 ソリューション紹介
Arithmer NLP 自然言語処理 ソリューション紹介Arithmer NLP 自然言語処理 ソリューション紹介
Arithmer NLP 自然言語処理 ソリューション紹介Arithmer Inc.
 
Arithmer Robo Introduction
Arithmer Robo IntroductionArithmer Robo Introduction
Arithmer Robo IntroductionArithmer Inc.
 
Arithmer AIチャットボット
Arithmer AIチャットボットArithmer AIチャットボット
Arithmer AIチャットボットArithmer Inc.
 
Arithmer R3 Introduction
Arithmer R3 Introduction Arithmer R3 Introduction
Arithmer R3 Introduction Arithmer Inc.
 
VIBE: Video Inference for Human Body Pose and Shape Estimation
VIBE: Video Inference for Human Body Pose and Shape EstimationVIBE: Video Inference for Human Body Pose and Shape Estimation
VIBE: Video Inference for Human Body Pose and Shape EstimationArithmer Inc.
 
Arithmer Inspection Introduction
Arithmer Inspection IntroductionArithmer Inspection Introduction
Arithmer Inspection IntroductionArithmer Inc.
 
全力解説!Transformer
全力解説!Transformer全力解説!Transformer
全力解説!TransformerArithmer Inc.
 
Arithmer NLP Introduction
Arithmer NLP IntroductionArithmer NLP Introduction
Arithmer NLP IntroductionArithmer Inc.
 
Introduction of Quantum Annealing and D-Wave Machines
Introduction of Quantum Annealing and D-Wave MachinesIntroduction of Quantum Annealing and D-Wave Machines
Introduction of Quantum Annealing and D-Wave MachinesArithmer Inc.
 
Arithmer OCR Introduction
Arithmer OCR IntroductionArithmer OCR Introduction
Arithmer OCR IntroductionArithmer Inc.
 
Arithmer Dynamics Introduction
Arithmer Dynamics Introduction Arithmer Dynamics Introduction
Arithmer Dynamics Introduction Arithmer Inc.
 
ArithmerDB Introduction
ArithmerDB IntroductionArithmerDB Introduction
ArithmerDB IntroductionArithmer Inc.
 
Summarizing videos with Attention
Summarizing videos with AttentionSummarizing videos with Attention
Summarizing videos with AttentionArithmer Inc.
 
3D human body modeling from RGB images
3D human body modeling from RGB images3D human body modeling from RGB images
3D human body modeling from RGB imagesArithmer Inc.
 
Object Pose Estimation
Object Pose EstimationObject Pose Estimation
Object Pose EstimationArithmer Inc.
 

More from Arithmer Inc. (20)

コーディネートレコメンド
コーディネートレコメンドコーディネートレコメンド
コーディネートレコメンド
 
Test for AI model
Test for AI modelTest for AI model
Test for AI model
 
最適化
最適化最適化
最適化
 
Arithmerソリューション紹介 流体予測システム
Arithmerソリューション紹介 流体予測システムArithmerソリューション紹介 流体予測システム
Arithmerソリューション紹介 流体予測システム
 
Arithmer NLP 自然言語処理 ソリューション紹介
Arithmer NLP 自然言語処理 ソリューション紹介Arithmer NLP 自然言語処理 ソリューション紹介
Arithmer NLP 自然言語処理 ソリューション紹介
 
Arithmer Robo Introduction
Arithmer Robo IntroductionArithmer Robo Introduction
Arithmer Robo Introduction
 
Arithmer AIチャットボット
Arithmer AIチャットボットArithmer AIチャットボット
Arithmer AIチャットボット
 
Arithmer R3 Introduction
Arithmer R3 Introduction Arithmer R3 Introduction
Arithmer R3 Introduction
 
VIBE: Video Inference for Human Body Pose and Shape Estimation
VIBE: Video Inference for Human Body Pose and Shape EstimationVIBE: Video Inference for Human Body Pose and Shape Estimation
VIBE: Video Inference for Human Body Pose and Shape Estimation
 
Arithmer Inspection Introduction
Arithmer Inspection IntroductionArithmer Inspection Introduction
Arithmer Inspection Introduction
 
全力解説!Transformer
全力解説!Transformer全力解説!Transformer
全力解説!Transformer
 
Arithmer NLP Introduction
Arithmer NLP IntroductionArithmer NLP Introduction
Arithmer NLP Introduction
 
Introduction of Quantum Annealing and D-Wave Machines
Introduction of Quantum Annealing and D-Wave MachinesIntroduction of Quantum Annealing and D-Wave Machines
Introduction of Quantum Annealing and D-Wave Machines
 
Arithmer OCR Introduction
Arithmer OCR IntroductionArithmer OCR Introduction
Arithmer OCR Introduction
 
Arithmer Dynamics Introduction
Arithmer Dynamics Introduction Arithmer Dynamics Introduction
Arithmer Dynamics Introduction
 
ArithmerDB Introduction
ArithmerDB IntroductionArithmerDB Introduction
ArithmerDB Introduction
 
Summarizing videos with Attention
Summarizing videos with AttentionSummarizing videos with Attention
Summarizing videos with Attention
 
3D human body modeling from RGB images
3D human body modeling from RGB images3D human body modeling from RGB images
3D human body modeling from RGB images
 
YOLACT
YOLACTYOLACT
YOLACT
 
Object Pose Estimation
Object Pose EstimationObject Pose Estimation
Object Pose Estimation
 

Recently uploaded

Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 

Recently uploaded (20)

Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 

Survey on self supervised image segmentation

  • 1. 2020/07/02 Working Group Survey on Self-Supervised Image Segmentation Arithmer R3 Div. R3 - Yann Le Guilly
  • 2. Working Group: Survey on Self-Supervised Image Segmentation Self-Introduction: Yann Le Guilly ● Education ○ Undergraduate in physics, University of Rennes 1 (France) ○ Master’s degree in applied mathematics, IRMAR laboratory, University of Rennes 1 (France) ○ Research Student at Tokyo Institute of Technology, Murata Laboratory ● Former Job ○ Machine Learning Engineer at Abeja, Inc. ■ Development of AI-based product and Proofs of concept ● Current Job ○ Machine Learning Engineer ■ Development of AI-based product and Proofs of concept
  • 3. Working Group: Survey on Self-Supervised Image Segmentation Agenda ● Introduction ○ Image Segmentation ○ Problems ○ Self-supervised learning ○ Content of this presentation ● Survey ○ Pseudo Labeling ○ Class Activation Map ○ Image Depth Information ○ How about using videos ● Some comments
  • 4. Working Group: Survey on Self-Supervised Image Segmentation Introduction: Image Segmentation 1
  • 5. Working Group: Survey on Self-Supervised Image Segmentation Introduction: Problems 2 CVAT: open source annotation tool - very hard to annotate (=expensive) - pixel-wise annotation - different types of objects - boundaries are often blurry - easy to make mistake - lots of unclear cases - Need to keep focus for long time How can we change this situation?
  • 6. Working Group: Survey on Self-Supervised Image Segmentation Introduction: Self-supervised learning ● Research topic pushed by “godfathers of AI” and specially by Yann Le Cun ● Pre-train a feature extractor in an unsupervised way then train the classifier with annotated data ● Reach SOTA with only 10% of labels in the last papers (starting to go beyond) ● Methods coming from NLP 3 Slide from Yann Le Cun’s (recurrent) presentation
  • 7. Working Group: Survey on Self-Supervised Image Segmentation Introduction: Content of this presentation - 3 very different directions - Using pseudo labelling - Using Activation Map (like grad-CAM) - Using depth information - Leverage frames in videos - High level explanations - For more details, I provided notes for some papers - For even more details, please read the original papers 4 Spoiler: there is no self-supervised learning semantic segmentation method that is very convincing...
  • 8. Working Group: Survey on Self-Supervised Image Segmentation Pseudo Labeling: Introduction - Usually use softmax to differentiate “strong prediction” to “weak prediction” - Keep only “strong prediction” as pseudo-labels - popular research topic Issue: - Softmax is not the best way since it takes the best score without considering other scores (which might be promising also) 5 Simple illustration on what is pseudo labeling
  • 9. Working Group: Survey on Self-Supervised Image Segmentation Pseudo Labeling: Entropy-guided (1/2) My notes: https://arithmer.co.jp/wp-content/uploads/pdf/notes_ESL_Entropy-guided_Self-supervised_Learning_for_Domain_Adaptation_in_Semantic_Segmentation.pdf Original paper: https://arxiv.org/abs/2006.08658v1 Repo: https://github.com/liyunsheng13/BDL 6
  • 10. Working Group: Survey on Self-Supervised Image Segmentation Pseudo Labeling: Entropy-guided (2/2) 7 Model Pre-trained on GTA5 dataset and method used on Cityscapes: consistently improves the final accuracy but by less than 1% mIoU State-of-the-art for fully supervised learning: 85.1% (IT = image translation)
  • 11. Working Group: Survey on Self-Supervised Image Segmentation Class Activation Map: Introduction 8 original paper: https://arxiv.org/abs/1610.02391 - Uses gradient information flow to the neurons of a specific CNN layer to identify regions of activation - Step forward interpretability - Work from 2017 (last version from last December)
  • 12. Working Group: Survey on Self-Supervised Image Segmentation Class Activation Map: Equivariant Attention Mechanism 9 My notes: https://arithmer.co.jp/wp-content/uploads/pdf/notes_Self-supervised_Equivariant_Attention_Mechanism_for_Weakly_Supervised_Semantic_Segmentation.pdf Original paper: https://arxiv.org/abs/2004.04581 Repo: https://github.com/YudeWang/SEAM 1. Need image level annotation (class) 2. 3 loss functions combines (ECR, ER and cls) a. both CAM outputs should be similar despite Affine transform b. But CAM degenerates (converge to a trivial solution) so ECR regularizes the PCM outputs with the original CAM
  • 13. Working Group: Survey on Self-Supervised Image Segmentation Class Activation Map: Equivariant Attention Mechanism 10 Main contribution: PCM module - uses attention to refine the mask from grad-CAM - attention can capture contextual information
  • 14. Working Group: Survey on Self-Supervised Image Segmentation Class Activation Map: Experiments Pascal VOC 2012 val with Image level annotation only (no mask). Using ResNet38 as backbone. fully-supervised learning SOTA: 90.0% (backbone is huge in this case though) Comparison on Pascal VOC 2012 dataset with other weakly supervised methods 11
  • 15. Working Group: Survey on Self-Supervised Image Segmentation Image Depth Information: HN labels - HN labels generation - Computing angles and height relative to the floor plane using RGB-D - Everything is binned to create labels - Training segmentation model on HN-labels - Fine-tuning on real dataset The point is more to show: pretrained HN labels >> pretrained ImageNet 12
  • 16. Working Group: Survey on Self-Supervised Image Segmentation Image Depth Information: Experiments dataset: NUYv2, trained on 50 epochs only. HN labels are from NUYv2. ImageNet is 25x bigger 13
  • 17. Working Group: Survey on Self-Supervised Image Segmentation How about using videos: Self-supervised Video Object Segmentation Original paper: https://arxiv.org/abs/2006.12480v1 - Learn unsupervised to track similar pixels across frames - Then when giving an initial mask (on this illustration, in the frame 0), the model can infer the same object for the next frames 14
  • 18. Working Group: Survey on Self-Supervised Image Segmentation How about using videos: Self-supervised Video Object Segmentation Original paper: https://arxiv.org/abs/2006.12480v1 - Learn unsupervised to track similar pixels across frames - Then when giving an initial mask (on this illustration, in the frame 0), the model can infer the same object for the next frames On Youtube VOS val. J (Mean) J (Recall) F(Mean) F(Recall). The models in the bottom part of the table are trained fully-supervised. 15
  • 19. 19