SlideShare a Scribd company logo
1 of 17
Download to read offline
How Useful is Self-Supervised Pretraining
for Visual Tasks
Hwang seung hyun
Yonsei University Severance Hospital CCIDS
Princeton university | CVPR 2020
2020.07.05
Introduction Related Work Methods and
Experiments
01 02 03
Conclusion
04
Yonsei Unversity Severance Hospital CCIDS
Contents
Self Supervised Pretraining
Introduction – Proposal
• There has been a lot of progress in self-supervised pretraining for vision. This paper
offer insights into when and how to use self-supervised pretraining.
• Self-supervised models now produce features that are comparable to or
outperform ImageNet Pretrained features.
• Networks pretrained on ImageNet data are relatively weak to domain shifts. Self
supervised methods have an advantage.
• Investigate how useful is self-supervision when there is sufficient amount of labeled
data.
• Large amount of data does not guarantee good performance, since fitting large data
to NN is difficult. Self supervised pretraining may produce better representations
that help optimization.
Introduction / Related Work / Methods and Experiments / Conclusion
01
Self Supervised Pretraining
Introduction – Proposal
Introduction / Related Work / Methods and Experiments / Conclusion
• Through experiment, (c) was found to be the most
common outcome.
02
ResNeSt
Introduction – Contributions
• Found that leading self-supervised pretraining methods are useful with a
small labeling budget, but utility tends to decrease with ample labels.
• Found that self-supervision is more helpful when applied to larger models
and to more difficult versions of the data.
• Relative performance of methods is not consistent across downstream
settings.
Introduction / Related Work / Methods and Experiments / Conclusion
03
Related Work
Introduction / Related Work / Methods and Experiments / Conclusion
04
1. Variational AutoEncoder (VAE)
– Standard baseline for mapping images to a low-dimensional latent space.
Pretraining Methods
Related Work
Introduction / Related Work / Methods and Experiments / Conclusion
05
2. Rotation
– Network is tasked with predicting whether an image has been rotated.
Pretraining Methods
Related Work
Introduction / Related Work / Methods and Experiments / Conclusion
06
3. Contrastive Multiview Coding(CMC)
– Split image into multiple channels such as the L and ab channels of an
image in Lab color space.
Pretraining Methods
Related Work
Introduction / Related Work / Methods and Experiments / Conclusion
07
4. Augmented Multiscale Deep
InfoMax (AMDIM)
– Instead of comparing across image
channels, AMDIM compares
representations from two augmented
versions of the same image
Pretraining Methods
Methods and Experiments
Experimental Settings – Data
Introduction / Related Work / Methods and Experiments / Conclusion
08
• To control dataset, authors synthesized images, giving endless supply of images.
• Rendered images with Blender using object models from ShapeNet.
• Generated images consist of objects floating in empty space.
• Can change the number of objects, orientation, texture, lighting, and positon .
Texture / Color / Viewpoint / Lighting
Methods and Experiments
Experimental Settings – Downstream tasks
Introduction / Related Work / Methods and Experiments / Conclusion
09
1. Object Classification
- Distinguish between ten ShapeNet classes.
2. Object pose estimation
- Discretized pose into five bins (upward, forward, backward, left, right)
3. Semantic Segmentation
- Images rendered with multiple objects
4. Depth Estimation
- Multiple images with coarser resolution
Methods and Experiments
Experimental Settings
Introduction / Related Work / Methods and Experiments / Conclusion
10
• Data resolutions are 64x64, 128x128
• Rendered total 480,000 images
• Used ResNet9 and ResNet50 for all experiments.
• Pretrain self-supervised algorithms for between 100-200 epochs.
• For finetuning, load a pretrained model and train for 75 to 200 additional
epochs.
• Evaluation : Accuracy & Utility
* Utility: U(x) = (x’/x) -1
Methods and Experiments
Results – Utility vs. Number of Labeled Samples
Introduction / Related Work / Methods and Experiments / Conclusion
• Self-supervision has significant utility when the number of labeled samples is
small, but utility approaches zero as labeled data grows.
• Self-supervised pretraining gives regularization that reduces overfitting, not
better optimization that reduces underfitting 11
Methods and Experiments
Results – Utility vs. Downstream Task
Introduction / Related Work / Methods and Experiments / Conclusion
• CMC performs best in object classification
• Rotation and AMDIM perform better on segmentation and depth estimation
respectively
12
Methods and Experiments
Results – Utility vs. Data Complexity
Introduction / Related Work / Methods and Experiments / Conclusion
• Utility goes up with texture
and down with viewpoint
changes for CMC.
• Opposite for AMDIM
• For VAE, as data complexity
increases, utility lowers (Since
latent space must encode all
information necessary to
reproduce the image)
• Contrastive approaches teach
a network to map to the same
embedding after applying
different image
transformations, helping
network to ignore changes in
pixel space.
13
Methods and Experiments
Results – Utility vs Model Size
Introduction / Related Work / Methods and Experiments / Conclusion
• For downstream performance, applying self-supervised pretraining on
large backbone network is better.
14
Conclusion
Introduction / Related Work / Methods and Experiments / Conclusion
• Investigated a number of factors that affect the utility of
self-supervised pretraining.
• Greatest benefits of pretraining are currently in low data
regimes.
• Performance of a self-supervised algorithm in one setting
may not necessarily reflect its performance in others.
15

More Related Content

What's hot

J. Park, AAAI 2022, MLILAB, KAIST AI
J. Park, AAAI 2022, MLILAB, KAIST AIJ. Park, AAAI 2022, MLILAB, KAIST AI
J. Park, AAAI 2022, MLILAB, KAIST AIMLILAB
 
A naturalistic open source movie for optical flow evaluation
A naturalistic open source movie for optical flow evaluationA naturalistic open source movie for optical flow evaluation
A naturalistic open source movie for optical flow evaluationAbdulrahman Kerim
 
Sim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement LearningSim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement Learningatulshah16
 
Educating Researchers Using the CHiMaD Benchmark Problems
Educating Researchers Using the CHiMaD Benchmark ProblemsEducating Researchers Using the CHiMaD Benchmark Problems
Educating Researchers Using the CHiMaD Benchmark ProblemsPFHub PFHub
 
Content Based Image Retrieval
Content Based Image RetrievalContent Based Image Retrieval
Content Based Image RetrievalJane Dane
 
T. Yoon, et. al., ICLR 2021, MLILAB, KAIST AI
T. Yoon, et. al., ICLR 2021, MLILAB, KAIST AIT. Yoon, et. al., ICLR 2021, MLILAB, KAIST AI
T. Yoon, et. al., ICLR 2021, MLILAB, KAIST AIMLILAB
 
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Multi illuminant estimation with c...
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Multi illuminant estimation with c...IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Multi illuminant estimation with c...
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Multi illuminant estimation with c...IEEEBEBTECHSTUDENTPROJECTS
 
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Dongmin Choi
 
Image Processing IEEE 2015 Projects
Image Processing IEEE 2015 ProjectsImage Processing IEEE 2015 Projects
Image Processing IEEE 2015 ProjectsVijay Karan
 
IBMA: An SPM toolbox for Neuroimaging Image-Based Meta-Analysis
IBMA: An SPM toolbox for Neuroimaging Image-Based Meta-AnalysisIBMA: An SPM toolbox for Neuroimaging Image-Based Meta-Analysis
IBMA: An SPM toolbox for Neuroimaging Image-Based Meta-AnalysisCamille Maumet
 
Developing Document Image Retrieval System
Developing Document Image Retrieval SystemDeveloping Document Image Retrieval System
Developing Document Image Retrieval SystemKonstantinos Zagoris
 
Result biased distributed-arithmetic-based
Result biased distributed-arithmetic-basedResult biased distributed-arithmetic-based
Result biased distributed-arithmetic-basedLogicMindtech Nologies
 
Neural Network Presentation
Neural Network PresentationNeural Network Presentation
Neural Network PresentationOmoye
 
Boosting Approach to Solving Machine Learning Problems
Boosting Approach to Solving Machine Learning ProblemsBoosting Approach to Solving Machine Learning Problems
Boosting Approach to Solving Machine Learning ProblemsDr Sulaimon Afolabi
 
Face alignment by deep convolutional network with adaptive learning rate
Face alignment by deep convolutional network with adaptive learning rateFace alignment by deep convolutional network with adaptive learning rate
Face alignment by deep convolutional network with adaptive learning rateZhiwen Shao
 
Hyperscope UX analysis
Hyperscope UX analysisHyperscope UX analysis
Hyperscope UX analysisRick Boardman
 
CHiMaD BM#7 Update (Steve DeWitt)
CHiMaD BM#7 Update (Steve DeWitt)CHiMaD BM#7 Update (Steve DeWitt)
CHiMaD BM#7 Update (Steve DeWitt)Stephen DeWitt
 
Implementation of area optimized low power multiplication and accumulation
Implementation of area optimized low power multiplication and accumulationImplementation of area optimized low power multiplication and accumulation
Implementation of area optimized low power multiplication and accumulationkarthik annam
 
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learning
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learningIEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learning
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learningIEEEBEBTECHSTUDENTPROJECTS
 

What's hot (20)

J. Park, AAAI 2022, MLILAB, KAIST AI
J. Park, AAAI 2022, MLILAB, KAIST AIJ. Park, AAAI 2022, MLILAB, KAIST AI
J. Park, AAAI 2022, MLILAB, KAIST AI
 
Inverse problems in medical imaging
Inverse problems in medical imagingInverse problems in medical imaging
Inverse problems in medical imaging
 
A naturalistic open source movie for optical flow evaluation
A naturalistic open source movie for optical flow evaluationA naturalistic open source movie for optical flow evaluation
A naturalistic open source movie for optical flow evaluation
 
Sim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement LearningSim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement Learning
 
Educating Researchers Using the CHiMaD Benchmark Problems
Educating Researchers Using the CHiMaD Benchmark ProblemsEducating Researchers Using the CHiMaD Benchmark Problems
Educating Researchers Using the CHiMaD Benchmark Problems
 
Content Based Image Retrieval
Content Based Image RetrievalContent Based Image Retrieval
Content Based Image Retrieval
 
T. Yoon, et. al., ICLR 2021, MLILAB, KAIST AI
T. Yoon, et. al., ICLR 2021, MLILAB, KAIST AIT. Yoon, et. al., ICLR 2021, MLILAB, KAIST AI
T. Yoon, et. al., ICLR 2021, MLILAB, KAIST AI
 
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Multi illuminant estimation with c...
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Multi illuminant estimation with c...IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Multi illuminant estimation with c...
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Multi illuminant estimation with c...
 
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
 
Image Processing IEEE 2015 Projects
Image Processing IEEE 2015 ProjectsImage Processing IEEE 2015 Projects
Image Processing IEEE 2015 Projects
 
IBMA: An SPM toolbox for Neuroimaging Image-Based Meta-Analysis
IBMA: An SPM toolbox for Neuroimaging Image-Based Meta-AnalysisIBMA: An SPM toolbox for Neuroimaging Image-Based Meta-Analysis
IBMA: An SPM toolbox for Neuroimaging Image-Based Meta-Analysis
 
Developing Document Image Retrieval System
Developing Document Image Retrieval SystemDeveloping Document Image Retrieval System
Developing Document Image Retrieval System
 
Result biased distributed-arithmetic-based
Result biased distributed-arithmetic-basedResult biased distributed-arithmetic-based
Result biased distributed-arithmetic-based
 
Neural Network Presentation
Neural Network PresentationNeural Network Presentation
Neural Network Presentation
 
Boosting Approach to Solving Machine Learning Problems
Boosting Approach to Solving Machine Learning ProblemsBoosting Approach to Solving Machine Learning Problems
Boosting Approach to Solving Machine Learning Problems
 
Face alignment by deep convolutional network with adaptive learning rate
Face alignment by deep convolutional network with adaptive learning rateFace alignment by deep convolutional network with adaptive learning rate
Face alignment by deep convolutional network with adaptive learning rate
 
Hyperscope UX analysis
Hyperscope UX analysisHyperscope UX analysis
Hyperscope UX analysis
 
CHiMaD BM#7 Update (Steve DeWitt)
CHiMaD BM#7 Update (Steve DeWitt)CHiMaD BM#7 Update (Steve DeWitt)
CHiMaD BM#7 Update (Steve DeWitt)
 
Implementation of area optimized low power multiplication and accumulation
Implementation of area optimized low power multiplication and accumulationImplementation of area optimized low power multiplication and accumulation
Implementation of area optimized low power multiplication and accumulation
 
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learning
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learningIEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learning
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learning
 

Similar to How useful is self-supervised pretraining for Visual tasks?

ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksSeunghyun Hwang
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...Jinwon Lee
 
Large Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image SynthesisLarge Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image SynthesisSeunghyun Hwang
 
Learning Sparse Networks using Targeted Dropout
Learning Sparse Networks using Targeted DropoutLearning Sparse Networks using Targeted Dropout
Learning Sparse Networks using Targeted DropoutSeunghyun Hwang
 
How well do self-supervised models transfer.pptx
How well do self-supervised models transfer.pptxHow well do self-supervised models transfer.pptx
How well do self-supervised models transfer.pptxssuserbafbd0
 
TIP_TAViT_presentation.pdf
TIP_TAViT_presentation.pdfTIP_TAViT_presentation.pdf
TIP_TAViT_presentation.pdfBoahKim2
 
A Probabilistic U-Net for Segmentation of Ambiguous Images
A Probabilistic U-Net for Segmentation of Ambiguous ImagesA Probabilistic U-Net for Segmentation of Ambiguous Images
A Probabilistic U-Net for Segmentation of Ambiguous ImagesSeunghyun Hwang
 
Presentation of master thesis
Presentation of master thesisPresentation of master thesis
Presentation of master thesisSeoung-Ho Choi
 
IRJET- Analysis of Vehicle Number Plate Recognition
IRJET- Analysis of Vehicle Number Plate RecognitionIRJET- Analysis of Vehicle Number Plate Recognition
IRJET- Analysis of Vehicle Number Plate RecognitionIRJET Journal
 
Self-training with Noisy Student improves ImageNet classification
Self-training with Noisy Student improves ImageNet classificationSelf-training with Noisy Student improves ImageNet classification
Self-training with Noisy Student improves ImageNet classificationChaehyeon Lee
 
Deep Generative model-based quality control for cardiac MRI segmentation
Deep Generative model-based quality control for cardiac MRI segmentation Deep Generative model-based quality control for cardiac MRI segmentation
Deep Generative model-based quality control for cardiac MRI segmentation Seunghyun Hwang
 
IRJET- Review of Tencent ML-Images Large-Scale Multi-Label Image Database
IRJET-  	  Review of Tencent ML-Images Large-Scale Multi-Label Image DatabaseIRJET-  	  Review of Tencent ML-Images Large-Scale Multi-Label Image Database
IRJET- Review of Tencent ML-Images Large-Scale Multi-Label Image DatabaseIRJET Journal
 
DeepStrip: High Resolution Boundary Refinement
DeepStrip: High Resolution Boundary RefinementDeepStrip: High Resolution Boundary Refinement
DeepStrip: High Resolution Boundary RefinementSeunghyun Hwang
 
A Survey on Different Relevance Feedback Techniques in Content Based Image Re...
A Survey on Different Relevance Feedback Techniques in Content Based Image Re...A Survey on Different Relevance Feedback Techniques in Content Based Image Re...
A Survey on Different Relevance Feedback Techniques in Content Based Image Re...IRJET Journal
 
A Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual RepresentationsA Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual RepresentationsSeunghyun Hwang
 
A DEEP LEARNING APPROACH FOR SEMANTIC SEGMENTATION IN BRAIN TUMOR IMAGES
A DEEP LEARNING APPROACH FOR SEMANTIC SEGMENTATION IN BRAIN TUMOR IMAGESA DEEP LEARNING APPROACH FOR SEMANTIC SEGMENTATION IN BRAIN TUMOR IMAGES
A DEEP LEARNING APPROACH FOR SEMANTIC SEGMENTATION IN BRAIN TUMOR IMAGESPNandaSai
 
Recuriter Recommendation System
Recuriter Recommendation SystemRecuriter Recommendation System
Recuriter Recommendation SystemIRJET Journal
 
Pricing like a data scientist
Pricing like a data scientistPricing like a data scientist
Pricing like a data scientistMatthew Evans
 
do adversarially robust image net models transfer better
do adversarially robust image net models transfer betterdo adversarially robust image net models transfer better
do adversarially robust image net models transfer betterLEE HOSEONG
 
Paper Explained: RandAugment: Practical automated data augmentation with a re...
Paper Explained: RandAugment: Practical automated data augmentation with a re...Paper Explained: RandAugment: Practical automated data augmentation with a re...
Paper Explained: RandAugment: Practical automated data augmentation with a re...Devansh16
 

Similar to How useful is self-supervised pretraining for Visual tasks? (20)

ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention Networks
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
 
Large Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image SynthesisLarge Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image Synthesis
 
Learning Sparse Networks using Targeted Dropout
Learning Sparse Networks using Targeted DropoutLearning Sparse Networks using Targeted Dropout
Learning Sparse Networks using Targeted Dropout
 
How well do self-supervised models transfer.pptx
How well do self-supervised models transfer.pptxHow well do self-supervised models transfer.pptx
How well do self-supervised models transfer.pptx
 
TIP_TAViT_presentation.pdf
TIP_TAViT_presentation.pdfTIP_TAViT_presentation.pdf
TIP_TAViT_presentation.pdf
 
A Probabilistic U-Net for Segmentation of Ambiguous Images
A Probabilistic U-Net for Segmentation of Ambiguous ImagesA Probabilistic U-Net for Segmentation of Ambiguous Images
A Probabilistic U-Net for Segmentation of Ambiguous Images
 
Presentation of master thesis
Presentation of master thesisPresentation of master thesis
Presentation of master thesis
 
IRJET- Analysis of Vehicle Number Plate Recognition
IRJET- Analysis of Vehicle Number Plate RecognitionIRJET- Analysis of Vehicle Number Plate Recognition
IRJET- Analysis of Vehicle Number Plate Recognition
 
Self-training with Noisy Student improves ImageNet classification
Self-training with Noisy Student improves ImageNet classificationSelf-training with Noisy Student improves ImageNet classification
Self-training with Noisy Student improves ImageNet classification
 
Deep Generative model-based quality control for cardiac MRI segmentation
Deep Generative model-based quality control for cardiac MRI segmentation Deep Generative model-based quality control for cardiac MRI segmentation
Deep Generative model-based quality control for cardiac MRI segmentation
 
IRJET- Review of Tencent ML-Images Large-Scale Multi-Label Image Database
IRJET-  	  Review of Tencent ML-Images Large-Scale Multi-Label Image DatabaseIRJET-  	  Review of Tencent ML-Images Large-Scale Multi-Label Image Database
IRJET- Review of Tencent ML-Images Large-Scale Multi-Label Image Database
 
DeepStrip: High Resolution Boundary Refinement
DeepStrip: High Resolution Boundary RefinementDeepStrip: High Resolution Boundary Refinement
DeepStrip: High Resolution Boundary Refinement
 
A Survey on Different Relevance Feedback Techniques in Content Based Image Re...
A Survey on Different Relevance Feedback Techniques in Content Based Image Re...A Survey on Different Relevance Feedback Techniques in Content Based Image Re...
A Survey on Different Relevance Feedback Techniques in Content Based Image Re...
 
A Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual RepresentationsA Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual Representations
 
A DEEP LEARNING APPROACH FOR SEMANTIC SEGMENTATION IN BRAIN TUMOR IMAGES
A DEEP LEARNING APPROACH FOR SEMANTIC SEGMENTATION IN BRAIN TUMOR IMAGESA DEEP LEARNING APPROACH FOR SEMANTIC SEGMENTATION IN BRAIN TUMOR IMAGES
A DEEP LEARNING APPROACH FOR SEMANTIC SEGMENTATION IN BRAIN TUMOR IMAGES
 
Recuriter Recommendation System
Recuriter Recommendation SystemRecuriter Recommendation System
Recuriter Recommendation System
 
Pricing like a data scientist
Pricing like a data scientistPricing like a data scientist
Pricing like a data scientist
 
do adversarially robust image net models transfer better
do adversarially robust image net models transfer betterdo adversarially robust image net models transfer better
do adversarially robust image net models transfer better
 
Paper Explained: RandAugment: Practical automated data augmentation with a re...
Paper Explained: RandAugment: Practical automated data augmentation with a re...Paper Explained: RandAugment: Practical automated data augmentation with a re...
Paper Explained: RandAugment: Practical automated data augmentation with a re...
 

More from Seunghyun Hwang

An annotation sparsification strategy for 3D medical image segmentation via r...
An annotation sparsification strategy for 3D medical image segmentation via r...An annotation sparsification strategy for 3D medical image segmentation via r...
An annotation sparsification strategy for 3D medical image segmentation via r...Seunghyun Hwang
 
Do wide and deep networks learn the same things? Uncovering how neural networ...
Do wide and deep networks learn the same things? Uncovering how neural networ...Do wide and deep networks learn the same things? Uncovering how neural networ...
Do wide and deep networks learn the same things? Uncovering how neural networ...Seunghyun Hwang
 
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...Seunghyun Hwang
 
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model Seunghyun Hwang
 
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...Seunghyun Hwang
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersSeunghyun Hwang
 
Segmenting Medical MRI via Recurrent Decoding Cell
Segmenting Medical MRI via Recurrent Decoding CellSegmenting Medical MRI via Recurrent Decoding Cell
Segmenting Medical MRI via Recurrent Decoding CellSeunghyun Hwang
 
Progressive learning and Disentanglement of hierarchical representations
Progressive learning and Disentanglement of hierarchical representationsProgressive learning and Disentanglement of hierarchical representations
Progressive learning and Disentanglement of hierarchical representationsSeunghyun Hwang
 
Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...Seunghyun Hwang
 
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...Seunghyun Hwang
 
Mix Conv: Mixed Depthwise Convolutional Kernels
Mix Conv: Mixed Depthwise Convolutional KernelsMix Conv: Mixed Depthwise Convolutional Kernels
Mix Conv: Mixed Depthwise Convolutional KernelsSeunghyun Hwang
 

More from Seunghyun Hwang (11)

An annotation sparsification strategy for 3D medical image segmentation via r...
An annotation sparsification strategy for 3D medical image segmentation via r...An annotation sparsification strategy for 3D medical image segmentation via r...
An annotation sparsification strategy for 3D medical image segmentation via r...
 
Do wide and deep networks learn the same things? Uncovering how neural networ...
Do wide and deep networks learn the same things? Uncovering how neural networ...Do wide and deep networks learn the same things? Uncovering how neural networ...
Do wide and deep networks learn the same things? Uncovering how neural networ...
 
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...
 
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model
 
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
 
Segmenting Medical MRI via Recurrent Decoding Cell
Segmenting Medical MRI via Recurrent Decoding CellSegmenting Medical MRI via Recurrent Decoding Cell
Segmenting Medical MRI via Recurrent Decoding Cell
 
Progressive learning and Disentanglement of hierarchical representations
Progressive learning and Disentanglement of hierarchical representationsProgressive learning and Disentanglement of hierarchical representations
Progressive learning and Disentanglement of hierarchical representations
 
Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...
 
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
 
Mix Conv: Mixed Depthwise Convolutional Kernels
Mix Conv: Mixed Depthwise Convolutional KernelsMix Conv: Mixed Depthwise Convolutional Kernels
Mix Conv: Mixed Depthwise Convolutional Kernels
 

Recently uploaded

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 

Recently uploaded (20)

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 

How useful is self-supervised pretraining for Visual tasks?

  • 1. How Useful is Self-Supervised Pretraining for Visual Tasks Hwang seung hyun Yonsei University Severance Hospital CCIDS Princeton university | CVPR 2020 2020.07.05
  • 2. Introduction Related Work Methods and Experiments 01 02 03 Conclusion 04 Yonsei Unversity Severance Hospital CCIDS Contents
  • 3. Self Supervised Pretraining Introduction – Proposal • There has been a lot of progress in self-supervised pretraining for vision. This paper offer insights into when and how to use self-supervised pretraining. • Self-supervised models now produce features that are comparable to or outperform ImageNet Pretrained features. • Networks pretrained on ImageNet data are relatively weak to domain shifts. Self supervised methods have an advantage. • Investigate how useful is self-supervision when there is sufficient amount of labeled data. • Large amount of data does not guarantee good performance, since fitting large data to NN is difficult. Self supervised pretraining may produce better representations that help optimization. Introduction / Related Work / Methods and Experiments / Conclusion 01
  • 4. Self Supervised Pretraining Introduction – Proposal Introduction / Related Work / Methods and Experiments / Conclusion • Through experiment, (c) was found to be the most common outcome. 02
  • 5. ResNeSt Introduction – Contributions • Found that leading self-supervised pretraining methods are useful with a small labeling budget, but utility tends to decrease with ample labels. • Found that self-supervision is more helpful when applied to larger models and to more difficult versions of the data. • Relative performance of methods is not consistent across downstream settings. Introduction / Related Work / Methods and Experiments / Conclusion 03
  • 6. Related Work Introduction / Related Work / Methods and Experiments / Conclusion 04 1. Variational AutoEncoder (VAE) – Standard baseline for mapping images to a low-dimensional latent space. Pretraining Methods
  • 7. Related Work Introduction / Related Work / Methods and Experiments / Conclusion 05 2. Rotation – Network is tasked with predicting whether an image has been rotated. Pretraining Methods
  • 8. Related Work Introduction / Related Work / Methods and Experiments / Conclusion 06 3. Contrastive Multiview Coding(CMC) – Split image into multiple channels such as the L and ab channels of an image in Lab color space. Pretraining Methods
  • 9. Related Work Introduction / Related Work / Methods and Experiments / Conclusion 07 4. Augmented Multiscale Deep InfoMax (AMDIM) – Instead of comparing across image channels, AMDIM compares representations from two augmented versions of the same image Pretraining Methods
  • 10. Methods and Experiments Experimental Settings – Data Introduction / Related Work / Methods and Experiments / Conclusion 08 • To control dataset, authors synthesized images, giving endless supply of images. • Rendered images with Blender using object models from ShapeNet. • Generated images consist of objects floating in empty space. • Can change the number of objects, orientation, texture, lighting, and positon . Texture / Color / Viewpoint / Lighting
  • 11. Methods and Experiments Experimental Settings – Downstream tasks Introduction / Related Work / Methods and Experiments / Conclusion 09 1. Object Classification - Distinguish between ten ShapeNet classes. 2. Object pose estimation - Discretized pose into five bins (upward, forward, backward, left, right) 3. Semantic Segmentation - Images rendered with multiple objects 4. Depth Estimation - Multiple images with coarser resolution
  • 12. Methods and Experiments Experimental Settings Introduction / Related Work / Methods and Experiments / Conclusion 10 • Data resolutions are 64x64, 128x128 • Rendered total 480,000 images • Used ResNet9 and ResNet50 for all experiments. • Pretrain self-supervised algorithms for between 100-200 epochs. • For finetuning, load a pretrained model and train for 75 to 200 additional epochs. • Evaluation : Accuracy & Utility * Utility: U(x) = (x’/x) -1
  • 13. Methods and Experiments Results – Utility vs. Number of Labeled Samples Introduction / Related Work / Methods and Experiments / Conclusion • Self-supervision has significant utility when the number of labeled samples is small, but utility approaches zero as labeled data grows. • Self-supervised pretraining gives regularization that reduces overfitting, not better optimization that reduces underfitting 11
  • 14. Methods and Experiments Results – Utility vs. Downstream Task Introduction / Related Work / Methods and Experiments / Conclusion • CMC performs best in object classification • Rotation and AMDIM perform better on segmentation and depth estimation respectively 12
  • 15. Methods and Experiments Results – Utility vs. Data Complexity Introduction / Related Work / Methods and Experiments / Conclusion • Utility goes up with texture and down with viewpoint changes for CMC. • Opposite for AMDIM • For VAE, as data complexity increases, utility lowers (Since latent space must encode all information necessary to reproduce the image) • Contrastive approaches teach a network to map to the same embedding after applying different image transformations, helping network to ignore changes in pixel space. 13
  • 16. Methods and Experiments Results – Utility vs Model Size Introduction / Related Work / Methods and Experiments / Conclusion • For downstream performance, applying self-supervised pretraining on large backbone network is better. 14
  • 17. Conclusion Introduction / Related Work / Methods and Experiments / Conclusion • Investigated a number of factors that affect the utility of self-supervised pretraining. • Greatest benefits of pretraining are currently in low data regimes. • Performance of a self-supervised algorithm in one setting may not necessarily reflect its performance in others. 15