SlideShare a Scribd company logo
1 of 23
Download to read offline
A Simple Framework for Contrastive Learning of
Visual Representations
Hwang seung hyun
Yonsei University Severance Hospital CCIDS
Google Research Team, Geoffrey Hinton | ICML 2020
2020.07.19
Introduction Related Work Methods and
Experiments
01 02 03
Conclusion
04
Yonsei Unversity Severance Hospital CCIDS
Contents
SimCLR
Introduction – Proposal
• Most mainstream approaches for unsupervised visual representations fall into one
of two classes: Generative or Discriminative
Introduction / Related Work / Methods and Experiments / Conclusion
01Predict rotation
Autoencoder
Jigsaw Puzzle
SimCLR
Introduction – Proposal
• Discriminative approaches based on Contrastive Learning in the latent space have
recently shown state-of-the-art results.
Introduction / Related Work / Methods and Experiments / Conclusion
02[AMDIM]
SimCLR
Introduction – Proposal
Introduction / Related Work / Methods and Experiments / Conclusion
• SimCLR outperform previous
work but is simpler
• SimCLR achieves 76.5% top-1
accuracy which is a 7% relative
improvement over previous SOTA
method.
• When fine-tuned with only 1% of
the ImageNet labels, SimCLR
achieved 85.8% top-5 accuracy.
03
SimCLR
Introduction – Contributions
• Composition of multiple data augmentation operations is crucial in unsupervised
contrastive learning.
• Learnable nonlinear transformation between the representation and the
contrastive loss substantially improves the quality of the learned representations.
• Contrastive learning benefits from larger batch sizes and longer training.
• Like supervised learning, contrastive learning benefits from deeper and wider
networks.
• Representation learning with contrastive cross entropy loss benefits from
normalized embeddings and temperature parameter.
Introduction / Related Work / Methods and Experiments / Conclusion
04
Related Work
Introduction / Related Work / Methods and Experiments / Conclusion
05
Handcrafted pretext tasks
• Relative patch prediction
• Jigsaw puzzles
• Rotation Prediction
• Colorization Prediction
.
.
. Limits the GENERALITY of
learned Representations!
Related Work
Introduction / Related Work / Methods and Experiments / Conclusion
06
Contrastive Visual Representation learning
• CPC V2
• AMDIM
• Rotation Prediction
• MoCo (by Facebook)
.
.
. “SimCLR” is their composition!
Methods and Experiments
Overall Architecture
Introduction / Related Work / Methods and Experiments / Conclusion
07
https://www.youtube.com/watch?v=5lsmGWtxnKA
Methods and Experiments
Architecture – Data Augmentation
Introduction / Related Work / Methods and Experiments / Conclusion
08
https://www.youtube.com/watch?v=5lsmGWtxnKA
Methods and Experiments
Architecture – loss function
Introduction / Related Work / Methods and Experiments / Conclusion
09
https://www.youtube.com/watch?v=5lsmGWtxnKA
Methods and Experiments
Introduction / Related Work / Methods and Experiments / Conclusion
10
https://www.youtube.com/watch?v=5lsmGWtxnKA
Final Loss
Architecture – loss function
[Normalized temperature-scaled cross entropy loss]
Methods and Experiments
Introduction / Related Work / Methods and Experiments / Conclusion
11
Algorithm
Methods and Experiments
Other Methods
Introduction / Related Work / Methods and Experiments / Conclusion
12
• Large Batch Size
- Use Train batch 4096
- Use LARS optimizer, since using standard SGD/Momentum optimizer
might be unstable within large batch.
• Global BN
- When training with data parallelism, BN mean and variance are
typically aggregated locally per device.
- Aggregated BN mean and variance over all devices during the training.
Methods and Experiments
Evaluation Protocal
Introduction / Related Work / Methods and Experiments / Conclusion
13
• Dataset and Metrics
- ImageNet
- Transfer Learning on wide range of datasets (Cifar10, Cifar100, etc)
• Default Setting
- Random crop and resize, Color distortions, Gaussian blur
- ResNet-50 as base encoder network
- 2-layer MLP projection head to project the representation to a 128-
dimensional latent space
- Trained at batch size 4096 for 100 epochs
Methods and Experiments
Ablation Studies – Data Augmentation
Introduction / Related Work / Methods and Experiments / Conclusion
14
“Coloring”, “Crop” = Crucial
Methods and Experiments
Ablation Studies – Data Augmentation
Introduction / Related Work / Methods and Experiments / Conclusion
15
Methods and Experiments
Ablation Studies – Nonlinear Projection head
Introduction / Related Work / Methods and Experiments / Conclusion
16
• The hidden layer before the projection head is a better representation
than the layer after
Methods and Experiments
Ablation Studies – Batch Size
Introduction / Related Work / Methods and Experiments / Conclusion
17
Methods and Experiments
Results – ImageNet
Introduction / Related Work / Methods and Experiments / Conclusion
18
Methods and Experiments
Results – semi-supervised learning
Introduction / Related Work / Methods and Experiments / Conclusion
19
Methods and Experiments
Results – Transfer Learning
Introduction / Related Work / Methods and Experiments / Conclusion
20
Conclusion
Introduction / Related Work / Methods and Experiments / Conclusion
• Improved considerably over previous methods for self-
supervised, semi-supervised, and transfer learning.
• SimCLR Differs from standard supervised learning on
ImageNet only in the choice of data augmentation, the use
of a nonlinear head, and the loss function.
• Despite a recent surge in interest, self-supervised learning
remains undervalued.
21

More Related Content

What's hot

Relational knowledge distillation
Relational knowledge distillationRelational knowledge distillation
Relational knowledge distillationNAVER Engineering
 
Deep Learning - Optimization Basic
Deep Learning - Optimization BasicDeep Learning - Optimization Basic
Deep Learning - Optimization BasicJaehyun Jun
 
Graph Representation Learning
Graph Representation LearningGraph Representation Learning
Graph Representation LearningJure Leskovec
 
Fine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsFine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsOVHcloud
 
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery ivaderivader
 
ELM: Extreme Learning Machine: Learning without iterative tuning
ELM: Extreme Learning Machine: Learning without iterative tuningELM: Extreme Learning Machine: Learning without iterative tuning
ELM: Extreme Learning Machine: Learning without iterative tuningzukun
 
Metric Learning 세미나.pptx
Metric Learning 세미나.pptxMetric Learning 세미나.pptx
Metric Learning 세미나.pptxDongkyunKim17
 
Masked Autoencoders Are Scalable Vision Learners.pptx
Masked Autoencoders Are Scalable Vision Learners.pptxMasked Autoencoders Are Scalable Vision Learners.pptx
Masked Autoencoders Are Scalable Vision Learners.pptxSangmin Woo
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learningmilad abbasi
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...Sungha Choi
 
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Deep Learning Italia
 
Autoencoders
AutoencodersAutoencoders
AutoencodersCloudxLab
 
【DL輪読会】StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
【DL輪読会】StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery【DL輪読会】StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
【DL輪読会】StyleCLIP: Text-Driven Manipulation of StyleGAN ImageryDeep Learning JP
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionJinwon Lee
 
End to-end semi-supervised object detection with soft teacher ver.1.0
End to-end semi-supervised object detection with soft teacher ver.1.0End to-end semi-supervised object detection with soft teacher ver.1.0
End to-end semi-supervised object detection with soft teacher ver.1.0taeseon ryu
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersSungchul Kim
 

What's hot (20)

Relational knowledge distillation
Relational knowledge distillationRelational knowledge distillation
Relational knowledge distillation
 
Transfer Learning
Transfer LearningTransfer Learning
Transfer Learning
 
Deep Learning - Optimization Basic
Deep Learning - Optimization BasicDeep Learning - Optimization Basic
Deep Learning - Optimization Basic
 
Graph Representation Learning
Graph Representation LearningGraph Representation Learning
Graph Representation Learning
 
Fine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsFine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP models
 
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
 
ELM: Extreme Learning Machine: Learning without iterative tuning
ELM: Extreme Learning Machine: Learning without iterative tuningELM: Extreme Learning Machine: Learning without iterative tuning
ELM: Extreme Learning Machine: Learning without iterative tuning
 
Siamese networks.pptx.pdf
Siamese networks.pptx.pdfSiamese networks.pptx.pdf
Siamese networks.pptx.pdf
 
Metric Learning 세미나.pptx
Metric Learning 세미나.pptxMetric Learning 세미나.pptx
Metric Learning 세미나.pptx
 
Masked Autoencoders Are Scalable Vision Learners.pptx
Masked Autoencoders Are Scalable Vision Learners.pptxMasked Autoencoders Are Scalable Vision Learners.pptx
Masked Autoencoders Are Scalable Vision Learners.pptx
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learning
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
 
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
 
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
 
Autoencoders
AutoencodersAutoencoders
Autoencoders
 
【DL輪読会】StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
【DL輪読会】StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery【DL輪読会】StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
【DL輪読会】StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object Detection
 
End to-end semi-supervised object detection with soft teacher ver.1.0
End to-end semi-supervised object detection with soft teacher ver.1.0End to-end semi-supervised object detection with soft teacher ver.1.0
End to-end semi-supervised object detection with soft teacher ver.1.0
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
 

Similar to A Simple Framework for Contrastive Learning of Visual Representations

How useful is self-supervised pretraining for Visual tasks?
How useful is self-supervised pretraining for Visual tasks?How useful is self-supervised pretraining for Visual tasks?
How useful is self-supervised pretraining for Visual tasks?Seunghyun Hwang
 
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...Seunghyun Hwang
 
Performance of Go on Multicore Systems
Performance of Go on Multicore SystemsPerformance of Go on Multicore Systems
Performance of Go on Multicore SystemsNo J
 
MSCV Capstone Spring 2020 Presentation - RL for AD
MSCV Capstone Spring 2020 Presentation - RL for ADMSCV Capstone Spring 2020 Presentation - RL for AD
MSCV Capstone Spring 2020 Presentation - RL for ADMayank Gupta
 
Troubleshooting Deep Neural Networks - Full Stack Deep Learning
Troubleshooting Deep Neural Networks - Full Stack Deep LearningTroubleshooting Deep Neural Networks - Full Stack Deep Learning
Troubleshooting Deep Neural Networks - Full Stack Deep LearningSergey Karayev
 
modelling-and-simulation-made-easy-with-simulink.pdf
modelling-and-simulation-made-easy-with-simulink.pdfmodelling-and-simulation-made-easy-with-simulink.pdf
modelling-and-simulation-made-easy-with-simulink.pdfGBBarrios
 
Large Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image SynthesisLarge Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image SynthesisSeunghyun Hwang
 
Toward a Traceable, Explainable and fair JD/Resume Recommendation System
Toward a Traceable, Explainable and fair JD/Resume Recommendation SystemToward a Traceable, Explainable and fair JD/Resume Recommendation System
Toward a Traceable, Explainable and fair JD/Resume Recommendation SystemAmine Barrak
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOpsRui Quintino
 
ASS_SDM2012_Ali
ASS_SDM2012_AliASS_SDM2012_Ali
ASS_SDM2012_AliMDO_Lab
 
Multi-core Real-time Simulation of High-Fidelity Vehicle Models using Open St...
Multi-core Real-time Simulation of High-Fidelity Vehicle Models using Open St...Multi-core Real-time Simulation of High-Fidelity Vehicle Models using Open St...
Multi-core Real-time Simulation of High-Fidelity Vehicle Models using Open St...Modelon
 
Bart Knaack - The Truth About Model-Based Quality Improvements
Bart Knaack - The Truth About Model-Based Quality ImprovementsBart Knaack - The Truth About Model-Based Quality Improvements
Bart Knaack - The Truth About Model-Based Quality ImprovementsTEST Huddle
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用CHENHuiMei
 
Single Camera Calibration Using Partially Visible Calibration Objects Based o...
Single Camera Calibration Using Partially Visible Calibration Objects Based o...Single Camera Calibration Using Partially Visible Calibration Objects Based o...
Single Camera Calibration Using Partially Visible Calibration Objects Based o...Yuji Oyamada
 
BC 504-Operation Research
BC 504-Operation ResearchBC 504-Operation Research
BC 504-Operation ResearchPCTE
 
AIAA-SDM-SequentialSampling-2012
AIAA-SDM-SequentialSampling-2012AIAA-SDM-SequentialSampling-2012
AIAA-SDM-SequentialSampling-2012OptiModel
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsScott Clark
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsSigOpt
 
Face Identification for Humanoid Robot
Face Identification for Humanoid RobotFace Identification for Humanoid Robot
Face Identification for Humanoid Robotthomaswangxin
 

Similar to A Simple Framework for Contrastive Learning of Visual Representations (20)

How useful is self-supervised pretraining for Visual tasks?
How useful is self-supervised pretraining for Visual tasks?How useful is self-supervised pretraining for Visual tasks?
How useful is self-supervised pretraining for Visual tasks?
 
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
 
Performance of Go on Multicore Systems
Performance of Go on Multicore SystemsPerformance of Go on Multicore Systems
Performance of Go on Multicore Systems
 
MSCV Capstone Spring 2020 Presentation - RL for AD
MSCV Capstone Spring 2020 Presentation - RL for ADMSCV Capstone Spring 2020 Presentation - RL for AD
MSCV Capstone Spring 2020 Presentation - RL for AD
 
Troubleshooting Deep Neural Networks - Full Stack Deep Learning
Troubleshooting Deep Neural Networks - Full Stack Deep LearningTroubleshooting Deep Neural Networks - Full Stack Deep Learning
Troubleshooting Deep Neural Networks - Full Stack Deep Learning
 
modelling-and-simulation-made-easy-with-simulink.pdf
modelling-and-simulation-made-easy-with-simulink.pdfmodelling-and-simulation-made-easy-with-simulink.pdf
modelling-and-simulation-made-easy-with-simulink.pdf
 
Large Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image SynthesisLarge Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image Synthesis
 
Toward a Traceable, Explainable and fair JD/Resume Recommendation System
Toward a Traceable, Explainable and fair JD/Resume Recommendation SystemToward a Traceable, Explainable and fair JD/Resume Recommendation System
Toward a Traceable, Explainable and fair JD/Resume Recommendation System
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps
 
ASS_SDM2012_Ali
ASS_SDM2012_AliASS_SDM2012_Ali
ASS_SDM2012_Ali
 
Multi-core Real-time Simulation of High-Fidelity Vehicle Models using Open St...
Multi-core Real-time Simulation of High-Fidelity Vehicle Models using Open St...Multi-core Real-time Simulation of High-Fidelity Vehicle Models using Open St...
Multi-core Real-time Simulation of High-Fidelity Vehicle Models using Open St...
 
Cp04invitedslide
Cp04invitedslideCp04invitedslide
Cp04invitedslide
 
Bart Knaack - The Truth About Model-Based Quality Improvements
Bart Knaack - The Truth About Model-Based Quality ImprovementsBart Knaack - The Truth About Model-Based Quality Improvements
Bart Knaack - The Truth About Model-Based Quality Improvements
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
Single Camera Calibration Using Partially Visible Calibration Objects Based o...
Single Camera Calibration Using Partially Visible Calibration Objects Based o...Single Camera Calibration Using Partially Visible Calibration Objects Based o...
Single Camera Calibration Using Partially Visible Calibration Objects Based o...
 
BC 504-Operation Research
BC 504-Operation ResearchBC 504-Operation Research
BC 504-Operation Research
 
AIAA-SDM-SequentialSampling-2012
AIAA-SDM-SequentialSampling-2012AIAA-SDM-SequentialSampling-2012
AIAA-SDM-SequentialSampling-2012
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Face Identification for Humanoid Robot
Face Identification for Humanoid RobotFace Identification for Humanoid Robot
Face Identification for Humanoid Robot
 

More from Seunghyun Hwang

An annotation sparsification strategy for 3D medical image segmentation via r...
An annotation sparsification strategy for 3D medical image segmentation via r...An annotation sparsification strategy for 3D medical image segmentation via r...
An annotation sparsification strategy for 3D medical image segmentation via r...Seunghyun Hwang
 
Do wide and deep networks learn the same things? Uncovering how neural networ...
Do wide and deep networks learn the same things? Uncovering how neural networ...Do wide and deep networks learn the same things? Uncovering how neural networ...
Do wide and deep networks learn the same things? Uncovering how neural networ...Seunghyun Hwang
 
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...Seunghyun Hwang
 
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model Seunghyun Hwang
 
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...Seunghyun Hwang
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersSeunghyun Hwang
 
Deep Generative model-based quality control for cardiac MRI segmentation
Deep Generative model-based quality control for cardiac MRI segmentation Deep Generative model-based quality control for cardiac MRI segmentation
Deep Generative model-based quality control for cardiac MRI segmentation Seunghyun Hwang
 
Segmenting Medical MRI via Recurrent Decoding Cell
Segmenting Medical MRI via Recurrent Decoding CellSegmenting Medical MRI via Recurrent Decoding Cell
Segmenting Medical MRI via Recurrent Decoding CellSeunghyun Hwang
 
Progressive learning and Disentanglement of hierarchical representations
Progressive learning and Disentanglement of hierarchical representationsProgressive learning and Disentanglement of hierarchical representations
Progressive learning and Disentanglement of hierarchical representationsSeunghyun Hwang
 
Learning Sparse Networks using Targeted Dropout
Learning Sparse Networks using Targeted DropoutLearning Sparse Networks using Targeted Dropout
Learning Sparse Networks using Targeted DropoutSeunghyun Hwang
 
ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksSeunghyun Hwang
 
DeepStrip: High Resolution Boundary Refinement
DeepStrip: High Resolution Boundary RefinementDeepStrip: High Resolution Boundary Refinement
DeepStrip: High Resolution Boundary RefinementSeunghyun Hwang
 
Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...Seunghyun Hwang
 
A Probabilistic U-Net for Segmentation of Ambiguous Images
A Probabilistic U-Net for Segmentation of Ambiguous ImagesA Probabilistic U-Net for Segmentation of Ambiguous Images
A Probabilistic U-Net for Segmentation of Ambiguous ImagesSeunghyun Hwang
 
Mix Conv: Mixed Depthwise Convolutional Kernels
Mix Conv: Mixed Depthwise Convolutional KernelsMix Conv: Mixed Depthwise Convolutional Kernels
Mix Conv: Mixed Depthwise Convolutional KernelsSeunghyun Hwang
 

More from Seunghyun Hwang (15)

An annotation sparsification strategy for 3D medical image segmentation via r...
An annotation sparsification strategy for 3D medical image segmentation via r...An annotation sparsification strategy for 3D medical image segmentation via r...
An annotation sparsification strategy for 3D medical image segmentation via r...
 
Do wide and deep networks learn the same things? Uncovering how neural networ...
Do wide and deep networks learn the same things? Uncovering how neural networ...Do wide and deep networks learn the same things? Uncovering how neural networ...
Do wide and deep networks learn the same things? Uncovering how neural networ...
 
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...
 
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model
 
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
 
Deep Generative model-based quality control for cardiac MRI segmentation
Deep Generative model-based quality control for cardiac MRI segmentation Deep Generative model-based quality control for cardiac MRI segmentation
Deep Generative model-based quality control for cardiac MRI segmentation
 
Segmenting Medical MRI via Recurrent Decoding Cell
Segmenting Medical MRI via Recurrent Decoding CellSegmenting Medical MRI via Recurrent Decoding Cell
Segmenting Medical MRI via Recurrent Decoding Cell
 
Progressive learning and Disentanglement of hierarchical representations
Progressive learning and Disentanglement of hierarchical representationsProgressive learning and Disentanglement of hierarchical representations
Progressive learning and Disentanglement of hierarchical representations
 
Learning Sparse Networks using Targeted Dropout
Learning Sparse Networks using Targeted DropoutLearning Sparse Networks using Targeted Dropout
Learning Sparse Networks using Targeted Dropout
 
ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention Networks
 
DeepStrip: High Resolution Boundary Refinement
DeepStrip: High Resolution Boundary RefinementDeepStrip: High Resolution Boundary Refinement
DeepStrip: High Resolution Boundary Refinement
 
Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...
 
A Probabilistic U-Net for Segmentation of Ambiguous Images
A Probabilistic U-Net for Segmentation of Ambiguous ImagesA Probabilistic U-Net for Segmentation of Ambiguous Images
A Probabilistic U-Net for Segmentation of Ambiguous Images
 
Mix Conv: Mixed Depthwise Convolutional Kernels
Mix Conv: Mixed Depthwise Convolutional KernelsMix Conv: Mixed Depthwise Convolutional Kernels
Mix Conv: Mixed Depthwise Convolutional Kernels
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 

A Simple Framework for Contrastive Learning of Visual Representations

  • 1. A Simple Framework for Contrastive Learning of Visual Representations Hwang seung hyun Yonsei University Severance Hospital CCIDS Google Research Team, Geoffrey Hinton | ICML 2020 2020.07.19
  • 2. Introduction Related Work Methods and Experiments 01 02 03 Conclusion 04 Yonsei Unversity Severance Hospital CCIDS Contents
  • 3. SimCLR Introduction – Proposal • Most mainstream approaches for unsupervised visual representations fall into one of two classes: Generative or Discriminative Introduction / Related Work / Methods and Experiments / Conclusion 01Predict rotation Autoencoder Jigsaw Puzzle
  • 4. SimCLR Introduction – Proposal • Discriminative approaches based on Contrastive Learning in the latent space have recently shown state-of-the-art results. Introduction / Related Work / Methods and Experiments / Conclusion 02[AMDIM]
  • 5. SimCLR Introduction – Proposal Introduction / Related Work / Methods and Experiments / Conclusion • SimCLR outperform previous work but is simpler • SimCLR achieves 76.5% top-1 accuracy which is a 7% relative improvement over previous SOTA method. • When fine-tuned with only 1% of the ImageNet labels, SimCLR achieved 85.8% top-5 accuracy. 03
  • 6. SimCLR Introduction – Contributions • Composition of multiple data augmentation operations is crucial in unsupervised contrastive learning. • Learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations. • Contrastive learning benefits from larger batch sizes and longer training. • Like supervised learning, contrastive learning benefits from deeper and wider networks. • Representation learning with contrastive cross entropy loss benefits from normalized embeddings and temperature parameter. Introduction / Related Work / Methods and Experiments / Conclusion 04
  • 7. Related Work Introduction / Related Work / Methods and Experiments / Conclusion 05 Handcrafted pretext tasks • Relative patch prediction • Jigsaw puzzles • Rotation Prediction • Colorization Prediction . . . Limits the GENERALITY of learned Representations!
  • 8. Related Work Introduction / Related Work / Methods and Experiments / Conclusion 06 Contrastive Visual Representation learning • CPC V2 • AMDIM • Rotation Prediction • MoCo (by Facebook) . . . “SimCLR” is their composition!
  • 9. Methods and Experiments Overall Architecture Introduction / Related Work / Methods and Experiments / Conclusion 07 https://www.youtube.com/watch?v=5lsmGWtxnKA
  • 10. Methods and Experiments Architecture – Data Augmentation Introduction / Related Work / Methods and Experiments / Conclusion 08 https://www.youtube.com/watch?v=5lsmGWtxnKA
  • 11. Methods and Experiments Architecture – loss function Introduction / Related Work / Methods and Experiments / Conclusion 09 https://www.youtube.com/watch?v=5lsmGWtxnKA
  • 12. Methods and Experiments Introduction / Related Work / Methods and Experiments / Conclusion 10 https://www.youtube.com/watch?v=5lsmGWtxnKA Final Loss Architecture – loss function [Normalized temperature-scaled cross entropy loss]
  • 13. Methods and Experiments Introduction / Related Work / Methods and Experiments / Conclusion 11 Algorithm
  • 14. Methods and Experiments Other Methods Introduction / Related Work / Methods and Experiments / Conclusion 12 • Large Batch Size - Use Train batch 4096 - Use LARS optimizer, since using standard SGD/Momentum optimizer might be unstable within large batch. • Global BN - When training with data parallelism, BN mean and variance are typically aggregated locally per device. - Aggregated BN mean and variance over all devices during the training.
  • 15. Methods and Experiments Evaluation Protocal Introduction / Related Work / Methods and Experiments / Conclusion 13 • Dataset and Metrics - ImageNet - Transfer Learning on wide range of datasets (Cifar10, Cifar100, etc) • Default Setting - Random crop and resize, Color distortions, Gaussian blur - ResNet-50 as base encoder network - 2-layer MLP projection head to project the representation to a 128- dimensional latent space - Trained at batch size 4096 for 100 epochs
  • 16. Methods and Experiments Ablation Studies – Data Augmentation Introduction / Related Work / Methods and Experiments / Conclusion 14 “Coloring”, “Crop” = Crucial
  • 17. Methods and Experiments Ablation Studies – Data Augmentation Introduction / Related Work / Methods and Experiments / Conclusion 15
  • 18. Methods and Experiments Ablation Studies – Nonlinear Projection head Introduction / Related Work / Methods and Experiments / Conclusion 16 • The hidden layer before the projection head is a better representation than the layer after
  • 19. Methods and Experiments Ablation Studies – Batch Size Introduction / Related Work / Methods and Experiments / Conclusion 17
  • 20. Methods and Experiments Results – ImageNet Introduction / Related Work / Methods and Experiments / Conclusion 18
  • 21. Methods and Experiments Results – semi-supervised learning Introduction / Related Work / Methods and Experiments / Conclusion 19
  • 22. Methods and Experiments Results – Transfer Learning Introduction / Related Work / Methods and Experiments / Conclusion 20
  • 23. Conclusion Introduction / Related Work / Methods and Experiments / Conclusion • Improved considerably over previous methods for self- supervised, semi-supervised, and transfer learning. • SimCLR Differs from standard supervised learning on ImageNet only in the choice of data augmentation, the use of a nonlinear head, and the loss function. • Despite a recent surge in interest, self-supervised learning remains undervalued. 21