SlideShare a Scribd company logo
1 of 26
Download to read offline
Rethinking Pre-training and Self-training
Google Research, Brain Team
Yonsei University Severance Hospital CCIDS
Choi Dongmin
Introduction
He et al. Rethinking ImageNet Pre-training. ICCV 2019
• Pre-training

- a dominant paradigm in computer vision (ex. ImageNet pre-training)

- However, ImageNet pre-training does not improve accuracy on COCO 

[Kaiming He, ICCV 2019]
Introduction
He et al. Rethinking ImageNet Pre-training. ICCV 2019
• Pre-training

- a dominant paradigm in computer vision (ex. ImageNet pre-training)

- However, ImageNet pre-training does not improve accuracy on COCO 

[Kaiming He, ICCV 2019]

• Self-training

- Steps (ex. Use ImageNet to help COCO object detection)

1) Discard the labels on ImageNet

2) Train an object detection on COCO, and use it to generate pseudo labels 

on ImageNet

3) A new model is trained on the combined pseudo-labeled ImageNet and 

labeled COCO data
Introduction
• Generality and Flexibility of Self-training with three insights



1) Stronger data augmentation & More labeled data

→ diminish the value of pre-training



2) Unlike pre-training, self-training is always helpful



3) Self-training improves upon pre-training
Methodology
• Methods and Control Factors

1. Data Augmentation

2. Pre-training

3. Self-training
Methodology
• Methods and Control Factors

1. Data Augmentation
Methodology
• Methods and Control Factors

1. Data Augmentation
AutoAugment RandAugment
Automatically search for

improved data augmentation policies Remove a separate 

search space phase on a proxy task
more stronger
Methodology
• Methods and Control Factors

2. Pre-training (EfficientNet-B7 baseline)
Methodology
• Methods and Control Factors

2. Pre-training (EfficientNet-B7 baseline)
ImageNet++ Init : EfficientNet-B7 + Noisy Student Method
M Tan et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. ICML 2019

Qizhe Xie et al. Self-training with Noisy Student improves ImageNet classification. arXiv:1911.04252
- A semi-supervised learning

- Self-training + Distillation
Methodology
• Methods and Control Factors

3. Self-training (based on Noise Student Method)
Qizhe Xie et al. Self-training with Noisy Student improves ImageNet classification. arXiv:1911.04252
Experiments
1. The effects of augmentation and labeled dataset size on pre-training
- Task : COCO object detection

- Network : RetinaNet with the EfficientNet-B7 backbone



- Left : under various ImageNet pre-trained checkpoint and data augmentation strengths
TY Lin et al. Focal Loss for Dense Object Detection. ICCV 2017
Finding 1. Pre-training hurts performance when stronger data augmentation is used
Experiments
1. The effects of augmentation and labeled dataset size on pre-training
- Task : COCO object detection

- Network : RetinaNet with the EfficientNet-B7 backbone



- Right : under various COCO dataset sizes and ImageNet pre-trained checkpoint
TY Lin et al. Focal Loss for Dense Object Detection. ICCV 2017
Finding 2. More labeled data diminishes the value of pre-training
Experiments
2. The effects of augmentation and labeled dataset size on self-training
- Task : COCO object detection (self-training only treats ImageNet as unlabeled data)

- Network : RetinaNet with the EfficientNet-B7 backbone



Finding 1. Self-training helps in high data/strong augmentation regimes,

even when pre-training hurts
= Pre-training
Experiments
2. The effects of augmentation and labeled dataset size on self-training
- Task : COCO object detection (self-training only treats ImageNet as unlabeled data)

- Network : RetinaNet with the EfficientNet-B7 backbone



Finding 2. Self-training works across dataset sizes and

is additive to pre-training.
Experiments
3. Self-supervised pre-training also hurts when self-training helps in high

data/strong augmentation regimes
- Task : COCO object detection

- Network : RetinaNet with the ResNet-50
backbone

- All models use Augment-S4

T Chen et al. A Simple Framework for Contrastive Learning of Visual Representations. arXiv:2002.05709
https://amitness.com/2020/03/illustrated-simclr/
Experiments
4. Exploring the limits of self-training and pre-training
- Task : COCO object detection

- Network : SpineNet (closer to SOTA)

- Self-training dataset : Open Images Dataset
X Du et al. SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization. CVPR 2020
SpineNet with Self-training

achieves the best performance
Experiments
4. Exploring the limits of self-training and pre-training
- Task : PASCAL VOC Semantic Segmentation

- Network : NAS-FPN (EfficientNet backbone)

- Pre-training + Self-training + Augment-S4

- Pre-training dataset : ImageNet

- Self-training dataset : aug set of PASCAL
G Ghiasi et al. NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection. CVPR 2019
Improves SOTA by +1.5% mIOU

w/ much less human labels
Experiments
4. Exploring the limits of self-training and pre-training
- Task : PASCAL VOC Semantic Segmentation

- Network : NAS-FPN (EfficientNet backbone)

- Pre-training + Self-training + Augment-S4

- Pre-training dataset : ImageNet

- Self-training dataset : aug set of PASCAL
Pre-training with a good checkpoint is crucial

due to PASCAL’s small dataset size
< Appendix C >
Discussion
1. Rethinking pre-training and universal feature representations
- Requirements of universal feature representations that can solve many tasks



- Weak performance of pre-training

: Pre-training is not aware of the task of interest and can fail to adapt

(ex. good features for ImageNet may discard positional information which is needed for COCO)



- Self-training is more adaptive to the task of interest (generally more beneficial)
Discussion
1. Rethinking pre-training and universal feature representations
- Requirements of universal feature representations that can solve many tasks



- Weak performance of pre-training

: Pre-training is not aware of the task of interest and can fail to adapt

(ex. good features for ImageNet may discard positional information which is needed for COCO)



- Self-training is more adaptive to the task of interest (generally more beneficial)
Discussion
2. The benefit of joint-training
- Joint-training : jointly train ImageNet classification with COCO object detection



- Random Initialization + Self-training + Joint Training : +4.4 improvement



- Joint Training (+2.9) and Pre-training (+2.6) gives similar improvements,

but Joint Training is achieved by training 19 epochs while Pre-training needed 

to be trained for 350 epochs.
Discussion
3. The importance of the task alignment
- aug : additional PASCAL VOC dataset with much noisier labels

- Training with aug dataset hurts performance when strong augmentation

- Self-training (pseudo-label on aug dataset) improves accuracy
Noisy (PASCAL) or un-targeted (ImageNet)
labeling is worse than targeted pseudo labeling
Discussion
3. The importance of the task alignment
- aug : additional PASCAL VOC dataset with much noisier labels

- Training with aug dataset hurts performance when strong augmentation

- Self-training (pseudo-label on aug dataset) improves accuracy
Noisy (PASCAL) or un-targeted (ImageNet)
labeling is worse than targeted pseudo labeling
Shao et al : Pre-training on Open Images hurts performance on COCO, despite
both of them being annotated with bounding boxes
Shao et al. Objects365: A Large-scale, High-quality Dataset for Object Detection. ICCV 2019
Not only the task but the annotations to be same for

pre-training to be beneficial (but self-training is very general)
Discussion
4. Limitations
- Self-training requires more compute than pre-training



- Good pre-trained models are also needed for low-data applications

(ex. PASCAL segmentation)
Discussion
4. Limitations
- Self-training requires more compute than pre-training



- Good pre-trained models are also needed for low-data applications

(ex. PASCAL segmentation)
5. The scalability, generality and flexibility of self-training
- Scalability : works well as we have more labeled data

- Generality : works well even when pre-training fails but also when pre-training

succeeds

- Flexibility : works well in every setup (low or high data / weak or strong aug)

and with different architectures, data sources, and tasks
The most methods fail when we have more labeled data or
more compute or better supervised training recipes,

but that does not seem to self-training
Thank you

More Related Content

What's hot

[기초개념] Recurrent Neural Network (RNN) 소개
[기초개념] Recurrent Neural Network (RNN) 소개[기초개념] Recurrent Neural Network (RNN) 소개
[기초개념] Recurrent Neural Network (RNN) 소개Donghyeon Kim
 
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...Sunghoon Joo
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisHyeongmin Lee
 
FCN to DeepLab.v3+
FCN to DeepLab.v3+FCN to DeepLab.v3+
FCN to DeepLab.v3+Whi Kwon
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksChristian Perone
 
Image Translation with GAN
Image Translation with GANImage Translation with GAN
Image Translation with GANJunho Cho
 
Introduction to Visual transformers
Introduction to Visual transformers Introduction to Visual transformers
Introduction to Visual transformers leopauly
 
makoto shing (stability ai) - image model fine-tuning - wandb_event_230525.pdf
makoto shing (stability ai) - image model fine-tuning - wandb_event_230525.pdfmakoto shing (stability ai) - image model fine-tuning - wandb_event_230525.pdf
makoto shing (stability ai) - image model fine-tuning - wandb_event_230525.pdfAkira Shibata
 
Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...
Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...
Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...KwanyoungKim7
 
物体検出フレームワークMMDetectionで快適な開発
物体検出フレームワークMMDetectionで快適な開発物体検出フレームワークMMDetectionで快適な開発
物体検出フレームワークMMDetectionで快適な開発Tatsuya Suzuki
 
文献紹介:Rethinking Data Augmentation for Image Super-resolution: A Comprehensive...
文献紹介:Rethinking Data Augmentation for Image Super-resolution: A Comprehensive...文献紹介:Rethinking Data Augmentation for Image Super-resolution: A Comprehensive...
文献紹介:Rethinking Data Augmentation for Image Super-resolution: A Comprehensive...Toru Tamaki
 
Transforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachTransforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachFerdin Joe John Joseph PhD
 
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...Sungha Choi
 
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksPR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksJinwon Lee
 
[DL輪読会]医用画像解析におけるセグメンテーション
[DL輪読会]医用画像解析におけるセグメンテーション[DL輪読会]医用画像解析におけるセグメンテーション
[DL輪読会]医用画像解析におけるセグメンテーションDeep Learning JP
 
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxivDeep Learning JP
 
【DL輪読会】DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Dri...
【DL輪読会】DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Dri...【DL輪読会】DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Dri...
【DL輪読会】DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Dri...Deep Learning JP
 
ConvNeXt: A ConvNet for the 2020s explained
ConvNeXt: A ConvNet for the 2020s explainedConvNeXt: A ConvNet for the 2020s explained
ConvNeXt: A ConvNet for the 2020s explainedSushant Gautam
 

What's hot (20)

[기초개념] Recurrent Neural Network (RNN) 소개
[기초개념] Recurrent Neural Network (RNN) 소개[기초개념] Recurrent Neural Network (RNN) 소개
[기초개념] Recurrent Neural Network (RNN) 소개
 
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
FCN to DeepLab.v3+
FCN to DeepLab.v3+FCN to DeepLab.v3+
FCN to DeepLab.v3+
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Image Translation with GAN
Image Translation with GANImage Translation with GAN
Image Translation with GAN
 
Densenet CNN
Densenet CNNDensenet CNN
Densenet CNN
 
Introduction to Visual transformers
Introduction to Visual transformers Introduction to Visual transformers
Introduction to Visual transformers
 
makoto shing (stability ai) - image model fine-tuning - wandb_event_230525.pdf
makoto shing (stability ai) - image model fine-tuning - wandb_event_230525.pdfmakoto shing (stability ai) - image model fine-tuning - wandb_event_230525.pdf
makoto shing (stability ai) - image model fine-tuning - wandb_event_230525.pdf
 
ViT.pptx
ViT.pptxViT.pptx
ViT.pptx
 
Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...
Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...
Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...
 
物体検出フレームワークMMDetectionで快適な開発
物体検出フレームワークMMDetectionで快適な開発物体検出フレームワークMMDetectionで快適な開発
物体検出フレームワークMMDetectionで快適な開発
 
文献紹介:Rethinking Data Augmentation for Image Super-resolution: A Comprehensive...
文献紹介:Rethinking Data Augmentation for Image Super-resolution: A Comprehensive...文献紹介:Rethinking Data Augmentation for Image Super-resolution: A Comprehensive...
文献紹介:Rethinking Data Augmentation for Image Super-resolution: A Comprehensive...
 
Transforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachTransforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approach
 
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
 
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksPR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
 
[DL輪読会]医用画像解析におけるセグメンテーション
[DL輪読会]医用画像解析におけるセグメンテーション[DL輪読会]医用画像解析におけるセグメンテーション
[DL輪読会]医用画像解析におけるセグメンテーション
 
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
 
【DL輪読会】DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Dri...
【DL輪読会】DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Dri...【DL輪読会】DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Dri...
【DL輪読会】DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Dri...
 
ConvNeXt: A ConvNet for the 2020s explained
ConvNeXt: A ConvNet for the 2020s explainedConvNeXt: A ConvNet for the 2020s explained
ConvNeXt: A ConvNet for the 2020s explained
 

Similar to Review : Rethinking Pre-training and Self-training

CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION cscpconf
 
IRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
IRJET- Object Detection and Recognition using Single Shot Multi-Box DetectorIRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
IRJET- Object Detection and Recognition using Single Shot Multi-Box DetectorIRJET Journal
 
A Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringA Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringIRJET Journal
 
ODSC West 2022 – Kitbashing in ML
ODSC West 2022 – Kitbashing in MLODSC West 2022 – Kitbashing in ML
ODSC West 2022 – Kitbashing in MLBryan Bischof
 
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGHANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGIRJET Journal
 
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGHANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGIRJET Journal
 
Presentation of master thesis
Presentation of master thesisPresentation of master thesis
Presentation of master thesisSeoung-Ho Choi
 
A new approachto image classification based on adeep multiclass AdaBoosting e...
A new approachto image classification based on adeep multiclass AdaBoosting e...A new approachto image classification based on adeep multiclass AdaBoosting e...
A new approachto image classification based on adeep multiclass AdaBoosting e...IJECEIAES
 
End-to-end deep auto-encoder for segmenting a moving object with limited tra...
End-to-end deep auto-encoder for segmenting a moving object  with limited tra...End-to-end deep auto-encoder for segmenting a moving object  with limited tra...
End-to-end deep auto-encoder for segmenting a moving object with limited tra...IJECEIAES
 
Centertrack and naver airush 2020 review
Centertrack and naver airush 2020 reviewCentertrack and naver airush 2020 review
Centertrack and naver airush 2020 review경훈 김
 
2a Mini-conf PredictCovid. Field: Artificial Intelligence
2a Mini-conf PredictCovid. Field: Artificial Intelligence2a Mini-conf PredictCovid. Field: Artificial Intelligence
2a Mini-conf PredictCovid. Field: Artificial IntelligenceAlex Camargo
 
USING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLING
USING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLINGUSING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLING
USING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLINGIRJET Journal
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Fernando Constantino
 
Automatic Learning Image Objects via Incremental Model
Automatic Learning Image Objects via Incremental ModelAutomatic Learning Image Objects via Incremental Model
Automatic Learning Image Objects via Incremental ModelIOSR Journals
 
5212303961620480 1585670953 joanna_stachera_proposal_g_soc2020
5212303961620480 1585670953 joanna_stachera_proposal_g_soc20205212303961620480 1585670953 joanna_stachera_proposal_g_soc2020
5212303961620480 1585670953 joanna_stachera_proposal_g_soc2020JoannaStachera1
 
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...IRJET Journal
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...IEEEFINALYEARSTUDENTPROJECT
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...IEEEMEMTECHSTUDENTSPROJECTS
 
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEEFINALYEARSTUDENTPROJECTS
 

Similar to Review : Rethinking Pre-training and Self-training (20)

CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
 
IRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
IRJET- Object Detection and Recognition using Single Shot Multi-Box DetectorIRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
IRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
 
A Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringA Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question Answering
 
ODSC West 2022 – Kitbashing in ML
ODSC West 2022 – Kitbashing in MLODSC West 2022 – Kitbashing in ML
ODSC West 2022 – Kitbashing in ML
 
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGHANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
 
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGHANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
 
Presentation of master thesis
Presentation of master thesisPresentation of master thesis
Presentation of master thesis
 
A new approachto image classification based on adeep multiclass AdaBoosting e...
A new approachto image classification based on adeep multiclass AdaBoosting e...A new approachto image classification based on adeep multiclass AdaBoosting e...
A new approachto image classification based on adeep multiclass AdaBoosting e...
 
End-to-end deep auto-encoder for segmenting a moving object with limited tra...
End-to-end deep auto-encoder for segmenting a moving object  with limited tra...End-to-end deep auto-encoder for segmenting a moving object  with limited tra...
End-to-end deep auto-encoder for segmenting a moving object with limited tra...
 
Centertrack and naver airush 2020 review
Centertrack and naver airush 2020 reviewCentertrack and naver airush 2020 review
Centertrack and naver airush 2020 review
 
2a Mini-conf PredictCovid. Field: Artificial Intelligence
2a Mini-conf PredictCovid. Field: Artificial Intelligence2a Mini-conf PredictCovid. Field: Artificial Intelligence
2a Mini-conf PredictCovid. Field: Artificial Intelligence
 
USING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLING
USING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLINGUSING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLING
USING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLING
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.
 
One shot learning
One shot learningOne shot learning
One shot learning
 
Automatic Learning Image Objects via Incremental Model
Automatic Learning Image Objects via Incremental ModelAutomatic Learning Image Objects via Incremental Model
Automatic Learning Image Objects via Incremental Model
 
5212303961620480 1585670953 joanna_stachera_proposal_g_soc2020
5212303961620480 1585670953 joanna_stachera_proposal_g_soc20205212303961620480 1585670953 joanna_stachera_proposal_g_soc2020
5212303961620480 1585670953 joanna_stachera_proposal_g_soc2020
 
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
 
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
 

More from Dongmin Choi

[Review] BoxInst: High-Performance Instance Segmentation with Box Annotations...
[Review] BoxInst: High-Performance Instance Segmentation with Box Annotations...[Review] BoxInst: High-Performance Instance Segmentation with Box Annotations...
[Review] BoxInst: High-Performance Instance Segmentation with Box Annotations...Dongmin Choi
 
Review: Incremental Few-shot Instance Segmentation [CDM]
Review: Incremental Few-shot Instance Segmentation [CDM]Review: Incremental Few-shot Instance Segmentation [CDM]
Review: Incremental Few-shot Instance Segmentation [CDM]Dongmin Choi
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level FeatureDongmin Choi
 
Transformer in Computer Vision
Transformer in Computer VisionTransformer in Computer Vision
Transformer in Computer VisionDongmin Choi
 
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...Dongmin Choi
 
YolactEdge Review [cdm]
YolactEdge Review [cdm]YolactEdge Review [cdm]
YolactEdge Review [cdm]Dongmin Choi
 
Review : Inter-slice Context Residual Learning for 3D Medical Image Segmentation
Review : Inter-slice Context Residual Learning for 3D Medical Image SegmentationReview : Inter-slice Context Residual Learning for 3D Medical Image Segmentation
Review : Inter-slice Context Residual Learning for 3D Medical Image SegmentationDongmin Choi
 
Deformable DETR Review [CDM]
Deformable DETR Review [CDM]Deformable DETR Review [CDM]
Deformable DETR Review [CDM]Dongmin Choi
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]Dongmin Choi
 
Review : Prototype Mixture Models for Few-shot Semantic Segmentation
Review : Prototype Mixture Models for Few-shot Semantic SegmentationReview : Prototype Mixture Models for Few-shot Semantic Segmentation
Review : Prototype Mixture Models for Few-shot Semantic SegmentationDongmin Choi
 
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...Dongmin Choi
 
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Dongmin Choi
 
Review : Structure Boundary Preserving Segmentation
for Medical Image with Am...
Review : Structure Boundary Preserving Segmentation
for Medical Image with Am...Review : Structure Boundary Preserving Segmentation
for Medical Image with Am...
Review : Structure Boundary Preserving Segmentation
for Medical Image with Am...Dongmin Choi
 
Pyradiomics Customization [CDM]
Pyradiomics Customization [CDM]Pyradiomics Customization [CDM]
Pyradiomics Customization [CDM]Dongmin Choi
 
Seeing What a GAN Cannot Generate [cdm]
Seeing What a GAN Cannot Generate [cdm]Seeing What a GAN Cannot Generate [cdm]
Seeing What a GAN Cannot Generate [cdm]Dongmin Choi
 
Neural network pruning with residual connections and limited-data review [cdm]
Neural network pruning with residual connections and limited-data review [cdm]Neural network pruning with residual connections and limited-data review [cdm]
Neural network pruning with residual connections and limited-data review [cdm]Dongmin Choi
 
Network Deconvolution review [cdm]
Network Deconvolution review [cdm]Network Deconvolution review [cdm]
Network Deconvolution review [cdm]Dongmin Choi
 
How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...Dongmin Choi
 
Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]Dongmin Choi
 
Augmix review [cdm]
Augmix review [cdm]Augmix review [cdm]
Augmix review [cdm]Dongmin Choi
 

More from Dongmin Choi (20)

[Review] BoxInst: High-Performance Instance Segmentation with Box Annotations...
[Review] BoxInst: High-Performance Instance Segmentation with Box Annotations...[Review] BoxInst: High-Performance Instance Segmentation with Box Annotations...
[Review] BoxInst: High-Performance Instance Segmentation with Box Annotations...
 
Review: Incremental Few-shot Instance Segmentation [CDM]
Review: Incremental Few-shot Instance Segmentation [CDM]Review: Incremental Few-shot Instance Segmentation [CDM]
Review: Incremental Few-shot Instance Segmentation [CDM]
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level Feature
 
Transformer in Computer Vision
Transformer in Computer VisionTransformer in Computer Vision
Transformer in Computer Vision
 
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
 
YolactEdge Review [cdm]
YolactEdge Review [cdm]YolactEdge Review [cdm]
YolactEdge Review [cdm]
 
Review : Inter-slice Context Residual Learning for 3D Medical Image Segmentation
Review : Inter-slice Context Residual Learning for 3D Medical Image SegmentationReview : Inter-slice Context Residual Learning for 3D Medical Image Segmentation
Review : Inter-slice Context Residual Learning for 3D Medical Image Segmentation
 
Deformable DETR Review [CDM]
Deformable DETR Review [CDM]Deformable DETR Review [CDM]
Deformable DETR Review [CDM]
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]
 
Review : Prototype Mixture Models for Few-shot Semantic Segmentation
Review : Prototype Mixture Models for Few-shot Semantic SegmentationReview : Prototype Mixture Models for Few-shot Semantic Segmentation
Review : Prototype Mixture Models for Few-shot Semantic Segmentation
 
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...
 
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
 
Review : Structure Boundary Preserving Segmentation
for Medical Image with Am...
Review : Structure Boundary Preserving Segmentation
for Medical Image with Am...Review : Structure Boundary Preserving Segmentation
for Medical Image with Am...
Review : Structure Boundary Preserving Segmentation
for Medical Image with Am...
 
Pyradiomics Customization [CDM]
Pyradiomics Customization [CDM]Pyradiomics Customization [CDM]
Pyradiomics Customization [CDM]
 
Seeing What a GAN Cannot Generate [cdm]
Seeing What a GAN Cannot Generate [cdm]Seeing What a GAN Cannot Generate [cdm]
Seeing What a GAN Cannot Generate [cdm]
 
Neural network pruning with residual connections and limited-data review [cdm]
Neural network pruning with residual connections and limited-data review [cdm]Neural network pruning with residual connections and limited-data review [cdm]
Neural network pruning with residual connections and limited-data review [cdm]
 
Network Deconvolution review [cdm]
Network Deconvolution review [cdm]Network Deconvolution review [cdm]
Network Deconvolution review [cdm]
 
How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...
 
Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]
 
Augmix review [cdm]
Augmix review [cdm]Augmix review [cdm]
Augmix review [cdm]
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 

Review : Rethinking Pre-training and Self-training

  • 1. Rethinking Pre-training and Self-training Google Research, Brain Team Yonsei University Severance Hospital CCIDS Choi Dongmin
  • 2. Introduction He et al. Rethinking ImageNet Pre-training. ICCV 2019 • Pre-training
 - a dominant paradigm in computer vision (ex. ImageNet pre-training)
 - However, ImageNet pre-training does not improve accuracy on COCO 
 [Kaiming He, ICCV 2019]
  • 3. Introduction He et al. Rethinking ImageNet Pre-training. ICCV 2019 • Pre-training
 - a dominant paradigm in computer vision (ex. ImageNet pre-training)
 - However, ImageNet pre-training does not improve accuracy on COCO 
 [Kaiming He, ICCV 2019] • Self-training
 - Steps (ex. Use ImageNet to help COCO object detection)
 1) Discard the labels on ImageNet
 2) Train an object detection on COCO, and use it to generate pseudo labels 
 on ImageNet
 3) A new model is trained on the combined pseudo-labeled ImageNet and 
 labeled COCO data
  • 4. Introduction • Generality and Flexibility of Self-training with three insights
 
 1) Stronger data augmentation & More labeled data
 → diminish the value of pre-training
 
 2) Unlike pre-training, self-training is always helpful
 
 3) Self-training improves upon pre-training
  • 5. Methodology • Methods and Control Factors
 1. Data Augmentation
 2. Pre-training
 3. Self-training
  • 6. Methodology • Methods and Control Factors
 1. Data Augmentation
  • 7. Methodology • Methods and Control Factors
 1. Data Augmentation AutoAugment RandAugment Automatically search for
 improved data augmentation policies Remove a separate search space phase on a proxy task more stronger
  • 8. Methodology • Methods and Control Factors
 2. Pre-training (EfficientNet-B7 baseline)
  • 9. Methodology • Methods and Control Factors
 2. Pre-training (EfficientNet-B7 baseline) ImageNet++ Init : EfficientNet-B7 + Noisy Student Method M Tan et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. ICML 2019
 Qizhe Xie et al. Self-training with Noisy Student improves ImageNet classification. arXiv:1911.04252 - A semi-supervised learning
 - Self-training + Distillation
  • 10. Methodology • Methods and Control Factors
 3. Self-training (based on Noise Student Method) Qizhe Xie et al. Self-training with Noisy Student improves ImageNet classification. arXiv:1911.04252
  • 11. Experiments 1. The effects of augmentation and labeled dataset size on pre-training - Task : COCO object detection
 - Network : RetinaNet with the EfficientNet-B7 backbone
 
 - Left : under various ImageNet pre-trained checkpoint and data augmentation strengths TY Lin et al. Focal Loss for Dense Object Detection. ICCV 2017 Finding 1. Pre-training hurts performance when stronger data augmentation is used
  • 12. Experiments 1. The effects of augmentation and labeled dataset size on pre-training - Task : COCO object detection
 - Network : RetinaNet with the EfficientNet-B7 backbone
 
 - Right : under various COCO dataset sizes and ImageNet pre-trained checkpoint TY Lin et al. Focal Loss for Dense Object Detection. ICCV 2017 Finding 2. More labeled data diminishes the value of pre-training
  • 13. Experiments 2. The effects of augmentation and labeled dataset size on self-training - Task : COCO object detection (self-training only treats ImageNet as unlabeled data)
 - Network : RetinaNet with the EfficientNet-B7 backbone
 
 Finding 1. Self-training helps in high data/strong augmentation regimes,
 even when pre-training hurts = Pre-training
  • 14. Experiments 2. The effects of augmentation and labeled dataset size on self-training - Task : COCO object detection (self-training only treats ImageNet as unlabeled data)
 - Network : RetinaNet with the EfficientNet-B7 backbone
 
 Finding 2. Self-training works across dataset sizes and
 is additive to pre-training.
  • 15. Experiments 3. Self-supervised pre-training also hurts when self-training helps in high
 data/strong augmentation regimes - Task : COCO object detection
 - Network : RetinaNet with the ResNet-50 backbone
 - All models use Augment-S4
 T Chen et al. A Simple Framework for Contrastive Learning of Visual Representations. arXiv:2002.05709 https://amitness.com/2020/03/illustrated-simclr/
  • 16. Experiments 4. Exploring the limits of self-training and pre-training - Task : COCO object detection
 - Network : SpineNet (closer to SOTA)
 - Self-training dataset : Open Images Dataset X Du et al. SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization. CVPR 2020 SpineNet with Self-training
 achieves the best performance
  • 17. Experiments 4. Exploring the limits of self-training and pre-training - Task : PASCAL VOC Semantic Segmentation
 - Network : NAS-FPN (EfficientNet backbone)
 - Pre-training + Self-training + Augment-S4 - Pre-training dataset : ImageNet - Self-training dataset : aug set of PASCAL G Ghiasi et al. NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection. CVPR 2019 Improves SOTA by +1.5% mIOU
 w/ much less human labels
  • 18. Experiments 4. Exploring the limits of self-training and pre-training - Task : PASCAL VOC Semantic Segmentation
 - Network : NAS-FPN (EfficientNet backbone)
 - Pre-training + Self-training + Augment-S4 - Pre-training dataset : ImageNet - Self-training dataset : aug set of PASCAL Pre-training with a good checkpoint is crucial
 due to PASCAL’s small dataset size < Appendix C >
  • 19. Discussion 1. Rethinking pre-training and universal feature representations - Requirements of universal feature representations that can solve many tasks
 
 - Weak performance of pre-training
 : Pre-training is not aware of the task of interest and can fail to adapt
 (ex. good features for ImageNet may discard positional information which is needed for COCO)
 
 - Self-training is more adaptive to the task of interest (generally more beneficial)
  • 20. Discussion 1. Rethinking pre-training and universal feature representations - Requirements of universal feature representations that can solve many tasks
 
 - Weak performance of pre-training
 : Pre-training is not aware of the task of interest and can fail to adapt
 (ex. good features for ImageNet may discard positional information which is needed for COCO)
 
 - Self-training is more adaptive to the task of interest (generally more beneficial)
  • 21. Discussion 2. The benefit of joint-training - Joint-training : jointly train ImageNet classification with COCO object detection
 
 - Random Initialization + Self-training + Joint Training : +4.4 improvement
 
 - Joint Training (+2.9) and Pre-training (+2.6) gives similar improvements,
 but Joint Training is achieved by training 19 epochs while Pre-training needed 
 to be trained for 350 epochs.
  • 22. Discussion 3. The importance of the task alignment - aug : additional PASCAL VOC dataset with much noisier labels
 - Training with aug dataset hurts performance when strong augmentation
 - Self-training (pseudo-label on aug dataset) improves accuracy Noisy (PASCAL) or un-targeted (ImageNet) labeling is worse than targeted pseudo labeling
  • 23. Discussion 3. The importance of the task alignment - aug : additional PASCAL VOC dataset with much noisier labels
 - Training with aug dataset hurts performance when strong augmentation
 - Self-training (pseudo-label on aug dataset) improves accuracy Noisy (PASCAL) or un-targeted (ImageNet) labeling is worse than targeted pseudo labeling Shao et al : Pre-training on Open Images hurts performance on COCO, despite both of them being annotated with bounding boxes Shao et al. Objects365: A Large-scale, High-quality Dataset for Object Detection. ICCV 2019 Not only the task but the annotations to be same for
 pre-training to be beneficial (but self-training is very general)
  • 24. Discussion 4. Limitations - Self-training requires more compute than pre-training
 
 - Good pre-trained models are also needed for low-data applications
 (ex. PASCAL segmentation)
  • 25. Discussion 4. Limitations - Self-training requires more compute than pre-training
 
 - Good pre-trained models are also needed for low-data applications
 (ex. PASCAL segmentation) 5. The scalability, generality and flexibility of self-training - Scalability : works well as we have more labeled data
 - Generality : works well even when pre-training fails but also when pre-training
 succeeds
 - Flexibility : works well in every setup (low or high data / weak or strong aug)
 and with different architectures, data sources, and tasks The most methods fail when we have more labeled data or more compute or better supervised training recipes,
 but that does not seem to self-training