SlideShare a Scribd company logo
2020/11/29
Ho Seong Lee (hoya012)
Cognex Deep Learning Lab
Research Engineer
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 1
Contents
• Introduction
• Related Work
• Experiments
• Analysis & Discussion
• Conclusion
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 2
Introduction
Transfer Learning is a widely-used paradigm in deep learning (maybe.. default..?)
• Models pre-trained on standard datasets (e.g. ImageNet) can be efficiently adapted to downstream tasks.
• Better pre-trained models yield better transfer results, suggesting that initial accuracy is a key aspect of
transfer learning performance.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 3
Reference: “Do Better ImageNet Models Transfer Better?“, 2019 CVPR
Related Works
Transfer Learning in various domain
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 4
• Medical imaging
• “Comparison of deep transfer learning strategies for digital pathology”, 2018 CVPRW
• Language modeling
• “Senteval: An evaluation toolkit for universal sentence representations”, 2018 arXiv
• Object Detection, Segmentation
• “Faster r-cnn: Towards real-time object detection with region proposal networks”, 2015 NIPS
• “R-fcn: Object detection via region-based fully convolutional networks”, 2016 NIPS
• “Speed/accuracy trade-offs for modern convolutional object detectors”, 2017 CVPR
• “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and
fully connected crfs”, 2017 TPAMI
Related Works
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 5
Transfer Learning with fine-tuning or frozen feature-based methods
• “Analyzing the performance of multilayer neural networks for object recognition”, 2014 ECCV
• “Return of the devil in the details: Delving deep into convolutional nets”, 2014 arXiv
• “Rich feature hierarchies for accurate object detection and semantic seg- mentation”, 2014 CVPR
• “How transferable are features in deep neural networks?”, 2014 NIPS
• “Factors of transferability for a generic convnet representation”, 2015 TPAMI
• “Bilinear cnn models for fine- grained visual recognition”, 2015 ICCV
• “What makes ImageNet good for transfer learning?”, 2016 arXiv
• “Best practices for fine-tuning visual classifiers to new domains”, 2016 ECCV
→ They show that fine-tuning outperforms frozen feature-based methods
Related Works
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 6
Adversarial robustness
• “Towards deep learning models resistant to adversarial attacks”, 2018 ICLR
• “Virtual adversarial training: a regularization method for supervised and semi-supervised learning”,
2018
• “Provably robust deep learning via adversarially trained smoothed classifier”, 2019 NeurIPS
• And many papers has studied the features learned by these robust networks and suggested that they
improve upon those learned by standard networks.
• On the other hand, prior studies have also identified theoretical and empirical tradeoffs between
standard accuracy and adversarial robustness.
Related Works
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 7
Adversarial robustness and Transfer learning
• “Adversarially robust transfer learning”, 2019 arXiv
• Transfer learning can increase downstream-task adversarial robustness
• “Adversarially-Trained Deep Nets Transfer Better”, 2020 arXiv
• Investigate the transfer performance of adversarially robust networks. → Very similar work!
• Authors study a larger set of downstream datasets and tasks and analyze the effects of model
accuracy, model width, and data resolution.
Experiments
Motivation: Fixed-Feature Transfer Learning
• Basically we use the source model as a feature extractor for the target dataset, the trains a simple (often
linear) model on the resulting features
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 8
Reference: Stanford cs231n lecture note
Experiments
How can we improve transfer learning?
• Prior works suggest that accuracy on the source dataset is a strong indicator of performance on
downstream tasks.
• Still, it is unclear if improving ImageNet accuracy is the only way to improve performance.
• After all, the behavior of fixed-feature transfer is governed by models’ learned representations, which
are not fully described by source-dataset accuracy.
• These representations are, in turn, controlled by the priors that we put on them during training
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 9
architectural components, loss functions, augmentations, etc.
Experiments
The adversarial robustness prior
• Adversarial robustness refers to a model’s invariance to small (often imperceptible) perturbations of its
inputs.
• Robustness is typically induced at training time by replacing the standard empirical risk minimization
objective with a robust optimization objective
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 10
Experiments
Should adversarial robustness help fixed-feature transfer?
• In fact, adversarially robust models are known to be significantly less accurate than their standard
counterparts.
• It suggest that using adversarially robust feature representations should hurt transfer performance.
• On the other hand, recent work has found that the feature representations of robust models carry
several advantages over those of standard models.
• For example, adversarially robust representations typically have better-behaved gradients and thus
facilitate regularization-free feature visualization
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 11
Experiments
Experiments – Fixed Feature Transfer Learning
• To resolve these two conflicting hypotheses (adversarially robust feature representations should hurt
transfer performance. vs. feature representations of robust models carry several advantages over
those of standard models.), use a test bed of 12 standard transfer learning datasets.
• Use four ResNet-based architecture (ResNet-18, 50, WideResNet-50-x2, 50-x4)
• The results indicate that robust networks consistently extract better features for transfer learning than
standard networks.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 12
Experiments
Experiments – Fixed Feature Transfer Learning
• To resolve these two conflicting hypotheses (adversarially robust feature representations should hurt
transfer performance. vs. feature representations of robust models carry several advantages over
those of standard models.), use a test bed of 12 standard transfer learning datasets.
• Use four ResNet-based architecture (ResNet-18, 50, WideResNet-50-x2, 50-x4)
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 13
Experiments
Experiments – Full-Network Fine Tuning
• A more expensive but often better-performing transfer learning method uses the pre-trained model as a
weight initialization rather than as a feature extractor.
• In other words, update all of the weights of the pre-trained model (via gradient descent) to minimize loss
on the target task.
• Many previous works find that for standard models, performance on full-network transfer learning is
highly correlated with performance on fixed-feature transfer learning.
• Hope that the findings of the last section (fixed-feature) also carry over to this setting (full-network).
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 14
Experiments
Experiments – Full-Network Fine Tuning
• Robust models match or improve on standard models in terms of transfer learning performance.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 15
Experiments
Experiments – Full-Network Fine Tuning
• Also, adversarially robust networks consistently outperform standard networks in Object Detection &
Instance Segmentation
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 16
Analysis & Discussion
4.1 ImageNet accuracy and transfer performance
• Take a closer look at the similarities and differences in transfer learning between robust networks and
standard networks.
• Hypothesis: robustness and accuracy have counteracting yet separate effects!
• That is, higher accuracy improves transfer learning for a fixed level of robustness, and higher
robustness improves transfer learning for a fixed level of accuracy
• The results (cf. Figure 5; similar results for full-network transfer in Appendix F) support this hypothesis.
• The previously observed linear relationship between accuracy and transfer performance is often violated
once robustness aspect comes into play.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 17
Analysis & Discussion
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 18
Analysis & Discussion
4.1 ImageNet accuracy and transfer performance
• In even more direct support of our hypothesis, find that when the robustness level is held fixed, the
accuracy- transfer correlation observed by prior works for standard models holds for robust models too.
• This findings also indicate that accuracy is not a sufficient measure of feature quality or versatility.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 19
Analysis & Discussion
4.2 Robust models improve with width
• Previous works find that although increasing network depth improves transfer performance, increasing
width hurts it.
• The results corroborate this trend for standard networks but indicate that it does not hold for robust
networks, at least in the regime of widths tested.
• As width increases, transfer performance plateaus and decreases for standard models, but continues to
steadily grow for robust models.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 20
Not always!!
Analysis & Discussion
4.2 Robust models improve with width
• Previous works find that although increasing network depth improves transfer performance, increasing
width hurts it.
• The results corroborate this trend for standard networks but indicate that it does not hold for robust
networks, at least in the regime of widths tested.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 21
Analysis & Discussion
4.3 Optimal robustness levels for downstream tasks
• Although the best robust models often outperform the best standard models, the optimal choice of
robustness parameter ε varies widely between datasets. For example, when transferring to CIFAR- 10
and CIFAR-100, the optimal ε values were 3.0 and 1.0, respectively.
• In contrast, smaller values of ε (smaller by an order of magnitude) tend to work better for the rest of the
datasets.
• One possible explanation for this variability in the optimal choice of ε might relate to dataset granularity.
• Although we lack a quantitative notion of granularity (in reality, features are not simply singular pixels),
we consider image resolution as a crude proxy.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 22
Analysis & Discussion
4.3 Optimal robustness levels for downstream tasks
• Since we scale target datasets to match ImageNet dimensions, each pixel in a low-resolution dataset
(e.g., CIFAR-10) image translates into several pixels in transfer, thus inflating dataset’s separability.
• Attempt to calibrate the granularities of the 12 image classification datasets used in this work, by first
downscaling all the images to the size of CIFAR-10 (32× 32), and then upscaling them to ImageNet size
once more.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 23
Analysis & Discussion
4.3 Optimal robustness levels for downstream tasks
• After controlling for original dataset dimension, the dataset’s epsilon vs. transfer accuracy curves all
behave almost identically to CIFAR-10 and CIFAR-100 ones. (Similar results for full-network transfer)
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 24
Analysis & Discussion
4.4 Comparing adversarial robustness to texture robustness
• Consider texture-invariant models, i.e., models trained on the texture-randomizing Stylized ImageNet
(SIN) dataset.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 25
Analysis & Discussion
4.4 Comparing adversarial robustness to texture robustness
• Transfer learning from adversarially robust models outperforms transfer learning from texture-invariant
models on all considered datasets.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 26
Full-network
Fixed-feature
Conclusion
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better?
• Propose using adversarially robust models for transfer learning.
• Compare transfer learning performance of robust and standard models on a suite of 12
classification tasks, object detection, and instance segmentation.
• Find that adversarial robust neural networks consistently match or improve upon the
performance of their standard counterparts, despite having lower ImageNet accuracy.
• Take a closer look at the behavior of adversarially robust networks, and study the interplay
between ImageNet accuracy, model width, robustness, and transfer performance.
27
Conclusion
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better?
• We can simply try this experiments! (https://github.com/Microsoft/robust-models-transfer)
28

More Related Content

What's hot

YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection review
LEE HOSEONG
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural Networks
PyData
 
Pelee: a real time object detection system on mobile devices Paper Review
Pelee: a real time object detection system on mobile devices Paper ReviewPelee: a real time object detection system on mobile devices Paper Review
Pelee: a real time object detection system on mobile devices Paper Review
LEE HOSEONG
 
Efficient de cvpr_2020_paper
Efficient de cvpr_2020_paperEfficient de cvpr_2020_paper
Efficient de cvpr_2020_paper
shanullah3
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
Jinwon Lee
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
Jinwon Lee
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...
Jinwon Lee
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
Sungjoon Choi
 
“DNN Training Data: How to Know What You Need and How to Get It,” a Presentat...
“DNN Training Data: How to Know What You Need and How to Get It,” a Presentat...“DNN Training Data: How to Know What You Need and How to Get It,” a Presentat...
“DNN Training Data: How to Know What You Need and How to Get It,” a Presentat...
Edge AI and Vision Alliance
 
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object Detector
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object DetectorPR-270: PP-YOLO: An Effective and Efficient Implementation of Object Detector
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object Detector
Jinwon Lee
 
CNN Quantization
CNN QuantizationCNN Quantization
CNN Quantization
Emanuele Ghelfi
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object Detection
Jinwon Lee
 
[PR12] Inception and Xception - Jaejun Yoo
[PR12] Inception and Xception - Jaejun Yoo[PR12] Inception and Xception - Jaejun Yoo
[PR12] Inception and Xception - Jaejun Yoo
JaeJun Yoo
 
A beginner's guide to Style Transfer and recent trends
A beginner's guide to Style Transfer and recent trendsA beginner's guide to Style Transfer and recent trends
A beginner's guide to Style Transfer and recent trends
JaeJun Yoo
 
Learning deep features for discriminative localization
Learning deep features for discriminative localizationLearning deep features for discriminative localization
Learning deep features for discriminative localization
太一郎 遠藤
 
Face Recognition: From Scratch To Hatch
Face Recognition: From Scratch To HatchFace Recognition: From Scratch To Hatch
Face Recognition: From Scratch To Hatch
Eduard Tyantov
 
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio..."Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
Edge AI and Vision Alliance
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive Learning
MLAI2
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Sungchul Kim
 
Reward constrained interactive recommendation with natural language feedback ...
Reward constrained interactive recommendation with natural language feedback ...Reward constrained interactive recommendation with natural language feedback ...
Reward constrained interactive recommendation with natural language feedback ...
Jeong-Gwan Lee
 

What's hot (20)

YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection review
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural Networks
 
Pelee: a real time object detection system on mobile devices Paper Review
Pelee: a real time object detection system on mobile devices Paper ReviewPelee: a real time object detection system on mobile devices Paper Review
Pelee: a real time object detection system on mobile devices Paper Review
 
Efficient de cvpr_2020_paper
Efficient de cvpr_2020_paperEfficient de cvpr_2020_paper
Efficient de cvpr_2020_paper
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
“DNN Training Data: How to Know What You Need and How to Get It,” a Presentat...
“DNN Training Data: How to Know What You Need and How to Get It,” a Presentat...“DNN Training Data: How to Know What You Need and How to Get It,” a Presentat...
“DNN Training Data: How to Know What You Need and How to Get It,” a Presentat...
 
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object Detector
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object DetectorPR-270: PP-YOLO: An Effective and Efficient Implementation of Object Detector
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object Detector
 
CNN Quantization
CNN QuantizationCNN Quantization
CNN Quantization
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object Detection
 
[PR12] Inception and Xception - Jaejun Yoo
[PR12] Inception and Xception - Jaejun Yoo[PR12] Inception and Xception - Jaejun Yoo
[PR12] Inception and Xception - Jaejun Yoo
 
A beginner's guide to Style Transfer and recent trends
A beginner's guide to Style Transfer and recent trendsA beginner's guide to Style Transfer and recent trends
A beginner's guide to Style Transfer and recent trends
 
Learning deep features for discriminative localization
Learning deep features for discriminative localizationLearning deep features for discriminative localization
Learning deep features for discriminative localization
 
Face Recognition: From Scratch To Hatch
Face Recognition: From Scratch To HatchFace Recognition: From Scratch To Hatch
Face Recognition: From Scratch To Hatch
 
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio..."Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive Learning
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
 
Reward constrained interactive recommendation with natural language feedback ...
Reward constrained interactive recommendation with natural language feedback ...Reward constrained interactive recommendation with natural language feedback ...
Reward constrained interactive recommendation with natural language feedback ...
 

Similar to do adversarially robust image net models transfer better

How well do self-supervised models transfer.pptx
How well do self-supervised models transfer.pptxHow well do self-supervised models transfer.pptx
How well do self-supervised models transfer.pptx
ssuserbafbd0
 
Augmix review [cdm]
Augmix review [cdm]Augmix review [cdm]
Augmix review [cdm]
Dongmin Choi
 
How useful is self-supervised pretraining for Visual tasks?
How useful is self-supervised pretraining for Visual tasks?How useful is self-supervised pretraining for Visual tasks?
How useful is self-supervised pretraining for Visual tasks?
Seunghyun Hwang
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
Sanghamitra Deb
 
ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention Networks
Seunghyun Hwang
 
Presentation of master thesis
Presentation of master thesisPresentation of master thesis
Presentation of master thesis
Seoung-Ho Choi
 
A Fully Progressive approach to Single image super-resolution
A Fully Progressive approach to Single image super-resolution A Fully Progressive approach to Single image super-resolution
A Fully Progressive approach to Single image super-resolution
Mohammed Ashour
 
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
The Statistical and Applied Mathematical Sciences Institute
 
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEEFINALYEARSTUDENTPROJECTS
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
IEEEMEMTECHSTUDENTSPROJECTS
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
IEEEFINALYEARSTUDENTPROJECT
 
Inception v4 vs Inception Resnet v2.pdf
Inception v4 vs Inception Resnet v2.pdfInception v4 vs Inception Resnet v2.pdf
Inception v4 vs Inception Resnet v2.pdf
ChauVVan
 
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksPR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Jinwon Lee
 
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondenceParn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
NAVER Engineering
 
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...Christopher Sneed, MSDS, PMP, CSPO
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
Sanghamitra Deb
 
How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...
Dongmin Choi
 
Graph convolutional neural networks for web-scale recommender systems.pptx
Graph convolutional neural networks for web-scale recommender systems.pptxGraph convolutional neural networks for web-scale recommender systems.pptx
Graph convolutional neural networks for web-scale recommender systems.pptx
ssuser2624f71
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-System
inside-BigData.com
 
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSaptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSitakanta Mishra
 

Similar to do adversarially robust image net models transfer better (20)

How well do self-supervised models transfer.pptx
How well do self-supervised models transfer.pptxHow well do self-supervised models transfer.pptx
How well do self-supervised models transfer.pptx
 
Augmix review [cdm]
Augmix review [cdm]Augmix review [cdm]
Augmix review [cdm]
 
How useful is self-supervised pretraining for Visual tasks?
How useful is self-supervised pretraining for Visual tasks?How useful is self-supervised pretraining for Visual tasks?
How useful is self-supervised pretraining for Visual tasks?
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
 
ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention Networks
 
Presentation of master thesis
Presentation of master thesisPresentation of master thesis
Presentation of master thesis
 
A Fully Progressive approach to Single image super-resolution
A Fully Progressive approach to Single image super-resolution A Fully Progressive approach to Single image super-resolution
A Fully Progressive approach to Single image super-resolution
 
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
 
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
 
Inception v4 vs Inception Resnet v2.pdf
Inception v4 vs Inception Resnet v2.pdfInception v4 vs Inception Resnet v2.pdf
Inception v4 vs Inception Resnet v2.pdf
 
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksPR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
 
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondenceParn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
 
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...
 
Graph convolutional neural networks for web-scale recommender systems.pptx
Graph convolutional neural networks for web-scale recommender systems.pptxGraph convolutional neural networks for web-scale recommender systems.pptx
Graph convolutional neural networks for web-scale recommender systems.pptx
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-System
 
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSaptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
 

More from LEE HOSEONG

Unsupervised anomaly detection using style distillation
Unsupervised anomaly detection using style distillationUnsupervised anomaly detection using style distillation
Unsupervised anomaly detection using style distillation
LEE HOSEONG
 
CNN Architecture A to Z
CNN Architecture A to ZCNN Architecture A to Z
CNN Architecture A to Z
LEE HOSEONG
 
carrier of_tricks_for_image_classification
carrier of_tricks_for_image_classificationcarrier of_tricks_for_image_classification
carrier of_tricks_for_image_classification
LEE HOSEONG
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution Overview
LEE HOSEONG
 
2019 ICLR Best Paper Review
2019 ICLR Best Paper Review2019 ICLR Best Paper Review
2019 ICLR Best Paper Review
LEE HOSEONG
 
"Google Vizier: A Service for Black-Box Optimization" Paper Review
"Google Vizier: A Service for Black-Box Optimization" Paper Review"Google Vizier: A Service for Black-Box Optimization" Paper Review
"Google Vizier: A Service for Black-Box Optimization" Paper Review
LEE HOSEONG
 
"Searching for Activation Functions" Paper Review
"Searching for Activation Functions" Paper Review"Searching for Activation Functions" Paper Review
"Searching for Activation Functions" Paper Review
LEE HOSEONG
 
"Learning transferable architectures for scalable image recognition" Paper Re...
"Learning transferable architectures for scalable image recognition" Paper Re..."Learning transferable architectures for scalable image recognition" Paper Re...
"Learning transferable architectures for scalable image recognition" Paper Re...
LEE HOSEONG
 
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
LEE HOSEONG
 
"Dataset and metrics for predicting local visible differences" Paper Review
"Dataset and metrics for predicting local visible differences" Paper Review"Dataset and metrics for predicting local visible differences" Paper Review
"Dataset and metrics for predicting local visible differences" Paper Review
LEE HOSEONG
 
"From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ..."From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ...
LEE HOSEONG
 
"simple does it weakly supervised instance and semantic segmentation" Paper r...
"simple does it weakly supervised instance and semantic segmentation" Paper r..."simple does it weakly supervised instance and semantic segmentation" Paper r...
"simple does it weakly supervised instance and semantic segmentation" Paper r...
LEE HOSEONG
 
"How does batch normalization help optimization" Paper Review
"How does batch normalization help optimization" Paper Review"How does batch normalization help optimization" Paper Review
"How does batch normalization help optimization" Paper Review
LEE HOSEONG
 

More from LEE HOSEONG (13)

Unsupervised anomaly detection using style distillation
Unsupervised anomaly detection using style distillationUnsupervised anomaly detection using style distillation
Unsupervised anomaly detection using style distillation
 
CNN Architecture A to Z
CNN Architecture A to ZCNN Architecture A to Z
CNN Architecture A to Z
 
carrier of_tricks_for_image_classification
carrier of_tricks_for_image_classificationcarrier of_tricks_for_image_classification
carrier of_tricks_for_image_classification
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution Overview
 
2019 ICLR Best Paper Review
2019 ICLR Best Paper Review2019 ICLR Best Paper Review
2019 ICLR Best Paper Review
 
"Google Vizier: A Service for Black-Box Optimization" Paper Review
"Google Vizier: A Service for Black-Box Optimization" Paper Review"Google Vizier: A Service for Black-Box Optimization" Paper Review
"Google Vizier: A Service for Black-Box Optimization" Paper Review
 
"Searching for Activation Functions" Paper Review
"Searching for Activation Functions" Paper Review"Searching for Activation Functions" Paper Review
"Searching for Activation Functions" Paper Review
 
"Learning transferable architectures for scalable image recognition" Paper Re...
"Learning transferable architectures for scalable image recognition" Paper Re..."Learning transferable architectures for scalable image recognition" Paper Re...
"Learning transferable architectures for scalable image recognition" Paper Re...
 
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
 
"Dataset and metrics for predicting local visible differences" Paper Review
"Dataset and metrics for predicting local visible differences" Paper Review"Dataset and metrics for predicting local visible differences" Paper Review
"Dataset and metrics for predicting local visible differences" Paper Review
 
"From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ..."From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ...
 
"simple does it weakly supervised instance and semantic segmentation" Paper r...
"simple does it weakly supervised instance and semantic segmentation" Paper r..."simple does it weakly supervised instance and semantic segmentation" Paper r...
"simple does it weakly supervised instance and semantic segmentation" Paper r...
 
"How does batch normalization help optimization" Paper Review
"How does batch normalization help optimization" Paper Review"How does batch normalization help optimization" Paper Review
"How does batch normalization help optimization" Paper Review
 

Recently uploaded

GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 

Recently uploaded (20)

GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 

do adversarially robust image net models transfer better

  • 1. 2020/11/29 Ho Seong Lee (hoya012) Cognex Deep Learning Lab Research Engineer PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 1
  • 2. Contents • Introduction • Related Work • Experiments • Analysis & Discussion • Conclusion PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 2
  • 3. Introduction Transfer Learning is a widely-used paradigm in deep learning (maybe.. default..?) • Models pre-trained on standard datasets (e.g. ImageNet) can be efficiently adapted to downstream tasks. • Better pre-trained models yield better transfer results, suggesting that initial accuracy is a key aspect of transfer learning performance. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 3 Reference: “Do Better ImageNet Models Transfer Better?“, 2019 CVPR
  • 4. Related Works Transfer Learning in various domain PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 4 • Medical imaging • “Comparison of deep transfer learning strategies for digital pathology”, 2018 CVPRW • Language modeling • “Senteval: An evaluation toolkit for universal sentence representations”, 2018 arXiv • Object Detection, Segmentation • “Faster r-cnn: Towards real-time object detection with region proposal networks”, 2015 NIPS • “R-fcn: Object detection via region-based fully convolutional networks”, 2016 NIPS • “Speed/accuracy trade-offs for modern convolutional object detectors”, 2017 CVPR • “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs”, 2017 TPAMI
  • 5. Related Works PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 5 Transfer Learning with fine-tuning or frozen feature-based methods • “Analyzing the performance of multilayer neural networks for object recognition”, 2014 ECCV • “Return of the devil in the details: Delving deep into convolutional nets”, 2014 arXiv • “Rich feature hierarchies for accurate object detection and semantic seg- mentation”, 2014 CVPR • “How transferable are features in deep neural networks?”, 2014 NIPS • “Factors of transferability for a generic convnet representation”, 2015 TPAMI • “Bilinear cnn models for fine- grained visual recognition”, 2015 ICCV • “What makes ImageNet good for transfer learning?”, 2016 arXiv • “Best practices for fine-tuning visual classifiers to new domains”, 2016 ECCV → They show that fine-tuning outperforms frozen feature-based methods
  • 6. Related Works PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 6 Adversarial robustness • “Towards deep learning models resistant to adversarial attacks”, 2018 ICLR • “Virtual adversarial training: a regularization method for supervised and semi-supervised learning”, 2018 • “Provably robust deep learning via adversarially trained smoothed classifier”, 2019 NeurIPS • And many papers has studied the features learned by these robust networks and suggested that they improve upon those learned by standard networks. • On the other hand, prior studies have also identified theoretical and empirical tradeoffs between standard accuracy and adversarial robustness.
  • 7. Related Works PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 7 Adversarial robustness and Transfer learning • “Adversarially robust transfer learning”, 2019 arXiv • Transfer learning can increase downstream-task adversarial robustness • “Adversarially-Trained Deep Nets Transfer Better”, 2020 arXiv • Investigate the transfer performance of adversarially robust networks. → Very similar work! • Authors study a larger set of downstream datasets and tasks and analyze the effects of model accuracy, model width, and data resolution.
  • 8. Experiments Motivation: Fixed-Feature Transfer Learning • Basically we use the source model as a feature extractor for the target dataset, the trains a simple (often linear) model on the resulting features PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 8 Reference: Stanford cs231n lecture note
  • 9. Experiments How can we improve transfer learning? • Prior works suggest that accuracy on the source dataset is a strong indicator of performance on downstream tasks. • Still, it is unclear if improving ImageNet accuracy is the only way to improve performance. • After all, the behavior of fixed-feature transfer is governed by models’ learned representations, which are not fully described by source-dataset accuracy. • These representations are, in turn, controlled by the priors that we put on them during training PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 9 architectural components, loss functions, augmentations, etc.
  • 10. Experiments The adversarial robustness prior • Adversarial robustness refers to a model’s invariance to small (often imperceptible) perturbations of its inputs. • Robustness is typically induced at training time by replacing the standard empirical risk minimization objective with a robust optimization objective PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 10
  • 11. Experiments Should adversarial robustness help fixed-feature transfer? • In fact, adversarially robust models are known to be significantly less accurate than their standard counterparts. • It suggest that using adversarially robust feature representations should hurt transfer performance. • On the other hand, recent work has found that the feature representations of robust models carry several advantages over those of standard models. • For example, adversarially robust representations typically have better-behaved gradients and thus facilitate regularization-free feature visualization PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 11
  • 12. Experiments Experiments – Fixed Feature Transfer Learning • To resolve these two conflicting hypotheses (adversarially robust feature representations should hurt transfer performance. vs. feature representations of robust models carry several advantages over those of standard models.), use a test bed of 12 standard transfer learning datasets. • Use four ResNet-based architecture (ResNet-18, 50, WideResNet-50-x2, 50-x4) • The results indicate that robust networks consistently extract better features for transfer learning than standard networks. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 12
  • 13. Experiments Experiments – Fixed Feature Transfer Learning • To resolve these two conflicting hypotheses (adversarially robust feature representations should hurt transfer performance. vs. feature representations of robust models carry several advantages over those of standard models.), use a test bed of 12 standard transfer learning datasets. • Use four ResNet-based architecture (ResNet-18, 50, WideResNet-50-x2, 50-x4) PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 13
  • 14. Experiments Experiments – Full-Network Fine Tuning • A more expensive but often better-performing transfer learning method uses the pre-trained model as a weight initialization rather than as a feature extractor. • In other words, update all of the weights of the pre-trained model (via gradient descent) to minimize loss on the target task. • Many previous works find that for standard models, performance on full-network transfer learning is highly correlated with performance on fixed-feature transfer learning. • Hope that the findings of the last section (fixed-feature) also carry over to this setting (full-network). PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 14
  • 15. Experiments Experiments – Full-Network Fine Tuning • Robust models match or improve on standard models in terms of transfer learning performance. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 15
  • 16. Experiments Experiments – Full-Network Fine Tuning • Also, adversarially robust networks consistently outperform standard networks in Object Detection & Instance Segmentation PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 16
  • 17. Analysis & Discussion 4.1 ImageNet accuracy and transfer performance • Take a closer look at the similarities and differences in transfer learning between robust networks and standard networks. • Hypothesis: robustness and accuracy have counteracting yet separate effects! • That is, higher accuracy improves transfer learning for a fixed level of robustness, and higher robustness improves transfer learning for a fixed level of accuracy • The results (cf. Figure 5; similar results for full-network transfer in Appendix F) support this hypothesis. • The previously observed linear relationship between accuracy and transfer performance is often violated once robustness aspect comes into play. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 17
  • 18. Analysis & Discussion PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 18
  • 19. Analysis & Discussion 4.1 ImageNet accuracy and transfer performance • In even more direct support of our hypothesis, find that when the robustness level is held fixed, the accuracy- transfer correlation observed by prior works for standard models holds for robust models too. • This findings also indicate that accuracy is not a sufficient measure of feature quality or versatility. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 19
  • 20. Analysis & Discussion 4.2 Robust models improve with width • Previous works find that although increasing network depth improves transfer performance, increasing width hurts it. • The results corroborate this trend for standard networks but indicate that it does not hold for robust networks, at least in the regime of widths tested. • As width increases, transfer performance plateaus and decreases for standard models, but continues to steadily grow for robust models. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 20 Not always!!
  • 21. Analysis & Discussion 4.2 Robust models improve with width • Previous works find that although increasing network depth improves transfer performance, increasing width hurts it. • The results corroborate this trend for standard networks but indicate that it does not hold for robust networks, at least in the regime of widths tested. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 21
  • 22. Analysis & Discussion 4.3 Optimal robustness levels for downstream tasks • Although the best robust models often outperform the best standard models, the optimal choice of robustness parameter ε varies widely between datasets. For example, when transferring to CIFAR- 10 and CIFAR-100, the optimal ε values were 3.0 and 1.0, respectively. • In contrast, smaller values of ε (smaller by an order of magnitude) tend to work better for the rest of the datasets. • One possible explanation for this variability in the optimal choice of ε might relate to dataset granularity. • Although we lack a quantitative notion of granularity (in reality, features are not simply singular pixels), we consider image resolution as a crude proxy. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 22
  • 23. Analysis & Discussion 4.3 Optimal robustness levels for downstream tasks • Since we scale target datasets to match ImageNet dimensions, each pixel in a low-resolution dataset (e.g., CIFAR-10) image translates into several pixels in transfer, thus inflating dataset’s separability. • Attempt to calibrate the granularities of the 12 image classification datasets used in this work, by first downscaling all the images to the size of CIFAR-10 (32× 32), and then upscaling them to ImageNet size once more. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 23
  • 24. Analysis & Discussion 4.3 Optimal robustness levels for downstream tasks • After controlling for original dataset dimension, the dataset’s epsilon vs. transfer accuracy curves all behave almost identically to CIFAR-10 and CIFAR-100 ones. (Similar results for full-network transfer) PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 24
  • 25. Analysis & Discussion 4.4 Comparing adversarial robustness to texture robustness • Consider texture-invariant models, i.e., models trained on the texture-randomizing Stylized ImageNet (SIN) dataset. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 25
  • 26. Analysis & Discussion 4.4 Comparing adversarial robustness to texture robustness • Transfer learning from adversarially robust models outperforms transfer learning from texture-invariant models on all considered datasets. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 26 Full-network Fixed-feature
  • 27. Conclusion PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? • Propose using adversarially robust models for transfer learning. • Compare transfer learning performance of robust and standard models on a suite of 12 classification tasks, object detection, and instance segmentation. • Find that adversarial robust neural networks consistently match or improve upon the performance of their standard counterparts, despite having lower ImageNet accuracy. • Take a closer look at the behavior of adversarially robust networks, and study the interplay between ImageNet accuracy, model width, robustness, and transfer performance. 27
  • 28. Conclusion PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? • We can simply try this experiments! (https://github.com/Microsoft/robust-models-transfer) 28