SlideShare a Scribd company logo
1 of 26
Download to read offline
PR-331
Sungchul Kim
Contents
1. Introduction & Related Work
2. (Conclusion)
3. Empirical Evaluation
4. Pitfalls and Limitations
5. Conclusion
2
Introduction & Related Work
3
• What is ‘Model Calibration’?
• The accuracy with which the scores provided by the model reflect its predictive uncertainty
• It is important in safety-critical applications, such as autonomous driving, medical diagnosis, forecasting
• On Calibration of Modern Neural Networks, ICML 2017. (PR-075)
https://arxiv.org/abs/1706.04599
PR-075: On Calibration of Modern Neural Networks (2017)
Introduction & Related Work
4
• On Calibration of Modern Neural Networks, ICML 2017. (PR-075)
• Expected Calibration Error (ECE)
https://arxiv.org/abs/1706.04599
PR-075: On Calibration of Modern Neural Networks (2017)
Video by Geoff Pleiss
Neural
Network
Input 𝑥
Logits 𝑧 𝑥
𝑒𝑧 𝑥
σ𝑖 𝑒𝑧𝑖 𝑥
Probability Ƹ
𝑝 𝑥
0 ≤ Ƹ
𝑝 𝑥 < 0.2 0.2 ≤ Ƹ
𝑝 𝑥 < 0.4 0.4 ≤ Ƹ
𝑝 𝑥 < 0.6 0.6 ≤ Ƹ
𝑝 𝑥 < 0.8 0.8 ≤ Ƹ
𝑝 𝑥 < 1
Introduction & Related Work
5
• On Calibration of Modern Neural Networks, ICML 2017. (PR-075)
• Expected Calibration Error (ECE)
0 ≤ Ƹ
𝑝 𝑥 < 0.2 0.2 ≤ Ƹ
𝑝 𝑥 < 0.4 0.4 ≤ Ƹ
𝑝 𝑥 < 0.6 0.6 ≤ Ƹ
𝑝 𝑥 < 0.8 0.8 ≤ Ƹ
𝑝 𝑥 < 1
X O O O
O
O
O
https://arxiv.org/abs/1706.04599
PR-075: On Calibration of Modern Neural Networks (2017)
Video by Geoff Pleiss
Avg. conf : 0.86
Accuracy : 0.86
Introduction & Related Work
6
• On Calibration of Modern Neural Networks, ICML 2017. (PR-075)
• Expected Calibration Error (ECE)
0 ≤ Ƹ
𝑝 𝑥 < 0.2 0.2 ≤ Ƹ
𝑝 𝑥 < 0.4 0.4 ≤ Ƹ
𝑝 𝑥 < 0.6 0.6 ≤ Ƹ
𝑝 𝑥 < 0.8 0.8 ≤ Ƹ
𝑝 𝑥 < 1
X O O
O
O
O
Avg. conf : 0.86
Accuracy : 0.71
Gap : 0.15
X
https://arxiv.org/abs/1706.04599
PR-075: On Calibration of Modern Neural Networks (2017)
Video by Geoff Pleiss
Introduction & Related Work
7
• On Calibration of Modern Neural Networks, ICML 2017. (PR-075)
• Expected Calibration Error (ECE)
0 ≤ Ƹ
𝑝 𝑥 < 0.2 0.2 ≤ Ƹ
𝑝 𝑥 < 0.4 0.4 ≤ Ƹ
𝑝 𝑥 < 0.6 0.6 ≤ Ƹ
𝑝 𝑥 < 0.8 0.8 ≤ Ƹ
𝑝 𝑥 < 1
X O O
O
O
O
0.86
0.71
7 x 0.15
X
https://arxiv.org/abs/1706.04599
PR-075: On Calibration of Modern Neural Networks (2017)
Video by Geoff Pleiss
X O
X
O
O
O
O
X
Avg. conf :
Accuracy :
Gap :
0.74
0.67
6 x 0.07
0.55
0.50
2 x 0.05
15
Expected Calibration Error (ECE) : 0.11 (Naeini et al. 2015)
Introduction & Related Work
8
• On Calibration of Modern Neural Networks, ICML 2017. (PR-075)
• Reliability Diagrams
acc 𝐵𝑚 =
1
𝐵𝑚
σ𝑖∈𝐵𝑚
1 ො
𝑦𝑖 = 𝑦𝑖 , conf 𝐵𝑚 =
1
𝐵𝑚
σ𝑖∈𝐵𝑚
Ƹ
𝑝𝑖
• Expected Calibration Error (ECE)
ECE = ෍
𝑚=1
𝑀
𝐵𝑚
𝑛
acc 𝐵𝑚 − conf 𝐵𝑚
• Negative log likelihood (NLL)
ℒ = − ෍
𝑖=1
𝑛
log ො
𝜋 𝑦𝑖|𝑥𝑖
https://arxiv.org/abs/1706.04599
PR-075: On Calibration of Modern Neural Networks (2017)
Video by Geoff Pleiss
Introduction & Related Work
9
• On Calibration of Modern Neural Networks, ICML 2017. (PR-075)
• Temperature Scaling
ො
𝑞𝑖 = max
𝑘
𝜎SM 𝑧𝑖/𝑇 (𝑘)
https://arxiv.org/abs/1706.04599
PR-075: On Calibration of Modern Neural Networks (2017)
Video by Geoff Pleiss
Introduction & Related Work
10
• Why should ‘Model Calibration’ be considered in safety-critical applications?
Video by Geoff Pleiss
Model : plastic bag (55% conf.)
Other sensors : person (90% conf.)
Model : cyst (55% conf.)
Clinician : HCC (90% conf.)
Introduction & Related Work
11
• Some works suggest a trend for larger, more accurate models to be worse calibrated.
• Recent researches have focused on improving accuracy rather than calibration.
• architecture size, amount of training data, and computing power
• Architecture : MLP-Mixer, ViT, …
• Training approaches : SimCLR, ResNext-WSL, CLIP, …
Q. Do past results on calibration extend to current state-of-the-art models?
https://arxiv.org/abs/1706.04599
Empirical Evaluation
12
• Experimental Setup
• Model families
• MLP-Mixer (PR-317)
• ViT (PR-281)
• BiT
• ResNeXt-WSL
• SimCLR (PR-231)
• CLIP
• AlexNet
https://arxiv.org/abs/2105.01601
https://arxiv.org/abs/2010.11929
https://arxiv.org/abs/1912.11370
https://arxiv.org/abs/1805.00932
https://arxiv.org/abs/2002.05709
https://arxiv.org/abs/2103.00020
https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
Empirical Evaluation
13
• Experimental Setup
• Datasets
• ImageNet
• ImageNetV2
• ImageNet-C (PR-277)
• ImageNet-R
• ImageNet-A
• ImageNet-21K
• JFT-300
https://arxiv.org/abs/1902.10811
https://arxiv.org/abs/1903.12261
https://arxiv.org/abs/2006.16241
https://arxiv.org/abs/1907.07174
https://ieeexplore.ieee.org/document/5206848
https://arxiv.org/abs/1707.02968
(Conclusion)
14
• Modern image models are well calibrated across distribution shift despite being designed with a focus on
accuracy.
• Model size and pretraining amount do not fully account for the performance differences between families.
• Architecture is a major determinant of calibration!
• MLP-Mixer and ViT (non-conv) are among the best-calibrated models both in-distribution and OOD.
• Further work on the influence of architecture inductive biases on calibration and OOD robustness will be
necessary to tell whether these results generalize.
Empirical Evaluation
15
• In-Distribution Calibration
• Several recent model families (MLP-Mixer, ViT, and BiT) are both highly accurate and well-calibrated.
• A zero-shot model, CLIP, is well-calibrated given its accuracy.
Empirical Evaluation
16
• In-Distribution Calibration
• Temperature scaling reveals consistent properties of model families.
• The poor calibration of past models can often be remedied by post-hoc recalibration. (temperature scaling)
→ Does a difference between models remain after recalibration?
→ The most recent architectures are better calibrated than past models even after temperature scaling.
• SimCLR or ResNeXt-WSL tend to be worse calibrated than BiT.
• MLP-Mixer and ViT (non-conv) can perform as well.
Empirical Evaluation
17
• In-Distribution Calibration
• Differences between families are not explained by model size or pretraining amount
• Disentangle how the differences between model families affect their calibration properties
• Model size
• Prior work : Larger neural networks are worse calibrated and have lower classification error.
• This study : ViT models are better calibrated than BiT models.
• Changing the size of BiT model cannot move it into the Pareto set of ViT models.
Empirical Evaluation
18
• In-Distribution Calibration
• Differences between families are not explained by model size or pretraining amount
• Disentangle how the differences between model families affect their calibration properties
• Model pretraining
• BiT models pretrained on ImageNet (1.3M), ImageNet-21K (12.8M), and JFT-300 (300M)
• It has no consistent effect on calibration. (ViT, MLP-Mixer < BiT < SimCLR)
• The amount of pretraining data and the duration of pretraining
Empirical Evaluation
19
• Accuracy and Calibration Under Distribution Shift
• ImageNet-C
Empirical Evaluation
20
• Accuracy and Calibration Under Distribution Shift
• What degree the insights from in-distribution and ImageNet-C calibration transfer to natural OOD data
• Previous : Better in-distribution calibration and accuracy do not predict better calibration under distribution shift.
• This study : The performance on several natural OOD datasets is largely consistent with that on ImageNet.
Empirical Evaluation
21
• Relating Accuracy and Calibration Within Model Families
• How to compare individual models within a family, where one model is more accurate but worse
calibrated, and the other is less accurate but better calibrated.
→ Which model should a practitioner choose for a safety-critical application?
• The cost structure of the specific application
• The scenario of selective prediction
• One can choose to ignore the model prediction (“abstain”) at a fixed cost if the prediction confidence is low, rather
than risking a (more costly) prediction error.
Empirical Evaluation
22
• Relating Accuracy and Calibration Within Model Families
• Total cost : a linear combination of the misclassification cost and abstention cost at a given cost ratio (x-
axis) and abstention rate (y-axis).
• Blue regions : The higher-accuracy model (R152x4) achieves a lower cost that the better-calibrated model (R50x1).
Pitfalls and Limitations
23
• There are two sources of bias
• From estimating ECE by binning : always negative
• From the finite sample size used to estimate the per-bin statistics : always positive
• A larger number of bins → fewer points per bin → higher variance in each bin → positive bias of ECE
Pitfalls and Limitations
24
• More formally,
• To estimate accuracy 𝐵𝑖 precisely, we need a number of samples inversely proportional to the standard
deviation 𝒑𝒊 𝟏 − 𝒑𝒊 / 𝑩𝒊 , where 𝑝𝑖 is the expected accuracy in 𝐵𝑖.
• Bias : models with extreme average accuracies (i.e. close to 0 or 1) < models with an accuracy close to 0.5
• If we estimate 𝔼 accuracy 𝐵𝑖 − 𝔼 confidence 𝐵𝑖
2
for any bin 𝑖 with 𝑛𝑖 samples
𝑏𝑖𝑎𝑠 =
1
𝑛𝑖
𝕍 𝐴 + 𝕍 𝐶 − 2Cov 𝐶, 𝐴
• 𝐴 = 𝑌 ∈ arg max 𝑓 𝑋 , 𝐶 = max 𝑓 𝑋
• Higher accuracy models have a lower bias not only due to higher accuracy (lower 𝕍 𝑨 ), but also because their
outputs correlate more with the correct label (higher covariance).
Conclusion
25
• Modern image models are well calibrated across distribution shift despite being designed with a focus on
accuracy.
• Model size and pretraining amount do not fully account for the performance differences between families.
• Architecture is a major determinant of calibration!
• MLP-Mixer and ViT (non-conv) are among the best-calibrated models both in-distribution and OOD.
• Further work on the influence of architecture inductive biases on calibration and OOD robustness will be
necessary to tell whether these results generalize.
Thank you!
26

More Related Content

What's hot

(文献紹介)デブラー手法の紹介
(文献紹介)デブラー手法の紹介(文献紹介)デブラー手法の紹介
(文献紹介)デブラー手法の紹介Morpho, Inc.
 
[DL輪読会]Grasping Field: Learning Implicit Representations for Human Grasps
[DL輪読会]Grasping Field: Learning Implicit Representations for  Human Grasps[DL輪読会]Grasping Field: Learning Implicit Representations for  Human Grasps
[DL輪読会]Grasping Field: Learning Implicit Representations for Human GraspsDeep Learning JP
 
Federated learning in brief
Federated learning in briefFederated learning in brief
Federated learning in briefShashi Perera
 
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?Deep Learning JP
 
Tutorial on Polynomial Networks at CVPR'22
Tutorial on Polynomial Networks at CVPR'22Tutorial on Polynomial Networks at CVPR'22
Tutorial on Polynomial Networks at CVPR'22Grigoris C
 
Learning loss for active learning
Learning loss for active learningLearning loss for active learning
Learning loss for active learningNAVER Engineering
 
Semantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep LearningSemantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep LearningSungjoon Choi
 
Transformers in Vision: From Zero to Hero
Transformers in Vision: From Zero to HeroTransformers in Vision: From Zero to Hero
Transformers in Vision: From Zero to HeroBill Liu
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature surveyAkshay Hegde
 
Graph Representation Learning
Graph Representation LearningGraph Representation Learning
Graph Representation LearningJure Leskovec
 
【論文読み会】BEiT_BERT Pre-Training of Image Transformers.pptx
【論文読み会】BEiT_BERT Pre-Training of Image Transformers.pptx【論文読み会】BEiT_BERT Pre-Training of Image Transformers.pptx
【論文読み会】BEiT_BERT Pre-Training of Image Transformers.pptxARISE analytics
 
文献紹介:Image Segmentation Using Deep Learning: A Survey
文献紹介:Image Segmentation Using Deep Learning: A Survey文献紹介:Image Segmentation Using Deep Learning: A Survey
文献紹介:Image Segmentation Using Deep Learning: A SurveyToru Tamaki
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Hwa Pyung Kim
 
【メタサーベイ】Neural Fields
【メタサーベイ】Neural Fields【メタサーベイ】Neural Fields
【メタサーベイ】Neural Fieldscvpaper. challenge
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...홍배 김
 
Uncertainty Estimation in Deep Learning
Uncertainty Estimation in Deep LearningUncertainty Estimation in Deep Learning
Uncertainty Estimation in Deep LearningChristian Perone
 
[DL輪読会]The Neural Process Family−Neural Processes関連の実装を読んで動かしてみる−
[DL輪読会]The Neural Process Family−Neural Processes関連の実装を読んで動かしてみる−[DL輪読会]The Neural Process Family−Neural Processes関連の実装を読んで動かしてみる−
[DL輪読会]The Neural Process Family−Neural Processes関連の実装を読んで動かしてみる−Deep Learning JP
 
Probabilistic face embeddings
Probabilistic face embeddingsProbabilistic face embeddings
Probabilistic face embeddingsKazuki Maeno
 

What's hot (20)

(文献紹介)デブラー手法の紹介
(文献紹介)デブラー手法の紹介(文献紹介)デブラー手法の紹介
(文献紹介)デブラー手法の紹介
 
[DL輪読会]Grasping Field: Learning Implicit Representations for Human Grasps
[DL輪読会]Grasping Field: Learning Implicit Representations for  Human Grasps[DL輪読会]Grasping Field: Learning Implicit Representations for  Human Grasps
[DL輪読会]Grasping Field: Learning Implicit Representations for Human Grasps
 
Federated learning in brief
Federated learning in briefFederated learning in brief
Federated learning in brief
 
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
 
Tutorial on Polynomial Networks at CVPR'22
Tutorial on Polynomial Networks at CVPR'22Tutorial on Polynomial Networks at CVPR'22
Tutorial on Polynomial Networks at CVPR'22
 
Introduction to Transformer Model
Introduction to Transformer ModelIntroduction to Transformer Model
Introduction to Transformer Model
 
MIRU MIRU わかる GAN
MIRU MIRU わかる GANMIRU MIRU わかる GAN
MIRU MIRU わかる GAN
 
Learning loss for active learning
Learning loss for active learningLearning loss for active learning
Learning loss for active learning
 
Semantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep LearningSemantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep Learning
 
Transformers in Vision: From Zero to Hero
Transformers in Vision: From Zero to HeroTransformers in Vision: From Zero to Hero
Transformers in Vision: From Zero to Hero
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
 
Graph Representation Learning
Graph Representation LearningGraph Representation Learning
Graph Representation Learning
 
【論文読み会】BEiT_BERT Pre-Training of Image Transformers.pptx
【論文読み会】BEiT_BERT Pre-Training of Image Transformers.pptx【論文読み会】BEiT_BERT Pre-Training of Image Transformers.pptx
【論文読み会】BEiT_BERT Pre-Training of Image Transformers.pptx
 
文献紹介:Image Segmentation Using Deep Learning: A Survey
文献紹介:Image Segmentation Using Deep Learning: A Survey文献紹介:Image Segmentation Using Deep Learning: A Survey
文献紹介:Image Segmentation Using Deep Learning: A Survey
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)
 
【メタサーベイ】Neural Fields
【メタサーベイ】Neural Fields【メタサーベイ】Neural Fields
【メタサーベイ】Neural Fields
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
 
Uncertainty Estimation in Deep Learning
Uncertainty Estimation in Deep LearningUncertainty Estimation in Deep Learning
Uncertainty Estimation in Deep Learning
 
[DL輪読会]The Neural Process Family−Neural Processes関連の実装を読んで動かしてみる−
[DL輪読会]The Neural Process Family−Neural Processes関連の実装を読んで動かしてみる−[DL輪読会]The Neural Process Family−Neural Processes関連の実装を読んで動かしてみる−
[DL輪読会]The Neural Process Family−Neural Processes関連の実装を読んで動かしてみる−
 
Probabilistic face embeddings
Probabilistic face embeddingsProbabilistic face embeddings
Probabilistic face embeddings
 

Similar to Revisiting the Calibration of Modern Neural Networks

深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用CHENHuiMei
 
addressing tim/quality trade-off in view maintenance
addressing tim/quality trade-off in view maintenanceaddressing tim/quality trade-off in view maintenance
addressing tim/quality trade-off in view maintenanceSoheila Dehghanzadeh
 
Optimizing SPARQL Query Processing On Dynamic and Static Data Based on Query ...
Optimizing SPARQL Query Processing On Dynamic and Static Data Based on Query ...Optimizing SPARQL Query Processing On Dynamic and Static Data Based on Query ...
Optimizing SPARQL Query Processing On Dynamic and Static Data Based on Query ...Soheila Dehghanzadeh
 
Augmix review [cdm]
Augmix review [cdm]Augmix review [cdm]
Augmix review [cdm]Dongmin Choi
 
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationBridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationThomas Ploetz
 
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly DetectionMVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly DetectionLEE HOSEONG
 
Hybrid predictive modelling of geometry with limited data in cold spray addit...
Hybrid predictive modelling of geometry with limited data in cold spray addit...Hybrid predictive modelling of geometry with limited data in cold spray addit...
Hybrid predictive modelling of geometry with limited data in cold spray addit...Daiki Ikeuchi
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...Jinwon Lee
 
Application of machine learning and cognitive computing in intrusion detectio...
Application of machine learning and cognitive computing in intrusion detectio...Application of machine learning and cognitive computing in intrusion detectio...
Application of machine learning and cognitive computing in intrusion detectio...Mahdi Hosseini Moghaddam
 
quality control STUDY ON 3 POLE MCCB MBA SIP report
quality control STUDY ON 3 POLE MCCB MBA SIP report quality control STUDY ON 3 POLE MCCB MBA SIP report
quality control STUDY ON 3 POLE MCCB MBA SIP report Akshay Nair
 
Mx net image segmentation to predict and diagnose the cardiac diseases karp...
Mx net image segmentation to predict and diagnose the cardiac diseases   karp...Mx net image segmentation to predict and diagnose the cardiac diseases   karp...
Mx net image segmentation to predict and diagnose the cardiac diseases karp...KannanRamasamy25
 
|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and BlockchainKan Yuenyong
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)DonghyunKang12
 
Extend Your Journey: Introducing Signal Strength into Location-based Applicat...
Extend Your Journey: Introducing Signal Strength into Location-based Applicat...Extend Your Journey: Introducing Signal Strength into Location-based Applicat...
Extend Your Journey: Introducing Signal Strength into Location-based Applicat...Chih-Chuan Cheng
 
Comparative Study of Pre-Trained Neural Network Models in Detection of Glaucoma
Comparative Study of Pre-Trained Neural Network Models in Detection of GlaucomaComparative Study of Pre-Trained Neural Network Models in Detection of Glaucoma
Comparative Study of Pre-Trained Neural Network Models in Detection of GlaucomaIRJET Journal
 
Blood Cell Image Classification for Detecting Malaria using CNN
Blood Cell Image Classification for Detecting Malaria using CNNBlood Cell Image Classification for Detecting Malaria using CNN
Blood Cell Image Classification for Detecting Malaria using CNNIRJET Journal
 

Similar to Revisiting the Calibration of Modern Neural Networks (20)

OOD_PPT.pptx
OOD_PPT.pptxOOD_PPT.pptx
OOD_PPT.pptx
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
addressing tim/quality trade-off in view maintenance
addressing tim/quality trade-off in view maintenanceaddressing tim/quality trade-off in view maintenance
addressing tim/quality trade-off in view maintenance
 
Optimizing SPARQL Query Processing On Dynamic and Static Data Based on Query ...
Optimizing SPARQL Query Processing On Dynamic and Static Data Based on Query ...Optimizing SPARQL Query Processing On Dynamic and Static Data Based on Query ...
Optimizing SPARQL Query Processing On Dynamic and Static Data Based on Query ...
 
Augmix review [cdm]
Augmix review [cdm]Augmix review [cdm]
Augmix review [cdm]
 
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationBridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
 
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly DetectionMVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
 
Hybrid predictive modelling of geometry with limited data in cold spray addit...
Hybrid predictive modelling of geometry with limited data in cold spray addit...Hybrid predictive modelling of geometry with limited data in cold spray addit...
Hybrid predictive modelling of geometry with limited data in cold spray addit...
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
 
Application of machine learning and cognitive computing in intrusion detectio...
Application of machine learning and cognitive computing in intrusion detectio...Application of machine learning and cognitive computing in intrusion detectio...
Application of machine learning and cognitive computing in intrusion detectio...
 
quality control STUDY ON 3 POLE MCCB MBA SIP report
quality control STUDY ON 3 POLE MCCB MBA SIP report quality control STUDY ON 3 POLE MCCB MBA SIP report
quality control STUDY ON 3 POLE MCCB MBA SIP report
 
Mx net image segmentation to predict and diagnose the cardiac diseases karp...
Mx net image segmentation to predict and diagnose the cardiac diseases   karp...Mx net image segmentation to predict and diagnose the cardiac diseases   karp...
Mx net image segmentation to predict and diagnose the cardiac diseases karp...
 
Generalization Ability of MOS Prediction Networks
Generalization Ability of MOS Prediction NetworksGeneralization Ability of MOS Prediction Networks
Generalization Ability of MOS Prediction Networks
 
|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
Extend Your Journey: Introducing Signal Strength into Location-based Applicat...
Extend Your Journey: Introducing Signal Strength into Location-based Applicat...Extend Your Journey: Introducing Signal Strength into Location-based Applicat...
Extend Your Journey: Introducing Signal Strength into Location-based Applicat...
 
Poster_group22
Poster_group22Poster_group22
Poster_group22
 
Comparative Study of Pre-Trained Neural Network Models in Detection of Glaucoma
Comparative Study of Pre-Trained Neural Network Models in Detection of GlaucomaComparative Study of Pre-Trained Neural Network Models in Detection of Glaucoma
Comparative Study of Pre-Trained Neural Network Models in Detection of Glaucoma
 
Blood Cell Image Classification for Detecting Malaria using CNN
Blood Cell Image Classification for Detecting Malaria using CNNBlood Cell Image Classification for Detecting Malaria using CNN
Blood Cell Image Classification for Detecting Malaria using CNN
 
Seminar nov2017
Seminar nov2017Seminar nov2017
Seminar nov2017
 

More from Sungchul Kim

PR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
PR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo SupervisionPR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
PR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo SupervisionSungchul Kim
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersSungchul Kim
 
Score based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential EquationsScore based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential EquationsSungchul Kim
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningSungchul Kim
 
Revisiting the Sibling Head in Object Detector
Revisiting the Sibling Head in Object DetectorRevisiting the Sibling Head in Object Detector
Revisiting the Sibling Head in Object DetectorSungchul Kim
 
Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...
Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...
Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...Sungchul Kim
 
Deeplabv1, v2, v3, v3+
Deeplabv1, v2, v3, v3+Deeplabv1, v2, v3, v3+
Deeplabv1, v2, v3, v3+Sungchul Kim
 
Going Deeper with Convolutions
Going Deeper with ConvolutionsGoing Deeper with Convolutions
Going Deeper with ConvolutionsSungchul Kim
 
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based LocalizationGrad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based LocalizationSungchul Kim
 
Generalized Intersection over Union: A Metric and A Loss for Bounding Box Reg...
Generalized Intersection over Union: A Metric and A Loss for Bounding Box Reg...Generalized Intersection over Union: A Metric and A Loss for Bounding Box Reg...
Generalized Intersection over Union: A Metric and A Loss for Bounding Box Reg...Sungchul Kim
 
Panoptic Segmentation
Panoptic SegmentationPanoptic Segmentation
Panoptic SegmentationSungchul Kim
 
On the Variance of the Adaptive Learning Rate and Beyond
On the Variance of the Adaptive Learning Rate and BeyondOn the Variance of the Adaptive Learning Rate and Beyond
On the Variance of the Adaptive Learning Rate and BeyondSungchul Kim
 
A Benchmark for Interpretability Methods in Deep Neural Networks
A Benchmark for Interpretability Methods in Deep Neural NetworksA Benchmark for Interpretability Methods in Deep Neural Networks
A Benchmark for Interpretability Methods in Deep Neural NetworksSungchul Kim
 
KDGAN: Knowledge Distillation with Generative Adversarial Networks
KDGAN: Knowledge Distillation with Generative Adversarial NetworksKDGAN: Knowledge Distillation with Generative Adversarial Networks
KDGAN: Knowledge Distillation with Generative Adversarial NetworksSungchul Kim
 
Designing Network Design Spaces
Designing Network Design SpacesDesigning Network Design Spaces
Designing Network Design SpacesSungchul Kim
 
Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSearch to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSungchul Kim
 
Supervised Constrastive Learning
Supervised Constrastive LearningSupervised Constrastive Learning
Supervised Constrastive LearningSungchul Kim
 
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Bootstrap Your Own Latent: A New Approach to Self-Supervised LearningBootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Bootstrap Your Own Latent: A New Approach to Self-Supervised LearningSungchul Kim
 
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...Sungchul Kim
 
Regularizing Class-wise Predictions via Self-knowledge Distillation
Regularizing Class-wise Predictions via Self-knowledge DistillationRegularizing Class-wise Predictions via Self-knowledge Distillation
Regularizing Class-wise Predictions via Self-knowledge DistillationSungchul Kim
 

More from Sungchul Kim (20)

PR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
PR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo SupervisionPR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
PR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
 
Score based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential EquationsScore based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential Equations
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
 
Revisiting the Sibling Head in Object Detector
Revisiting the Sibling Head in Object DetectorRevisiting the Sibling Head in Object Detector
Revisiting the Sibling Head in Object Detector
 
Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...
Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...
Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...
 
Deeplabv1, v2, v3, v3+
Deeplabv1, v2, v3, v3+Deeplabv1, v2, v3, v3+
Deeplabv1, v2, v3, v3+
 
Going Deeper with Convolutions
Going Deeper with ConvolutionsGoing Deeper with Convolutions
Going Deeper with Convolutions
 
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based LocalizationGrad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
 
Generalized Intersection over Union: A Metric and A Loss for Bounding Box Reg...
Generalized Intersection over Union: A Metric and A Loss for Bounding Box Reg...Generalized Intersection over Union: A Metric and A Loss for Bounding Box Reg...
Generalized Intersection over Union: A Metric and A Loss for Bounding Box Reg...
 
Panoptic Segmentation
Panoptic SegmentationPanoptic Segmentation
Panoptic Segmentation
 
On the Variance of the Adaptive Learning Rate and Beyond
On the Variance of the Adaptive Learning Rate and BeyondOn the Variance of the Adaptive Learning Rate and Beyond
On the Variance of the Adaptive Learning Rate and Beyond
 
A Benchmark for Interpretability Methods in Deep Neural Networks
A Benchmark for Interpretability Methods in Deep Neural NetworksA Benchmark for Interpretability Methods in Deep Neural Networks
A Benchmark for Interpretability Methods in Deep Neural Networks
 
KDGAN: Knowledge Distillation with Generative Adversarial Networks
KDGAN: Knowledge Distillation with Generative Adversarial NetworksKDGAN: Knowledge Distillation with Generative Adversarial Networks
KDGAN: Knowledge Distillation with Generative Adversarial Networks
 
Designing Network Design Spaces
Designing Network Design SpacesDesigning Network Design Spaces
Designing Network Design Spaces
 
Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSearch to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the Eyes
 
Supervised Constrastive Learning
Supervised Constrastive LearningSupervised Constrastive Learning
Supervised Constrastive Learning
 
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Bootstrap Your Own Latent: A New Approach to Self-Supervised LearningBootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
 
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
 
Regularizing Class-wise Predictions via Self-knowledge Distillation
Regularizing Class-wise Predictions via Self-knowledge DistillationRegularizing Class-wise Predictions via Self-knowledge Distillation
Regularizing Class-wise Predictions via Self-knowledge Distillation
 

Recently uploaded

College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 

Recently uploaded (20)

College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 

Revisiting the Calibration of Modern Neural Networks

  • 2. Contents 1. Introduction & Related Work 2. (Conclusion) 3. Empirical Evaluation 4. Pitfalls and Limitations 5. Conclusion 2
  • 3. Introduction & Related Work 3 • What is ‘Model Calibration’? • The accuracy with which the scores provided by the model reflect its predictive uncertainty • It is important in safety-critical applications, such as autonomous driving, medical diagnosis, forecasting • On Calibration of Modern Neural Networks, ICML 2017. (PR-075) https://arxiv.org/abs/1706.04599 PR-075: On Calibration of Modern Neural Networks (2017)
  • 4. Introduction & Related Work 4 • On Calibration of Modern Neural Networks, ICML 2017. (PR-075) • Expected Calibration Error (ECE) https://arxiv.org/abs/1706.04599 PR-075: On Calibration of Modern Neural Networks (2017) Video by Geoff Pleiss Neural Network Input 𝑥 Logits 𝑧 𝑥 𝑒𝑧 𝑥 σ𝑖 𝑒𝑧𝑖 𝑥 Probability Ƹ 𝑝 𝑥 0 ≤ Ƹ 𝑝 𝑥 < 0.2 0.2 ≤ Ƹ 𝑝 𝑥 < 0.4 0.4 ≤ Ƹ 𝑝 𝑥 < 0.6 0.6 ≤ Ƹ 𝑝 𝑥 < 0.8 0.8 ≤ Ƹ 𝑝 𝑥 < 1
  • 5. Introduction & Related Work 5 • On Calibration of Modern Neural Networks, ICML 2017. (PR-075) • Expected Calibration Error (ECE) 0 ≤ Ƹ 𝑝 𝑥 < 0.2 0.2 ≤ Ƹ 𝑝 𝑥 < 0.4 0.4 ≤ Ƹ 𝑝 𝑥 < 0.6 0.6 ≤ Ƹ 𝑝 𝑥 < 0.8 0.8 ≤ Ƹ 𝑝 𝑥 < 1 X O O O O O O https://arxiv.org/abs/1706.04599 PR-075: On Calibration of Modern Neural Networks (2017) Video by Geoff Pleiss Avg. conf : 0.86 Accuracy : 0.86
  • 6. Introduction & Related Work 6 • On Calibration of Modern Neural Networks, ICML 2017. (PR-075) • Expected Calibration Error (ECE) 0 ≤ Ƹ 𝑝 𝑥 < 0.2 0.2 ≤ Ƹ 𝑝 𝑥 < 0.4 0.4 ≤ Ƹ 𝑝 𝑥 < 0.6 0.6 ≤ Ƹ 𝑝 𝑥 < 0.8 0.8 ≤ Ƹ 𝑝 𝑥 < 1 X O O O O O Avg. conf : 0.86 Accuracy : 0.71 Gap : 0.15 X https://arxiv.org/abs/1706.04599 PR-075: On Calibration of Modern Neural Networks (2017) Video by Geoff Pleiss
  • 7. Introduction & Related Work 7 • On Calibration of Modern Neural Networks, ICML 2017. (PR-075) • Expected Calibration Error (ECE) 0 ≤ Ƹ 𝑝 𝑥 < 0.2 0.2 ≤ Ƹ 𝑝 𝑥 < 0.4 0.4 ≤ Ƹ 𝑝 𝑥 < 0.6 0.6 ≤ Ƹ 𝑝 𝑥 < 0.8 0.8 ≤ Ƹ 𝑝 𝑥 < 1 X O O O O O 0.86 0.71 7 x 0.15 X https://arxiv.org/abs/1706.04599 PR-075: On Calibration of Modern Neural Networks (2017) Video by Geoff Pleiss X O X O O O O X Avg. conf : Accuracy : Gap : 0.74 0.67 6 x 0.07 0.55 0.50 2 x 0.05 15 Expected Calibration Error (ECE) : 0.11 (Naeini et al. 2015)
  • 8. Introduction & Related Work 8 • On Calibration of Modern Neural Networks, ICML 2017. (PR-075) • Reliability Diagrams acc 𝐵𝑚 = 1 𝐵𝑚 σ𝑖∈𝐵𝑚 1 ො 𝑦𝑖 = 𝑦𝑖 , conf 𝐵𝑚 = 1 𝐵𝑚 σ𝑖∈𝐵𝑚 Ƹ 𝑝𝑖 • Expected Calibration Error (ECE) ECE = ෍ 𝑚=1 𝑀 𝐵𝑚 𝑛 acc 𝐵𝑚 − conf 𝐵𝑚 • Negative log likelihood (NLL) ℒ = − ෍ 𝑖=1 𝑛 log ො 𝜋 𝑦𝑖|𝑥𝑖 https://arxiv.org/abs/1706.04599 PR-075: On Calibration of Modern Neural Networks (2017) Video by Geoff Pleiss
  • 9. Introduction & Related Work 9 • On Calibration of Modern Neural Networks, ICML 2017. (PR-075) • Temperature Scaling ො 𝑞𝑖 = max 𝑘 𝜎SM 𝑧𝑖/𝑇 (𝑘) https://arxiv.org/abs/1706.04599 PR-075: On Calibration of Modern Neural Networks (2017) Video by Geoff Pleiss
  • 10. Introduction & Related Work 10 • Why should ‘Model Calibration’ be considered in safety-critical applications? Video by Geoff Pleiss Model : plastic bag (55% conf.) Other sensors : person (90% conf.) Model : cyst (55% conf.) Clinician : HCC (90% conf.)
  • 11. Introduction & Related Work 11 • Some works suggest a trend for larger, more accurate models to be worse calibrated. • Recent researches have focused on improving accuracy rather than calibration. • architecture size, amount of training data, and computing power • Architecture : MLP-Mixer, ViT, … • Training approaches : SimCLR, ResNext-WSL, CLIP, … Q. Do past results on calibration extend to current state-of-the-art models? https://arxiv.org/abs/1706.04599
  • 12. Empirical Evaluation 12 • Experimental Setup • Model families • MLP-Mixer (PR-317) • ViT (PR-281) • BiT • ResNeXt-WSL • SimCLR (PR-231) • CLIP • AlexNet https://arxiv.org/abs/2105.01601 https://arxiv.org/abs/2010.11929 https://arxiv.org/abs/1912.11370 https://arxiv.org/abs/1805.00932 https://arxiv.org/abs/2002.05709 https://arxiv.org/abs/2103.00020 https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
  • 13. Empirical Evaluation 13 • Experimental Setup • Datasets • ImageNet • ImageNetV2 • ImageNet-C (PR-277) • ImageNet-R • ImageNet-A • ImageNet-21K • JFT-300 https://arxiv.org/abs/1902.10811 https://arxiv.org/abs/1903.12261 https://arxiv.org/abs/2006.16241 https://arxiv.org/abs/1907.07174 https://ieeexplore.ieee.org/document/5206848 https://arxiv.org/abs/1707.02968
  • 14. (Conclusion) 14 • Modern image models are well calibrated across distribution shift despite being designed with a focus on accuracy. • Model size and pretraining amount do not fully account for the performance differences between families. • Architecture is a major determinant of calibration! • MLP-Mixer and ViT (non-conv) are among the best-calibrated models both in-distribution and OOD. • Further work on the influence of architecture inductive biases on calibration and OOD robustness will be necessary to tell whether these results generalize.
  • 15. Empirical Evaluation 15 • In-Distribution Calibration • Several recent model families (MLP-Mixer, ViT, and BiT) are both highly accurate and well-calibrated. • A zero-shot model, CLIP, is well-calibrated given its accuracy.
  • 16. Empirical Evaluation 16 • In-Distribution Calibration • Temperature scaling reveals consistent properties of model families. • The poor calibration of past models can often be remedied by post-hoc recalibration. (temperature scaling) → Does a difference between models remain after recalibration? → The most recent architectures are better calibrated than past models even after temperature scaling. • SimCLR or ResNeXt-WSL tend to be worse calibrated than BiT. • MLP-Mixer and ViT (non-conv) can perform as well.
  • 17. Empirical Evaluation 17 • In-Distribution Calibration • Differences between families are not explained by model size or pretraining amount • Disentangle how the differences between model families affect their calibration properties • Model size • Prior work : Larger neural networks are worse calibrated and have lower classification error. • This study : ViT models are better calibrated than BiT models. • Changing the size of BiT model cannot move it into the Pareto set of ViT models.
  • 18. Empirical Evaluation 18 • In-Distribution Calibration • Differences between families are not explained by model size or pretraining amount • Disentangle how the differences between model families affect their calibration properties • Model pretraining • BiT models pretrained on ImageNet (1.3M), ImageNet-21K (12.8M), and JFT-300 (300M) • It has no consistent effect on calibration. (ViT, MLP-Mixer < BiT < SimCLR) • The amount of pretraining data and the duration of pretraining
  • 19. Empirical Evaluation 19 • Accuracy and Calibration Under Distribution Shift • ImageNet-C
  • 20. Empirical Evaluation 20 • Accuracy and Calibration Under Distribution Shift • What degree the insights from in-distribution and ImageNet-C calibration transfer to natural OOD data • Previous : Better in-distribution calibration and accuracy do not predict better calibration under distribution shift. • This study : The performance on several natural OOD datasets is largely consistent with that on ImageNet.
  • 21. Empirical Evaluation 21 • Relating Accuracy and Calibration Within Model Families • How to compare individual models within a family, where one model is more accurate but worse calibrated, and the other is less accurate but better calibrated. → Which model should a practitioner choose for a safety-critical application? • The cost structure of the specific application • The scenario of selective prediction • One can choose to ignore the model prediction (“abstain”) at a fixed cost if the prediction confidence is low, rather than risking a (more costly) prediction error.
  • 22. Empirical Evaluation 22 • Relating Accuracy and Calibration Within Model Families • Total cost : a linear combination of the misclassification cost and abstention cost at a given cost ratio (x- axis) and abstention rate (y-axis). • Blue regions : The higher-accuracy model (R152x4) achieves a lower cost that the better-calibrated model (R50x1).
  • 23. Pitfalls and Limitations 23 • There are two sources of bias • From estimating ECE by binning : always negative • From the finite sample size used to estimate the per-bin statistics : always positive • A larger number of bins → fewer points per bin → higher variance in each bin → positive bias of ECE
  • 24. Pitfalls and Limitations 24 • More formally, • To estimate accuracy 𝐵𝑖 precisely, we need a number of samples inversely proportional to the standard deviation 𝒑𝒊 𝟏 − 𝒑𝒊 / 𝑩𝒊 , where 𝑝𝑖 is the expected accuracy in 𝐵𝑖. • Bias : models with extreme average accuracies (i.e. close to 0 or 1) < models with an accuracy close to 0.5 • If we estimate 𝔼 accuracy 𝐵𝑖 − 𝔼 confidence 𝐵𝑖 2 for any bin 𝑖 with 𝑛𝑖 samples 𝑏𝑖𝑎𝑠 = 1 𝑛𝑖 𝕍 𝐴 + 𝕍 𝐶 − 2Cov 𝐶, 𝐴 • 𝐴 = 𝑌 ∈ arg max 𝑓 𝑋 , 𝐶 = max 𝑓 𝑋 • Higher accuracy models have a lower bias not only due to higher accuracy (lower 𝕍 𝑨 ), but also because their outputs correlate more with the correct label (higher covariance).
  • 25. Conclusion 25 • Modern image models are well calibrated across distribution shift despite being designed with a focus on accuracy. • Model size and pretraining amount do not fully account for the performance differences between families. • Architecture is a major determinant of calibration! • MLP-Mixer and ViT (non-conv) are among the best-calibrated models both in-distribution and OOD. • Further work on the influence of architecture inductive biases on calibration and OOD robustness will be necessary to tell whether these results generalize.