SlideShare a Scribd company logo
1 of 55
Download to read offline
Are you sure about that!?
Uncertainty Quantification in AI
Florian Wilhelm Berlin, October 10th 2019
2
Dr. Florian Wilhelm
Principal Data Scientist @ inovex
@FlorianWilhelm
FlorianWilhelm
florianwilhelm.info
Mathematical Modelling
Data Science to Production
Recommender Systems
Uncertainty Quantification & Causality
Python Data Stack
Maintainer PyScaffold
3
Simon Bachstein
Data Scientist @ inovex
2018/07 – 2019/01 Master Thesis at inovex:
Uncertainty Quantification in Deep Learning
• Blogpost:
http://inovex.de/blog/uncertainty-quantification-deep-learning
• Master Thesis:
https://sbachstein.de/master_thesis.pdf
@simonbachstein
sbachstein
sbachstein.de
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
4
Deep Networks cannot look beyond their horizon
Motivation
5
90% cat
10% dog
Deep Networks cannot look beyond their horizon
Motivation
6
40% cat
60% dog
Deep Networks cannot look beyond their horizon
Motivation
7
?
Boult, T. E., Cruz, S., Dhamija, A., Gunther, M., Henrydoss, J., & Scheirer, W. (2019). Learning and the Unknown: Surveying
Steps Toward Open World Recognition. Aaai, 1–8. Retrieved from www.aaai.org
Learning and the Unknown
8
Simple Regression Problem
Interpolation
9
Simple Regression Problem
Deep Networks don’t extrapolate
Neural Arithmetic Logic Units, NIPS'18, Andrew Trask et. al.10
Simple Regression Problem
Deep Networks don’t extrapolate
11
Simple Regression Problem
Uncertainty about interpolation and extrapolation
12
Types of Uncertainty
13
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
14
Methods for Uncertainty Quantification
16
Relaxation of mathematical assumptions about data
Gaussian
Processes
Deep Ensembles / Dropout
Ensembles
Quantile
Regression
Monte Carlo
Dropout
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
17
A Gaussian Process can be thought of as a random function
which is defined by its mean and covariance functions
Gaussian Processes
18
Definition
Gaussian Processes
19
Example
Gaussian Processes
20
Inference
Gaussian Processes
21
Inference
Gaussian Processes
22
Inference with perfect interpolation
Gaussian Processes
23
Inference with noisy observations
Gaussian Processes
24
Inference
Inference using given data points can be done analytically. For
example, when assuming the (prior) mean function to be zero
everywhere, we get:
Good introduction:
Bayesian Non-parametric Models for Data Science using PyMC by Christopher Fonnesbeck
• https://www.youtube.com/watch?v=-sIOMs4MSuA
• https://de.slideshare.net/mlreview/bayesian-nonparametric-models-for-data-science-using-pymc
computationally intense
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
25
MC Dropout
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, ICML 2016, Yarin Gal et. al.
26
...
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
27
Deep Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017, Balaji Lakshminarayanan et. al28
Custom loss function:
Capture uncertainty directly at training time
Deep Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017, Balaji Lakshminarayanan et. al29
Combine an ensemble of networks
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
30
Dropout Ensembles
31
The best of both worlds?
...
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
32
Using the cumulative distribution function (cdf) of a random variable Y, we
define the quantile:
Loss function to estimate quantile:
Quantile Regression
33
Intuition behind Quantile Regression
34
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
Assume median is here (𝜏 = 0.5)
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.1
0.2
0.5
0.8
Error: 0.0 + 1.6 = 1.6
Intuition behind Quantile Regression
35
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
Assume median is here (𝜏 = 0.5)
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.1
0.4
0.7
Error: 0.0 + 1.2 = 1.2
0.0
Intuition behind Quantile Regression
36
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
Assume median is here (𝜏 = 0.5)
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.1
0.3
0.6
Error: 0.1 + 0.9 = 1.0
0.0
Intuition behind Quantile Regression
37
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
Assume median is here (𝜏 = 0.5)
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.2
0.2
0.5
Error: 0.3 + 0.7 = 1.0
0.1
No change due to the linearity of the error!
+0.1
+0.1
-0.1
-0.1
Now the 0.75th Quantile
38
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
Assume 𝜏 = 0.75 is here
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.2
0.2
0.5
0.1
Error: (1 − 0.75) ⋅ 0.3 + 0.75 ⋅ 0.7 = 0.6
Right-side error weights 3 times as much as
the left-side error
Now the 0.75th Quantile
39
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
𝜏 = 0.75
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.5
0.1
0.2
Error: (1 − 0.75) ⋅ 1.0 + 0.75 ⋅ 0.2 = 0.4
0.4
Change in the right-side error also weights
3 times as much as the left-side error
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
40
According to the function
Samples are generated as follows:
Experiments
Uncertainty in Deep Learning (Phd thesis), Yarin Gal, http://mlg.eng.cam.ac.uk/yarin/blog_2248.html41
Dataset
Experiments
42
Dataset
Neural networks
› 2 hidden layers with 20 ReLU neurons each
› 5 networks for Deep Ensembles
› 100 iterations for Dropout predictions
› Adam optimizer with batch size of 128
› LR, weight decay, dropout probability are optimized
Gaussian Processes
› squared exponential covariance and zero mean function prior
› covariance function parameters and aleatory noise are optimized
Experiments
43
Network setup and hyperparameters
Mean squared error (MSE)
Mean negative log likelihood (MNLL)
Mean Kullback-Leibler (KL) divergence
Experiments
44
Measures for generalization quality
Experiments
45
Interpolation
Experiments
46
They still don’t extrapolate and they don’t quite realize
Experiments
47
Gaussian Process
Experiments
Convergence
48
Experiments
49
Heteroscedastic noise
Experiments
50
Non-Gaussian noise
Experiments
51
Uncertainty split
aleatoric epistemic
Summary
52
GP MCD DeepE DropoutE QR
Homoscedastic
noise
++ o + o o
Heteroscedastic
noise
-- - ++ + +
Non-Gaussian
noise
+ o + + -
Convergence ++ - + - +
Speed (--) + - / (+) + ++
Uncertainty split yes no yes yes no
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
53
› Neural network approaches discussed here are very aware of
aleatory uncertainty, however, not capable of correctly estimating
epistemic uncertainty
› Gaussian Processes give clear signals about ignorance but do not
scale
A combined solution needs to be developed because
uncertainty estimation is needed in critical applications
Conclusion
54
There is work to be done
› Bayesian Neural Networks (e.g. with PyMC)
› Sparse Gaussian Process approximations
› Gaussian Processes on top of neural networks
Outlook
55
Other approaches
Thank You!
Florian Wilhelm
Principal Data Scientist
inovex GmbH
Schanzenstraße 6-20
Kupferhütte 1.13
51063 Köln
florian.wilhelm@inovex.de

More Related Content

What's hot

Bayes Classification
Bayes ClassificationBayes Classification
Bayes Classificationsathish sak
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector MachinesEdgar Marca
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hakky St
 
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis Salah Amean
 
Introduction to Principle Component Analysis
Introduction to Principle Component AnalysisIntroduction to Principle Component Analysis
Introduction to Principle Component AnalysisSunjeet Jena
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersFunctional Imperative
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningShubhmay Potdar
 
Bayesian Neural Networks
Bayesian Neural NetworksBayesian Neural Networks
Bayesian Neural NetworksNatan Katz
 
Cross validation
Cross validationCross validation
Cross validationRidhaAfrawe
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)Sharayu Patil
 
Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithmhadifar
 
Introduction to Bayesian Statistics
Introduction to Bayesian StatisticsIntroduction to Bayesian Statistics
Introduction to Bayesian StatisticsPhilipp Singer
 
PCA (Principal component analysis)
PCA (Principal component analysis)PCA (Principal component analysis)
PCA (Principal component analysis)Learnbay Datascience
 

What's hot (20)

Bayes Classification
Bayes ClassificationBayes Classification
Bayes Classification
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
 
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
 
Fuzzy c means manual work
Fuzzy c means manual workFuzzy c means manual work
Fuzzy c means manual work
 
Introduction to Principle Component Analysis
Introduction to Principle Component AnalysisIntroduction to Principle Component Analysis
Introduction to Principle Component Analysis
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
 
Bayes Belief Networks
Bayes Belief NetworksBayes Belief Networks
Bayes Belief Networks
 
Fuzzy Clustering(C-means, K-means)
Fuzzy Clustering(C-means, K-means)Fuzzy Clustering(C-means, K-means)
Fuzzy Clustering(C-means, K-means)
 
Bayesian Neural Networks
Bayesian Neural NetworksBayesian Neural Networks
Bayesian Neural Networks
 
Cross validation
Cross validationCross validation
Cross validation
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)
 
Hierachical clustering
Hierachical clusteringHierachical clustering
Hierachical clustering
 
Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithm
 
Shap
ShapShap
Shap
 
Introduction to Bayesian Statistics
Introduction to Bayesian StatisticsIntroduction to Bayesian Statistics
Introduction to Bayesian Statistics
 
Bayesian networks
Bayesian networksBayesian networks
Bayesian networks
 
PCA (Principal component analysis)
PCA (Principal component analysis)PCA (Principal component analysis)
PCA (Principal component analysis)
 
Principal Component Analysis
Principal Component AnalysisPrincipal Component Analysis
Principal Component Analysis
 

Similar to Uncertainty Quantification in AI

PPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at UberPPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at UberJisang Yoon
 
Data Mining the City - A (practical) introduction to Machine Learning
Data Mining the City - A (practical) introduction to Machine LearningData Mining the City - A (practical) introduction to Machine Learning
Data Mining the City - A (practical) introduction to Machine LearningDanil Nagy
 
Robustness Metrics for ML Models based on Deep Learning Methods
Robustness Metrics for ML Models based on Deep Learning MethodsRobustness Metrics for ML Models based on Deep Learning Methods
Robustness Metrics for ML Models based on Deep Learning MethodsData Science Milan
 
Discussion of Persi Diaconis' lecture at ISBA 2016
Discussion of Persi Diaconis' lecture at ISBA 2016Discussion of Persi Diaconis' lecture at ISBA 2016
Discussion of Persi Diaconis' lecture at ISBA 2016Christian Robert
 
November, 2006 CCKM'06 1
November, 2006 CCKM'06 1 November, 2006 CCKM'06 1
November, 2006 CCKM'06 1 butest
 
A practical Introduction to Machine(s) Learning
A practical Introduction to Machine(s) LearningA practical Introduction to Machine(s) Learning
A practical Introduction to Machine(s) LearningBruno Gonçalves
 
Experiments in genetic programming
Experiments in genetic programmingExperiments in genetic programming
Experiments in genetic programmingLars Marius Garshol
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Julien SIMON
 
Practical Ai Class 3
Practical Ai Class 3Practical Ai Class 3
Practical Ai Class 3Oliver Zhang
 
Is ignorance bliss
Is ignorance blissIs ignorance bliss
Is ignorance blissStephen Senn
 
Machine learning in the life sciences with knime
Machine learning in the life sciences with knimeMachine learning in the life sciences with knime
Machine learning in the life sciences with knimeGreg Landrum
 
An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)Julien SIMON
 
Machine learning for_finance
Machine learning for_financeMachine learning for_finance
Machine learning for_financeStefan Duprey
 
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...PyData
 
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)NAVER Engineering
 
Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...
Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...
Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...Ricardo Guerrero Gómez-Olmedo
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...butest
 
Data-driven hypothesis generation using deep neural nets
Data-driven hypothesis generation using deep neural netsData-driven hypothesis generation using deep neural nets
Data-driven hypothesis generation using deep neural netsBalázs Kégl
 
Deep Learning for Computer Vision - ExecutiveML
Deep Learning for Computer Vision - ExecutiveMLDeep Learning for Computer Vision - ExecutiveML
Deep Learning for Computer Vision - ExecutiveMLAlex Conway
 

Similar to Uncertainty Quantification in AI (20)

PPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at UberPPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at Uber
 
Data Mining the City - A (practical) introduction to Machine Learning
Data Mining the City - A (practical) introduction to Machine LearningData Mining the City - A (practical) introduction to Machine Learning
Data Mining the City - A (practical) introduction to Machine Learning
 
Robustness Metrics for ML Models based on Deep Learning Methods
Robustness Metrics for ML Models based on Deep Learning MethodsRobustness Metrics for ML Models based on Deep Learning Methods
Robustness Metrics for ML Models based on Deep Learning Methods
 
Discussion of Persi Diaconis' lecture at ISBA 2016
Discussion of Persi Diaconis' lecture at ISBA 2016Discussion of Persi Diaconis' lecture at ISBA 2016
Discussion of Persi Diaconis' lecture at ISBA 2016
 
November, 2006 CCKM'06 1
November, 2006 CCKM'06 1 November, 2006 CCKM'06 1
November, 2006 CCKM'06 1
 
A practical Introduction to Machine(s) Learning
A practical Introduction to Machine(s) LearningA practical Introduction to Machine(s) Learning
A practical Introduction to Machine(s) Learning
 
Experiments in genetic programming
Experiments in genetic programmingExperiments in genetic programming
Experiments in genetic programming
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)
 
Practical Ai Class 3
Practical Ai Class 3Practical Ai Class 3
Practical Ai Class 3
 
Is ignorance bliss
Is ignorance blissIs ignorance bliss
Is ignorance bliss
 
Machine learning in the life sciences with knime
Machine learning in the life sciences with knimeMachine learning in the life sciences with knime
Machine learning in the life sciences with knime
 
An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)
 
Machine learning for_finance
Machine learning for_financeMachine learning for_finance
Machine learning for_finance
 
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
 
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
 
Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...
Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...
Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...
 
Phylogenetics2
Phylogenetics2Phylogenetics2
Phylogenetics2
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...
 
Data-driven hypothesis generation using deep neural nets
Data-driven hypothesis generation using deep neural netsData-driven hypothesis generation using deep neural nets
Data-driven hypothesis generation using deep neural nets
 
Deep Learning for Computer Vision - ExecutiveML
Deep Learning for Computer Vision - ExecutiveMLDeep Learning for Computer Vision - ExecutiveML
Deep Learning for Computer Vision - ExecutiveML
 

More from Florian Wilhelm

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unlocking the Power of Integer Programming
Unlocking the Power of Integer ProgrammingUnlocking the Power of Integer Programming
Unlocking the Power of Integer ProgrammingFlorian Wilhelm
 
WALD: A Modern & Sustainable Analytics Stack
WALD: A Modern & Sustainable Analytics StackWALD: A Modern & Sustainable Analytics Stack
WALD: A Modern & Sustainable Analytics StackFlorian Wilhelm
 
Forget about AI and do Mathematical Modelling instead!
Forget about AI and do Mathematical Modelling instead!Forget about AI and do Mathematical Modelling instead!
Forget about AI and do Mathematical Modelling instead!Florian Wilhelm
 
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...Florian Wilhelm
 
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...Florian Wilhelm
 
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...Florian Wilhelm
 
Performance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use casePerformance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use caseFlorian Wilhelm
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionFlorian Wilhelm
 
How mobile.de brings Data Science to Production for a Personalized Web Experi...
How mobile.de brings Data Science to Production for a Personalized Web Experi...How mobile.de brings Data Science to Production for a Personalized Web Experi...
How mobile.de brings Data Science to Production for a Personalized Web Experi...Florian Wilhelm
 
Deep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
Deep Learning-based Recommendations for Germany's Biggest Vehicle MarketplaceDeep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
Deep Learning-based Recommendations for Germany's Biggest Vehicle MarketplaceFlorian Wilhelm
 
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...Florian Wilhelm
 
Declarative Thinking and Programming
Declarative Thinking and ProgrammingDeclarative Thinking and Programming
Declarative Thinking and ProgrammingFlorian Wilhelm
 
Which car fits my life? - PyData Berlin 2017
Which car fits my life? - PyData Berlin 2017Which car fits my life? - PyData Berlin 2017
Which car fits my life? - PyData Berlin 2017Florian Wilhelm
 
PyData Meetup Berlin 2017-04-19
PyData Meetup Berlin 2017-04-19PyData Meetup Berlin 2017-04-19
PyData Meetup Berlin 2017-04-19Florian Wilhelm
 
Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...Florian Wilhelm
 

More from Florian Wilhelm (16)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unlocking the Power of Integer Programming
Unlocking the Power of Integer ProgrammingUnlocking the Power of Integer Programming
Unlocking the Power of Integer Programming
 
WALD: A Modern & Sustainable Analytics Stack
WALD: A Modern & Sustainable Analytics StackWALD: A Modern & Sustainable Analytics Stack
WALD: A Modern & Sustainable Analytics Stack
 
Forget about AI and do Mathematical Modelling instead!
Forget about AI and do Mathematical Modelling instead!Forget about AI and do Mathematical Modelling instead!
Forget about AI and do Mathematical Modelling instead!
 
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
 
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
 
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
 
Performance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use casePerformance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use case
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
 
How mobile.de brings Data Science to Production for a Personalized Web Experi...
How mobile.de brings Data Science to Production for a Personalized Web Experi...How mobile.de brings Data Science to Production for a Personalized Web Experi...
How mobile.de brings Data Science to Production for a Personalized Web Experi...
 
Deep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
Deep Learning-based Recommendations for Germany's Biggest Vehicle MarketplaceDeep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
Deep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
 
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
 
Declarative Thinking and Programming
Declarative Thinking and ProgrammingDeclarative Thinking and Programming
Declarative Thinking and Programming
 
Which car fits my life? - PyData Berlin 2017
Which car fits my life? - PyData Berlin 2017Which car fits my life? - PyData Berlin 2017
Which car fits my life? - PyData Berlin 2017
 
PyData Meetup Berlin 2017-04-19
PyData Meetup Berlin 2017-04-19PyData Meetup Berlin 2017-04-19
PyData Meetup Berlin 2017-04-19
 
Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...
 

Recently uploaded

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 

Recently uploaded (20)

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 

Uncertainty Quantification in AI

  • 1. Are you sure about that!? Uncertainty Quantification in AI Florian Wilhelm Berlin, October 10th 2019
  • 2. 2 Dr. Florian Wilhelm Principal Data Scientist @ inovex @FlorianWilhelm FlorianWilhelm florianwilhelm.info Mathematical Modelling Data Science to Production Recommender Systems Uncertainty Quantification & Causality Python Data Stack Maintainer PyScaffold
  • 3. 3 Simon Bachstein Data Scientist @ inovex 2018/07 – 2019/01 Master Thesis at inovex: Uncertainty Quantification in Deep Learning • Blogpost: http://inovex.de/blog/uncertainty-quantification-deep-learning • Master Thesis: https://sbachstein.de/master_thesis.pdf @simonbachstein sbachstein sbachstein.de
  • 4. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 4
  • 5. Deep Networks cannot look beyond their horizon Motivation 5 90% cat 10% dog
  • 6. Deep Networks cannot look beyond their horizon Motivation 6 40% cat 60% dog
  • 7. Deep Networks cannot look beyond their horizon Motivation 7 ?
  • 8. Boult, T. E., Cruz, S., Dhamija, A., Gunther, M., Henrydoss, J., & Scheirer, W. (2019). Learning and the Unknown: Surveying Steps Toward Open World Recognition. Aaai, 1–8. Retrieved from www.aaai.org Learning and the Unknown 8
  • 10. Simple Regression Problem Deep Networks don’t extrapolate Neural Arithmetic Logic Units, NIPS'18, Andrew Trask et. al.10
  • 11. Simple Regression Problem Deep Networks don’t extrapolate 11
  • 12. Simple Regression Problem Uncertainty about interpolation and extrapolation 12
  • 14. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 14
  • 15. Methods for Uncertainty Quantification 16 Relaxation of mathematical assumptions about data Gaussian Processes Deep Ensembles / Dropout Ensembles Quantile Regression Monte Carlo Dropout
  • 16. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 17
  • 17. A Gaussian Process can be thought of as a random function which is defined by its mean and covariance functions Gaussian Processes 18 Definition
  • 21. Gaussian Processes 22 Inference with perfect interpolation
  • 23. Gaussian Processes 24 Inference Inference using given data points can be done analytically. For example, when assuming the (prior) mean function to be zero everywhere, we get: Good introduction: Bayesian Non-parametric Models for Data Science using PyMC by Christopher Fonnesbeck • https://www.youtube.com/watch?v=-sIOMs4MSuA • https://de.slideshare.net/mlreview/bayesian-nonparametric-models-for-data-science-using-pymc computationally intense
  • 24. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 25
  • 25. MC Dropout Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, ICML 2016, Yarin Gal et. al. 26 ...
  • 26. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 27
  • 27. Deep Ensembles Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017, Balaji Lakshminarayanan et. al28 Custom loss function: Capture uncertainty directly at training time
  • 28. Deep Ensembles Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017, Balaji Lakshminarayanan et. al29 Combine an ensemble of networks
  • 29. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 30
  • 30. Dropout Ensembles 31 The best of both worlds? ...
  • 31. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 32
  • 32. Using the cumulative distribution function (cdf) of a random variable Y, we define the quantile: Loss function to estimate quantile: Quantile Regression 33
  • 33. Intuition behind Quantile Regression 34 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 Assume median is here (𝜏 = 0.5) 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.1 0.2 0.5 0.8 Error: 0.0 + 1.6 = 1.6
  • 34. Intuition behind Quantile Regression 35 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 Assume median is here (𝜏 = 0.5) 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.1 0.4 0.7 Error: 0.0 + 1.2 = 1.2 0.0
  • 35. Intuition behind Quantile Regression 36 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 Assume median is here (𝜏 = 0.5) 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.1 0.3 0.6 Error: 0.1 + 0.9 = 1.0 0.0
  • 36. Intuition behind Quantile Regression 37 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 Assume median is here (𝜏 = 0.5) 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.2 0.2 0.5 Error: 0.3 + 0.7 = 1.0 0.1 No change due to the linearity of the error! +0.1 +0.1 -0.1 -0.1
  • 37. Now the 0.75th Quantile 38 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 Assume 𝜏 = 0.75 is here 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.2 0.2 0.5 0.1 Error: (1 − 0.75) ⋅ 0.3 + 0.75 ⋅ 0.7 = 0.6 Right-side error weights 3 times as much as the left-side error
  • 38. Now the 0.75th Quantile 39 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 𝜏 = 0.75 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.5 0.1 0.2 Error: (1 − 0.75) ⋅ 1.0 + 0.75 ⋅ 0.2 = 0.4 0.4 Change in the right-side error also weights 3 times as much as the left-side error
  • 39. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 40
  • 40. According to the function Samples are generated as follows: Experiments Uncertainty in Deep Learning (Phd thesis), Yarin Gal, http://mlg.eng.cam.ac.uk/yarin/blog_2248.html41 Dataset
  • 42. Neural networks › 2 hidden layers with 20 ReLU neurons each › 5 networks for Deep Ensembles › 100 iterations for Dropout predictions › Adam optimizer with batch size of 128 › LR, weight decay, dropout probability are optimized Gaussian Processes › squared exponential covariance and zero mean function prior › covariance function parameters and aleatory noise are optimized Experiments 43 Network setup and hyperparameters
  • 43. Mean squared error (MSE) Mean negative log likelihood (MNLL) Mean Kullback-Leibler (KL) divergence Experiments 44 Measures for generalization quality
  • 45. Experiments 46 They still don’t extrapolate and they don’t quite realize
  • 51. Summary 52 GP MCD DeepE DropoutE QR Homoscedastic noise ++ o + o o Heteroscedastic noise -- - ++ + + Non-Gaussian noise + o + + - Convergence ++ - + - + Speed (--) + - / (+) + ++ Uncertainty split yes no yes yes no
  • 52. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 53
  • 53. › Neural network approaches discussed here are very aware of aleatory uncertainty, however, not capable of correctly estimating epistemic uncertainty › Gaussian Processes give clear signals about ignorance but do not scale A combined solution needs to be developed because uncertainty estimation is needed in critical applications Conclusion 54 There is work to be done
  • 54. › Bayesian Neural Networks (e.g. with PyMC) › Sparse Gaussian Process approximations › Gaussian Processes on top of neural networks Outlook 55 Other approaches
  • 55. Thank You! Florian Wilhelm Principal Data Scientist inovex GmbH Schanzenstraße 6-20 Kupferhütte 1.13 51063 Köln florian.wilhelm@inovex.de