SlideShare a Scribd company logo
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Uncertainty Estimation
in Deep Learning
A brief introduction
Christian S. Perone
christian.perone@gmail.com
http://blog.christianperone.com
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Agenda
Uncertainties
Knowing what you don’t know
The problem
Different Uncertainties
Importance of Uncertainty
Bayesian Inference
The frequentist way
The bayesian inference
MCMC Sampling
Deep Learning
Short intro
Bayesian Neural Networks
Variational Inference
Introduction
Posterior Approximation
Training a BNN
Dropout
Ensembles
Introduction
Deep Ensembles
Randomized Prior Functions
Final Remarks
Q&A
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Who Am I
Christian S. Perone
BSc in Computer Science in Brazil (UPF),
MSc in Biomedical Eng. in Montreal
(Polytechnique/UdeM)
Machine Learning / Data Science
Working at Jungle
Blog at
blog.christianperone.com
Open-source projects
https://github.com/perone
Twitter @tarantulae
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Section I
Uncertainties
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Knowing what you don’t know
It is correct, somebody might say, that
(...) Socrates did not know anything; and
it was indeed wisdom that they
recognized their own lack of knowledge,
(...).
—Karl R. Popper, The World of Parmenides
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Knowing what you don’t know
It is correct, somebody might say, that
(...) Socrates did not know anything; and
it was indeed wisdom that they
recognized their own lack of knowledge,
(...).
—Karl R. Popper, The World of Parmenides
What this has to do statistical learning ?
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
The problem
Let’s say you trained a model to classify an image as having lesion or
not;
Different MRI contrasts (T2/T1). Source: http://www.msdiscovery.org. 2019.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
The problem
Let’s say you trained a model to classify an image as having lesion or
not;
Different MRI contrasts (T2/T1). Source: http://www.msdiscovery.org. 2019.
Later you do prediction on volumes with different parametrization,
anatomy, etc;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
The problem
Let’s say you trained a model to classify an image as having lesion or
not;
Different MRI contrasts (T2/T1). Source: http://www.msdiscovery.org. 2019.
Later you do prediction on volumes with different parametrization,
anatomy, etc;
The problem: you can still have a prediction with high probability,
even if your sample is out-of-distribution.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
The problem
A simple regression problem.
Source: Yarin Gal. Uncertainty in Deep Learning. PhD Thesis. 2016.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
The problem
A simple regression problem.
6 4 2 0 2 4 6
20
10
0
10
20
30
40
Source: Ian Osband et al. Using Randomized Prior Functions for Deep Reinforcement Learning. NIPS
2018. Image from: http://blog.christianperone.com
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Different Uncertainties
Two main types of uncertainty, often confused by practitioners, but very
different quantities:
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Different Uncertainties
Two main types of uncertainty, often confused by practitioners, but very
different quantities:
Aleatoric Uncertainty
Information data cannot explain, also called data uncertainty, or irreducible
uncertainty. More data might not reduce it;
Ex: increasing measurement precision can reduce it.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Different Uncertainties
Two main types of uncertainty, often confused by practitioners, but very
different quantities:
Aleatoric Uncertainty
Information data cannot explain, also called data uncertainty, or irreducible
uncertainty. More data might not reduce it;
Ex: increasing measurement precision can reduce it.
Epistemic Uncertainty
Uncertainty in the model itself, also called model uncertainty, or reducible
uncertainty;
Ex: can be explained away by increasing training size.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Importance of Uncertainty
Medical imaging (classification, segmentation);
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Importance of Uncertainty
Medical imaging (classification, segmentation);
Autonomous vehicles (what’s the uncertainty this object is a tree ?);
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Importance of Uncertainty
Medical imaging (classification, segmentation);
Autonomous vehicles (what’s the uncertainty this object is a tree ?);
Active Learning (which sample should be labeled ?);
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Importance of Uncertainty
Medical imaging (classification, segmentation);
Autonomous vehicles (what’s the uncertainty this object is a tree ?);
Active Learning (which sample should be labeled ?);
Explore/exploit dilemma in reinforcement learning;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Importance of Uncertainty
Medical imaging (classification, segmentation);
Autonomous vehicles (what’s the uncertainty this object is a tree ?);
Active Learning (which sample should be labeled ?);
Explore/exploit dilemma in reinforcement learning;
Out-of-distribution detection;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Importance of Uncertainty
Medical imaging (classification, segmentation);
Autonomous vehicles (what’s the uncertainty this object is a tree ?);
Active Learning (which sample should be labeled ?);
Explore/exploit dilemma in reinforcement learning;
Out-of-distribution detection;
Model understanding/dataset understanding;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Importance of Uncertainty
Medical imaging (classification, segmentation);
Autonomous vehicles (what’s the uncertainty this object is a tree ?);
Active Learning (which sample should be labeled ?);
Explore/exploit dilemma in reinforcement learning;
Out-of-distribution detection;
Model understanding/dataset understanding;
Nearly all applications !
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Example in Reinforcement Learning
The explore/exploit dilemma:
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Example in Reinforcement Learning
Work by Maxime Wabartha et al.:
estimated by taking, for each approach, the pointwise average and standard deviation over 50 sampled
functions. We expect the empirical posterior predictive distribution to cover the ground truth function.
While we succeed to do so using a MSE loss and the proposed approach, we do not manage to obtain
diverse functions using solely anchoring neither using dropout; in our experiments, changing the
dropout rate did not improve the quality of the obtained uncertainty. Input bootstrapping does produce
functions that better span the width of outputs, but it also disregards by nature certain points of the
training set, where we expect the uncertainty to be low given our current knowledge. We also provide
in the appendix an example of the functions generated by our function approach when fixing X.
0.4 0.2 0.0 0.2 0.4
1.00
0.75
0.50
0.25
0.00
0.25
0.50
0.75
1.00
Dropout 0.2
0.4 0.2 0.0 0.2 0.4
1.00
0.75
0.50
0.25
0.00
0.25
0.50
0.75
1.00
Input bootstrapping
0.4 0.2 0.0 0.2 0.4
1.00
0.75
0.50
0.25
0.00
0.25
0.50
0.75
1.00
AnchoringGround truth
Sample function
Standard deviations
Training set
0.4 0.2 0.0 0.2 0.4
1.00
0.75
0.50
0.25
0.00
0.25
0.50
0.75
1.00
RepulsiveReference
function
Figure 1: Comparison of the empirical (over 20 sample functions) posterior predictive distribution
for dropout, input bootstrapping, anchoring and repulsive constraint.
3.2 Diverse functions in high-dimensional input space
We apply the method to function approximation in the case of a reinforcement learning problem
requiring exploration. More precisely, we showcase how our method can help sample diverse reward
functions in a model-based setting. We create a dataset of 43 13x13 frames with the associated reward.
We use as function approximator a small CNN outputing a reward for a given frame (see appendix).
To illustrate our method, we sample the repulsive points from possible frames, thus directly from the
manifold, in or out of the training distribution (see appendix). Figure 2 (rightmost figure) shows how
Source: Maxime Wabartha et al. Sampling diverse neural networks for exploration in reinforcement
learning. NIPS 2018.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Section II
Bayesian Inference
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
A simple frequentist regression
In a frequentist linear regression, we have a point estimate for the
parameters of our model.
For a maximum likelihood derivation, take a look at
http://blog.christianperone.com/2019/01/mle/.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
A simple frequentist regression
In a frequentist linear regression, we have a point estimate for the
parameters of our model.
First, we define our model:
f(x) = θ0 + θ1x1 + θ2x2 + . . . =
Vectorial notation
x β
For a maximum likelihood derivation, take a look at
http://blog.christianperone.com/2019/01/mle/.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
A simple frequentist regression
In a frequentist linear regression, we have a point estimate for the
parameters of our model.
First, we define our model:
f(x) = θ0 + θ1x1 + θ2x2 + . . . =
Vectorial notation
x β
Later, we define a loss such as the MSE (mean squared error):
L =
1
n
n
i=1
(f(xi) − yi)2
For a maximum likelihood derivation, take a look at
http://blog.christianperone.com/2019/01/mle/.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
A simple frequentist regression
In a frequentist linear regression, we have a point estimate for the
parameters of our model.
First, we define our model:
f(x) = θ0 + θ1x1 + θ2x2 + . . . =
Vectorial notation
x β
Later, we define a loss such as the MSE (mean squared error):
L =
1
n
n
i=1
(f(xi) − yi)2
Finally, we optimize it:
ˆθ = arg min
θ
L(f(x), y)
For a maximum likelihood derivation, take a look at
http://blog.christianperone.com/2019/01/mle/.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
A simple frequentist regression
0.0 0.2 0.4 0.6 0.8 1.0
x
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
y
Frequentist regression
sample data
regression line
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
The bayesian way
Bayesian approaches represent the uncertainty using a distribution over
parameters. Instead of a point estimate, we have an entire posterior.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
The bayesian way
Bayesian approaches represent the uncertainty using a distribution over
parameters. Instead of a point estimate, we have an entire posterior.
To formulate our bayesian regression, we first select a likelihood;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
The bayesian way
Bayesian approaches represent the uncertainty using a distribution over
parameters. Instead of a point estimate, we have an entire posterior.
To formulate our bayesian regression, we first select a likelihood;
After that, we select priors over parameters;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
The bayesian way
Bayesian approaches represent the uncertainty using a distribution over
parameters. Instead of a point estimate, we have an entire posterior.
To formulate our bayesian regression, we first select a likelihood;
After that, we select priors over parameters;
Then we compute or approximate (sampling) the posterior of our
model and data.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Prior, likelihood and posterior
1 2 3
Credibility
Prior
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Prior, likelihood and posterior
1 2 3
Credibility
Prior
1 2 3
Credibility
Data
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Prior, likelihood and posterior
1 2 3
Credibility
Prior
1 2 3
Credibility
Data
1 2 3
Credibility
Posterior
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Prior, likelihood and posterior
Posterior
p(θ|X)
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Prior, likelihood and posterior
Posterior
p(θ|X) ∝ p(X|θ)
Likelihood
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Prior, likelihood and posterior
Posterior
p(θ|X) ∝ p(X|θ)
Likelihood
Prior
π(θ)
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
posterior
0 0.5 1
likelihood
0 0.5 1
prior
0 0.5 1
⇥ /
posterior
0 0.5 1
prior
0 0.5 1
⇥ /
⇥ /
likelihood
0 0.5 1
prior
0 0.5 1
likelihood
0 0.5 1
posterior
0 0.5 1
Source: Statistical Rethinking/Winter 2019. Richard McElreath.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian regression
Let’s reformulate our regression:
We will use a simple Gaussian distribution for our observations,
defined as:
Y ∼ N(µ, σ2
)
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian regression
Let’s reformulate our regression:
We will use a simple Gaussian distribution for our observations,
defined as:
Y ∼ N(µ, σ2
)
We plug our regression of the µ:
Y ∼ N( α + βx
Linear model
, σ2
)
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian regression
Let’s reformulate our regression:
We will use a simple Gaussian distribution for our observations,
defined as:
Y ∼ N(µ, σ2
)
We plug our regression of the µ:
Y ∼ N( α + βx
Linear model
, σ2
)
And define the priors:
α ∼ N(0, 20)
β ∼ N(0, 20)
σ ∼ U(0, 5)
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian Regression in Plate Notation
You can represent the same model below with plate notation:
Y ∼ N(α + βx, σ2
)
α ∼ N(0, 20)
β ∼ N(0, 20)
σ ∼ U(0, 5)
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian Regression in Plate Notation
You can represent the same model below with plate notation:
Y ∼ N(α + βx, σ2
)
α ∼ N(0, 20)
β ∼ N(0, 20)
σ ∼ U(0, 5)
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
MCMC Sampling
Let’s see a demo of a Monte Carlo Markov Chain sampler:
Source: MCMC Demos, by Chi Feng
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
MCMC Sampling
0.7 0.8 0.9 1.0 1.1 1.2
0
2
4
Frequency
Intercept
0 1000 2000 3000 4000
0.8
1.0
1.2
Samplevalue
Intercept
1.6 1.8 2.0 2.2 2.4
0
1
2
3
Frequency
x
0 1000 2000 3000 4000
1.5
2.0
Samplevalue
x
0.45 0.50 0.55 0.60
0
5
10
15
Frequency
sigma
0 1000 2000 3000 4000
0.5
0.6
Samplevalue
sigma
Trace plot generated using PyMC3, you can also use ArviZ.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian regression
0.0 0.2 0.4 0.6 0.8 1.0
x
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
y
Posterior predictive regression lines
sample data
posterior predictive regression lines
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian methods
Bayesian methods can give us a full posterior to reason about;
1
Zoubin Ghahramani, History of Bayesian Neural Networks, NIPS 2016
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian methods
Bayesian methods can give us a full posterior to reason about;
Explicit priors;
1
Zoubin Ghahramani, History of Bayesian Neural Networks, NIPS 2016
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian methods
Bayesian methods can give us a full posterior to reason about;
Explicit priors;
Uncertainty;
1
Zoubin Ghahramani, History of Bayesian Neural Networks, NIPS 2016
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian methods
Bayesian methods can give us a full posterior to reason about;
Explicit priors;
Uncertainty;
They’re on the side of algorithms, not models 1;
1
Zoubin Ghahramani, History of Bayesian Neural Networks, NIPS 2016
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian methods
Bayesian methods can give us a full posterior to reason about;
Explicit priors;
Uncertainty;
They’re on the side of algorithms, not models 1;
However,
Intractable posterior for many practical cases and large datasets;
p(θ|X) =
p(X|θ)π(θ)
p(X)
1
Zoubin Ghahramani, History of Bayesian Neural Networks, NIPS 2016
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian methods
Bayesian methods can give us a full posterior to reason about;
Explicit priors;
Uncertainty;
They’re on the side of algorithms, not models 1;
However,
Intractable posterior for many practical cases and large datasets;
p(θ|X) =
p(X|θ)π(θ)
p(X)
Tuning and using MCMC algorithms can be tricky.
1
Zoubin Ghahramani, History of Bayesian Neural Networks, NIPS 2016
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Section III
Deep Learning
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Deep Learning
It’s not a secret that Deep Learning reached an important milestone in
Machine Learning:
Non-linear function approximators;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Deep Learning
It’s not a secret that Deep Learning reached an important milestone in
Machine Learning:
Non-linear function approximators;
They can scale to large datasets (thanks to stochastic approximation);
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Deep Learning
It’s not a secret that Deep Learning reached an important milestone in
Machine Learning:
Non-linear function approximators;
They can scale to large datasets (thanks to stochastic approximation);
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Deep Learning
It’s not a secret that Deep Learning reached an important milestone in
Machine Learning:
Non-linear function approximators;
They can scale to large datasets (thanks to stochastic approximation);
They are state-of-the-art for NLP, computer vision, speech, etc;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Deep Learning
It’s not a secret that Deep Learning reached an important milestone in
Machine Learning:
Non-linear function approximators;
They can scale to large datasets (thanks to stochastic approximation);
They are state-of-the-art for NLP, computer vision, speech, etc;
Very expressive and flexible;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Deep Learning
It’s not a secret that Deep Learning reached an important milestone in
Machine Learning:
Non-linear function approximators;
They can scale to large datasets (thanks to stochastic approximation);
They are state-of-the-art for NLP, computer vision, speech, etc;
Very expressive and flexible;
Representation learning;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
One-slide Intro to Deep Learning
x0
x1
...
xD
y
(1)
0
y
(1)
1
...
y
(1)
m(1)
. . .
. . .
. . . y
(L)
0
y
(L)
1
...
y
(L)
m(L)
y
(L+1)
1
y
(L+1)
2
...
y
(L+1)
C
input layer
1st hidden layer Lth hidden layer
output layer
A multi-layer perceptron (MLP) network overview. Source: David Stutz, 2018, BSD 3-Clause License.
Parametrized models with composition of functions;
Trained using backpropagation and SGD;
Learned usually by maximizing the log likelihood;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian Neural Networks
A Bayesian Neural Network (BNN) is a Neural Network with
distributions over parameters2.
2
Neal, Radford M. (2012). Bayesian learning for neural networks.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian Neural Networks
A Bayesian Neural Network (BNN) is a Neural Network with
distributions over parameters2.
Source: Weight Uncertainty in Neural Networks. Charles Blundell et al. 2015.
2
Neal, Radford M. (2012). Bayesian learning for neural networks.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian Neural Networks
In modern Deep Neural Networks, however, we have some challenges:
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian Neural Networks
In modern Deep Neural Networks, however, we have some challenges:
A lot of data;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian Neural Networks
In modern Deep Neural Networks, however, we have some challenges:
A lot of data;
High-dimensionality in data;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian Neural Networks
In modern Deep Neural Networks, however, we have some challenges:
A lot of data;
High-dimensionality in data;
Millions of parameters;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian Neural Networks
In modern Deep Neural Networks, however, we have some challenges:
A lot of data;
High-dimensionality in data;
Millions of parameters;
Highly non-convex surfaces;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bayesian Neural Networks
In modern Deep Neural Networks, however, we have some challenges:
A lot of data;
High-dimensionality in data;
Millions of parameters;
Highly non-convex surfaces;
This makes these models very difficult for Bayesian methods, therefore an
approximation is required:
Variational Inference
(variational bayes)
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Section IV
Variational Inference
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Variational Inference
Variational Inference (VI) is often used as an alternative to MCMC;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Variational Inference
Variational Inference (VI) is often used as an alternative to MCMC;
Can be used to approximate the posterior of Bayesian models;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Variational Inference
Variational Inference (VI) is often used as an alternative to MCMC;
Can be used to approximate the posterior of Bayesian models;
Faster than MCMC for complex models and larger datasets;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Variational Inference
Variational Inference (VI) is often used as an alternative to MCMC;
Can be used to approximate the posterior of Bayesian models;
Faster than MCMC for complex models and larger datasets;
Shift from sampling to optimization;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Variational Inference
Variational Inference (VI) is often used as an alternative to MCMC;
Can be used to approximate the posterior of Bayesian models;
Faster than MCMC for complex models and larger datasets;
Shift from sampling to optimization;
Less guarantees than MCMC, density close to the target;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Variational Inference
Variational Inference (VI) is often used as an alternative to MCMC;
Can be used to approximate the posterior of Bayesian models;
Faster than MCMC for complex models and larger datasets;
Shift from sampling to optimization;
Less guarantees than MCMC, density close to the target;
For an in-depth review
For a modern in-depth review please refer to: Variational Inference: A Review for
Statisticians. Blei, D. M. et al (2018).
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Variational Inference
We have a very complex posterior distribution p(w | D) that we
want to approximate (w are the parameters, and D is the data);
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Variational Inference
We have a very complex posterior distribution p(w | D) that we
want to approximate (w are the parameters, and D is the data);
We do this approximation by using an "easier" distribution q(w | θ)
(also called the variational distribution, where θ are the variational
parameters);
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Variational Inference
We have a very complex posterior distribution p(w | D) that we
want to approximate (w are the parameters, and D is the data);
We do this approximation by using an "easier" distribution q(w | θ)
(also called the variational distribution, where θ are the variational
parameters);
Variational approximation (green). Source: Eric Jang, 2016. https://blog.evjang.com
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Posterior approximation
If we want to approximate p(w | D) with q(w | θ), we need a
measure of "closeness";
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Posterior approximation
If we want to approximate p(w | D) with q(w | θ), we need a
measure of "closeness";
We use Kullback-Leibler (KL) divergence:
Source: Flawnson Tong, https://towardsdatascience.com
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Posterior approximation
We use Kullback-Leibler (KL) divergence:
θ∗
= arg min
θ
KL[q(w | θ) || p(w | D)]
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Posterior approximation
We use Kullback-Leibler (KL) divergence:
θ∗
= arg min
θ
KL[q(w | θ) || p(w | D)]
θ∗
= arg min
θ
log q(w | θ)
variational posterior
− log p(w)
prior
− log p(D | w)
log likelihood
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Posterior approximation
We use Kullback-Leibler (KL) divergence:
θ∗
= arg min
θ
KL[q(w | θ) || p(w | D)]
θ∗
= arg min
θ
log q(w | θ)
variational posterior
− log p(w)
prior
− log p(D | w)
log likelihood
Why KL-divergence ?
Because it allows us to derive a cost that is tractable to optimization.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Posterior approximation
We use Kullback-Leibler (KL) divergence:
θ∗
= arg min
θ
KL[q(w | θ) || p(w | D)]
θ∗
= arg min
θ
log q(w | θ)
variational posterior
− log p(w)
prior
− log p(D | w)
log likelihood
Why KL-divergence ?
Because it allows us to derive a cost that is tractable to optimization.
Not without paying a price though.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Forward and Reverse KL
Forms of the KL-divergence. Source: Pattern Recognition and Machine Learning. Christopher M.
Bishop. 2006. (a) forward KL-divergence, (b) and (c) reverse KL-divergence.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Forward KL
Source: Colin Raffel, https://colinraffel.com
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Forward KL (misspecification)
Source: Colin Raffel, https://colinraffel.com
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Reverse KL
Source: Colin Raffel, https://colinraffel.com
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Quality of the uncertainty estimation
MFVB approximation. Source: Variational Bayes and beyond: Bayesian inference for big data.
Tamara Broderick. ICML 2018.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Quality of the uncertainty estimation
MFVB approximation. Source: Variational Bayes and beyond: Bayesian inference for big data.
Tamara Broderick. ICML 2018.
Can underestimate variance severely;
When compared to MCMC, means are usually fine, but variance is
far away;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Training a Bayesian Neural Network
The training loop for a Bayesian Neural Network (BNN) using
Variational Inference is shown below:
Sample from q(w | θ) the parameters of the network. Two
variational parameters for each weight in q: µ and σ;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Training a Bayesian Neural Network
The training loop for a Bayesian Neural Network (BNN) using
Variational Inference is shown below:
Sample from q(w | θ) the parameters of the network. Two
variational parameters for each weight in q: µ and σ;
Parametrize the network with the sampled parameters, often using
the reparametrization trick;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Training a Bayesian Neural Network
The training loop for a Bayesian Neural Network (BNN) using
Variational Inference is shown below:
Sample from q(w | θ) the parameters of the network. Two
variational parameters for each weight in q: µ and σ;
Parametrize the network with the sampled parameters, often using
the reparametrization trick;
Forward pass with the data batch;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Training a Bayesian Neural Network
The training loop for a Bayesian Neural Network (BNN) using
Variational Inference is shown below:
Sample from q(w | θ) the parameters of the network. Two
variational parameters for each weight in q: µ and σ;
Parametrize the network with the sampled parameters, often using
the reparametrization trick;
Forward pass with the data batch;
Calculate the combined loss: variational posterior, prior and log
likelihood;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Training a Bayesian Neural Network
The training loop for a Bayesian Neural Network (BNN) using
Variational Inference is shown below:
Sample from q(w | θ) the parameters of the network. Two
variational parameters for each weight in q: µ and σ;
Parametrize the network with the sampled parameters, often using
the reparametrization trick;
Forward pass with the data batch;
Calculate the combined loss: variational posterior, prior and log
likelihood;
Compute gradients by backpropagation and optimize with SGD;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Training a Bayesian Neural Network
The training loop for a Bayesian Neural Network (BNN) using
Variational Inference is shown below:
Sample from q(w | θ) the parameters of the network. Two
variational parameters for each weight in q: µ and σ;
Parametrize the network with the sampled parameters, often using
the reparametrization trick;
Forward pass with the data batch;
Calculate the combined loss: variational posterior, prior and log
likelihood;
Compute gradients by backpropagation and optimize with SGD;
Repeat;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Training a Bayesian Neural Network
The training loop for a Bayesian Neural Network (BNN) using
Variational Inference is shown below:
Sample from q(w | θ) the parameters of the network. Two
variational parameters for each weight in q: µ and σ;
Parametrize the network with the sampled parameters, often using
the reparametrization trick;
Forward pass with the data batch;
Calculate the combined loss: variational posterior, prior and log
likelihood;
Compute gradients by backpropagation and optimize with SGD;
Repeat;
Prediction: multiple forward passes.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Training a Bayesian Neural Network
The training loop for a Bayesian Neural Network (BNN) using
Variational Inference is shown below:
Sample from q(w | θ) the parameters of the network. Two
variational parameters for each weight in q: µ and σ;
Parametrize the network with the sampled parameters, often using
the reparametrization trick;
Forward pass with the data batch;
Calculate the combined loss: variational posterior, prior and log
likelihood;
Compute gradients by backpropagation and optimize with SGD;
Repeat;
Prediction: multiple forward passes.
This method is also called bayes by backprop.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Quality of the uncertainty estimation
HMC vs VI. Source: Bayesian Inference with Anchored Ensembles of Neural Networks, and Application
to Exploration in Reinforcement Learning. Tim Pearce. 2018.
For more information
For more information about the variational approach, please refer to: Weight
Uncertainty in Neural Networks. C. Blundell, et al. 2015.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Dropout as a Bayesian Approximation
Dropout. Source: Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Nitish
Srivastava, et al. 2014.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Dropout as a Bayesian Approximation
In 2015, the work Dropout as a Bayesian Approximation: Insights and
Applications. Yarin Gal et al., they found a relationship between
Dropout and Bayesian approximation;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Dropout as a Bayesian Approximation
In 2015, the work Dropout as a Bayesian Approximation: Insights and
Applications. Yarin Gal et al., they found a relationship between
Dropout and Bayesian approximation;
It turns out that to do a Bernoulli approximate variational inference
in Bayesian NNs, you can just add dropout during training and
during prediction time as well;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Dropout as a Bayesian Approximation
In 2015, the work Dropout as a Bayesian Approximation: Insights and
Applications. Yarin Gal et al., they found a relationship between
Dropout and Bayesian approximation;
It turns out that to do a Bernoulli approximate variational inference
in Bayesian NNs, you can just add dropout during training and
during prediction time as well;
Quite appealing due to its simplicity and it also provided an
interesting interpretation of dropout;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Dropout as a Bayesian Approximation
In 2015, the work Dropout as a Bayesian Approximation: Insights and
Applications. Yarin Gal et al., they found a relationship between
Dropout and Bayesian approximation;
It turns out that to do a Bernoulli approximate variational inference
in Bayesian NNs, you can just add dropout during training and
during prediction time as well;
Quite appealing due to its simplicity and it also provided an
interesting interpretation of dropout;
This technique is called "MC Dropout" or "Monte Carlo Dropout".
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
MC Dropout on a Regression Setting
Some results from the MC Dropout on a regression setting:
MC Dropout. Source: Dropout as a Bayesian Approximation: Insights and Applications. Yarin Gal et al.
ICML 2015.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
MC Dropout on a Classification Setting
Some results from the MC Dropout on a classification setting:
MC Dropout. Source: Dropout as a Bayesian Approximation: Insights and Applications. Yarin Gal et al.
ICML 2015.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Criticism of MC Dropout
Some results from the MC Dropout on a regression setting:
MC Dropout with varying number of data points. Gray regions is 1, std. dev. above and below. Source:
Randomized Prior Functions for Deep Reinforcement Learning. Ian Osband et al. 2018.
It was shown that MC Dropout didn’t pass a simple sanity check in a linear
setting, as it didn’t concentrate with more data.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Section V
Ensembles
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Ensembles
Uses multiple hypothesis to
learn a better one;
We can see dropout as an
ensemble, but with shared
weights;
The ensemble variance can be
interpreted as uncertainty;
Simple intuition why it works.
Input Data
Combine predictions
Model #1 Model #2 Model #3
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Deep Ensembles
In the work: Simple and Scalable Predictive Uncertainty Estimation using
Deep Ensembles. Lakshminarayanan B., et al. NIPS 2017., they proposed a
very simple method to compute uncertainty with ensembles:
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Deep Ensembles
In the work: Simple and Scalable Predictive Uncertainty Estimation using
Deep Ensembles. Lakshminarayanan B., et al. NIPS 2017., they proposed a
very simple method to compute uncertainty with ensembles:
Setting
You have M models, with independent parameters θ1, θ2, θM .
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Deep Ensembles
In the work: Simple and Scalable Predictive Uncertainty Estimation using
Deep Ensembles. Lakshminarayanan B., et al. NIPS 2017., they proposed a
very simple method to compute uncertainty with ensembles:
Setting
You have M models, with independent parameters θ1, θ2, θM .
1) Initialize parameters θ1, θ2, θM randomly;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Deep Ensembles
In the work: Simple and Scalable Predictive Uncertainty Estimation using
Deep Ensembles. Lakshminarayanan B., et al. NIPS 2017., they proposed a
very simple method to compute uncertainty with ensembles:
Setting
You have M models, with independent parameters θ1, θ2, θM .
1) Initialize parameters θ1, θ2, θM randomly;
2) Train each network m ∈ M with weights θm individually;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Deep Ensembles
In the work: Simple and Scalable Predictive Uncertainty Estimation using
Deep Ensembles. Lakshminarayanan B., et al. NIPS 2017., they proposed a
very simple method to compute uncertainty with ensembles:
Setting
You have M models, with independent parameters θ1, θ2, θM .
1) Initialize parameters θ1, θ2, θM randomly;
2) Train each network m ∈ M with weights θm individually;
3) Add or not adversarial training;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Deep Ensembles
In the work: Simple and Scalable Predictive Uncertainty Estimation using
Deep Ensembles. Lakshminarayanan B., et al. NIPS 2017., they proposed a
very simple method to compute uncertainty with ensembles:
Setting
You have M models, with independent parameters θ1, θ2, θM .
1) Initialize parameters θ1, θ2, θM randomly;
2) Train each network m ∈ M with weights θm individually;
3) Add or not adversarial training;
4) Combine the predictions with:
p(y | x) = M−1
average
M
m=1
prediction from each network
pθm (y | x, θm)
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Evaluating Entropy on Classification
Plot of the binary entropy function H(p). A measure of the uncertainty.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Evaluating Entropy on Classification
0.20.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
entropy values
0
1
2
3
4
5
6
7
8 Known classes
1
2
3
4
5
1 0 1 2 3 4 5
entropy values
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7 Unknown classes
1
2
3
4
5
ImageNet trained only on dogs. Histogram of the predictive entropy on test examples from known classes
(dogs) and unknown classes (non-dogs) with varying ensemble size. Source: Lakshminarayanan B., et al.
NIPS 2017.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Evaluating Entropy on Classification
−0.50.0 0.5 1.0 1.5 2.0 2.5
entropy values
0
1
2
3
4
5
6
7
Ensemble
1
5
10
−0.50.0 0.5 1.0 1.5 2.0 2.5
entropy values
Ensemble + R
1
5
10
−0.50.0 0.5 1.0 1.5 2.0 2.5
entropy values
Ensemble + AT
1
5
10
−0.5 0.0 0.5 1.0 1.5 2.0
entropy values
MC dropout
1
5
10
−0.50.0 0.5 1.0 1.5 2.0 2.5
entropy values
0
1
2
3
4
5
6
7
Ensemble
1
5
10
−0.50.0 0.5 1.0 1.5 2.0 2.5
entropy values
Ensemble + R
1
5
10
−0.50.0 0.5 1.0 1.5 2.0 2.5
entropy values
Ensemble + AT
1
5
10
−0.50.0 0.5 1.0 1.5 2.0 2.5
entropy values
MC dropout
1
5
10
Histogram of the predictive entropy on test examples from known classes from SVHN (top row) and
unknown classes from CIFAR-10 (bottom row). Source: Lakshminarayanan B., et al. NIPS 2017.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Randomized Priors
In Randomized Prior Functions for Deep Reinforcement Learning. Ian
Osband et al. 2018:
Very simple and elegant modification on the ensemble method for
uncertainty;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Randomized Priors
In Randomized Prior Functions for Deep Reinforcement Learning. Ian
Osband et al. 2018:
Very simple and elegant modification on the ensemble method for
uncertainty;
Developed in the Reinforcement Learning context;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Randomized Priors
In Randomized Prior Functions for Deep Reinforcement Learning. Ian
Osband et al. 2018:
Very simple and elegant modification on the ensemble method for
uncertainty;
Developed in the Reinforcement Learning context;
Overcome the issue of injecting a prior into ensemble-based
approaches to uncertainty;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Randomized Priors
In Randomized Prior Functions for Deep Reinforcement Learning. Ian
Osband et al. 2018:
Very simple and elegant modification on the ensemble method for
uncertainty;
Developed in the Reinforcement Learning context;
Overcome the issue of injecting a prior into ensemble-based
approaches to uncertainty;
On a simple linear setting, it is equivalent to exact Bayesian inference
for the case of a linear Gaussian model.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bootstrap
Population
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bootstrap
Population
Sample #1
Sample #2
Sample #3
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bootstrap
Population
Sample #1
Sample #2
Sample #3
Statistic
Statistic
Statistic
q1
q2
q3
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Bootstrap
Population
Sample #1
Sample #2
Sample #3
Statistic
Statistic
Statistic
q1
q2
q3
Bootstrap Statistic
Distribution
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Randomized Prior Functions
The key insight is to add a randomized (but fixed) prior and
bootstraped data:
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Randomized Prior Functions
The key insight is to add a randomized (but fixed) prior and
bootstraped data:
for k = 1, . . . , K do:
Initialize θk ∼ random;
Form Dk with bootstrap;
Sample prior function pk ∼ P
Optimize L(fθ + λpk; Dk)
return posterior ensemble {fθk
+ pk}K
k=1
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Qualitative Inspection
Some pathological cases:
Posterior predictive distributions for 1D regression with a (20, 20)-MLP and ReLUs. Source:
Randomized Prior Functions for Deep Reinforcement Learning. Ian Osband et al. 2018.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Qualitative Inspection
Some pathological cases:
Posterior predictive distributions for 1D regression with a (20, 20)-MLP and ReLUs. Source:
Randomized Prior Functions for Deep Reinforcement Learning. Ian Osband et al. 2018.
“(...) If an agent has only ever observed zero reward, then no amount of
bootstrapping or ensembling will cause it to simulate positive rewards.
(...)”
– Randomized Prior Functions for Deep Reinforcement Learning. Ian
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Predictive Uncertainty
6 4 2 0 2 4 6
20
10
0
10
20
30
40
Source: Ian Osband et al. Using Randomized Prior Functions for Deep Reinforcement Learning. NIPS
2018. Image from: http://blog.christianperone.com
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Posterior Samples
4 3 2 1 0 1 2 3 4
10
5
0
5
10
Source: Ian Osband et al. Using Randomized Prior Functions for Deep Reinforcement Learning. NIPS
2018. Image from: http://blog.christianperone.com
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Prior Samples
4 3 2 1 0 1 2 3 4
4
2
0
2
4
Source: Ian Osband et al. Using Randomized Prior Functions for Deep Reinforcement Learning. NIPS
2018. Image from: http://blog.christianperone.com
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Final Remarks
Many methods, no standardized evaluation, no ground truth for
model uncertainty;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Final Remarks
Many methods, no standardized evaluation, no ground truth for
model uncertainty;
Performance (CPU/GPU resources) penalty basically for all
methods;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Final Remarks
Many methods, no standardized evaluation, no ground truth for
model uncertainty;
Performance (CPU/GPU resources) penalty basically for all
methods;
No scalable solution for MCMC (yet);
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Final Remarks
Many methods, no standardized evaluation, no ground truth for
model uncertainty;
Performance (CPU/GPU resources) penalty basically for all
methods;
No scalable solution for MCMC (yet);
Choice depends on application;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Final Remarks
Many methods, no standardized evaluation, no ground truth for
model uncertainty;
Performance (CPU/GPU resources) penalty basically for all
methods;
No scalable solution for MCMC (yet);
Choice depends on application;
Always take into consideration the trade-off of guarantees;
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Final Remarks
Many methods, no standardized evaluation, no ground truth for
model uncertainty;
Performance (CPU/GPU resources) penalty basically for all
methods;
No scalable solution for MCMC (yet);
Choice depends on application;
Always take into consideration the trade-off of guarantees;
Significant evolution of methods, frameworks and hardware.
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Learning More - I
Statistical Rethinking (excellent book and course), by Richard
McElreath.
https://xcelab.net/rm/statistical-rethinking/
Variational Inference: A Review, by David M. Blei, et al.
https://arxiv.org/abs/1601.00670
Scalable Bayesian Inference, by David Dunson. NIPS 2018 Talk.
https://www.youtube.com/watch?v=0HXpnG_WnlI
Variational Bayes and Beyond, by Tamara Broderick. ICML 2018
Tutorial.
https://www.youtube.com/watch?v=Moo4-KR5qNg
History of Bayesian Neural Networks, by Zoubin Ghahramani.
NIPS 2016 Keynote talk.
https://www.youtube.com/watch?v=FD8l2vPU5FY
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Learning More - II
Uncertainty in Deep Learning, Slides, by Roberto Silveira.
http://tiny.cc/c77n9y
A Beginner’s Guide to Variational Methods, by Eric Jang.
https:
//blog.evjang.com/2016/08/variational-bayes.html
Uncertainty in Deep Learning, Thesis, by Yarin Gal.
http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf
PyMC3, Framework, by PyMC3 developers.
https://docs.pymc.io/
Pyro, Framework, by Pyro developers.
http://pyro.ai/
Tensorflow Probability, Framework, by TensorFlow developers.
https://www.tensorflow.org/probability
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Section VI
Q&A
Uncertainty in Deep Learning - Christian S. Perone (2019)
Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A
Q&A
Hope you liked ! Questions ?

More Related Content

What's hot

Hyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine LearningHyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine Learning
Francesco Casalegno
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
Antonio Rueda-Toicen
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
butest
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
Fellowship at Vodafone FutureLab
 
機械学習の未解決課題
機械学習の未解決課題機械学習の未解決課題
機械学習の未解決課題
Hiroyuki Masuda
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
Appsilon Data Science
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General Audiences
Sangwoo Mo
 
[DL輪読会]Estimating Predictive Uncertainty via Prior Networks
[DL輪読会]Estimating Predictive Uncertainty via Prior Networks[DL輪読会]Estimating Predictive Uncertainty via Prior Networks
[DL輪読会]Estimating Predictive Uncertainty via Prior Networks
Deep Learning JP
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
NAVER Engineering
 
Transfer Learning
Transfer LearningTransfer Learning
Transfer Learning
Hichem Felouat
 
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
Edureka!
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
Lukas Masuch
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
Shubhmay Potdar
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
Upekha Vandebona
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
Sri Ambati
 
[컨퍼런스] 모두콘 2018 리뷰
[컨퍼런스] 모두콘 2018 리뷰[컨퍼런스] 모두콘 2018 리뷰
[컨퍼런스] 모두콘 2018 리뷰
Donghyeon Kim
 
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Model-Agnostic Meta-Learning for Fast Adaptation of Deep NetworksModel-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Yoonho Lee
 
自然言語処理における深層学習を用いた予測の不確実性 - Predictive Uncertainty in NLP -
自然言語処理における深層学習を用いた予測の不確実性  - Predictive Uncertainty in NLP -自然言語処理における深層学習を用いた予測の不確実性  - Predictive Uncertainty in NLP -
自然言語処理における深層学習を用いた予測の不確実性 - Predictive Uncertainty in NLP -
tmtm otm
 
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIGenerative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
WithTheBest
 
An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep Learning
Julien SIMON
 

What's hot (20)

Hyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine LearningHyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine Learning
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
機械学習の未解決課題
機械学習の未解決課題機械学習の未解決課題
機械学習の未解決課題
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General Audiences
 
[DL輪読会]Estimating Predictive Uncertainty via Prior Networks
[DL輪読会]Estimating Predictive Uncertainty via Prior Networks[DL輪読会]Estimating Predictive Uncertainty via Prior Networks
[DL輪読会]Estimating Predictive Uncertainty via Prior Networks
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
Transfer Learning
Transfer LearningTransfer Learning
Transfer Learning
 
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
[컨퍼런스] 모두콘 2018 리뷰
[컨퍼런스] 모두콘 2018 리뷰[컨퍼런스] 모두콘 2018 리뷰
[컨퍼런스] 모두콘 2018 리뷰
 
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Model-Agnostic Meta-Learning for Fast Adaptation of Deep NetworksModel-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
 
自然言語処理における深層学習を用いた予測の不確実性 - Predictive Uncertainty in NLP -
自然言語処理における深層学習を用いた予測の不確実性  - Predictive Uncertainty in NLP -自然言語処理における深層学習を用いた予測の不確実性  - Predictive Uncertainty in NLP -
自然言語処理における深層学習を用いた予測の不確実性 - Predictive Uncertainty in NLP -
 
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIGenerative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
 
An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep Learning
 

Similar to Uncertainty Estimation in Deep Learning

Simulation-Based Education: Developing Scenarios and the Importance of Debrie...
Simulation-Based Education: Developing Scenarios and the Importance of Debrie...Simulation-Based Education: Developing Scenarios and the Importance of Debrie...
Simulation-Based Education: Developing Scenarios and the Importance of Debrie...
Eric B. Bauman
 
AI & VR for Academy of Medical Educators.pptx
AI & VR for Academy of Medical Educators.pptxAI & VR for Academy of Medical Educators.pptx
AI & VR for Academy of Medical Educators.pptx
Janet Corral
 
DQ2Patrick QueisneOne of the greatest barriers that the orga
DQ2Patrick QueisneOne of the greatest barriers that the orgaDQ2Patrick QueisneOne of the greatest barriers that the orga
DQ2Patrick QueisneOne of the greatest barriers that the orga
DustiBuckner14
 
Video slides focus on population & sample
Video slides focus on population & sampleVideo slides focus on population & sample
Video slides focus on population & sample
DoctoralNet Limited
 
E. Jenkins' Thesis
E. Jenkins' Thesis E. Jenkins' Thesis
E. Jenkins' Thesis
Elizabeth Jenkins
 
The Error of Our Ways
The Error of Our WaysThe Error of Our Ways
The Error of Our Ways
Ofqual Slideshare
 
The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...
jemille6
 
The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...
StephenSenn2
 
How to read academic research (beginner's guide)
How to read academic research (beginner's guide)How to read academic research (beginner's guide)
How to read academic research (beginner's guide)
Russell James
 
Ethicsandcriticalthinking
EthicsandcriticalthinkingEthicsandcriticalthinking
Ethicsandcriticalthinking
Chris Willmott
 
The relationship between cognitive styles and social influence in dyad group ...
The relationship between cognitive styles and social influence in dyad group ...The relationship between cognitive styles and social influence in dyad group ...
The relationship between cognitive styles and social influence in dyad group ...
Jingdan "Diana" Zhu
 
Focus on population sample
Focus on population sampleFocus on population sample
Focus on population sample
DoctoralNet Limited
 
Bio 500 grand canyon entire course latest
Bio 500 grand canyon entire course latestBio 500 grand canyon entire course latest
Bio 500 grand canyon entire course latest
Activity Mode
 
Jinn MA Thesis 2014
Jinn MA Thesis 2014Jinn MA Thesis 2014
Jinn MA Thesis 2014
Nicole Jinn
 
Bio 500 grand canyon entire course latest
Bio 500 grand canyon entire course latestBio 500 grand canyon entire course latest
Bio 500 grand canyon entire course latest
Activity Mode
 
Bio 500 grand canyon entire course latest
Bio 500 grand canyon entire course latestBio 500 grand canyon entire course latest
Bio 500 grand canyon entire course latest
Activity Mode
 
D. Lakens: Preregistration as a Tool to Evaluate the Severity of a Test
D. Lakens: Preregistration  as a Tool to Evaluate the Severity of a TestD. Lakens: Preregistration  as a Tool to Evaluate the Severity of a Test
D. Lakens: Preregistration as a Tool to Evaluate the Severity of a Test
jemille6
 
Educational Research 102: Selecting the Best Study Design for your Research Q...
Educational Research 102: Selecting the Best Study Design for your Research Q...Educational Research 102: Selecting the Best Study Design for your Research Q...
Educational Research 102: Selecting the Best Study Design for your Research Q...
fnuthalapaty
 
1RUNNING HEAD PROBLEM IDENTIFICATIONJohnston-Taylor.docx
1RUNNING HEAD PROBLEM IDENTIFICATIONJohnston-Taylor.docx1RUNNING HEAD PROBLEM IDENTIFICATIONJohnston-Taylor.docx
1RUNNING HEAD PROBLEM IDENTIFICATIONJohnston-Taylor.docx
RAJU852744
 
1RUNNING HEAD PROBLEM IDENTIFICATIONJohnston-Taylor.docx
1RUNNING HEAD PROBLEM IDENTIFICATIONJohnston-Taylor.docx1RUNNING HEAD PROBLEM IDENTIFICATIONJohnston-Taylor.docx
1RUNNING HEAD PROBLEM IDENTIFICATIONJohnston-Taylor.docx
jesusamckone
 

Similar to Uncertainty Estimation in Deep Learning (20)

Simulation-Based Education: Developing Scenarios and the Importance of Debrie...
Simulation-Based Education: Developing Scenarios and the Importance of Debrie...Simulation-Based Education: Developing Scenarios and the Importance of Debrie...
Simulation-Based Education: Developing Scenarios and the Importance of Debrie...
 
AI & VR for Academy of Medical Educators.pptx
AI & VR for Academy of Medical Educators.pptxAI & VR for Academy of Medical Educators.pptx
AI & VR for Academy of Medical Educators.pptx
 
DQ2Patrick QueisneOne of the greatest barriers that the orga
DQ2Patrick QueisneOne of the greatest barriers that the orgaDQ2Patrick QueisneOne of the greatest barriers that the orga
DQ2Patrick QueisneOne of the greatest barriers that the orga
 
Video slides focus on population & sample
Video slides focus on population & sampleVideo slides focus on population & sample
Video slides focus on population & sample
 
E. Jenkins' Thesis
E. Jenkins' Thesis E. Jenkins' Thesis
E. Jenkins' Thesis
 
The Error of Our Ways
The Error of Our WaysThe Error of Our Ways
The Error of Our Ways
 
The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...
 
The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...
 
How to read academic research (beginner's guide)
How to read academic research (beginner's guide)How to read academic research (beginner's guide)
How to read academic research (beginner's guide)
 
Ethicsandcriticalthinking
EthicsandcriticalthinkingEthicsandcriticalthinking
Ethicsandcriticalthinking
 
The relationship between cognitive styles and social influence in dyad group ...
The relationship between cognitive styles and social influence in dyad group ...The relationship between cognitive styles and social influence in dyad group ...
The relationship between cognitive styles and social influence in dyad group ...
 
Focus on population sample
Focus on population sampleFocus on population sample
Focus on population sample
 
Bio 500 grand canyon entire course latest
Bio 500 grand canyon entire course latestBio 500 grand canyon entire course latest
Bio 500 grand canyon entire course latest
 
Jinn MA Thesis 2014
Jinn MA Thesis 2014Jinn MA Thesis 2014
Jinn MA Thesis 2014
 
Bio 500 grand canyon entire course latest
Bio 500 grand canyon entire course latestBio 500 grand canyon entire course latest
Bio 500 grand canyon entire course latest
 
Bio 500 grand canyon entire course latest
Bio 500 grand canyon entire course latestBio 500 grand canyon entire course latest
Bio 500 grand canyon entire course latest
 
D. Lakens: Preregistration as a Tool to Evaluate the Severity of a Test
D. Lakens: Preregistration  as a Tool to Evaluate the Severity of a TestD. Lakens: Preregistration  as a Tool to Evaluate the Severity of a Test
D. Lakens: Preregistration as a Tool to Evaluate the Severity of a Test
 
Educational Research 102: Selecting the Best Study Design for your Research Q...
Educational Research 102: Selecting the Best Study Design for your Research Q...Educational Research 102: Selecting the Best Study Design for your Research Q...
Educational Research 102: Selecting the Best Study Design for your Research Q...
 
1RUNNING HEAD PROBLEM IDENTIFICATIONJohnston-Taylor.docx
1RUNNING HEAD PROBLEM IDENTIFICATIONJohnston-Taylor.docx1RUNNING HEAD PROBLEM IDENTIFICATIONJohnston-Taylor.docx
1RUNNING HEAD PROBLEM IDENTIFICATIONJohnston-Taylor.docx
 
1RUNNING HEAD PROBLEM IDENTIFICATIONJohnston-Taylor.docx
1RUNNING HEAD PROBLEM IDENTIFICATIONJohnston-Taylor.docx1RUNNING HEAD PROBLEM IDENTIFICATIONJohnston-Taylor.docx
1RUNNING HEAD PROBLEM IDENTIFICATIONJohnston-Taylor.docx
 

More from Christian Perone

PyTorch 2 Internals
PyTorch 2 InternalsPyTorch 2 Internals
PyTorch 2 Internals
Christian Perone
 
Gradient-based optimization for Deep Learning: a short introduction
Gradient-based optimization for Deep Learning: a short introductionGradient-based optimization for Deep Learning: a short introduction
Gradient-based optimization for Deep Learning: a short introduction
Christian Perone
 
Bayesian modelling for COVID-19 seroprevalence studies
Bayesian modelling for COVID-19 seroprevalence studiesBayesian modelling for COVID-19 seroprevalence studies
Bayesian modelling for COVID-19 seroprevalence studies
Christian Perone
 
PyTorch under the hood
PyTorch under the hoodPyTorch under the hood
PyTorch under the hood
Christian Perone
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - Introduction
Christian Perone
 
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and PythonApache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Christian Perone
 
Deep Learning - Convolutional Neural Networks - Architectural Zoo
Deep Learning - Convolutional Neural Networks - Architectural ZooDeep Learning - Convolutional Neural Networks - Architectural Zoo
Deep Learning - Convolutional Neural Networks - Architectural Zoo
Christian Perone
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
Christian Perone
 
Machine Learning com Python e Scikit-learn
Machine Learning com Python e Scikit-learnMachine Learning com Python e Scikit-learn
Machine Learning com Python e Scikit-learn
Christian Perone
 
Python - Introdução Básica
Python - Introdução BásicaPython - Introdução Básica
Python - Introdução Básica
Christian Perone
 
C++0x :: Introduction to some amazing features
C++0x :: Introduction to some amazing featuresC++0x :: Introduction to some amazing features
C++0x :: Introduction to some amazing features
Christian Perone
 

More from Christian Perone (11)

PyTorch 2 Internals
PyTorch 2 InternalsPyTorch 2 Internals
PyTorch 2 Internals
 
Gradient-based optimization for Deep Learning: a short introduction
Gradient-based optimization for Deep Learning: a short introductionGradient-based optimization for Deep Learning: a short introduction
Gradient-based optimization for Deep Learning: a short introduction
 
Bayesian modelling for COVID-19 seroprevalence studies
Bayesian modelling for COVID-19 seroprevalence studiesBayesian modelling for COVID-19 seroprevalence studies
Bayesian modelling for COVID-19 seroprevalence studies
 
PyTorch under the hood
PyTorch under the hoodPyTorch under the hood
PyTorch under the hood
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - Introduction
 
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and PythonApache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
 
Deep Learning - Convolutional Neural Networks - Architectural Zoo
Deep Learning - Convolutional Neural Networks - Architectural ZooDeep Learning - Convolutional Neural Networks - Architectural Zoo
Deep Learning - Convolutional Neural Networks - Architectural Zoo
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Machine Learning com Python e Scikit-learn
Machine Learning com Python e Scikit-learnMachine Learning com Python e Scikit-learn
Machine Learning com Python e Scikit-learn
 
Python - Introdução Básica
Python - Introdução BásicaPython - Introdução Básica
Python - Introdução Básica
 
C++0x :: Introduction to some amazing features
C++0x :: Introduction to some amazing featuresC++0x :: Introduction to some amazing features
C++0x :: Introduction to some amazing features
 

Recently uploaded

"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes..."Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
Anant Gupta
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
Priyanka Aash
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
aslasdfmkhan4750
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
alexjohnson7307
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Networks
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
ssuser1915fe1
 
How to build a generative AI solution A step-by-step guide (2).pdf
How to build a generative AI solution A step-by-step guide (2).pdfHow to build a generative AI solution A step-by-step guide (2).pdf
How to build a generative AI solution A step-by-step guide (2).pdf
ChristopherTHyatt
 
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
maigasapphire
 
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
KAMAL CHOUDHARY
 
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
Priyanka Aash
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
ishalveerrandhawa1
 
CHAPTER-8 COMPONENTS OF COMPUTER SYSTEM CLASS 9 CBSE
CHAPTER-8 COMPONENTS OF COMPUTER SYSTEM CLASS 9 CBSECHAPTER-8 COMPONENTS OF COMPUTER SYSTEM CLASS 9 CBSE
CHAPTER-8 COMPONENTS OF COMPUTER SYSTEM CLASS 9 CBSE
kumarjarun2010
 
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Muhammad Ali
 
The Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF GuideThe Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF Guide
Shiv Technolabs
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
bhumivarma35300
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
RaminGhanbari2
 
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Mydbops
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
 

Recently uploaded (20)

"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes..."Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
 
How to build a generative AI solution A step-by-step guide (2).pdf
How to build a generative AI solution A step-by-step guide (2).pdfHow to build a generative AI solution A step-by-step guide (2).pdf
How to build a generative AI solution A step-by-step guide (2).pdf
 
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
 
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
 
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
 
CHAPTER-8 COMPONENTS OF COMPUTER SYSTEM CLASS 9 CBSE
CHAPTER-8 COMPONENTS OF COMPUTER SYSTEM CLASS 9 CBSECHAPTER-8 COMPONENTS OF COMPUTER SYSTEM CLASS 9 CBSE
CHAPTER-8 COMPONENTS OF COMPUTER SYSTEM CLASS 9 CBSE
 
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
 
The Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF GuideThe Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF Guide
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
 
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
 

Uncertainty Estimation in Deep Learning

  • 1. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Uncertainty Estimation in Deep Learning A brief introduction Christian S. Perone christian.perone@gmail.com http://blog.christianperone.com
  • 2. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Agenda Uncertainties Knowing what you don’t know The problem Different Uncertainties Importance of Uncertainty Bayesian Inference The frequentist way The bayesian inference MCMC Sampling Deep Learning Short intro Bayesian Neural Networks Variational Inference Introduction Posterior Approximation Training a BNN Dropout Ensembles Introduction Deep Ensembles Randomized Prior Functions Final Remarks Q&A
  • 3. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Who Am I Christian S. Perone BSc in Computer Science in Brazil (UPF), MSc in Biomedical Eng. in Montreal (Polytechnique/UdeM) Machine Learning / Data Science Working at Jungle Blog at blog.christianperone.com Open-source projects https://github.com/perone Twitter @tarantulae
  • 4. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Section I Uncertainties
  • 5. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Knowing what you don’t know It is correct, somebody might say, that (...) Socrates did not know anything; and it was indeed wisdom that they recognized their own lack of knowledge, (...). —Karl R. Popper, The World of Parmenides
  • 6. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Knowing what you don’t know It is correct, somebody might say, that (...) Socrates did not know anything; and it was indeed wisdom that they recognized their own lack of knowledge, (...). —Karl R. Popper, The World of Parmenides What this has to do statistical learning ?
  • 7. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A The problem Let’s say you trained a model to classify an image as having lesion or not; Different MRI contrasts (T2/T1). Source: http://www.msdiscovery.org. 2019.
  • 8. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A The problem Let’s say you trained a model to classify an image as having lesion or not; Different MRI contrasts (T2/T1). Source: http://www.msdiscovery.org. 2019. Later you do prediction on volumes with different parametrization, anatomy, etc;
  • 9. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A The problem Let’s say you trained a model to classify an image as having lesion or not; Different MRI contrasts (T2/T1). Source: http://www.msdiscovery.org. 2019. Later you do prediction on volumes with different parametrization, anatomy, etc; The problem: you can still have a prediction with high probability, even if your sample is out-of-distribution.
  • 10. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A The problem A simple regression problem. Source: Yarin Gal. Uncertainty in Deep Learning. PhD Thesis. 2016.
  • 11. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A The problem A simple regression problem. 6 4 2 0 2 4 6 20 10 0 10 20 30 40 Source: Ian Osband et al. Using Randomized Prior Functions for Deep Reinforcement Learning. NIPS 2018. Image from: http://blog.christianperone.com
  • 12. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Different Uncertainties Two main types of uncertainty, often confused by practitioners, but very different quantities:
  • 13. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Different Uncertainties Two main types of uncertainty, often confused by practitioners, but very different quantities: Aleatoric Uncertainty Information data cannot explain, also called data uncertainty, or irreducible uncertainty. More data might not reduce it; Ex: increasing measurement precision can reduce it.
  • 14. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Different Uncertainties Two main types of uncertainty, often confused by practitioners, but very different quantities: Aleatoric Uncertainty Information data cannot explain, also called data uncertainty, or irreducible uncertainty. More data might not reduce it; Ex: increasing measurement precision can reduce it. Epistemic Uncertainty Uncertainty in the model itself, also called model uncertainty, or reducible uncertainty; Ex: can be explained away by increasing training size.
  • 15. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Importance of Uncertainty Medical imaging (classification, segmentation);
  • 16. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Importance of Uncertainty Medical imaging (classification, segmentation); Autonomous vehicles (what’s the uncertainty this object is a tree ?);
  • 17. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Importance of Uncertainty Medical imaging (classification, segmentation); Autonomous vehicles (what’s the uncertainty this object is a tree ?); Active Learning (which sample should be labeled ?);
  • 18. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Importance of Uncertainty Medical imaging (classification, segmentation); Autonomous vehicles (what’s the uncertainty this object is a tree ?); Active Learning (which sample should be labeled ?); Explore/exploit dilemma in reinforcement learning;
  • 19. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Importance of Uncertainty Medical imaging (classification, segmentation); Autonomous vehicles (what’s the uncertainty this object is a tree ?); Active Learning (which sample should be labeled ?); Explore/exploit dilemma in reinforcement learning; Out-of-distribution detection;
  • 20. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Importance of Uncertainty Medical imaging (classification, segmentation); Autonomous vehicles (what’s the uncertainty this object is a tree ?); Active Learning (which sample should be labeled ?); Explore/exploit dilemma in reinforcement learning; Out-of-distribution detection; Model understanding/dataset understanding;
  • 21. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Importance of Uncertainty Medical imaging (classification, segmentation); Autonomous vehicles (what’s the uncertainty this object is a tree ?); Active Learning (which sample should be labeled ?); Explore/exploit dilemma in reinforcement learning; Out-of-distribution detection; Model understanding/dataset understanding; Nearly all applications !
  • 22. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Example in Reinforcement Learning The explore/exploit dilemma:
  • 23. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Example in Reinforcement Learning Work by Maxime Wabartha et al.: estimated by taking, for each approach, the pointwise average and standard deviation over 50 sampled functions. We expect the empirical posterior predictive distribution to cover the ground truth function. While we succeed to do so using a MSE loss and the proposed approach, we do not manage to obtain diverse functions using solely anchoring neither using dropout; in our experiments, changing the dropout rate did not improve the quality of the obtained uncertainty. Input bootstrapping does produce functions that better span the width of outputs, but it also disregards by nature certain points of the training set, where we expect the uncertainty to be low given our current knowledge. We also provide in the appendix an example of the functions generated by our function approach when fixing X. 0.4 0.2 0.0 0.2 0.4 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00 Dropout 0.2 0.4 0.2 0.0 0.2 0.4 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00 Input bootstrapping 0.4 0.2 0.0 0.2 0.4 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00 AnchoringGround truth Sample function Standard deviations Training set 0.4 0.2 0.0 0.2 0.4 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00 RepulsiveReference function Figure 1: Comparison of the empirical (over 20 sample functions) posterior predictive distribution for dropout, input bootstrapping, anchoring and repulsive constraint. 3.2 Diverse functions in high-dimensional input space We apply the method to function approximation in the case of a reinforcement learning problem requiring exploration. More precisely, we showcase how our method can help sample diverse reward functions in a model-based setting. We create a dataset of 43 13x13 frames with the associated reward. We use as function approximator a small CNN outputing a reward for a given frame (see appendix). To illustrate our method, we sample the repulsive points from possible frames, thus directly from the manifold, in or out of the training distribution (see appendix). Figure 2 (rightmost figure) shows how Source: Maxime Wabartha et al. Sampling diverse neural networks for exploration in reinforcement learning. NIPS 2018.
  • 24. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Section II Bayesian Inference
  • 25. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A A simple frequentist regression In a frequentist linear regression, we have a point estimate for the parameters of our model. For a maximum likelihood derivation, take a look at http://blog.christianperone.com/2019/01/mle/.
  • 26. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A A simple frequentist regression In a frequentist linear regression, we have a point estimate for the parameters of our model. First, we define our model: f(x) = θ0 + θ1x1 + θ2x2 + . . . = Vectorial notation x β For a maximum likelihood derivation, take a look at http://blog.christianperone.com/2019/01/mle/.
  • 27. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A A simple frequentist regression In a frequentist linear regression, we have a point estimate for the parameters of our model. First, we define our model: f(x) = θ0 + θ1x1 + θ2x2 + . . . = Vectorial notation x β Later, we define a loss such as the MSE (mean squared error): L = 1 n n i=1 (f(xi) − yi)2 For a maximum likelihood derivation, take a look at http://blog.christianperone.com/2019/01/mle/.
  • 28. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A A simple frequentist regression In a frequentist linear regression, we have a point estimate for the parameters of our model. First, we define our model: f(x) = θ0 + θ1x1 + θ2x2 + . . . = Vectorial notation x β Later, we define a loss such as the MSE (mean squared error): L = 1 n n i=1 (f(xi) − yi)2 Finally, we optimize it: ˆθ = arg min θ L(f(x), y) For a maximum likelihood derivation, take a look at http://blog.christianperone.com/2019/01/mle/.
  • 29. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A A simple frequentist regression 0.0 0.2 0.4 0.6 0.8 1.0 x 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 y Frequentist regression sample data regression line
  • 30. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A The bayesian way Bayesian approaches represent the uncertainty using a distribution over parameters. Instead of a point estimate, we have an entire posterior.
  • 31. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A The bayesian way Bayesian approaches represent the uncertainty using a distribution over parameters. Instead of a point estimate, we have an entire posterior. To formulate our bayesian regression, we first select a likelihood;
  • 32. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A The bayesian way Bayesian approaches represent the uncertainty using a distribution over parameters. Instead of a point estimate, we have an entire posterior. To formulate our bayesian regression, we first select a likelihood; After that, we select priors over parameters;
  • 33. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A The bayesian way Bayesian approaches represent the uncertainty using a distribution over parameters. Instead of a point estimate, we have an entire posterior. To formulate our bayesian regression, we first select a likelihood; After that, we select priors over parameters; Then we compute or approximate (sampling) the posterior of our model and data.
  • 34. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Prior, likelihood and posterior 1 2 3 Credibility Prior
  • 35. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Prior, likelihood and posterior 1 2 3 Credibility Prior 1 2 3 Credibility Data
  • 36. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Prior, likelihood and posterior 1 2 3 Credibility Prior 1 2 3 Credibility Data 1 2 3 Credibility Posterior
  • 37. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Prior, likelihood and posterior Posterior p(θ|X)
  • 38. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Prior, likelihood and posterior Posterior p(θ|X) ∝ p(X|θ) Likelihood
  • 39. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Prior, likelihood and posterior Posterior p(θ|X) ∝ p(X|θ) Likelihood Prior π(θ)
  • 40. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A posterior 0 0.5 1 likelihood 0 0.5 1 prior 0 0.5 1 ⇥ / posterior 0 0.5 1 prior 0 0.5 1 ⇥ / ⇥ / likelihood 0 0.5 1 prior 0 0.5 1 likelihood 0 0.5 1 posterior 0 0.5 1 Source: Statistical Rethinking/Winter 2019. Richard McElreath.
  • 41. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian regression Let’s reformulate our regression: We will use a simple Gaussian distribution for our observations, defined as: Y ∼ N(µ, σ2 )
  • 42. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian regression Let’s reformulate our regression: We will use a simple Gaussian distribution for our observations, defined as: Y ∼ N(µ, σ2 ) We plug our regression of the µ: Y ∼ N( α + βx Linear model , σ2 )
  • 43. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian regression Let’s reformulate our regression: We will use a simple Gaussian distribution for our observations, defined as: Y ∼ N(µ, σ2 ) We plug our regression of the µ: Y ∼ N( α + βx Linear model , σ2 ) And define the priors: α ∼ N(0, 20) β ∼ N(0, 20) σ ∼ U(0, 5)
  • 44. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian Regression in Plate Notation You can represent the same model below with plate notation: Y ∼ N(α + βx, σ2 ) α ∼ N(0, 20) β ∼ N(0, 20) σ ∼ U(0, 5)
  • 45. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian Regression in Plate Notation You can represent the same model below with plate notation: Y ∼ N(α + βx, σ2 ) α ∼ N(0, 20) β ∼ N(0, 20) σ ∼ U(0, 5)
  • 46. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A MCMC Sampling Let’s see a demo of a Monte Carlo Markov Chain sampler: Source: MCMC Demos, by Chi Feng
  • 47. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A MCMC Sampling 0.7 0.8 0.9 1.0 1.1 1.2 0 2 4 Frequency Intercept 0 1000 2000 3000 4000 0.8 1.0 1.2 Samplevalue Intercept 1.6 1.8 2.0 2.2 2.4 0 1 2 3 Frequency x 0 1000 2000 3000 4000 1.5 2.0 Samplevalue x 0.45 0.50 0.55 0.60 0 5 10 15 Frequency sigma 0 1000 2000 3000 4000 0.5 0.6 Samplevalue sigma Trace plot generated using PyMC3, you can also use ArviZ.
  • 48. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian regression 0.0 0.2 0.4 0.6 0.8 1.0 x 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 y Posterior predictive regression lines sample data posterior predictive regression lines
  • 49. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian methods Bayesian methods can give us a full posterior to reason about; 1 Zoubin Ghahramani, History of Bayesian Neural Networks, NIPS 2016
  • 50. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian methods Bayesian methods can give us a full posterior to reason about; Explicit priors; 1 Zoubin Ghahramani, History of Bayesian Neural Networks, NIPS 2016
  • 51. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian methods Bayesian methods can give us a full posterior to reason about; Explicit priors; Uncertainty; 1 Zoubin Ghahramani, History of Bayesian Neural Networks, NIPS 2016
  • 52. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian methods Bayesian methods can give us a full posterior to reason about; Explicit priors; Uncertainty; They’re on the side of algorithms, not models 1; 1 Zoubin Ghahramani, History of Bayesian Neural Networks, NIPS 2016
  • 53. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian methods Bayesian methods can give us a full posterior to reason about; Explicit priors; Uncertainty; They’re on the side of algorithms, not models 1; However, Intractable posterior for many practical cases and large datasets; p(θ|X) = p(X|θ)π(θ) p(X) 1 Zoubin Ghahramani, History of Bayesian Neural Networks, NIPS 2016
  • 54. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian methods Bayesian methods can give us a full posterior to reason about; Explicit priors; Uncertainty; They’re on the side of algorithms, not models 1; However, Intractable posterior for many practical cases and large datasets; p(θ|X) = p(X|θ)π(θ) p(X) Tuning and using MCMC algorithms can be tricky. 1 Zoubin Ghahramani, History of Bayesian Neural Networks, NIPS 2016
  • 55. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Section III Deep Learning
  • 56. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Deep Learning It’s not a secret that Deep Learning reached an important milestone in Machine Learning: Non-linear function approximators;
  • 57. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Deep Learning It’s not a secret that Deep Learning reached an important milestone in Machine Learning: Non-linear function approximators; They can scale to large datasets (thanks to stochastic approximation);
  • 58. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Deep Learning It’s not a secret that Deep Learning reached an important milestone in Machine Learning: Non-linear function approximators; They can scale to large datasets (thanks to stochastic approximation);
  • 59. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Deep Learning It’s not a secret that Deep Learning reached an important milestone in Machine Learning: Non-linear function approximators; They can scale to large datasets (thanks to stochastic approximation); They are state-of-the-art for NLP, computer vision, speech, etc;
  • 60. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Deep Learning It’s not a secret that Deep Learning reached an important milestone in Machine Learning: Non-linear function approximators; They can scale to large datasets (thanks to stochastic approximation); They are state-of-the-art for NLP, computer vision, speech, etc; Very expressive and flexible;
  • 61. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Deep Learning It’s not a secret that Deep Learning reached an important milestone in Machine Learning: Non-linear function approximators; They can scale to large datasets (thanks to stochastic approximation); They are state-of-the-art for NLP, computer vision, speech, etc; Very expressive and flexible; Representation learning;
  • 62. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A One-slide Intro to Deep Learning x0 x1 ... xD y (1) 0 y (1) 1 ... y (1) m(1) . . . . . . . . . y (L) 0 y (L) 1 ... y (L) m(L) y (L+1) 1 y (L+1) 2 ... y (L+1) C input layer 1st hidden layer Lth hidden layer output layer A multi-layer perceptron (MLP) network overview. Source: David Stutz, 2018, BSD 3-Clause License. Parametrized models with composition of functions; Trained using backpropagation and SGD; Learned usually by maximizing the log likelihood;
  • 63. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian Neural Networks A Bayesian Neural Network (BNN) is a Neural Network with distributions over parameters2. 2 Neal, Radford M. (2012). Bayesian learning for neural networks.
  • 64. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian Neural Networks A Bayesian Neural Network (BNN) is a Neural Network with distributions over parameters2. Source: Weight Uncertainty in Neural Networks. Charles Blundell et al. 2015. 2 Neal, Radford M. (2012). Bayesian learning for neural networks.
  • 65. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian Neural Networks In modern Deep Neural Networks, however, we have some challenges:
  • 66. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian Neural Networks In modern Deep Neural Networks, however, we have some challenges: A lot of data;
  • 67. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian Neural Networks In modern Deep Neural Networks, however, we have some challenges: A lot of data; High-dimensionality in data;
  • 68. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian Neural Networks In modern Deep Neural Networks, however, we have some challenges: A lot of data; High-dimensionality in data; Millions of parameters;
  • 69. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian Neural Networks In modern Deep Neural Networks, however, we have some challenges: A lot of data; High-dimensionality in data; Millions of parameters; Highly non-convex surfaces;
  • 70. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bayesian Neural Networks In modern Deep Neural Networks, however, we have some challenges: A lot of data; High-dimensionality in data; Millions of parameters; Highly non-convex surfaces; This makes these models very difficult for Bayesian methods, therefore an approximation is required: Variational Inference (variational bayes)
  • 71. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Section IV Variational Inference
  • 72. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Variational Inference Variational Inference (VI) is often used as an alternative to MCMC;
  • 73. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Variational Inference Variational Inference (VI) is often used as an alternative to MCMC; Can be used to approximate the posterior of Bayesian models;
  • 74. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Variational Inference Variational Inference (VI) is often used as an alternative to MCMC; Can be used to approximate the posterior of Bayesian models; Faster than MCMC for complex models and larger datasets;
  • 75. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Variational Inference Variational Inference (VI) is often used as an alternative to MCMC; Can be used to approximate the posterior of Bayesian models; Faster than MCMC for complex models and larger datasets; Shift from sampling to optimization;
  • 76. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Variational Inference Variational Inference (VI) is often used as an alternative to MCMC; Can be used to approximate the posterior of Bayesian models; Faster than MCMC for complex models and larger datasets; Shift from sampling to optimization; Less guarantees than MCMC, density close to the target;
  • 77. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Variational Inference Variational Inference (VI) is often used as an alternative to MCMC; Can be used to approximate the posterior of Bayesian models; Faster than MCMC for complex models and larger datasets; Shift from sampling to optimization; Less guarantees than MCMC, density close to the target; For an in-depth review For a modern in-depth review please refer to: Variational Inference: A Review for Statisticians. Blei, D. M. et al (2018).
  • 78. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Variational Inference We have a very complex posterior distribution p(w | D) that we want to approximate (w are the parameters, and D is the data);
  • 79. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Variational Inference We have a very complex posterior distribution p(w | D) that we want to approximate (w are the parameters, and D is the data); We do this approximation by using an "easier" distribution q(w | θ) (also called the variational distribution, where θ are the variational parameters);
  • 80. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Variational Inference We have a very complex posterior distribution p(w | D) that we want to approximate (w are the parameters, and D is the data); We do this approximation by using an "easier" distribution q(w | θ) (also called the variational distribution, where θ are the variational parameters); Variational approximation (green). Source: Eric Jang, 2016. https://blog.evjang.com
  • 81. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Posterior approximation If we want to approximate p(w | D) with q(w | θ), we need a measure of "closeness";
  • 82. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Posterior approximation If we want to approximate p(w | D) with q(w | θ), we need a measure of "closeness"; We use Kullback-Leibler (KL) divergence: Source: Flawnson Tong, https://towardsdatascience.com
  • 83. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Posterior approximation We use Kullback-Leibler (KL) divergence: θ∗ = arg min θ KL[q(w | θ) || p(w | D)]
  • 84. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Posterior approximation We use Kullback-Leibler (KL) divergence: θ∗ = arg min θ KL[q(w | θ) || p(w | D)] θ∗ = arg min θ log q(w | θ) variational posterior − log p(w) prior − log p(D | w) log likelihood
  • 85. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Posterior approximation We use Kullback-Leibler (KL) divergence: θ∗ = arg min θ KL[q(w | θ) || p(w | D)] θ∗ = arg min θ log q(w | θ) variational posterior − log p(w) prior − log p(D | w) log likelihood Why KL-divergence ? Because it allows us to derive a cost that is tractable to optimization.
  • 86. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Posterior approximation We use Kullback-Leibler (KL) divergence: θ∗ = arg min θ KL[q(w | θ) || p(w | D)] θ∗ = arg min θ log q(w | θ) variational posterior − log p(w) prior − log p(D | w) log likelihood Why KL-divergence ? Because it allows us to derive a cost that is tractable to optimization. Not without paying a price though.
  • 87. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Forward and Reverse KL Forms of the KL-divergence. Source: Pattern Recognition and Machine Learning. Christopher M. Bishop. 2006. (a) forward KL-divergence, (b) and (c) reverse KL-divergence.
  • 88. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Forward KL Source: Colin Raffel, https://colinraffel.com
  • 89. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Forward KL (misspecification) Source: Colin Raffel, https://colinraffel.com
  • 90. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Reverse KL Source: Colin Raffel, https://colinraffel.com
  • 91. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Quality of the uncertainty estimation MFVB approximation. Source: Variational Bayes and beyond: Bayesian inference for big data. Tamara Broderick. ICML 2018.
  • 92. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Quality of the uncertainty estimation MFVB approximation. Source: Variational Bayes and beyond: Bayesian inference for big data. Tamara Broderick. ICML 2018. Can underestimate variance severely; When compared to MCMC, means are usually fine, but variance is far away;
  • 93. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Training a Bayesian Neural Network The training loop for a Bayesian Neural Network (BNN) using Variational Inference is shown below: Sample from q(w | θ) the parameters of the network. Two variational parameters for each weight in q: µ and σ;
  • 94. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Training a Bayesian Neural Network The training loop for a Bayesian Neural Network (BNN) using Variational Inference is shown below: Sample from q(w | θ) the parameters of the network. Two variational parameters for each weight in q: µ and σ; Parametrize the network with the sampled parameters, often using the reparametrization trick;
  • 95. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Training a Bayesian Neural Network The training loop for a Bayesian Neural Network (BNN) using Variational Inference is shown below: Sample from q(w | θ) the parameters of the network. Two variational parameters for each weight in q: µ and σ; Parametrize the network with the sampled parameters, often using the reparametrization trick; Forward pass with the data batch;
  • 96. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Training a Bayesian Neural Network The training loop for a Bayesian Neural Network (BNN) using Variational Inference is shown below: Sample from q(w | θ) the parameters of the network. Two variational parameters for each weight in q: µ and σ; Parametrize the network with the sampled parameters, often using the reparametrization trick; Forward pass with the data batch; Calculate the combined loss: variational posterior, prior and log likelihood;
  • 97. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Training a Bayesian Neural Network The training loop for a Bayesian Neural Network (BNN) using Variational Inference is shown below: Sample from q(w | θ) the parameters of the network. Two variational parameters for each weight in q: µ and σ; Parametrize the network with the sampled parameters, often using the reparametrization trick; Forward pass with the data batch; Calculate the combined loss: variational posterior, prior and log likelihood; Compute gradients by backpropagation and optimize with SGD;
  • 98. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Training a Bayesian Neural Network The training loop for a Bayesian Neural Network (BNN) using Variational Inference is shown below: Sample from q(w | θ) the parameters of the network. Two variational parameters for each weight in q: µ and σ; Parametrize the network with the sampled parameters, often using the reparametrization trick; Forward pass with the data batch; Calculate the combined loss: variational posterior, prior and log likelihood; Compute gradients by backpropagation and optimize with SGD; Repeat;
  • 99. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Training a Bayesian Neural Network The training loop for a Bayesian Neural Network (BNN) using Variational Inference is shown below: Sample from q(w | θ) the parameters of the network. Two variational parameters for each weight in q: µ and σ; Parametrize the network with the sampled parameters, often using the reparametrization trick; Forward pass with the data batch; Calculate the combined loss: variational posterior, prior and log likelihood; Compute gradients by backpropagation and optimize with SGD; Repeat; Prediction: multiple forward passes.
  • 100. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Training a Bayesian Neural Network The training loop for a Bayesian Neural Network (BNN) using Variational Inference is shown below: Sample from q(w | θ) the parameters of the network. Two variational parameters for each weight in q: µ and σ; Parametrize the network with the sampled parameters, often using the reparametrization trick; Forward pass with the data batch; Calculate the combined loss: variational posterior, prior and log likelihood; Compute gradients by backpropagation and optimize with SGD; Repeat; Prediction: multiple forward passes. This method is also called bayes by backprop.
  • 101. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Quality of the uncertainty estimation HMC vs VI. Source: Bayesian Inference with Anchored Ensembles of Neural Networks, and Application to Exploration in Reinforcement Learning. Tim Pearce. 2018. For more information For more information about the variational approach, please refer to: Weight Uncertainty in Neural Networks. C. Blundell, et al. 2015.
  • 102. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Dropout as a Bayesian Approximation Dropout. Source: Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Nitish Srivastava, et al. 2014.
  • 103. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Dropout as a Bayesian Approximation In 2015, the work Dropout as a Bayesian Approximation: Insights and Applications. Yarin Gal et al., they found a relationship between Dropout and Bayesian approximation;
  • 104. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Dropout as a Bayesian Approximation In 2015, the work Dropout as a Bayesian Approximation: Insights and Applications. Yarin Gal et al., they found a relationship between Dropout and Bayesian approximation; It turns out that to do a Bernoulli approximate variational inference in Bayesian NNs, you can just add dropout during training and during prediction time as well;
  • 105. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Dropout as a Bayesian Approximation In 2015, the work Dropout as a Bayesian Approximation: Insights and Applications. Yarin Gal et al., they found a relationship between Dropout and Bayesian approximation; It turns out that to do a Bernoulli approximate variational inference in Bayesian NNs, you can just add dropout during training and during prediction time as well; Quite appealing due to its simplicity and it also provided an interesting interpretation of dropout;
  • 106. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Dropout as a Bayesian Approximation In 2015, the work Dropout as a Bayesian Approximation: Insights and Applications. Yarin Gal et al., they found a relationship between Dropout and Bayesian approximation; It turns out that to do a Bernoulli approximate variational inference in Bayesian NNs, you can just add dropout during training and during prediction time as well; Quite appealing due to its simplicity and it also provided an interesting interpretation of dropout; This technique is called "MC Dropout" or "Monte Carlo Dropout".
  • 107. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A MC Dropout on a Regression Setting Some results from the MC Dropout on a regression setting: MC Dropout. Source: Dropout as a Bayesian Approximation: Insights and Applications. Yarin Gal et al. ICML 2015.
  • 108. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A MC Dropout on a Classification Setting Some results from the MC Dropout on a classification setting: MC Dropout. Source: Dropout as a Bayesian Approximation: Insights and Applications. Yarin Gal et al. ICML 2015.
  • 109. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Criticism of MC Dropout Some results from the MC Dropout on a regression setting: MC Dropout with varying number of data points. Gray regions is 1, std. dev. above and below. Source: Randomized Prior Functions for Deep Reinforcement Learning. Ian Osband et al. 2018. It was shown that MC Dropout didn’t pass a simple sanity check in a linear setting, as it didn’t concentrate with more data.
  • 110. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Section V Ensembles
  • 111. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Ensembles Uses multiple hypothesis to learn a better one; We can see dropout as an ensemble, but with shared weights; The ensemble variance can be interpreted as uncertainty; Simple intuition why it works. Input Data Combine predictions Model #1 Model #2 Model #3
  • 112. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Deep Ensembles In the work: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles. Lakshminarayanan B., et al. NIPS 2017., they proposed a very simple method to compute uncertainty with ensembles:
  • 113. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Deep Ensembles In the work: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles. Lakshminarayanan B., et al. NIPS 2017., they proposed a very simple method to compute uncertainty with ensembles: Setting You have M models, with independent parameters θ1, θ2, θM .
  • 114. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Deep Ensembles In the work: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles. Lakshminarayanan B., et al. NIPS 2017., they proposed a very simple method to compute uncertainty with ensembles: Setting You have M models, with independent parameters θ1, θ2, θM . 1) Initialize parameters θ1, θ2, θM randomly;
  • 115. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Deep Ensembles In the work: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles. Lakshminarayanan B., et al. NIPS 2017., they proposed a very simple method to compute uncertainty with ensembles: Setting You have M models, with independent parameters θ1, θ2, θM . 1) Initialize parameters θ1, θ2, θM randomly; 2) Train each network m ∈ M with weights θm individually;
  • 116. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Deep Ensembles In the work: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles. Lakshminarayanan B., et al. NIPS 2017., they proposed a very simple method to compute uncertainty with ensembles: Setting You have M models, with independent parameters θ1, θ2, θM . 1) Initialize parameters θ1, θ2, θM randomly; 2) Train each network m ∈ M with weights θm individually; 3) Add or not adversarial training;
  • 117. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Deep Ensembles In the work: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles. Lakshminarayanan B., et al. NIPS 2017., they proposed a very simple method to compute uncertainty with ensembles: Setting You have M models, with independent parameters θ1, θ2, θM . 1) Initialize parameters θ1, θ2, θM randomly; 2) Train each network m ∈ M with weights θm individually; 3) Add or not adversarial training; 4) Combine the predictions with: p(y | x) = M−1 average M m=1 prediction from each network pθm (y | x, θm)
  • 118. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Evaluating Entropy on Classification Plot of the binary entropy function H(p). A measure of the uncertainty.
  • 119. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Evaluating Entropy on Classification 0.20.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 entropy values 0 1 2 3 4 5 6 7 8 Known classes 1 2 3 4 5 1 0 1 2 3 4 5 entropy values 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Unknown classes 1 2 3 4 5 ImageNet trained only on dogs. Histogram of the predictive entropy on test examples from known classes (dogs) and unknown classes (non-dogs) with varying ensemble size. Source: Lakshminarayanan B., et al. NIPS 2017.
  • 120. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Evaluating Entropy on Classification −0.50.0 0.5 1.0 1.5 2.0 2.5 entropy values 0 1 2 3 4 5 6 7 Ensemble 1 5 10 −0.50.0 0.5 1.0 1.5 2.0 2.5 entropy values Ensemble + R 1 5 10 −0.50.0 0.5 1.0 1.5 2.0 2.5 entropy values Ensemble + AT 1 5 10 −0.5 0.0 0.5 1.0 1.5 2.0 entropy values MC dropout 1 5 10 −0.50.0 0.5 1.0 1.5 2.0 2.5 entropy values 0 1 2 3 4 5 6 7 Ensemble 1 5 10 −0.50.0 0.5 1.0 1.5 2.0 2.5 entropy values Ensemble + R 1 5 10 −0.50.0 0.5 1.0 1.5 2.0 2.5 entropy values Ensemble + AT 1 5 10 −0.50.0 0.5 1.0 1.5 2.0 2.5 entropy values MC dropout 1 5 10 Histogram of the predictive entropy on test examples from known classes from SVHN (top row) and unknown classes from CIFAR-10 (bottom row). Source: Lakshminarayanan B., et al. NIPS 2017.
  • 121. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Randomized Priors In Randomized Prior Functions for Deep Reinforcement Learning. Ian Osband et al. 2018: Very simple and elegant modification on the ensemble method for uncertainty;
  • 122. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Randomized Priors In Randomized Prior Functions for Deep Reinforcement Learning. Ian Osband et al. 2018: Very simple and elegant modification on the ensemble method for uncertainty; Developed in the Reinforcement Learning context;
  • 123. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Randomized Priors In Randomized Prior Functions for Deep Reinforcement Learning. Ian Osband et al. 2018: Very simple and elegant modification on the ensemble method for uncertainty; Developed in the Reinforcement Learning context; Overcome the issue of injecting a prior into ensemble-based approaches to uncertainty;
  • 124. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Randomized Priors In Randomized Prior Functions for Deep Reinforcement Learning. Ian Osband et al. 2018: Very simple and elegant modification on the ensemble method for uncertainty; Developed in the Reinforcement Learning context; Overcome the issue of injecting a prior into ensemble-based approaches to uncertainty; On a simple linear setting, it is equivalent to exact Bayesian inference for the case of a linear Gaussian model.
  • 125. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bootstrap Population
  • 126. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bootstrap Population Sample #1 Sample #2 Sample #3
  • 127. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bootstrap Population Sample #1 Sample #2 Sample #3 Statistic Statistic Statistic q1 q2 q3
  • 128. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Bootstrap Population Sample #1 Sample #2 Sample #3 Statistic Statistic Statistic q1 q2 q3 Bootstrap Statistic Distribution
  • 129. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Randomized Prior Functions The key insight is to add a randomized (but fixed) prior and bootstraped data:
  • 130. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Randomized Prior Functions The key insight is to add a randomized (but fixed) prior and bootstraped data: for k = 1, . . . , K do: Initialize θk ∼ random; Form Dk with bootstrap; Sample prior function pk ∼ P Optimize L(fθ + λpk; Dk) return posterior ensemble {fθk + pk}K k=1
  • 131. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Qualitative Inspection Some pathological cases: Posterior predictive distributions for 1D regression with a (20, 20)-MLP and ReLUs. Source: Randomized Prior Functions for Deep Reinforcement Learning. Ian Osband et al. 2018.
  • 132. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Qualitative Inspection Some pathological cases: Posterior predictive distributions for 1D regression with a (20, 20)-MLP and ReLUs. Source: Randomized Prior Functions for Deep Reinforcement Learning. Ian Osband et al. 2018. “(...) If an agent has only ever observed zero reward, then no amount of bootstrapping or ensembling will cause it to simulate positive rewards. (...)” – Randomized Prior Functions for Deep Reinforcement Learning. Ian
  • 133. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Predictive Uncertainty 6 4 2 0 2 4 6 20 10 0 10 20 30 40 Source: Ian Osband et al. Using Randomized Prior Functions for Deep Reinforcement Learning. NIPS 2018. Image from: http://blog.christianperone.com
  • 134. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Posterior Samples 4 3 2 1 0 1 2 3 4 10 5 0 5 10 Source: Ian Osband et al. Using Randomized Prior Functions for Deep Reinforcement Learning. NIPS 2018. Image from: http://blog.christianperone.com
  • 135. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Prior Samples 4 3 2 1 0 1 2 3 4 4 2 0 2 4 Source: Ian Osband et al. Using Randomized Prior Functions for Deep Reinforcement Learning. NIPS 2018. Image from: http://blog.christianperone.com
  • 136. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Final Remarks Many methods, no standardized evaluation, no ground truth for model uncertainty;
  • 137. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Final Remarks Many methods, no standardized evaluation, no ground truth for model uncertainty; Performance (CPU/GPU resources) penalty basically for all methods;
  • 138. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Final Remarks Many methods, no standardized evaluation, no ground truth for model uncertainty; Performance (CPU/GPU resources) penalty basically for all methods; No scalable solution for MCMC (yet);
  • 139. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Final Remarks Many methods, no standardized evaluation, no ground truth for model uncertainty; Performance (CPU/GPU resources) penalty basically for all methods; No scalable solution for MCMC (yet); Choice depends on application;
  • 140. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Final Remarks Many methods, no standardized evaluation, no ground truth for model uncertainty; Performance (CPU/GPU resources) penalty basically for all methods; No scalable solution for MCMC (yet); Choice depends on application; Always take into consideration the trade-off of guarantees;
  • 141. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Final Remarks Many methods, no standardized evaluation, no ground truth for model uncertainty; Performance (CPU/GPU resources) penalty basically for all methods; No scalable solution for MCMC (yet); Choice depends on application; Always take into consideration the trade-off of guarantees; Significant evolution of methods, frameworks and hardware.
  • 142. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Learning More - I Statistical Rethinking (excellent book and course), by Richard McElreath. https://xcelab.net/rm/statistical-rethinking/ Variational Inference: A Review, by David M. Blei, et al. https://arxiv.org/abs/1601.00670 Scalable Bayesian Inference, by David Dunson. NIPS 2018 Talk. https://www.youtube.com/watch?v=0HXpnG_WnlI Variational Bayes and Beyond, by Tamara Broderick. ICML 2018 Tutorial. https://www.youtube.com/watch?v=Moo4-KR5qNg History of Bayesian Neural Networks, by Zoubin Ghahramani. NIPS 2016 Keynote talk. https://www.youtube.com/watch?v=FD8l2vPU5FY
  • 143. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Learning More - II Uncertainty in Deep Learning, Slides, by Roberto Silveira. http://tiny.cc/c77n9y A Beginner’s Guide to Variational Methods, by Eric Jang. https: //blog.evjang.com/2016/08/variational-bayes.html Uncertainty in Deep Learning, Thesis, by Yarin Gal. http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf PyMC3, Framework, by PyMC3 developers. https://docs.pymc.io/ Pyro, Framework, by Pyro developers. http://pyro.ai/ Tensorflow Probability, Framework, by TensorFlow developers. https://www.tensorflow.org/probability
  • 144. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Section VI Q&A
  • 145. Uncertainty in Deep Learning - Christian S. Perone (2019) Uncertainties Bayesian Inference Deep Learning Variational Inference Ensembles Q&A Q&A Hope you liked ! Questions ?