SlideShare a Scribd company logo
1 of 47
Download to read offline
All of Statistics Chapter 13.
Statistical Decision Theory
Sangwoo Mo
KAIST Algorithmic Intelligence Lab.
August 31, 2017
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 1 / 20
Table of Contents
1 Decision Theory: How to choose a ‘good’ estimator?
2 Computing Bayes Estimators
3 Computing Minimax Estimators
4 Maximum Likelihood Estimators: A good approximator
5 Admissibility: What are the ‘best’ estimators?
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 2 / 20
Table of Contents
1 Decision Theory: How to choose a ‘good’ estimator?
2 Computing Bayes Estimators
3 Computing Minimax Estimators
4 Maximum Likelihood Estimators: A good approximator
5 Admissibility: What are the ‘best’ estimators?
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 3 / 20
Statistical Decision Theory
We learned several point estimators (e.g. sample mean, MLE, MoM).
How do we choose among them? ⇒ Decision theory!
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 4 / 20
Statistical Decision Theory
We learned several point estimators (e.g. sample mean, MLE, MoM).
How do we choose among them? ⇒ Decision theory!
First, we will define loss and risk to evaluate the estimator.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 4 / 20
Statistical Decision Theory
We learned several point estimators (e.g. sample mean, MLE, MoM).
How do we choose among them? ⇒ Decision theory!
First, we will define loss and risk to evaluate the estimator.
Definition
Loss: L(θ, ˆθ) : Θ × ΘE → R measures the discrepancy between θ and ˆθ.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 4 / 20
Statistical Decision Theory
We learned several point estimators (e.g. sample mean, MLE, MoM).
How do we choose among them? ⇒ Decision theory!
First, we will define loss and risk to evaluate the estimator.
Definition
Loss: L(θ, ˆθ) : Θ × ΘE → R measures the discrepancy between θ and ˆθ.
Example
L(θ, ˆθ) = (θ − ˆθ)2 squared error loss,
L(θ, ˆθ) = |θ − ˆθ| absolute error loss,
L(θ, ˆθ) = 0 if θ = ˆθ and 1 if θ = ˆθ zero-one loss,
L(θ, ˆθ) = log(f (x;θ)
f (x;ˆθ)
)f (x; θ)dx Kullback-Leibler loss.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 4 / 20
Statistical Decision Theory
We learned several point estimators (e.g. sample mean, MLE, MoM).
How do we choose among them? ⇒ Decision theory!
First, we will define loss and risk to evaluate the estimator.
Definition
Loss: L(θ, ˆθ) : Θ × ΘE → R measures the discrepancy between θ and ˆθ.
Definition
Risk: R(θ, ˆθ) = Eθ(L(θ, ˆθ)) = L(θ, ˆθ(x))f (x; θ)dx.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 4 / 20
Statistical Decision Theory
We learned several point estimators (e.g. sample mean, MLE, MoM).
How do we choose among them? ⇒ Decision theory!
First, we will define loss and risk to evaluate the estimator.
Definition
Loss: L(θ, ˆθ) : Θ × ΘE → R measures the discrepancy between θ and ˆθ.
Definition
Risk: R(θ, ˆθ) = Eθ(L(θ, ˆθ)) = L(θ, ˆθ(x))f (x; θ)dx.
We want to choose estimator with the smallest risk.
But how can we compare risk functons?
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 4 / 20
Measure for Risk Comparison
How can we compare risk functions?
⇒ We need one-number summary of the risk function!
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 5 / 20
Measure for Risk Comparison
How can we compare risk functions?
⇒ We need one-number summary of the risk function!
Definition
Bayes risk: r(π, ˆθ) = R(θ, ˆθ)π(θ)dθ where π(θ) is a prior for θ.
Maximum risk: ¯R(ˆθ) = supθ R(θ, ˆθ).
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 5 / 20
Measure for Risk Comparison
How can we compare risk functions?
⇒ We need one-number summary of the risk function!
Definition
Bayes risk: r(π, ˆθ) = R(θ, ˆθ)π(θ)dθ where π(θ) is a prior for θ.
Maximum risk: ¯R(ˆθ) = supθ R(θ, ˆθ).
Definition
Bayes estimator: ˆθ s.t. R(θ, ˆθ) = inf˜θ r(π, ˜θ).
Minimax estimator: ˆθ s.t. R(θ, ˆθ) = inf˜θ supθ R(θ, ˜θ).
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 5 / 20
Measure for Risk Comparison
How can we compare risk functions?
⇒ We need one-number summary of the risk function!
Definition
Bayes risk: r(π, ˆθ) = R(θ, ˆθ)π(θ)dθ where π(θ) is a prior for θ.
Maximum risk: ¯R(ˆθ) = supθ R(θ, ˆθ).
Definition
Bayes estimator: ˆθ s.t. R(θ, ˆθ) = inf˜θ r(π, ˜θ).
Minimax estimator: ˆθ s.t. R(θ, ˆθ) = inf˜θ supθ R(θ, ˜θ).
Note that minimax estimator is the most conservative.
With proper prior, Bayes estimator works better in general.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 5 / 20
Table of Contents
1 Decision Theory: How to choose a ‘good’ estimator?
2 Computing Bayes Estimators
3 Computing Minimax Estimators
4 Maximum Likelihood Estimators: A good approximator
5 Admissibility: What are the ‘best’ estimators?
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 6 / 20
Computing Bayes Estimators
For given data x, we can compute Bayes estimator via posterior.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 7 / 20
Computing Bayes Estimators
For given data x, we can compute Bayes estimator via posterior.
Definition
Posterior risk: r(ˆθ|x) = L(θ, ˆθ(x))f (θ|x)dθ
where f (θ|x) = f (x|θ)π(θ)/m(x) is posterior density
and m(x) = f (x|θ)π(θ)dθ is marginal distribution of X.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 7 / 20
Computing Bayes Estimators
For given data x, we can compute Bayes estimator via posterior.
Definition
Posterior risk: r(ˆθ|x) = L(θ, ˆθ(x))f (θ|x)dθ
where f (θ|x) = f (x|θ)π(θ)/m(x) is posterior density
and m(x) = f (x|θ)π(θ)dθ is marginal distribution of X.
Theorem
The Bayes risk r(π, ˆθ) satisfies
r(π, ˆθ) = r(ˆθ|x)m(x)dx.
Let ˆθ(x) be the value minimizes r(ˆθ|x). Then ˆθ is the Bayes estimator.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 7 / 20
Computing Bayes Estimators
Now we can find explicit formula for Bayes estimators for some specific
loss functions.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 8 / 20
Computing Bayes Estimators
Now we can find explicit formula for Bayes estimators for some specific
loss functions.
Theorem
If the loss is squared error / absolute error / zero-one loss,
then the Bayes estimator is mean / median / mode of the posterior f (θ|x).
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 8 / 20
Computing Bayes Estimators
Now we can find explicit formula for Bayes estimators for some specific
loss functions.
Theorem
If the loss is squared error / absolute error / zero-one loss,
then the Bayes estimator is mean / median / mode of the posterior f (θ|x).
Proof.
We will only prove the theorem for squared error loss.
The Bayes estimator ˆθ(x) minimizes r(ˆθ|x) = (θ − ˆθ(x))2f (θ|x)dθ.
Taking the derivative of r(ˆθ|x) w.r.t. ˆθ(x) leads the result.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 8 / 20
Table of Contents
1 Decision Theory: How to choose a ‘good’ estimator?
2 Computing Bayes Estimators
3 Computing Minimax Estimators
4 Maximum Likelihood Estimators: A good approximator
5 Admissibility: What are the ‘best’ estimators?
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 9 / 20
Computing Minimax Estimators
Computing minimax estimator is complex in general.
Here, we will only mention a few key results.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 10 / 20
Computing Minimax Estimators
Computing minimax estimator is complex in general.
Here, we will only mention a few key results.
Theorem
Let ˆθπ be the Bayes estimator for some prior π.
Suppose that R(θ, ˆθπ) ≤ r(π, ˆθπ) for all θ.
Then ˆθπ is minimax estimator and π is the least favorablea prior.
a
for any π , r(π, ˆθπ
) ≥ r(π , ˆθπ
)
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 10 / 20
Computing Minimax Estimators
Computing minimax estimator is complex in general.
Here, we will only mention a few key results.
Theorem
Let ˆθπ be the Bayes estimator for some prior π.
Suppose that R(θ, ˆθπ) ≤ r(π, ˆθπ) for all θ.
Then ˆθπ is minimax estimator and π is the least favorablea prior.
a
for any π , r(π, ˆθπ
) ≥ r(π , ˆθπ
)
Proof.
Suppose ˆθπ is not minimax estimator.
Then there is ˆθ0 s.t. supθ R(θ, ˆθ0) < supθ R(θ, ˆθπ).
Hence, r(π, ˆθ0) ≤ supθ R(θ, ˆθ0) < supθ R(θ, ˆθπ) ≤ r(π, ˆθπ).
It leads the contradiction.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 10 / 20
Computing Minimax Estimators
Computing minimax estimator is complex in general.
Here, we will only mention a few key results.
Theorem
Let ˆθπ be the Bayes estimator for some prior π.
Suppose that R(θ, ˆθπ) ≤ r(π, ˆθπ) for all θ.
Then ˆθπ is minimax estimator and π is the least favorablea prior.
a
for any π , r(π, ˆθπ
) ≥ r(π , ˆθπ
)
Corollary
Let ˆθπ be the Bayes estimator for some prior π.
Suppose that R(θ, ˆθπ) = c for some c.
Then ˆθπ is minimax estimator.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 10 / 20
Computing Minimax Estimators
Example
Let X1, ..., Xn ∼ Bernoulli(p). Assume squred error loss.
Assume Beta(α, β) prior. Then the posterior mean is ˆp =
α+
n
i=1
Xi
α+β+n .
Now let α = β = n/4. Then R(p, ˆp) = n
4(n+
√
n)2 .
Hence, by previous theorem, ˆp is minimax estimator.
Example
Consider again Bernoulli but with loss L(p, ˆp) = (p−ˆp)2
p(1−p) .
Assume uniform prior. Then the Bayes estimator is ˆp =
n
i=1
Xi
n .
Then R(p, ˆp) = 1
n . Hence, ˆp is minimax estimator.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 11 / 20
Computing Minimax Estimators
For normal distribution, we can achieve a nice result.
Theorem
Let X1, ..., Xn ∼ N(θ, 1) and Θ = R. Let ˆθ = ¯X.
Then ˆθ is minimax w.r.t. any well-behaveda loss function.
Moreover, it is the only estimator with this property.
a
level sets are convex and symmetric about the origin
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 12 / 20
Computing Minimax Estimators
For normal distribution, we can achieve a nice result.
Theorem
Let X1, ..., Xn ∼ N(θ, 1) and Θ = R. Let ˆθ = ¯X.
Then ˆθ is minimax w.r.t. any well-behaveda loss function.
Moreover, it is the only estimator with this property.
a
level sets are convex and symmetric about the origin
If parameter space is bounded, the theorem above does not apply.
Example
Suppose that X ∼ N(θ, 1) and Θ = [−m, m] where 0 < m < 1.
Assume squared error loss. Assume 1
2(δ(−m) + δ(m)) prior.
Then ˆθ(X) = m tanh(mX) is minimax estimator.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 12 / 20
Table of Contents
1 Decision Theory: How to choose a ‘good’ estimator?
2 Computing Bayes Estimators
3 Computing Minimax Estimators
4 Maximum Likelihood Estimators: A good approximator
5 Admissibility: What are the ‘best’ estimators?
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 13 / 20
MLE approximates Bayes/minimax estimator
Still, it is challenging to compute Bayes and minimax estimator.
Surprisingly, it is shown that for large samples, MLE is approximately
Bayes and minimax for parametric models.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 14 / 20
MLE approximates Bayes/minimax estimator
Still, it is challenging to compute Bayes and minimax estimator.
Surprisingly, it is shown that for large samples, MLE is approximately
Bayes and minimax for parametric models.
Idea Sketch (MLE is approximately Bayes)
For large n, the effect of prior is negligible. Moreover,
√
n(ˆθBayes − θ) → N(0,
1
I(θ)
).
Thus, Bayes estimator is approximately MLE.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 14 / 20
MLE approximates Bayes/minimax estimator
Still, it is challenging to compute Bayes and minimax estimator.
Surprisingly, it is shown that for large samples, MLE is approximately
Bayes and minimax for parametric models.
Idea Sketch (MLE is approximately minimax)
Assume squared error loss. Let ˆθ be MLE estimator. Then
R(θ, ˆθ) = Vθ(ˆθ) + bias2
≈ Vθ(ˆθ)a ≈
1
nI(θ)
.
For large n, for other θ , R(θ, θ ) ≥ 1
nI(θ) ≈ R(θ, ˆθ).
Thus, MLE is approximately minimax.
a
typically, squared bias is O(n−2
) and variance is O(n−1
)
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 14 / 20
Table of Contents
1 Decision Theory: How to choose a ‘good’ estimator?
2 Computing Bayes Estimators
3 Computing Minimax Estimators
4 Maximum Likelihood Estimators: A good approximator
5 Admissibility: What are the ‘best’ estimators?
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 15 / 20
Admissibility
As we have seen before, we can’t decide the best estimator over risks.
However, we can decide what is not the best estimator.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 16 / 20
Admissibility
As we have seen before, we can’t decide the best estimator over risks.
However, we can decide what is not the best estimator.
Definition
An estimator ˆθ is inadmissible if there exists an estimator ˆθ s.t.
R(θ, ˆθ ) ≤ R(θ, ˆθ) for all θ and
R(θ, ˆθ ) < R(θ, ˆθ) for at least one θ.
An estimator ˆθ is admissible if it is not inadmissible.
Definition
An estimator ˆθ is strongly inadmissible if there exists an estimator ˆθ
and > 0 s.t. R(θ, ˆθ ) < R(θ, ˆθ) − for all θ.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 16 / 20
Admissibility
Remark that admissibility only characterize the bad estimator.
Admissible estimator are not generally good, and sometimes can be bad.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 17 / 20
Admissibility
Remark that admissibility only characterize the bad estimator.
Admissible estimator are not generally good, and sometimes can be bad.
Example
Let X ∼ N(θ, 1). Assume squared error loss. Let ˆθ(X) = 3.
Then ˆθ(X) is admissible, even though it is clearly bad.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 17 / 20
Admissibility
Remark that admissibility only characterize the bad estimator.
Admissible estimator are not generally good, and sometimes can be bad.
Example
Let X ∼ N(θ, 1). Assume squared error loss. Let ˆθ(X) = 3.
Then ˆθ(X) is admissible, even though it is clearly bad.
Example (Proof.)
Suppose not. There exists ˆθ s.t. R(3, ˆθ ) ≤ R(3, ˆθ) = 0.
Hence, R(3, ˆθ ) = (ˆθ (x) − 3)2f (x; 3)dx, and ˆθ (X) = 3.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 17 / 20
Admissibility of Bayes Estimators
Assuming regular condition, Bayes estimators are admissible.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 18 / 20
Admissibility of Bayes Estimators
Assuming regular condition, Bayes estimators are admissible.
Theorem (unique case)
If Bayes estimator is unique, then it is admissible.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 18 / 20
Admissibility of Bayes Estimators
Assuming regular condition, Bayes estimators are admissible.
Theorem (unique case)
If Bayes estimator is unique, then it is admissible.
Theorem (discrete case)
If Θ is discrete set, then all Bayes estimators are admissible.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 18 / 20
Admissibility of Bayes Estimators
Assuming regular condition, Bayes estimators are admissible.
Theorem (unique case)
If Bayes estimator is unique, then it is admissible.
Theorem (discrete case)
If Θ is discrete set, then all Bayes estimators are admissible.
Theorem (continuous case)
If Θ is continuous set, and if R(θ, ˆθ) is continuous in θ for every ˆθ,
then all Bayes estimators are admissible.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 18 / 20
Admissibility of Minimax Estimators
Neither minimaxity nor admissibility implies the other in general.
However, there are some facts linking them.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 19 / 20
Admissibility of Minimax Estimators
Neither minimaxity nor admissibility implies the other in general.
However, there are some facts linking them.
Theorem (admissibility ⇒ minimaxity)
Suppose that ˆθ has constant risk and is admissible.
Then it is minimax estimator.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 19 / 20
Admissibility of Minimax Estimators
Neither minimaxity nor admissibility implies the other in general.
However, there are some facts linking them.
Theorem (admissibility ⇒ minimaxity)
Suppose that ˆθ has constant risk and is admissible.
Then it is minimax estimator.
Proof.
Suppose not. There exists ˆθ s.t.
R(θ, ˆθ ) ≤ supθ R(θ, ˆθ ) < supθ R(θ, ˆθ) = R(θ, ˆθ).
It implies that ˆθ is inadmissible.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 19 / 20
Admissibility of Minimax Estimators
Neither minimaxity nor admissibility implies the other in general.
However, there are some facts linking them.
Theorem (admissibility ⇒ minimaxity)
Suppose that ˆθ has constant risk and is admissible.
Then it is minimax estimator.
Theorem (minimaxity ⇒ admissibility)
If ˆθ is minimax then it is not strongly inadmissible.
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 19 / 20
Questions?
Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 20 / 20

More Related Content

What's hot

Bayes rule (Bayes Law)
Bayes rule (Bayes Law)Bayes rule (Bayes Law)
Bayes rule (Bayes Law)Tish997
 
An introduction to Bayesian Statistics using Python
An introduction to Bayesian Statistics using PythonAn introduction to Bayesian Statistics using Python
An introduction to Bayesian Statistics using Pythonfreshdatabos
 
Slides sales-forecasting-session2-web
Slides sales-forecasting-session2-webSlides sales-forecasting-session2-web
Slides sales-forecasting-session2-webArthur Charpentier
 
Equational axioms for probability calculus and modelling of Likelihood ratio ...
Equational axioms for probability calculus and modelling of Likelihood ratio ...Equational axioms for probability calculus and modelling of Likelihood ratio ...
Equational axioms for probability calculus and modelling of Likelihood ratio ...Advanced-Concepts-Team
 
Otter 2014-12-22-01-slideshare
Otter 2014-12-22-01-slideshareOtter 2014-12-22-01-slideshare
Otter 2014-12-22-01-slideshareRuo Ando
 
Probability and Random Variables
Probability and Random VariablesProbability and Random Variables
Probability and Random VariablesSubhobrata Banerjee
 
1614 probability-models and concepts
1614 probability-models and concepts1614 probability-models and concepts
1614 probability-models and conceptsDr Fereidoun Dejahang
 
Covariate balancing propensity score STATA user written code by Filip Premik
Covariate balancing propensity score STATA user written code by Filip PremikCovariate balancing propensity score STATA user written code by Filip Premik
Covariate balancing propensity score STATA user written code by Filip PremikGRAPE
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learningbutest
 

What's hot (20)

Pareto Models, Slides EQUINEQ
Pareto Models, Slides EQUINEQPareto Models, Slides EQUINEQ
Pareto Models, Slides EQUINEQ
 
Bayes rule (Bayes Law)
Bayes rule (Bayes Law)Bayes rule (Bayes Law)
Bayes rule (Bayes Law)
 
Bayes' theorem
Bayes' theoremBayes' theorem
Bayes' theorem
 
QMC: Transition Workshop - Importance Sampling the Union of Rare Events with ...
QMC: Transition Workshop - Importance Sampling the Union of Rare Events with ...QMC: Transition Workshop - Importance Sampling the Union of Rare Events with ...
QMC: Transition Workshop - Importance Sampling the Union of Rare Events with ...
 
Boston talk
Boston talkBoston talk
Boston talk
 
An introduction to Bayesian Statistics using Python
An introduction to Bayesian Statistics using PythonAn introduction to Bayesian Statistics using Python
An introduction to Bayesian Statistics using Python
 
07022018
0702201807022018
07022018
 
Slides sales-forecasting-session2-web
Slides sales-forecasting-session2-webSlides sales-forecasting-session2-web
Slides sales-forecasting-session2-web
 
Equational axioms for probability calculus and modelling of Likelihood ratio ...
Equational axioms for probability calculus and modelling of Likelihood ratio ...Equational axioms for probability calculus and modelling of Likelihood ratio ...
Equational axioms for probability calculus and modelling of Likelihood ratio ...
 
Slides toulouse
Slides toulouseSlides toulouse
Slides toulouse
 
Otter 2014-12-22-01-slideshare
Otter 2014-12-22-01-slideshareOtter 2014-12-22-01-slideshare
Otter 2014-12-22-01-slideshare
 
Bayes learning
Bayes learningBayes learning
Bayes learning
 
Probability and Random Variables
Probability and Random VariablesProbability and Random Variables
Probability and Random Variables
 
Proba stats-r1-2017
Proba stats-r1-2017Proba stats-r1-2017
Proba stats-r1-2017
 
Slides econ-lm
Slides econ-lmSlides econ-lm
Slides econ-lm
 
Kul Wedsem Presentation
Kul Wedsem PresentationKul Wedsem Presentation
Kul Wedsem Presentation
 
1614 probability-models and concepts
1614 probability-models and concepts1614 probability-models and concepts
1614 probability-models and concepts
 
2.03 bayesian estimation
2.03 bayesian estimation2.03 bayesian estimation
2.03 bayesian estimation
 
Covariate balancing propensity score STATA user written code by Filip Premik
Covariate balancing propensity score STATA user written code by Filip PremikCovariate balancing propensity score STATA user written code by Filip Premik
Covariate balancing propensity score STATA user written code by Filip Premik
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 

Similar to Statistical Decision Theory

Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
Basic Concept Of Probability
Basic Concept Of ProbabilityBasic Concept Of Probability
Basic Concept Of Probabilityguest45a926
 
Hessian Matrices in Statistics
Hessian Matrices in StatisticsHessian Matrices in Statistics
Hessian Matrices in StatisticsFerris Jumah
 
Application of Graphic LASSO in Portfolio Optimization_Yixuan Chen & Mengxi J...
Application of Graphic LASSO in Portfolio Optimization_Yixuan Chen & Mengxi J...Application of Graphic LASSO in Portfolio Optimization_Yixuan Chen & Mengxi J...
Application of Graphic LASSO in Portfolio Optimization_Yixuan Chen & Mengxi J...Mengxi Jiang
 
効率的反実仮想学習
効率的反実仮想学習効率的反実仮想学習
効率的反実仮想学習Masa Kato
 
Estimation Theory, PhD Course, Ghent University, Belgium
Estimation Theory, PhD Course, Ghent University, BelgiumEstimation Theory, PhD Course, Ghent University, Belgium
Estimation Theory, PhD Course, Ghent University, BelgiumStijn De Vuyst
 
Data classification sammer
Data classification sammer Data classification sammer
Data classification sammer Sammer Qader
 
STOMA FULL SLIDE (probability of IISc bangalore)
STOMA FULL SLIDE (probability of IISc bangalore)STOMA FULL SLIDE (probability of IISc bangalore)
STOMA FULL SLIDE (probability of IISc bangalore)2010111
 
Regularization and variable selection via elastic net
Regularization and variable selection via elastic netRegularization and variable selection via elastic net
Regularization and variable selection via elastic netKyusonLim
 
Slides econometrics-2017-graduate-2
Slides econometrics-2017-graduate-2Slides econometrics-2017-graduate-2
Slides econometrics-2017-graduate-2Arthur Charpentier
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationUmberto Picchini
 
Koh_Liang_ICML2017
Koh_Liang_ICML2017Koh_Liang_ICML2017
Koh_Liang_ICML2017Masa Kato
 
Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naïve Bayes Machine Learning Classification with R Programming: A case study ...Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naïve Bayes Machine Learning Classification with R Programming: A case study ...SubmissionResearchpa
 
Accelerated approximate Bayesian computation with applications to protein fol...
Accelerated approximate Bayesian computation with applications to protein fol...Accelerated approximate Bayesian computation with applications to protein fol...
Accelerated approximate Bayesian computation with applications to protein fol...Umberto Picchini
 

Similar to Statistical Decision Theory (20)

ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
Basic Concept Of Probability
Basic Concept Of ProbabilityBasic Concept Of Probability
Basic Concept Of Probability
 
Hessian Matrices in Statistics
Hessian Matrices in StatisticsHessian Matrices in Statistics
Hessian Matrices in Statistics
 
Application of Graphic LASSO in Portfolio Optimization_Yixuan Chen & Mengxi J...
Application of Graphic LASSO in Portfolio Optimization_Yixuan Chen & Mengxi J...Application of Graphic LASSO in Portfolio Optimization_Yixuan Chen & Mengxi J...
Application of Graphic LASSO in Portfolio Optimization_Yixuan Chen & Mengxi J...
 
Side 2019 #3
Side 2019 #3Side 2019 #3
Side 2019 #3
 
効率的反実仮想学習
効率的反実仮想学習効率的反実仮想学習
効率的反実仮想学習
 
Naive Bayes Presentation
Naive Bayes PresentationNaive Bayes Presentation
Naive Bayes Presentation
 
Estimation Theory, PhD Course, Ghent University, Belgium
Estimation Theory, PhD Course, Ghent University, BelgiumEstimation Theory, PhD Course, Ghent University, Belgium
Estimation Theory, PhD Course, Ghent University, Belgium
 
Data classification sammer
Data classification sammer Data classification sammer
Data classification sammer
 
STOMA FULL SLIDE (probability of IISc bangalore)
STOMA FULL SLIDE (probability of IISc bangalore)STOMA FULL SLIDE (probability of IISc bangalore)
STOMA FULL SLIDE (probability of IISc bangalore)
 
PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...
PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...
PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...
 
Regularization and variable selection via elastic net
Regularization and variable selection via elastic netRegularization and variable selection via elastic net
Regularization and variable selection via elastic net
 
Slides econometrics-2017-graduate-2
Slides econometrics-2017-graduate-2Slides econometrics-2017-graduate-2
Slides econometrics-2017-graduate-2
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computation
 
Koh_Liang_ICML2017
Koh_Liang_ICML2017Koh_Liang_ICML2017
Koh_Liang_ICML2017
 
Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naïve Bayes Machine Learning Classification with R Programming: A case study ...Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naïve Bayes Machine Learning Classification with R Programming: A case study ...
 
Mle
MleMle
Mle
 
PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...
PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...
PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...
 
Accelerated approximate Bayesian computation with applications to protein fol...
Accelerated approximate Bayesian computation with applications to protein fol...Accelerated approximate Bayesian computation with applications to protein fol...
Accelerated approximate Bayesian computation with applications to protein fol...
 

More from Sangwoo Mo

Brief History of Visual Representation Learning
Brief History of Visual Representation LearningBrief History of Visual Representation Learning
Brief History of Visual Representation LearningSangwoo Mo
 
Learning Visual Representations from Uncurated Data
Learning Visual Representations from Uncurated DataLearning Visual Representations from Uncurated Data
Learning Visual Representations from Uncurated DataSangwoo Mo
 
Hyperbolic Deep Reinforcement Learning
Hyperbolic Deep Reinforcement LearningHyperbolic Deep Reinforcement Learning
Hyperbolic Deep Reinforcement LearningSangwoo Mo
 
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...Sangwoo Mo
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSangwoo Mo
 
Deep Learning Theory Seminar (Chap 3, part 2)
Deep Learning Theory Seminar (Chap 3, part 2)Deep Learning Theory Seminar (Chap 3, part 2)
Deep Learning Theory Seminar (Chap 3, part 2)Sangwoo Mo
 
Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Sangwoo Mo
 
Introduction to Diffusion Models
Introduction to Diffusion ModelsIntroduction to Diffusion Models
Introduction to Diffusion ModelsSangwoo Mo
 
Object-Region Video Transformers
Object-Region Video TransformersObject-Region Video Transformers
Object-Region Video TransformersSangwoo Mo
 
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksSangwoo Mo
 
Learning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaLearning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaSangwoo Mo
 
Sharpness-aware minimization (SAM)
Sharpness-aware minimization (SAM)Sharpness-aware minimization (SAM)
Sharpness-aware minimization (SAM)Sangwoo Mo
 
Explicit Density Models
Explicit Density ModelsExplicit Density Models
Explicit Density ModelsSangwoo Mo
 
Score-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential EquationsScore-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential EquationsSangwoo Mo
 
Self-Attention with Linear Complexity
Self-Attention with Linear ComplexitySelf-Attention with Linear Complexity
Self-Attention with Linear ComplexitySangwoo Mo
 
Meta-Learning with Implicit Gradients
Meta-Learning with Implicit GradientsMeta-Learning with Implicit Gradients
Meta-Learning with Implicit GradientsSangwoo Mo
 
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Sangwoo Mo
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General AudiencesSangwoo Mo
 
Bayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-LearningBayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-LearningSangwoo Mo
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingSangwoo Mo
 

More from Sangwoo Mo (20)

Brief History of Visual Representation Learning
Brief History of Visual Representation LearningBrief History of Visual Representation Learning
Brief History of Visual Representation Learning
 
Learning Visual Representations from Uncurated Data
Learning Visual Representations from Uncurated DataLearning Visual Representations from Uncurated Data
Learning Visual Representations from Uncurated Data
 
Hyperbolic Deep Reinforcement Learning
Hyperbolic Deep Reinforcement LearningHyperbolic Deep Reinforcement Learning
Hyperbolic Deep Reinforcement Learning
 
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture Note
 
Deep Learning Theory Seminar (Chap 3, part 2)
Deep Learning Theory Seminar (Chap 3, part 2)Deep Learning Theory Seminar (Chap 3, part 2)
Deep Learning Theory Seminar (Chap 3, part 2)
 
Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)
 
Introduction to Diffusion Models
Introduction to Diffusion ModelsIntroduction to Diffusion Models
Introduction to Diffusion Models
 
Object-Region Video Transformers
Object-Region Video TransformersObject-Region Video Transformers
Object-Region Video Transformers
 
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural Networks
 
Learning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaLearning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat Minima
 
Sharpness-aware minimization (SAM)
Sharpness-aware minimization (SAM)Sharpness-aware minimization (SAM)
Sharpness-aware minimization (SAM)
 
Explicit Density Models
Explicit Density ModelsExplicit Density Models
Explicit Density Models
 
Score-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential EquationsScore-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential Equations
 
Self-Attention with Linear Complexity
Self-Attention with Linear ComplexitySelf-Attention with Linear Complexity
Self-Attention with Linear Complexity
 
Meta-Learning with Implicit Gradients
Meta-Learning with Implicit GradientsMeta-Learning with Implicit Gradients
Meta-Learning with Implicit Gradients
 
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General Audiences
 
Bayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-LearningBayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-Learning
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 

Recently uploaded

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 

Recently uploaded (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 

Statistical Decision Theory

  • 1. All of Statistics Chapter 13. Statistical Decision Theory Sangwoo Mo KAIST Algorithmic Intelligence Lab. August 31, 2017 Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 1 / 20
  • 2. Table of Contents 1 Decision Theory: How to choose a ‘good’ estimator? 2 Computing Bayes Estimators 3 Computing Minimax Estimators 4 Maximum Likelihood Estimators: A good approximator 5 Admissibility: What are the ‘best’ estimators? Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 2 / 20
  • 3. Table of Contents 1 Decision Theory: How to choose a ‘good’ estimator? 2 Computing Bayes Estimators 3 Computing Minimax Estimators 4 Maximum Likelihood Estimators: A good approximator 5 Admissibility: What are the ‘best’ estimators? Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 3 / 20
  • 4. Statistical Decision Theory We learned several point estimators (e.g. sample mean, MLE, MoM). How do we choose among them? ⇒ Decision theory! Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 4 / 20
  • 5. Statistical Decision Theory We learned several point estimators (e.g. sample mean, MLE, MoM). How do we choose among them? ⇒ Decision theory! First, we will define loss and risk to evaluate the estimator. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 4 / 20
  • 6. Statistical Decision Theory We learned several point estimators (e.g. sample mean, MLE, MoM). How do we choose among them? ⇒ Decision theory! First, we will define loss and risk to evaluate the estimator. Definition Loss: L(θ, ˆθ) : Θ × ΘE → R measures the discrepancy between θ and ˆθ. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 4 / 20
  • 7. Statistical Decision Theory We learned several point estimators (e.g. sample mean, MLE, MoM). How do we choose among them? ⇒ Decision theory! First, we will define loss and risk to evaluate the estimator. Definition Loss: L(θ, ˆθ) : Θ × ΘE → R measures the discrepancy between θ and ˆθ. Example L(θ, ˆθ) = (θ − ˆθ)2 squared error loss, L(θ, ˆθ) = |θ − ˆθ| absolute error loss, L(θ, ˆθ) = 0 if θ = ˆθ and 1 if θ = ˆθ zero-one loss, L(θ, ˆθ) = log(f (x;θ) f (x;ˆθ) )f (x; θ)dx Kullback-Leibler loss. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 4 / 20
  • 8. Statistical Decision Theory We learned several point estimators (e.g. sample mean, MLE, MoM). How do we choose among them? ⇒ Decision theory! First, we will define loss and risk to evaluate the estimator. Definition Loss: L(θ, ˆθ) : Θ × ΘE → R measures the discrepancy between θ and ˆθ. Definition Risk: R(θ, ˆθ) = Eθ(L(θ, ˆθ)) = L(θ, ˆθ(x))f (x; θ)dx. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 4 / 20
  • 9. Statistical Decision Theory We learned several point estimators (e.g. sample mean, MLE, MoM). How do we choose among them? ⇒ Decision theory! First, we will define loss and risk to evaluate the estimator. Definition Loss: L(θ, ˆθ) : Θ × ΘE → R measures the discrepancy between θ and ˆθ. Definition Risk: R(θ, ˆθ) = Eθ(L(θ, ˆθ)) = L(θ, ˆθ(x))f (x; θ)dx. We want to choose estimator with the smallest risk. But how can we compare risk functons? Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 4 / 20
  • 10. Measure for Risk Comparison How can we compare risk functions? ⇒ We need one-number summary of the risk function! Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 5 / 20
  • 11. Measure for Risk Comparison How can we compare risk functions? ⇒ We need one-number summary of the risk function! Definition Bayes risk: r(π, ˆθ) = R(θ, ˆθ)π(θ)dθ where π(θ) is a prior for θ. Maximum risk: ¯R(ˆθ) = supθ R(θ, ˆθ). Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 5 / 20
  • 12. Measure for Risk Comparison How can we compare risk functions? ⇒ We need one-number summary of the risk function! Definition Bayes risk: r(π, ˆθ) = R(θ, ˆθ)π(θ)dθ where π(θ) is a prior for θ. Maximum risk: ¯R(ˆθ) = supθ R(θ, ˆθ). Definition Bayes estimator: ˆθ s.t. R(θ, ˆθ) = inf˜θ r(π, ˜θ). Minimax estimator: ˆθ s.t. R(θ, ˆθ) = inf˜θ supθ R(θ, ˜θ). Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 5 / 20
  • 13. Measure for Risk Comparison How can we compare risk functions? ⇒ We need one-number summary of the risk function! Definition Bayes risk: r(π, ˆθ) = R(θ, ˆθ)π(θ)dθ where π(θ) is a prior for θ. Maximum risk: ¯R(ˆθ) = supθ R(θ, ˆθ). Definition Bayes estimator: ˆθ s.t. R(θ, ˆθ) = inf˜θ r(π, ˜θ). Minimax estimator: ˆθ s.t. R(θ, ˆθ) = inf˜θ supθ R(θ, ˜θ). Note that minimax estimator is the most conservative. With proper prior, Bayes estimator works better in general. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 5 / 20
  • 14. Table of Contents 1 Decision Theory: How to choose a ‘good’ estimator? 2 Computing Bayes Estimators 3 Computing Minimax Estimators 4 Maximum Likelihood Estimators: A good approximator 5 Admissibility: What are the ‘best’ estimators? Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 6 / 20
  • 15. Computing Bayes Estimators For given data x, we can compute Bayes estimator via posterior. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 7 / 20
  • 16. Computing Bayes Estimators For given data x, we can compute Bayes estimator via posterior. Definition Posterior risk: r(ˆθ|x) = L(θ, ˆθ(x))f (θ|x)dθ where f (θ|x) = f (x|θ)π(θ)/m(x) is posterior density and m(x) = f (x|θ)π(θ)dθ is marginal distribution of X. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 7 / 20
  • 17. Computing Bayes Estimators For given data x, we can compute Bayes estimator via posterior. Definition Posterior risk: r(ˆθ|x) = L(θ, ˆθ(x))f (θ|x)dθ where f (θ|x) = f (x|θ)π(θ)/m(x) is posterior density and m(x) = f (x|θ)π(θ)dθ is marginal distribution of X. Theorem The Bayes risk r(π, ˆθ) satisfies r(π, ˆθ) = r(ˆθ|x)m(x)dx. Let ˆθ(x) be the value minimizes r(ˆθ|x). Then ˆθ is the Bayes estimator. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 7 / 20
  • 18. Computing Bayes Estimators Now we can find explicit formula for Bayes estimators for some specific loss functions. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 8 / 20
  • 19. Computing Bayes Estimators Now we can find explicit formula for Bayes estimators for some specific loss functions. Theorem If the loss is squared error / absolute error / zero-one loss, then the Bayes estimator is mean / median / mode of the posterior f (θ|x). Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 8 / 20
  • 20. Computing Bayes Estimators Now we can find explicit formula for Bayes estimators for some specific loss functions. Theorem If the loss is squared error / absolute error / zero-one loss, then the Bayes estimator is mean / median / mode of the posterior f (θ|x). Proof. We will only prove the theorem for squared error loss. The Bayes estimator ˆθ(x) minimizes r(ˆθ|x) = (θ − ˆθ(x))2f (θ|x)dθ. Taking the derivative of r(ˆθ|x) w.r.t. ˆθ(x) leads the result. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 8 / 20
  • 21. Table of Contents 1 Decision Theory: How to choose a ‘good’ estimator? 2 Computing Bayes Estimators 3 Computing Minimax Estimators 4 Maximum Likelihood Estimators: A good approximator 5 Admissibility: What are the ‘best’ estimators? Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 9 / 20
  • 22. Computing Minimax Estimators Computing minimax estimator is complex in general. Here, we will only mention a few key results. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 10 / 20
  • 23. Computing Minimax Estimators Computing minimax estimator is complex in general. Here, we will only mention a few key results. Theorem Let ˆθπ be the Bayes estimator for some prior π. Suppose that R(θ, ˆθπ) ≤ r(π, ˆθπ) for all θ. Then ˆθπ is minimax estimator and π is the least favorablea prior. a for any π , r(π, ˆθπ ) ≥ r(π , ˆθπ ) Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 10 / 20
  • 24. Computing Minimax Estimators Computing minimax estimator is complex in general. Here, we will only mention a few key results. Theorem Let ˆθπ be the Bayes estimator for some prior π. Suppose that R(θ, ˆθπ) ≤ r(π, ˆθπ) for all θ. Then ˆθπ is minimax estimator and π is the least favorablea prior. a for any π , r(π, ˆθπ ) ≥ r(π , ˆθπ ) Proof. Suppose ˆθπ is not minimax estimator. Then there is ˆθ0 s.t. supθ R(θ, ˆθ0) < supθ R(θ, ˆθπ). Hence, r(π, ˆθ0) ≤ supθ R(θ, ˆθ0) < supθ R(θ, ˆθπ) ≤ r(π, ˆθπ). It leads the contradiction. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 10 / 20
  • 25. Computing Minimax Estimators Computing minimax estimator is complex in general. Here, we will only mention a few key results. Theorem Let ˆθπ be the Bayes estimator for some prior π. Suppose that R(θ, ˆθπ) ≤ r(π, ˆθπ) for all θ. Then ˆθπ is minimax estimator and π is the least favorablea prior. a for any π , r(π, ˆθπ ) ≥ r(π , ˆθπ ) Corollary Let ˆθπ be the Bayes estimator for some prior π. Suppose that R(θ, ˆθπ) = c for some c. Then ˆθπ is minimax estimator. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 10 / 20
  • 26. Computing Minimax Estimators Example Let X1, ..., Xn ∼ Bernoulli(p). Assume squred error loss. Assume Beta(α, β) prior. Then the posterior mean is ˆp = α+ n i=1 Xi α+β+n . Now let α = β = n/4. Then R(p, ˆp) = n 4(n+ √ n)2 . Hence, by previous theorem, ˆp is minimax estimator. Example Consider again Bernoulli but with loss L(p, ˆp) = (p−ˆp)2 p(1−p) . Assume uniform prior. Then the Bayes estimator is ˆp = n i=1 Xi n . Then R(p, ˆp) = 1 n . Hence, ˆp is minimax estimator. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 11 / 20
  • 27. Computing Minimax Estimators For normal distribution, we can achieve a nice result. Theorem Let X1, ..., Xn ∼ N(θ, 1) and Θ = R. Let ˆθ = ¯X. Then ˆθ is minimax w.r.t. any well-behaveda loss function. Moreover, it is the only estimator with this property. a level sets are convex and symmetric about the origin Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 12 / 20
  • 28. Computing Minimax Estimators For normal distribution, we can achieve a nice result. Theorem Let X1, ..., Xn ∼ N(θ, 1) and Θ = R. Let ˆθ = ¯X. Then ˆθ is minimax w.r.t. any well-behaveda loss function. Moreover, it is the only estimator with this property. a level sets are convex and symmetric about the origin If parameter space is bounded, the theorem above does not apply. Example Suppose that X ∼ N(θ, 1) and Θ = [−m, m] where 0 < m < 1. Assume squared error loss. Assume 1 2(δ(−m) + δ(m)) prior. Then ˆθ(X) = m tanh(mX) is minimax estimator. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 12 / 20
  • 29. Table of Contents 1 Decision Theory: How to choose a ‘good’ estimator? 2 Computing Bayes Estimators 3 Computing Minimax Estimators 4 Maximum Likelihood Estimators: A good approximator 5 Admissibility: What are the ‘best’ estimators? Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 13 / 20
  • 30. MLE approximates Bayes/minimax estimator Still, it is challenging to compute Bayes and minimax estimator. Surprisingly, it is shown that for large samples, MLE is approximately Bayes and minimax for parametric models. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 14 / 20
  • 31. MLE approximates Bayes/minimax estimator Still, it is challenging to compute Bayes and minimax estimator. Surprisingly, it is shown that for large samples, MLE is approximately Bayes and minimax for parametric models. Idea Sketch (MLE is approximately Bayes) For large n, the effect of prior is negligible. Moreover, √ n(ˆθBayes − θ) → N(0, 1 I(θ) ). Thus, Bayes estimator is approximately MLE. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 14 / 20
  • 32. MLE approximates Bayes/minimax estimator Still, it is challenging to compute Bayes and minimax estimator. Surprisingly, it is shown that for large samples, MLE is approximately Bayes and minimax for parametric models. Idea Sketch (MLE is approximately minimax) Assume squared error loss. Let ˆθ be MLE estimator. Then R(θ, ˆθ) = Vθ(ˆθ) + bias2 ≈ Vθ(ˆθ)a ≈ 1 nI(θ) . For large n, for other θ , R(θ, θ ) ≥ 1 nI(θ) ≈ R(θ, ˆθ). Thus, MLE is approximately minimax. a typically, squared bias is O(n−2 ) and variance is O(n−1 ) Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 14 / 20
  • 33. Table of Contents 1 Decision Theory: How to choose a ‘good’ estimator? 2 Computing Bayes Estimators 3 Computing Minimax Estimators 4 Maximum Likelihood Estimators: A good approximator 5 Admissibility: What are the ‘best’ estimators? Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 15 / 20
  • 34. Admissibility As we have seen before, we can’t decide the best estimator over risks. However, we can decide what is not the best estimator. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 16 / 20
  • 35. Admissibility As we have seen before, we can’t decide the best estimator over risks. However, we can decide what is not the best estimator. Definition An estimator ˆθ is inadmissible if there exists an estimator ˆθ s.t. R(θ, ˆθ ) ≤ R(θ, ˆθ) for all θ and R(θ, ˆθ ) < R(θ, ˆθ) for at least one θ. An estimator ˆθ is admissible if it is not inadmissible. Definition An estimator ˆθ is strongly inadmissible if there exists an estimator ˆθ and > 0 s.t. R(θ, ˆθ ) < R(θ, ˆθ) − for all θ. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 16 / 20
  • 36. Admissibility Remark that admissibility only characterize the bad estimator. Admissible estimator are not generally good, and sometimes can be bad. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 17 / 20
  • 37. Admissibility Remark that admissibility only characterize the bad estimator. Admissible estimator are not generally good, and sometimes can be bad. Example Let X ∼ N(θ, 1). Assume squared error loss. Let ˆθ(X) = 3. Then ˆθ(X) is admissible, even though it is clearly bad. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 17 / 20
  • 38. Admissibility Remark that admissibility only characterize the bad estimator. Admissible estimator are not generally good, and sometimes can be bad. Example Let X ∼ N(θ, 1). Assume squared error loss. Let ˆθ(X) = 3. Then ˆθ(X) is admissible, even though it is clearly bad. Example (Proof.) Suppose not. There exists ˆθ s.t. R(3, ˆθ ) ≤ R(3, ˆθ) = 0. Hence, R(3, ˆθ ) = (ˆθ (x) − 3)2f (x; 3)dx, and ˆθ (X) = 3. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 17 / 20
  • 39. Admissibility of Bayes Estimators Assuming regular condition, Bayes estimators are admissible. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 18 / 20
  • 40. Admissibility of Bayes Estimators Assuming regular condition, Bayes estimators are admissible. Theorem (unique case) If Bayes estimator is unique, then it is admissible. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 18 / 20
  • 41. Admissibility of Bayes Estimators Assuming regular condition, Bayes estimators are admissible. Theorem (unique case) If Bayes estimator is unique, then it is admissible. Theorem (discrete case) If Θ is discrete set, then all Bayes estimators are admissible. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 18 / 20
  • 42. Admissibility of Bayes Estimators Assuming regular condition, Bayes estimators are admissible. Theorem (unique case) If Bayes estimator is unique, then it is admissible. Theorem (discrete case) If Θ is discrete set, then all Bayes estimators are admissible. Theorem (continuous case) If Θ is continuous set, and if R(θ, ˆθ) is continuous in θ for every ˆθ, then all Bayes estimators are admissible. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 18 / 20
  • 43. Admissibility of Minimax Estimators Neither minimaxity nor admissibility implies the other in general. However, there are some facts linking them. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 19 / 20
  • 44. Admissibility of Minimax Estimators Neither minimaxity nor admissibility implies the other in general. However, there are some facts linking them. Theorem (admissibility ⇒ minimaxity) Suppose that ˆθ has constant risk and is admissible. Then it is minimax estimator. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 19 / 20
  • 45. Admissibility of Minimax Estimators Neither minimaxity nor admissibility implies the other in general. However, there are some facts linking them. Theorem (admissibility ⇒ minimaxity) Suppose that ˆθ has constant risk and is admissible. Then it is minimax estimator. Proof. Suppose not. There exists ˆθ s.t. R(θ, ˆθ ) ≤ supθ R(θ, ˆθ ) < supθ R(θ, ˆθ) = R(θ, ˆθ). It implies that ˆθ is inadmissible. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 19 / 20
  • 46. Admissibility of Minimax Estimators Neither minimaxity nor admissibility implies the other in general. However, there are some facts linking them. Theorem (admissibility ⇒ minimaxity) Suppose that ˆθ has constant risk and is admissible. Then it is minimax estimator. Theorem (minimaxity ⇒ admissibility) If ˆθ is minimax then it is not strongly inadmissible. Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 19 / 20
  • 47. Questions? Sangwoo Mo (KAIST ALIN Lab.) AoS Chap 13. August 31, 2017 20 / 20