SlideShare a Scribd company logo
1 of 20
Download to read offline
Confidence Intervals
Exact Intervals, Jackknife, and Bootstrap
Francesco Casalegno
1/20
Why do we need Confidence Intervals?
• Very common use case
we have a few samples x1, ..., xn from an unknown distribution F
we need to estimate some parameter θ of the underlying distribution
• A single value or an interval of values?
use x1, ..., xn to compute our best guess for the parameter θ
but this single value does not take into account the intrinsic uncertainty
due to our limited information on F (finite number of samples n)
so use x1, ..., xn to also compute an interval that likely contains the true θ
• The frequentist solution
there are several ways to compute an interval estimate for θ
we follow the frequentist approach: computing a confidence interval
2/20
Point Estimates and Interval Estimates
• Given x1, ..., xn from a distribution F, estimate unknown parameter θ of F.
Given x1, ..., xn drawn from N(µ, σ2), we want to estimate the variance σ2.
Given x1, ..., xn and y1, ..., yn, we want to estimate the correlation ρ(X, Y ).
• A point estimate is a statistic ˆθ = T(X1, ..., Xn) estimating the unknown θ.
A classical example is the maximum likelihood estimator (MLE).
• Important properties of an estimator are bias and variance.
Bias(ˆθ) = E[ˆθ] − θ Var(ˆθ) = E[(ˆθ − E[ˆθ])2
]
Given x1, ..., xn drawn from N(µ, σ2), the MLE for σ2 is ˆσ2 = 1
n i (xi − ¯x)2.
This estimator has bias −σ2
n
and variance 2σ4
n
.
• An interval estimate is an interval statistic
I(X1, ..., Xn) = [L(X1, ..., Xn), U(X1, ..., Xn)]
containing possible values for the unknown θ.
Two classical examples are the confidence interval and the credible interval.
3/20
Confidence Intervals
• I(X1, ..., Xn) is a confidence interval for θ with confidence level α if, for
any fixed value of the unknown parameter θ,
P(θ ∈ I(X1, ..., Xn)) = α
• If α = 0.95, this means that for any fixed θ, if we repeat a sampling of n
values X1, ..., Xn ∼ Fθ for 100 times and compute a confidence interval every
time, 95 of such intervals contain the true value of θ.
• This does not mean that given the samples x1, ..., xn the probability that
θ ∈ I(x1, ..., xn) is α! This is a common misunderstanding, but the probability
in our definition is about the samples X1, ..., Xn and not on θ.
Indeed, in the frequentist approach θ is a fixed (albeit unknown) value, not a
random variable with associated probability.
For a Bayesian approach, fix x1, ..., xn instead and assign a posterior distribution
to θ. This yields a credible interval which contains θ with probability α.
4/20
Confidence Intervals: Example
90% Confidence Interval
Frequentist approach
θ is fixed, but unknown
X1, ..., Xn are drawn from Fθ 10
times
Build interval for each (X1, ..., Xn)
9/10 of the intervals contain the
true θ
90% Credible Interval
Bayesian approach
Associate to θ a probability
measuring our belief
x1, ..., xn are fixed observations
update posterior belief on θ
Build interval containing θ with
probability = 90%
5/20
How do we compute Confidence Interval?
• Depending on the situation, we have to use a different approach
Exact method: based on a known distribution of ˆθ
Asymptotic method: based on asymptotic normality of the MLE
Jackknife method: simple resampling technique
Bootstrap method: more elaborate resampling technique
6/20
Exact method
• The value of θ is fixed, but ˆθ = T(X1, ..., Xn) is a random variable. If we
know the exact distribution of ˆθ we can compute an exact confidence interval.
Example: Normal distribution N(µ, σ2
)
The MLE for the mean µ is the sample mean ˆµ = ¯x and we have
ˆµ − µ
σ/
√
n
∼ N(0, 1)
so if σ2 is known we can compute an exact confidence interval for µ.
The MLE for the variance is the sample variance ˆσ2 = 1
n i (xi − ¯x)2. This
estimator is biased, so we consider instead the Bessel correction yielding
s2 = 1
n−1 i (xi − ¯x)2 and we have
s2
σ2/(n − 1)
∼ χ2
n−1.
If σ2 is unknown we can use Using s2 we can compute an exact confidence
interval for µ n by using
ˆµ − µ
s/
√
n
∼ tn−1.
7/20
Exact Method: Pros and Cons
Pros
+ confidence level is exactly α
+ closed-form expression allows fast computation
+ works for any sample size n
Cons
– if we do not know the distribution F or the family of distributions it
belongs to (non-parametric statistics), we cannot compute the exact
distribution of ˆθ
– even if F is known, the exact distribution of ˆθ is often impossible to
compute: θ = ρ(X, Y ), θ = Median(X), ...
8/20
Asymptotic Method
• In many cases we choose ˆθ as the MLE for θ. This estimator has (under
reasonable assumptions) the key property of asymptotic normality
√
n(ˆθ − θ)
d
−→ N(0, I(θ)−1
)
where I(θ)=−EX [ (θ; x)] is the Fisher information and (θ; x) = log p(x; θ).
Example: Exponential distribution Exp(λ)
The p.d.f. is p(x; λ) = λxe−λx , so we have
(λ; x) = log(λ) − λx and (λ; x1, ..., xn) = n log(λ) − λ i xi
so that the MLE is ˆλ = 1/¯x. The Fisher information is
−EX [ (λ)] = −EX [−1/λ2] = 1/λ2
so we can use the asymptotic approximation
√
n(ˆλ − λ) ≈ N(0, λ2) ⇒
ˆλ
λ
≈ N(1, 1/n)
Example: Bernoulli distribution Bernoulli(λ)
The p.d.f. is p(x; p) = px (1 − p)1−x , so we have
(p;x)=xlog(p) + (1 − x)log(1 − p)
so that the MLE is ˆp = ¯x. The Fisher information is
−EX [ (p)] = −EX [− x
p
− 1−x
(1−p)2 ] = − p
p2 + 1−p
(1−p)2 = 1
p(1−p)
so we can use the asymptotic approximation
√
n(ˆp − p) ≈ N(0, p(1 − p)) ⇒ ˆp − p ≈ N(0, ˆp(1 − ˆp)/n)
9/20
Asymptotic Method: Example
Consider Exp(λ) and the estimator ˆk = i Xi /n for k = 1/λ = 1/3.
It can be shown that the exact distribution is ˆk ∼ 2nkχ2
2n
We have seen that the asymptotic distribution is ˆk ≈ N(k, k2
/n)
10/20
Asymptotic Method: Pros and Cons
Pros
+ easier computation if sampling distribution F is known
+ expected information I(θ) may be replaced by observed information I(ˆθ)
Cons
– works well only for n sufficiently large (typically at least n > 50)
– neglets skewness of the distribution of ˆθ
– requires to know F or the family of distributions it belongs to
– can be applied only if ˆθ is asymptotically normal (typically MLE)
11/20
Jackknife Method
• Given any estimator ˆθ, the jackknife is based on n leave-1-out estimators
ˆθ(i) = T(X1, ..., Xi−1, Xi+1, ..., Xn), with ˆθ(·) = 1
n i
ˆθ(i).
We also consider the n pseudo-values
˜θi = nˆθ − (n − 1)ˆθ(i)
• A first use of jackknife is bias correction. Indeed,
biasjack = (n − 1)(ˆθ(·) − ˆθ)
is a linear estimator of Bias(ˆθ) (i.e. error is O(1/n2
)). Then, a bias-corrected
estimator is given by the mean of the pseudo-values
ˆθjack = nˆθ − (n − 1)ˆθ(·) = 1
n i
˜θi = ˜θ
• Similarly, bootstrap is used to for the estimation of other properties of ˆθ.
E.g. the variance estimator for Var(ˆθ) given by
varjack = 1
n
˜s2
= 1
n
1
n−1 i (˜θi − ˜θ)2
= n−1
n i (ˆθ(i) − ˆθ(·))2
is a linear estimator, assuming that ˆθ = T(X1, ..., Xn) is smooth.
• We may use varjack and ˆθjack to compute a jackknife approximated
confidence interval using the asymptotic approximation
˜θ − θ
˜s2/n
≈ tn−1
but in practice this approximation is often too crude.
12/20
Jackknife Method: Limitations
• If ˆθ is non-smooth the jackknife variance estimator may be non-consistent.
If ˆθ is the sample mean, it can be proved that
varjack
Var(ˆθ)
d
−→
χ2
2
2
2
• To fix that we introduce an extension of jackknife. This time we consider the
n
d
leave-d-out estimators obtained by computing the statistic T on every
possible subset of X1, ..., Xn obtained by removing d elements.
For ˆθ = sample mean, choosing
√
n < d < n − 1 yields a consistent variance
estimator
varjack = n−d
d n
d
i (ˆθ(i) − ˆθ(·))2
13/20
Jackknife Method: Example
Consider (X, Y ) ∼ F for some F and the Pearson correlation coefficient ρ.
It can shown that the estimator ˆρ2
= i (xi −¯x)(yi −¯y)
√
i (xi −¯x)2
√
i (yi −¯y)2
is biased.
F =⇒ (x1, y1), (x2, y2), (x3, y3), ..., (x10, y10) −→ ˆρ2
⇓
(x2, y2), (x3, y3), (x4, y4), ..., (x10, y10) −→ ˆρ2
(1)
(x1, y1), (x3, y3), (x4, y4), ..., (x10, y10) −→ ˆρ2
(2)
...
(x1, y1), (x2, y2), (x3, y3), ..., (x9, y9) −→ ˆρ2
(10)
The jackknife estimator ˆσ2
jack = 10ˆσ2
− 9ˆρ2
(·) has bias correct up to O(1/n).
14/20
Jackknife Method: Pros and Cons
Pros
+ can be used for non-parametric statistics
+ fast computation
+ bias correction up to O(1/n2
)
+ leave-d-out provides consistent variance estimator
Cons
– leave-1-out may be non-consistent
– leave-d-out is more expensive
– confidence interval is based on crude approximations, bootstrap is better
15/20
Bootstrap Method
• The bootstrap consists in B resamplings with replacement from x1, ..., xn.
This is equivalent to sample from the empirical CDF ˆF.
• For each of the B resamples (x
(b)
1 , ..., x
(b)
n ), compute the estimator ˆθ∗
(b). We
use the values ˆθ∗
(b) to estimate the distribution of ˆθ∗
= ˆθ∗
( ˆF), which in turn is
an appoximation of the distribution of interest ˆθ = ˆθ(F).
• To compute point estimates for the properties of ˆθ we use the pair (ˆθ∗
, ˆθ)
to approximate the pair (ˆθ, θ).
The bootstrap bias estimator for Bias(ˆθ) = E[ˆθ] − θ is given by
biasboot = E[ˆθ∗] − ˆθ = 1
B b
ˆθ∗
b − ˆθ
so that the bias-corrected bootstrap estimator reads
ˆθboot = 2ˆθ + 1
B b
ˆθ∗
b
Similarly, the bootstrap variance estimator for Var(ˆθ) = E[(ˆθ − E[ˆθ])2] is
varboot = 1
B−1 b
ˆθ∗
b − 1
B b
ˆθ∗
b
2
• Notice that bootstrap is more generic than jackknife, since it estimates the
whole distribution of ˆθ and not only its bias and variance.
Actually, one can prove that jackknife is a first order approximation of bootstrap.
16/20
Bootstrap Method: Example
Real World Bootstrap World
F ˆF
⇓ ⇓
x1, ..., xn x∗
1 , ..., x∗
n
↓ ↓
ˆθ ˆθ∗
17/20
Bootstrap Method: Confidence Intervals
• Different techniques are available to compute bootstrap interval estimates.
Here p[α] denotes the α-quantile of distribution p, with z[α] for standard normal.
• The pivotal interval comes from P(l < ˆθ − ˆθ∗
< u) ≈ P(l < θ − ˆθ < u)
CI = (2ˆθ − ˆθ∗
[1 − α/2], 2ˆθ − ˆθ∗
[α/2])
• The studentized interval has an approach similar to the jackknife’s one
CI = (ˆθjack − tn−1[α/2]varjack , ˆθjack + tn−1[α/2]varjack )
• The BCa interval (bias-corrected and accelerated)
CI = (ˆθ∗
[g(α)], ˆθ∗
[g(1 − α)]), with g(α) = Φ z0 +
z0 + z[α]
1 − a(z0 + z[α])
where z0 = Φ−1
(#{ˆθ∗
b < ˆθ}/B) is the bias-correction and the acceleration
a =
1
6
Skew(I(ˆθ)) ≈
1
6
i (ˆθ(·) − ˆθ(i))3
[ i (ˆθ(·) − ˆθ(i))2]3/2
is approximated using jackknife. The BCa interval has an excellent O(1/n)
coverage error, so it is preferred to the other bootstrap methods.
18/20
Bootstrap Method: Pros and Cons
Pros
+ can be used for non-parametric statistics
+ more powerful than jackknife, it approximates whole distribution of ˆθ
+ more accurate than jackknife for computing variance of ˆθ
+ BCa interval has O(1/n) coverage error
Cons
– more expensive than jackknife (B should be large enough)
– if n is very small bootstrap may fail
– if the family of F is known, much better results with exact methods
19/20
Conclusions
• We want to estimate a parameter θ, using samples X1, ..., Xn ∼ Fθ.
• Confidence intervals are needed to express uncertainty of estimator ˆθ.
• If distribution F is known, preferably use exact or asymptotic methods.
• If F is unknown or distribution of ˆθ is complex, use jackknife or bootstrap.
• Use jackknife to estimate properties of ˆθ. Not so good for c. intervals.
• Use bootstrap to estimate distribution of ˆθ. Good for c. intervals (BCa).
20/20

More Related Content

What's hot

Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statisticsAnand Thokal
 
Regression analysis
Regression analysisRegression analysis
Regression analysisRavi shankar
 
Conditional probability
Conditional probabilityConditional probability
Conditional probabilitysuncil0071
 
Central limit theorem
Central limit theoremCentral limit theorem
Central limit theoremVijeesh Soman
 
Lecture: Joint, Conditional and Marginal Probabilities
Lecture: Joint, Conditional and Marginal Probabilities Lecture: Joint, Conditional and Marginal Probabilities
Lecture: Joint, Conditional and Marginal Probabilities Marina Santini
 
Categorical data analysis.pptx
Categorical data analysis.pptxCategorical data analysis.pptx
Categorical data analysis.pptxBegashaw3
 
Inferential statictis ready go
Inferential statictis ready goInferential statictis ready go
Inferential statictis ready goMmedsc Hahm
 
Bbs11 ppt ch12
Bbs11 ppt ch12Bbs11 ppt ch12
Bbs11 ppt ch12Tuul Tuul
 
Multinomial logisticregression basicrelationships
Multinomial logisticregression basicrelationshipsMultinomial logisticregression basicrelationships
Multinomial logisticregression basicrelationshipsAnirudha si
 
Probability distribution
Probability distributionProbability distribution
Probability distributionPunit Raut
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spssDr Nisha Arora
 
Presentation On Regression
Presentation On RegressionPresentation On Regression
Presentation On Regressionalok tiwari
 
Basic Descriptive statistics
Basic Descriptive statisticsBasic Descriptive statistics
Basic Descriptive statisticsAjendra Sharma
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive StatisticsBhagya Silva
 

What's hot (20)

Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Binomial probability distributions
Binomial probability distributions  Binomial probability distributions
Binomial probability distributions
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Logistic Regression Analysis
Logistic Regression AnalysisLogistic Regression Analysis
Logistic Regression Analysis
 
Conditional probability
Conditional probabilityConditional probability
Conditional probability
 
Central limit theorem
Central limit theoremCentral limit theorem
Central limit theorem
 
Lecture: Joint, Conditional and Marginal Probabilities
Lecture: Joint, Conditional and Marginal Probabilities Lecture: Joint, Conditional and Marginal Probabilities
Lecture: Joint, Conditional and Marginal Probabilities
 
Categorical data analysis.pptx
Categorical data analysis.pptxCategorical data analysis.pptx
Categorical data analysis.pptx
 
Sampling theory
Sampling theorySampling theory
Sampling theory
 
Inferential statictis ready go
Inferential statictis ready goInferential statictis ready go
Inferential statictis ready go
 
Panel data
Panel dataPanel data
Panel data
 
Correlation
CorrelationCorrelation
Correlation
 
Bbs11 ppt ch12
Bbs11 ppt ch12Bbs11 ppt ch12
Bbs11 ppt ch12
 
Multinomial logisticregression basicrelationships
Multinomial logisticregression basicrelationshipsMultinomial logisticregression basicrelationships
Multinomial logisticregression basicrelationships
 
Probability distribution
Probability distributionProbability distribution
Probability distribution
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spss
 
Presentation On Regression
Presentation On RegressionPresentation On Regression
Presentation On Regression
 
Basic Descriptive statistics
Basic Descriptive statisticsBasic Descriptive statistics
Basic Descriptive statistics
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 

Similar to Confidence Intervals––Exact Intervals, Jackknife, and Bootstrap

Problem_Session_Notes
Problem_Session_NotesProblem_Session_Notes
Problem_Session_NotesLu Mao
 
Fisher_info_ppt and mathematical process to find time domain and frequency do...
Fisher_info_ppt and mathematical process to find time domain and frequency do...Fisher_info_ppt and mathematical process to find time domain and frequency do...
Fisher_info_ppt and mathematical process to find time domain and frequency do...praveenyadav2020
 
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrapStatistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrapChristian Robert
 
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier Analysis
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier AnalysisDSP_FOEHU - MATLAB 02 - The Discrete-time Fourier Analysis
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier AnalysisAmr E. Mohamed
 
ISM_Session_5 _ 23rd and 24th December.pptx
ISM_Session_5 _ 23rd and 24th December.pptxISM_Session_5 _ 23rd and 24th December.pptx
ISM_Session_5 _ 23rd and 24th December.pptxssuser1eba67
 
Applications to Central Limit Theorem and Law of Large Numbers
Applications to Central Limit Theorem and Law of Large NumbersApplications to Central Limit Theorem and Law of Large Numbers
Applications to Central Limit Theorem and Law of Large NumbersUniversity of Salerno
 
Communication Theory - Random Process.pdf
Communication Theory - Random Process.pdfCommunication Theory - Random Process.pdf
Communication Theory - Random Process.pdfRajaSekaran923497
 
Point Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis testsPoint Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis testsUniversity of Salerno
 
Econometrics 2.pptx
Econometrics 2.pptxEconometrics 2.pptx
Econometrics 2.pptxfuad80
 
Chapter_09_ParameterEstimation.pptx
Chapter_09_ParameterEstimation.pptxChapter_09_ParameterEstimation.pptx
Chapter_09_ParameterEstimation.pptxVimalMehta19
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationUmberto Picchini
 

Similar to Confidence Intervals––Exact Intervals, Jackknife, and Bootstrap (20)

Talk 3
Talk 3Talk 3
Talk 3
 
Problem_Session_Notes
Problem_Session_NotesProblem_Session_Notes
Problem_Session_Notes
 
Fisher_info_ppt and mathematical process to find time domain and frequency do...
Fisher_info_ppt and mathematical process to find time domain and frequency do...Fisher_info_ppt and mathematical process to find time domain and frequency do...
Fisher_info_ppt and mathematical process to find time domain and frequency do...
 
Statistics Homework Help
Statistics Homework HelpStatistics Homework Help
Statistics Homework Help
 
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrapStatistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
 
Talk 2
Talk 2Talk 2
Talk 2
 
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier Analysis
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier AnalysisDSP_FOEHU - MATLAB 02 - The Discrete-time Fourier Analysis
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier Analysis
 
ISM_Session_5 _ 23rd and 24th December.pptx
ISM_Session_5 _ 23rd and 24th December.pptxISM_Session_5 _ 23rd and 24th December.pptx
ISM_Session_5 _ 23rd and 24th December.pptx
 
U unit7 ssb
U unit7 ssbU unit7 ssb
U unit7 ssb
 
Data Analysis Assignment Help
Data Analysis Assignment HelpData Analysis Assignment Help
Data Analysis Assignment Help
 
Applications to Central Limit Theorem and Law of Large Numbers
Applications to Central Limit Theorem and Law of Large NumbersApplications to Central Limit Theorem and Law of Large Numbers
Applications to Central Limit Theorem and Law of Large Numbers
 
Communication Theory - Random Process.pdf
Communication Theory - Random Process.pdfCommunication Theory - Random Process.pdf
Communication Theory - Random Process.pdf
 
Point Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis testsPoint Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis tests
 
Excel Homework Help
Excel Homework HelpExcel Homework Help
Excel Homework Help
 
Propensity albert
Propensity albertPropensity albert
Propensity albert
 
Econometrics 2.pptx
Econometrics 2.pptxEconometrics 2.pptx
Econometrics 2.pptx
 
Chapter_09_ParameterEstimation.pptx
Chapter_09_ParameterEstimation.pptxChapter_09_ParameterEstimation.pptx
Chapter_09_ParameterEstimation.pptx
 
Montecarlophd
MontecarlophdMontecarlophd
Montecarlophd
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computation
 

More from Francesco Casalegno

DVC - Git-like Data Version Control for Machine Learning projects
DVC - Git-like Data Version Control for Machine Learning projectsDVC - Git-like Data Version Control for Machine Learning projects
DVC - Git-like Data Version Control for Machine Learning projectsFrancesco Casalegno
 
Ordinal Regression and Machine Learning: Applications, Methods, Metrics
Ordinal Regression and Machine Learning: Applications, Methods, MetricsOrdinal Regression and Machine Learning: Applications, Methods, Metrics
Ordinal Regression and Machine Learning: Applications, Methods, MetricsFrancesco Casalegno
 
Markov Chain Monte Carlo Methods
Markov Chain Monte Carlo MethodsMarkov Chain Monte Carlo Methods
Markov Chain Monte Carlo MethodsFrancesco Casalegno
 
Hyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine LearningHyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine LearningFrancesco Casalegno
 
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...Francesco Casalegno
 
C++11: Rvalue References, Move Semantics, Perfect Forwarding
C++11: Rvalue References, Move Semantics, Perfect ForwardingC++11: Rvalue References, Move Semantics, Perfect Forwarding
C++11: Rvalue References, Move Semantics, Perfect ForwardingFrancesco Casalegno
 

More from Francesco Casalegno (8)

DVC - Git-like Data Version Control for Machine Learning projects
DVC - Git-like Data Version Control for Machine Learning projectsDVC - Git-like Data Version Control for Machine Learning projects
DVC - Git-like Data Version Control for Machine Learning projects
 
Ordinal Regression and Machine Learning: Applications, Methods, Metrics
Ordinal Regression and Machine Learning: Applications, Methods, MetricsOrdinal Regression and Machine Learning: Applications, Methods, Metrics
Ordinal Regression and Machine Learning: Applications, Methods, Metrics
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Markov Chain Monte Carlo Methods
Markov Chain Monte Carlo MethodsMarkov Chain Monte Carlo Methods
Markov Chain Monte Carlo Methods
 
Hyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine LearningHyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine Learning
 
Smart Pointers in C++
Smart Pointers in C++Smart Pointers in C++
Smart Pointers in C++
 
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
 
C++11: Rvalue References, Move Semantics, Perfect Forwarding
C++11: Rvalue References, Move Semantics, Perfect ForwardingC++11: Rvalue References, Move Semantics, Perfect Forwarding
C++11: Rvalue References, Move Semantics, Perfect Forwarding
 

Recently uploaded

module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsOrtegaSyrineMay
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Servicenishacall1
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flyPRADYUMMAURYA1
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Silpa
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformationAreesha Ahmad
 

Recently uploaded (20)

module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 

Confidence Intervals––Exact Intervals, Jackknife, and Bootstrap

  • 1. Confidence Intervals Exact Intervals, Jackknife, and Bootstrap Francesco Casalegno 1/20
  • 2. Why do we need Confidence Intervals? • Very common use case we have a few samples x1, ..., xn from an unknown distribution F we need to estimate some parameter θ of the underlying distribution • A single value or an interval of values? use x1, ..., xn to compute our best guess for the parameter θ but this single value does not take into account the intrinsic uncertainty due to our limited information on F (finite number of samples n) so use x1, ..., xn to also compute an interval that likely contains the true θ • The frequentist solution there are several ways to compute an interval estimate for θ we follow the frequentist approach: computing a confidence interval 2/20
  • 3. Point Estimates and Interval Estimates • Given x1, ..., xn from a distribution F, estimate unknown parameter θ of F. Given x1, ..., xn drawn from N(µ, σ2), we want to estimate the variance σ2. Given x1, ..., xn and y1, ..., yn, we want to estimate the correlation ρ(X, Y ). • A point estimate is a statistic ˆθ = T(X1, ..., Xn) estimating the unknown θ. A classical example is the maximum likelihood estimator (MLE). • Important properties of an estimator are bias and variance. Bias(ˆθ) = E[ˆθ] − θ Var(ˆθ) = E[(ˆθ − E[ˆθ])2 ] Given x1, ..., xn drawn from N(µ, σ2), the MLE for σ2 is ˆσ2 = 1 n i (xi − ¯x)2. This estimator has bias −σ2 n and variance 2σ4 n . • An interval estimate is an interval statistic I(X1, ..., Xn) = [L(X1, ..., Xn), U(X1, ..., Xn)] containing possible values for the unknown θ. Two classical examples are the confidence interval and the credible interval. 3/20
  • 4. Confidence Intervals • I(X1, ..., Xn) is a confidence interval for θ with confidence level α if, for any fixed value of the unknown parameter θ, P(θ ∈ I(X1, ..., Xn)) = α • If α = 0.95, this means that for any fixed θ, if we repeat a sampling of n values X1, ..., Xn ∼ Fθ for 100 times and compute a confidence interval every time, 95 of such intervals contain the true value of θ. • This does not mean that given the samples x1, ..., xn the probability that θ ∈ I(x1, ..., xn) is α! This is a common misunderstanding, but the probability in our definition is about the samples X1, ..., Xn and not on θ. Indeed, in the frequentist approach θ is a fixed (albeit unknown) value, not a random variable with associated probability. For a Bayesian approach, fix x1, ..., xn instead and assign a posterior distribution to θ. This yields a credible interval which contains θ with probability α. 4/20
  • 5. Confidence Intervals: Example 90% Confidence Interval Frequentist approach θ is fixed, but unknown X1, ..., Xn are drawn from Fθ 10 times Build interval for each (X1, ..., Xn) 9/10 of the intervals contain the true θ 90% Credible Interval Bayesian approach Associate to θ a probability measuring our belief x1, ..., xn are fixed observations update posterior belief on θ Build interval containing θ with probability = 90% 5/20
  • 6. How do we compute Confidence Interval? • Depending on the situation, we have to use a different approach Exact method: based on a known distribution of ˆθ Asymptotic method: based on asymptotic normality of the MLE Jackknife method: simple resampling technique Bootstrap method: more elaborate resampling technique 6/20
  • 7. Exact method • The value of θ is fixed, but ˆθ = T(X1, ..., Xn) is a random variable. If we know the exact distribution of ˆθ we can compute an exact confidence interval. Example: Normal distribution N(µ, σ2 ) The MLE for the mean µ is the sample mean ˆµ = ¯x and we have ˆµ − µ σ/ √ n ∼ N(0, 1) so if σ2 is known we can compute an exact confidence interval for µ. The MLE for the variance is the sample variance ˆσ2 = 1 n i (xi − ¯x)2. This estimator is biased, so we consider instead the Bessel correction yielding s2 = 1 n−1 i (xi − ¯x)2 and we have s2 σ2/(n − 1) ∼ χ2 n−1. If σ2 is unknown we can use Using s2 we can compute an exact confidence interval for µ n by using ˆµ − µ s/ √ n ∼ tn−1. 7/20
  • 8. Exact Method: Pros and Cons Pros + confidence level is exactly α + closed-form expression allows fast computation + works for any sample size n Cons – if we do not know the distribution F or the family of distributions it belongs to (non-parametric statistics), we cannot compute the exact distribution of ˆθ – even if F is known, the exact distribution of ˆθ is often impossible to compute: θ = ρ(X, Y ), θ = Median(X), ... 8/20
  • 9. Asymptotic Method • In many cases we choose ˆθ as the MLE for θ. This estimator has (under reasonable assumptions) the key property of asymptotic normality √ n(ˆθ − θ) d −→ N(0, I(θ)−1 ) where I(θ)=−EX [ (θ; x)] is the Fisher information and (θ; x) = log p(x; θ). Example: Exponential distribution Exp(λ) The p.d.f. is p(x; λ) = λxe−λx , so we have (λ; x) = log(λ) − λx and (λ; x1, ..., xn) = n log(λ) − λ i xi so that the MLE is ˆλ = 1/¯x. The Fisher information is −EX [ (λ)] = −EX [−1/λ2] = 1/λ2 so we can use the asymptotic approximation √ n(ˆλ − λ) ≈ N(0, λ2) ⇒ ˆλ λ ≈ N(1, 1/n) Example: Bernoulli distribution Bernoulli(λ) The p.d.f. is p(x; p) = px (1 − p)1−x , so we have (p;x)=xlog(p) + (1 − x)log(1 − p) so that the MLE is ˆp = ¯x. The Fisher information is −EX [ (p)] = −EX [− x p − 1−x (1−p)2 ] = − p p2 + 1−p (1−p)2 = 1 p(1−p) so we can use the asymptotic approximation √ n(ˆp − p) ≈ N(0, p(1 − p)) ⇒ ˆp − p ≈ N(0, ˆp(1 − ˆp)/n) 9/20
  • 10. Asymptotic Method: Example Consider Exp(λ) and the estimator ˆk = i Xi /n for k = 1/λ = 1/3. It can be shown that the exact distribution is ˆk ∼ 2nkχ2 2n We have seen that the asymptotic distribution is ˆk ≈ N(k, k2 /n) 10/20
  • 11. Asymptotic Method: Pros and Cons Pros + easier computation if sampling distribution F is known + expected information I(θ) may be replaced by observed information I(ˆθ) Cons – works well only for n sufficiently large (typically at least n > 50) – neglets skewness of the distribution of ˆθ – requires to know F or the family of distributions it belongs to – can be applied only if ˆθ is asymptotically normal (typically MLE) 11/20
  • 12. Jackknife Method • Given any estimator ˆθ, the jackknife is based on n leave-1-out estimators ˆθ(i) = T(X1, ..., Xi−1, Xi+1, ..., Xn), with ˆθ(·) = 1 n i ˆθ(i). We also consider the n pseudo-values ˜θi = nˆθ − (n − 1)ˆθ(i) • A first use of jackknife is bias correction. Indeed, biasjack = (n − 1)(ˆθ(·) − ˆθ) is a linear estimator of Bias(ˆθ) (i.e. error is O(1/n2 )). Then, a bias-corrected estimator is given by the mean of the pseudo-values ˆθjack = nˆθ − (n − 1)ˆθ(·) = 1 n i ˜θi = ˜θ • Similarly, bootstrap is used to for the estimation of other properties of ˆθ. E.g. the variance estimator for Var(ˆθ) given by varjack = 1 n ˜s2 = 1 n 1 n−1 i (˜θi − ˜θ)2 = n−1 n i (ˆθ(i) − ˆθ(·))2 is a linear estimator, assuming that ˆθ = T(X1, ..., Xn) is smooth. • We may use varjack and ˆθjack to compute a jackknife approximated confidence interval using the asymptotic approximation ˜θ − θ ˜s2/n ≈ tn−1 but in practice this approximation is often too crude. 12/20
  • 13. Jackknife Method: Limitations • If ˆθ is non-smooth the jackknife variance estimator may be non-consistent. If ˆθ is the sample mean, it can be proved that varjack Var(ˆθ) d −→ χ2 2 2 2 • To fix that we introduce an extension of jackknife. This time we consider the n d leave-d-out estimators obtained by computing the statistic T on every possible subset of X1, ..., Xn obtained by removing d elements. For ˆθ = sample mean, choosing √ n < d < n − 1 yields a consistent variance estimator varjack = n−d d n d i (ˆθ(i) − ˆθ(·))2 13/20
  • 14. Jackknife Method: Example Consider (X, Y ) ∼ F for some F and the Pearson correlation coefficient ρ. It can shown that the estimator ˆρ2 = i (xi −¯x)(yi −¯y) √ i (xi −¯x)2 √ i (yi −¯y)2 is biased. F =⇒ (x1, y1), (x2, y2), (x3, y3), ..., (x10, y10) −→ ˆρ2 ⇓ (x2, y2), (x3, y3), (x4, y4), ..., (x10, y10) −→ ˆρ2 (1) (x1, y1), (x3, y3), (x4, y4), ..., (x10, y10) −→ ˆρ2 (2) ... (x1, y1), (x2, y2), (x3, y3), ..., (x9, y9) −→ ˆρ2 (10) The jackknife estimator ˆσ2 jack = 10ˆσ2 − 9ˆρ2 (·) has bias correct up to O(1/n). 14/20
  • 15. Jackknife Method: Pros and Cons Pros + can be used for non-parametric statistics + fast computation + bias correction up to O(1/n2 ) + leave-d-out provides consistent variance estimator Cons – leave-1-out may be non-consistent – leave-d-out is more expensive – confidence interval is based on crude approximations, bootstrap is better 15/20
  • 16. Bootstrap Method • The bootstrap consists in B resamplings with replacement from x1, ..., xn. This is equivalent to sample from the empirical CDF ˆF. • For each of the B resamples (x (b) 1 , ..., x (b) n ), compute the estimator ˆθ∗ (b). We use the values ˆθ∗ (b) to estimate the distribution of ˆθ∗ = ˆθ∗ ( ˆF), which in turn is an appoximation of the distribution of interest ˆθ = ˆθ(F). • To compute point estimates for the properties of ˆθ we use the pair (ˆθ∗ , ˆθ) to approximate the pair (ˆθ, θ). The bootstrap bias estimator for Bias(ˆθ) = E[ˆθ] − θ is given by biasboot = E[ˆθ∗] − ˆθ = 1 B b ˆθ∗ b − ˆθ so that the bias-corrected bootstrap estimator reads ˆθboot = 2ˆθ + 1 B b ˆθ∗ b Similarly, the bootstrap variance estimator for Var(ˆθ) = E[(ˆθ − E[ˆθ])2] is varboot = 1 B−1 b ˆθ∗ b − 1 B b ˆθ∗ b 2 • Notice that bootstrap is more generic than jackknife, since it estimates the whole distribution of ˆθ and not only its bias and variance. Actually, one can prove that jackknife is a first order approximation of bootstrap. 16/20
  • 17. Bootstrap Method: Example Real World Bootstrap World F ˆF ⇓ ⇓ x1, ..., xn x∗ 1 , ..., x∗ n ↓ ↓ ˆθ ˆθ∗ 17/20
  • 18. Bootstrap Method: Confidence Intervals • Different techniques are available to compute bootstrap interval estimates. Here p[α] denotes the α-quantile of distribution p, with z[α] for standard normal. • The pivotal interval comes from P(l < ˆθ − ˆθ∗ < u) ≈ P(l < θ − ˆθ < u) CI = (2ˆθ − ˆθ∗ [1 − α/2], 2ˆθ − ˆθ∗ [α/2]) • The studentized interval has an approach similar to the jackknife’s one CI = (ˆθjack − tn−1[α/2]varjack , ˆθjack + tn−1[α/2]varjack ) • The BCa interval (bias-corrected and accelerated) CI = (ˆθ∗ [g(α)], ˆθ∗ [g(1 − α)]), with g(α) = Φ z0 + z0 + z[α] 1 − a(z0 + z[α]) where z0 = Φ−1 (#{ˆθ∗ b < ˆθ}/B) is the bias-correction and the acceleration a = 1 6 Skew(I(ˆθ)) ≈ 1 6 i (ˆθ(·) − ˆθ(i))3 [ i (ˆθ(·) − ˆθ(i))2]3/2 is approximated using jackknife. The BCa interval has an excellent O(1/n) coverage error, so it is preferred to the other bootstrap methods. 18/20
  • 19. Bootstrap Method: Pros and Cons Pros + can be used for non-parametric statistics + more powerful than jackknife, it approximates whole distribution of ˆθ + more accurate than jackknife for computing variance of ˆθ + BCa interval has O(1/n) coverage error Cons – more expensive than jackknife (B should be large enough) – if n is very small bootstrap may fail – if the family of F is known, much better results with exact methods 19/20
  • 20. Conclusions • We want to estimate a parameter θ, using samples X1, ..., Xn ∼ Fθ. • Confidence intervals are needed to express uncertainty of estimator ˆθ. • If distribution F is known, preferably use exact or asymptotic methods. • If F is unknown or distribution of ˆθ is complex, use jackknife or bootstrap. • Use jackknife to estimate properties of ˆθ. Not so good for c. intervals. • Use bootstrap to estimate distribution of ˆθ. Good for c. intervals (BCa). 20/20