Introduction to Bag of Little Bootstrap

ML-IR Discussion:
Bag of Little
Bootstrap (BLB)

Recap:
- Recap
- Why bootstrap
- What is bootstrap
- Bag of Little Bootstrap (BLB)
- Guarantees
- Examples

Asymptotic Approach
Theory has it:

Asymptotic Approach
Theory has it:

?

Asymptotic Approach

95%
Confidence Interval

Problems with the asymptotic
Approach:

- Density “f” is hard to estimate
- Sample size demand is much larger than the mean for
Central Limit theorem to kick in
- True median unknown

Solution:
When theory is too hard…
Let’s empirically estimate
theoretical truth!

Empirical Approach: Ideal
Population

Sample Over and
Over again!

Population

Sample Over and
Over again!

Median Est 1

Median Est 2

95% of sample medians

Similar
Enough?
Population

Our Sample

Empirical Approach: Bootstrap
Efron Tibshirani (1993)
Our Sample

Draw with replacement
n samples

Median Est* 1

Median Est* 2

95% of sample medians

Used for:
- Bias estimation
- Variance
- Confidence intervals
Main benefits:
- Automatic
- Flexible
- Fast convergence (Hall, 1992)

Key: There are 3 distributions
Population

Population

Approximate
distribution
Actual Sample

Population

Approximate
distribution
Actual Sample

Approximate
distribution

Bootstrap Samples

Population

Approximate
distribution
Actual Sample

Approximate
distribution

Bootstrap Samples

Approximate
the approximation
- Is there bias?
- What’s the variance?
- etc.

No free meals:
- Bootstrapping requires re-sampling the entire
population B times
- Each sample is size n
- Sampling m < n will violate the sample size
properties
- Original sample size cannot be too small
- “Pre-asymptopia” cases

Hope
-

Resample expects .632n unique samples
Sample less – m out of n bootstrap is possible with
analytical adjustments. (Bickel 1997)

Hope
-


Intuition: Need less than all n values for each bootstrap.

Hope
-


Intuition: Need less than all n values for each bootstrap.

Problem:
- Analytical adjustment is not as automatic as desirable
- m out of n bootstrap is sensitive to choices of m

Bag of Little Bootstrap
-

Sample without
replacement the
sample s times into
sizes of b

-

Sample without
replacement the
sample s times into
sizes of b
- Resample each
until sample size is
n, r times.

-

Med 1

Med r

Sample without
replacement the
sample s times into
sizes of b
- Resample each
n, r times.
- Compute the
median for each

-

Med 1

Med r

Sample without
replacement the
sample s times into
sizes of b
- Resample each
n, r times.
- Compute the
median for each
- Compute the
confidence interval
for each

-

Med 1

Med r

-

Sample without
replacement the
sample s times into
sizes of b
- Resample each
n, r times.
- Compute the
median for each
- Compute the
confidence interval
for each
Take average of each
upper and lower point
for the confidence
interval

Klein et al. 2012
Computational Gains:
- Each sample only has b unique values!
- Can sample a b-dimensional multinomial
with n trials.
- Scales in b instead of n
- Easily parallelizable

Klein et al. 2012
Computational Gains:
- Each sample only has b unique values!
- Can sample a b-dimensional multinomial
with n trials.
- Scales in b instead of n
- Easily parallelizable
If b=n^(0.6), a dataset of size 1TB:
- Bootstrap storage demands ~ 632GB
- BLB storage demands ~ 4GB

Theoretical guarantees:
- Consistency
- Higher order correctness
- Fast convergence rate (same as bootstrap)

Performance
b = n^(gamma), 0.5<= gamma <=1
These choices of gamma ensures bootstrap convergence rates.

Performance
Relative error of confidence interval width of logistic regression
coefficients
(Klein et al. 2012)

Performance
Relative error of confidence interval width of logistic regression
coefficients
(Klein et al. 2012)

Gamma residuals

t-distr residuals

Selecting Hyperparameters
• b, the number of unique samples for each little bootstrap
• s, the number of size b samples w/o replacement
• r, the number of multinomials to draw

Selecting Hyperparameters
• b, the number of unique samples for each little bootstrap
• s, the number of size b samples w/o replacement
• r, the number of multinomials to draw

b: the larger the better
s, r: adaptively increase this until a convergence
has been reached. (Median doesn’t change)

Main benefits:
- Computationally friendly
- Maintains most statistical properties of bootstrap
- Flexibility
- More robust to choice of b than older methods

Reference
• Efron, Tibshirani (1993) An Introduction to the Bootstrap
• Kleiner et al. (2012) A Scalable Bootstrap for Massive Data

Thanks!

Introduction to Bag of Little Bootstrap

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Introduction to Bag of Little Bootstrap

Similar to Introduction to Bag of Little Bootstrap (20)

More from Wayne Lee

More from Wayne Lee (7)

Recently uploaded

Recently uploaded (20)

Introduction to Bag of Little Bootstrap