Your SlideShare is downloading. ×
0
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Introduction to Bag of Little Bootstrap

807

Published on

Reading group presentation on Bag of Little Bootstrap (BLB)

Reading group presentation on Bag of Little Bootstrap (BLB)

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
807
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
24
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. ML-IR Discussion: Bag of Little Bootstrap (BLB)
  • 2. Recap: - Recap - Why bootstrap - What is bootstrap - Bag of Little Bootstrap (BLB) - Guarantees - Examples
  • 3. Recap: Population Our Sample
  • 4. Estimate the median!
  • 5. Estimate the median!
  • 6. Asymptotic Approach Theory has it:
  • 7. Asymptotic Approach Theory has it: ?
  • 8. Asymptotic Approach 95% Confidence Interval
  • 9. Problems with the asymptotic Approach: - Density “f” is hard to estimate - Sample size demand is much larger than the mean for Central Limit theorem to kick in - True median unknown
  • 10. Solution: When theory is too hard… Let’s empirically estimate theoretical truth!
  • 11. Empirical Approach: Ideal Population Sample Over and Over again!
  • 12. Empirical Approach: Ideal Population Sample Over and Over again! Median Est 1 Median Est 2
  • 13. Empirical Approach: Ideal
  • 14. Empirical Approach: Ideal 95% of sample medians
  • 15. Similar Enough? Population Our Sample
  • 16. Empirical Approach: Bootstrap Efron Tibshirani (1993) Our Sample Draw with replacement n samples Median Est* 1 Median Est* 2
  • 17. Empirical Approach: Bootstrap
  • 18. Empirical Approach: Bootstrap 95% of sample medians
  • 19. Empirical Approach: Bootstrap Used for: - Bias estimation - Variance - Confidence intervals Main benefits: - Automatic - Flexible - Fast convergence (Hall, 1992)
  • 20. Key: There are 3 distributions Population
  • 21. Key: There are 3 distributions Population Approximate distribution Actual Sample
  • 22. Key: There are 3 distributions Population Approximate distribution Actual Sample Approximate distribution Bootstrap Samples
  • 23. Key: There are 3 distributions Population Approximate distribution Actual Sample Approximate distribution Bootstrap Samples Approximate the approximation - Is there bias? - What’s the variance? - etc.
  • 24. No free meals: - Bootstrapping requires re-sampling the entire population B times - Each sample is size n - Sampling m < n will violate the sample size properties - Original sample size cannot be too small - “Pre-asymptopia” cases
  • 25. Hope - Resample expects .632n unique samples Sample less – m out of n bootstrap is possible with analytical adjustments. (Bickel 1997)
  • 26. Hope - Resample expects .632n unique samples Sample less – m out of n bootstrap is possible with analytical adjustments. (Bickel 1997) Intuition: Need less than all n values for each bootstrap.
  • 27. Hope - Resample expects .632n unique samples Sample less – m out of n bootstrap is possible with analytical adjustments. (Bickel 1997) Intuition: Need less than all n values for each bootstrap. Problem: - Analytical adjustment is not as automatic as desirable - m out of n bootstrap is sensitive to choices of m
  • 28. Bag of Little Bootstrap - Sample without replacement the sample s times into sizes of b
  • 29. Bag of Little Bootstrap - Sample without replacement the sample s times into sizes of b - Resample each until sample size is n, r times.
  • 30. Bag of Little Bootstrap - Med 1 Med r Sample without replacement the sample s times into sizes of b - Resample each until sample size is n, r times. - Compute the median for each
  • 31. Bag of Little Bootstrap - Med 1 Med r Sample without replacement the sample s times into sizes of b - Resample each until sample size is n, r times. - Compute the median for each - Compute the confidence interval for each
  • 32. Bag of Little Bootstrap - Med 1 Med r Sample without replacement the sample s times into sizes of b - Resample each until sample size is n, r times. - Compute the median for each - Compute the confidence interval for each
  • 33. Bag of Little Bootstrap - Med 1 Med r - Sample without replacement the sample s times into sizes of b - Resample each until sample size is n, r times. - Compute the median for each - Compute the confidence interval for each Take average of each upper and lower point for the confidence interval
  • 34. Bag of Little Bootstrap Klein et al. 2012 Computational Gains: - Each sample only has b unique values! - Can sample a b-dimensional multinomial with n trials. - Scales in b instead of n - Easily parallelizable
  • 35. Bag of Little Bootstrap Klein et al. 2012 Computational Gains: - Each sample only has b unique values! - Can sample a b-dimensional multinomial with n trials. - Scales in b instead of n - Easily parallelizable If b=n^(0.6), a dataset of size 1TB: - Bootstrap storage demands ~ 632GB - BLB storage demands ~ 4GB
  • 36. Bag of Little Bootstrap Theoretical guarantees: - Consistency - Higher order correctness - Fast convergence rate (same as bootstrap)
  • 37. Performance b = n^(gamma), 0.5<= gamma <=1 These choices of gamma ensures bootstrap convergence rates.
  • 38. Performance b = n^(gamma), 0.5<= gamma <=1 These choices of gamma ensures bootstrap convergence rates. Relative error of confidence interval width of logistic regression coefficients (Klein et al. 2012)
  • 39. Performance b = n^(gamma), 0.5<= gamma <=1 These choices of gamma ensures bootstrap convergence rates. Relative error of confidence interval width of logistic regression coefficients (Klein et al. 2012) Gamma residuals t-distr residuals
  • 40. Performance vs Time
  • 41. Selecting Hyperparameters • b, the number of unique samples for each little bootstrap • s, the number of size b samples w/o replacement • r, the number of multinomials to draw
  • 42. Selecting Hyperparameters • b, the number of unique samples for each little bootstrap • s, the number of size b samples w/o replacement • r, the number of multinomials to draw b: the larger the better s, r: adaptively increase this until a convergence has been reached. (Median doesn’t change)
  • 43. Bag of Little Bootstrap Main benefits: - Computationally friendly - Maintains most statistical properties of bootstrap - Flexibility - More robust to choice of b than older methods
  • 44. Reference • Efron, Tibshirani (1993) An Introduction to the Bootstrap • Kleiner et al. (2012) A Scalable Bootstrap for Massive Data Thanks!

×