Upcoming SlideShare
×

# Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo

2,364 views

Published on

This is the invited talk give at the Basque Center for Applied Mathematics (BCAM) in Spain in 2010.

2 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
2,364
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
119
0
Likes
2
Embeds 0
No embeds

No notes for slide

### Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo

1. 1. Monte Carlo & MCMCXin-She YangMonte CarloEstimating πBuﬀon’s Monte Carlo Simulations, Sampling andproblemProbabilityMonte Carlo Markov Chain Monte CarloMonte CarlointegrationQuality ofSamplingQuasi-MonteCarlo Xin-She YangPseudorandomPseudorandomnumbergenerationOtherdistributions c 2010LimitationsMultivariatedistributionsMarkovChainsMarkov chainsMarkov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
2. 2. Estimating πMonte Carlo & MCMCXin-She Yang How to estimate π using only a ruler and some match sticks?Monte CarloEstimating πBuﬀon’sproblemProbabilityMonte CarloMonte CarlointegrationQuality ofSamplingQuasi-MonteCarloPseudorandomPseudorandomnumbergenerationOtherdistributionsLimitationsMultivariatedistributionsMarkovChainsMarkov chainsMarkov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
3. 3. Buﬀon’s Needle ProblemMonte Carlo & MCMC Buﬀon’s needle problem (1733). Probability of crossing a lineXin-She Yang 2 L p= · ,Monte Carlo π dEstimating π where L = length of needles, and d =spacing.Buﬀon’sproblemProbabilityMonte CarloMonte CarlointegrationQuality ofSamplingQuasi-MonteCarloPseudorandomPseudorandomnumbergenerationOtherdistributionsLimitationsMultivariatedistributionsMarkovChainsMarkov chainsMarkov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
4. 4. Probability of Crossing a LineMonte Carlo & MCMCXin-She Yang Since p ≈ n/N ≈ 2L/πd, we haveMonte Carlo 2N LEstimating πBuﬀon’s π≈ · .problem n dProbabilityMonte CarloMonte Carlointegration Lazzarini (1901): L = 5d/6, N = 3408, n = 1808, soQuality ofSamplingQuasi-Monte 2 × 3408 5Carlo π≈ · ≈ 3.14159290.Pseudorandom 1808 6PseudorandomnumbergenerationOtherdistributions Too accurate?! Is this right? What happens when n = 1809?Limitations √Multivariatedistributions Errors ∼ 1/ N ∼ 2%.MarkovChainsMarkov chainsMarkov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
5. 5. Monte Carlo MethodsMonte Carlo & MCMC Everyone has used Monte Carlo methods in some way ...Xin-She YangMonte CarloEstimating πBuﬀon’sproblemProbabilityMonte CarloMonte CarlointegrationQuality ofSamplingQuasi-MonteCarloPseudorandomPseudorandomnumbergenerationOtherdistributionsLimitationsMultivariatedistributionsMarkov Measure temperatures, choose a product, ...ChainsMarkov chainsMarkov chains Taste soup, wine ...A FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
6. 6. Monte Carlo IntegrationMonte Carlo & MCMC n 1Xin-She Yang I= fdv = V fi + O(ǫ), Ω NMonte Carlo i =1Estimating π 1 N 2 √Buﬀon’sproblem N i =1 fi − µ2Probability ǫ∼ ∼ O(1/ N).Monte Carlo NMonte CarlointegrationQuality ofSamplingQuasi-MonteCarloPseudorandomPseudorandomnumbergenerationOtherdistributionsLimitationsMultivariatedistributionsMarkovChainsMarkov chainsMarkov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
7. 7. Importance and Quality of the SamplesMonte Carlo & MCMC Higher dimensions – even more challenging!Xin-She Yang I= ... f (u, v , ..., w ) du dv ...dw .Monte CarloEstimating πBuﬀon’sproblem √Probability Errors ∼ 1/ NMonte CarloMonte CarlointegrationQuality of Higher dimensional integralsSamplingQuasi-MonteCarlo How to distribute these sampling points?PseudorandomPseudorandomnumber Regular grids: E ∼ O(N −2/d ) in d ≥ 4 dimensions (notgenerationOther enough!)distributionsLimitationsMultivariatedistributions Strategies: importance sampling, Latin hypercube, ...MarkovChainsMarkov chains Any other ways?Markov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
8. 8. Quasi-Monte Carlo MethodsMonte Carlo & MCMC In essence, that is to distribute (consecutive) sampling pointsXin-She Yang as far away as possible, using quasi-random or low-discrepancy numbers (not pseudo-random)... Halton, Sobol, Corput ...Monte CarloEstimating πBuﬀon’s For example, Corput express an integer n as a prime base bproblemProbability mMonte CarloMonte Carlo n= aj (n)b j , aj ∈ {0, 1, 2, ..., b − 1}.integrationQuality of j=0SamplingQuasi-MonteCarlo Then, it is reversed or reﬂectedPseudorandom mPseudorandom 1numbergeneration φb (n) = aj (n) .Other b j+1distributions j=0LimitationsMultivariatedistributions For example, 0, 1, 2, ..., 15 =⇒ 0, 1 , 1 , 3 , 1 , ..., 15 . 2 4 4 8 16MarkovChainsMarkov chains Errors ∼ O(1/N)Markov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
9. 9. Pseudorandom numbers – by deterministic sequencesMonte Carlo & MCMC Uniform Distributions:Xin-She Yang di = (adi −1 + c) mod m,Monte CarloEstimating π Classic IBM generator:Buﬀon’s m = 231 (strong correlation!)problemProbability a = 65539, c = 0,Monte CarloMonte CarlointegrationQuality of In fact, correlation coeﬃcient is 1!SamplingQuasi-Monte Better choice (old Matlab):CarloPseudorandom a = 75 = 16807, c = 0, m =31 −1 = 2, 147, 483, 647.PseudorandomnumbergenerationOther If scaled by m, all numbers are in [1/m, (m − 1)/m].distributionsLimitations New Matlab: [ǫ, 1 − ǫ], ǫ = 2−53 ≈ 1.1 × 10−16 .MultivariatedistributionsMarkovChains IEEE: 64-bits system = 53 bits for a signed fraction in base 2Markov chainsMarkov chains and 11 bits for a signed exponent.A FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
10. 10. Other DistributionsMonte Carlo & MCMC Inverse transform method, rejection method, Mersenne twister,Xin-She Yang ..., Markov chain Monte Carlo. 2 √1 e −u /2 ,Monte CarloEstimating π Standard norm distribution: p(u) = 2πBuﬀon’s v −u 2 /2 du CDF: Φ(v ) = √1 = 1 v 2 [1 + ( 2 )],problem −∞ eProbability √ 2πMonte CarloMonte Carlo √integrationQuality of v = Φ−1 (u) = 2 erf−1 (2u − 1),Sampling 1200 10000Quasi-MonteCarlo 1000 8000PseudorandomPseudorandom 800number 6000generation 600Otherdistributions 4000Limitations 400Multivariatedistributions 2000 200MarkovChains 0 0 0.2 0.4 0.6 0.8 1 0 -6 -4 -2 0 2 4 6Markov chainsMarkov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
11. 11. Transform method: LimitationsMonte Carlo & MCMCXin-She YangMonte Carlo √Estimating π v = Φ−1 (u) = 2 erf−1 (2u − 1),Buﬀon’sproblemProbabilityMonte CarloMonte Carlo √integration π πx 3 7π 2 x 5 127π 3 x 7Quality ofSampling erf−1 (x) = x+ + + + ··· .Quasi-MonteCarlo 2 12 480 40320PseudorandomPseudorandomnumbergeneration Not so easy to calculate!OtherdistributionsLimitations Sometimes, the inverse may not be possible.MultivariatedistributionsMarkovChainsMarkov chainsMarkov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
12. 12. Multivariate DistributionsMonte Carlo & MCMC Bivariate normal distributions:Xin-She Yang 1 −(v1 +v2 )/2 2 2 p(v1 , v2 ) = e .Monte Carlo 2πEstimating πBuﬀon’s Box-M¨ller method: from u1 , u2 ∼ uniform distributions uproblemProbabilityMonte CarloMonte Carlo v1 = −2 ln u1 cos(2πu2 ), v2 = −2 ln u1 sin(2πu2 ).integrationQuality ofSamplingQuasi-MonteCarlo ProblemsPseudorandomPseudorandomnumber Diﬃcult to calculate the inverse in most casesgenerationOther (sometimes, even impossible!).distributionsLimitationsMultivariate Other methods (e.g., rejection method) are ineﬃcient.distributionsMarkovChainsMarkov chains So – the Markov chain Monte Carlo (MCMC) way!Markov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
13. 13. Random Walk down the Markov ChainsMonte Carlo & MCMC Random walk – A drunkard’s walk:Xin-She Yang ut+1 = µ + ut + wt ,Monte CarloEstimating π where wt is a random variable, and µ is the drift.Buﬀon’sproblem For example, wt ∼ N(0, σ 2 ) (Gaussian).ProbabilityMonte CarloMonte Carlo 25 10integrationQuality of 20Sampling 5Quasi-MonteCarlo 15 0Pseudorandom 10Pseudorandom -5number 5generation -10Other 0distributionsLimitations -5 -15Multivariatedistributions -10 -20 0 100 200 300 400 500 -15 -10 -5 0 5 10 15 20MarkovChainsMarkov chainsMarkov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
14. 14. Markov ChainsMonte Carlo & MCMCXin-She Yang Markov chain: the next state only depends on the current state and the transition probability.Monte CarloEstimating πBuﬀon’sproblemProbability P(i , j) ≡ P(Vt+1 = Sj V0 = Sp , ..., Vt = Si )Monte CarloMonte CarlointegrationQuality of = P(Vt+1 = Sj Vt = Sj ),SamplingQuasi-MonteCarlo =⇒ Pij πi∗ = Pji πj∗ , π ∗ = stionary probability distribution.PseudorandomPseudorandomnumbergenerationOther Examples: Brownian motiondistributionsLimitationsMultivariatedistributions ui +1 = µ + ui + ǫi , ǫi ∼ N(0, σ 2 ).MarkovChainsMarkov chainsMarkov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
15. 15. Markov ChainsMonte Carlo & MCMC Monopoly (board games)Xin-She YangMonte CarloEstimating πBuﬀon’sproblemProbabilityMonte CarloMonte CarlointegrationQuality ofSamplingQuasi-MonteCarloPseudorandomPseudorandomnumbergenerationOtherdistributionsLimitationsMultivariatedistributionsMarkovChains Monopoly AnimationMarkov chainsMarkov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
16. 16. A Famous \$Billion Markov Chain – PageRankMonte Carlo & MCMCXin-She Yang Google PageRank Algorithm (by Page et al., 1997)Monte CarloEstimating πBuﬀon’sproblemProbabilityMonte CarloMonte CarlointegrationQuality ofSamplingQuasi-MonteCarloPseudorandomPseudorandomnumbergenerationOtherdistributionsLimitationsMultivariatedistributions Billions of web pages: pages = states, link probability ∼ 1/tMarkovChains where t ≈ the expectation of the number of clicks.Markov chainsMarkov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
17. 17. Googling as a Markov Chain (t)Monte Carlo (t+1) 1−α Ranki & MCMC Rankj = +α ,Xin-She Yang N B(pi ) pi ∈Ω(pi )Monte CarloEstimating π where N=number of pages, B(pi ) is the link bounds of page (t=0)Buﬀon’sproblem pi , and α=a ranking factor (≈ 0.85). Ranki = 1/N.Probability TMonte CarloMonte Carlo Let R = Rank1 , ..., RankN , and L(pi , pj ) = 0 if no linksintegration =⇒Quality ofSampling  Quasi-MonteCarlo  (1 − α)  L(p1 , p1 ) ... L(p1 , pj ) ...L(p1 , pN ) . .  Pseudorandom    . Pseudorandom 1 .    R=  .  + α L(pi , p1 ) L(pi , pj ) ...L(pi , pN )  R,    numbergeneration N . . ..   Other    . . distributions  . Limitations (1 − α) L(pN , p1 ) ... L(pN , pN )Multivariatedistributions where N L(pi , pj ) = 1. Google Matrix (stochastic, sparse).MarkovChains i =1Markov chainsMarkov chains =⇒ a stationary probability distribution R (update monthly).A FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
18. 18. Markov Chain Monte CarloMonte Carlo & MCMCXin-She YangMonte Carlo Landmarks: Monte Carlo method (1930s, 1945, from 1950s)Estimating πBuﬀon’s e.g., Metropolis Algorithm (1953), Metropolis-Hastings (1970).problemProbabilityMonte CarloMonte Carlo Markov Chain Monte Carlo (MCMC) methods – A class ofintegrationQuality of methods.SamplingQuasi-MonteCarlo Really took oﬀ in 1990s, now applied to a wide range of areas:PseudorandomPseudorandom physics, Bayesian statistics, climate changes, machine learning,numbergenerationOther ﬁnance, economy, medicine, biology, materials and engineeringdistributionsLimitations ...MultivariatedistributionsMarkovChainsMarkov chainsMarkov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
19. 19. Metropolis-HastingsMonte Carlo & MCMC The Metropolis-Hastings algorithm algorithm:Xin-She Yang 1 Begin with any initial θ0 at time t ← 0 such thatMonte Carlo p(θ0 ) > 0Estimating πBuﬀon’sproblem 2 Generating a candidate sample θ∗ ∼ q(θt , .) from aProbabilityMonte Carlo proposal distributionMonte CarlointegrationQuality of 3 Evaluate the acceptance probability α(θt , θ∗ ) given bySamplingQuasi-MonteCarlo p(θ∗ )q(θ∗ , θt )Pseudorandom α = min ,1Pseudorandomnumber p(θt )q(θt , θ∗ )generationOtherdistributions 4 Generate a uniformly-distributed random number u ∼LimitationsMultivariate Unif[0, 1], and accept θ∗ if α ≥ u. That is, if α ≥ u thendistributionsMarkov θt+1 ← θ∗ else θt+1 ← θtChainsMarkov chains 5 Increase the counter or time t ← t + 1, and go to step 2Markov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
20. 20. Mixture distribution: A distribution with known mean and variance.Monte Carlo & MCMC f (x|µ, σ 2 ) = K αi pi (x|µi , σi2 ), i =i K i =1 αi = 1.Xin-She Yang E.g., α1 = α2 = 1/2, µ1 = 2, µ2 = −2 and σ1 = σ2 = 1. 6Monte Carlo 4Estimating π 2Buﬀon’sproblem 0Probability -2Monte CarloMonte Carlo -4 0 2000 4000 6000 8000 10000integrationQuality ofSampling 0.2Quasi-Monte 0.18Carlo 0.16PseudorandomPseudorandom 0.14numbergeneration 0.12Other 0.1distributionsLimitations 0.08Multivariatedistributions 0.06 0.04MarkovChains 0.02Markov chains 0Markov chains −6 −4 −2 0 2 4 6A FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
21. 21. When to Stop the ChainMonte Carlo & MCMC As the MCMC runs, convergence may be reachedXin-She Yang When does a chain converge? When to stop the chain ... ?Monte CarloEstimating π Are the samples correlated ?Buﬀon’sproblemProbability 0Monte CarloMonte Carlointegration 100Quality ofSampling 200Quasi-MonteCarloPseudorandom 300Pseudorandomnumber 400generationOtherdistributions 500LimitationsMultivariatedistributions 600MarkovChains 0 100 200 300 400 500 600 700 800 900Markov chainsMarkov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
22. 22. A Long Single Chain or Multiple Short Chains?Monte Carlo & MCMCXin-She YangMonte Carlo When a Markov chain will converge in practice? If it hasEstimating πBuﬀon’s converged, what does it mean?problemProbabilityMonte Carlo Is a very long chain really good enough (from statisticalMonte Carlointegration point of view)?Quality ofSamplingQuasi-Monte How long is long enough?CarloPseudorandom Are multiple chains better?Pseudorandomnumbergeneration How to improve the sampling eﬃciency and/or mixingOtherdistributions properties ?LimitationsMultivariatedistributionsMarkovChainsMarkov chainsMarkov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
23. 23. Simulated TemperingMonte Carlo & MCMC Simulated annealing: temperature T from high to low.Xin-She Yang Simulated tempering: raise T to a higher value, reduce to low.Monte CarloEstimating πBuﬀon’s πτ = π(x)1/τ , πτ →∞ → 1, as τ → ∞.problemProbabilityMonte Carlo The basic idea is to reduce from a very high τ to τ0 = 1.Monte CarlointegrationQuality ofSampling ﬂattenQuasi-MonteCarlo =⇒Pseudorandom π≥ 0 πτ = π(x)1/τPseudorandomnumbergenerationOtherdistributionsLimitations TemperingMultivariatedistributions Use ﬂattened (near uniform) distributions asMarkovChains proposals/candidates to produce high quality samplings.Markov chainsMarkov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
24. 24. Sampling: Forward or Backward? Which Way?Monte Carlo & MCMC Is this the only way?Xin-She Yang No! – Coupling from the Past & MetaheuristicsMonte CarloEstimating πBuﬀon’sproblemProbabilityMonte Carlo If we go backward along the chain, any advantages? If so, how?Monte CarlointegrationQuality ofSampling Is there a universally eﬃcient sampling tool for drawingQuasi-MonteCarlo samples in general?PseudorandomPseudorandomnumber No! – No-free-lunch theorem (Wolpert & Macready, 1997)generationOtherdistributions The aim of the research is to ﬁnd the best algorithm(s) for aLimitationsMultivariatedistributions given/speciﬁc problem/distribution.MarkovChainsMarkov chains Also Metaheuristics (very promosing).Markov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC
25. 25. Thank youMonte Carlo & MCMCXin-She Yang ReferencesMonte Carlo Gamerman D., Markov Chain Monte Carlo, Chapman & Hall/CRC, (1997).Estimating π Corcoran J. and Tweedie R., Perfect sampling ... Jour. Stat. Plan. Infer., 104, 297 (2002).Buﬀon’sproblem Cox M., Forbes A. B., Harris P. M., Smith I., Classiﬁcation and solution of regression ..., NPL SSfMProbability Report, (2004).Monte Carlo Propp J. & Wilson D., Exact sampling ..., Random Stru. Alg., 9, 223 (1996).Monte Carlointegration Yang X. S., Nature-Inspired Metaheuristic Algorithms, Luniver Press, (2008).Quality ofSampling Yang X. S., Introduction to Computational Mathematics, World Scientiﬁc, (2008).Quasi-Monte Yang X. S., Engineering Optimization: An Introduction with Metaheuristic Applications, Wiley,Carlo (2010).PseudorandomPseudorandomnumbergenerationOtherdistributions Acknowledgement:LimitationsMultivariate EPSRC, SSfM, NPL, CUED, and London Maths Society.distributionsMarkov Thank you!ChainsMarkov chainsMarkov chainsA FamousMarkov Chain Xin-She Yang Monte Carlo & MCMC