Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- Reduced-cost ensemble Kalman filter... by Mélanie Rochoux 959 views
- 4th joint Warwick Oxford Statistics... by Christian Robert 4177 views
- Savage-Dickey paradox by Christian Robert 2270 views
- Introduction to ROBOTICS by elliando dias 20962 views
- Ch 07 MATLAB Applications in Chemic... by Chyi-Tsong Chen 3010 views

2,434 views

Published on

Seminar in Stanford University, August 09, 2010

Published in:
Education

No Downloads

Total views

2,434

On SlideShare

0

From Embeds

0

Number of Embeds

478

Shares

0

Downloads

21

Comments

4

Likes

1

No notes for slide

- 1. Vanilla Rao–Blackwellisation of Metropolis–Hastings algorithms Christian P. Robert Universit´ Paris-Dauphine and CREST, Paris, France e Joint works with Randal Douc, Pierre Jacob and Murray Smith xian@ceremade.dauphine.fr August 9, 2010 1 / 22
- 2. Main themes 1 Rao–Blackwellisation on MCMC. 2 Can be performed in any Hastings Metropolis algorithm. 3 Asymptotically more eﬃcient than usual MCMC with a controlled additional computing 4 Can take advantage of parallel capacities at a very basic level 2 / 22
- 3. Main themes 1 Rao–Blackwellisation on MCMC. 2 Can be performed in any Hastings Metropolis algorithm. 3 Asymptotically more eﬃcient than usual MCMC with a controlled additional computing 4 Can take advantage of parallel capacities at a very basic level 2 / 22
- 4. Main themes 1 Rao–Blackwellisation on MCMC. 2 Can be performed in any Hastings Metropolis algorithm. 3 Asymptotically more eﬃcient than usual MCMC with a controlled additional computing 4 Can take advantage of parallel capacities at a very basic level 2 / 22
- 5. Metropolis Hastings revisited Rao–Blackwellisation Outline 1 Metropolis Hastings revisited 2 Rao–Blackwellisation Formal importance sampling Variance reduction Asymptotic results Illustrations 3 / 22
- 6. Metropolis Hastings revisited Rao–Blackwellisation Outline 1 Metropolis Hastings revisited 2 Rao–Blackwellisation Formal importance sampling Variance reduction Asymptotic results Illustrations 3 / 22
- 7. Metropolis Hastings revisited Rao–Blackwellisation Outline 1 Metropolis Hastings revisited 2 Rao–Blackwellisation Formal importance sampling Variance reduction Asymptotic results Illustrations 4 / 22
- 8. Metropolis Hastings revisited Rao–Blackwellisation Metropolis Hastings algorithm 1 We wish to approximate h(x)π(x)dx I = = h(x)¯ (x)dx π π(x)dx 2 x → π(x) is known but not π(x)dx. 1 n 3 Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov chain with limiting distribution π . ¯ 4 Convergence obtained from Law of Large Numbers or CLT for Markov chains. 5 / 22
- 9. Metropolis Hastings revisited Rao–Blackwellisation Metropolis Hastings algorithm 1 We wish to approximate h(x)π(x)dx I = = h(x)¯ (x)dx π π(x)dx 2 x → π(x) is known but not π(x)dx. 1 n 3 Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov chain with limiting distribution π . ¯ 4 Convergence obtained from Law of Large Numbers or CLT for Markov chains. 5 / 22
- 10. Metropolis Hastings revisited Rao–Blackwellisation Metropolis Hastings algorithm 1 We wish to approximate h(x)π(x)dx I = = h(x)¯ (x)dx π π(x)dx 2 x → π(x) is known but not π(x)dx. 1 n 3 Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov chain with limiting distribution π . ¯ 4 Convergence obtained from Law of Large Numbers or CLT for Markov chains. 5 / 22
- 11. Metropolis Hastings revisited Rao–Blackwellisation Metropolis Hastings algorithm 1 We wish to approximate h(x)π(x)dx I = = h(x)¯ (x)dx π π(x)dx 2 x → π(x) is known but not π(x)dx. 1 n 3 Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov chain with limiting distribution π . ¯ 4 Convergence obtained from Law of Large Numbers or CLT for Markov chains. 5 / 22
- 12. Metropolis Hastings revisited Rao–Blackwellisation Metropolis Hasting Algorithm Suppose that x (t) is drawn. 1 Simulate yt ∼ q(·|x (t) ). 2 Set x (t+1) = yt with probability π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) Otherwise, set x (t+1) = x (t) . 3 α is such that the detailed balance equation is satisﬁed: ⊲ π is the ¯ stationary distribution of (x (t) ). ◮ The accepted candidates are simulated with the rejection algorithm. 6 / 22
- 13. Metropolis Hastings revisited Rao–Blackwellisation Metropolis Hasting Algorithm Suppose that x (t) is drawn. 1 Simulate yt ∼ q(·|x (t) ). 2 Set x (t+1) = yt with probability π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) Otherwise, set x (t+1) = x (t) . 3 α is such that the detailed balance equation is satisﬁed: ⊲ π is the ¯ stationary distribution of (x (t) ). ◮ The accepted candidates are simulated with the rejection algorithm. 6 / 22
- 14. Metropolis Hastings revisited Rao–Blackwellisation Metropolis Hasting Algorithm Suppose that x (t) is drawn. 1 Simulate yt ∼ q(·|x (t) ). 2 Set x (t+1) = yt with probability π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) Otherwise, set x (t+1) = x (t) . 3 α is such that the detailed balance equation is satisﬁed: π(x)q(y |x)α(x, y ) = π(y )q(x|y )α(y , x). ⊲ π is the stationary distribution of (x (t) ). ¯ ◮ The accepted candidates are simulated with the rejection algorithm. 6 / 22
- 15. Metropolis Hastings revisited Rao–Blackwellisation Metropolis Hasting Algorithm Suppose that x (t) is drawn. 1 Simulate yt ∼ q(·|x (t) ). 2 Set x (t+1) = yt with probability π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) Otherwise, set x (t+1) = x (t) . 3 α is such that the detailed balance equation is satisﬁed: π(x)q(y |x)α(x, y ) = π(y )q(x|y )α(y , x). ⊲ π is the stationary distribution of (x (t) ). ¯ ◮ The accepted candidates are simulated with the rejection algorithm. 6 / 22
- 16. Metropolis Hastings revisited Rao–Blackwellisation Some properties of the HM algorithm 1 Alternative representation of the estimator δ is n MN 1 (t) 1 δ= h(x )= ni h(zi ) , n t=1 N i=1 where zi ’s are the accepted yj ’s, MN is the number of accepted yj ’s till time N, ni is the number of times zi appears in the sequence (x (t) )t . 7 / 22
- 17. Metropolis Hastings revisited Rao–Blackwellisation α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi )/ ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. The transition kernel q admits π as a stationary distribution: ˜ ˜ π (x)˜ (y |x) = ˜ q 8 / 22
- 18. Metropolis Hastings revisited Rao–Blackwellisation α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi )/ ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. The transition kernel q admits π as a stationary distribution: ˜ ˜ π (x)˜ (y |x) = ˜ q 8 / 22
- 19. Metropolis Hastings revisited Rao–Blackwellisation α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi )/ ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. The transition kernel q admits π as a stationary distribution: ˜ ˜ π (x)˜ (y |x) = ˜ q 8 / 22
- 20. Metropolis Hastings revisited Rao–Blackwellisation α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi )/ ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. The transition kernel q admits π as a stationary distribution: ˜ ˜ π(x)p(x) α(x, y )q(y |x) π (x)˜ (y |x) = ˜ q π(u)p(u)du p(x) π (x) ˜ q (y |x) ˜ 8 / 22
- 21. Metropolis Hastings revisited Rao–Blackwellisation α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi )/ ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. The transition kernel q admits π as a stationary distribution: ˜ ˜ π(x)α(x, y )q(y |x) π (x)˜ (y |x) = ˜ q π(u)p(u)du 8 / 22
- 22. Metropolis Hastings revisited Rao–Blackwellisation α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi )/ ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. The transition kernel q admits π as a stationary distribution: ˜ ˜ π(y )α(y , x)q(x|y ) π (x)˜ (y |x) = ˜ q π(u)p(u)du 8 / 22
- 23. Metropolis Hastings revisited Rao–Blackwellisation α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi )/ ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. The transition kernel q admits π as a stationary distribution: ˜ ˜ π (x)˜ (y |x) = π (y )˜ (x|y ) , ˜ q ˜ q 8 / 22
- 24. Metropolis Hastings revisited Rao–Blackwellisation Lemme The sequence (zi , ni ) satisﬁes 1 (zi , ni )i is a Markov chain; 2 zi+1 and ni are independent given zi ; 3 ni is distributed as a geometric random variable with probability parameter p(zi ) := α(zi , y ) q(y |zi ) dy ; (1) 4 ˜ (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy ˜ and stationary distribution π such that ˜ q (·|z) ∝ α(z, ·) q(·|z) ˜ and π (·) ∝ π(·)p(·) . ˜ 9 / 22
- 25. Metropolis Hastings revisited Rao–Blackwellisation Lemme The sequence (zi , ni ) satisﬁes 1 (zi , ni )i is a Markov chain; 2 zi+1 and ni are independent given zi ; 3 ni is distributed as a geometric random variable with probability parameter p(zi ) := α(zi , y ) q(y |zi ) dy ; (1) 4 ˜ (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy ˜ and stationary distribution π such that ˜ q (·|z) ∝ α(z, ·) q(·|z) ˜ and π (·) ∝ π(·)p(·) . ˜ 9 / 22
- 26. Metropolis Hastings revisited Rao–Blackwellisation Lemme The sequence (zi , ni ) satisﬁes 1 (zi , ni )i is a Markov chain; 2 zi+1 and ni are independent given zi ; 3 ni is distributed as a geometric random variable with probability parameter p(zi ) := α(zi , y ) q(y |zi ) dy ; (1) 4 ˜ (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy ˜ and stationary distribution π such that ˜ q (·|z) ∝ α(z, ·) q(·|z) ˜ and π (·) ∝ π(·)p(·) . ˜ 9 / 22
- 27. Metropolis Hastings revisited Rao–Blackwellisation Lemme The sequence (zi , ni ) satisﬁes 1 (zi , ni )i is a Markov chain; 2 zi+1 and ni are independent given zi ; 3 ni is distributed as a geometric random variable with probability parameter p(zi ) := α(zi , y ) q(y |zi ) dy ; (1) 4 ˜ (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy ˜ and stationary distribution π such that ˜ q (·|z) ∝ α(z, ·) q(·|z) ˜ and π (·) ∝ π(·)p(·) . ˜ 9 / 22
- 28. Metropolis Hastings revisited Rao–Blackwellisation Old bottle, new wine [or vice-versa] zi−1 10 / 22
- 29. Metropolis Hastings revisited Rao–Blackwellisation Old bottle, new wine [or vice-versa] indep zi−1 zi indep ni−1 10 / 22
- 30. Metropolis Hastings revisited Rao–Blackwellisation Old bottle, new wine [or vice-versa] indep indep zi−1 zi zi+1 indep indep ni−1 ni 10 / 22
- 31. Metropolis Hastings revisited Rao–Blackwellisation Old bottle, new wine [or vice-versa] indep indep zi−1 zi zi+1 indep indep ni−1 ni n MN 1 1 δ= h(x (t) ) = ni h(zi ) . n t=1 N i=1 10 / 22
- 32. Metropolis Hastings revisited Rao–Blackwellisation Old bottle, new wine [or vice-versa] indep indep zi−1 zi zi+1 indep indep ni−1 ni n MN 1 1 δ= h(x (t) ) = ni h(zi ) . n t=1 N i=1 10 / 22
- 33. Metropolis Hastings revisited Rao–Blackwellisation Outline 1 Metropolis Hastings revisited 2 Rao–Blackwellisation Formal importance sampling Variance reduction Asymptotic results Illustrations 11 / 22
- 34. Metropolis Hastings revisited Rao–Blackwellisation Formal importance sampling Importance sampling perspective 1 A natural idea: MN 1 h(zi ) δ∗ = , N p(zi ) i=1 12 / 22
- 35. Metropolis Hastings revisited Rao–Blackwellisation Formal importance sampling Importance sampling perspective 1 A natural idea: MN h(zi ) MN π(zi ) i=1 i=1 h(zi ) p(zi ) π (zi ) ˜ δ∗ ≃ = . MN 1 MN π(zi ) i=1 i=1 p(zi ) π (zi ) ˜ 12 / 22
- 36. Metropolis Hastings revisited Rao–Blackwellisation Formal importance sampling Importance sampling perspective 1 A natural idea: MN h(zi ) MN π(zi ) i=1 i=1 h(zi ) p(zi ) π (zi ) ˜ δ∗ ≃ = . MN 1 MN π(zi ) i=1 i=1 p(zi ) π (zi ) ˜ 2 But p not available in closed form. 12 / 22
- 37. Metropolis Hastings revisited Rao–Blackwellisation Formal importance sampling Importance sampling perspective 1 A natural idea: MN h(zi ) MN π(zi ) i=1 i=1 h(zi ) p(zi ) π (zi ) ˜ δ∗ ≃ = . MN 1 MN π(zi ) i=1 i=1 p(zi ) π (zi ) ˜ 2 But p not available in closed form. 3 The geometric ni is the replacement obvious solution that is used in the original Metropolis–Hastings estimate since E[ni ] = 1/p(zi ). 12 / 22
- 38. Metropolis Hastings revisited Rao–Blackwellisation Formal importance sampling Importance sampling perspective 1 A natural idea: MN h(zi ) MN π(zi ) i=1 i=1 h(zi ) p(zi ) π (zi ) ˜ δ∗ ≃ = . MN 1 MN π(zi ) i=1 i=1 p(zi ) π (zi ) ˜ 2 But p not available in closed form. 3 The geometric ni is the replacement obvious solution that is used in the original Metropolis–Hastings estimate since E[ni ] = 1/p(zi ). 12 / 22
- 39. Metropolis Hastings revisited Rao–Blackwellisation Formal importance sampling The crude estimate of 1/p(zi ), ∞ ni = 1 + I {uℓ ≥ α(zi , yℓ )} , j=1 ℓ≤j can be improved: Lemma If (yj )j is an iid sequence with distribution q(y |zi ), the quantity ∞ ˆ ξi = 1 + {1 − α(zi , yℓ )} j=1 ℓ≤j is an unbiased estimator of 1/p(zi ) which variance, conditional on zi , is lower than the conditional variance of ni , {1 − p(zi )}/p 2 (zi ). 13 / 22
- 40. Metropolis Hastings revisited Rao–Blackwellisation Formal importance sampling Rao-Blackwellised, for sure? ∞ ˆ ξi = 1 + {1 − α(zi , yℓ )} j=1 ℓ≤j 1 Inﬁnite sum but ﬁnite with at least positive probability: π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) For example: take a symetric random walk as a proposal. 2 What if we wish to be sure that the sum is ﬁnite? 14 / 22
- 41. Metropolis Hastings revisited Rao–Blackwellisation Variance reduction Variance improvement Proposition If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid uniform sequence, for any k ≥ 0, the quantity ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} j=1 1≤ℓ≤k∧j k+1≤ℓ≤j is an unbiased estimator of 1/p(zi ) with an almost sure ﬁnite number of terms. 15 / 22
- 42. Metropolis Hastings revisited Rao–Blackwellisation Variance reduction Variance improvement Proposition If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid uniform sequence, for any k ≥ 0, the quantity ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} j=1 1≤ℓ≤k∧j k+1≤ℓ≤j is an unbiased estimator of 1/p(zi ) with an almost sure ﬁnite number of terms. Moreover, for k ≥ 1, 1 − (1 − 2p(zi ) + r (zi ))k 2 − p(zi ) „ « h ˛ i ˆ˛ 1 − p(zi ) V ξik ˛ zi = − (p(zi ) − r (zi )) , p 2 (zi ) 2p(zi ) − r (zi ) p 2 (zi ) α2 (zi , y ) q(y |zi ) dy . R R where p(zi ) := α(zi , y ) q(y |zi ) dy . and r (zi ) := 15 / 22
- 43. Metropolis Hastings revisited Rao–Blackwellisation Variance reduction Variance improvement Proposition If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid uniform sequence, for any k ≥ 0, the quantity ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} j=1 1≤ℓ≤k∧j k+1≤ℓ≤j is an unbiased estimator of 1/p(zi ) with an almost sure ﬁnite number of terms. Therefore, we have ˆ ˆ ˆ V ξi zi ≤ V ξik zi ≤ V ξi0 zi = V [ni | zi ] . 15 / 22
- 44. Metropolis Hastings revisited Rao–Blackwellisation Variance reduction zi−1 ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} j=1 1≤ℓ≤k∧j k+1≤ℓ≤j 16 / 22
- 45. Metropolis Hastings revisited Rao–Blackwellisation Variance reduction not indep zi−1 zi not indep ˆk ξi−1 ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} j=1 1≤ℓ≤k∧j k+1≤ℓ≤j 16 / 22
- 46. Metropolis Hastings revisited Rao–Blackwellisation Variance reduction not indep not indep zi−1 zi zi+1 not indep not indep ˆk ξi−1 ˆ ξik ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} j=1 1≤ℓ≤k∧j k+1≤ℓ≤j 16 / 22
- 47. Metropolis Hastings revisited Rao–Blackwellisation Variance reduction not indep not indep zi−1 zi zi+1 not indep not indep ˆk ξi−1 ˆ ξik M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi 16 / 22
- 48. Metropolis Hastings revisited Rao–Blackwellisation Variance reduction not indep not indep zi−1 zi zi+1 not indep not indep ˆk ξi−1 ˆ ξik M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi 16 / 22
- 49. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results Let M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}. 17 / 22
- 50. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results Let M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}. Assume that there exists a positive function ϕ ≥ 1 such that PM i=1 h(zi )/p(zi ) P ∀h ∈ Cϕ , PM −→ π(h) i=1 1/p(zi ) Theorem Under the assumption that π(p) > 0, the following convergence property holds: i) If h is in Cϕ , then k P δM −→M→∞ π(h) (◮Consistency) 17 / 22
- 51. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results Let M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}. Assume that there exists a positive function ψ such that √ M i=1 h(zi )/p(zi ) L ∀h ∈ Cψ , M M − π(h) −→ N (0, Γ(h)) i=1 1/p(zi ) Theorem Under the assumption that π(p) > 0, the following convergence property holds: ii) If, in addition, h2 /p ∈ Cϕ and h ∈ Cψ , then √ k L M(δM − π(h)) −→M→∞ N (0, Vk [h − π(h)]) , (◮Clt) where Vk (h) := π(p) ˆ π(dz)V ξik z h2 (z)p(z) + Γ(h) . 17 / 22
- 52. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results We will need some additional assumptions. Assume a maximal inequality for the Markov chain (zi )i : there exists a measurable function ζ such that for any starting point x, i NCh (x) ∀h ∈ Cζ , Px sup [h(zi ) − π (h)] > ǫ ≤ ˜ 0≤i≤N ǫ2 j=0 Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N h(x (t) ) t=1 L MN − π(h) −→N→∞ N (0, V0 [h − π(h)]) , N where MN is deﬁned by 18 / 22
- 53. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results We will need some additional assumptions. Assume a maximal inequality for the Markov chain (zi )i : there exists a measurable function ζ such that for any starting point x, i NCh (x) ∀h ∈ Cζ , Px sup [h(zi ) − π (h)] > ǫ ≤ ˜ 0≤i≤N ǫ2 j=0 Moreover, assume that ∃φ ≥ 1 such that for any starting point x, ˜ P ∀h ∈ Cφ , Q n (x, h) −→ π (h) = π(ph)/π(p) , ˜ Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N t=1 h(x (t) ) L 18 / 22
- 54. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results We will need some additional assumptions. Assume a maximal inequality for the Markov chain (zi )i : there exists a measurable function ζ such that for any starting point x, i NCh (x) ∀h ∈ Cζ , Px sup [h(zi ) − π (h)] > ǫ ≤ ˜ 0≤i≤N ǫ2 j=0 Moreover, assume that ∃φ ≥ 1 such that for any starting point x, ˜ P ∀h ∈ Cφ , Q n (x, h) −→ π (h) = π(ph)/π(p) , ˜ Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N t=1 h(x (t) ) L 18 / 22
- 55. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results i NCh (x) ∀h ∈ Cζ , Px sup [h(zi ) − π (h)] > ǫ ≤ ˜ 0≤i≤N j=0 ǫ2 ˜ P ∀h ∈ Cφ , Q n (x, h) −→ π (h) = π(ph)/π(p) , ˜ Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N t=1h(x (t) ) L MN − π(h) −→N→∞ N (0, V0 [h − π(h)]) , N where MN is deﬁned by 18 / 22
- 56. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N h(x (t) ) t=1 L MN − π(h) −→N→∞ N (0, V0 [h − π(h)]) , N where MN is deﬁned by MN MN +1 ˆ ξi0 ≤ N < ˆ ξi0 . i=1 i=1 18 / 22
- 57. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N h(x (t) ) t=1 L MN − π(h) −→N→∞ N (0, V0 [h − π(h)]) , N where MN is deﬁned by MN MN +1 ˆ ξi0 ≤ N < ˆ ξi0 . i=1 i=1 18 / 22
- 58. Metropolis Hastings revisited Rao–Blackwellisation Illustrations Variance gain (1) h(x) x x2 IX >0 p(x) τ = .1 0.971 0.953 0.957 0.207 τ =2 0.965 0.942 0.875 0.861 τ =5 0.913 0.982 0.785 0.826 τ =7 0.899 0.982 0.768 0.820 Ratios of the empirical variances of δ ∞ and δ estimating E[h(X )]: 100 MCMC iterations over 103 replications of a random walk Gaussian proposal with scale τ . 19 / 22
- 59. Metropolis Hastings revisited Rao–Blackwellisation Illustrations Illustration (1) Figure: Overlay of the variations of 250 iid realisations of the estimates δ (gold) and δ ∞ (grey) of E[X ] = 0 for 1000 iterations, along with the 90% interquantile range for the estimates δ (brown) and δ ∞ (pink), in the setting of a random walk Gaussian proposal with scale τ = 10. 20 / 22
- 60. Metropolis Hastings revisited Rao–Blackwellisation Illustrations Extra computational eﬀort median mean q.8 q.9 time τ = .25 0.0 8.85 4.9 13 4.2 τ = .50 0.0 6.76 4 11 2.25 τ = 1.0 0.25 6.15 4 10 2.5 τ = 2.0 0.20 5.90 3.5 8.5 4.5 Additional computing eﬀort due: median and mean numbers of additional iterations, 80% and 90% quantiles for the additional iterations, and ratio of the average R computing times obtained over 105 simulations 21 / 22
- 61. Metropolis Hastings revisited Rao–Blackwellisation Illustrations Illustration (2) Figure: Overlay of the variations of 500 iid realisations of the estimates δ (deep grey), δ ∞ (medium grey) and of the importance sampling version (light grey) of E[X ] = 10 when X ∼ Exp(.1) for 100 iterations, along with the 90% interquantile ranges (same colour code), in the setting of an independent exponential proposal with scale µ = 0.02. 22 / 22

No public clipboards found for this slide

Login to see the comments