Your SlideShare is downloading. ×
0
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Dr09 Slide
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Dr09 Slide

842

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
842
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. A vanilla Rao–Blackwellisation of Metropolis–Hastings algorithms Randal DOUC and Christian ROBERT Telecom SudParis, France randal.douc@it-sudparis.eu April 2009 1 / 24
  • 2. Main themes 1 Rao–Blackwellisation on MCMC. 2 Can be performed in any Hastings Metropolis algorithm. 3 Asymptotically more efficient to usual MCMC with a controlled amount of calculations. 2 / 24
  • 3. Main themes 1 Rao–Blackwellisation on MCMC. 2 Can be performed in any Hastings Metropolis algorithm. 3 Asymptotically more efficient to usual MCMC with a controlled amount of calculations. 2 / 24
  • 4. Main themes 1 Rao–Blackwellisation on MCMC. 2 Can be performed in any Hastings Metropolis algorithm. 3 Asymptotically more efficient to usual MCMC with a controlled amount of calculations. 2 / 24
  • 5. Main themes 1 Rao–Blackwellisation on MCMC. 2 Can be performed in any Hastings Metropolis algorithm. 3 Asymptotically more efficient to usual MCMC with a controlled amount of calculations. 2 / 24
  • 6. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Outline 1 Introduction 2 Some properties of the HM algorithm 3 Rao–Blackwellisation Variance reduction Asymptotic results 4 Illustrations 5 Conclusion 3 / 24
  • 7. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Outline 1 Introduction 2 Some properties of the HM algorithm 3 Rao–Blackwellisation Variance reduction Asymptotic results 4 Illustrations 5 Conclusion 3 / 24
  • 8. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Outline 1 Introduction 2 Some properties of the HM algorithm 3 Rao–Blackwellisation Variance reduction Asymptotic results 4 Illustrations 5 Conclusion 3 / 24
  • 9. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Outline 1 Introduction 2 Some properties of the HM algorithm 3 Rao–Blackwellisation Variance reduction Asymptotic results 4 Illustrations 5 Conclusion 3 / 24
  • 10. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Outline 1 Introduction 2 Some properties of the HM algorithm 3 Rao–Blackwellisation Variance reduction Asymptotic results 4 Illustrations 5 Conclusion 3 / 24
  • 11. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Outline 1 Introduction 2 Some properties of the HM algorithm 3 Rao–Blackwellisation Variance reduction Asymptotic results 4 Illustrations 5 Conclusion 4 / 24
  • 12. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Metropolis Hastings algorithm 1 We wish to approximate h(x )π(x )dx I= = h(x )¯ (x )dx π π(x )dx 2 x → π(x ) is known but not π(x )dx . 1 n 3 Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov chain with limiting distribution π . ¯ 4 Convergence obtained from Law of Large Numbers or CLT for Markov chains. 5 / 24
  • 13. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Metropolis Hastings algorithm 1 We wish to approximate h(x )π(x )dx I= = h(x )¯ (x )dx π π(x )dx 2 x → π(x ) is known but not π(x )dx . 1 n 3 Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov chain with limiting distribution π . ¯ 4 Convergence obtained from Law of Large Numbers or CLT for Markov chains. 5 / 24
  • 14. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Metropolis Hastings algorithm 1 We wish to approximate h(x )π(x )dx I= = h(x )¯ (x )dx π π(x )dx 2 x → π(x ) is known but not π(x )dx . 1 n 3 Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov chain with limiting distribution π . ¯ 4 Convergence obtained from Law of Large Numbers or CLT for Markov chains. 5 / 24
  • 15. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Metropolis Hastings algorithm 1 We wish to approximate h(x )π(x )dx I= = h(x )¯ (x )dx π π(x )dx 2 x → π(x ) is known but not π(x )dx . 1 n 3 Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov chain with limiting distribution π . ¯ 4 Convergence obtained from Law of Large Numbers or CLT for Markov chains. 5 / 24
  • 16. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Metropolis Hasting Algorithm Suppose that x (t) is drawn. 1 Simulate yt ∼ q(·|x (t) ). 2 Set x (t+1) = yt with probability π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) Otherwise, set x (t+1) = x (t) . 3 α is such that the detailed balance equation is satisfied: ⊲ π is ¯ the stationary distribution of (x (t) ). ◮ The accepted candidates are simulated with the rejection algorithm. 6 / 24
  • 17. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Metropolis Hasting Algorithm Suppose that x (t) is drawn. 1 Simulate yt ∼ q(·|x (t) ). 2 Set x (t+1) = yt with probability π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) Otherwise, set x (t+1) = x (t) . 3 α is such that the detailed balance equation is satisfied: ⊲ π is ¯ the stationary distribution of (x (t) ). ◮ The accepted candidates are simulated with the rejection algorithm. 6 / 24
  • 18. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Metropolis Hasting Algorithm Suppose that x (t) is drawn. 1 Simulate yt ∼ q(·|x (t) ). 2 Set x (t+1) = yt with probability π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) Otherwise, set x (t+1) = x (t) . 3 α is such that the detailed balance equation is satisfied: π(x )q(y |x )α(x , y ) = π(y )q(x |y )α(y , x ). ⊲ π is the stationary distribution of (x (t) ). ¯ ◮ The accepted candidates are simulated with the rejection algorithm. 6 / 24
  • 19. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Metropolis Hasting Algorithm Suppose that x (t) is drawn. 1 Simulate yt ∼ q(·|x (t) ). 2 Set x (t+1) = yt with probability π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) Otherwise, set x (t+1) = x (t) . 3 α is such that the detailed balance equation is satisfied: π(x )q(y |x )α(x , y ) = π(y )q(x |y )α(y , x ). ⊲ π is the stationary distribution of (x (t) ). ¯ ◮ The accepted candidates are simulated with the rejection algorithm. 6 / 24
  • 20. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Outline 1 Introduction 2 Some properties of the HM algorithm 3 Rao–Blackwellisation Variance reduction Asymptotic results 4 Illustrations 5 Conclusion 7 / 24
  • 21. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion 1 Alternative representation of the estimator δ is n MN 1 (t) 1 δ= h(x ) = ni h(zi ) , n N t=1 i=1 where zi ’s are the accepted yj ’s, MN is the number of accepted yj ’s till time N, ni is the number of times zi appears in the sequence (x (t) )t . 8 / 24
  • 22. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion α(zi , ·) q(·|zi ) q(·|zi ) ˜ q (·|zi ) = ≤ , p(zi ) p(zi ) where p(zi ) = ˜ α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) ˜ q (y |zi )/ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. ˜ The transition kernel q admits π as a stationary distribution: ˜ ˜ ˜ π (x )q (y |x ) = 9 / 24
  • 23. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion α(zi , ·) q(·|zi ) q(·|zi ) ˜ q (·|zi ) = ≤ , p(zi ) p(zi ) where p(zi ) = ˜ α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) ˜ q (y |zi )/ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. ˜ The transition kernel q admits π as a stationary distribution: ˜ ˜ ˜ π (x )q (y |x ) = 9 / 24
  • 24. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion α(zi , ·) q(·|zi ) q(·|zi ) ˜ q (·|zi ) = ≤ , p(zi ) p(zi ) where p(zi ) = ˜ α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) ˜ q (y |zi )/ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. ˜ The transition kernel q admits π as a stationary distribution: ˜ ˜ ˜ π (x )q (y |x ) = 9 / 24
  • 25. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion α(zi , ·) q(·|zi ) q(·|zi ) ˜ q (·|zi ) = ≤ , p(zi ) p(zi ) where p(zi ) = ˜ α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) ˜ q (y |zi )/ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. ˜ The transition kernel q admits π as a stationary distribution: ˜ π(x )p(x ) α(x , y )q(y |x ) ˜ ˜ π (x )q (y |x ) = π(u)p(u)du p(x ) π (x) ˜ ˜ q (y |x) 9 / 24
  • 26. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion α(zi , ·) q(·|zi ) q(·|zi ) ˜ q (·|zi ) = ≤ , p(zi ) p(zi ) where p(zi ) = ˜ α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) ˜ q (y |zi )/ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. ˜ The transition kernel q admits π as a stationary distribution: ˜ π(x )α(x , y )q(y |x ) ˜ ˜ π (x )q (y |x ) = π(u)p(u)du 9 / 24
  • 27. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion α(zi , ·) q(·|zi ) q(·|zi ) ˜ q (·|zi ) = ≤ , p(zi ) p(zi ) where p(zi ) = ˜ α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) ˜ q (y |zi )/ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. ˜ The transition kernel q admits π as a stationary distribution: ˜ π(y )α(y , x )q(x |y ) ˜ ˜ π (x )q (y |x ) = π(u)p(u)du 9 / 24
  • 28. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion α(zi , ·) q(·|zi ) q(·|zi ) ˜ q (·|zi ) = ≤ , p(zi ) p(zi ) where p(zi ) = ˜ α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) ˜ q (y |zi )/ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. ˜ The transition kernel q admits π as a stationary distribution: ˜ ˜ ˜ ˜ ˜ π (x )q (y |x ) = π (y )q (x |y ) , 9 / 24
  • 29. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Lemme The sequence (zi , ni ) satisfies 1 (zi , ni )i is a Markov chain; 2 zi+1 and ni are independent given zi ; 3 ni is distributed as a geometric random variable with probability parameter p(zi ) := α(zi , y ) q(y |zi ) dy ; (1) 4 (zi )i is a Markov chain with transition kernel ˜ ˜ Q(z, dy ) = q (y |z)dy and stationary distribution π such that ˜ ˜ q (·|z) ∝ α(z, ·) q(·|z) and π (·) ∝ π(·)p(·) . ˜ 10 / 24
  • 30. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Lemme The sequence (zi , ni ) satisfies 1 (zi , ni )i is a Markov chain; 2 zi+1 and ni are independent given zi ; 3 ni is distributed as a geometric random variable with probability parameter p(zi ) := α(zi , y ) q(y |zi ) dy ; (1) 4 (zi )i is a Markov chain with transition kernel ˜ ˜ Q(z, dy ) = q (y |z)dy and stationary distribution π such that ˜ ˜ q (·|z) ∝ α(z, ·) q(·|z) and π (·) ∝ π(·)p(·) . ˜ 10 / 24
  • 31. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Lemme The sequence (zi , ni ) satisfies 1 (zi , ni )i is a Markov chain; 2 zi+1 and ni are independent given zi ; 3 ni is distributed as a geometric random variable with probability parameter p(zi ) := α(zi , y ) q(y |zi ) dy ; (1) 4 (zi )i is a Markov chain with transition kernel ˜ ˜ Q(z, dy ) = q (y |z)dy and stationary distribution π such that ˜ ˜ q (·|z) ∝ α(z, ·) q(·|z) and π (·) ∝ π(·)p(·) . ˜ 10 / 24
  • 32. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Lemme The sequence (zi , ni ) satisfies 1 (zi , ni )i is a Markov chain; 2 zi+1 and ni are independent given zi ; 3 ni is distributed as a geometric random variable with probability parameter p(zi ) := α(zi , y ) q(y |zi ) dy ; (1) 4 (zi )i is a Markov chain with transition kernel ˜ ˜ Q(z, dy ) = q (y |z)dy and stationary distribution π such that ˜ ˜ q (·|z) ∝ α(z, ·) q(·|z) and π (·) ∝ π(·)p(·) . ˜ 10 / 24
  • 33. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion zi−1 11 / 24
  • 34. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion indep zi−1 zi indep ni−1 11 / 24
  • 35. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion indep indep zi−1 zi zi+1 indep indep ni−1 ni 11 / 24
  • 36. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion indep indep zi−1 zi zi+1 indep indep ni−1 ni n MN 1 1 δ= h(x (t) ) = ni h(zi ) . n N t=1 i=1 11 / 24
  • 37. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion indep indep zi−1 zi zi+1 indep indep ni−1 ni n MN 1 1 δ= h(x (t) ) = ni h(zi ) . n N t=1 i=1 11 / 24
  • 38. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Outline 1 Introduction 2 Some properties of the HM algorithm 3 Rao–Blackwellisation Variance reduction Asymptotic results 4 Illustrations 5 Conclusion 12 / 24
  • 39. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion 1 A natural idea: MN 1 h(zi ) δ∗ = , N p(zi ) i=1 13 / 24
  • 40. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion 1 A natural idea: MN h(zi ) MN π(zi ) i=1 i=1 h(zi ) p(zi ) π (zi ) ˜ δ∗ ≃ = . MN 1 MN π(zi ) i=1 i=1 p(zi ) π (zi ) ˜ 13 / 24
  • 41. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion 1 A natural idea: MN h(zi ) MN π(zi ) i=1 i=1 h(zi ) ∗ p(zi ) π (zi ) ˜ δ ≃ = . MN 1 MN π(zi ) i=1 i=1 p(zi ) π (zi ) ˜ 2 But p not available in closed form. 13 / 24
  • 42. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion 1 A natural idea: MN h(zi ) MN π(zi ) i=1 i=1 h(zi ) ∗ p(zi ) π (zi ) ˜ δ ≃ = . MN 1 MN π(zi ) i=1 i=1 p(zi ) π (zi ) ˜ 2 But p not available in closed form. 3 The geometric ni is the obvious solution that is used in the original Metropolis–Hastings estimate. 13 / 24
  • 43. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion 1 A natural idea: MN h(zi ) MN π(zi ) i=1 i=1 h(zi ) p(zi ) π (zi ) ˜ δ∗ ≃ = . MN 1 MN π(zi ) i=1 i=1 p(zi ) π (zi ) ˜ 2 But p not available in closed form. 3 The geometric ni is the obvious solution that is used in the original Metropolis–Hastings estimate. ∞ ni = 1 + I {uℓ ≥ α(zi , yℓ )} , j=1 ℓ≤j 13 / 24
  • 44. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion ∞ ni = 1 + I {uℓ ≥ α(zi , yℓ )} , j=1 ℓ≤j Lemma If (yj )j is an iid sequence with distribution q(y |zi ), the quantity ∞ ˆ ξi = 1 + {1 − α(zi , yℓ )} j=1 ℓ≤j is an unbiased estimator of 1/p(zi ) which variance, conditional on zi , is lower than the conditional variance of ni , {1 − p(zi )}/p2 (zi ). 13 / 24
  • 45. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion ∞ ˆ ξi = 1 + {1 − α(zi , yℓ )} j=1 ℓ≤j 1 Infinite sum but sometimes finite: π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) For example: take a symetric random walk as a proposal. 2 What if we wish to be sure that the sum is finite? 14 / 24
  • 46. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Variance reduction Proposition If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid uniform sequence, for any k ≥ 0, the quantity ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} (2) j=1 1≤ℓ≤k ∧j k +1≤ℓ≤j is an unbiased estimator of 1/p(zi ) with an almost sure finite number of terms. 15 / 24
  • 47. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Variance reduction Proposition If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid uniform sequence, for any k ≥ 0, the quantity ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} (2) j=1 1≤ℓ≤k ∧j k +1≤ℓ≤j is an unbiased estimator of 1/p(zi ) with an almost sure finite number of terms. Moreover, for k ≥ 1, ˆ 1 − p(zi ) 1 − (1 − 2p(zi ) + r (zi ))k 2 − p(zi ) V ξik zi = − (p(zi )−r (zi )) , p2 (zi ) 2p(zi ) − r (zi ) p2 (zi ) where p(zi ) := α(zi , y ) q(y |zi ) dy . and r (zi ) := α2 (zi , y ) q(y |zi ) dy . 15 / 24
  • 48. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Variance reduction Proposition If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid uniform sequence, for any k ≥ 0, the quantity ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} (2) j=1 1≤ℓ≤k ∧j k +1≤ℓ≤j is an unbiased estimator of 1/p(zi ) with an almost sure finite number of terms. Therefore, we have ˆ ˆ ˆ V ξi zi ≤ V ξik zi ≤ V ξi0 zi = V [ni | zi ] . 15 / 24
  • 49. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Variance reduction zi−1 ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} (3) j=1 1≤ℓ≤k ∧j k +1≤ℓ≤j 16 / 24
  • 50. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Variance reduction not indep zi−1 zi not indep ˆk ξi−1 ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} (3) j=1 1≤ℓ≤k ∧j k +1≤ℓ≤j 16 / 24
  • 51. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Variance reduction not indep not indep zi−1 zi zi+1 not indep not indep ˆk ξi−1 ˆ ξik ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} (3) j=1 1≤ℓ≤k ∧j k +1≤ℓ≤j 16 / 24
  • 52. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Variance reduction not indep not indep zi−1 zi zi+1 not indep not indep ˆk ξi−1 ˆ ξik M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi 16 / 24
  • 53. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Variance reduction not indep not indep zi−1 zi zi+1 not indep not indep ˆk ξi−1 ˆ ξik M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi 16 / 24
  • 54. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Asymptotic results Let M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}. 17 / 24
  • 55. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Asymptotic results Let M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}. Assume that there exist a positive function ϕ ≥ 1 such that M i=1 h(zi )/p(zi ) P ∀h ∈ Cϕ , M −→ π(h) (3) i=1 1/p(zi ) Theorem Under the assumption that π(p) > 0, the following convergence property holds: i) If h is in Cϕ , then k P δM −→M→∞ π(h) (◮C ONSISTENCY) 17 / 24
  • 56. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Asymptotic results Let M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}. Assume that there exist a positive function ψ such that √ M i=1 h(zi )/p(zi ) L ∀h ∈ Cψ , M M − π(h) −→ N (0, Γ(h)) i=1 1/p(zi ) Theorem Under the assumption that π(p) > 0, the following convergence property holds: ii) If, in addition, h2 /p ∈ Cϕ and h ∈ Cψ , then √ k L M(δM − π(h)) −→M→∞ N (0, Vk [h − π(h)]) , (◮C LT) where Vk (h) := π(p) ˆ π(dz)V ξik z h2 (z)p(z) + Γ(h) . 17 / 24
  • 57. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Asymptotic results We will need some additional assumptions. Assume a maximal inequality for the Markov chain (zi )i : there exists a measurable function ζ such that for any starting point x ,   i NCh (x ) ∀h ∈ Cζ , Px  sup [h(zi ) − π (h)] > ǫ ≤ ˜ 0≤i≤N ǫ2 j=0 Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x , N t=1 h(x (t) ) L MN − π(h) −→N→∞ N (0, V0 [h − π(h)]) , N 18 / 24
  • 58. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Asymptotic results We will need some additional assumptions. Assume a maximal inequality for the Markov chain (zi )i : there exists a measurable function ζ such that for any starting point x ,   i NCh (x ) ∀h ∈ Cζ , Px  sup [h(zi ) − π (h)] > ǫ ≤ ˜ 0≤i≤N ǫ2 j=0 Moreover, assume that ∃φ ≥ 1 such that for any starting point x , ∀h ∈ Cφ , ˜ P Q n (x , h) −→ π (h) = π(ph)/π(p) , ˜ Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x , 18 / 24
  • 59. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Asymptotic results We will need some additional assumptions. Assume a maximal inequality for the Markov chain (zi )i : there exists a measurable function ζ such that for any starting point x ,   i NCh (x ) ∀h ∈ Cζ , Px  sup [h(zi ) − π (h)] > ǫ ≤ ˜ 0≤i≤N ǫ2 j=0 Moreover, assume that ∃φ ≥ 1 such that for any starting point x , ∀h ∈ Cφ , ˜ P Q n (x , h) −→ π (h) = π(ph)/π(p) , ˜ Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x , 18 / 24
  • 60. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Asymptotic results   i NCh (x ) ∀h ∈ Cζ , Px  sup [h(zi ) − π (h)] > ǫ ≤ ˜ 0≤i≤N j=0 ǫ2 ∀h ∈ Cφ , ˜ P Q n (x , h) −→ π (h) = π(ph)/π(p) , ˜ Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x , N t=1 h(x (t) ) L MN − π(h) −→N→∞ N (0, V0 [h − π(h)]) , N 18 / 24
  • 61. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Asymptotic results Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x , N t=1 h(x (t) ) L MN − π(h) −→N→∞ N (0, V0 [h − π(h)]) , N where MN is defined by MN MN +1 ˆ ξi0 ≤ N < ˆ ξi0 . (3) i=1 i=1 18 / 24
  • 62. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Asymptotic results Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x , N t=1 h(x (t) ) L MN − π(h) −→N→∞ N (0, V0 [h − π(h)]) , N where MN is defined by MN MN +1 ˆ ξi0 ≤ N < ˆ ξi0 . (3) i=1 i=1 18 / 24
  • 63. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Outline 1 Introduction 2 Some properties of the HM algorithm 3 Rao–Blackwellisation Variance reduction Asymptotic results 4 Illustrations 5 Conclusion 19 / 24
  • 64. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Figure: Overlay of the variations of 250 iid realisations of the estimates δ (gold) and δ ∞ (grey) of E[X ] = 0 for 1000 iterations, along with the 90% interquantile range for the estimates δ (brown) and δ ∞ (pink), in the setting of a random walk Gaussian proposal with scale τ = 10. 20 / 24
  • 65. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Figure: Overlay of the variations of 500 iid realisations of the estimates δ (deep grey), δ ∞ (medium grey) and of the importance sampling version (light grey) of E[X ] = 10 when X ∼ Exp(.1) for 100 iterations, along with the 90% interquantile ranges (same colour code), in the setting of an independent exponential proposal with scale µ = 0.02. 21 / 24
  • 66. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion I|x−y |=1 if x > 0 , π(x ) = β(1 − β)x and 2q(y |x ) = I|y |≤1 if x = 0 . For this problem, p(x ) = 1 − β/2 and r (x ) = 1 − β + β 2 /2 . We can therefore compute the gain in variance p(x ) − r (x ) 2 − p(x ) β(1 − β)(2 + β) =2 2p(x ) − r (x ) p2 (x ) (2 − β 2 )(2 − β)2 which is optimal for β = 0.174, leading to a gain of 0.578 while the relative gain in variance is p(x ) − r (x ) 2 − p(x ) (1 − β)(2 + β) = 2p(x ) − r (x ) 1 − p(x ) (2 − β 2 ) which is decreasing in β. 22 / 24
  • 67. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion Outline 1 Introduction 2 Some properties of the HM algorithm 3 Rao–Blackwellisation Variance reduction Asymptotic results 4 Illustrations 5 Conclusion 23 / 24
  • 68. Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion a) Rao Blackwellisation of any HM algorithm with a controled amount of additional calculation. b) Link with the importance sampling of Markov chains. c) Analysis with asymptotic results on triangular arrays. 24 / 24

×