SlideShare a Scribd company logo
1 of 61
Download to read offline
Vanilla Rao–Blackwellisation of
          Metropolis–Hastings algorithms

                     Christian P. Robert
     Universit´ Paris-Dauphine and CREST, Paris, France
              e
Joint works with Randal Douc, Pierre Jacob and Murray Smith

              xian@ceremade.dauphine.fr



                      August 9, 2010


                                                              1 / 22
Main themes




    1   Rao–Blackwellisation on MCMC.
    2   Can be performed in any Hastings Metropolis algorithm.
    3   Asymptotically more efficient than usual MCMC with a
        controlled additional computing
    4   Can take advantage of parallel capacities at a very basic level




                                                                          2 / 22
Main themes




    1   Rao–Blackwellisation on MCMC.
    2   Can be performed in any Hastings Metropolis algorithm.
    3   Asymptotically more efficient than usual MCMC with a
        controlled additional computing
    4   Can take advantage of parallel capacities at a very basic level




                                                                          2 / 22
Main themes




    1   Rao–Blackwellisation on MCMC.
    2   Can be performed in any Hastings Metropolis algorithm.
    3   Asymptotically more efficient than usual MCMC with a
        controlled additional computing
    4   Can take advantage of parallel capacities at a very basic level




                                                                          2 / 22
Metropolis Hastings revisited               Rao–Blackwellisation




Outline



       1    Metropolis Hastings revisited


       2    Rao–Blackwellisation
              Formal importance sampling
              Variance reduction
              Asymptotic results
              Illustrations




                                                           3 / 22
Metropolis Hastings revisited               Rao–Blackwellisation




Outline



       1    Metropolis Hastings revisited


       2    Rao–Blackwellisation
              Formal importance sampling
              Variance reduction
              Asymptotic results
              Illustrations




                                                           3 / 22
Metropolis Hastings revisited               Rao–Blackwellisation




Outline



       1    Metropolis Hastings revisited


       2    Rao–Blackwellisation
              Formal importance sampling
              Variance reduction
              Asymptotic results
              Illustrations




                                                           4 / 22
Metropolis Hastings revisited                                            Rao–Blackwellisation




Metropolis Hastings algorithm


           1   We wish to approximate

                                      h(x)π(x)dx
                                I =              =    h(x)¯ (x)dx
                                                          π
                                        π(x)dx


           2   x → π(x) is known but not    π(x)dx.
                                          1 n
           3   Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov
               chain with limiting distribution π .
                                                ¯
           4   Convergence obtained from Law of Large Numbers or CLT for
               Markov chains.



                                                                                        5 / 22
Metropolis Hastings revisited                                            Rao–Blackwellisation




Metropolis Hastings algorithm


           1   We wish to approximate

                                      h(x)π(x)dx
                                I =              =    h(x)¯ (x)dx
                                                          π
                                        π(x)dx


           2   x → π(x) is known but not    π(x)dx.
                                          1 n
           3   Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov
               chain with limiting distribution π .
                                                ¯
           4   Convergence obtained from Law of Large Numbers or CLT for
               Markov chains.



                                                                                        5 / 22
Metropolis Hastings revisited                                            Rao–Blackwellisation




Metropolis Hastings algorithm


           1   We wish to approximate

                                      h(x)π(x)dx
                                I =              =    h(x)¯ (x)dx
                                                          π
                                        π(x)dx


           2   x → π(x) is known but not    π(x)dx.
                                          1 n
           3   Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov
               chain with limiting distribution π .
                                                ¯
           4   Convergence obtained from Law of Large Numbers or CLT for
               Markov chains.



                                                                                        5 / 22
Metropolis Hastings revisited                                            Rao–Blackwellisation




Metropolis Hastings algorithm


           1   We wish to approximate

                                      h(x)π(x)dx
                                I =              =    h(x)¯ (x)dx
                                                          π
                                        π(x)dx


           2   x → π(x) is known but not    π(x)dx.
                                          1 n
           3   Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov
               chain with limiting distribution π .
                                                ¯
           4   Convergence obtained from Law of Large Numbers or CLT for
               Markov chains.



                                                                                        5 / 22
Metropolis Hastings revisited                                                       Rao–Blackwellisation




Metropolis Hasting Algorithm

       Suppose that x (t) is drawn.
           1   Simulate yt ∼ q(·|x (t) ).
           2   Set x (t+1) = yt with probability

                                                           π(yt ) q(x (t) |yt )
                                α(x (t) , yt ) = min 1,
                                                          π(x (t) ) q(yt |x (t) )

               Otherwise, set x (t+1) = x (t) .
           3   α is such that the detailed balance equation is satisfied: ⊲ π is the
                                                                           ¯
               stationary distribution of (x (t) ).
       ◮ The accepted candidates are simulated with the rejection algorithm.


                                                                                                   6 / 22
Metropolis Hastings revisited                                                       Rao–Blackwellisation




Metropolis Hasting Algorithm

       Suppose that x (t) is drawn.
           1   Simulate yt ∼ q(·|x (t) ).
           2   Set x (t+1) = yt with probability

                                                           π(yt ) q(x (t) |yt )
                                α(x (t) , yt ) = min 1,
                                                          π(x (t) ) q(yt |x (t) )

               Otherwise, set x (t+1) = x (t) .
           3   α is such that the detailed balance equation is satisfied: ⊲ π is the
                                                                           ¯
               stationary distribution of (x (t) ).
       ◮ The accepted candidates are simulated with the rejection algorithm.


                                                                                                   6 / 22
Metropolis Hastings revisited                                                       Rao–Blackwellisation




Metropolis Hasting Algorithm
       Suppose that x (t) is drawn.
           1   Simulate yt ∼ q(·|x (t) ).
           2   Set x (t+1) = yt with probability

                                                           π(yt ) q(x (t) |yt )
                                α(x (t) , yt ) = min 1,
                                                          π(x (t) ) q(yt |x (t) )

               Otherwise, set x (t+1) = x (t) .
           3   α is such that the detailed balance equation is satisfied:

                                π(x)q(y |x)α(x, y ) = π(y )q(x|y )α(y , x).

               ⊲ π is the stationary distribution of (x (t) ).
                 ¯
       ◮ The accepted candidates are simulated with the rejection algorithm.
                                                                                                   6 / 22
Metropolis Hastings revisited                                                       Rao–Blackwellisation




Metropolis Hasting Algorithm
       Suppose that x (t) is drawn.
           1   Simulate yt ∼ q(·|x (t) ).
           2   Set x (t+1) = yt with probability

                                                           π(yt ) q(x (t) |yt )
                                α(x (t) , yt ) = min 1,
                                                          π(x (t) ) q(yt |x (t) )

               Otherwise, set x (t+1) = x (t) .
           3   α is such that the detailed balance equation is satisfied:

                                π(x)q(y |x)α(x, y ) = π(y )q(x|y )α(y , x).

               ⊲ π is the stationary distribution of (x (t) ).
                 ¯
       ◮ The accepted candidates are simulated with the rejection algorithm.
                                                                                                   6 / 22
Metropolis Hastings revisited                                                          Rao–Blackwellisation




Some properties of the HM algorithm



           1   Alternative representation of the estimator δ is
                                           n                       MN
                                     1                (t)      1
                                  δ=            h(x         )=           ni h(zi ) ,
                                     n    t=1
                                                               N
                                                                   i=1

               where
                       zi ’s are the accepted yj ’s,
                       MN is the number of accepted yj ’s till time N,
                       ni is the number of times zi appears in the sequence (x (t) )t .




                                                                                                      7 / 22
Metropolis Hastings revisited                                                           Rao–Blackwellisation




                                                 α(zi , ·) q(·|zi )   q(·|zi )
                                   q (·|zi ) =
                                   ˜                                ≤          ,
                                                      p(zi )           p(zi )

       where p(zi ) =           α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ):
                                                                                  ˜
           1   Propose a candidate y ∼ q(·|zi )
           2   Accept with probability

                                                        q(y |zi )
                                         q (y |zi )/
                                         ˜                          = α(zi , y )
                                                         p(zi )

               Otherwise, reject it and starts again.
           3   ◮ this is the transition of the HM algorithm.
       The transition kernel q admits π as a stationary distribution:
                             ˜        ˜

                                                 π (x)˜ (y |x) =
                                                 ˜ q


                                                                                                       8 / 22
Metropolis Hastings revisited                                                           Rao–Blackwellisation




                                                 α(zi , ·) q(·|zi )   q(·|zi )
                                   q (·|zi ) =
                                   ˜                                ≤          ,
                                                      p(zi )           p(zi )

       where p(zi ) =           α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ):
                                                                                  ˜
           1   Propose a candidate y ∼ q(·|zi )
           2   Accept with probability

                                                        q(y |zi )
                                         q (y |zi )/
                                         ˜                          = α(zi , y )
                                                         p(zi )

               Otherwise, reject it and starts again.
           3   ◮ this is the transition of the HM algorithm.
       The transition kernel q admits π as a stationary distribution:
                             ˜        ˜

                                                 π (x)˜ (y |x) =
                                                 ˜ q


                                                                                                       8 / 22
Metropolis Hastings revisited                                                           Rao–Blackwellisation




                                                 α(zi , ·) q(·|zi )   q(·|zi )
                                   q (·|zi ) =
                                   ˜                                ≤          ,
                                                      p(zi )           p(zi )

       where p(zi ) =           α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ):
                                                                                  ˜
           1   Propose a candidate y ∼ q(·|zi )
           2   Accept with probability

                                                        q(y |zi )
                                         q (y |zi )/
                                         ˜                          = α(zi , y )
                                                         p(zi )

               Otherwise, reject it and starts again.
           3   ◮ this is the transition of the HM algorithm.
       The transition kernel q admits π as a stationary distribution:
                             ˜        ˜

                                                 π (x)˜ (y |x) =
                                                 ˜ q


                                                                                                       8 / 22
Metropolis Hastings revisited                                                             Rao–Blackwellisation




                                                   α(zi , ·) q(·|zi )   q(·|zi )
                                     q (·|zi ) =
                                     ˜                                ≤          ,
                                                        p(zi )           p(zi )
       where p(zi ) =             α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ):
                                                                                    ˜
           1   Propose a candidate y ∼ q(·|zi )
           2   Accept with probability
                                                          q(y |zi )
                                           q (y |zi )/
                                           ˜                          = α(zi , y )
                                                           p(zi )
               Otherwise, reject it and starts again.
           3   ◮ this is the transition of the HM algorithm.
       The transition kernel q admits π as a stationary distribution:
                             ˜        ˜
                                                     π(x)p(x)   α(x, y )q(y |x)
                                π (x)˜ (y |x) =
                                ˜ q
                                                     π(u)p(u)du     p(x)
                                                         π (x)
                                                         ˜                 q (y |x)
                                                                           ˜

                                                                                                         8 / 22
Metropolis Hastings revisited                                                           Rao–Blackwellisation




                                                 α(zi , ·) q(·|zi )   q(·|zi )
                                   q (·|zi ) =
                                   ˜                                ≤          ,
                                                      p(zi )           p(zi )

       where p(zi ) =           α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ):
                                                                                  ˜
           1   Propose a candidate y ∼ q(·|zi )
           2   Accept with probability

                                                        q(y |zi )
                                         q (y |zi )/
                                         ˜                          = α(zi , y )
                                                         p(zi )

               Otherwise, reject it and starts again.
           3   ◮ this is the transition of the HM algorithm.
       The transition kernel q admits π as a stationary distribution:
                             ˜        ˜

                                                        π(x)α(x, y )q(y |x)
                                    π (x)˜ (y |x) =
                                    ˜ q
                                                            π(u)p(u)du

                                                                                                       8 / 22
Metropolis Hastings revisited                                                           Rao–Blackwellisation




                                                 α(zi , ·) q(·|zi )   q(·|zi )
                                   q (·|zi ) =
                                   ˜                                ≤          ,
                                                      p(zi )           p(zi )

       where p(zi ) =           α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ):
                                                                                  ˜
           1   Propose a candidate y ∼ q(·|zi )
           2   Accept with probability

                                                        q(y |zi )
                                         q (y |zi )/
                                         ˜                          = α(zi , y )
                                                         p(zi )

               Otherwise, reject it and starts again.
           3   ◮ this is the transition of the HM algorithm.
       The transition kernel q admits π as a stationary distribution:
                             ˜        ˜

                                                        π(y )α(y , x)q(x|y )
                                    π (x)˜ (y |x) =
                                    ˜ q
                                                             π(u)p(u)du

                                                                                                       8 / 22
Metropolis Hastings revisited                                                           Rao–Blackwellisation




                                                 α(zi , ·) q(·|zi )   q(·|zi )
                                   q (·|zi ) =
                                   ˜                                ≤          ,
                                                      p(zi )           p(zi )

       where p(zi ) =           α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ):
                                                                                  ˜
           1   Propose a candidate y ∼ q(·|zi )
           2   Accept with probability

                                                        q(y |zi )
                                         q (y |zi )/
                                         ˜                          = α(zi , y )
                                                         p(zi )

               Otherwise, reject it and starts again.
           3   ◮ this is the transition of the HM algorithm.
       The transition kernel q admits π as a stationary distribution:
                             ˜        ˜

                                        π (x)˜ (y |x) = π (y )˜ (x|y ) ,
                                        ˜ q             ˜ q


                                                                                                       8 / 22
Metropolis Hastings revisited                                                       Rao–Blackwellisation




       Lemme
       The sequence (zi , ni ) satisfies
           1   (zi , ni )i is a Markov chain;
           2   zi+1 and ni are independent given zi ;
           3   ni is distributed as a geometric random variable with probability
               parameter
                                        p(zi ) :=     α(zi , y ) q(y |zi ) dy ;             (1)


           4                                                   ˜
               (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy
                                                                           ˜
               and stationary distribution π such that
                                           ˜

                                q (·|z) ∝ α(z, ·) q(·|z)
                                ˜                          and π (·) ∝ π(·)p(·) .
                                                               ˜


                                                                                                   9 / 22
Metropolis Hastings revisited                                                       Rao–Blackwellisation




       Lemme
       The sequence (zi , ni ) satisfies
           1   (zi , ni )i is a Markov chain;
           2   zi+1 and ni are independent given zi ;
           3   ni is distributed as a geometric random variable with probability
               parameter
                                        p(zi ) :=     α(zi , y ) q(y |zi ) dy ;             (1)


           4                                                   ˜
               (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy
                                                                           ˜
               and stationary distribution π such that
                                           ˜

                                q (·|z) ∝ α(z, ·) q(·|z)
                                ˜                          and π (·) ∝ π(·)p(·) .
                                                               ˜


                                                                                                   9 / 22
Metropolis Hastings revisited                                                       Rao–Blackwellisation




       Lemme
       The sequence (zi , ni ) satisfies
           1   (zi , ni )i is a Markov chain;
           2   zi+1 and ni are independent given zi ;
           3   ni is distributed as a geometric random variable with probability
               parameter
                                        p(zi ) :=     α(zi , y ) q(y |zi ) dy ;             (1)


           4                                                   ˜
               (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy
                                                                           ˜
               and stationary distribution π such that
                                           ˜

                                q (·|z) ∝ α(z, ·) q(·|z)
                                ˜                          and π (·) ∝ π(·)p(·) .
                                                               ˜


                                                                                                   9 / 22
Metropolis Hastings revisited                                                       Rao–Blackwellisation




       Lemme
       The sequence (zi , ni ) satisfies
           1   (zi , ni )i is a Markov chain;
           2   zi+1 and ni are independent given zi ;
           3   ni is distributed as a geometric random variable with probability
               parameter
                                        p(zi ) :=     α(zi , y ) q(y |zi ) dy ;             (1)


           4                                                   ˜
               (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy
                                                                           ˜
               and stationary distribution π such that
                                           ˜

                                q (·|z) ∝ α(z, ·) q(·|z)
                                ˜                          and π (·) ∝ π(·)p(·) .
                                                               ˜


                                                                                                   9 / 22
Metropolis Hastings revisited          Rao–Blackwellisation




Old bottle, new wine [or vice-versa]


                                zi−1




                                                     10 / 22
Metropolis Hastings revisited                       Rao–Blackwellisation




Old bottle, new wine [or vice-versa]


                                       indep
                                zi−1           zi


                                   indep


                                ni−1




                                                                  10 / 22
Metropolis Hastings revisited                                         Rao–Blackwellisation




Old bottle, new wine [or vice-versa]


                                       indep           indep
                                zi−1           zi              zi+1


                                   indep            indep


                                ni−1           ni




                                                                                    10 / 22
Metropolis Hastings revisited                                                          Rao–Blackwellisation




Old bottle, new wine [or vice-versa]
                                           indep                   indep
                                zi−1                    zi                      zi+1


                                   indep                     indep


                                ni−1                    ni




                                            n                       MN
                                       1                       1
                                 δ=              h(x (t) ) =             ni h(zi ) .
                                       n   t=1
                                                               N
                                                                   i=1


                                                                                                     10 / 22
Metropolis Hastings revisited                                                          Rao–Blackwellisation




Old bottle, new wine [or vice-versa]
                                           indep                   indep
                                zi−1                    zi                      zi+1


                                   indep                     indep


                                ni−1                    ni




                                            n                       MN
                                       1                       1
                                 δ=              h(x (t) ) =             ni h(zi ) .
                                       n   t=1
                                                               N
                                                                   i=1


                                                                                                     10 / 22
Metropolis Hastings revisited               Rao–Blackwellisation




Outline



       1    Metropolis Hastings revisited


       2    Rao–Blackwellisation
              Formal importance sampling
              Variance reduction
              Asymptotic results
              Illustrations




                                                          11 / 22
Metropolis Hastings revisited                                Rao–Blackwellisation

Formal importance sampling


Importance sampling perspective




           1   A natural idea:
                                            MN
                                        1         h(zi )
                                 δ∗ =                    ,
                                        N         p(zi )
                                            i=1




                                                                           12 / 22
Metropolis Hastings revisited                                                  Rao–Blackwellisation

Formal importance sampling


Importance sampling perspective



           1   A natural idea:

                                        MN    h(zi )     MN π(zi )
                                        i=1              i=1        h(zi )
                                              p(zi )        π (zi )
                                                             ˜
                                 δ∗ ≃                =                     .
                                        MN      1          MN π(zi )
                                        i=1                i=1
                                              p(zi )           π (zi )
                                                               ˜




                                                                                             12 / 22
Metropolis Hastings revisited                                                  Rao–Blackwellisation

Formal importance sampling


Importance sampling perspective



           1   A natural idea:

                                        MN    h(zi )     MN π(zi )
                                        i=1              i=1        h(zi )
                                              p(zi )        π (zi )
                                                             ˜
                                 δ∗ ≃                =                     .
                                        MN      1          MN π(zi )
                                        i=1                i=1
                                              p(zi )           π (zi )
                                                               ˜

           2   But p not available in closed form.




                                                                                             12 / 22
Metropolis Hastings revisited                                                  Rao–Blackwellisation

Formal importance sampling


Importance sampling perspective


           1   A natural idea:

                                        MN    h(zi )     MN π(zi )
                                        i=1              i=1        h(zi )
                                              p(zi )        π (zi )
                                                             ˜
                                 δ∗ ≃                =                     .
                                        MN      1          MN π(zi )
                                        i=1                i=1
                                              p(zi )           π (zi )
                                                               ˜

           2   But p not available in closed form.
           3   The geometric ni is the replacement obvious solution that is used in
               the original Metropolis–Hastings estimate since E[ni ] = 1/p(zi ).




                                                                                             12 / 22
Metropolis Hastings revisited                                                  Rao–Blackwellisation

Formal importance sampling


Importance sampling perspective


           1   A natural idea:

                                        MN    h(zi )     MN π(zi )
                                        i=1              i=1        h(zi )
                                              p(zi )        π (zi )
                                                             ˜
                                 δ∗ ≃                =                     .
                                        MN      1          MN π(zi )
                                        i=1                i=1
                                              p(zi )           π (zi )
                                                               ˜

           2   But p not available in closed form.
           3   The geometric ni is the replacement obvious solution that is used in
               the original Metropolis–Hastings estimate since E[ni ] = 1/p(zi ).




                                                                                             12 / 22
Metropolis Hastings revisited                                                   Rao–Blackwellisation

Formal importance sampling



       The crude estimate of 1/p(zi ),
                                           ∞
                                ni = 1 +             I {uℓ ≥ α(zi , yℓ )} ,
                                           j=1 ℓ≤j

       can be improved:
       Lemma
       If (yj )j is an iid sequence with distribution q(y |zi ), the quantity
                                             ∞
                                 ˆ
                                 ξi = 1 +              {1 − α(zi , yℓ )}
                                             j=1 ℓ≤j


       is an unbiased estimator of 1/p(zi ) which variance, conditional on zi , is
       lower than the conditional variance of ni , {1 − p(zi )}/p 2 (zi ).


                                                                                              13 / 22
Metropolis Hastings revisited                                                       Rao–Blackwellisation

Formal importance sampling


Rao-Blackwellised, for sure?


                                             ∞
                                  ˆ
                                  ξi = 1 +             {1 − α(zi , yℓ )}
                                             j=1 ℓ≤j


           1   Infinite sum but finite with at least positive probability:

                                                           π(yt ) q(x (t) |yt )
                                α(x (t) , yt ) = min 1,
                                                          π(x (t) ) q(yt |x (t) )

               For example: take a symetric random walk as a proposal.
           2   What if we wish to be sure that the sum is finite?




                                                                                                  14 / 22
Metropolis Hastings revisited                                                               Rao–Blackwellisation

Variance reduction


Variance improvement



       Proposition
       If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid
       uniform sequence, for any k ≥ 0, the quantity
                                ∞
                ˆ
                ξik = 1 +                     {1 − α(zi , yj )}             I {uℓ ≥ α(zi , yℓ )}
                                j=1 1≤ℓ≤k∧j                       k+1≤ℓ≤j


       is an unbiased estimator of 1/p(zi ) with an almost sure finite number of
       terms.




                                                                                                          15 / 22
Metropolis Hastings revisited                                                                                    Rao–Blackwellisation

Variance reduction


Variance improvement

       Proposition
       If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid
       uniform sequence, for any k ≥ 0, the quantity
                                     ∞
                ˆ
                ξik = 1 +                             {1 − α(zi , yj )}                     I {uℓ ≥ α(zi , yℓ )}
                                    j=1 1≤ℓ≤k∧j                                k+1≤ℓ≤j


       is an unbiased estimator of 1/p(zi ) with an almost sure finite number of
       terms. Moreover, for k ≥ 1,
                                      1 − (1 − 2p(zi ) + r (zi ))k 2 − p(zi )
                                                                  „            «
            h ˛ i
             ˆ˛         1 − p(zi )
           V ξik ˛ zi =             −                                            (p(zi ) − r (zi )) ,
                          p 2 (zi )        2p(zi ) − r (zi )         p 2 (zi )

                                                                                   α2 (zi , y ) q(y |zi ) dy .
                                R                                              R
        where p(zi ) :=             α(zi , y ) q(y |zi ) dy . and r (zi ) :=



                                                                                                                               15 / 22
Metropolis Hastings revisited                                                               Rao–Blackwellisation

Variance reduction


Variance improvement

       Proposition
       If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid
       uniform sequence, for any k ≥ 0, the quantity
                                ∞
                ˆ
                ξik = 1 +                     {1 − α(zi , yj )}             I {uℓ ≥ α(zi , yℓ )}
                                j=1 1≤ℓ≤k∧j                       k+1≤ℓ≤j


       is an unbiased estimator of 1/p(zi ) with an almost sure finite number of
       terms. Therefore, we have

                             ˆ         ˆ          ˆ
                           V ξi zi ≤ V ξik zi ≤ V ξi0 zi = V [ni | zi ] .




                                                                                                          15 / 22
Metropolis Hastings revisited                                                               Rao–Blackwellisation

Variance reduction




                                     zi−1




                                ∞
                ˆ
                ξik = 1 +                     {1 − α(zi , yj )}             I {uℓ ≥ α(zi , yℓ )}
                                j=1 1≤ℓ≤k∧j                       k+1≤ℓ≤j




                                                                                                          16 / 22
Metropolis Hastings revisited                                                               Rao–Blackwellisation

Variance reduction



                                           not indep
                                    zi−1                  zi


                                       not indep


                                    ˆk
                                    ξi−1




                                ∞
                ˆ
                ξik = 1 +                     {1 − α(zi , yj )}             I {uℓ ≥ α(zi , yℓ )}
                                j=1 1≤ℓ≤k∧j                       k+1≤ℓ≤j




                                                                                                          16 / 22
Metropolis Hastings revisited                                                               Rao–Blackwellisation

Variance reduction



                                         not indep            not indep
                                  zi−1                  zi                   zi+1


                                     not indep               not indep


                                  ˆk
                                  ξi−1                 ˆ
                                                       ξik




                                ∞
                ˆ
                ξik = 1 +                     {1 − α(zi , yj )}             I {uℓ ≥ α(zi , yℓ )}
                                j=1 1≤ℓ≤k∧j                       k+1≤ℓ≤j




                                                                                                          16 / 22
Metropolis Hastings revisited                                                 Rao–Blackwellisation

Variance reduction



                                       not indep          not indep
                                zi−1               zi                  zi+1


                                   not indep             not indep


                                ˆk
                                ξi−1               ˆ
                                                   ξik




                                                   M ˆk
                                         k         i=1 ξi h(zi )
                                        δM =         M ˆk
                                                                   .
                                                     i=1 ξi




                                                                                            16 / 22
Metropolis Hastings revisited                                                 Rao–Blackwellisation

Variance reduction



                                       not indep          not indep
                                zi−1               zi                  zi+1


                                   not indep             not indep


                                ˆk
                                ξi−1               ˆ
                                                   ξik




                                                   M ˆk
                                         k         i=1 ξi h(zi )
                                        δM =         M ˆk
                                                                   .
                                                     i=1 ξi




                                                                                            16 / 22
Metropolis Hastings revisited                                         Rao–Blackwellisation

Asymptotic results




       Let
                                       M ˆk
                                 k     i=1 ξi h(zi )
                                δM =     M ˆk
                                                       .
                                         i=1 ξi

       For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}.




                                                                                    17 / 22
Metropolis Hastings revisited                                                   Rao–Blackwellisation

Asymptotic results



       Let
                                                     M ˆk
                                         k           i=1 ξi h(zi )
                                        δM =           M ˆk
                                                                     .
                                                       i=1 ξi

       For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}. Assume
       that there exists a positive function ϕ ≥ 1 such that
                                             PM
                                               i=1   h(zi )/p(zi )    P
                                 ∀h ∈ Cϕ ,     PM                    −→ π(h)
                                                  i=1 1/p(zi )



       Theorem

       Under the assumption that π(p) > 0, the following convergence property holds:
           i) If h is in Cϕ , then

                                      k  P
                                     δM −→M→∞ π(h) (◮Consistency)



                                                                                              17 / 22
Metropolis Hastings revisited                                                   Rao–Blackwellisation

Asymptotic results

       Let
                                                 M ˆk
                                         k       i=1 ξi h(zi )
                                        δM =       M ˆk
                                                                 .
                                                   i=1 ξi

       For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}.
       Assume that there exists a positive function ψ such that
                                √       M
                                        i=1 h(zi )/p(zi )             L
                 ∀h ∈ Cψ ,          M     M
                                                            − π(h)   −→ N (0, Γ(h))
                                          i=1 1/p(zi )


       Theorem

       Under the assumption that π(p) > 0, the following convergence property
       holds:
          ii) If, in addition, h2 /p ∈ Cϕ and h ∈ Cψ , then
                       √      k          L
                           M(δM − π(h)) −→M→∞ N (0, Vk [h − π(h)]) , (◮Clt)

               where Vk (h) := π(p)              ˆ
                                          π(dz)V ξik z h2 (z)p(z) + Γ(h) .
                                                                                              17 / 22
Metropolis Hastings revisited                                                 Rao–Blackwellisation

Asymptotic results

       We will need some additional assumptions. Assume a maximal inequality
       for the Markov chain (zi )i : there exists a measurable function ζ such that
       for any starting point x,
                                                               
                                          i
                                                                    NCh (x)
               ∀h ∈ Cζ , Px  sup           [h(zi ) − π (h)] > ǫ ≤
                                                      ˜
                                 0≤i≤N                                 ǫ2
                                             j=0


       Theorem

       Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume
       moreover that
                     √      0           L
                        M δM − π(h) −→ N (0, V0 [h − π(h)]) .

       Then, for any starting point x,
                                N
                                  h(x (t) )
                                t=1                   L
                     MN                     − π(h)   −→N→∞ N (0, V0 [h − π(h)]) ,
                                  N

       where MN is defined by
                                                                                            18 / 22
Metropolis Hastings revisited                                                       Rao–Blackwellisation

Asymptotic results

       We will need some additional assumptions. Assume a maximal inequality
       for the Markov chain (zi )i : there exists a measurable function ζ such that
       for any starting point x,
                                                               
                                          i
                                                                    NCh (x)
               ∀h ∈ Cζ , Px  sup           [h(zi ) − π (h)] > ǫ ≤
                                                      ˜
                                 0≤i≤N                                 ǫ2
                                                     j=0

       Moreover, assume that ∃φ ≥ 1 such that for any starting point x,
                                               ˜           P
                                ∀h ∈ Cφ ,      Q n (x, h) −→ π (h) = π(ph)/π(p) ,
                                                             ˜


       Theorem

       Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume
       moreover that
                     √      0           L
                        M δM − π(h) −→ N (0, V0 [h − π(h)]) .

       Then, for any starting point x,
                                   N
                                   t=1   h(x (t) )          L                                     18 / 22
Metropolis Hastings revisited                                                       Rao–Blackwellisation

Asymptotic results

       We will need some additional assumptions. Assume a maximal inequality
       for the Markov chain (zi )i : there exists a measurable function ζ such that
       for any starting point x,
                                                               
                                          i
                                                                    NCh (x)
               ∀h ∈ Cζ , Px  sup           [h(zi ) − π (h)] > ǫ ≤
                                                      ˜
                                 0≤i≤N                                 ǫ2
                                                     j=0

       Moreover, assume that ∃φ ≥ 1 such that for any starting point x,
                                               ˜           P
                                ∀h ∈ Cφ ,      Q n (x, h) −→ π (h) = π(ph)/π(p) ,
                                                             ˜


       Theorem

       Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume
       moreover that
                     √      0           L
                        M δM − π(h) −→ N (0, V0 [h − π(h)]) .

       Then, for any starting point x,
                                   N
                                   t=1   h(x (t) )          L                                     18 / 22
Metropolis Hastings revisited                                                         Rao–Blackwellisation

Asymptotic results

                                                                          
                                                    i
                                                                                 NCh (x)
                     ∀h ∈ Cζ ,      Px  sup             [h(zi ) − π (h)] > ǫ ≤
                                                                   ˜
                                             0≤i≤N j=0                             ǫ2

                                              ˜           P
                                ∀h ∈ Cφ ,     Q n (x, h) −→ π (h) = π(ph)/π(p) ,
                                                            ˜


       Theorem

       Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume
       moreover that
                     √      0           L
                        M δM − π(h) −→ N (0, V0 [h − π(h)]) .

       Then, for any starting point x,
                                   N
                                   t=1h(x (t) )              L
                      MN                        − π(h)      −→N→∞ N (0, V0 [h − π(h)]) ,
                                      N

       where MN is defined by
                                                                                                    18 / 22
Metropolis Hastings revisited                                                 Rao–Blackwellisation

Asymptotic results




       Theorem

       Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume
       moreover that
                     √      0           L
                        M δM − π(h) −→ N (0, V0 [h − π(h)]) .

       Then, for any starting point x,
                                N
                                  h(x (t) )
                                t=1                   L
                     MN                     − π(h)   −→N→∞ N (0, V0 [h − π(h)]) ,
                                  N

       where MN is defined by
                                        MN                MN +1
                                              ˆ
                                              ξi0 ≤ N <           ˆ
                                                                  ξi0 .
                                        i=1                i=1



                                                                                            18 / 22
Metropolis Hastings revisited                                                 Rao–Blackwellisation

Asymptotic results




       Theorem

       Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume
       moreover that
                     √      0           L
                        M δM − π(h) −→ N (0, V0 [h − π(h)]) .

       Then, for any starting point x,
                                N
                                  h(x (t) )
                                t=1                   L
                     MN                     − π(h)   −→N→∞ N (0, V0 [h − π(h)]) ,
                                  N

       where MN is defined by
                                        MN                MN +1
                                              ˆ
                                              ξi0 ≤ N <           ˆ
                                                                  ξi0 .
                                        i=1                i=1



                                                                                            18 / 22
Metropolis Hastings revisited                                            Rao–Blackwellisation

Illustrations


Variance gain (1)



                                h(x)     x       x2      IX >0   p(x)
                                τ = .1   0.971   0.953   0.957   0.207
                                τ =2     0.965   0.942   0.875   0.861
                                τ =5     0.913   0.982   0.785   0.826
                                τ =7     0.899   0.982   0.768   0.820

        Ratios of the empirical variances of δ ∞ and δ estimating E[h(X )]: 100
        MCMC iterations over 103 replications of a random walk Gaussian
        proposal with scale τ .




                                                                                       19 / 22
Metropolis Hastings revisited                                             Rao–Blackwellisation

Illustrations


Illustration (1)




        Figure: Overlay of the variations of 250 iid realisations of the estimates
        δ (gold) and δ ∞ (grey) of E[X ] = 0 for 1000 iterations, along with the
        90% interquantile range for the estimates δ (brown) and δ ∞ (pink), in
        the setting of a random walk Gaussian proposal with scale τ = 10.
                                                                                        20 / 22
Metropolis Hastings revisited                                                  Rao–Blackwellisation

Illustrations


Extra computational effort



                                            median   mean   q.8   q.9   time
                                τ   = .25   0.0      8.85   4.9   13    4.2
                                τ   = .50   0.0      6.76   4     11    2.25
                                τ   = 1.0   0.25     6.15   4     10    2.5
                                τ   = 2.0   0.20     5.90   3.5   8.5   4.5

        Additional computing effort due: median and mean numbers of additional
        iterations, 80% and 90% quantiles for the additional iterations, and ratio
        of the average R computing times obtained over 105 simulations




                                                                                             21 / 22
Metropolis Hastings revisited                                            Rao–Blackwellisation

Illustrations


Illustration (2)




        Figure: Overlay of the variations of 500 iid realisations of the estimates
        δ (deep grey), δ ∞ (medium grey) and of the importance sampling version
        (light grey) of E[X ] = 10 when X ∼ Exp(.1) for 100 iterations, along
        with the 90% interquantile ranges (same colour code), in the setting of
        an independent exponential proposal with scale µ = 0.02.                       22 / 22

More Related Content

What's hot

Resolving the black-hole information paradox
Resolving the black-hole information paradoxResolving the black-hole information paradox
Resolving the black-hole information paradoxFausto Intilla
 
Jensen's inequality, EM 알고리즘
Jensen's inequality, EM 알고리즘 Jensen's inequality, EM 알고리즘
Jensen's inequality, EM 알고리즘 Jungkyu Lee
 
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...Zac Darcy
 
Talk given at the Workshop in Catania University
Talk given at the Workshop in Catania University Talk given at the Workshop in Catania University
Talk given at the Workshop in Catania University Marco Frasca
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distancesChristian Robert
 
Geometric and viscosity solutions for the Cauchy problem of first order
Geometric and viscosity solutions for the Cauchy problem of first orderGeometric and viscosity solutions for the Cauchy problem of first order
Geometric and viscosity solutions for the Cauchy problem of first orderJuliho Castillo
 
Metropolis-Hastings MCMC Short Tutorial
Metropolis-Hastings MCMC Short TutorialMetropolis-Hastings MCMC Short Tutorial
Metropolis-Hastings MCMC Short TutorialRalph Schlosser
 
Large-scale structure non-Gaussianities with modal methods (Ascona)
Large-scale structure non-Gaussianities with modal methods (Ascona)Large-scale structure non-Gaussianities with modal methods (Ascona)
Large-scale structure non-Gaussianities with modal methods (Ascona)Marcel Schmittfull
 
Modeling biased tracers at the field level
Modeling biased tracers at the field levelModeling biased tracers at the field level
Modeling biased tracers at the field levelMarcel Schmittfull
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themPierre Jacob
 
The computational limit_to_quantum_determinism_and_the_black_hole_information...
The computational limit_to_quantum_determinism_and_the_black_hole_information...The computational limit_to_quantum_determinism_and_the_black_hole_information...
The computational limit_to_quantum_determinism_and_the_black_hole_information...Sérgio Sacani
 
TheAscoliArzelaTheoremWithApplications
TheAscoliArzelaTheoremWithApplicationsTheAscoliArzelaTheoremWithApplications
TheAscoliArzelaTheoremWithApplicationsAndrea Collivadino
 
Db31463471
Db31463471Db31463471
Db31463471IJMER
 
Volume_7_avrami
Volume_7_avramiVolume_7_avrami
Volume_7_avramiJohn Obuch
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Scienceresearchinventy
 
Resource theory of asymmetric distinguishability
Resource theory of asymmetric distinguishabilityResource theory of asymmetric distinguishability
Resource theory of asymmetric distinguishabilityMark Wilde
 
Talk given at the Twelfth Workshop on Non-Perurbative Quantum Chromodynamics ...
Talk given at the Twelfth Workshop on Non-Perurbative Quantum Chromodynamics ...Talk given at the Twelfth Workshop on Non-Perurbative Quantum Chromodynamics ...
Talk given at the Twelfth Workshop on Non-Perurbative Quantum Chromodynamics ...Marco Frasca
 
ON OPTIMIZATION OF MANUFACTURING OF MULTICHANNEL HETEROTRANSISTORS TO INCREAS...
ON OPTIMIZATION OF MANUFACTURING OF MULTICHANNEL HETEROTRANSISTORS TO INCREAS...ON OPTIMIZATION OF MANUFACTURING OF MULTICHANNEL HETEROTRANSISTORS TO INCREAS...
ON OPTIMIZATION OF MANUFACTURING OF MULTICHANNEL HETEROTRANSISTORS TO INCREAS...ijrap
 

What's hot (20)

Resolving the black-hole information paradox
Resolving the black-hole information paradoxResolving the black-hole information paradox
Resolving the black-hole information paradox
 
Jensen's inequality, EM 알고리즘
Jensen's inequality, EM 알고리즘 Jensen's inequality, EM 알고리즘
Jensen's inequality, EM 알고리즘
 
Bd32360363
Bd32360363Bd32360363
Bd32360363
 
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...
 
Talk given at the Workshop in Catania University
Talk given at the Workshop in Catania University Talk given at the Workshop in Catania University
Talk given at the Workshop in Catania University
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distances
 
Geometric and viscosity solutions for the Cauchy problem of first order
Geometric and viscosity solutions for the Cauchy problem of first orderGeometric and viscosity solutions for the Cauchy problem of first order
Geometric and viscosity solutions for the Cauchy problem of first order
 
Metropolis-Hastings MCMC Short Tutorial
Metropolis-Hastings MCMC Short TutorialMetropolis-Hastings MCMC Short Tutorial
Metropolis-Hastings MCMC Short Tutorial
 
Large-scale structure non-Gaussianities with modal methods (Ascona)
Large-scale structure non-Gaussianities with modal methods (Ascona)Large-scale structure non-Gaussianities with modal methods (Ascona)
Large-scale structure non-Gaussianities with modal methods (Ascona)
 
Modeling biased tracers at the field level
Modeling biased tracers at the field levelModeling biased tracers at the field level
Modeling biased tracers at the field level
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing them
 
The computational limit_to_quantum_determinism_and_the_black_hole_information...
The computational limit_to_quantum_determinism_and_the_black_hole_information...The computational limit_to_quantum_determinism_and_the_black_hole_information...
The computational limit_to_quantum_determinism_and_the_black_hole_information...
 
TheAscoliArzelaTheoremWithApplications
TheAscoliArzelaTheoremWithApplicationsTheAscoliArzelaTheoremWithApplications
TheAscoliArzelaTheoremWithApplications
 
Db31463471
Db31463471Db31463471
Db31463471
 
Volume_7_avrami
Volume_7_avramiVolume_7_avrami
Volume_7_avrami
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
 
Resource theory of asymmetric distinguishability
Resource theory of asymmetric distinguishabilityResource theory of asymmetric distinguishability
Resource theory of asymmetric distinguishability
 
Talk given at the Twelfth Workshop on Non-Perurbative Quantum Chromodynamics ...
Talk given at the Twelfth Workshop on Non-Perurbative Quantum Chromodynamics ...Talk given at the Twelfth Workshop on Non-Perurbative Quantum Chromodynamics ...
Talk given at the Twelfth Workshop on Non-Perurbative Quantum Chromodynamics ...
 
ON OPTIMIZATION OF MANUFACTURING OF MULTICHANNEL HETEROTRANSISTORS TO INCREAS...
ON OPTIMIZATION OF MANUFACTURING OF MULTICHANNEL HETEROTRANSISTORS TO INCREAS...ON OPTIMIZATION OF MANUFACTURING OF MULTICHANNEL HETEROTRANSISTORS TO INCREAS...
ON OPTIMIZATION OF MANUFACTURING OF MULTICHANNEL HETEROTRANSISTORS TO INCREAS...
 
Mechanical Engineering Assignment Help
Mechanical Engineering Assignment HelpMechanical Engineering Assignment Help
Mechanical Engineering Assignment Help
 

Similar to A Vanilla Rao-Blackwellisation

Vanilla rao blackwellisation
Vanilla rao blackwellisationVanilla rao blackwellisation
Vanilla rao blackwellisationDeb Roy
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018Christian Robert
 
Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical MethodsChristian Robert
 
Delayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsDelayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsChristian Robert
 
Introduction to MCMC methods
Introduction to MCMC methodsIntroduction to MCMC methods
Introduction to MCMC methodsChristian Robert
 
InternshipReport
InternshipReportInternshipReport
InternshipReportHamza Ameur
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Fabian Pedregosa
 
Introduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methodsIntroduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methodsChristian Robert
 
Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical MethodsChristian Robert
 
An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...Alexander Decker
 
Some Thoughts on Sampling
Some Thoughts on SamplingSome Thoughts on Sampling
Some Thoughts on SamplingDon Sheehy
 

Similar to A Vanilla Rao-Blackwellisation (20)

Vanilla rao blackwellisation
Vanilla rao blackwellisationVanilla rao blackwellisation
Vanilla rao blackwellisation
 
Dr09 Slide
Dr09 SlideDr09 Slide
Dr09 Slide
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018
 
Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical Methods
 
Delayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsDelayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithms
 
Rdnd2008
Rdnd2008Rdnd2008
Rdnd2008
 
Adc
AdcAdc
Adc
 
Introduction to MCMC methods
Introduction to MCMC methodsIntroduction to MCMC methods
Introduction to MCMC methods
 
InternshipReport
InternshipReportInternshipReport
InternshipReport
 
Pres metabief2020jmm
Pres metabief2020jmmPres metabief2020jmm
Pres metabief2020jmm
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3
 
Berans qm overview
Berans qm overviewBerans qm overview
Berans qm overview
 
Shanghai tutorial
Shanghai tutorialShanghai tutorial
Shanghai tutorial
 
Introduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methodsIntroduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methods
 
Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical Methods
 
Demo
DemoDemo
Demo
 
Demo
DemoDemo
Demo
 
Demo
DemoDemo
Demo
 
An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...
 
Some Thoughts on Sampling
Some Thoughts on SamplingSome Thoughts on Sampling
Some Thoughts on Sampling
 

More from Christian Robert

Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceChristian Robert
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinChristian Robert
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?Christian Robert
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Christian Robert
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Christian Robert
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking componentsChristian Robert
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihoodChristian Robert
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)Christian Robert
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerChristian Robert
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussionChristian Robert
 

More from Christian Robert (20)

Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de France
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael Martin
 
discussion of ICML23.pdf
discussion of ICML23.pdfdiscussion of ICML23.pdf
discussion of ICML23.pdf
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?
 
restore.pdf
restore.pdfrestore.pdf
restore.pdf
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking components
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihood
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
 
eugenics and statistics
eugenics and statisticseugenics and statistics
eugenics and statistics
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussion
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 

Recently uploaded

Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersChitralekhaTherkar
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 

Recently uploaded (20)

Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of Powders
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 

A Vanilla Rao-Blackwellisation

  • 1. Vanilla Rao–Blackwellisation of Metropolis–Hastings algorithms Christian P. Robert Universit´ Paris-Dauphine and CREST, Paris, France e Joint works with Randal Douc, Pierre Jacob and Murray Smith xian@ceremade.dauphine.fr August 9, 2010 1 / 22
  • 2. Main themes 1 Rao–Blackwellisation on MCMC. 2 Can be performed in any Hastings Metropolis algorithm. 3 Asymptotically more efficient than usual MCMC with a controlled additional computing 4 Can take advantage of parallel capacities at a very basic level 2 / 22
  • 3. Main themes 1 Rao–Blackwellisation on MCMC. 2 Can be performed in any Hastings Metropolis algorithm. 3 Asymptotically more efficient than usual MCMC with a controlled additional computing 4 Can take advantage of parallel capacities at a very basic level 2 / 22
  • 4. Main themes 1 Rao–Blackwellisation on MCMC. 2 Can be performed in any Hastings Metropolis algorithm. 3 Asymptotically more efficient than usual MCMC with a controlled additional computing 4 Can take advantage of parallel capacities at a very basic level 2 / 22
  • 5. Metropolis Hastings revisited Rao–Blackwellisation Outline 1 Metropolis Hastings revisited 2 Rao–Blackwellisation Formal importance sampling Variance reduction Asymptotic results Illustrations 3 / 22
  • 6. Metropolis Hastings revisited Rao–Blackwellisation Outline 1 Metropolis Hastings revisited 2 Rao–Blackwellisation Formal importance sampling Variance reduction Asymptotic results Illustrations 3 / 22
  • 7. Metropolis Hastings revisited Rao–Blackwellisation Outline 1 Metropolis Hastings revisited 2 Rao–Blackwellisation Formal importance sampling Variance reduction Asymptotic results Illustrations 4 / 22
  • 8. Metropolis Hastings revisited Rao–Blackwellisation Metropolis Hastings algorithm 1 We wish to approximate h(x)π(x)dx I = = h(x)¯ (x)dx π π(x)dx 2 x → π(x) is known but not π(x)dx. 1 n 3 Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov chain with limiting distribution π . ¯ 4 Convergence obtained from Law of Large Numbers or CLT for Markov chains. 5 / 22
  • 9. Metropolis Hastings revisited Rao–Blackwellisation Metropolis Hastings algorithm 1 We wish to approximate h(x)π(x)dx I = = h(x)¯ (x)dx π π(x)dx 2 x → π(x) is known but not π(x)dx. 1 n 3 Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov chain with limiting distribution π . ¯ 4 Convergence obtained from Law of Large Numbers or CLT for Markov chains. 5 / 22
  • 10. Metropolis Hastings revisited Rao–Blackwellisation Metropolis Hastings algorithm 1 We wish to approximate h(x)π(x)dx I = = h(x)¯ (x)dx π π(x)dx 2 x → π(x) is known but not π(x)dx. 1 n 3 Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov chain with limiting distribution π . ¯ 4 Convergence obtained from Law of Large Numbers or CLT for Markov chains. 5 / 22
  • 11. Metropolis Hastings revisited Rao–Blackwellisation Metropolis Hastings algorithm 1 We wish to approximate h(x)π(x)dx I = = h(x)¯ (x)dx π π(x)dx 2 x → π(x) is known but not π(x)dx. 1 n 3 Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov chain with limiting distribution π . ¯ 4 Convergence obtained from Law of Large Numbers or CLT for Markov chains. 5 / 22
  • 12. Metropolis Hastings revisited Rao–Blackwellisation Metropolis Hasting Algorithm Suppose that x (t) is drawn. 1 Simulate yt ∼ q(·|x (t) ). 2 Set x (t+1) = yt with probability π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) Otherwise, set x (t+1) = x (t) . 3 α is such that the detailed balance equation is satisfied: ⊲ π is the ¯ stationary distribution of (x (t) ). ◮ The accepted candidates are simulated with the rejection algorithm. 6 / 22
  • 13. Metropolis Hastings revisited Rao–Blackwellisation Metropolis Hasting Algorithm Suppose that x (t) is drawn. 1 Simulate yt ∼ q(·|x (t) ). 2 Set x (t+1) = yt with probability π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) Otherwise, set x (t+1) = x (t) . 3 α is such that the detailed balance equation is satisfied: ⊲ π is the ¯ stationary distribution of (x (t) ). ◮ The accepted candidates are simulated with the rejection algorithm. 6 / 22
  • 14. Metropolis Hastings revisited Rao–Blackwellisation Metropolis Hasting Algorithm Suppose that x (t) is drawn. 1 Simulate yt ∼ q(·|x (t) ). 2 Set x (t+1) = yt with probability π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) Otherwise, set x (t+1) = x (t) . 3 α is such that the detailed balance equation is satisfied: π(x)q(y |x)α(x, y ) = π(y )q(x|y )α(y , x). ⊲ π is the stationary distribution of (x (t) ). ¯ ◮ The accepted candidates are simulated with the rejection algorithm. 6 / 22
  • 15. Metropolis Hastings revisited Rao–Blackwellisation Metropolis Hasting Algorithm Suppose that x (t) is drawn. 1 Simulate yt ∼ q(·|x (t) ). 2 Set x (t+1) = yt with probability π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) Otherwise, set x (t+1) = x (t) . 3 α is such that the detailed balance equation is satisfied: π(x)q(y |x)α(x, y ) = π(y )q(x|y )α(y , x). ⊲ π is the stationary distribution of (x (t) ). ¯ ◮ The accepted candidates are simulated with the rejection algorithm. 6 / 22
  • 16. Metropolis Hastings revisited Rao–Blackwellisation Some properties of the HM algorithm 1 Alternative representation of the estimator δ is n MN 1 (t) 1 δ= h(x )= ni h(zi ) , n t=1 N i=1 where zi ’s are the accepted yj ’s, MN is the number of accepted yj ’s till time N, ni is the number of times zi appears in the sequence (x (t) )t . 7 / 22
  • 17. Metropolis Hastings revisited Rao–Blackwellisation α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi )/ ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. The transition kernel q admits π as a stationary distribution: ˜ ˜ π (x)˜ (y |x) = ˜ q 8 / 22
  • 18. Metropolis Hastings revisited Rao–Blackwellisation α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi )/ ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. The transition kernel q admits π as a stationary distribution: ˜ ˜ π (x)˜ (y |x) = ˜ q 8 / 22
  • 19. Metropolis Hastings revisited Rao–Blackwellisation α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi )/ ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. The transition kernel q admits π as a stationary distribution: ˜ ˜ π (x)˜ (y |x) = ˜ q 8 / 22
  • 20. Metropolis Hastings revisited Rao–Blackwellisation α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi )/ ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. The transition kernel q admits π as a stationary distribution: ˜ ˜ π(x)p(x) α(x, y )q(y |x) π (x)˜ (y |x) = ˜ q π(u)p(u)du p(x) π (x) ˜ q (y |x) ˜ 8 / 22
  • 21. Metropolis Hastings revisited Rao–Blackwellisation α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi )/ ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. The transition kernel q admits π as a stationary distribution: ˜ ˜ π(x)α(x, y )q(y |x) π (x)˜ (y |x) = ˜ q π(u)p(u)du 8 / 22
  • 22. Metropolis Hastings revisited Rao–Blackwellisation α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi )/ ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. The transition kernel q admits π as a stationary distribution: ˜ ˜ π(y )α(y , x)q(x|y ) π (x)˜ (y |x) = ˜ q π(u)p(u)du 8 / 22
  • 23. Metropolis Hastings revisited Rao–Blackwellisation α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate according to q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi )/ ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. 3 ◮ this is the transition of the HM algorithm. The transition kernel q admits π as a stationary distribution: ˜ ˜ π (x)˜ (y |x) = π (y )˜ (x|y ) , ˜ q ˜ q 8 / 22
  • 24. Metropolis Hastings revisited Rao–Blackwellisation Lemme The sequence (zi , ni ) satisfies 1 (zi , ni )i is a Markov chain; 2 zi+1 and ni are independent given zi ; 3 ni is distributed as a geometric random variable with probability parameter p(zi ) := α(zi , y ) q(y |zi ) dy ; (1) 4 ˜ (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy ˜ and stationary distribution π such that ˜ q (·|z) ∝ α(z, ·) q(·|z) ˜ and π (·) ∝ π(·)p(·) . ˜ 9 / 22
  • 25. Metropolis Hastings revisited Rao–Blackwellisation Lemme The sequence (zi , ni ) satisfies 1 (zi , ni )i is a Markov chain; 2 zi+1 and ni are independent given zi ; 3 ni is distributed as a geometric random variable with probability parameter p(zi ) := α(zi , y ) q(y |zi ) dy ; (1) 4 ˜ (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy ˜ and stationary distribution π such that ˜ q (·|z) ∝ α(z, ·) q(·|z) ˜ and π (·) ∝ π(·)p(·) . ˜ 9 / 22
  • 26. Metropolis Hastings revisited Rao–Blackwellisation Lemme The sequence (zi , ni ) satisfies 1 (zi , ni )i is a Markov chain; 2 zi+1 and ni are independent given zi ; 3 ni is distributed as a geometric random variable with probability parameter p(zi ) := α(zi , y ) q(y |zi ) dy ; (1) 4 ˜ (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy ˜ and stationary distribution π such that ˜ q (·|z) ∝ α(z, ·) q(·|z) ˜ and π (·) ∝ π(·)p(·) . ˜ 9 / 22
  • 27. Metropolis Hastings revisited Rao–Blackwellisation Lemme The sequence (zi , ni ) satisfies 1 (zi , ni )i is a Markov chain; 2 zi+1 and ni are independent given zi ; 3 ni is distributed as a geometric random variable with probability parameter p(zi ) := α(zi , y ) q(y |zi ) dy ; (1) 4 ˜ (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy ˜ and stationary distribution π such that ˜ q (·|z) ∝ α(z, ·) q(·|z) ˜ and π (·) ∝ π(·)p(·) . ˜ 9 / 22
  • 28. Metropolis Hastings revisited Rao–Blackwellisation Old bottle, new wine [or vice-versa] zi−1 10 / 22
  • 29. Metropolis Hastings revisited Rao–Blackwellisation Old bottle, new wine [or vice-versa] indep zi−1 zi indep ni−1 10 / 22
  • 30. Metropolis Hastings revisited Rao–Blackwellisation Old bottle, new wine [or vice-versa] indep indep zi−1 zi zi+1 indep indep ni−1 ni 10 / 22
  • 31. Metropolis Hastings revisited Rao–Blackwellisation Old bottle, new wine [or vice-versa] indep indep zi−1 zi zi+1 indep indep ni−1 ni n MN 1 1 δ= h(x (t) ) = ni h(zi ) . n t=1 N i=1 10 / 22
  • 32. Metropolis Hastings revisited Rao–Blackwellisation Old bottle, new wine [or vice-versa] indep indep zi−1 zi zi+1 indep indep ni−1 ni n MN 1 1 δ= h(x (t) ) = ni h(zi ) . n t=1 N i=1 10 / 22
  • 33. Metropolis Hastings revisited Rao–Blackwellisation Outline 1 Metropolis Hastings revisited 2 Rao–Blackwellisation Formal importance sampling Variance reduction Asymptotic results Illustrations 11 / 22
  • 34. Metropolis Hastings revisited Rao–Blackwellisation Formal importance sampling Importance sampling perspective 1 A natural idea: MN 1 h(zi ) δ∗ = , N p(zi ) i=1 12 / 22
  • 35. Metropolis Hastings revisited Rao–Blackwellisation Formal importance sampling Importance sampling perspective 1 A natural idea: MN h(zi ) MN π(zi ) i=1 i=1 h(zi ) p(zi ) π (zi ) ˜ δ∗ ≃ = . MN 1 MN π(zi ) i=1 i=1 p(zi ) π (zi ) ˜ 12 / 22
  • 36. Metropolis Hastings revisited Rao–Blackwellisation Formal importance sampling Importance sampling perspective 1 A natural idea: MN h(zi ) MN π(zi ) i=1 i=1 h(zi ) p(zi ) π (zi ) ˜ δ∗ ≃ = . MN 1 MN π(zi ) i=1 i=1 p(zi ) π (zi ) ˜ 2 But p not available in closed form. 12 / 22
  • 37. Metropolis Hastings revisited Rao–Blackwellisation Formal importance sampling Importance sampling perspective 1 A natural idea: MN h(zi ) MN π(zi ) i=1 i=1 h(zi ) p(zi ) π (zi ) ˜ δ∗ ≃ = . MN 1 MN π(zi ) i=1 i=1 p(zi ) π (zi ) ˜ 2 But p not available in closed form. 3 The geometric ni is the replacement obvious solution that is used in the original Metropolis–Hastings estimate since E[ni ] = 1/p(zi ). 12 / 22
  • 38. Metropolis Hastings revisited Rao–Blackwellisation Formal importance sampling Importance sampling perspective 1 A natural idea: MN h(zi ) MN π(zi ) i=1 i=1 h(zi ) p(zi ) π (zi ) ˜ δ∗ ≃ = . MN 1 MN π(zi ) i=1 i=1 p(zi ) π (zi ) ˜ 2 But p not available in closed form. 3 The geometric ni is the replacement obvious solution that is used in the original Metropolis–Hastings estimate since E[ni ] = 1/p(zi ). 12 / 22
  • 39. Metropolis Hastings revisited Rao–Blackwellisation Formal importance sampling The crude estimate of 1/p(zi ), ∞ ni = 1 + I {uℓ ≥ α(zi , yℓ )} , j=1 ℓ≤j can be improved: Lemma If (yj )j is an iid sequence with distribution q(y |zi ), the quantity ∞ ˆ ξi = 1 + {1 − α(zi , yℓ )} j=1 ℓ≤j is an unbiased estimator of 1/p(zi ) which variance, conditional on zi , is lower than the conditional variance of ni , {1 − p(zi )}/p 2 (zi ). 13 / 22
  • 40. Metropolis Hastings revisited Rao–Blackwellisation Formal importance sampling Rao-Blackwellised, for sure? ∞ ˆ ξi = 1 + {1 − α(zi , yℓ )} j=1 ℓ≤j 1 Infinite sum but finite with at least positive probability: π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) For example: take a symetric random walk as a proposal. 2 What if we wish to be sure that the sum is finite? 14 / 22
  • 41. Metropolis Hastings revisited Rao–Blackwellisation Variance reduction Variance improvement Proposition If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid uniform sequence, for any k ≥ 0, the quantity ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} j=1 1≤ℓ≤k∧j k+1≤ℓ≤j is an unbiased estimator of 1/p(zi ) with an almost sure finite number of terms. 15 / 22
  • 42. Metropolis Hastings revisited Rao–Blackwellisation Variance reduction Variance improvement Proposition If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid uniform sequence, for any k ≥ 0, the quantity ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} j=1 1≤ℓ≤k∧j k+1≤ℓ≤j is an unbiased estimator of 1/p(zi ) with an almost sure finite number of terms. Moreover, for k ≥ 1, 1 − (1 − 2p(zi ) + r (zi ))k 2 − p(zi ) „ « h ˛ i ˆ˛ 1 − p(zi ) V ξik ˛ zi = − (p(zi ) − r (zi )) , p 2 (zi ) 2p(zi ) − r (zi ) p 2 (zi ) α2 (zi , y ) q(y |zi ) dy . R R where p(zi ) := α(zi , y ) q(y |zi ) dy . and r (zi ) := 15 / 22
  • 43. Metropolis Hastings revisited Rao–Blackwellisation Variance reduction Variance improvement Proposition If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid uniform sequence, for any k ≥ 0, the quantity ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} j=1 1≤ℓ≤k∧j k+1≤ℓ≤j is an unbiased estimator of 1/p(zi ) with an almost sure finite number of terms. Therefore, we have ˆ ˆ ˆ V ξi zi ≤ V ξik zi ≤ V ξi0 zi = V [ni | zi ] . 15 / 22
  • 44. Metropolis Hastings revisited Rao–Blackwellisation Variance reduction zi−1 ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} j=1 1≤ℓ≤k∧j k+1≤ℓ≤j 16 / 22
  • 45. Metropolis Hastings revisited Rao–Blackwellisation Variance reduction not indep zi−1 zi not indep ˆk ξi−1 ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} j=1 1≤ℓ≤k∧j k+1≤ℓ≤j 16 / 22
  • 46. Metropolis Hastings revisited Rao–Blackwellisation Variance reduction not indep not indep zi−1 zi zi+1 not indep not indep ˆk ξi−1 ˆ ξik ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {uℓ ≥ α(zi , yℓ )} j=1 1≤ℓ≤k∧j k+1≤ℓ≤j 16 / 22
  • 47. Metropolis Hastings revisited Rao–Blackwellisation Variance reduction not indep not indep zi−1 zi zi+1 not indep not indep ˆk ξi−1 ˆ ξik M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi 16 / 22
  • 48. Metropolis Hastings revisited Rao–Blackwellisation Variance reduction not indep not indep zi−1 zi zi+1 not indep not indep ˆk ξi−1 ˆ ξik M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi 16 / 22
  • 49. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results Let M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}. 17 / 22
  • 50. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results Let M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}. Assume that there exists a positive function ϕ ≥ 1 such that PM i=1 h(zi )/p(zi ) P ∀h ∈ Cϕ , PM −→ π(h) i=1 1/p(zi ) Theorem Under the assumption that π(p) > 0, the following convergence property holds: i) If h is in Cϕ , then k P δM −→M→∞ π(h) (◮Consistency) 17 / 22
  • 51. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results Let M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}. Assume that there exists a positive function ψ such that √ M i=1 h(zi )/p(zi ) L ∀h ∈ Cψ , M M − π(h) −→ N (0, Γ(h)) i=1 1/p(zi ) Theorem Under the assumption that π(p) > 0, the following convergence property holds: ii) If, in addition, h2 /p ∈ Cϕ and h ∈ Cψ , then √ k L M(δM − π(h)) −→M→∞ N (0, Vk [h − π(h)]) , (◮Clt) where Vk (h) := π(p) ˆ π(dz)V ξik z h2 (z)p(z) + Γ(h) . 17 / 22
  • 52. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results We will need some additional assumptions. Assume a maximal inequality for the Markov chain (zi )i : there exists a measurable function ζ such that for any starting point x,   i NCh (x) ∀h ∈ Cζ , Px  sup [h(zi ) − π (h)] > ǫ ≤ ˜ 0≤i≤N ǫ2 j=0 Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N h(x (t) ) t=1 L MN − π(h) −→N→∞ N (0, V0 [h − π(h)]) , N where MN is defined by 18 / 22
  • 53. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results We will need some additional assumptions. Assume a maximal inequality for the Markov chain (zi )i : there exists a measurable function ζ such that for any starting point x,   i NCh (x) ∀h ∈ Cζ , Px  sup [h(zi ) − π (h)] > ǫ ≤ ˜ 0≤i≤N ǫ2 j=0 Moreover, assume that ∃φ ≥ 1 such that for any starting point x, ˜ P ∀h ∈ Cφ , Q n (x, h) −→ π (h) = π(ph)/π(p) , ˜ Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N t=1 h(x (t) ) L 18 / 22
  • 54. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results We will need some additional assumptions. Assume a maximal inequality for the Markov chain (zi )i : there exists a measurable function ζ such that for any starting point x,   i NCh (x) ∀h ∈ Cζ , Px  sup [h(zi ) − π (h)] > ǫ ≤ ˜ 0≤i≤N ǫ2 j=0 Moreover, assume that ∃φ ≥ 1 such that for any starting point x, ˜ P ∀h ∈ Cφ , Q n (x, h) −→ π (h) = π(ph)/π(p) , ˜ Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N t=1 h(x (t) ) L 18 / 22
  • 55. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results   i NCh (x) ∀h ∈ Cζ , Px  sup [h(zi ) − π (h)] > ǫ ≤ ˜ 0≤i≤N j=0 ǫ2 ˜ P ∀h ∈ Cφ , Q n (x, h) −→ π (h) = π(ph)/π(p) , ˜ Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N t=1h(x (t) ) L MN − π(h) −→N→∞ N (0, V0 [h − π(h)]) , N where MN is defined by 18 / 22
  • 56. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N h(x (t) ) t=1 L MN − π(h) −→N→∞ N (0, V0 [h − π(h)]) , N where MN is defined by MN MN +1 ˆ ξi0 ≤ N < ˆ ξi0 . i=1 i=1 18 / 22
  • 57. Metropolis Hastings revisited Rao–Blackwellisation Asymptotic results Theorem Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N h(x (t) ) t=1 L MN − π(h) −→N→∞ N (0, V0 [h − π(h)]) , N where MN is defined by MN MN +1 ˆ ξi0 ≤ N < ˆ ξi0 . i=1 i=1 18 / 22
  • 58. Metropolis Hastings revisited Rao–Blackwellisation Illustrations Variance gain (1) h(x) x x2 IX >0 p(x) τ = .1 0.971 0.953 0.957 0.207 τ =2 0.965 0.942 0.875 0.861 τ =5 0.913 0.982 0.785 0.826 τ =7 0.899 0.982 0.768 0.820 Ratios of the empirical variances of δ ∞ and δ estimating E[h(X )]: 100 MCMC iterations over 103 replications of a random walk Gaussian proposal with scale τ . 19 / 22
  • 59. Metropolis Hastings revisited Rao–Blackwellisation Illustrations Illustration (1) Figure: Overlay of the variations of 250 iid realisations of the estimates δ (gold) and δ ∞ (grey) of E[X ] = 0 for 1000 iterations, along with the 90% interquantile range for the estimates δ (brown) and δ ∞ (pink), in the setting of a random walk Gaussian proposal with scale τ = 10. 20 / 22
  • 60. Metropolis Hastings revisited Rao–Blackwellisation Illustrations Extra computational effort median mean q.8 q.9 time τ = .25 0.0 8.85 4.9 13 4.2 τ = .50 0.0 6.76 4 11 2.25 τ = 1.0 0.25 6.15 4 10 2.5 τ = 2.0 0.20 5.90 3.5 8.5 4.5 Additional computing effort due: median and mean numbers of additional iterations, 80% and 90% quantiles for the additional iterations, and ratio of the average R computing times obtained over 105 simulations 21 / 22
  • 61. Metropolis Hastings revisited Rao–Blackwellisation Illustrations Illustration (2) Figure: Overlay of the variations of 500 iid realisations of the estimates δ (deep grey), δ ∞ (medium grey) and of the importance sampling version (light grey) of E[X ] = 10 when X ∼ Exp(.1) for 100 iterations, along with the 90% interquantile ranges (same colour code), in the setting of an independent exponential proposal with scale µ = 0.02. 22 / 22