SlideShare a Scribd company logo
1 of 75
Download to read offline
Scaling analysis of multiple-try MCMC
              methods

               Randal DOUC
       randal.douc@it-sudparis.eu


Travail joint avec Mylène Bédard et Eric Moulines.




                                                     1 / 25
Themes

   1   MCMC algorithms with multiple proposals: MCTM, MTM-C.
   2   Analysis through optimal scaling (introduced by Roberts,
       Gelman, Gilks, 1998)
   3   Hit and Run algorithm.




                                                                  2 / 25
Themes

   1   MCMC algorithms with multiple proposals: MCTM, MTM-C.
   2   Analysis through optimal scaling (introduced by Roberts,
       Gelman, Gilks, 1998)
   3   Hit and Run algorithm.




                                                                  2 / 25
Themes

   1   MCMC algorithms with multiple proposals: MCTM, MTM-C.
   2   Analysis through optimal scaling (introduced by Roberts,
       Gelman, Gilks, 1998)
   3   Hit and Run algorithm.




                                                                  2 / 25
Themes

   1   MCMC algorithms with multiple proposals: MCTM, MTM-C.
   2   Analysis through optimal scaling (introduced by Roberts,
       Gelman, Gilks, 1998)
   3   Hit and Run algorithm.




                                                                  2 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




Plan de l’exposé

       1       Introduction

       2       MH algorithms with multiple proposals
                Random Walk MH
                MCTM algorithm
                MTM-C algorithms

       3       Optimal scaling
                 Main results

       4       Optimising the speed up process
                 MCTM algorithm
                 MTM-C algorithms
       5       Conclusion


                                                                                                                   3 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




Plan de l’exposé

       1       Introduction

       2       MH algorithms with multiple proposals
                Random Walk MH
                MCTM algorithm
                MTM-C algorithms

       3       Optimal scaling
                 Main results

       4       Optimising the speed up process
                 MCTM algorithm
                 MTM-C algorithms
       5       Conclusion


                                                                                                                   3 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




Plan de l’exposé

       1       Introduction

       2       MH algorithms with multiple proposals
                Random Walk MH
                MCTM algorithm
                MTM-C algorithms

       3       Optimal scaling
                 Main results

       4       Optimising the speed up process
                 MCTM algorithm
                 MTM-C algorithms
       5       Conclusion


                                                                                                                   3 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




Plan de l’exposé

       1       Introduction

       2       MH algorithms with multiple proposals
                Random Walk MH
                MCTM algorithm
                MTM-C algorithms

       3       Optimal scaling
                 Main results

       4       Optimising the speed up process
                 MCTM algorithm
                 MTM-C algorithms
       5       Conclusion


                                                                                                                   3 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




Plan de l’exposé

       1       Introduction

       2       MH algorithms with multiple proposals
                Random Walk MH
                MCTM algorithm
                MTM-C algorithms

       3       Optimal scaling
                 Main results

       4       Optimising the speed up process
                 MCTM algorithm
                 MTM-C algorithms
       5       Conclusion


                                                                                                                   3 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




Plan

       1       Introduction

       2       MH algorithms with multiple proposals
                Random Walk MH
                MCTM algorithm
                MTM-C algorithms

       3       Optimal scaling
                 Main results

       4       Optimising the speed up process
                 MCTM algorithm
                 MTM-C algorithms
       5       Conclusion


                                                                                                                   4 / 25
Introduction    MH algorithms with multiple proposals    Optimal scaling   Optimising the speed up process   Conclusion




Metropolis Hastings (MH) algorithm
           1   We wish to approximate
                                                        π(x )
                                 I=         h(x )              dx =          h(x )¯ (x )dx
                                                                                  π
                                                        π(u)du

           2   x → π(x ) is known but not                  π(u)du.

               Approximate I with ˜ = n t=1 h(X [t]) where (X [t]) is a Markov
                                          1   n
           3                         I
               chain with limiting distribution π .
                                                ¯
           4   In MH algorithm, the last condition is obtained from a detailed
               balance condition
                                       ∀x , y ,     π(x )p(x , y ) = π(y )p(y , x )


           5   Quality of the approximation are obtained from Law of Large
               Numbers or CLT for Markov chains.
                                                                                                                   5 / 25
Introduction    MH algorithms with multiple proposals    Optimal scaling   Optimising the speed up process   Conclusion




Metropolis Hastings (MH) algorithm
           1   We wish to approximate
                                                        π(x )
                                 I=         h(x )              dx =          h(x )¯ (x )dx
                                                                                  π
                                                        π(u)du

           2   x → π(x ) is known but not                  π(u)du.

               Approximate I with ˜ = n t=1 h(X [t]) where (X [t]) is a Markov
                                          1   n
           3                         I
               chain with limiting distribution π .
                                                ¯
           4   In MH algorithm, the last condition is obtained from a detailed
               balance condition
                                       ∀x , y ,     π(x )p(x , y ) = π(y )p(y , x )


           5   Quality of the approximation are obtained from Law of Large
               Numbers or CLT for Markov chains.
                                                                                                                   5 / 25
Introduction    MH algorithms with multiple proposals    Optimal scaling   Optimising the speed up process   Conclusion




Metropolis Hastings (MH) algorithm
           1   We wish to approximate
                                                        π(x )
                                 I=         h(x )              dx =          h(x )¯ (x )dx
                                                                                  π
                                                        π(u)du

           2   x → π(x ) is known but not                  π(u)du.

               Approximate I with ˜ = n t=1 h(X [t]) where (X [t]) is a Markov
                                          1   n
           3                         I
               chain with limiting distribution π .
                                                ¯
           4   In MH algorithm, the last condition is obtained from a detailed
               balance condition
                                       ∀x , y ,     π(x )p(x , y ) = π(y )p(y , x )


           5   Quality of the approximation are obtained from Law of Large
               Numbers or CLT for Markov chains.
                                                                                                                   5 / 25
Introduction    MH algorithms with multiple proposals    Optimal scaling   Optimising the speed up process   Conclusion




Metropolis Hastings (MH) algorithm
           1   We wish to approximate
                                                        π(x )
                                 I=         h(x )              dx =          h(x )¯ (x )dx
                                                                                  π
                                                        π(u)du

           2   x → π(x ) is known but not                  π(u)du.

               Approximate I with ˜ = n t=1 h(X [t]) where (X [t]) is a Markov
                                          1   n
           3                         I
               chain with limiting distribution π .
                                                ¯
           4   In MH algorithm, the last condition is obtained from a detailed
               balance condition
                                       ∀x , y ,     π(x )p(x , y ) = π(y )p(y , x )


           5   Quality of the approximation are obtained from Law of Large
               Numbers or CLT for Markov chains.
                                                                                                                   5 / 25
Introduction    MH algorithms with multiple proposals    Optimal scaling   Optimising the speed up process   Conclusion




Metropolis Hastings (MH) algorithm
           1   We wish to approximate
                                                        π(x )
                                 I=         h(x )              dx =          h(x )¯ (x )dx
                                                                                  π
                                                        π(u)du

           2   x → π(x ) is known but not                  π(u)du.

               Approximate I with ˜ = n t=1 h(X [t]) where (X [t]) is a Markov
                                          1   n
           3                         I
               chain with limiting distribution π .
                                                ¯
           4   In MH algorithm, the last condition is obtained from a detailed
               balance condition
                                       ∀x , y ,     π(x )p(x , y ) = π(y )p(y , x )


           5   Quality of the approximation are obtained from Law of Large
               Numbers or CLT for Markov chains.
                                                                                                                   5 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




Plan

       1       Introduction

       2       MH algorithms with multiple proposals
                Random Walk MH
                MCTM algorithm
                MTM-C algorithms

       3       Optimal scaling
                 Main results

       4       Optimising the speed up process
                 MCTM algorithm
                 MTM-C algorithms
       5       Conclusion


                                                                                                                   6 / 25
Introduction    MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


Random Walk MH




               Notation w.p. = with probability

       Algorithme (MCMC )
       If X [t] = x , how is X [t + 1] simulated?
       (a) Y ∼ q(x ; ·).
       (b) Accept the proposal X [t + 1] = Y w.p. α(x , Y ) where

                                                                 π(y )q(y ; x )
                                            α(x , y ) = 1 ∧
                                                                 π(x )q(x ; y )

       (c) Otherwise X [t + 1] = x




                                                                                                                  7 / 25
Introduction    MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


Random Walk MH



               Notation w.p. = with probability

       Algorithme (MCMC )
       If X [t] = x , how is X [t + 1] simulated?
       (a) Y ∼ q(x ; ·).
       (b) Accept the proposal X [t + 1] = Y w.p. α(x , Y ) where

                                                                 π(y )q(y ; x )
                                            α(x , y ) = 1 ∧
                                                                 π(x )q(x ; y )

       (c) Otherwise X [t + 1] = x

       The chain is π-reversible since:




                                                                                                                  7 / 25
Introduction    MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


Random Walk MH



               Notation w.p. = with probability

       Algorithme (MCMC )
       If X [t] = x , how is X [t + 1] simulated?
       (a) Y ∼ q(x ; ·).
       (b) Accept the proposal X [t + 1] = Y w.p. α(x , Y ) where

                                                                 π(y )q(y ; x )
                                            α(x , y ) = 1 ∧
                                                                 π(x )q(x ; y )

       (c) Otherwise X [t + 1] = x

       The chain is π-reversible since:

                        π(x )α(x , y )q(x ; y ) = π(x )α(x , y ) ∧ π(y )α(y , x )


                                                                                                                  7 / 25
Introduction    MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


Random Walk MH



               Notation w.p. = with probability

       Algorithme (MCMC )
       If X [t] = x , how is X [t + 1] simulated?
       (a) Y ∼ q(x ; ·).
       (b) Accept the proposal X [t + 1] = Y w.p. α(x , Y ) where

                                                                 π(y )q(y ; x )
                                            α(x , y ) = 1 ∧
                                                                 π(x )q(x ; y )

       (c) Otherwise X [t + 1] = x

       The chain is π-reversible since:

                              π(x )α(x , y )q(x ; y ) = π(y )α(y , x )q(y ; x )


                                                                                                                  7 / 25
Introduction    MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


Random Walk MH


               Assume that q(x ; y ) = q(y ; x ) ◮ the instrumental kernel is
               symmetric. Typically Y = X + U where U has symm. distr.
               Notation w.p. = with probability

       Algorithme (MCMC with symmetric proposal)
       If X [t] = x , how is X [t + 1] simulated?
       (a) Y ∼ q(x ; ·).
       (b) Accept the proposal X [t + 1] = Y w.p. α(x , Y ) where

                                                                 π(y )q(y ; x )
                                            α(x , y ) = 1 ∧
                                                                 π(x )q(x ; y )

       (c) Otherwise X [t + 1] = x

       The chain is π-reversible since:
                              π(x )α(x , y )q(x ; y ) = π(y )α(y , x )q(y ; x )

                                                                                                                  7 / 25
Introduction    MH algorithms with multiple proposals   Optimal scaling      Optimising the speed up process   Conclusion


Random Walk MH


               Assume that q(x ; y ) = q(y ; x ) ◮ the instrumental kernel is
               symmetric. Typically Y = X + U where U has symm. distr.
               Notation w.p. = with probability

       Algorithme (MCMC with symmetric proposal)
       If X [t] = x , how is X [t + 1] simulated?
       (a) Y ∼ q(x ; ·).
       (b) Accept the proposal X [t + 1] = Y w.p. α(x , Y ) where

                                                                          π(y )
                                                  α(x , y ) = 1 ∧
                                                                          π(x )

       (c) Otherwise X [t + 1] = x

       The chain is π-reversible since:
                              π(x )α(x , y )q(x ; y ) = π(y )α(y , x )q(y ; x )

                                                                                                                     7 / 25
Introduction     MH algorithms with multiple proposals     Optimal scaling     Optimising the speed up process   Conclusion


MCTM algorithm


Multiple proposal MCMC

           1   Liu, Liang, Wong (2000) introduced the multiple proposal
               MCMC. Generalized to multiple correlated proposals by Craiu
               and Lemieux (2007).
           2   A pool of candidates is drawn from (Y 1 , . . . , Y K )                        X [t]
                                                                                                      ∼ q(X [t]; ·).
           3   We select one candidate a priori according to some "informative"
               criterium (with high values of π for example).
           4   We accept the candidate with some well chosen probability.

       ◮ diversity of the candidates: some candidates are, other are far
       away from the current state. Some additional notations:

                     Yj   X [t]
                                  ∼ qj (X [t]; ·) (◮M ARGINAL                DIST.)                              (1)
                     (Y i )i=j    X [t],Y j
                                              ∼ qj (X [t], Y j ; ·) (◮S IM .
                                                ¯                               OTHER CAND.)             .       (2)

                                                                                                                       8 / 25
Introduction     MH algorithms with multiple proposals     Optimal scaling     Optimising the speed up process   Conclusion


MCTM algorithm


Multiple proposal MCMC

           1   Liu, Liang, Wong (2000) introduced the multiple proposal
               MCMC. Generalized to multiple correlated proposals by Craiu
               and Lemieux (2007).
           2   A pool of candidates is drawn from (Y 1 , . . . , Y K )                        X [t]
                                                                                                      ∼ q(X [t]; ·).
           3   We select one candidate a priori according to some "informative"
               criterium (with high values of π for example).
           4   We accept the candidate with some well chosen probability.

       ◮ diversity of the candidates: some candidates are, other are far
       away from the current state. Some additional notations:

                     Yj   X [t]
                                  ∼ qj (X [t]; ·) (◮M ARGINAL                DIST.)                              (1)
                     (Y i )i=j    X [t],Y j
                                              ∼ qj (X [t], Y j ; ·) (◮S IM .
                                                ¯                               OTHER CAND.)             .       (2)

                                                                                                                       8 / 25
Introduction     MH algorithms with multiple proposals     Optimal scaling     Optimising the speed up process   Conclusion


MCTM algorithm


Multiple proposal MCMC

           1   Liu, Liang, Wong (2000) introduced the multiple proposal
               MCMC. Generalized to multiple correlated proposals by Craiu
               and Lemieux (2007).
           2   A pool of candidates is drawn from (Y 1 , . . . , Y K )                        X [t]
                                                                                                      ∼ q(X [t]; ·).
           3   We select one candidate a priori according to some "informative"
               criterium (with high values of π for example).
           4   We accept the candidate with some well chosen probability.

       ◮ diversity of the candidates: some candidates are, other are far
       away from the current state. Some additional notations:

                     Yj   X [t]
                                  ∼ qj (X [t]; ·) (◮M ARGINAL                DIST.)                              (1)
                     (Y i )i=j    X [t],Y j
                                              ∼ qj (X [t], Y j ; ·) (◮S IM .
                                                ¯                               OTHER CAND.)             .       (2)

                                                                                                                       8 / 25
Introduction     MH algorithms with multiple proposals     Optimal scaling     Optimising the speed up process   Conclusion


MCTM algorithm


Multiple proposal MCMC

           1   Liu, Liang, Wong (2000) introduced the multiple proposal
               MCMC. Generalized to multiple correlated proposals by Craiu
               and Lemieux (2007).
           2   A pool of candidates is drawn from (Y 1 , . . . , Y K )                        X [t]
                                                                                                      ∼ q(X [t]; ·).
           3   We select one candidate a priori according to some "informative"
               criterium (with high values of π for example).
           4   We accept the candidate with some well chosen probability.

       ◮ diversity of the candidates: some candidates are, other are far
       away from the current state. Some additional notations:

                     Yj   X [t]
                                  ∼ qj (X [t]; ·) (◮M ARGINAL                DIST.)                              (1)
                     (Y i )i=j    X [t],Y j
                                              ∼ qj (X [t], Y j ; ·) (◮S IM .
                                                ¯                               OTHER CAND.)             .       (2)

                                                                                                                       8 / 25
Introduction     MH algorithms with multiple proposals     Optimal scaling     Optimising the speed up process   Conclusion


MCTM algorithm


Multiple proposal MCMC

           1   Liu, Liang, Wong (2000) introduced the multiple proposal
               MCMC. Generalized to multiple correlated proposals by Craiu
               and Lemieux (2007).
           2   A pool of candidates is drawn from (Y 1 , . . . , Y K )                        X [t]
                                                                                                      ∼ q(X [t]; ·).
           3   We select one candidate a priori according to some "informative"
               criterium (with high values of π for example).
           4   We accept the candidate with some well chosen probability.

       ◮ diversity of the candidates: some candidates are, other are far
       away from the current state. Some additional notations:

                     Yj   X [t]
                                  ∼ qj (X [t]; ·) (◮M ARGINAL                DIST.)                              (1)
                     (Y i )i=j    X [t],Y j
                                              ∼ qj (X [t], Y j ; ·) (◮S IM .
                                                ¯                               OTHER CAND.)             .       (2)

                                                                                                                       8 / 25
Introduction     MH algorithms with multiple proposals     Optimal scaling     Optimising the speed up process   Conclusion


MCTM algorithm


Multiple proposal MCMC

           1   Liu, Liang, Wong (2000) introduced the multiple proposal
               MCMC. Generalized to multiple correlated proposals by Craiu
               and Lemieux (2007).
           2   A pool of candidates is drawn from (Y 1 , . . . , Y K )                        X [t]
                                                                                                      ∼ q(X [t]; ·).
           3   We select one candidate a priori according to some "informative"
               criterium (with high values of π for example).
           4   We accept the candidate with some well chosen probability.

       ◮ diversity of the candidates: some candidates are, other are far
       away from the current state. Some additional notations:

                     Yj   X [t]
                                  ∼ qj (X [t]; ·) (◮M ARGINAL                DIST.)                              (1)
                     (Y i )i=j    X [t],Y j
                                              ∼ qj (X [t], Y j ; ·) (◮S IM .
                                                ¯                               OTHER CAND.)             .       (2)

                                                                                                                       8 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling     Optimising the speed up process   Conclusion


MCTM algorithm

       Assume that qj (x ; y ) = qj (y ; x ) .

       Algorithme (MCTM: Multiple Correlated try Metropolis alg.)
       If X [t] = x , how is X [t + 1] simulated?
       (a) (Y 1 , . . . , Y K ) ∼ q(x ; ·).                                           (◮POOL        OF CAND.)

       (b) Draw an index J ∈ {1, . . . , K }, with probability proportional to
           [π(Y 1 ), . . . , π(Y K )] .                 (◮S ELECTION A PRIORI)
            ˜
       (c) {Y J,i }i=J ∼ qJ (Y J , x ; ·).
                         ¯                                                   (◮AUXILIARY         VARIABLES)

                                                                       ˜
       (d) Accept the proposal X [t + 1] = Y J w.p. αJ (x , (Y i )K , (Y J,i )i=J )
                                                                  i=1
           where

                                                                            i=j π(y i ) + π(y j )
                           αj (x , (y i )K , (y j,i )i=j ) = 1 ∧
                                         i=1 ˜                                                    .            (3)
                                                                                  ˜ j,i
                                                                            i=j π(y ) + π(x )

                                                             (◮MH          ACCEPTANCE PROBABILITY)

       (e) Otherwise, X [t + 1] = X [t]
          See MTM-C                                                                                                  9 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling     Optimising the speed up process   Conclusion


MCTM algorithm

       Assume that qj (x ; y ) = qj (y ; x ) .

       Algorithme (MCTM: Multiple Correlated try Metropolis alg.)
       If X [t] = x , how is X [t + 1] simulated?
       (a) (Y 1 , . . . , Y K ) ∼ q(x ; ·).                                           (◮POOL        OF CAND.)

       (b) Draw an index J ∈ {1, . . . , K }, with probability proportional to
           [π(Y 1 ), . . . , π(Y K )] .                 (◮S ELECTION A PRIORI)
            ˜
       (c) {Y J,i }i=J ∼ qJ (Y J , x ; ·).
                         ¯                                                   (◮AUXILIARY         VARIABLES)

                                                                       ˜
       (d) Accept the proposal X [t + 1] = Y J w.p. αJ (x , (Y i )K , (Y J,i )i=J )
                                                                  i=1
           where

                                                                            i=j π(y i ) + π(y j )
                           αj (x , (y i )K , (y j,i )i=j ) = 1 ∧
                                         i=1 ˜                                                    .            (3)
                                                                                  ˜ j,i
                                                                            i=j π(y ) + π(x )

                                                             (◮MH          ACCEPTANCE PROBABILITY)

       (e) Otherwise, X [t + 1] = X [t]
          See MTM-C                                                                                                  9 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling     Optimising the speed up process   Conclusion


MCTM algorithm

       Assume that qj (x ; y ) = qj (y ; x ) .

       Algorithme (MCTM: Multiple Correlated try Metropolis alg.)
       If X [t] = x , how is X [t + 1] simulated?
       (a) (Y 1 , . . . , Y K ) ∼ q(x ; ·).                                           (◮POOL        OF CAND.)

       (b) Draw an index J ∈ {1, . . . , K }, with probability proportional to
           [π(Y 1 ), . . . , π(Y K )] .                 (◮S ELECTION A PRIORI)
            ˜
       (c) {Y J,i }i=J ∼ qJ (Y J , x ; ·).
                         ¯                                                   (◮AUXILIARY         VARIABLES)

                                                                       ˜
       (d) Accept the proposal X [t + 1] = Y J w.p. αJ (x , (Y i )K , (Y J,i )i=J )
                                                                  i=1
           where

                                                                            i=j π(y i ) + π(y j )
                           αj (x , (y i )K , (y j,i )i=j ) = 1 ∧
                                         i=1 ˜                                                    .            (3)
                                                                                  ˜ j,i
                                                                            i=j π(y ) + π(x )

                                                             (◮MH          ACCEPTANCE PROBABILITY)

       (e) Otherwise, X [t + 1] = X [t]
          See MTM-C                                                                                                  9 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling     Optimising the speed up process   Conclusion


MCTM algorithm

       Assume that qj (x ; y ) = qj (y ; x ) .

       Algorithme (MCTM: Multiple Correlated try Metropolis alg.)
       If X [t] = x , how is X [t + 1] simulated?
       (a) (Y 1 , . . . , Y K ) ∼ q(x ; ·).                                           (◮POOL        OF CAND.)

       (b) Draw an index J ∈ {1, . . . , K }, with probability proportional to
           [π(Y 1 ), . . . , π(Y K )] .                 (◮S ELECTION A PRIORI)
            ˜
       (c) {Y J,i }i=J ∼ qJ (Y J , x ; ·).
                         ¯                                                   (◮AUXILIARY         VARIABLES)

                                                                       ˜
       (d) Accept the proposal X [t + 1] = Y J w.p. αJ (x , (Y i )K , (Y J,i )i=J )
                                                                  i=1
           where

                                                                            i=j π(y i ) + π(y j )
                           αj (x , (y i )K , (y j,i )i=j ) = 1 ∧
                                         i=1 ˜                                                    .            (3)
                                                                                  ˜ j,i
                                                                            i=j π(y ) + π(x )

                                                             (◮MH          ACCEPTANCE PROBABILITY)

       (e) Otherwise, X [t + 1] = X [t]
          See MTM-C                                                                                                  9 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


MCTM algorithm




           1   It generalises the classical Random Walk Hasting Metropolis
               algorithm (which is the case K = 1). RWMC




                                                                                                                  10 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling     Optimising the speed up process      Conclusion


MCTM algorithm




           1   It generalises the classical Random Walk Hasting Metropolis
               algorithm (which is the case K = 1). RWMC
           2   It satisfies the detailed balance condition wrt π:
                                                                                                                  
                            K
                  π(x )             ···                 ¯
                                             qj (x ; y )Qj x , y ;                   ¯
                                                                             d(y i ) Qj y , x ;             d(y j,i )
                           j=1                                         i=j                              i=j

                                                  π(y )                         π(y ) +        i=j   π(y i )
                                                                           1∧
                                           π(y ) + i=j π(y i )                  π(x ) +       i=j    π(y j,i )




                                                                                                                       10 / 25
Introduction     MH algorithms with multiple proposals    Optimal scaling      Optimising the speed up process        Conclusion


MCTM algorithm




           1   It generalises the classical Random Walk Hasting Metropolis
               algorithm (which is the case K = 1). RWMC
           2   It satisfies the detailed balance condition wrt π:
                                                                                                                                   
                                   K
                  π(x )π(y )            qj (x ; y )      ···       ¯
                                                                   Qj x , y ;                    ¯
                                                                                         d(y i ) Qj y , x ;             d(y j,i )
                                  j=1                                              i=j                              i=j

                                                               1                                1
                                                                          i
                                                                               ∧
                                                 π(y ) +           i=j π(y )        π(x ) +       i=j   π(y j,i )

               ◮ symmetric wrt (x , y )




                                                                                                                            10 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


MCTM algorithm




           1   The MCTM uses the simulation of K random variables for the
               pool of candidates and K − 1 auxiliary variables to compute the
               MH acceptance ratio.
           2   Can we reduce the number of simulated variables while keeping
               the diversity of the pool?
           3   Draw one random variable and use transformations to create the
               pool of candidates and auxiliary variables.




                                                                                                                  11 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


MCTM algorithm




           1   The MCTM uses the simulation of K random variables for the
               pool of candidates and K − 1 auxiliary variables to compute the
               MH acceptance ratio.
           2   Can we reduce the number of simulated variables while keeping
               the diversity of the pool?
           3   Draw one random variable and use transformations to create the
               pool of candidates and auxiliary variables.




                                                                                                                  11 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


MCTM algorithm




           1   The MCTM uses the simulation of K random variables for the
               pool of candidates and K − 1 auxiliary variables to compute the
               MH acceptance ratio.
           2   Can we reduce the number of simulated variables while keeping
               the diversity of the pool?
           3   Draw one random variable and use transformations to create the
               pool of candidates and auxiliary variables.




                                                                                                                  11 / 25
Introduction    MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


MTM-C algorithms

           Ψi : X ×[0, 1)r → X
       Let                     .
           Ψj,i : X × X → X
       Assume that
           1   For all j ∈ {1, . . . , K }, set

                                        Y j = Ψj (x , V ) (◮C OMMON R . V.)
               where V ∼ U([0, 1)r )
           2   For any (i, j) ∈ {1, . . . , K }2 ,

                Y i = Ψj,i (x , Y j ) . (◮R ECONSTRUCTION                    OF THE OTHER CAND.)
                                                                                                            (4)

       Example:
                                                                   i
                ψ i (x , v ) = x + σΦ−1 (< v i + v >) where v i =< K a >, a ∈ Rr , Φ
               cumulative repartition function of the normal distribution. ◮
               Korobov seq. + Cranley Patterson rot.
               ψ i (x , v ) = x + γ i Φ−1 (v ) . ◮ Hit and Run algorithm.                                         12 / 25
Introduction    MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


MTM-C algorithms

           Ψi : X ×[0, 1)r → X
       Let                     .
           Ψj,i : X × X → X
       Assume that
           1   For all j ∈ {1, . . . , K }, set

                                        Y j = Ψj (x , V ) (◮C OMMON R . V.)
               where V ∼ U([0, 1)r )
           2   For any (i, j) ∈ {1, . . . , K }2 ,

                Y i = Ψj,i (x , Y j ) . (◮R ECONSTRUCTION                    OF THE OTHER CAND.)
                                                                                                            (4)

       Example:
                                                                   i
                ψ i (x , v ) = x + σΦ−1 (< v i + v >) where v i =< K a >, a ∈ Rr , Φ
               cumulative repartition function of the normal distribution. ◮
               Korobov seq. + Cranley Patterson rot.
               ψ i (x , v ) = x + γ i Φ−1 (v ) . ◮ Hit and Run algorithm.                                         12 / 25
Introduction    MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


MTM-C algorithms

           Ψi : X ×[0, 1)r → X
       Let                     .
           Ψj,i : X × X → X
       Assume that
           1   For all j ∈ {1, . . . , K }, set

                                        Y j = Ψj (x , V ) (◮C OMMON R . V.)
               where V ∼ U([0, 1)r )
           2   For any (i, j) ∈ {1, . . . , K }2 ,

                Y i = Ψj,i (x , Y j ) . (◮R ECONSTRUCTION                    OF THE OTHER CAND.)
                                                                                                            (4)

       Example:
                                                                   i
                ψ i (x , v ) = x + σΦ−1 (< v i + v >) where v i =< K a >, a ∈ Rr , Φ
               cumulative repartition function of the normal distribution. ◮
               Korobov seq. + Cranley Patterson rot.
               ψ i (x , v ) = x + γ i Φ−1 (v ) . ◮ Hit and Run algorithm.                                         12 / 25
Introduction    MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


MTM-C algorithms

           Ψi : X ×[0, 1)r → X
       Let                     .
           Ψj,i : X × X → X
       Assume that
           1   For all j ∈ {1, . . . , K }, set

                                        Y j = Ψj (x , V ) (◮C OMMON R . V.)
               where V ∼ U([0, 1)r )
           2   For any (i, j) ∈ {1, . . . , K }2 ,

                Y i = Ψj,i (x , Y j ) . (◮R ECONSTRUCTION                    OF THE OTHER CAND.)
                                                                                                            (4)

       Example:
                                                                   i
                ψ i (x , v ) = x + σΦ−1 (< v i + v >) where v i =< K a >, a ∈ Rr , Φ
               cumulative repartition function of the normal distribution. ◮
               Korobov seq. + Cranley Patterson rot.
               ψ i (x , v ) = x + γ i Φ−1 (v ) . ◮ Hit and Run algorithm.                                         12 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


MTM-C algorithms



       Algorithme (MTM-C: Multiple Try Metropolis alg. with common
       proposal)

       (a) Draw V ∼ U([0, 1)r ) and set Y i = Ψi (x , V ) for i = 1, . . . , K .
       (b) Draw an index J ∈ {1, . . . , K }, with probability proportional to

                                                  [π(Y 1 ), . . . , π(Y K )] .


       (c) Accept X [t + 1] = Y J with probability αJ (x , Y ), where, for
                                                   ¯
           j ∈ {1, . . . , K },

                          αj (x , y j ) = αj x , {Ψj,i (x , y j )}K , {Ψj,i (y j , x )}i=j
                          ¯                                       i=1                               ,        (5)

               with αj given in (3) and reject otherwise.
       (d) Otherwise X [t + 1] = Y J .
          See MCTM

                                                                                                                   13 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


MTM-C algorithms



       Algorithme (MTM-C: Multiple Try Metropolis alg. with common
       proposal)

       (a) Draw V ∼ U([0, 1)r ) and set Y i = Ψi (x , V ) for i = 1, . . . , K .
       (b) Draw an index J ∈ {1, . . . , K }, with probability proportional to

                                                  [π(Y 1 ), . . . , π(Y K )] .


       (c) Accept X [t + 1] = Y J with probability αJ (x , Y ), where, for
                                                   ¯
           j ∈ {1, . . . , K },

                          αj (x , y j ) = αj x , {Ψj,i (x , y j )}K , {Ψj,i (y j , x )}i=j
                          ¯                                       i=1                               ,        (5)

               with αj given in (3) and reject otherwise.
       (d) Otherwise X [t + 1] = Y J .
          See MCTM

                                                                                                                   13 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


MTM-C algorithms



       Algorithme (MTM-C: Multiple Try Metropolis alg. with common
       proposal)

       (a) Draw V ∼ U([0, 1)r ) and set Y i = Ψi (x , V ) for i = 1, . . . , K .
       (b) Draw an index J ∈ {1, . . . , K }, with probability proportional to

                                                  [π(Y 1 ), . . . , π(Y K )] .


       (c) Accept X [t + 1] = Y J with probability αJ (x , Y ), where, for
                                                   ¯
           j ∈ {1, . . . , K },

                          αj (x , y j ) = αj x , {Ψj,i (x , y j )}K , {Ψj,i (y j , x )}i=j
                          ¯                                       i=1                               ,        (5)

               with αj given in (3) and reject otherwise.
       (d) Otherwise X [t + 1] = Y J .
          See MCTM

                                                                                                                   13 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


MTM-C algorithms



       Algorithme (MTM-C: Multiple Try Metropolis alg. with common
       proposal)

       (a) Draw V ∼ U([0, 1)r ) and set Y i = Ψi (x , V ) for i = 1, . . . , K .
       (b) Draw an index J ∈ {1, . . . , K }, with probability proportional to

                                                  [π(Y 1 ), . . . , π(Y K )] .


       (c) Accept X [t + 1] = Y J with probability αJ (x , Y ), where, for
                                                   ¯
           j ∈ {1, . . . , K },

                          αj (x , y j ) = αj x , {Ψj,i (x , y j )}K , {Ψj,i (y j , x )}i=j
                          ¯                                       i=1                               ,        (5)

               with αj given in (3) and reject otherwise.
       (d) Otherwise X [t + 1] = Y J .
          See MCTM

                                                                                                                   13 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




Plan

       1       Introduction

       2       MH algorithms with multiple proposals
                Random Walk MH
                MCTM algorithm
                MTM-C algorithms

       3       Optimal scaling
                 Main results

       4       Optimising the speed up process
                 MCTM algorithm
                 MTM-C algorithms
       5       Conclusion


                                                                                                                  14 / 25
Introduction      MH algorithms with multiple proposals    Optimal scaling   Optimising the speed up process   Conclusion




How to compare two MH algorithms

       ◮ P ESKUN- If P1 and P2 are two π-reversible kernels and

                                              ∀x , y      p1 (x , y ) ≤ p2 (x , y )

               then P2 is better than P1 in terms of the asymptotic variance of
               N −1 N h(X1 ).
                     i=1


           1     Off diagonal order: Not always easy to compare!
           2     Moreover, one expression of the asymptotic variance is:
                                                                ∞
                                    V = Varπ (h) + 2                 Covπ (h(X0 ), h(Xt ))
                                                               t=1




                                                                                                                    15 / 25
Introduction      MH algorithms with multiple proposals    Optimal scaling   Optimising the speed up process   Conclusion




How to compare two MH algorithms

       ◮ P ESKUN- If P1 and P2 are two π-reversible kernels and

                                              ∀x , y      p1 (x , y ) ≤ p2 (x , y )

               then P2 is better than P1 in terms of the asymptotic variance of
               N −1 N h(X1 ).
                     i=1


           1     Off diagonal order: Not always easy to compare!
           2     Moreover, one expression of the asymptotic variance is:
                                                                ∞
                                    V = Varπ (h) + 2                 Covπ (h(X0 ), h(Xt ))
                                                               t=1




                                                                                                                    15 / 25
Introduction      MH algorithms with multiple proposals    Optimal scaling   Optimising the speed up process   Conclusion




How to compare two MH algorithms

       ◮ P ESKUN- If P1 and P2 are two π-reversible kernels and

                                              ∀x , y      p1 (x , y ) ≤ p2 (x , y )

               then P2 is better than P1 in terms of the asymptotic variance of
               N −1 N h(X1 ).
                     i=1


           1     Off diagonal order: Not always easy to compare!
           2     Moreover, one expression of the asymptotic variance is:
                                                                ∞
                                    V = Varπ (h) + 2                 Covπ (h(X0 ), h(Xt ))
                                                               t=1




                                                                                                                    15 / 25
Introduction    MH algorithms with multiple proposals   Optimal scaling    Optimising the speed up process   Conclusion




Original idea of optimal scaling


       For the RW-MH algorithm:
           1   Increase dimension T .
                                                                T
           2   Target distribution πT (x0:T ) =                 t=0   f (xt ) .
           3   Assume that XT [0] ∼ πT .
           4   Take a random walk increasingly conservative: draw candidate
                              ℓ
               YT = XT [t] + √T UT [t] where UT [t] centered standard normal.
           5   What is the "best" ℓ?




                                                                                                                  16 / 25
Introduction    MH algorithms with multiple proposals   Optimal scaling    Optimising the speed up process   Conclusion




Original idea of optimal scaling


       For the RW-MH algorithm:
           1   Increase dimension T .
                                                                T
           2   Target distribution πT (x0:T ) =                 t=0   f (xt ) .
           3   Assume that XT [0] ∼ πT .
           4   Take a random walk increasingly conservative: draw candidate
                              ℓ
               YT = XT [t] + √T UT [t] where UT [t] centered standard normal.
           5   What is the "best" ℓ?




                                                                                                                  16 / 25
Introduction    MH algorithms with multiple proposals   Optimal scaling    Optimising the speed up process   Conclusion




Original idea of optimal scaling


       For the RW-MH algorithm:
           1   Increase dimension T .
                                                                T
           2   Target distribution πT (x0:T ) =                 t=0   f (xt ) .
           3   Assume that XT [0] ∼ πT .
           4   Take a random walk increasingly conservative: draw candidate
                              ℓ
               YT = XT [t] + √T UT [t] where UT [t] centered standard normal.
           5   What is the "best" ℓ?




                                                                                                                  16 / 25
Introduction    MH algorithms with multiple proposals   Optimal scaling    Optimising the speed up process   Conclusion




Original idea of optimal scaling


       For the RW-MH algorithm:
           1   Increase dimension T .
                                                                T
           2   Target distribution πT (x0:T ) =                 t=0   f (xt ) .
           3   Assume that XT [0] ∼ πT .
           4   Take a random walk increasingly conservative: draw candidate
                              ℓ
               YT = XT [t] + √T UT [t] where UT [t] centered standard normal.
           5   What is the "best" ℓ?




                                                                                                                  16 / 25
Introduction    MH algorithms with multiple proposals   Optimal scaling    Optimising the speed up process   Conclusion




Original idea of optimal scaling


       For the RW-MH algorithm:
           1   Increase dimension T .
                                                                T
           2   Target distribution πT (x0:T ) =                 t=0   f (xt ) .
           3   Assume that XT [0] ∼ πT .
           4   Take a random walk increasingly conservative: draw candidate
                              ℓ
               YT = XT [t] + √T UT [t] where UT [t] centered standard normal.
           5   What is the "best" ℓ?




                                                                                                                  16 / 25
Introduction   MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




       Théorème
       The first component of (XT [⌊Ts⌋])0≤s≤1 weakly converges in the
       Skorokhod topology to the stationary solution (W [λℓ s], s ∈ R+ ) of the
       Langevin SDE

                                                           1
                                 dW [s] = dB[s] +            [ln f ]′ (W [s])ds ,
                                                           2

       In particular, the first component of

                                     (XT [0], XT [α1 T ], . . . , XT [αp T ])

       converges weakly to the distribution of

                                  (W [0], W [λℓ α1 T ], . . . , W [λℓ αp T ])




                                                                                                                17 / 25
Introduction   MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




       Théorème
       The first component of (XT [⌊Ts⌋])0≤s≤1 weakly converges in the
       Skorokhod topology to the stationary solution (W [λℓ s], s ∈ R+ ) of the
       Langevin SDE

                                                           1
                                 dW [s] = dB[s] +            [ln f ]′ (W [s])ds ,
                                                           2

       In particular, the first component of

                                      (XT [0], XT [α1 T ], . . . , XT [αp T ])

       converges weakly to the distribution of

                                    (W [0], W [λℓ α1 T ], . . . , W [λℓ αp T ])


                                                  ℓ
                                                    √
       Then, ℓ is chosen to maximize λℓ = 2ℓ2 Φ − 2 I where
                                2
       I=      {[ln f ]′ (x )} f (x )dx .
                                                                                                                17 / 25
Introduction   MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




       Théorème
       The first component of (XT [⌊Ts⌋])0≤s≤1 weakly converges in the
       Skorokhod topology to the stationary solution (W [s], s ∈ R+ ) of the
       Langevin SDE

                                                              λℓ        ′
                            dW [s] =           λℓ dB[s] +        [ln f ] (W [s])ds ,
                                                              2

       In particular, the first component of

                                      (XT [0], XT [α1 T ], . . . , XT [αp T ])

       converges weakly to the distribution of

                                    (W [0], W [λℓ α1 T ], . . . , W [λℓ αp T ])

                                                  ℓ
                                                    √
       Then, ℓ is chosen to maximize λℓ = 2ℓ2 Φ − 2 I where
                                2
       I=      {[ln f ]′ (x )} f (x )dx .
                                                                                                                17 / 25
Introduction      MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


Main results


Optimal scaling for the MCTM algorithm

       ◮ The pool of candidates

               YT ,t [n + 1] = XT ,t [n] + T −1/2 Uti [n + 1] ,
                i
                                                                             0 ≤ t ≤ T,          1 ≤ i ≤ K,

               where for any t ∈ {0, . . . , T },

                         (Uti [n + 1])K ∼ N (0, Σ) , (◮MCTM)
                                      i=1

                         Uti [n + 1] = ψ i (Vt ),         and Vt ∼ U[0, 1], (◮MTM-C)


       ◮ The auxiliary variables

                           ˜ j,i                              ˜
                           YT ,t [n + 1] = XT ,t [n] + T −1/2 Utj,i [n + 1] ,               i =j ,


                                                                                                                   18 / 25
Introduction      MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


Main results


Optimal scaling for the MCTM algorithm

       ◮ The pool of candidates

               YT ,t [n + 1] = XT ,t [n] + T −1/2 Uti [n + 1] ,
                i
                                                                             0 ≤ t ≤ T,          1 ≤ i ≤ K,

               where for any t ∈ {0, . . . , T },

                         (Uti [n + 1])K ∼ N (0, Σ) , (◮MCTM)
                                      i=1

                         Uti [n + 1] = ψ i (Vt ),         and Vt ∼ U[0, 1], (◮MTM-C)


       ◮ The auxiliary variables

                           ˜ j,i                              ˜
                           YT ,t [n + 1] = XT ,t [n] + T −1/2 Utj,i [n + 1] ,               i =j ,


                                                                                                                   18 / 25
Introduction    MH algorithms with multiple proposals     Optimal scaling   Optimising the speed up process   Conclusion


Main results


       Théorème
       Suppose that XT [0] is distributed according to the target density πT .
       Then, the process (XT ,0 [sT ], s ∈ R+ ) weakly converges in the
       Skorokhod topology to the stationary solution (W [s], s ∈ R+ ) of the
       Langevin SDE

                                                                 1          ′
                             dW [s] = λ1/2 dB[s] +                 λ [ln f ] (W [s])ds ,
                                                                 2

       with λ       λ I, (Γj )K , where Γj , 1 ≤ j ≤ K denotes the covariance
                              j=1
                                     j     i         ˜ j,i
       matrix of the random vector (U0 , (U0 )i=j , (U0 )i=j ).

       For the MCTM, Γj = Γj (Σ).
                                                                             2K −1
                              α(Γ) = E A                Gi − Var[Gi ]/2      i=1
                                                                                          ,                   (6)

       where A is bounded lip. and (Gi )2K −1 ∼ N (0, Γ).
                                        i=1
                                                             K
                                  λ I, (Γj )K
                                            j=1                  Γj1,1 × α IΓj ,                              (7)
                                                                                                                    19 / 25
Introduction    MH algorithms with multiple proposals     Optimal scaling   Optimising the speed up process   Conclusion


Main results


       Théorème
       Suppose that XT [0] is distributed according to the target density πT .
       Then, the process (XT ,0 [sT ], s ∈ R+ ) weakly converges in the
       Skorokhod topology to the stationary solution (W [s], s ∈ R+ ) of the
       Langevin SDE

                                                                 1          ′
                             dW [s] = λ1/2 dB[s] +                 λ [ln f ] (W [s])ds ,
                                                                 2

       with λ       λ I, (Γj )K , where Γj , 1 ≤ j ≤ K denotes the covariance
                              j=1
                                     j     i         ˜ j,i
       matrix of the random vector (U0 , (U0 )i=j , (U0 )i=j ).

       For the MCTM, Γj = Γj (Σ).
                                                                             2K −1
                              α(Γ) = E A                Gi − Var[Gi ]/2      i=1
                                                                                          ,                   (6)

       where A is bounded lip. and (Gi )2K −1 ∼ N (0, Γ).
                                        i=1
                                                             K
                                  λ I, (Γj )K
                                            j=1                  Γj1,1 × α IΓj ,                              (7)
                                                                                                                    19 / 25
Introduction    MH algorithms with multiple proposals     Optimal scaling   Optimising the speed up process   Conclusion


Main results


       Théorème
       Suppose that XT [0] is distributed according to the target density πT .
       Then, the process (XT ,0 [sT ], s ∈ R+ ) weakly converges in the
       Skorokhod topology to the stationary solution (W [s], s ∈ R+ ) of the
       Langevin SDE

                                                                 1          ′
                             dW [s] = λ1/2 dB[s] +                 λ [ln f ] (W [s])ds ,
                                                                 2

       with λ       λ I, (Γj )K , where Γj , 1 ≤ j ≤ K denotes the covariance
                              j=1
                                     j     i         ˜ j,i
       matrix of the random vector (U0 , (U0 )i=j , (U0 )i=j ).

       For the MCTM, Γj = Γj (Σ).
                                                                             2K −1
                              α(Γ) = E A                Gi − Var[Gi ]/2      i=1
                                                                                          ,                   (6)

       where A is bounded lip. and (Gi )2K −1 ∼ N (0, Γ).
                                        i=1
                                                             K
                                  λ I, (Γj )K
                                            j=1                  Γj1,1 × α IΓj ,                              (7)
                                                                                                                    19 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




Plan

       1       Introduction

       2       MH algorithms with multiple proposals
                Random Walk MH
                MCTM algorithm
                MTM-C algorithms

       3       Optimal scaling
                 Main results

       4       Optimising the speed up process
                 MCTM algorithm
                 MTM-C algorithms
       5       Conclusion


                                                                                                                  20 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion


MCTM algorithm




       We optimize the speed λ                      λ(I, (Γj (Σ))K ) over a subset G
                                                                 j=1

               G = Σ = diag(ℓ2 , . . . , ℓ2 ), (ℓ1 , . . . , ℓK ) ∈ RK : the proposals
                                1         K
               have different scales but are independent.
               G = Σ = ℓ2 Σa , ℓ2 ∈ R , where Σa is the extreme antithetic
               covariance matrix:
                                                       K         1
                                           Σa             IK −      1K 1T
                                                                        K
                                                     K −1      K −1

               with 1K = (1, . . . , 1)T .




                                                                                                                  21 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling     Optimising the speed up process   Conclusion


MCTM algorithm


MCTM algorithms




       Table: Optimal scaling constants, value of the speed, and mean
       acceptance rate for independent proposals

                                K        1            2        3             4       5
                                ℓ⋆      2.38        2.64     2.82          2.99     3.12
                                λ⋆      1.32        2.24     2.94          3.51     4.00
                                a⋆      0.23        0.32     0.37          0.39     0.41




                                                                                                                    22 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling     Optimising the speed up process   Conclusion


MCTM algorithm


MCTM algorithms

       Table: Optimal scaling constants, value of the speed, and mean
       acceptance rate for extreme antithetic proposals

                                K        1            2        3             4       5
                                ℓ⋆      2.38        2.37     2.64          2.83     2.99
                                λ⋆      1.32        2.64     3.66          4.37     4.91
                                a⋆      0.23        0.46     0.52          0.54     0.55


       Table: Optimal scaling constants, value of the speed, and mean
       acceptance rate for the optimal covariance

                                K        1            2        3             4       5
                                ℓ⋆      2.38        2.37     2.66          2.83     2.98
                                λ⋆      1.32        2.64     3.70          4.40     4.93
                                a⋆      0.23        0.46     0.52          0.55     0.56
                                                                                                                    22 / 25
Introduction   MH algorithms with multiple proposals   Optimal scaling     Optimising the speed up process   Conclusion


MTM-C algorithms


MTM-C algorithms

       Table: Optimal scaling constants, optimal value of the speed and the
       mean acceptance rate for the RQMC MTM algorithm based on the
       Korobov sequence and Cranley-Patterson rotations

                              K        1            2        3             4       5
                              σ⋆      2.38        2.59     2.77          2.91     3.03
                              λ⋆      1.32        2.43     3.31          4.01     4.56
                              a⋆      0.23        0.36     0.42          0.47     0.50


       Table: Optimal scaling constants, value of the speed, and mean
       acceptance rate for the hit-and-run algorithm

                            K        1           2         4           6            8
                            ℓ⋆      2.38        2.37     7.11        11.85        16.75
                            λ⋆      1.32        2.64     2.65        2.65         2.65
                            a⋆      0.23        0.46     0.46        0.46         0.46                            23 / 25
Introduction     MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




Plan

       1       Introduction

       2       MH algorithms with multiple proposals
                Random Walk MH
                MCTM algorithm
                MTM-C algorithms

       3       Optimal scaling
                 Main results

       4       Optimising the speed up process
                 MCTM algorithm
                 MTM-C algorithms
       5       Conclusion


                                                                                                                  24 / 25
Introduction       MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




Conclusion

       ◮ MCTM algorithm:
               1      Extreme antithetic proposals improves upon the MTM with
                      independent proposals.
               2      Still, the improvement is not overly impressive and since the
                      introduction of correlation makes the computation of the
                      acceptance ratio more complex.
       ◮ MTM-C algorithm:
          1 The advantage of the MTM-C algorithms: only one simulation
            is required for obtaining the pool of proposals and auxiliary
            variables.
          2 The MTM-RQMC ∼ the extreme antithetic proposals.
          3 Our preferred choice: the MTM-HR algorithm. In particular,
            the case K = 2 induces a speed which is twice that of the
            Metropolis algorithm whereas the computational cost is
            almost the same in many scenarios.
                                                                                                                    25 / 25
Introduction       MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




Conclusion

       ◮ MCTM algorithm:
               1      Extreme antithetic proposals improves upon the MTM with
                      independent proposals.
               2      Still, the improvement is not overly impressive and since the
                      introduction of correlation makes the computation of the
                      acceptance ratio more complex.
       ◮ MTM-C algorithm:
          1 The advantage of the MTM-C algorithms: only one simulation
            is required for obtaining the pool of proposals and auxiliary
            variables.
          2 The MTM-RQMC ∼ the extreme antithetic proposals.
          3 Our preferred choice: the MTM-HR algorithm. In particular,
            the case K = 2 induces a speed which is twice that of the
            Metropolis algorithm whereas the computational cost is
            almost the same in many scenarios.
                                                                                                                    25 / 25
Introduction       MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




Conclusion

       ◮ MCTM algorithm:
               1      Extreme antithetic proposals improves upon the MTM with
                      independent proposals.
               2      Still, the improvement is not overly impressive and since the
                      introduction of correlation makes the computation of the
                      acceptance ratio more complex.
       ◮ MTM-C algorithm:
          1 The advantage of the MTM-C algorithms: only one simulation
            is required for obtaining the pool of proposals and auxiliary
            variables.
          2 The MTM-RQMC ∼ the extreme antithetic proposals.
          3 Our preferred choice: the MTM-HR algorithm. In particular,
            the case K = 2 induces a speed which is twice that of the
            Metropolis algorithm whereas the computational cost is
            almost the same in many scenarios.
                                                                                                                    25 / 25
Introduction       MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




Conclusion

       ◮ MCTM algorithm:
               1      Extreme antithetic proposals improves upon the MTM with
                      independent proposals.
               2      Still, the improvement is not overly impressive and since the
                      introduction of correlation makes the computation of the
                      acceptance ratio more complex.
       ◮ MTM-C algorithm:
          1 The advantage of the MTM-C algorithms: only one simulation
            is required for obtaining the pool of proposals and auxiliary
            variables.
          2 The MTM-RQMC ∼ the extreme antithetic proposals.
          3 Our preferred choice: the MTM-HR algorithm. In particular,
            the case K = 2 induces a speed which is twice that of the
            Metropolis algorithm whereas the computational cost is
            almost the same in many scenarios.
                                                                                                                    25 / 25
Introduction       MH algorithms with multiple proposals   Optimal scaling   Optimising the speed up process   Conclusion




Conclusion

       ◮ MCTM algorithm:
               1      Extreme antithetic proposals improves upon the MTM with
                      independent proposals.
               2      Still, the improvement is not overly impressive and since the
                      introduction of correlation makes the computation of the
                      acceptance ratio more complex.
       ◮ MTM-C algorithm:
          1 The advantage of the MTM-C algorithms: only one simulation
            is required for obtaining the pool of proposals and auxiliary
            variables.
          2 The MTM-RQMC ∼ the extreme antithetic proposals.
          3 Our preferred choice: the MTM-HR algorithm. In particular,
            the case K = 2 induces a speed which is twice that of the
            Metropolis algorithm whereas the computational cost is
            almost the same in many scenarios.
                                                                                                                    25 / 25

More Related Content

Similar to Optimal scaling in MTM algorithms

Firefly exact MCMC for Big Data
Firefly exact MCMC for Big DataFirefly exact MCMC for Big Data
Firefly exact MCMC for Big DataGianvito Siciliano
 
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...Ajay Kumar
 
DESIGN OF DELAY COMPUTATION METHOD FOR CYCLOTOMIC FAST FOURIER TRANSFORM
DESIGN OF DELAY COMPUTATION METHOD FOR CYCLOTOMIC FAST FOURIER TRANSFORMDESIGN OF DELAY COMPUTATION METHOD FOR CYCLOTOMIC FAST FOURIER TRANSFORM
DESIGN OF DELAY COMPUTATION METHOD FOR CYCLOTOMIC FAST FOURIER TRANSFORMsipij
 
Hyperheuristics in Logistics - kassem danach
Hyperheuristics in Logistics - kassem danachHyperheuristics in Logistics - kassem danach
Hyperheuristics in Logistics - kassem danachKassem Danach
 
Sequential estimation of_discrete_choice_models
Sequential estimation of_discrete_choice_modelsSequential estimation of_discrete_choice_models
Sequential estimation of_discrete_choice_modelsYoussefKitane
 
HyperPrompt:Prompt-based Task-Conditioning of Transformerspdf
HyperPrompt:Prompt-based Task-Conditioning of TransformerspdfHyperPrompt:Prompt-based Task-Conditioning of Transformerspdf
HyperPrompt:Prompt-based Task-Conditioning of TransformerspdfPo-Chuan Chen
 
Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective Optimization
Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective OptimizationHybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective Optimization
Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective OptimizationeArtius, Inc.
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingAdam Doyle
 
Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019Faisal Siddiqi
 
Icitam2019 2020 book_chapter
Icitam2019 2020 book_chapterIcitam2019 2020 book_chapter
Icitam2019 2020 book_chapterBan Bang
 
Directed Optimization on Pareto Frontier
Directed Optimization on Pareto FrontierDirected Optimization on Pareto Frontier
Directed Optimization on Pareto FrontiereArtius, Inc.
 
Cuckoo Search: Recent Advances and Applications
Cuckoo Search: Recent Advances and ApplicationsCuckoo Search: Recent Advances and Applications
Cuckoo Search: Recent Advances and ApplicationsXin-She Yang
 
Speech recognition final
Speech recognition finalSpeech recognition final
Speech recognition finalArchit Vora
 
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...Varad Meru
 
Sequential estimation of_discrete_choice_models__copy_-4
Sequential estimation of_discrete_choice_models__copy_-4Sequential estimation of_discrete_choice_models__copy_-4
Sequential estimation of_discrete_choice_models__copy_-4YoussefKitane
 
Automated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAutomated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAijun Zhang
 

Similar to Optimal scaling in MTM algorithms (20)

Firefly exact MCMC for Big Data
Firefly exact MCMC for Big DataFirefly exact MCMC for Big Data
Firefly exact MCMC for Big Data
 
Tesis-Maestria-Presentacion-SIturriaga
Tesis-Maestria-Presentacion-SIturriagaTesis-Maestria-Presentacion-SIturriaga
Tesis-Maestria-Presentacion-SIturriaga
 
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
 
DESIGN OF DELAY COMPUTATION METHOD FOR CYCLOTOMIC FAST FOURIER TRANSFORM
DESIGN OF DELAY COMPUTATION METHOD FOR CYCLOTOMIC FAST FOURIER TRANSFORMDESIGN OF DELAY COMPUTATION METHOD FOR CYCLOTOMIC FAST FOURIER TRANSFORM
DESIGN OF DELAY COMPUTATION METHOD FOR CYCLOTOMIC FAST FOURIER TRANSFORM
 
Hyperheuristics in Logistics - kassem danach
Hyperheuristics in Logistics - kassem danachHyperheuristics in Logistics - kassem danach
Hyperheuristics in Logistics - kassem danach
 
Sequential estimation of_discrete_choice_models
Sequential estimation of_discrete_choice_modelsSequential estimation of_discrete_choice_models
Sequential estimation of_discrete_choice_models
 
HyperPrompt:Prompt-based Task-Conditioning of Transformerspdf
HyperPrompt:Prompt-based Task-Conditioning of TransformerspdfHyperPrompt:Prompt-based Task-Conditioning of Transformerspdf
HyperPrompt:Prompt-based Task-Conditioning of Transformerspdf
 
Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective Optimization
Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective OptimizationHybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective Optimization
Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective Optimization
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-making
 
intro
introintro
intro
 
Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019
 
Icitam2019 2020 book_chapter
Icitam2019 2020 book_chapterIcitam2019 2020 book_chapter
Icitam2019 2020 book_chapter
 
Directed Optimization on Pareto Frontier
Directed Optimization on Pareto FrontierDirected Optimization on Pareto Frontier
Directed Optimization on Pareto Frontier
 
Cuckoo Search: Recent Advances and Applications
Cuckoo Search: Recent Advances and ApplicationsCuckoo Search: Recent Advances and Applications
Cuckoo Search: Recent Advances and Applications
 
Speech recognition final
Speech recognition finalSpeech recognition final
Speech recognition final
 
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
 
Sequential estimation of_discrete_choice_models__copy_-4
Sequential estimation of_discrete_choice_models__copy_-4Sequential estimation of_discrete_choice_models__copy_-4
Sequential estimation of_discrete_choice_models__copy_-4
 
Metaheuristics
MetaheuristicsMetaheuristics
Metaheuristics
 
EiB Seminar from Esteban Vegas, Ph.D.
EiB Seminar from Esteban Vegas, Ph.D. EiB Seminar from Esteban Vegas, Ph.D.
EiB Seminar from Esteban Vegas, Ph.D.
 
Automated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAutomated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform Designs
 

Recently uploaded

Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 

Recently uploaded (20)

Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 

Optimal scaling in MTM algorithms

  • 1. Scaling analysis of multiple-try MCMC methods Randal DOUC randal.douc@it-sudparis.eu Travail joint avec Mylène Bédard et Eric Moulines. 1 / 25
  • 2. Themes 1 MCMC algorithms with multiple proposals: MCTM, MTM-C. 2 Analysis through optimal scaling (introduced by Roberts, Gelman, Gilks, 1998) 3 Hit and Run algorithm. 2 / 25
  • 3. Themes 1 MCMC algorithms with multiple proposals: MCTM, MTM-C. 2 Analysis through optimal scaling (introduced by Roberts, Gelman, Gilks, 1998) 3 Hit and Run algorithm. 2 / 25
  • 4. Themes 1 MCMC algorithms with multiple proposals: MCTM, MTM-C. 2 Analysis through optimal scaling (introduced by Roberts, Gelman, Gilks, 1998) 3 Hit and Run algorithm. 2 / 25
  • 5. Themes 1 MCMC algorithms with multiple proposals: MCTM, MTM-C. 2 Analysis through optimal scaling (introduced by Roberts, Gelman, Gilks, 1998) 3 Hit and Run algorithm. 2 / 25
  • 6. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Plan de l’exposé 1 Introduction 2 MH algorithms with multiple proposals Random Walk MH MCTM algorithm MTM-C algorithms 3 Optimal scaling Main results 4 Optimising the speed up process MCTM algorithm MTM-C algorithms 5 Conclusion 3 / 25
  • 7. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Plan de l’exposé 1 Introduction 2 MH algorithms with multiple proposals Random Walk MH MCTM algorithm MTM-C algorithms 3 Optimal scaling Main results 4 Optimising the speed up process MCTM algorithm MTM-C algorithms 5 Conclusion 3 / 25
  • 8. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Plan de l’exposé 1 Introduction 2 MH algorithms with multiple proposals Random Walk MH MCTM algorithm MTM-C algorithms 3 Optimal scaling Main results 4 Optimising the speed up process MCTM algorithm MTM-C algorithms 5 Conclusion 3 / 25
  • 9. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Plan de l’exposé 1 Introduction 2 MH algorithms with multiple proposals Random Walk MH MCTM algorithm MTM-C algorithms 3 Optimal scaling Main results 4 Optimising the speed up process MCTM algorithm MTM-C algorithms 5 Conclusion 3 / 25
  • 10. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Plan de l’exposé 1 Introduction 2 MH algorithms with multiple proposals Random Walk MH MCTM algorithm MTM-C algorithms 3 Optimal scaling Main results 4 Optimising the speed up process MCTM algorithm MTM-C algorithms 5 Conclusion 3 / 25
  • 11. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Plan 1 Introduction 2 MH algorithms with multiple proposals Random Walk MH MCTM algorithm MTM-C algorithms 3 Optimal scaling Main results 4 Optimising the speed up process MCTM algorithm MTM-C algorithms 5 Conclusion 4 / 25
  • 12. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Metropolis Hastings (MH) algorithm 1 We wish to approximate π(x ) I= h(x ) dx = h(x )¯ (x )dx π π(u)du 2 x → π(x ) is known but not π(u)du. Approximate I with ˜ = n t=1 h(X [t]) where (X [t]) is a Markov 1 n 3 I chain with limiting distribution π . ¯ 4 In MH algorithm, the last condition is obtained from a detailed balance condition ∀x , y , π(x )p(x , y ) = π(y )p(y , x ) 5 Quality of the approximation are obtained from Law of Large Numbers or CLT for Markov chains. 5 / 25
  • 13. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Metropolis Hastings (MH) algorithm 1 We wish to approximate π(x ) I= h(x ) dx = h(x )¯ (x )dx π π(u)du 2 x → π(x ) is known but not π(u)du. Approximate I with ˜ = n t=1 h(X [t]) where (X [t]) is a Markov 1 n 3 I chain with limiting distribution π . ¯ 4 In MH algorithm, the last condition is obtained from a detailed balance condition ∀x , y , π(x )p(x , y ) = π(y )p(y , x ) 5 Quality of the approximation are obtained from Law of Large Numbers or CLT for Markov chains. 5 / 25
  • 14. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Metropolis Hastings (MH) algorithm 1 We wish to approximate π(x ) I= h(x ) dx = h(x )¯ (x )dx π π(u)du 2 x → π(x ) is known but not π(u)du. Approximate I with ˜ = n t=1 h(X [t]) where (X [t]) is a Markov 1 n 3 I chain with limiting distribution π . ¯ 4 In MH algorithm, the last condition is obtained from a detailed balance condition ∀x , y , π(x )p(x , y ) = π(y )p(y , x ) 5 Quality of the approximation are obtained from Law of Large Numbers or CLT for Markov chains. 5 / 25
  • 15. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Metropolis Hastings (MH) algorithm 1 We wish to approximate π(x ) I= h(x ) dx = h(x )¯ (x )dx π π(u)du 2 x → π(x ) is known but not π(u)du. Approximate I with ˜ = n t=1 h(X [t]) where (X [t]) is a Markov 1 n 3 I chain with limiting distribution π . ¯ 4 In MH algorithm, the last condition is obtained from a detailed balance condition ∀x , y , π(x )p(x , y ) = π(y )p(y , x ) 5 Quality of the approximation are obtained from Law of Large Numbers or CLT for Markov chains. 5 / 25
  • 16. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Metropolis Hastings (MH) algorithm 1 We wish to approximate π(x ) I= h(x ) dx = h(x )¯ (x )dx π π(u)du 2 x → π(x ) is known but not π(u)du. Approximate I with ˜ = n t=1 h(X [t]) where (X [t]) is a Markov 1 n 3 I chain with limiting distribution π . ¯ 4 In MH algorithm, the last condition is obtained from a detailed balance condition ∀x , y , π(x )p(x , y ) = π(y )p(y , x ) 5 Quality of the approximation are obtained from Law of Large Numbers or CLT for Markov chains. 5 / 25
  • 17. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Plan 1 Introduction 2 MH algorithms with multiple proposals Random Walk MH MCTM algorithm MTM-C algorithms 3 Optimal scaling Main results 4 Optimising the speed up process MCTM algorithm MTM-C algorithms 5 Conclusion 6 / 25
  • 18. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Random Walk MH Notation w.p. = with probability Algorithme (MCMC ) If X [t] = x , how is X [t + 1] simulated? (a) Y ∼ q(x ; ·). (b) Accept the proposal X [t + 1] = Y w.p. α(x , Y ) where π(y )q(y ; x ) α(x , y ) = 1 ∧ π(x )q(x ; y ) (c) Otherwise X [t + 1] = x 7 / 25
  • 19. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Random Walk MH Notation w.p. = with probability Algorithme (MCMC ) If X [t] = x , how is X [t + 1] simulated? (a) Y ∼ q(x ; ·). (b) Accept the proposal X [t + 1] = Y w.p. α(x , Y ) where π(y )q(y ; x ) α(x , y ) = 1 ∧ π(x )q(x ; y ) (c) Otherwise X [t + 1] = x The chain is π-reversible since: 7 / 25
  • 20. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Random Walk MH Notation w.p. = with probability Algorithme (MCMC ) If X [t] = x , how is X [t + 1] simulated? (a) Y ∼ q(x ; ·). (b) Accept the proposal X [t + 1] = Y w.p. α(x , Y ) where π(y )q(y ; x ) α(x , y ) = 1 ∧ π(x )q(x ; y ) (c) Otherwise X [t + 1] = x The chain is π-reversible since: π(x )α(x , y )q(x ; y ) = π(x )α(x , y ) ∧ π(y )α(y , x ) 7 / 25
  • 21. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Random Walk MH Notation w.p. = with probability Algorithme (MCMC ) If X [t] = x , how is X [t + 1] simulated? (a) Y ∼ q(x ; ·). (b) Accept the proposal X [t + 1] = Y w.p. α(x , Y ) where π(y )q(y ; x ) α(x , y ) = 1 ∧ π(x )q(x ; y ) (c) Otherwise X [t + 1] = x The chain is π-reversible since: π(x )α(x , y )q(x ; y ) = π(y )α(y , x )q(y ; x ) 7 / 25
  • 22. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Random Walk MH Assume that q(x ; y ) = q(y ; x ) ◮ the instrumental kernel is symmetric. Typically Y = X + U where U has symm. distr. Notation w.p. = with probability Algorithme (MCMC with symmetric proposal) If X [t] = x , how is X [t + 1] simulated? (a) Y ∼ q(x ; ·). (b) Accept the proposal X [t + 1] = Y w.p. α(x , Y ) where π(y )q(y ; x ) α(x , y ) = 1 ∧ π(x )q(x ; y ) (c) Otherwise X [t + 1] = x The chain is π-reversible since: π(x )α(x , y )q(x ; y ) = π(y )α(y , x )q(y ; x ) 7 / 25
  • 23. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Random Walk MH Assume that q(x ; y ) = q(y ; x ) ◮ the instrumental kernel is symmetric. Typically Y = X + U where U has symm. distr. Notation w.p. = with probability Algorithme (MCMC with symmetric proposal) If X [t] = x , how is X [t + 1] simulated? (a) Y ∼ q(x ; ·). (b) Accept the proposal X [t + 1] = Y w.p. α(x , Y ) where π(y ) α(x , y ) = 1 ∧ π(x ) (c) Otherwise X [t + 1] = x The chain is π-reversible since: π(x )α(x , y )q(x ; y ) = π(y )α(y , x )q(y ; x ) 7 / 25
  • 24. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm Multiple proposal MCMC 1 Liu, Liang, Wong (2000) introduced the multiple proposal MCMC. Generalized to multiple correlated proposals by Craiu and Lemieux (2007). 2 A pool of candidates is drawn from (Y 1 , . . . , Y K ) X [t] ∼ q(X [t]; ·). 3 We select one candidate a priori according to some "informative" criterium (with high values of π for example). 4 We accept the candidate with some well chosen probability. ◮ diversity of the candidates: some candidates are, other are far away from the current state. Some additional notations: Yj X [t] ∼ qj (X [t]; ·) (◮M ARGINAL DIST.) (1) (Y i )i=j X [t],Y j ∼ qj (X [t], Y j ; ·) (◮S IM . ¯ OTHER CAND.) . (2) 8 / 25
  • 25. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm Multiple proposal MCMC 1 Liu, Liang, Wong (2000) introduced the multiple proposal MCMC. Generalized to multiple correlated proposals by Craiu and Lemieux (2007). 2 A pool of candidates is drawn from (Y 1 , . . . , Y K ) X [t] ∼ q(X [t]; ·). 3 We select one candidate a priori according to some "informative" criterium (with high values of π for example). 4 We accept the candidate with some well chosen probability. ◮ diversity of the candidates: some candidates are, other are far away from the current state. Some additional notations: Yj X [t] ∼ qj (X [t]; ·) (◮M ARGINAL DIST.) (1) (Y i )i=j X [t],Y j ∼ qj (X [t], Y j ; ·) (◮S IM . ¯ OTHER CAND.) . (2) 8 / 25
  • 26. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm Multiple proposal MCMC 1 Liu, Liang, Wong (2000) introduced the multiple proposal MCMC. Generalized to multiple correlated proposals by Craiu and Lemieux (2007). 2 A pool of candidates is drawn from (Y 1 , . . . , Y K ) X [t] ∼ q(X [t]; ·). 3 We select one candidate a priori according to some "informative" criterium (with high values of π for example). 4 We accept the candidate with some well chosen probability. ◮ diversity of the candidates: some candidates are, other are far away from the current state. Some additional notations: Yj X [t] ∼ qj (X [t]; ·) (◮M ARGINAL DIST.) (1) (Y i )i=j X [t],Y j ∼ qj (X [t], Y j ; ·) (◮S IM . ¯ OTHER CAND.) . (2) 8 / 25
  • 27. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm Multiple proposal MCMC 1 Liu, Liang, Wong (2000) introduced the multiple proposal MCMC. Generalized to multiple correlated proposals by Craiu and Lemieux (2007). 2 A pool of candidates is drawn from (Y 1 , . . . , Y K ) X [t] ∼ q(X [t]; ·). 3 We select one candidate a priori according to some "informative" criterium (with high values of π for example). 4 We accept the candidate with some well chosen probability. ◮ diversity of the candidates: some candidates are, other are far away from the current state. Some additional notations: Yj X [t] ∼ qj (X [t]; ·) (◮M ARGINAL DIST.) (1) (Y i )i=j X [t],Y j ∼ qj (X [t], Y j ; ·) (◮S IM . ¯ OTHER CAND.) . (2) 8 / 25
  • 28. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm Multiple proposal MCMC 1 Liu, Liang, Wong (2000) introduced the multiple proposal MCMC. Generalized to multiple correlated proposals by Craiu and Lemieux (2007). 2 A pool of candidates is drawn from (Y 1 , . . . , Y K ) X [t] ∼ q(X [t]; ·). 3 We select one candidate a priori according to some "informative" criterium (with high values of π for example). 4 We accept the candidate with some well chosen probability. ◮ diversity of the candidates: some candidates are, other are far away from the current state. Some additional notations: Yj X [t] ∼ qj (X [t]; ·) (◮M ARGINAL DIST.) (1) (Y i )i=j X [t],Y j ∼ qj (X [t], Y j ; ·) (◮S IM . ¯ OTHER CAND.) . (2) 8 / 25
  • 29. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm Multiple proposal MCMC 1 Liu, Liang, Wong (2000) introduced the multiple proposal MCMC. Generalized to multiple correlated proposals by Craiu and Lemieux (2007). 2 A pool of candidates is drawn from (Y 1 , . . . , Y K ) X [t] ∼ q(X [t]; ·). 3 We select one candidate a priori according to some "informative" criterium (with high values of π for example). 4 We accept the candidate with some well chosen probability. ◮ diversity of the candidates: some candidates are, other are far away from the current state. Some additional notations: Yj X [t] ∼ qj (X [t]; ·) (◮M ARGINAL DIST.) (1) (Y i )i=j X [t],Y j ∼ qj (X [t], Y j ; ·) (◮S IM . ¯ OTHER CAND.) . (2) 8 / 25
  • 30. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm Assume that qj (x ; y ) = qj (y ; x ) . Algorithme (MCTM: Multiple Correlated try Metropolis alg.) If X [t] = x , how is X [t + 1] simulated? (a) (Y 1 , . . . , Y K ) ∼ q(x ; ·). (◮POOL OF CAND.) (b) Draw an index J ∈ {1, . . . , K }, with probability proportional to [π(Y 1 ), . . . , π(Y K )] . (◮S ELECTION A PRIORI) ˜ (c) {Y J,i }i=J ∼ qJ (Y J , x ; ·). ¯ (◮AUXILIARY VARIABLES) ˜ (d) Accept the proposal X [t + 1] = Y J w.p. αJ (x , (Y i )K , (Y J,i )i=J ) i=1 where i=j π(y i ) + π(y j ) αj (x , (y i )K , (y j,i )i=j ) = 1 ∧ i=1 ˜ . (3) ˜ j,i i=j π(y ) + π(x ) (◮MH ACCEPTANCE PROBABILITY) (e) Otherwise, X [t + 1] = X [t] See MTM-C 9 / 25
  • 31. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm Assume that qj (x ; y ) = qj (y ; x ) . Algorithme (MCTM: Multiple Correlated try Metropolis alg.) If X [t] = x , how is X [t + 1] simulated? (a) (Y 1 , . . . , Y K ) ∼ q(x ; ·). (◮POOL OF CAND.) (b) Draw an index J ∈ {1, . . . , K }, with probability proportional to [π(Y 1 ), . . . , π(Y K )] . (◮S ELECTION A PRIORI) ˜ (c) {Y J,i }i=J ∼ qJ (Y J , x ; ·). ¯ (◮AUXILIARY VARIABLES) ˜ (d) Accept the proposal X [t + 1] = Y J w.p. αJ (x , (Y i )K , (Y J,i )i=J ) i=1 where i=j π(y i ) + π(y j ) αj (x , (y i )K , (y j,i )i=j ) = 1 ∧ i=1 ˜ . (3) ˜ j,i i=j π(y ) + π(x ) (◮MH ACCEPTANCE PROBABILITY) (e) Otherwise, X [t + 1] = X [t] See MTM-C 9 / 25
  • 32. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm Assume that qj (x ; y ) = qj (y ; x ) . Algorithme (MCTM: Multiple Correlated try Metropolis alg.) If X [t] = x , how is X [t + 1] simulated? (a) (Y 1 , . . . , Y K ) ∼ q(x ; ·). (◮POOL OF CAND.) (b) Draw an index J ∈ {1, . . . , K }, with probability proportional to [π(Y 1 ), . . . , π(Y K )] . (◮S ELECTION A PRIORI) ˜ (c) {Y J,i }i=J ∼ qJ (Y J , x ; ·). ¯ (◮AUXILIARY VARIABLES) ˜ (d) Accept the proposal X [t + 1] = Y J w.p. αJ (x , (Y i )K , (Y J,i )i=J ) i=1 where i=j π(y i ) + π(y j ) αj (x , (y i )K , (y j,i )i=j ) = 1 ∧ i=1 ˜ . (3) ˜ j,i i=j π(y ) + π(x ) (◮MH ACCEPTANCE PROBABILITY) (e) Otherwise, X [t + 1] = X [t] See MTM-C 9 / 25
  • 33. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm Assume that qj (x ; y ) = qj (y ; x ) . Algorithme (MCTM: Multiple Correlated try Metropolis alg.) If X [t] = x , how is X [t + 1] simulated? (a) (Y 1 , . . . , Y K ) ∼ q(x ; ·). (◮POOL OF CAND.) (b) Draw an index J ∈ {1, . . . , K }, with probability proportional to [π(Y 1 ), . . . , π(Y K )] . (◮S ELECTION A PRIORI) ˜ (c) {Y J,i }i=J ∼ qJ (Y J , x ; ·). ¯ (◮AUXILIARY VARIABLES) ˜ (d) Accept the proposal X [t + 1] = Y J w.p. αJ (x , (Y i )K , (Y J,i )i=J ) i=1 where i=j π(y i ) + π(y j ) αj (x , (y i )K , (y j,i )i=j ) = 1 ∧ i=1 ˜ . (3) ˜ j,i i=j π(y ) + π(x ) (◮MH ACCEPTANCE PROBABILITY) (e) Otherwise, X [t + 1] = X [t] See MTM-C 9 / 25
  • 34. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm 1 It generalises the classical Random Walk Hasting Metropolis algorithm (which is the case K = 1). RWMC 10 / 25
  • 35. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm 1 It generalises the classical Random Walk Hasting Metropolis algorithm (which is the case K = 1). RWMC 2 It satisfies the detailed balance condition wrt π:     K π(x ) ··· ¯ qj (x ; y )Qj x , y ; ¯ d(y i ) Qj y , x ; d(y j,i ) j=1 i=j i=j π(y ) π(y ) + i=j π(y i ) 1∧ π(y ) + i=j π(y i ) π(x ) + i=j π(y j,i ) 10 / 25
  • 36. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm 1 It generalises the classical Random Walk Hasting Metropolis algorithm (which is the case K = 1). RWMC 2 It satisfies the detailed balance condition wrt π:     K π(x )π(y ) qj (x ; y ) ··· ¯ Qj x , y ; ¯ d(y i ) Qj y , x ; d(y j,i ) j=1 i=j i=j 1 1 i ∧ π(y ) + i=j π(y ) π(x ) + i=j π(y j,i ) ◮ symmetric wrt (x , y ) 10 / 25
  • 37. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm 1 The MCTM uses the simulation of K random variables for the pool of candidates and K − 1 auxiliary variables to compute the MH acceptance ratio. 2 Can we reduce the number of simulated variables while keeping the diversity of the pool? 3 Draw one random variable and use transformations to create the pool of candidates and auxiliary variables. 11 / 25
  • 38. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm 1 The MCTM uses the simulation of K random variables for the pool of candidates and K − 1 auxiliary variables to compute the MH acceptance ratio. 2 Can we reduce the number of simulated variables while keeping the diversity of the pool? 3 Draw one random variable and use transformations to create the pool of candidates and auxiliary variables. 11 / 25
  • 39. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm 1 The MCTM uses the simulation of K random variables for the pool of candidates and K − 1 auxiliary variables to compute the MH acceptance ratio. 2 Can we reduce the number of simulated variables while keeping the diversity of the pool? 3 Draw one random variable and use transformations to create the pool of candidates and auxiliary variables. 11 / 25
  • 40. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MTM-C algorithms Ψi : X ×[0, 1)r → X Let . Ψj,i : X × X → X Assume that 1 For all j ∈ {1, . . . , K }, set Y j = Ψj (x , V ) (◮C OMMON R . V.) where V ∼ U([0, 1)r ) 2 For any (i, j) ∈ {1, . . . , K }2 , Y i = Ψj,i (x , Y j ) . (◮R ECONSTRUCTION OF THE OTHER CAND.) (4) Example: i ψ i (x , v ) = x + σΦ−1 (< v i + v >) where v i =< K a >, a ∈ Rr , Φ cumulative repartition function of the normal distribution. ◮ Korobov seq. + Cranley Patterson rot. ψ i (x , v ) = x + γ i Φ−1 (v ) . ◮ Hit and Run algorithm. 12 / 25
  • 41. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MTM-C algorithms Ψi : X ×[0, 1)r → X Let . Ψj,i : X × X → X Assume that 1 For all j ∈ {1, . . . , K }, set Y j = Ψj (x , V ) (◮C OMMON R . V.) where V ∼ U([0, 1)r ) 2 For any (i, j) ∈ {1, . . . , K }2 , Y i = Ψj,i (x , Y j ) . (◮R ECONSTRUCTION OF THE OTHER CAND.) (4) Example: i ψ i (x , v ) = x + σΦ−1 (< v i + v >) where v i =< K a >, a ∈ Rr , Φ cumulative repartition function of the normal distribution. ◮ Korobov seq. + Cranley Patterson rot. ψ i (x , v ) = x + γ i Φ−1 (v ) . ◮ Hit and Run algorithm. 12 / 25
  • 42. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MTM-C algorithms Ψi : X ×[0, 1)r → X Let . Ψj,i : X × X → X Assume that 1 For all j ∈ {1, . . . , K }, set Y j = Ψj (x , V ) (◮C OMMON R . V.) where V ∼ U([0, 1)r ) 2 For any (i, j) ∈ {1, . . . , K }2 , Y i = Ψj,i (x , Y j ) . (◮R ECONSTRUCTION OF THE OTHER CAND.) (4) Example: i ψ i (x , v ) = x + σΦ−1 (< v i + v >) where v i =< K a >, a ∈ Rr , Φ cumulative repartition function of the normal distribution. ◮ Korobov seq. + Cranley Patterson rot. ψ i (x , v ) = x + γ i Φ−1 (v ) . ◮ Hit and Run algorithm. 12 / 25
  • 43. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MTM-C algorithms Ψi : X ×[0, 1)r → X Let . Ψj,i : X × X → X Assume that 1 For all j ∈ {1, . . . , K }, set Y j = Ψj (x , V ) (◮C OMMON R . V.) where V ∼ U([0, 1)r ) 2 For any (i, j) ∈ {1, . . . , K }2 , Y i = Ψj,i (x , Y j ) . (◮R ECONSTRUCTION OF THE OTHER CAND.) (4) Example: i ψ i (x , v ) = x + σΦ−1 (< v i + v >) where v i =< K a >, a ∈ Rr , Φ cumulative repartition function of the normal distribution. ◮ Korobov seq. + Cranley Patterson rot. ψ i (x , v ) = x + γ i Φ−1 (v ) . ◮ Hit and Run algorithm. 12 / 25
  • 44. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MTM-C algorithms Algorithme (MTM-C: Multiple Try Metropolis alg. with common proposal) (a) Draw V ∼ U([0, 1)r ) and set Y i = Ψi (x , V ) for i = 1, . . . , K . (b) Draw an index J ∈ {1, . . . , K }, with probability proportional to [π(Y 1 ), . . . , π(Y K )] . (c) Accept X [t + 1] = Y J with probability αJ (x , Y ), where, for ¯ j ∈ {1, . . . , K }, αj (x , y j ) = αj x , {Ψj,i (x , y j )}K , {Ψj,i (y j , x )}i=j ¯ i=1 , (5) with αj given in (3) and reject otherwise. (d) Otherwise X [t + 1] = Y J . See MCTM 13 / 25
  • 45. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MTM-C algorithms Algorithme (MTM-C: Multiple Try Metropolis alg. with common proposal) (a) Draw V ∼ U([0, 1)r ) and set Y i = Ψi (x , V ) for i = 1, . . . , K . (b) Draw an index J ∈ {1, . . . , K }, with probability proportional to [π(Y 1 ), . . . , π(Y K )] . (c) Accept X [t + 1] = Y J with probability αJ (x , Y ), where, for ¯ j ∈ {1, . . . , K }, αj (x , y j ) = αj x , {Ψj,i (x , y j )}K , {Ψj,i (y j , x )}i=j ¯ i=1 , (5) with αj given in (3) and reject otherwise. (d) Otherwise X [t + 1] = Y J . See MCTM 13 / 25
  • 46. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MTM-C algorithms Algorithme (MTM-C: Multiple Try Metropolis alg. with common proposal) (a) Draw V ∼ U([0, 1)r ) and set Y i = Ψi (x , V ) for i = 1, . . . , K . (b) Draw an index J ∈ {1, . . . , K }, with probability proportional to [π(Y 1 ), . . . , π(Y K )] . (c) Accept X [t + 1] = Y J with probability αJ (x , Y ), where, for ¯ j ∈ {1, . . . , K }, αj (x , y j ) = αj x , {Ψj,i (x , y j )}K , {Ψj,i (y j , x )}i=j ¯ i=1 , (5) with αj given in (3) and reject otherwise. (d) Otherwise X [t + 1] = Y J . See MCTM 13 / 25
  • 47. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MTM-C algorithms Algorithme (MTM-C: Multiple Try Metropolis alg. with common proposal) (a) Draw V ∼ U([0, 1)r ) and set Y i = Ψi (x , V ) for i = 1, . . . , K . (b) Draw an index J ∈ {1, . . . , K }, with probability proportional to [π(Y 1 ), . . . , π(Y K )] . (c) Accept X [t + 1] = Y J with probability αJ (x , Y ), where, for ¯ j ∈ {1, . . . , K }, αj (x , y j ) = αj x , {Ψj,i (x , y j )}K , {Ψj,i (y j , x )}i=j ¯ i=1 , (5) with αj given in (3) and reject otherwise. (d) Otherwise X [t + 1] = Y J . See MCTM 13 / 25
  • 48. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Plan 1 Introduction 2 MH algorithms with multiple proposals Random Walk MH MCTM algorithm MTM-C algorithms 3 Optimal scaling Main results 4 Optimising the speed up process MCTM algorithm MTM-C algorithms 5 Conclusion 14 / 25
  • 49. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion How to compare two MH algorithms ◮ P ESKUN- If P1 and P2 are two π-reversible kernels and ∀x , y p1 (x , y ) ≤ p2 (x , y ) then P2 is better than P1 in terms of the asymptotic variance of N −1 N h(X1 ). i=1 1 Off diagonal order: Not always easy to compare! 2 Moreover, one expression of the asymptotic variance is: ∞ V = Varπ (h) + 2 Covπ (h(X0 ), h(Xt )) t=1 15 / 25
  • 50. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion How to compare two MH algorithms ◮ P ESKUN- If P1 and P2 are two π-reversible kernels and ∀x , y p1 (x , y ) ≤ p2 (x , y ) then P2 is better than P1 in terms of the asymptotic variance of N −1 N h(X1 ). i=1 1 Off diagonal order: Not always easy to compare! 2 Moreover, one expression of the asymptotic variance is: ∞ V = Varπ (h) + 2 Covπ (h(X0 ), h(Xt )) t=1 15 / 25
  • 51. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion How to compare two MH algorithms ◮ P ESKUN- If P1 and P2 are two π-reversible kernels and ∀x , y p1 (x , y ) ≤ p2 (x , y ) then P2 is better than P1 in terms of the asymptotic variance of N −1 N h(X1 ). i=1 1 Off diagonal order: Not always easy to compare! 2 Moreover, one expression of the asymptotic variance is: ∞ V = Varπ (h) + 2 Covπ (h(X0 ), h(Xt )) t=1 15 / 25
  • 52. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Original idea of optimal scaling For the RW-MH algorithm: 1 Increase dimension T . T 2 Target distribution πT (x0:T ) = t=0 f (xt ) . 3 Assume that XT [0] ∼ πT . 4 Take a random walk increasingly conservative: draw candidate ℓ YT = XT [t] + √T UT [t] where UT [t] centered standard normal. 5 What is the "best" ℓ? 16 / 25
  • 53. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Original idea of optimal scaling For the RW-MH algorithm: 1 Increase dimension T . T 2 Target distribution πT (x0:T ) = t=0 f (xt ) . 3 Assume that XT [0] ∼ πT . 4 Take a random walk increasingly conservative: draw candidate ℓ YT = XT [t] + √T UT [t] where UT [t] centered standard normal. 5 What is the "best" ℓ? 16 / 25
  • 54. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Original idea of optimal scaling For the RW-MH algorithm: 1 Increase dimension T . T 2 Target distribution πT (x0:T ) = t=0 f (xt ) . 3 Assume that XT [0] ∼ πT . 4 Take a random walk increasingly conservative: draw candidate ℓ YT = XT [t] + √T UT [t] where UT [t] centered standard normal. 5 What is the "best" ℓ? 16 / 25
  • 55. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Original idea of optimal scaling For the RW-MH algorithm: 1 Increase dimension T . T 2 Target distribution πT (x0:T ) = t=0 f (xt ) . 3 Assume that XT [0] ∼ πT . 4 Take a random walk increasingly conservative: draw candidate ℓ YT = XT [t] + √T UT [t] where UT [t] centered standard normal. 5 What is the "best" ℓ? 16 / 25
  • 56. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Original idea of optimal scaling For the RW-MH algorithm: 1 Increase dimension T . T 2 Target distribution πT (x0:T ) = t=0 f (xt ) . 3 Assume that XT [0] ∼ πT . 4 Take a random walk increasingly conservative: draw candidate ℓ YT = XT [t] + √T UT [t] where UT [t] centered standard normal. 5 What is the "best" ℓ? 16 / 25
  • 57. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Théorème The first component of (XT [⌊Ts⌋])0≤s≤1 weakly converges in the Skorokhod topology to the stationary solution (W [λℓ s], s ∈ R+ ) of the Langevin SDE 1 dW [s] = dB[s] + [ln f ]′ (W [s])ds , 2 In particular, the first component of (XT [0], XT [α1 T ], . . . , XT [αp T ]) converges weakly to the distribution of (W [0], W [λℓ α1 T ], . . . , W [λℓ αp T ]) 17 / 25
  • 58. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Théorème The first component of (XT [⌊Ts⌋])0≤s≤1 weakly converges in the Skorokhod topology to the stationary solution (W [λℓ s], s ∈ R+ ) of the Langevin SDE 1 dW [s] = dB[s] + [ln f ]′ (W [s])ds , 2 In particular, the first component of (XT [0], XT [α1 T ], . . . , XT [αp T ]) converges weakly to the distribution of (W [0], W [λℓ α1 T ], . . . , W [λℓ αp T ]) ℓ √ Then, ℓ is chosen to maximize λℓ = 2ℓ2 Φ − 2 I where 2 I= {[ln f ]′ (x )} f (x )dx . 17 / 25
  • 59. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Théorème The first component of (XT [⌊Ts⌋])0≤s≤1 weakly converges in the Skorokhod topology to the stationary solution (W [s], s ∈ R+ ) of the Langevin SDE λℓ ′ dW [s] = λℓ dB[s] + [ln f ] (W [s])ds , 2 In particular, the first component of (XT [0], XT [α1 T ], . . . , XT [αp T ]) converges weakly to the distribution of (W [0], W [λℓ α1 T ], . . . , W [λℓ αp T ]) ℓ √ Then, ℓ is chosen to maximize λℓ = 2ℓ2 Φ − 2 I where 2 I= {[ln f ]′ (x )} f (x )dx . 17 / 25
  • 60. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Main results Optimal scaling for the MCTM algorithm ◮ The pool of candidates YT ,t [n + 1] = XT ,t [n] + T −1/2 Uti [n + 1] , i 0 ≤ t ≤ T, 1 ≤ i ≤ K, where for any t ∈ {0, . . . , T }, (Uti [n + 1])K ∼ N (0, Σ) , (◮MCTM) i=1 Uti [n + 1] = ψ i (Vt ), and Vt ∼ U[0, 1], (◮MTM-C) ◮ The auxiliary variables ˜ j,i ˜ YT ,t [n + 1] = XT ,t [n] + T −1/2 Utj,i [n + 1] , i =j , 18 / 25
  • 61. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Main results Optimal scaling for the MCTM algorithm ◮ The pool of candidates YT ,t [n + 1] = XT ,t [n] + T −1/2 Uti [n + 1] , i 0 ≤ t ≤ T, 1 ≤ i ≤ K, where for any t ∈ {0, . . . , T }, (Uti [n + 1])K ∼ N (0, Σ) , (◮MCTM) i=1 Uti [n + 1] = ψ i (Vt ), and Vt ∼ U[0, 1], (◮MTM-C) ◮ The auxiliary variables ˜ j,i ˜ YT ,t [n + 1] = XT ,t [n] + T −1/2 Utj,i [n + 1] , i =j , 18 / 25
  • 62. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Main results Théorème Suppose that XT [0] is distributed according to the target density πT . Then, the process (XT ,0 [sT ], s ∈ R+ ) weakly converges in the Skorokhod topology to the stationary solution (W [s], s ∈ R+ ) of the Langevin SDE 1 ′ dW [s] = λ1/2 dB[s] + λ [ln f ] (W [s])ds , 2 with λ λ I, (Γj )K , where Γj , 1 ≤ j ≤ K denotes the covariance j=1 j i ˜ j,i matrix of the random vector (U0 , (U0 )i=j , (U0 )i=j ). For the MCTM, Γj = Γj (Σ). 2K −1 α(Γ) = E A Gi − Var[Gi ]/2 i=1 , (6) where A is bounded lip. and (Gi )2K −1 ∼ N (0, Γ). i=1 K λ I, (Γj )K j=1 Γj1,1 × α IΓj , (7) 19 / 25
  • 63. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Main results Théorème Suppose that XT [0] is distributed according to the target density πT . Then, the process (XT ,0 [sT ], s ∈ R+ ) weakly converges in the Skorokhod topology to the stationary solution (W [s], s ∈ R+ ) of the Langevin SDE 1 ′ dW [s] = λ1/2 dB[s] + λ [ln f ] (W [s])ds , 2 with λ λ I, (Γj )K , where Γj , 1 ≤ j ≤ K denotes the covariance j=1 j i ˜ j,i matrix of the random vector (U0 , (U0 )i=j , (U0 )i=j ). For the MCTM, Γj = Γj (Σ). 2K −1 α(Γ) = E A Gi − Var[Gi ]/2 i=1 , (6) where A is bounded lip. and (Gi )2K −1 ∼ N (0, Γ). i=1 K λ I, (Γj )K j=1 Γj1,1 × α IΓj , (7) 19 / 25
  • 64. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Main results Théorème Suppose that XT [0] is distributed according to the target density πT . Then, the process (XT ,0 [sT ], s ∈ R+ ) weakly converges in the Skorokhod topology to the stationary solution (W [s], s ∈ R+ ) of the Langevin SDE 1 ′ dW [s] = λ1/2 dB[s] + λ [ln f ] (W [s])ds , 2 with λ λ I, (Γj )K , where Γj , 1 ≤ j ≤ K denotes the covariance j=1 j i ˜ j,i matrix of the random vector (U0 , (U0 )i=j , (U0 )i=j ). For the MCTM, Γj = Γj (Σ). 2K −1 α(Γ) = E A Gi − Var[Gi ]/2 i=1 , (6) where A is bounded lip. and (Gi )2K −1 ∼ N (0, Γ). i=1 K λ I, (Γj )K j=1 Γj1,1 × α IΓj , (7) 19 / 25
  • 65. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Plan 1 Introduction 2 MH algorithms with multiple proposals Random Walk MH MCTM algorithm MTM-C algorithms 3 Optimal scaling Main results 4 Optimising the speed up process MCTM algorithm MTM-C algorithms 5 Conclusion 20 / 25
  • 66. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm We optimize the speed λ λ(I, (Γj (Σ))K ) over a subset G j=1 G = Σ = diag(ℓ2 , . . . , ℓ2 ), (ℓ1 , . . . , ℓK ) ∈ RK : the proposals 1 K have different scales but are independent. G = Σ = ℓ2 Σa , ℓ2 ∈ R , where Σa is the extreme antithetic covariance matrix: K 1 Σa IK − 1K 1T K K −1 K −1 with 1K = (1, . . . , 1)T . 21 / 25
  • 67. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm MCTM algorithms Table: Optimal scaling constants, value of the speed, and mean acceptance rate for independent proposals K 1 2 3 4 5 ℓ⋆ 2.38 2.64 2.82 2.99 3.12 λ⋆ 1.32 2.24 2.94 3.51 4.00 a⋆ 0.23 0.32 0.37 0.39 0.41 22 / 25
  • 68. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MCTM algorithm MCTM algorithms Table: Optimal scaling constants, value of the speed, and mean acceptance rate for extreme antithetic proposals K 1 2 3 4 5 ℓ⋆ 2.38 2.37 2.64 2.83 2.99 λ⋆ 1.32 2.64 3.66 4.37 4.91 a⋆ 0.23 0.46 0.52 0.54 0.55 Table: Optimal scaling constants, value of the speed, and mean acceptance rate for the optimal covariance K 1 2 3 4 5 ℓ⋆ 2.38 2.37 2.66 2.83 2.98 λ⋆ 1.32 2.64 3.70 4.40 4.93 a⋆ 0.23 0.46 0.52 0.55 0.56 22 / 25
  • 69. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion MTM-C algorithms MTM-C algorithms Table: Optimal scaling constants, optimal value of the speed and the mean acceptance rate for the RQMC MTM algorithm based on the Korobov sequence and Cranley-Patterson rotations K 1 2 3 4 5 σ⋆ 2.38 2.59 2.77 2.91 3.03 λ⋆ 1.32 2.43 3.31 4.01 4.56 a⋆ 0.23 0.36 0.42 0.47 0.50 Table: Optimal scaling constants, value of the speed, and mean acceptance rate for the hit-and-run algorithm K 1 2 4 6 8 ℓ⋆ 2.38 2.37 7.11 11.85 16.75 λ⋆ 1.32 2.64 2.65 2.65 2.65 a⋆ 0.23 0.46 0.46 0.46 0.46 23 / 25
  • 70. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Plan 1 Introduction 2 MH algorithms with multiple proposals Random Walk MH MCTM algorithm MTM-C algorithms 3 Optimal scaling Main results 4 Optimising the speed up process MCTM algorithm MTM-C algorithms 5 Conclusion 24 / 25
  • 71. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Conclusion ◮ MCTM algorithm: 1 Extreme antithetic proposals improves upon the MTM with independent proposals. 2 Still, the improvement is not overly impressive and since the introduction of correlation makes the computation of the acceptance ratio more complex. ◮ MTM-C algorithm: 1 The advantage of the MTM-C algorithms: only one simulation is required for obtaining the pool of proposals and auxiliary variables. 2 The MTM-RQMC ∼ the extreme antithetic proposals. 3 Our preferred choice: the MTM-HR algorithm. In particular, the case K = 2 induces a speed which is twice that of the Metropolis algorithm whereas the computational cost is almost the same in many scenarios. 25 / 25
  • 72. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Conclusion ◮ MCTM algorithm: 1 Extreme antithetic proposals improves upon the MTM with independent proposals. 2 Still, the improvement is not overly impressive and since the introduction of correlation makes the computation of the acceptance ratio more complex. ◮ MTM-C algorithm: 1 The advantage of the MTM-C algorithms: only one simulation is required for obtaining the pool of proposals and auxiliary variables. 2 The MTM-RQMC ∼ the extreme antithetic proposals. 3 Our preferred choice: the MTM-HR algorithm. In particular, the case K = 2 induces a speed which is twice that of the Metropolis algorithm whereas the computational cost is almost the same in many scenarios. 25 / 25
  • 73. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Conclusion ◮ MCTM algorithm: 1 Extreme antithetic proposals improves upon the MTM with independent proposals. 2 Still, the improvement is not overly impressive and since the introduction of correlation makes the computation of the acceptance ratio more complex. ◮ MTM-C algorithm: 1 The advantage of the MTM-C algorithms: only one simulation is required for obtaining the pool of proposals and auxiliary variables. 2 The MTM-RQMC ∼ the extreme antithetic proposals. 3 Our preferred choice: the MTM-HR algorithm. In particular, the case K = 2 induces a speed which is twice that of the Metropolis algorithm whereas the computational cost is almost the same in many scenarios. 25 / 25
  • 74. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Conclusion ◮ MCTM algorithm: 1 Extreme antithetic proposals improves upon the MTM with independent proposals. 2 Still, the improvement is not overly impressive and since the introduction of correlation makes the computation of the acceptance ratio more complex. ◮ MTM-C algorithm: 1 The advantage of the MTM-C algorithms: only one simulation is required for obtaining the pool of proposals and auxiliary variables. 2 The MTM-RQMC ∼ the extreme antithetic proposals. 3 Our preferred choice: the MTM-HR algorithm. In particular, the case K = 2 induces a speed which is twice that of the Metropolis algorithm whereas the computational cost is almost the same in many scenarios. 25 / 25
  • 75. Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion Conclusion ◮ MCTM algorithm: 1 Extreme antithetic proposals improves upon the MTM with independent proposals. 2 Still, the improvement is not overly impressive and since the introduction of correlation makes the computation of the acceptance ratio more complex. ◮ MTM-C algorithm: 1 The advantage of the MTM-C algorithms: only one simulation is required for obtaining the pool of proposals and auxiliary variables. 2 The MTM-RQMC ∼ the extreme antithetic proposals. 3 Our preferred choice: the MTM-HR algorithm. In particular, the case K = 2 induces a speed which is twice that of the Metropolis algorithm whereas the computational cost is almost the same in many scenarios. 25 / 25