SlideShare a Scribd company logo
The Metropolis adjusted Langevin Algorithm
for log-concave probability measures in high
                dimensions
                Andreas Eberle
                 June 9, 2011


                     0-0
1    INTRODUCTION

                      1 2
            U (x) =     |x| + V (x) ,     x ∈ Rd ,     V ∈ C 4 (Rd ),
                      2
                     1 −U (x) d       (2π)d/2 −V (x) d
             µ(dx) =   e     λ (dx) =        e      γ (dx),
                     Z                   Z

γd = N (0, Id ) standard normal distribution in Rd .

AIM :

    • Approximate Sampling from µ.

    • Rigorous error and complexity estimates, d → ∞.
RUNNING EXAMPLE: TRANSITION PATH SAMPLING

                  dYt = dBt −         H(Yt ) dt ,   Y0 = y0 ∈ Rn ,
µ = conditional distribution on C([0, T ], Rn ) of (Yt )t∈[0,T ] given YT = yT .
By Girsanov‘s Theorem:

                        µ(dy) = Z −1 exp(−V (y)) γ(dy),
γ = distribution of Brownian bridge from y0 to yT ,
                                  T
                                      1
                   V (y) =              ∆H(yt ) + | H(yt )|2    dt.
                              0       2


Finite dimensional approximation via Karhunen-Loève expansion:

         γ(dy) → γ d (dx) ,           V (y) → Vd (x)           setup above
MARKOV CHAIN MONTE CARLO APPROACH

  • Simulate an ergodic Markov process (Xn ) with stationary distribution µ.
                  −1
  • n large: P ◦ Xn ≈ µ

  • Continuous time: (over-damped) Langevin diffusion
                           1        1
                    dXt = − Xt dt −   V (Xt ) dt + dBt
                           2        2


  • Discrete time: Metropolis-Hastings Algorithms, Gibbs Samplers
METROPOLIS-HASTINGS ALGORITHM
(Metropolis et al 1953, Hastings 1970)

              µ(x) := Z −1 exp(−U (x))      density of µ w.r.t. λd ,

          p(x, y) stochastic kernel on Rd     proposal density, > 0,


ALGORITHM

  1. Choose an initial state X0 .

  2. For n := 0, 1, 2, . . . do

        • Sample Yn ∼ p(Xn , y)dy, Un ∼ Unif(0, 1) independently.
        • If Un < α(Xn , Yn ) then accept the proposal and set Xn+1 := Yn ;
          else reject the proposal and set Xn+1 := Xn .
METROPOLIS-HASTINGS ACCEPTANCE PROBABILITY
                      µ(y)p(y, x)
     α(x, y) = min                ,1   = exp (−G(x, y)+ ),    x, y ∈ Rd ,
                      µ(x)p(x, y)

              µ(x)p(x, y)                   p(x, y)                  γ d (x)p(x, y)
G(x, y) = log             = U (y)−U (x)+log         = V (y)−V (x)+log d
              µ(y)p(y, x)                   p(y, x)                  γ (y)p(y, x)


   • (Xn ) is a time-homogeneous Markov chain with transition kernel

       q(x, dy) = α(x, y)p(x, y)dy + q(x)δx (dy), q(x) = 1 − q(x, Rd  {x}).


   • Detailed Balance:

                         µ(dx) q(x, dy) = µ(dy) q(y, dx).
PROPOSAL DISTRIBUTIONS FOR METROPOLIS-HASTINGS
            x → Yh (x) proposed move,          h > 0 step size,
          ph (x, dy) = P [Yh (x) ∈ dy] proposal distribution,
       αh (x, y) = exp(−Gh (x, y)+ ) acceptance probability.


  • Random Walk Proposals (       Random Walk Metropolis)
                                       √
                       Yh (x)    = x + h · Z,   Z ∼ γd,
                    ph (x, dy)   = N (x, h · Id ),
                    Gh (x, y)    = U (y) − U (x).
  • Ornstein-Uhlenbeck Proposals
                        h               h2
       Yh (x) =      1−      x+      h−    · Z,       Z ∼ γd,
                        2               4
    ph (x, dy) = N ((1 − h/2)x, (h − h2 /4) · Id ),      det. balance w.r.t. γ d
    Gh (x, y) = V (y) − V (x).
• Euler Proposals (     Metropolis Adjusted Langevin Algorithm)

                          h       h              √
          Yh (x) =     1−      x−      V (x) +       h · Z,       Z ∼ γd.
                          2       2

  (Euler step for Langevin equation dXt = − 1 Xt dt −
                                            2
                                                              1
                                                              2   V (Xt ) dt + dBt )

                               h      h
        ph (x, dy)   = N ((1 − )x −       V (x), h · Id ),
                               2      2
        Gh (x, y)    = V (y) − V (x) − (y − x) · ( V (y) +          V (x))/2
                        +h(| U (y)|2 − | U (x)|2 )/4.


  REMARK. Even for V ≡ 0, γ d is not a stationary distribution for pEuler .
                                                                     h
  Stationarity only holds asymptotically as h → 0. This causes substantial
  problems in high dimensions.
• Semi-implicit Euler Proposals (   Semi-implicit MALA)

                        h       h                  h2
     Yh (x) =        1−    x−       V (x) + h −       · Z,   Z ∼ γd,
                        2       2                  4
                         h       h              h2
  ph (x, dy) =   N ((1 − )x −        V (x), (h − ) · Id )  ( = pOU if V ≡ 0 )
                                                                h
                         2       2               4
  Gh (x, y) =    V (y) − V (x) − (y − x) · ( V (y) + V (x))/2
                      h
                 +          (y + x) · ( V (y) − V (x)) + | V (y)|2 − | V (x)|2 .
                   8 − 2h


  REMARK. Semi-implicit discretization of Langevin equation
                       1       1
         dXt     =   − Xt dt −    V (Xt ) dt + dBt
                       2       2
                       ε Xn+1 + Xn   ε             √
  Xn+1 − Xn      =   −             −      V (Xn ) + εZn+1 ,     Zi i.i.d. ∼ γ d
                       2     2       2
Solve for Xn and substitute h = ε/(1 + ε/4):

                      h        h                   h2
       Xn+1 =      1−     Xn −      V (Xn ) +   h−    · Zn+1 .
                      2        2                   4
KNOWN RESULTS FOR METROPOLIS-HASTINGS IN HIGH DIMENSIONS

  • Scaling of acceptance probabilities and mean square jumps as d → ∞

  • Diffusion limits as d → ∞
  • Ergodicity, Geometric Ergodicity

  • Quantitative bounds for mixing times, rigorous complexity estimates
Optimal Scaling and diffusion limits

   • Roberts, Gelman, Gilks 1997: Diffusion limit for RWM with product tar-
     get, h = O(d−1 )

   • Roberts, Rosenthal 1998: Diffusion limit for MALA with product target,
     h = O(d−1/3 )

   • Beskos, Roberts, Stuart, Voss 2008: Semi-implicit MALA applied to
     Transition Path Sampling, Scaling h = O(1)

   • Beskos, Roberts, Stuart 2009: Optimal Scaling for non-product targets

   • Mattingly, Pillai, Stuart 2010: Diffusion limit for RWM with non-product
     target, h = O(d−1 )

   • Pillai, Stuart, Thiéry 2011: Diffusion limit for MALA with non-product
     target, h = O(d−1/3 )
Geometric ergodicity for MALA

   • Roberts, Tweedie 1996: Geometric convergence holds if      U is globally
     Lipschitz but fails in general

   • Bou Rabee, van den Eijnden 2009: Strong accuracy for truncated MALA

   • Bou Rabee, Hairer, van den Eijnden 2010: Convergence to equilibrium
     for MALA at exponential rate up to term exponentially small in time step
     size
BOUNDS FOR MIXING TIME, COMPLEXITY
Metropolis with ball walk proposals
   • Dyer, Frieze, Kannan 1991: µ = U nif (K), K ⊂ Rd convex
     ⇒ Total variation mixing time is polynomial in d and diam(K)
   • Applegate, Kannan 1991, ... , Lovasz, Vempala 2006: U : K → R
     concave, K ⊂ Rd convex
     ⇒ Total variation mixing time is polynomial in d and diam(K)
Metropolis adjusted Langevin :
   • No rigorous complexity estimates so far.
   • Classical results for Langevin diffusions. In particular: If µ is strictly
     log-concave, i.e.,

                      ∃ κ > 0 : ∂ 2 U (x) ≥ κ · Id     ∀ x ∈ Rd

      then
                    dK ( law(Xt ) , µ ) ≤ e−κt dK ( law(X0 ) , µ ).
If µ is strictly log-concave, i.e.,

                     ∃ κ > 0 : ∂ 2 U (x) ≥ κ · Id     ∀ x ∈ Rd

then
                   dK ( law(Xt ) , µ ) ≤ e−κt dK ( law(X0 ) , µ ).


   • Bound is independent of dimension, sharp !

   • Under additional conditions, a corresponding result holds for the Euler
     discretization.

   • This suggests that comparable bounds might hold for MALA, or even for
     Ornstein-Uhlenbeck proposals.
2   Main result and strategy of proof
Semi-implicit MALA:

                   h       h                  h2
    Yh (x) =    1−      x−       V (x) +   h−    · Z,      Z ∼ γ d , h > 0,
                   2       2                  4

Coupling of proposal distributions ph (x, dy), x ∈ Rd ,

            Yh (x)   if U ≤ αh (x, Yh (x))
Wh (x) =                                   , U ∼ U nif (0, 1) independent of Z,
            x        if U > α(x, Y (x, x))
                                       ˜

Coupling of MALA transition kernels qh (x, dy), x ∈ Rd .
We fix a radius R ∈ (0, ∞) and a norm              ·   −    on Rd such that x            −   ≤ |x| for
any x ∈ Rd , and set

       d(x, x) := min( x − x
            ˜              ˜    − , R),       B := {x ∈ Rd :                  x   −   < R/2}.


EXAMPLE: Transition Path Sampling

   • |x|Rd is finite dimensional projection of Cameron-Martin norm

                                                           2        1/2
                                              T
                                                      dx
                             |x|CM =                           dt         .
                                          0           dt

   •    x   −   is finite dimensional approximation of supremum or L2 norm.
GOAL:

E [d(Wh (x), Wh (˜))] ≤
                 x          1 − κh + Ch3/2 d(x, x)
                                                ˜            ∀ x, x ∈ B, h ∈ (0, 1)
                                                                  ˜

 with explicit constants κ, C ∈ (0, ∞) that do depend on the dimension d only
through the moments

                                        k
                    mk :=           x   −   γ d (dx) ,   k ∈ N.
                              Rd


CONSEQUENCE:
   • Contractivity of MALA transition kernel qh for small h w.r.t. Kantorovich-
     Wasserstein distance

              dK (ν, η) =     sup       E[d(X, Y )] ,     ν, η ∈ P rob(Rd ).
                            X∼ν,Y ∼η


         dK (νqh , µ) ≤ (1 − κh + Ch3/2 ) dK (ν, µ) + R · (ν(B c ) + µ(B c )).
• Upper bound for mixing time

         Tmix (ε) = inf {n ≥ 0 : dK (νqh , µ) < ε for any ν ∈ P rob(Rd )}.
                                       n




EXAMPLE: Transition Path Sampling
Dimension-independent bounds hold under appropriate assumptions.
STRATEGY OF PROOF: Let

    A(x) := {U ≤ αh (x, Yh (x))}       (proposed move from x is accepted)

Then

          E[d(Wh (x), Wh (˜))] ≤
                          x           E[d(Yh (x), Yh (˜)); A(x) ∩ A(˜)]
                                                      x             x
                                      + d(x, x) · P [A(x)C ∩ A(˜)C ]
                                             ˜                 x
                                      + R · P [A(x) A(˜)].x

We prove under appropriate assumptions:

  1. E[d(Yh (x), Yh (˜))] ≤ (1 − κh) · d(x, x),
                     x                      ˜

  2. P [A(x)C ] = E[1 − αh (x, Yh (x))] ≤ C1 h3/2 ,

  3. P [A(x)    A(˜)] ≤ E[|αh (x, Yh (x)) − αh (˜, Yh (˜))|] ≤ C2 h3/2 x − x
                  x                             x      x                   ˜   −,

with explicit constants κ, C1 , C2 ∈ (0, ∞).
3    Contractivity of proposal step
PROPOSITION. Suppose there exists a constant α ∈ (0, 1) such that
                       2
                           V (x) · η        −   ≤ α η   −    ∀x ∈ B, η ∈ Rd .                  (1)

Then
                                               1−α
       Yh (x) − Yh (˜)
                    x       −   ≤           1−     h        x−x
                                                              ˜   −        ∀x, x ∈ B, h > 0.
                                                                               ˜
                                                2

Proof.
                                        1
 Yh (x) − Yh (˜)
              x    −       ≤                ∂x−˜ Yh (tx + (1 − t)˜)
                                               x                 x    −   dt
                                    0
                                        1
                                        h            h 2
                           =              )(x − x) −
                                            (1 −˜        V (tx + (1 − t)˜) · (x − x)
                                                                        x         ˜                  −   dt
                                 0      2            2
                                    h              h
                           ≤    (1 − ) x − x − + α x − x −
                                            ˜             ˜
                                    2              2
REMARK.
The assumption (1) implies strict convexity of U (x) = 1 |x|2 + V (x).
                                                       2

EXAMPLE.
For Transition Path Sampling, (A1) holds for small T with α independent of
the dimension.
4    Bounds for MALA rejection probabilities

αh (x, y) = exp(−Gh (x, y)+ ) Acceptance probability for semi-implicit MALA,



Gh (x, y)   = V (y) − V (x) − (y − x) · ( V (y) + V (x))/2
                   h
              +          (y + x) · ( V (y) − V (x)) + | V (y)|2 − | V (x)|2 .
                8 − 2h

ASSUMPTION. There exist finite constants Cn , pn ∈ [0, ∞) such that
                n                                       pn
             |(∂ξ1 ,...,ξn V )(x)| ≤ Cn max(1, x      −)     ξ1   −   · · · ξn   −


for any x ∈ Rd , ξ1 , . . . , ξn ∈ Rd , and n = 2, 3, 4.
THEOREM. If the assumption above is satisfied then there exists a polyno-
mial P : R2 → R+ of degree max(p3 + 3, 2p2 + 2) such that

   E[1 − αh (x, Yh (x))] ≤ E[Gh (x, Yh (x))+ ] ≤ P( x    −,    U (x)   −)   · h3/2

for all x ∈ Rd , h ∈ (0, 2).

REMARK.

   • The polynomial P is explicit. It depends only on the values C2 , C3 , p2 , p3
     and on the moments
                               k
                  mk = E[ Z    − ],    0 ≤ k ≤ max(p3 + 3, 2p2 + 2)

      but it does not depend on the dimension d.

   • For MALA with explicit Euler proposals, a corresponding estimate holds
     with mk replaced by mk = E[|Z|k ]. Note, however, that mk → ∞ as
                          ˜                                    ˜
     d → ∞.
Proof.

   |V (y) − V (x) − (y − x) · ( V (y) +              V (x))/2|                                   (2)
              1 1           3
       =          t(1 − t)∂y−x V (x + t(y − x)) dt
              2 0
             1
       ≤        y − x 3 · sup{∂η V (z) : z ∈ [x, y], η
                      −
                                3
                                                                  −   ≤ 1}
             12
                        3
   E       Yh (x) − x   −   ≤ const. · h3/2                                                      (3)

   |(y + x) · ( V (y) −          V (x))|                                                         (4)
       ≤      y+x   −   · sup |∂ξ V (y) − ∂ξ V (x)|
                            ξ   − ≤1

                                                  2
       ≤      y+x   −   · y−x          −   · sup{∂ξη V (z) : z ∈ [x, y], ξ   −,   η   −   ≤ 1}
5     Dependence of rejection on current state
THEOREM. If the assumption above is satisfied then there exists a polyno-
mial Q : R2 → R+ of degree max(p4 + 2, p3 + p2 + 2, 3p2 + 1) such that
             E     x Gh (x, Yh (x)) +    ≤ Q( x   −,        U (x)   −)   · h3/2
for all x ∈ Rd , h ∈ (0, 2), where
                          η   +   := sup{ξ · η : ξ   −   ≤ 1}.


CONSEQUENCE.
    P [Accept(x)   Accept(˜)] ≤ E[|αh (x, Yh (x)) − αh (˜, Yh (˜))|]
                          x                             x      x
                                   ≤ E[|Gh (x, Yh (x)) − Gh (˜, Yh (˜))|]
                                                             x      x
                                   ≤ x − x − · sup
                                         ˜                z Gh (z, Yh (z))          +
                                                  z∈[x,˜]
                                                       x

                                   ≤ h3/2 x − x
                                              ˜      −    sup Q( x       −,       U (x)   −)
                                                         z∈[x,˜]
                                                              x

More Related Content

What's hot

Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Gabriel Peyré
 
Low Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse ProblemsLow Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse Problems
Gabriel Peyré
 
Low Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
Low Complexity Regularization of Inverse Problems - Course #1 Inverse ProblemsLow Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
Low Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
Gabriel Peyré
 
Signal Processing Course : Convex Optimization
Signal Processing Course : Convex OptimizationSignal Processing Course : Convex Optimization
Signal Processing Course : Convex Optimization
Gabriel Peyré
 
Geodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and GraphicsGeodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and Graphics
Gabriel Peyré
 
Lesson 27: Integration by Substitution, part II (Section 10 version)
Lesson 27: Integration by Substitution, part II (Section 10 version)Lesson 27: Integration by Substitution, part II (Section 10 version)
Lesson 27: Integration by Substitution, part II (Section 10 version)
Matthew Leingang
 
Introduction to inverse problems
Introduction to inverse problemsIntroduction to inverse problems
Introduction to inverse problemsDelta Pi Systems
 
Scattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysisScattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysis
VjekoslavKovac1
 
Tales on two commuting transformations or flows
Tales on two commuting transformations or flowsTales on two commuting transformations or flows
Tales on two commuting transformations or flows
VjekoslavKovac1
 
Monte-Carlo method for Two-Stage SLP
Monte-Carlo method for Two-Stage SLPMonte-Carlo method for Two-Stage SLP
Monte-Carlo method for Two-Stage SLP
SSA KPI
 
Quantitative norm convergence of some ergodic averages
Quantitative norm convergence of some ergodic averagesQuantitative norm convergence of some ergodic averages
Quantitative norm convergence of some ergodic averages
VjekoslavKovac1
 
Signal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse ProblemsSignal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse Problems
Gabriel Peyré
 
Nested sampling
Nested samplingNested sampling
Nested sampling
Christian Robert
 
Natalini nse slide_giu2013
Natalini nse slide_giu2013Natalini nse slide_giu2013
Natalini nse slide_giu2013Madd Maths
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
Stuff You Must Know Cold for the AP Calculus BC Exam!
Stuff You Must Know Cold for the AP Calculus BC Exam!Stuff You Must Know Cold for the AP Calculus BC Exam!
Stuff You Must Know Cold for the AP Calculus BC Exam!A Jorge Garcia
 
Lecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dualLecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dual
Stéphane Canu
 
ABC in Venezia
ABC in VeneziaABC in Venezia
ABC in Venezia
Christian Robert
 
The multilayer perceptron
The multilayer perceptronThe multilayer perceptron
The multilayer perceptron
ESCOM
 

What's hot (20)

Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
 
Low Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse ProblemsLow Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse Problems
 
Low Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
Low Complexity Regularization of Inverse Problems - Course #1 Inverse ProblemsLow Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
Low Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
 
Signal Processing Course : Convex Optimization
Signal Processing Course : Convex OptimizationSignal Processing Course : Convex Optimization
Signal Processing Course : Convex Optimization
 
Geodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and GraphicsGeodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and Graphics
 
Lesson 27: Integration by Substitution, part II (Section 10 version)
Lesson 27: Integration by Substitution, part II (Section 10 version)Lesson 27: Integration by Substitution, part II (Section 10 version)
Lesson 27: Integration by Substitution, part II (Section 10 version)
 
Introduction to inverse problems
Introduction to inverse problemsIntroduction to inverse problems
Introduction to inverse problems
 
Scattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysisScattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysis
 
Tales on two commuting transformations or flows
Tales on two commuting transformations or flowsTales on two commuting transformations or flows
Tales on two commuting transformations or flows
 
Monte-Carlo method for Two-Stage SLP
Monte-Carlo method for Two-Stage SLPMonte-Carlo method for Two-Stage SLP
Monte-Carlo method for Two-Stage SLP
 
Quantitative norm convergence of some ergodic averages
Quantitative norm convergence of some ergodic averagesQuantitative norm convergence of some ergodic averages
Quantitative norm convergence of some ergodic averages
 
Signal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse ProblemsSignal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse Problems
 
Nested sampling
Nested samplingNested sampling
Nested sampling
 
Natalini nse slide_giu2013
Natalini nse slide_giu2013Natalini nse slide_giu2013
Natalini nse slide_giu2013
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Stuff You Must Know Cold for the AP Calculus BC Exam!
Stuff You Must Know Cold for the AP Calculus BC Exam!Stuff You Must Know Cold for the AP Calculus BC Exam!
Stuff You Must Know Cold for the AP Calculus BC Exam!
 
Lecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dualLecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dual
 
ABC in Venezia
ABC in VeneziaABC in Venezia
ABC in Venezia
 
Lecture5 kernel svm
Lecture5 kernel svmLecture5 kernel svm
Lecture5 kernel svm
 
The multilayer perceptron
The multilayer perceptronThe multilayer perceptron
The multilayer perceptron
 

Viewers also liked

Hedibert Lopes' talk at BigMC
Hedibert Lopes' talk at  BigMCHedibert Lopes' talk at  BigMC
Hedibert Lopes' talk at BigMC
BigMC
 
Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...
Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...
Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...
BigMC
 
Comparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsComparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering models
BigMC
 
Computation of the marginal likelihood
Computation of the marginal likelihoodComputation of the marginal likelihood
Computation of the marginal likelihood
BigMC
 
Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing ...
Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing ...Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing ...
Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing ...
BigMC
 
Stability of adaptive random-walk Metropolis algorithms
Stability of adaptive random-walk Metropolis algorithmsStability of adaptive random-walk Metropolis algorithms
Stability of adaptive random-walk Metropolis algorithms
BigMC
 
"Monte-Carlo Tree Search for the game of Go"
"Monte-Carlo Tree Search for the game of Go""Monte-Carlo Tree Search for the game of Go"
"Monte-Carlo Tree Search for the game of Go"
BigMC
 

Viewers also liked (7)

Hedibert Lopes' talk at BigMC
Hedibert Lopes' talk at  BigMCHedibert Lopes' talk at  BigMC
Hedibert Lopes' talk at BigMC
 
Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...
Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...
Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...
 
Comparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsComparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering models
 
Computation of the marginal likelihood
Computation of the marginal likelihoodComputation of the marginal likelihood
Computation of the marginal likelihood
 
Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing ...
Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing ...Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing ...
Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing ...
 
Stability of adaptive random-walk Metropolis algorithms
Stability of adaptive random-walk Metropolis algorithmsStability of adaptive random-walk Metropolis algorithms
Stability of adaptive random-walk Metropolis algorithms
 
"Monte-Carlo Tree Search for the game of Go"
"Monte-Carlo Tree Search for the game of Go""Monte-Carlo Tree Search for the game of Go"
"Monte-Carlo Tree Search for the game of Go"
 

Similar to Andreas Eberle

CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
The Statistical and Applied Mathematical Sciences Institute
 
1 hofstad
1 hofstad1 hofstad
1 hofstad
Yandex
 
Holographic Cotton Tensor
Holographic Cotton TensorHolographic Cotton Tensor
Holographic Cotton Tensor
Sebastian De Haro
 
Norm-variation of bilinear averages
Norm-variation of bilinear averagesNorm-variation of bilinear averages
Norm-variation of bilinear averages
VjekoslavKovac1
 
Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...
HidenoriOgata
 
Physical Chemistry Assignment Help
Physical Chemistry Assignment HelpPhysical Chemistry Assignment Help
Physical Chemistry Assignment Help
Edu Assignment Help
 
02 2d systems matrix
02 2d systems matrix02 2d systems matrix
02 2d systems matrix
Rumah Belajar
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - Introduction
Charles Deledalle
 
Unconditionally stable fdtd methods
Unconditionally stable fdtd methodsUnconditionally stable fdtd methods
Unconditionally stable fdtd methodsphysics Imposible
 
Fdtd
FdtdFdtd
Fdtd
FdtdFdtd
Modeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationModeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential Equation
Mark Chang
 
Bird’s-eye view of Gaussian harmonic analysis
Bird’s-eye view of Gaussian harmonic analysisBird’s-eye view of Gaussian harmonic analysis
Bird’s-eye view of Gaussian harmonic analysis
Radboud University Medical Center
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
Valentin De Bortoli
 
On Twisted Paraproducts and some other Multilinear Singular Integrals
On Twisted Paraproducts and some other Multilinear Singular IntegralsOn Twisted Paraproducts and some other Multilinear Singular Integrals
On Twisted Paraproducts and some other Multilinear Singular Integrals
VjekoslavKovac1
 
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
The Statistical and Applied Mathematical Sciences Institute
 
CS369-FDM-FEM.pptx
CS369-FDM-FEM.pptxCS369-FDM-FEM.pptx
CS369-FDM-FEM.pptx
courtoi
 
Monopole zurich
Monopole zurichMonopole zurich
Monopole zurich
Tapio Salminen
 
Alternating direction
Alternating directionAlternating direction
Alternating direction
Derek Pang
 

Similar to Andreas Eberle (20)

CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
 
1 hofstad
1 hofstad1 hofstad
1 hofstad
 
Holographic Cotton Tensor
Holographic Cotton TensorHolographic Cotton Tensor
Holographic Cotton Tensor
 
Norm-variation of bilinear averages
Norm-variation of bilinear averagesNorm-variation of bilinear averages
Norm-variation of bilinear averages
 
Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...
 
Physical Chemistry Assignment Help
Physical Chemistry Assignment HelpPhysical Chemistry Assignment Help
Physical Chemistry Assignment Help
 
Funcion gamma
Funcion gammaFuncion gamma
Funcion gamma
 
02 2d systems matrix
02 2d systems matrix02 2d systems matrix
02 2d systems matrix
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - Introduction
 
Unconditionally stable fdtd methods
Unconditionally stable fdtd methodsUnconditionally stable fdtd methods
Unconditionally stable fdtd methods
 
Fdtd
FdtdFdtd
Fdtd
 
Fdtd
FdtdFdtd
Fdtd
 
Modeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationModeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential Equation
 
Bird’s-eye view of Gaussian harmonic analysis
Bird’s-eye view of Gaussian harmonic analysisBird’s-eye view of Gaussian harmonic analysis
Bird’s-eye view of Gaussian harmonic analysis
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
On Twisted Paraproducts and some other Multilinear Singular Integrals
On Twisted Paraproducts and some other Multilinear Singular IntegralsOn Twisted Paraproducts and some other Multilinear Singular Integrals
On Twisted Paraproducts and some other Multilinear Singular Integrals
 
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
 
CS369-FDM-FEM.pptx
CS369-FDM-FEM.pptxCS369-FDM-FEM.pptx
CS369-FDM-FEM.pptx
 
Monopole zurich
Monopole zurichMonopole zurich
Monopole zurich
 
Alternating direction
Alternating directionAlternating direction
Alternating direction
 

Recently uploaded

Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 

Recently uploaded (20)

Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 

Andreas Eberle

  • 1. The Metropolis adjusted Langevin Algorithm for log-concave probability measures in high dimensions Andreas Eberle June 9, 2011 0-0
  • 2. 1 INTRODUCTION 1 2 U (x) = |x| + V (x) , x ∈ Rd , V ∈ C 4 (Rd ), 2 1 −U (x) d (2π)d/2 −V (x) d µ(dx) = e λ (dx) = e γ (dx), Z Z γd = N (0, Id ) standard normal distribution in Rd . AIM : • Approximate Sampling from µ. • Rigorous error and complexity estimates, d → ∞.
  • 3. RUNNING EXAMPLE: TRANSITION PATH SAMPLING dYt = dBt − H(Yt ) dt , Y0 = y0 ∈ Rn , µ = conditional distribution on C([0, T ], Rn ) of (Yt )t∈[0,T ] given YT = yT . By Girsanov‘s Theorem: µ(dy) = Z −1 exp(−V (y)) γ(dy), γ = distribution of Brownian bridge from y0 to yT , T 1 V (y) = ∆H(yt ) + | H(yt )|2 dt. 0 2 Finite dimensional approximation via Karhunen-Loève expansion: γ(dy) → γ d (dx) , V (y) → Vd (x) setup above
  • 4. MARKOV CHAIN MONTE CARLO APPROACH • Simulate an ergodic Markov process (Xn ) with stationary distribution µ. −1 • n large: P ◦ Xn ≈ µ • Continuous time: (over-damped) Langevin diffusion 1 1 dXt = − Xt dt − V (Xt ) dt + dBt 2 2 • Discrete time: Metropolis-Hastings Algorithms, Gibbs Samplers
  • 5. METROPOLIS-HASTINGS ALGORITHM (Metropolis et al 1953, Hastings 1970) µ(x) := Z −1 exp(−U (x)) density of µ w.r.t. λd , p(x, y) stochastic kernel on Rd proposal density, > 0, ALGORITHM 1. Choose an initial state X0 . 2. For n := 0, 1, 2, . . . do • Sample Yn ∼ p(Xn , y)dy, Un ∼ Unif(0, 1) independently. • If Un < α(Xn , Yn ) then accept the proposal and set Xn+1 := Yn ; else reject the proposal and set Xn+1 := Xn .
  • 6. METROPOLIS-HASTINGS ACCEPTANCE PROBABILITY µ(y)p(y, x) α(x, y) = min ,1 = exp (−G(x, y)+ ), x, y ∈ Rd , µ(x)p(x, y) µ(x)p(x, y) p(x, y) γ d (x)p(x, y) G(x, y) = log = U (y)−U (x)+log = V (y)−V (x)+log d µ(y)p(y, x) p(y, x) γ (y)p(y, x) • (Xn ) is a time-homogeneous Markov chain with transition kernel q(x, dy) = α(x, y)p(x, y)dy + q(x)δx (dy), q(x) = 1 − q(x, Rd {x}). • Detailed Balance: µ(dx) q(x, dy) = µ(dy) q(y, dx).
  • 7. PROPOSAL DISTRIBUTIONS FOR METROPOLIS-HASTINGS x → Yh (x) proposed move, h > 0 step size, ph (x, dy) = P [Yh (x) ∈ dy] proposal distribution, αh (x, y) = exp(−Gh (x, y)+ ) acceptance probability. • Random Walk Proposals ( Random Walk Metropolis) √ Yh (x) = x + h · Z, Z ∼ γd, ph (x, dy) = N (x, h · Id ), Gh (x, y) = U (y) − U (x). • Ornstein-Uhlenbeck Proposals h h2 Yh (x) = 1− x+ h− · Z, Z ∼ γd, 2 4 ph (x, dy) = N ((1 − h/2)x, (h − h2 /4) · Id ), det. balance w.r.t. γ d Gh (x, y) = V (y) − V (x).
  • 8. • Euler Proposals ( Metropolis Adjusted Langevin Algorithm) h h √ Yh (x) = 1− x− V (x) + h · Z, Z ∼ γd. 2 2 (Euler step for Langevin equation dXt = − 1 Xt dt − 2 1 2 V (Xt ) dt + dBt ) h h ph (x, dy) = N ((1 − )x − V (x), h · Id ), 2 2 Gh (x, y) = V (y) − V (x) − (y − x) · ( V (y) + V (x))/2 +h(| U (y)|2 − | U (x)|2 )/4. REMARK. Even for V ≡ 0, γ d is not a stationary distribution for pEuler . h Stationarity only holds asymptotically as h → 0. This causes substantial problems in high dimensions.
  • 9. • Semi-implicit Euler Proposals ( Semi-implicit MALA) h h h2 Yh (x) = 1− x− V (x) + h − · Z, Z ∼ γd, 2 2 4 h h h2 ph (x, dy) = N ((1 − )x − V (x), (h − ) · Id ) ( = pOU if V ≡ 0 ) h 2 2 4 Gh (x, y) = V (y) − V (x) − (y − x) · ( V (y) + V (x))/2 h + (y + x) · ( V (y) − V (x)) + | V (y)|2 − | V (x)|2 . 8 − 2h REMARK. Semi-implicit discretization of Langevin equation 1 1 dXt = − Xt dt − V (Xt ) dt + dBt 2 2 ε Xn+1 + Xn ε √ Xn+1 − Xn = − − V (Xn ) + εZn+1 , Zi i.i.d. ∼ γ d 2 2 2
  • 10. Solve for Xn and substitute h = ε/(1 + ε/4): h h h2 Xn+1 = 1− Xn − V (Xn ) + h− · Zn+1 . 2 2 4
  • 11. KNOWN RESULTS FOR METROPOLIS-HASTINGS IN HIGH DIMENSIONS • Scaling of acceptance probabilities and mean square jumps as d → ∞ • Diffusion limits as d → ∞ • Ergodicity, Geometric Ergodicity • Quantitative bounds for mixing times, rigorous complexity estimates
  • 12. Optimal Scaling and diffusion limits • Roberts, Gelman, Gilks 1997: Diffusion limit for RWM with product tar- get, h = O(d−1 ) • Roberts, Rosenthal 1998: Diffusion limit for MALA with product target, h = O(d−1/3 ) • Beskos, Roberts, Stuart, Voss 2008: Semi-implicit MALA applied to Transition Path Sampling, Scaling h = O(1) • Beskos, Roberts, Stuart 2009: Optimal Scaling for non-product targets • Mattingly, Pillai, Stuart 2010: Diffusion limit for RWM with non-product target, h = O(d−1 ) • Pillai, Stuart, Thiéry 2011: Diffusion limit for MALA with non-product target, h = O(d−1/3 )
  • 13. Geometric ergodicity for MALA • Roberts, Tweedie 1996: Geometric convergence holds if U is globally Lipschitz but fails in general • Bou Rabee, van den Eijnden 2009: Strong accuracy for truncated MALA • Bou Rabee, Hairer, van den Eijnden 2010: Convergence to equilibrium for MALA at exponential rate up to term exponentially small in time step size
  • 14. BOUNDS FOR MIXING TIME, COMPLEXITY Metropolis with ball walk proposals • Dyer, Frieze, Kannan 1991: µ = U nif (K), K ⊂ Rd convex ⇒ Total variation mixing time is polynomial in d and diam(K) • Applegate, Kannan 1991, ... , Lovasz, Vempala 2006: U : K → R concave, K ⊂ Rd convex ⇒ Total variation mixing time is polynomial in d and diam(K) Metropolis adjusted Langevin : • No rigorous complexity estimates so far. • Classical results for Langevin diffusions. In particular: If µ is strictly log-concave, i.e., ∃ κ > 0 : ∂ 2 U (x) ≥ κ · Id ∀ x ∈ Rd then dK ( law(Xt ) , µ ) ≤ e−κt dK ( law(X0 ) , µ ).
  • 15. If µ is strictly log-concave, i.e., ∃ κ > 0 : ∂ 2 U (x) ≥ κ · Id ∀ x ∈ Rd then dK ( law(Xt ) , µ ) ≤ e−κt dK ( law(X0 ) , µ ). • Bound is independent of dimension, sharp ! • Under additional conditions, a corresponding result holds for the Euler discretization. • This suggests that comparable bounds might hold for MALA, or even for Ornstein-Uhlenbeck proposals.
  • 16. 2 Main result and strategy of proof Semi-implicit MALA: h h h2 Yh (x) = 1− x− V (x) + h− · Z, Z ∼ γ d , h > 0, 2 2 4 Coupling of proposal distributions ph (x, dy), x ∈ Rd , Yh (x) if U ≤ αh (x, Yh (x)) Wh (x) = , U ∼ U nif (0, 1) independent of Z, x if U > α(x, Y (x, x)) ˜ Coupling of MALA transition kernels qh (x, dy), x ∈ Rd .
  • 17. We fix a radius R ∈ (0, ∞) and a norm · − on Rd such that x − ≤ |x| for any x ∈ Rd , and set d(x, x) := min( x − x ˜ ˜ − , R), B := {x ∈ Rd : x − < R/2}. EXAMPLE: Transition Path Sampling • |x|Rd is finite dimensional projection of Cameron-Martin norm 2 1/2 T dx |x|CM = dt . 0 dt • x − is finite dimensional approximation of supremum or L2 norm.
  • 18. GOAL: E [d(Wh (x), Wh (˜))] ≤ x 1 − κh + Ch3/2 d(x, x) ˜ ∀ x, x ∈ B, h ∈ (0, 1) ˜ with explicit constants κ, C ∈ (0, ∞) that do depend on the dimension d only through the moments k mk := x − γ d (dx) , k ∈ N. Rd CONSEQUENCE: • Contractivity of MALA transition kernel qh for small h w.r.t. Kantorovich- Wasserstein distance dK (ν, η) = sup E[d(X, Y )] , ν, η ∈ P rob(Rd ). X∼ν,Y ∼η dK (νqh , µ) ≤ (1 − κh + Ch3/2 ) dK (ν, µ) + R · (ν(B c ) + µ(B c )).
  • 19. • Upper bound for mixing time Tmix (ε) = inf {n ≥ 0 : dK (νqh , µ) < ε for any ν ∈ P rob(Rd )}. n EXAMPLE: Transition Path Sampling Dimension-independent bounds hold under appropriate assumptions.
  • 20. STRATEGY OF PROOF: Let A(x) := {U ≤ αh (x, Yh (x))} (proposed move from x is accepted) Then E[d(Wh (x), Wh (˜))] ≤ x E[d(Yh (x), Yh (˜)); A(x) ∩ A(˜)] x x + d(x, x) · P [A(x)C ∩ A(˜)C ] ˜ x + R · P [A(x) A(˜)].x We prove under appropriate assumptions: 1. E[d(Yh (x), Yh (˜))] ≤ (1 − κh) · d(x, x), x ˜ 2. P [A(x)C ] = E[1 − αh (x, Yh (x))] ≤ C1 h3/2 , 3. P [A(x) A(˜)] ≤ E[|αh (x, Yh (x)) − αh (˜, Yh (˜))|] ≤ C2 h3/2 x − x x x x ˜ −, with explicit constants κ, C1 , C2 ∈ (0, ∞).
  • 21. 3 Contractivity of proposal step PROPOSITION. Suppose there exists a constant α ∈ (0, 1) such that 2 V (x) · η − ≤ α η − ∀x ∈ B, η ∈ Rd . (1) Then 1−α Yh (x) − Yh (˜) x − ≤ 1− h x−x ˜ − ∀x, x ∈ B, h > 0. ˜ 2 Proof. 1 Yh (x) − Yh (˜) x − ≤ ∂x−˜ Yh (tx + (1 − t)˜) x x − dt 0 1 h h 2 = )(x − x) − (1 −˜ V (tx + (1 − t)˜) · (x − x) x ˜ − dt 0 2 2 h h ≤ (1 − ) x − x − + α x − x − ˜ ˜ 2 2
  • 22. REMARK. The assumption (1) implies strict convexity of U (x) = 1 |x|2 + V (x). 2 EXAMPLE. For Transition Path Sampling, (A1) holds for small T with α independent of the dimension.
  • 23. 4 Bounds for MALA rejection probabilities αh (x, y) = exp(−Gh (x, y)+ ) Acceptance probability for semi-implicit MALA, Gh (x, y) = V (y) − V (x) − (y − x) · ( V (y) + V (x))/2 h + (y + x) · ( V (y) − V (x)) + | V (y)|2 − | V (x)|2 . 8 − 2h ASSUMPTION. There exist finite constants Cn , pn ∈ [0, ∞) such that n pn |(∂ξ1 ,...,ξn V )(x)| ≤ Cn max(1, x −) ξ1 − · · · ξn − for any x ∈ Rd , ξ1 , . . . , ξn ∈ Rd , and n = 2, 3, 4.
  • 24. THEOREM. If the assumption above is satisfied then there exists a polyno- mial P : R2 → R+ of degree max(p3 + 3, 2p2 + 2) such that E[1 − αh (x, Yh (x))] ≤ E[Gh (x, Yh (x))+ ] ≤ P( x −, U (x) −) · h3/2 for all x ∈ Rd , h ∈ (0, 2). REMARK. • The polynomial P is explicit. It depends only on the values C2 , C3 , p2 , p3 and on the moments k mk = E[ Z − ], 0 ≤ k ≤ max(p3 + 3, 2p2 + 2) but it does not depend on the dimension d. • For MALA with explicit Euler proposals, a corresponding estimate holds with mk replaced by mk = E[|Z|k ]. Note, however, that mk → ∞ as ˜ ˜ d → ∞.
  • 25. Proof. |V (y) − V (x) − (y − x) · ( V (y) + V (x))/2| (2) 1 1 3 = t(1 − t)∂y−x V (x + t(y − x)) dt 2 0 1 ≤ y − x 3 · sup{∂η V (z) : z ∈ [x, y], η − 3 − ≤ 1} 12 3 E Yh (x) − x − ≤ const. · h3/2 (3) |(y + x) · ( V (y) − V (x))| (4) ≤ y+x − · sup |∂ξ V (y) − ∂ξ V (x)| ξ − ≤1 2 ≤ y+x − · y−x − · sup{∂ξη V (z) : z ∈ [x, y], ξ −, η − ≤ 1}
  • 26. 5 Dependence of rejection on current state THEOREM. If the assumption above is satisfied then there exists a polyno- mial Q : R2 → R+ of degree max(p4 + 2, p3 + p2 + 2, 3p2 + 1) such that E x Gh (x, Yh (x)) + ≤ Q( x −, U (x) −) · h3/2 for all x ∈ Rd , h ∈ (0, 2), where η + := sup{ξ · η : ξ − ≤ 1}. CONSEQUENCE. P [Accept(x) Accept(˜)] ≤ E[|αh (x, Yh (x)) − αh (˜, Yh (˜))|] x x x ≤ E[|Gh (x, Yh (x)) − Gh (˜, Yh (˜))|] x x ≤ x − x − · sup ˜ z Gh (z, Yh (z)) + z∈[x,˜] x ≤ h3/2 x − x ˜ − sup Q( x −, U (x) −) z∈[x,˜] x