SlideShare a Scribd company logo
1 of 60
Download to read offline
Wang–Landau algorithm
               Flat Histogram in finite time
Parallel Wang–Landau: asymptotic behaviour




        On the convergence properties of the
             Wang-Landau Algorithm

                     Robin J. Ryder
             CEREMADE, Universit´ Paris Dauphine
                                e
                       and CREST


                        ENSAE – 31 January 2012


      joint work with Pierre Jacob (National University of Singapore)
          and Pierre Del Moral (INRIA & Universit´ de Bordeaux)
                                                 e



            R. Ryder (Dauphine) & CREST       Wang-Landau               1/ 40
Wang–Landau algorithm
                       Flat Histogram in finite time
        Parallel Wang–Landau: asymptotic behaviour


Outline



  1   Wang–Landau algorithm


  2   Flat Histogram in finite time


  3   Parallel Wang–Landau: asymptotic behaviour




                    R. Ryder (Dauphine) & CREST       Wang-Landau   2/ 40
Wang–Landau algorithm
                     Flat Histogram in finite time
      Parallel Wang–Landau: asymptotic behaviour


Motivation

                      0.5


                      0.4


                      0.3
            density




                      0.2


                      0.1


                      0.0
                            −5            0               5             10   15
                                                      X

         Figure: A mixture of well-separated normal distributions.

                        R. Ryder (Dauphine) & CREST       Wang-Landau             3/ 40
Wang–Landau algorithm
                     Flat Histogram in finite time
      Parallel Wang–Landau: asymptotic behaviour


Motivation

                      0.5


                      0.4


                      0.3
            density




                      0.2


                      0.1


                      0.0
                            −5            0               5             10   15
                                                      X

         Figure: A mixture of well-separated normal distributions.

                        R. Ryder (Dauphine) & CREST       Wang-Landau             4/ 40
Wang–Landau algorithm
                     Flat Histogram in finite time
      Parallel Wang–Landau: asymptotic behaviour


Motivation

                      0.5


                      0.4


                      0.3
            density




                      0.2


                      0.1


                      0.0
                            −5            0               5             10   15
                                                      X

         Figure: A mixture of well-separated normal distributions.

                        R. Ryder (Dauphine) & CREST       Wang-Landau             5/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Setting

  Partition the state space
                                                     d
                                            X =            Xi
                                                     i=1


  Desired frequencies

                                       φ = (φ1 , . . . , φd )

  Penalized distribution
                                                      π(x)
                                       πθ (x) ∝
                                                     θ(J(x))
  where J(x) such that x ∈ XJ(x) .

                   R. Ryder (Dauphine) & CREST           Wang-Landau   6/ 40
Wang–Landau algorithm
                        Flat Histogram in finite time
         Parallel Wang–Landau: asymptotic behaviour


First algorithm



  Algorithm 1 Wang-Landau with deterministic schedule (γt )
   1: Init θ0 > 0, X0 ∈ X .
   2: for t = 1 to T do
   3:    Sample Xt from Kθt−1 (Xt−1 , ·), MH kernel targeting πθt−1 .
   4:     Update the penalties:

                         log θt (i) ← log θt−1 (i) + γt (1 Xi (Xt ) − φi )
                                                          I

   5:   end for




                     R. Ryder (Dauphine) & CREST       Wang-Landau           7/ 40
Wang–Landau algorithm
                       Flat Histogram in finite time
        Parallel Wang–Landau: asymptotic behaviour


Video




                    R. Ryder (Dauphine) & CREST       Wang-Landau   8/ 40
Wang–Landau algorithm
                       Flat Histogram in finite time
        Parallel Wang–Landau: asymptotic behaviour


Video




                    R. Ryder (Dauphine) & CREST       Wang-Landau   8/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Flat Histogram

  Issues with the first version
  Choice of γ has a huge impact on the results.

  Flat Histogram
  Define the counters:
                                                      t
                                    νt (i) :=              1 Xi (Xn )
                                                            I
                                                     n=1

  Flat Histogram (FH) is reached when:

                                                νt (i)
                                   max                 − φi < c
                                i∈{1,...,d}       t

                   R. Ryder (Dauphine) & CREST             Wang-Landau   9/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Flat Histogram



  Idea
  Instead of decreasing γt at each time step t, decrease only when
  the Flat Histogram criterion is reached.

  In practice
  Denote by κt the number of FH criteria reached up to time t.
      Use γκt instead of γt at time t.
      If FH is reached at time t, reset νt (i) to 0 for all i.




                   R. Ryder (Dauphine) & CREST       Wang-Landau     10/ 40
Wang–Landau algorithm
                        Flat Histogram in finite time
         Parallel Wang–Landau: asymptotic behaviour


Wang–Landau with Flat Histogram


  Algorithm 2 Wang-Landau with stochastic schedule (γκt )
   1: Init θ0 > 0, X0 ∈ X .
   2: Init κ0 ← 0.
   3: for t = 1 to T do
   4:    Sample Xt from Kθt−1 (Xt−1 , ·), MH kernel targeting πθt−1 .
   5:    If (FH) then κt ← κt−1 + 1, otherwise κt ← κt−1 .
   6:    Update the penalties:

                         log θt (i) ← log θt−1 (i) + γκt (1 Xi (Xt ) − φi )
                                                           I

   7:   end for



                     R. Ryder (Dauphine) & CREST       Wang-Landau            11/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


FH is met in finite time


  To be sure that eventually, for any c > 0:

                                                νt (i)
                                   max                 − φi < c
                                i∈{1,...,d}       t
  we want to prove:

                                                     νt (i) P
                          ∀i ∈ {1, . . . , d}              − − φi
                                                            −→
                                                       t t→∞
  for any fixed γ > 0.




                   R. Ryder (Dauphine) & CREST       Wang-Landau    12/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Various updates

  Right update

                   log θt (i) ← log θt−1 (i) + γ (1 Xi (Xt ) − φi )
                                                   I                    (1)
  Using this update, FH is met in finite time.

  Wrong update


                 θt (i) ← θt−1 (i) [1 + γ (1 Xi (Xt ) − φi )]
                                            I
                          ⇔
           log θt (i) ← log θt−1 (i) + log [1 + γ (1 Xi (Xt ) − φi )]
                                                    I                   (2)

  Here FH is not necessarily met in finite time.
                        1
  Special case: ∀i φi = d .

                   R. Ryder (Dauphine) & CREST       Wang-Landau              13/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Assumptions


  In this talk, there are only two bins: d = 2. Additionally:
  Assumption

  The bins are not empty with respect to µ and π:

                     ∀i ∈ {1, 2}           µ(Xi ) > 0 and π(Xi ) > 0

  Assumption

  The state space X is compact.




                   R. Ryder (Dauphine) & CREST       Wang-Landau       14/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Assumptions


  Assumption

  The proposal distribution Q(x, y ) is such that:

              ∃qmin > 0           ∀x ∈ X             ∀y ∈ X     Q(x, y ) > qmin

  Assumption

  The MH acceptance ratio is bounded from both sides:

                                                                     π(y ) Q(y , x)
   ∃m > 0 ∃M > 0 ∀x ∈ X                              ∀y ∈ X    m<                   <M
                                                                     π(x) Q(x, y )



                   R. Ryder (Dauphine) & CREST         Wang-Landau                       15/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Theorem

  Theorem
  Consider the sequence of penalties θt introduced in the WL
  algorithm. We define:

                                     θt (1)
                      Zt = log              = log θt (1) − log θt (2)
                                     θt (2)

  Then:
                               Zt L1
                                  −− 0
                                   −→
                                t t→∞
  and consequently, with update (1) (FH) is reached in finite time
  for any precision threshold c, whereas this is not guaranteed for
  update (2).


                   R. Ryder (Dauphine) & CREST       Wang-Landau        16/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Consequence



  Using update (1), and starting from Z0 = 0:

                                             X
              Zt = log θt (1) 1 XX
                               IX
                                                 X
                         − log θt (2) 1 XX
                                       IX




                   R. Ryder (Dauphine) & CREST       Wang-Landau   17/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Consequence



  Using update (1), and starting from Z0 = 0:


                                                     t                          X
              Zt = log θ0 (1) + γ                          I                 IX
                                                     n=1 (1 X1 (Xn ) − φ1 ) 1 XX
                                                       t                           X
                         − log θ0 (2) + γ                                       IX
                                                       n=1 (1 X2 (Xn ) − φ2 ) 1 XX
                                                             I




                   R. Ryder (Dauphine) & CREST          Wang-Landau                    17/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Consequence



  Using update (1), and starting from Z0 = 0:

                                                     X
              Zt = γ(νt (1) − tφ1 ) 1 XX
                                     IX
                                                         X
                         −γ(νt (2) − tφ2 ) 1 XX
                                            IX




                   R. Ryder (Dauphine) & CREST       Wang-Landau   17/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Consequence



  Using update (1), and starting from Z0 = 0:

                                                     X
              Zt = γ(νt (1) − tφ1 ) 1 XX
                                     IX
                                                                   X
                         −γ(t − νt (1) − t(1 − φ1 )) 1 XX
                                                      IX




                   R. Ryder (Dauphine) & CREST       Wang-Landau       17/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Consequence



  Using update (1), and starting from Z0 = 0:

                                                     X
              Zt = γ(νt (1) − tφ1 ) 1 XX
                                     IX
                                                         X
                         +γ(νt (1) − tφ1 )) 1 XX
                                             IX




                   R. Ryder (Dauphine) & CREST       Wang-Landau   17/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Consequence



  Using update (1), and starting from Z0 = 0:

                                                      X
              Zt = 2γ(νt (1) − tφ1 ) 1 XX
                                      IX
                                X
                           1 XX
                            IX




                   R. Ryder (Dauphine) & CREST       Wang-Landau   17/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Consequence



  Using update (1), and starting from Z0 = 0:


                                νt (1)                  X
               Zt
               t    = 2γ           t     − φ1        1 XX
                                                      IX
                                X
                           1 XX
                            IX




                    R. Ryder (Dauphine) & CREST        Wang-Landau   17/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Consequence



  Using update (1), and starting from Z0 = 0:


                                νt (1)                 L             X
               Zt
               t    = 2γ           t − φ1            − − 0 1 XX
                                                      −1→   IX
                                                       t→∞
                         νt (1)     L                    X
              ⇒             t      −1→
                                 − − φ1              1 XX
                                                      IX
                                 t→∞




                    R. Ryder (Dauphine) & CREST        Wang-Landau       17/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Consequence



  Using update (1), and starting from Z0 = 0:


                                νt (1)                  L             X
               Zt
               t    = 2γ           t − φ1            − − 0 1 XX
                                                      −1→   IX
                                                      t→∞
                         νt (1)     L                   X
              ⇒             t      −1→
                                 − − φ1              1 XX
                                                      IX
                                 t→∞

  Using update (2), these simplifications do not occur: νt (i)/t
  converges, but not to φi in general.




                    R. Ryder (Dauphine) & CREST         Wang-Landau       17/ 40
Wang–Landau algorithm
                       Flat Histogram in finite time
        Parallel Wang–Landau: asymptotic behaviour


Proof



  To prove that
                                            Zt L1
                                              − − 0,
                                               −→
                                            t t→∞
  first notice that

                      Zt+1 − Zt = 2γ (νt+1 (1) − νt (1) − φ1 )
  can only take two different values, which we note +a and −b.




                    R. Ryder (Dauphine) & CREST       Wang-Landau   18/ 40
Wang–Landau algorithm
                       Flat Histogram in finite time
        Parallel Wang–Landau: asymptotic behaviour


Proof

  Introduce Ut , the increment of Zt :

                                         Zt+1 = Zt + Ut

  Lemma
  With the introduced processes Zt and Ut , there exists > 0 such
                                   ¯                       ¯
  that for all η > 0, there exists Z hi such that, if Zt ≥ Z hi , we have
  the following two inequalities:

                        P[Ut+1 = −b|Ut = +a, Zt ] >
                        P[Ut+1 = −b|Ut = −b, Zt ] > 1 − η.

  (and the symmetric corollary)

                    R. Ryder (Dauphine) & CREST       Wang-Landau           19/ 40
Wang–Landau algorithm
                           Flat Histogram in finite time
            Parallel Wang–Landau: asymptotic behaviour


Proof

                                                                                                      q


                                                                                                  q

                                                                                                          q
                                                                                              q


                                                                                          q
                                                                                                               q

                                                                                      q


                                                                                  q                                q
                                                                                                      q
                                                                              q
                                                                                                  q                    q
                                                                          q
                                                                                              q           q                    q

                                                                      q
                                                                      q
                                                                                          q                                q

                                                                 q
                                                                 q                                             q                    q

                                                                          q           q
                                                             q
                                                             q


                                                 Zs      q
                                                         q
                                                                              q
                                                                                  q                                q                    q



                                                     q                                                                 q
                                                                                                                                                    ~ ~
                                                     q
                                                                                                                                            q

                                                 q                                                                             q
                                                                                                                                                    Zs+T
                                                 q

                        q
                        q
                                            q
                                            q                                                                              q                    q

                    q           q
                                q
                                                                                                                                    q
                                                                                                                           Zs+T
                    q

                            q
                            q
                                        q
                                        q

                q
                q
                                    q
                                    q
                                                                                                                                        q
            q
            q


        q
        q
                                                                                                                                            q
                                                                                                                                                    q
                                                                                                                                                q            q
                                                                                                                                                         q

                        5                   10                   15              20                       25                   30                   35
                                                                              time

  Figure: We prove that Zt returns below a given horizontal bar whenever
  it goes above it, and it does so in finite time. This implies Zt /t → 0.

                                R. Ryder (Dauphine) & CREST                           Wang-Landau                                                                20/ 40
Wang–Landau algorithm
                       Flat Histogram in finite time
        Parallel Wang–Landau: asymptotic behaviour


Proof

  Introduce a crossing time s: Zs−1 ≤ Z hi and Zs ≥ Z hi .
                           ˜
  Introduce a new process Zt :
                              ˜           ˜      ˜
                              Zs = Zs and Zt+1 = Zt + Vt
                             ˜
  We can choose Vt such that Zt ≥ Zt and
  Lemma
  (Vt ) is a Markov chain over the space {+a, −b} with transition
  matrix
                             1−
                                η    1−η
  where the first state corresponds to +a and the second state to
  −b.

                    R. Ryder (Dauphine) & CREST       Wang-Landau   21/ 40
Wang–Landau algorithm
                           Flat Histogram in finite time
            Parallel Wang–Landau: asymptotic behaviour


Proof

                                                                                                      q


                                                                                                  q

                                                                                                          q
                                                                                              q


                                                                                          q
                                                                                                               q

                                                                                      q


                                                                                  q                                q
                                                                                                      q
                                                                              q
                                                                                                  q                    q
                                                                          q
                                                                                              q           q                    q

                                                                      q
                                                                      q
                                                                                          q                                q

                                                                 q
                                                                 q                                             q                    q

                                                                          q           q
                                                             q
                                                             q


                                                 Zs      q
                                                         q
                                                                              q
                                                                                  q                                q                    q



                                                     q                                                                 q
                                                                                                                                                    ~ ~
                                                     q
                                                                                                                                            q

                                                 q                                                                             q
                                                                                                                                                    Zs+T
                                                 q

                        q
                        q
                                            q
                                            q                                                                              q                    q

                    q           q
                                q
                                                                                                                                    q
                                                                                                                           Zs+T
                    q

                            q
                            q
                                        q
                                        q

                q
                q
                                    q
                                    q
                                                                                                                                        q
            q
            q


        q
        q
                                                                                                                                            q
                                                                                                                                                    q
                                                                                                                                                q            q
                                                                                                                                                         q

                        5                   10                   15              20                       25                   30                   35
                                                                              time

  Figure: We prove that Zt returns below a given horizontal bar whenever
  it goes above it, and it does so in finite time. This implies Zt /t → 0.

                                R. Ryder (Dauphine) & CREST                           Wang-Landau                                                                22/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Case d = 2




  This shows that Zt /t → 0, hence (FH) is met in finite time.
  For the case d > 2, we shall use only update 1.




                   R. Ryder (Dauphine) & CREST       Wang-Landau   23/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Case d > 2



  To prove that (FH) is met in finite time in the case with d > 2, we
  need an extra assumption:
  Assumption
  The desired frequencies are all rational numbers:

                                            ∀i, φi ∈ Q




                   R. Ryder (Dauphine) & CREST       Wang-Landau       24/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Irreducible Markov chain

  Under the assumption of rationality, we have the following lemma:
  Lemma
  Let Θ be the following subset of Rd :
                                                                             d
                           d                              d
        Θ = {z ∈ R : ∃(n1 , . . . , nd ) ∈ N                  zi = ni − φi         nj }
                                                                             j=1

  Then denoting by λ the product of the Lebesgue measure µ on X
  and of the counting measure on Θ, (Xt , log θt ) is λ-irreducible.

  Proof: B´zout’s lemma.
          e


                   R. Ryder (Dauphine) & CREST       Wang-Landau                          25/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


A limit exists




  Since (Xt , log θt ) is λ-irreducible, the proportion of visits to any
  λ-measurable set of X × Θ converges to a limit in [0, 1]. Hence
  the vector (ν(i)/t) converges to some limit pi . We now need to
  prove that this limit corresponds to (FH).




                   R. Ryder (Dauphine) & CREST       Wang-Landau           26/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Reductio ad absurdum



  Suppose (reductio ad absurdum) that p = φ. Then there exist i
  and j such that
                          pi − φi < pj − φj
  and hence

  Ztj,i = −νt (i) + νt (j) + t(φi − φj ) ∼ t(−pi + φi + pj − φj ) → ∞
                                      ˜
  As before, we can construct Ut and Ut and show that Ut decreases
  on average, which contradicts the assumption that Ztj,i → ∞.




                   R. Ryder (Dauphine) & CREST       Wang-Landau        27/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Conclusion for d > 2




  Conclusion: using update 1, (FH) is met in finite time for any
  number of bins d.




                   R. Ryder (Dauphine) & CREST       Wang-Landau   28/ 40
Wang–Landau algorithm
                               Flat Histogram in finite time
                Parallel Wang–Landau: asymptotic behaviour


Illustration


                                                                            1.0                                                                1.0




                                              Proportions of visits to each bin




                                                                                                                 Proportions of visits to each bin
          0.5
                                                                            0.8                                                                0.8
          0.4
                                                                            0.6                                                                0.6
    density




          0.3

                                                                            0.4                                                                0.4
          0.2


          0.1                                                               0.2                                                                0.2


          0.0                                                               0.0                                                                0.0
                −4    −2       0   2    4                                         0   1000 2000 3000 4000 5000                                       0   1000 2000 3000 4000 5000
                           X                                                             iterations                                                         iterations

   (a) Histogram of the gen- (b) Convergence of the                                                              (c) Convergence of the
   erated sample             proportions of visits to                                                            proportions of visits to
                             each bin, using the right                                                           each bin, using the wrong
                             update                                                                              update



                               R. Ryder (Dauphine) & CREST                                      Wang-Landau                                                                         29/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Parallel chains


                  (1)             (N)
  N chains (Xt , . . . , Xt             ) instead of one.
       targeting the same biased distribution πθt at iteration t,
       sharing the same estimated bias θt at iteration t.

  Old update of the penalties
                                                                   X
              log θt (i) ← log θt−1 (i) + γ (1 Xi (Xt ) − φi ) 1 XX
                                              I                 IX


  (L. Bornn, P. Jacob, P. Del Moral, A. Doucet)



                   R. Ryder (Dauphine) & CREST       Wang-Landau       30/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Parallel chains


                  (1)             (N)
  N chains (Xt , . . . , Xt             ) instead of one.
       targeting the same biased distribution πθt at iteration t,
       sharing the same estimated bias θt at iteration t.

  New update of the penalties
                                                                   N          (k)             X
              log θt (i) ← log θt−1 (i) + γ               1
                                                          N        k=1 1 Xi (Xt )
                                                                        I           − φi   1 XX
                                                                                            IX


  (L. Bornn, P. Jacob, P. Del Moral, A. Doucet)



                   R. Ryder (Dauphine) & CREST       Wang-Landau                            30/ 40
Wang–Landau algorithm
                       Flat Histogram in finite time
        Parallel Wang–Landau: asymptotic behaviour


Infinite number of chains

  Suppose φi = 1/d. We can use either update and choose the
  “wrong” update
                                                          N
                                                      1               (k)      1
              θt (i) ← θt−1 (i) 1 + γ                           1 Xi (Xt ) −
                                                                 I
                                                      N                        d
                                                          k=1

                        (k)          iid
  Note that if (Xt )N ∼ πθt then
                    k=1

        N                                                                                 −1
    1                (k)         a.s.                               ψi              ψk
              1 Xi (Xt )
               I              −−→
                              −−                 πθt (x)dx =
    N                          N→∞          Xi                   θt−1 (i)          θ(k)
        k=1                                                                    k




                    R. Ryder (Dauphine) & CREST       Wang-Landau                              31/ 40
Wang–Landau algorithm
                    Flat Histogram in finite time
     Parallel Wang–Landau: asymptotic behaviour




This corresponds to update

                                               1
     θt (i) ← θt−1 (i) 1 + γ       πθt (x)dx −
                                X              d
                              i                                            
                                                                   −1
                                  ψi           ψk                           1 
   ⇔ θt (i) ← θt−1 (i) 1 + γ                                          −
                                θt−1 (i)      θ(k)                          d
                                                               k

Normalize:
                                                   θt (i)
                                    θt (i) ←
                                                    j θt (j)

Then the penalties (θt )t≥0 should converge towards ψ.




                 R. Ryder (Dauphine) & CREST       Wang-Landau                     32/ 40
Wang–Landau algorithm
               Flat Histogram in finite time
Parallel Wang–Landau: asymptotic behaviour




                                                ψi             ψk     −1         XX
                     θt−1 (i) 1+γ             θt−1 (i)   (   k θ(k)   )     1
                                                                           −d    I
                                                                                1 XX
θt (i) ←                                                                               XX
                                                   ψj              ψk −1 1
                     d
                     j=1 θt−1 (j)     1+γ        θt−1 (j)    (   k θ(k) −d)         I
                                                                                   1 XX




            R. Ryder (Dauphine) & CREST           Wang-Landau                               33/ 40
Wang–Landau algorithm
               Flat Histogram in finite time
Parallel Wang–Landau: asymptotic behaviour




                                                ψi                         XX
                     θt−1 (i) 1+γ                       −1         1
                                              θt−1 (i) Mψ (θt−1 )− d      I
                                                                          1 XX
θt (i) ←                                           ψj                            XX
                     d
                     j=1 θt−1 (j)     1+γ                  −1         1
                                                 θt−1 (j) Mψ (θt−1 )− d       I
                                                                             1 XX




            R. Ryder (Dauphine) & CREST           Wang-Landau                         33/ 40
Wang–Landau algorithm
               Flat Histogram in finite time
Parallel Wang–Landau: asymptotic behaviour




                                                              XX
                                            −1
                                                       I
                     θt−1 (i)(1− γ ) + ψi γMψ (θt−1 ) 1 XX
                                 d
θt (i) ←                                                           XX
                     d               γ
                     j=1 θt−1 (j)(1− d )
                                                     −1
                                                               I
                                              + ψj γMψ (θt−1 ) 1 XX




            R. Ryder (Dauphine) & CREST       Wang-Landau               33/ 40
Wang–Landau algorithm
               Flat Histogram in finite time
Parallel Wang–Landau: asymptotic behaviour




                                                             XX
                                        −1
                                                   I
                 θt−1 (i)(1− γ ) + ψi γMψ (θt−1 ) 1 XX
                             d
θt (i) ←                                                XX
                                   −1
                                              I
                         (1− γ )+γMψ (θt−1 ) 1 XX
                             d




            R. Ryder (Dauphine) & CREST       Wang-Landau         33/ 40
Wang–Landau algorithm
               Flat Histogram in finite time
Parallel Wang–Landau: asymptotic behaviour




                                      1− γ                     γM −1 (θt−1 )
                                                                 ψ
θt (i) ← θt−1 (i) 1− γ +γM −1 (θ
                           d
                                                      + ψi 1− γ +γM −1 (θ
                                 d       ψ    t−1 )            d      ψ     t−1 )




            R. Ryder (Dauphine) & CREST       Wang-Landau                           33/ 40
Wang–Landau algorithm
               Flat Histogram in finite time
Parallel Wang–Landau: asymptotic behaviour




θt (i) ← θt−1 (i) × (1 − α(θt−1 )) + ψi × α(θt−1 )




            R. Ryder (Dauphine) & CREST       Wang-Landau   33/ 40
Wang–Landau algorithm
                    Flat Histogram in finite time
     Parallel Wang–Landau: asymptotic behaviour




Hence the update can be written as a convex combination

           θt (i) ← θt−1 (i) × (1 − α(θt−1 )) + ψi × α(θt−1 )

where
                                                     −1
                                                   γMψ (θt−1 )
                        α(θt−1 ) =                 γ       −1
                                          1−       d   + γMψ (θt−1 )
It can easily be shown that

                   ∀t ≥ 0        0 < αmin ≤ α(θt ) ≤ αmax < 1

provided that ∀i, θ0 (i) > 0, ψ(i) > 0.
                                                             iid
Of course the assumptions N → ∞ and ∼ sampling are strong.
What happens in practice?


                 R. Ryder (Dauphine) & CREST           Wang-Landau     34/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


Animation



  Toy example

                      X = {1, 2, 3, 4, 5},           π = (π1 , . . . , π5 )
  We take the partition: Xi = {i} and launch the Wang–Landau
  algorithm for N = 1, 10, 100 and γ = 1.
  In this case we know the value of ψi : ψi = πi , so we can compute
  the deterministic sequence of penalties corresponding to N = ∞.

  Animation




                   R. Ryder (Dauphine) & CREST       Wang-Landau              35/ 40
Wang–Landau algorithm
                    Flat Histogram in finite time
     Parallel Wang–Landau: asymptotic behaviour


Animation




                                                                 N=∞
                 R. Ryder (Dauphine) & CREST       Wang-Landau         36/ 40
Wang–Landau algorithm
                    Flat Histogram in finite time
     Parallel Wang–Landau: asymptotic behaviour


Animation




                                                                 N = 100
                 R. Ryder (Dauphine) & CREST       Wang-Landau             36/ 40
Wang–Landau algorithm
                    Flat Histogram in finite time
     Parallel Wang–Landau: asymptotic behaviour


Animation




                                                                 N = 10
                 R. Ryder (Dauphine) & CREST       Wang-Landau            36/ 40
Wang–Landau algorithm
                    Flat Histogram in finite time
     Parallel Wang–Landau: asymptotic behaviour


Animation




                                                                 N=1
                 R. Ryder (Dauphine) & CREST       Wang-Landau         36/ 40
Wang–Landau algorithm
                    Flat Histogram in finite time
     Parallel Wang–Landau: asymptotic behaviour


Animation




                                                           “




                 R. Ryder (Dauphine) & CREST       Wang-Landau   37/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


New take on the Parallel Wang–Landau


  Call fψ the function such that:

                                         θt ← fψ (θt−1 )

  As we saw:
                            fψ (θ) = θ(1 − α(θ)) + ψα(θ))
  where
                                                       −1
                                                     γMψ (θ)
                               α(θ) =                γ       −1
                                            1−       d   + γMψ (θ)




                   R. Ryder (Dauphine) & CREST           Wang-Landau   38/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


New take on the Parallel Wang–Landau (in progress)

  In the ideal case we have
                                                          n
                            θn = fψ ◦ . . . ◦ fψ (θ0 ) = fψ (θ0 )
                                            ×n

  If we add perturbations to ψ we also add perturbations to θ, as
  follows:
                 ε
           θ1 = fψ (θ0 ) = fψ (θ0 ) + V0




                   R. Ryder (Dauphine) & CREST       Wang-Landau    39/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


New take on the Parallel Wang–Landau (in progress)

  In the ideal case we have
                                                          n
                            θn = fψ ◦ . . . ◦ fψ (θ0 ) = fψ (θ0 )
                                            ×n

  If we add perturbations to ψ we also add perturbations to θ, as
  follows:
                 ε
           θ1 = fψ (θ0 ) = fψ (θ0 ) + V0
                 ε          ε                     2
           θ2 = fψ (θ1 ) = fψ (fψ (θ0 ) + V0 ) = fψ (θ0 ) + V1
           ...




                   R. Ryder (Dauphine) & CREST       Wang-Landau    39/ 40
Wang–Landau algorithm
                      Flat Histogram in finite time
       Parallel Wang–Landau: asymptotic behaviour


New take on the Parallel Wang–Landau (in progress)

  In the ideal case we have
                                                          n
                            θn = fψ ◦ . . . ◦ fψ (θ0 ) = fψ (θ0 )
                                            ×n

  If we add perturbations to ψ we also add perturbations to θ, as
  follows:
                 ε
           θ1 = fψ (θ0 ) = fψ (θ0 ) + V0
                 ε          ε                     2
           θ2 = fψ (θ1 ) = fψ (fψ (θ0 ) + V0 ) = fψ (θ0 ) + V1
           ...

  The properties of fψ should help control the noise terms
  (Vn )n≥0 . . .


                   R. Ryder (Dauphine) & CREST       Wang-Landau    39/ 40
Wang–Landau algorithm
                     Flat Histogram in finite time
      Parallel Wang–Landau: asymptotic behaviour


Bibliography

      Atchad´, Y. and Liu, J. (2010). The Wang-Landau algorithm
              e
      in general state spaces: applications and convergence analysis.
      Statistica Sinica, 20:209–233.
      Wang, F. and Landau, D. (2001). Determining the density of
      states for classical statistical models: A random walk
      algorithm to produce a flat histogram. Physical Review E,
      64(5):56101.
      An Adaptive Interacting Wang-Landau Algorithm for
      Automatic Density Exploration, L. Bornn, P. Jacob, P. Del
      Moral, A. Doucet, on arXiv.
      The Wang-Landau algorithm reaches the Flat Histogram
      criterion in finite time, P. Jacob & RR, on arXiv.


                  R. Ryder (Dauphine) & CREST       Wang-Landau         40/ 40

More Related Content

What's hot

Dissertation Slides
Dissertation SlidesDissertation Slides
Dissertation SlidesSanghui Ahn
 
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練Abner Huang
 
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...Chiheb Ben Hammouda
 
Aurelian Isar - Decoherence And Transition From Quantum To Classical In Open ...
Aurelian Isar - Decoherence And Transition From Quantum To Classical In Open ...Aurelian Isar - Decoherence And Transition From Quantum To Classical In Open ...
Aurelian Isar - Decoherence And Transition From Quantum To Classical In Open ...SEENET-MTP
 
Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Pierre Jacob
 
DFT and its properties
DFT and its propertiesDFT and its properties
DFT and its propertiesssuser2797e4
 
DPPs everywhere: repulsive point processes for Monte Carlo integration, signa...
DPPs everywhere: repulsive point processes for Monte Carlo integration, signa...DPPs everywhere: repulsive point processes for Monte Carlo integration, signa...
DPPs everywhere: repulsive point processes for Monte Carlo integration, signa...Advanced-Concepts-Team
 
弱値の半古典論
弱値の半古典論弱値の半古典論
弱値の半古典論tanaka-atushi
 
Model checking of time petri nets
Model checking of time petri netsModel checking of time petri nets
Model checking of time petri netsMarwa Al-Rikaby
 
Quantum Algorithms and Lower Bounds in Continuous Time
Quantum Algorithms and Lower Bounds in Continuous TimeQuantum Algorithms and Lower Bounds in Continuous Time
Quantum Algorithms and Lower Bounds in Continuous TimeDavid Yonge-Mallo
 
Dsp U Lec07 Realization Of Discrete Time Systems
Dsp U   Lec07 Realization Of Discrete Time SystemsDsp U   Lec07 Realization Of Discrete Time Systems
Dsp U Lec07 Realization Of Discrete Time Systemstaha25
 
A Vanilla Rao-Blackwellisation
A Vanilla Rao-BlackwellisationA Vanilla Rao-Blackwellisation
A Vanilla Rao-BlackwellisationChristian Robert
 
F Giordano Proton transversity distributions
F Giordano Proton transversity distributionsF Giordano Proton transversity distributions
F Giordano Proton transversity distributionsFrancesca Giordano
 

What's hot (19)

Dissertation Slides
Dissertation SlidesDissertation Slides
Dissertation Slides
 
NC time seminar
NC time seminarNC time seminar
NC time seminar
 
QMC: Operator Splitting Workshop, Thresholdings, Robustness, and Generalized ...
QMC: Operator Splitting Workshop, Thresholdings, Robustness, and Generalized ...QMC: Operator Splitting Workshop, Thresholdings, Robustness, and Generalized ...
QMC: Operator Splitting Workshop, Thresholdings, Robustness, and Generalized ...
 
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
 
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
 
Aurelian Isar - Decoherence And Transition From Quantum To Classical In Open ...
Aurelian Isar - Decoherence And Transition From Quantum To Classical In Open ...Aurelian Isar - Decoherence And Transition From Quantum To Classical In Open ...
Aurelian Isar - Decoherence And Transition From Quantum To Classical In Open ...
 
Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7
 
Transforms
TransformsTransforms
Transforms
 
DFT and its properties
DFT and its propertiesDFT and its properties
DFT and its properties
 
DPPs everywhere: repulsive point processes for Monte Carlo integration, signa...
DPPs everywhere: repulsive point processes for Monte Carlo integration, signa...DPPs everywhere: repulsive point processes for Monte Carlo integration, signa...
DPPs everywhere: repulsive point processes for Monte Carlo integration, signa...
 
1416336962.pdf
1416336962.pdf1416336962.pdf
1416336962.pdf
 
弱値の半古典論
弱値の半古典論弱値の半古典論
弱値の半古典論
 
Introduction to chaos
Introduction to chaosIntroduction to chaos
Introduction to chaos
 
Model checking of time petri nets
Model checking of time petri netsModel checking of time petri nets
Model checking of time petri nets
 
0906.2042v2
0906.2042v20906.2042v2
0906.2042v2
 
Quantum Algorithms and Lower Bounds in Continuous Time
Quantum Algorithms and Lower Bounds in Continuous TimeQuantum Algorithms and Lower Bounds in Continuous Time
Quantum Algorithms and Lower Bounds in Continuous Time
 
Dsp U Lec07 Realization Of Discrete Time Systems
Dsp U   Lec07 Realization Of Discrete Time SystemsDsp U   Lec07 Realization Of Discrete Time Systems
Dsp U Lec07 Realization Of Discrete Time Systems
 
A Vanilla Rao-Blackwellisation
A Vanilla Rao-BlackwellisationA Vanilla Rao-Blackwellisation
A Vanilla Rao-Blackwellisation
 
F Giordano Proton transversity distributions
F Giordano Proton transversity distributionsF Giordano Proton transversity distributions
F Giordano Proton transversity distributions
 

Similar to On the convergence properties of the Wang-Landau algorithm

PAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPierre Jacob
 
Further discriminatory signature of inflation
Further discriminatory signature of inflationFurther discriminatory signature of inflation
Further discriminatory signature of inflationLaila A
 
RSS discussion of Girolami and Calderhead, October 13, 2010
RSS discussion of Girolami and Calderhead, October 13, 2010RSS discussion of Girolami and Calderhead, October 13, 2010
RSS discussion of Girolami and Calderhead, October 13, 2010Christian Robert
 
Computing Local and Global Centrality
Computing Local and Global CentralityComputing Local and Global Centrality
Computing Local and Global CentralityDavid Gleich
 
Riemannian stochastic variance reduced gradient on Grassmann manifold (ICCOPT...
Riemannian stochastic variance reduced gradient on Grassmann manifold (ICCOPT...Riemannian stochastic variance reduced gradient on Grassmann manifold (ICCOPT...
Riemannian stochastic variance reduced gradient on Grassmann manifold (ICCOPT...Hiroyuki KASAI
 
Quadrature amplitude modulation qam transmitter
Quadrature amplitude modulation qam transmitterQuadrature amplitude modulation qam transmitter
Quadrature amplitude modulation qam transmitterAsaad Drake
 

Similar to On the convergence properties of the Wang-Landau algorithm (7)

PAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ Warwick
 
Further discriminatory signature of inflation
Further discriminatory signature of inflationFurther discriminatory signature of inflation
Further discriminatory signature of inflation
 
RSS discussion of Girolami and Calderhead, October 13, 2010
RSS discussion of Girolami and Calderhead, October 13, 2010RSS discussion of Girolami and Calderhead, October 13, 2010
RSS discussion of Girolami and Calderhead, October 13, 2010
 
Computing Local and Global Centrality
Computing Local and Global CentralityComputing Local and Global Centrality
Computing Local and Global Centrality
 
Real Time Spectroscopy
Real Time SpectroscopyReal Time Spectroscopy
Real Time Spectroscopy
 
Riemannian stochastic variance reduced gradient on Grassmann manifold (ICCOPT...
Riemannian stochastic variance reduced gradient on Grassmann manifold (ICCOPT...Riemannian stochastic variance reduced gradient on Grassmann manifold (ICCOPT...
Riemannian stochastic variance reduced gradient on Grassmann manifold (ICCOPT...
 
Quadrature amplitude modulation qam transmitter
Quadrature amplitude modulation qam transmitterQuadrature amplitude modulation qam transmitter
Quadrature amplitude modulation qam transmitter
 

More from Robin Ryder

Bayesian Methods for Historical Linguistics
Bayesian Methods for Historical LinguisticsBayesian Methods for Historical Linguistics
Bayesian Methods for Historical LinguisticsRobin Ryder
 
A phylogenetic model of language diversification
A phylogenetic model of language diversificationA phylogenetic model of language diversification
A phylogenetic model of language diversificationRobin Ryder
 
Statistical Methods in Historical Linguistics
Statistical Methods in Historical LinguisticsStatistical Methods in Historical Linguistics
Statistical Methods in Historical LinguisticsRobin Ryder
 
Introduction à ABC
Introduction à ABCIntroduction à ABC
Introduction à ABCRobin Ryder
 
Bayesian case studies, practical 2
Bayesian case studies, practical 2Bayesian case studies, practical 2
Bayesian case studies, practical 2Robin Ryder
 
Bayesian case studies, practical 1
Bayesian case studies, practical 1Bayesian case studies, practical 1
Bayesian case studies, practical 1Robin Ryder
 
Modèles phylogéniques de la diversification des langues
Modèles phylogéniques de la diversification des languesModèles phylogéniques de la diversification des langues
Modèles phylogéniques de la diversification des languesRobin Ryder
 
Talk at Institut Jean Nicod on 6 October 2010
Talk at Institut Jean Nicod on 6 October 2010Talk at Institut Jean Nicod on 6 October 2010
Talk at Institut Jean Nicod on 6 October 2010Robin Ryder
 
Phylogenetic models and MCMC methods for the reconstruction of language history
Phylogenetic models and MCMC methods for the reconstruction of language historyPhylogenetic models and MCMC methods for the reconstruction of language history
Phylogenetic models and MCMC methods for the reconstruction of language historyRobin Ryder
 
Modèles phylogénétiques de la diversification des langues
Modèles phylogénétiques de la diversification des languesModèles phylogénétiques de la diversification des langues
Modèles phylogénétiques de la diversification des languesRobin Ryder
 
Approximate Bayesian Computation (ABC)
Approximate Bayesian Computation (ABC)Approximate Bayesian Computation (ABC)
Approximate Bayesian Computation (ABC)Robin Ryder
 

More from Robin Ryder (11)

Bayesian Methods for Historical Linguistics
Bayesian Methods for Historical LinguisticsBayesian Methods for Historical Linguistics
Bayesian Methods for Historical Linguistics
 
A phylogenetic model of language diversification
A phylogenetic model of language diversificationA phylogenetic model of language diversification
A phylogenetic model of language diversification
 
Statistical Methods in Historical Linguistics
Statistical Methods in Historical LinguisticsStatistical Methods in Historical Linguistics
Statistical Methods in Historical Linguistics
 
Introduction à ABC
Introduction à ABCIntroduction à ABC
Introduction à ABC
 
Bayesian case studies, practical 2
Bayesian case studies, practical 2Bayesian case studies, practical 2
Bayesian case studies, practical 2
 
Bayesian case studies, practical 1
Bayesian case studies, practical 1Bayesian case studies, practical 1
Bayesian case studies, practical 1
 
Modèles phylogéniques de la diversification des langues
Modèles phylogéniques de la diversification des languesModèles phylogéniques de la diversification des langues
Modèles phylogéniques de la diversification des langues
 
Talk at Institut Jean Nicod on 6 October 2010
Talk at Institut Jean Nicod on 6 October 2010Talk at Institut Jean Nicod on 6 October 2010
Talk at Institut Jean Nicod on 6 October 2010
 
Phylogenetic models and MCMC methods for the reconstruction of language history
Phylogenetic models and MCMC methods for the reconstruction of language historyPhylogenetic models and MCMC methods for the reconstruction of language history
Phylogenetic models and MCMC methods for the reconstruction of language history
 
Modèles phylogénétiques de la diversification des langues
Modèles phylogénétiques de la diversification des languesModèles phylogénétiques de la diversification des langues
Modèles phylogénétiques de la diversification des langues
 
Approximate Bayesian Computation (ABC)
Approximate Bayesian Computation (ABC)Approximate Bayesian Computation (ABC)
Approximate Bayesian Computation (ABC)
 

On the convergence properties of the Wang-Landau algorithm

  • 1. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour On the convergence properties of the Wang-Landau Algorithm Robin J. Ryder CEREMADE, Universit´ Paris Dauphine e and CREST ENSAE – 31 January 2012 joint work with Pierre Jacob (National University of Singapore) and Pierre Del Moral (INRIA & Universit´ de Bordeaux) e R. Ryder (Dauphine) & CREST Wang-Landau 1/ 40
  • 2. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Outline 1 Wang–Landau algorithm 2 Flat Histogram in finite time 3 Parallel Wang–Landau: asymptotic behaviour R. Ryder (Dauphine) & CREST Wang-Landau 2/ 40
  • 3. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Motivation 0.5 0.4 0.3 density 0.2 0.1 0.0 −5 0 5 10 15 X Figure: A mixture of well-separated normal distributions. R. Ryder (Dauphine) & CREST Wang-Landau 3/ 40
  • 4. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Motivation 0.5 0.4 0.3 density 0.2 0.1 0.0 −5 0 5 10 15 X Figure: A mixture of well-separated normal distributions. R. Ryder (Dauphine) & CREST Wang-Landau 4/ 40
  • 5. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Motivation 0.5 0.4 0.3 density 0.2 0.1 0.0 −5 0 5 10 15 X Figure: A mixture of well-separated normal distributions. R. Ryder (Dauphine) & CREST Wang-Landau 5/ 40
  • 6. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Setting Partition the state space d X = Xi i=1 Desired frequencies φ = (φ1 , . . . , φd ) Penalized distribution π(x) πθ (x) ∝ θ(J(x)) where J(x) such that x ∈ XJ(x) . R. Ryder (Dauphine) & CREST Wang-Landau 6/ 40
  • 7. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour First algorithm Algorithm 1 Wang-Landau with deterministic schedule (γt ) 1: Init θ0 > 0, X0 ∈ X . 2: for t = 1 to T do 3: Sample Xt from Kθt−1 (Xt−1 , ·), MH kernel targeting πθt−1 . 4: Update the penalties: log θt (i) ← log θt−1 (i) + γt (1 Xi (Xt ) − φi ) I 5: end for R. Ryder (Dauphine) & CREST Wang-Landau 7/ 40
  • 8. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Video R. Ryder (Dauphine) & CREST Wang-Landau 8/ 40
  • 9. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Video R. Ryder (Dauphine) & CREST Wang-Landau 8/ 40
  • 10. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Flat Histogram Issues with the first version Choice of γ has a huge impact on the results. Flat Histogram Define the counters: t νt (i) := 1 Xi (Xn ) I n=1 Flat Histogram (FH) is reached when: νt (i) max − φi < c i∈{1,...,d} t R. Ryder (Dauphine) & CREST Wang-Landau 9/ 40
  • 11. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Flat Histogram Idea Instead of decreasing γt at each time step t, decrease only when the Flat Histogram criterion is reached. In practice Denote by κt the number of FH criteria reached up to time t. Use γκt instead of γt at time t. If FH is reached at time t, reset νt (i) to 0 for all i. R. Ryder (Dauphine) & CREST Wang-Landau 10/ 40
  • 12. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Wang–Landau with Flat Histogram Algorithm 2 Wang-Landau with stochastic schedule (γκt ) 1: Init θ0 > 0, X0 ∈ X . 2: Init κ0 ← 0. 3: for t = 1 to T do 4: Sample Xt from Kθt−1 (Xt−1 , ·), MH kernel targeting πθt−1 . 5: If (FH) then κt ← κt−1 + 1, otherwise κt ← κt−1 . 6: Update the penalties: log θt (i) ← log θt−1 (i) + γκt (1 Xi (Xt ) − φi ) I 7: end for R. Ryder (Dauphine) & CREST Wang-Landau 11/ 40
  • 13. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour FH is met in finite time To be sure that eventually, for any c > 0: νt (i) max − φi < c i∈{1,...,d} t we want to prove: νt (i) P ∀i ∈ {1, . . . , d} − − φi −→ t t→∞ for any fixed γ > 0. R. Ryder (Dauphine) & CREST Wang-Landau 12/ 40
  • 14. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Various updates Right update log θt (i) ← log θt−1 (i) + γ (1 Xi (Xt ) − φi ) I (1) Using this update, FH is met in finite time. Wrong update θt (i) ← θt−1 (i) [1 + γ (1 Xi (Xt ) − φi )] I ⇔ log θt (i) ← log θt−1 (i) + log [1 + γ (1 Xi (Xt ) − φi )] I (2) Here FH is not necessarily met in finite time. 1 Special case: ∀i φi = d . R. Ryder (Dauphine) & CREST Wang-Landau 13/ 40
  • 15. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Assumptions In this talk, there are only two bins: d = 2. Additionally: Assumption The bins are not empty with respect to µ and π: ∀i ∈ {1, 2} µ(Xi ) > 0 and π(Xi ) > 0 Assumption The state space X is compact. R. Ryder (Dauphine) & CREST Wang-Landau 14/ 40
  • 16. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Assumptions Assumption The proposal distribution Q(x, y ) is such that: ∃qmin > 0 ∀x ∈ X ∀y ∈ X Q(x, y ) > qmin Assumption The MH acceptance ratio is bounded from both sides: π(y ) Q(y , x) ∃m > 0 ∃M > 0 ∀x ∈ X ∀y ∈ X m< <M π(x) Q(x, y ) R. Ryder (Dauphine) & CREST Wang-Landau 15/ 40
  • 17. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Theorem Theorem Consider the sequence of penalties θt introduced in the WL algorithm. We define: θt (1) Zt = log = log θt (1) − log θt (2) θt (2) Then: Zt L1 −− 0 −→ t t→∞ and consequently, with update (1) (FH) is reached in finite time for any precision threshold c, whereas this is not guaranteed for update (2). R. Ryder (Dauphine) & CREST Wang-Landau 16/ 40
  • 18. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Consequence Using update (1), and starting from Z0 = 0: X Zt = log θt (1) 1 XX IX X − log θt (2) 1 XX IX R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40
  • 19. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Consequence Using update (1), and starting from Z0 = 0: t X Zt = log θ0 (1) + γ I IX n=1 (1 X1 (Xn ) − φ1 ) 1 XX t X − log θ0 (2) + γ IX n=1 (1 X2 (Xn ) − φ2 ) 1 XX I R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40
  • 20. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Consequence Using update (1), and starting from Z0 = 0: X Zt = γ(νt (1) − tφ1 ) 1 XX IX X −γ(νt (2) − tφ2 ) 1 XX IX R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40
  • 21. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Consequence Using update (1), and starting from Z0 = 0: X Zt = γ(νt (1) − tφ1 ) 1 XX IX X −γ(t − νt (1) − t(1 − φ1 )) 1 XX IX R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40
  • 22. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Consequence Using update (1), and starting from Z0 = 0: X Zt = γ(νt (1) − tφ1 ) 1 XX IX X +γ(νt (1) − tφ1 )) 1 XX IX R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40
  • 23. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Consequence Using update (1), and starting from Z0 = 0: X Zt = 2γ(νt (1) − tφ1 ) 1 XX IX X 1 XX IX R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40
  • 24. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Consequence Using update (1), and starting from Z0 = 0: νt (1) X Zt t = 2γ t − φ1 1 XX IX X 1 XX IX R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40
  • 25. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Consequence Using update (1), and starting from Z0 = 0: νt (1) L X Zt t = 2γ t − φ1 − − 0 1 XX −1→ IX t→∞ νt (1) L X ⇒ t −1→ − − φ1 1 XX IX t→∞ R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40
  • 26. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Consequence Using update (1), and starting from Z0 = 0: νt (1) L X Zt t = 2γ t − φ1 − − 0 1 XX −1→ IX t→∞ νt (1) L X ⇒ t −1→ − − φ1 1 XX IX t→∞ Using update (2), these simplifications do not occur: νt (i)/t converges, but not to φi in general. R. Ryder (Dauphine) & CREST Wang-Landau 17/ 40
  • 27. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Proof To prove that Zt L1 − − 0, −→ t t→∞ first notice that Zt+1 − Zt = 2γ (νt+1 (1) − νt (1) − φ1 ) can only take two different values, which we note +a and −b. R. Ryder (Dauphine) & CREST Wang-Landau 18/ 40
  • 28. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Proof Introduce Ut , the increment of Zt : Zt+1 = Zt + Ut Lemma With the introduced processes Zt and Ut , there exists > 0 such ¯ ¯ that for all η > 0, there exists Z hi such that, if Zt ≥ Z hi , we have the following two inequalities: P[Ut+1 = −b|Ut = +a, Zt ] > P[Ut+1 = −b|Ut = −b, Zt ] > 1 − η. (and the symmetric corollary) R. Ryder (Dauphine) & CREST Wang-Landau 19/ 40
  • 29. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Proof q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Zs q q q q q q q q ~ ~ q q q q Zs+T q q q q q q q q q q q Zs+T q q q q q q q q q q q q q q q q q q q 5 10 15 20 25 30 35 time Figure: We prove that Zt returns below a given horizontal bar whenever it goes above it, and it does so in finite time. This implies Zt /t → 0. R. Ryder (Dauphine) & CREST Wang-Landau 20/ 40
  • 30. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Proof Introduce a crossing time s: Zs−1 ≤ Z hi and Zs ≥ Z hi . ˜ Introduce a new process Zt : ˜ ˜ ˜ Zs = Zs and Zt+1 = Zt + Vt ˜ We can choose Vt such that Zt ≥ Zt and Lemma (Vt ) is a Markov chain over the space {+a, −b} with transition matrix 1− η 1−η where the first state corresponds to +a and the second state to −b. R. Ryder (Dauphine) & CREST Wang-Landau 21/ 40
  • 31. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Proof q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Zs q q q q q q q q ~ ~ q q q q Zs+T q q q q q q q q q q q Zs+T q q q q q q q q q q q q q q q q q q q 5 10 15 20 25 30 35 time Figure: We prove that Zt returns below a given horizontal bar whenever it goes above it, and it does so in finite time. This implies Zt /t → 0. R. Ryder (Dauphine) & CREST Wang-Landau 22/ 40
  • 32. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Case d = 2 This shows that Zt /t → 0, hence (FH) is met in finite time. For the case d > 2, we shall use only update 1. R. Ryder (Dauphine) & CREST Wang-Landau 23/ 40
  • 33. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Case d > 2 To prove that (FH) is met in finite time in the case with d > 2, we need an extra assumption: Assumption The desired frequencies are all rational numbers: ∀i, φi ∈ Q R. Ryder (Dauphine) & CREST Wang-Landau 24/ 40
  • 34. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Irreducible Markov chain Under the assumption of rationality, we have the following lemma: Lemma Let Θ be the following subset of Rd : d d d Θ = {z ∈ R : ∃(n1 , . . . , nd ) ∈ N zi = ni − φi nj } j=1 Then denoting by λ the product of the Lebesgue measure µ on X and of the counting measure on Θ, (Xt , log θt ) is λ-irreducible. Proof: B´zout’s lemma. e R. Ryder (Dauphine) & CREST Wang-Landau 25/ 40
  • 35. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour A limit exists Since (Xt , log θt ) is λ-irreducible, the proportion of visits to any λ-measurable set of X × Θ converges to a limit in [0, 1]. Hence the vector (ν(i)/t) converges to some limit pi . We now need to prove that this limit corresponds to (FH). R. Ryder (Dauphine) & CREST Wang-Landau 26/ 40
  • 36. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Reductio ad absurdum Suppose (reductio ad absurdum) that p = φ. Then there exist i and j such that pi − φi < pj − φj and hence Ztj,i = −νt (i) + νt (j) + t(φi − φj ) ∼ t(−pi + φi + pj − φj ) → ∞ ˜ As before, we can construct Ut and Ut and show that Ut decreases on average, which contradicts the assumption that Ztj,i → ∞. R. Ryder (Dauphine) & CREST Wang-Landau 27/ 40
  • 37. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Conclusion for d > 2 Conclusion: using update 1, (FH) is met in finite time for any number of bins d. R. Ryder (Dauphine) & CREST Wang-Landau 28/ 40
  • 38. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Illustration 1.0 1.0 Proportions of visits to each bin Proportions of visits to each bin 0.5 0.8 0.8 0.4 0.6 0.6 density 0.3 0.4 0.4 0.2 0.1 0.2 0.2 0.0 0.0 0.0 −4 −2 0 2 4 0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000 X iterations iterations (a) Histogram of the gen- (b) Convergence of the (c) Convergence of the erated sample proportions of visits to proportions of visits to each bin, using the right each bin, using the wrong update update R. Ryder (Dauphine) & CREST Wang-Landau 29/ 40
  • 39. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Parallel chains (1) (N) N chains (Xt , . . . , Xt ) instead of one. targeting the same biased distribution πθt at iteration t, sharing the same estimated bias θt at iteration t. Old update of the penalties X log θt (i) ← log θt−1 (i) + γ (1 Xi (Xt ) − φi ) 1 XX I IX (L. Bornn, P. Jacob, P. Del Moral, A. Doucet) R. Ryder (Dauphine) & CREST Wang-Landau 30/ 40
  • 40. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Parallel chains (1) (N) N chains (Xt , . . . , Xt ) instead of one. targeting the same biased distribution πθt at iteration t, sharing the same estimated bias θt at iteration t. New update of the penalties N (k) X log θt (i) ← log θt−1 (i) + γ 1 N k=1 1 Xi (Xt ) I − φi 1 XX IX (L. Bornn, P. Jacob, P. Del Moral, A. Doucet) R. Ryder (Dauphine) & CREST Wang-Landau 30/ 40
  • 41. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Infinite number of chains Suppose φi = 1/d. We can use either update and choose the “wrong” update N 1 (k) 1 θt (i) ← θt−1 (i) 1 + γ 1 Xi (Xt ) − I N d k=1 (k) iid Note that if (Xt )N ∼ πθt then k=1 N −1 1 (k) a.s. ψi ψk 1 Xi (Xt ) I −−→ −− πθt (x)dx = N N→∞ Xi θt−1 (i) θ(k) k=1 k R. Ryder (Dauphine) & CREST Wang-Landau 31/ 40
  • 42. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour This corresponds to update 1 θt (i) ← θt−1 (i) 1 + γ πθt (x)dx − X d   i  −1 ψi ψk 1  ⇔ θt (i) ← θt−1 (i) 1 + γ  − θt−1 (i) θ(k) d k Normalize: θt (i) θt (i) ← j θt (j) Then the penalties (θt )t≥0 should converge towards ψ. R. Ryder (Dauphine) & CREST Wang-Landau 32/ 40
  • 43. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour ψi ψk −1 XX θt−1 (i) 1+γ θt−1 (i) ( k θ(k) ) 1 −d I 1 XX θt (i) ← XX ψj ψk −1 1 d j=1 θt−1 (j) 1+γ θt−1 (j) ( k θ(k) −d) I 1 XX R. Ryder (Dauphine) & CREST Wang-Landau 33/ 40
  • 44. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour ψi XX θt−1 (i) 1+γ −1 1 θt−1 (i) Mψ (θt−1 )− d I 1 XX θt (i) ← ψj XX d j=1 θt−1 (j) 1+γ −1 1 θt−1 (j) Mψ (θt−1 )− d I 1 XX R. Ryder (Dauphine) & CREST Wang-Landau 33/ 40
  • 45. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour XX −1 I θt−1 (i)(1− γ ) + ψi γMψ (θt−1 ) 1 XX d θt (i) ← XX d γ j=1 θt−1 (j)(1− d ) −1 I + ψj γMψ (θt−1 ) 1 XX R. Ryder (Dauphine) & CREST Wang-Landau 33/ 40
  • 46. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour XX −1 I θt−1 (i)(1− γ ) + ψi γMψ (θt−1 ) 1 XX d θt (i) ← XX −1 I (1− γ )+γMψ (θt−1 ) 1 XX d R. Ryder (Dauphine) & CREST Wang-Landau 33/ 40
  • 47. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour 1− γ γM −1 (θt−1 ) ψ θt (i) ← θt−1 (i) 1− γ +γM −1 (θ d + ψi 1− γ +γM −1 (θ d ψ t−1 ) d ψ t−1 ) R. Ryder (Dauphine) & CREST Wang-Landau 33/ 40
  • 48. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour θt (i) ← θt−1 (i) × (1 − α(θt−1 )) + ψi × α(θt−1 ) R. Ryder (Dauphine) & CREST Wang-Landau 33/ 40
  • 49. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Hence the update can be written as a convex combination θt (i) ← θt−1 (i) × (1 − α(θt−1 )) + ψi × α(θt−1 ) where −1 γMψ (θt−1 ) α(θt−1 ) = γ −1 1− d + γMψ (θt−1 ) It can easily be shown that ∀t ≥ 0 0 < αmin ≤ α(θt ) ≤ αmax < 1 provided that ∀i, θ0 (i) > 0, ψ(i) > 0. iid Of course the assumptions N → ∞ and ∼ sampling are strong. What happens in practice? R. Ryder (Dauphine) & CREST Wang-Landau 34/ 40
  • 50. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Animation Toy example X = {1, 2, 3, 4, 5}, π = (π1 , . . . , π5 ) We take the partition: Xi = {i} and launch the Wang–Landau algorithm for N = 1, 10, 100 and γ = 1. In this case we know the value of ψi : ψi = πi , so we can compute the deterministic sequence of penalties corresponding to N = ∞. Animation R. Ryder (Dauphine) & CREST Wang-Landau 35/ 40
  • 51. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Animation N=∞ R. Ryder (Dauphine) & CREST Wang-Landau 36/ 40
  • 52. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Animation N = 100 R. Ryder (Dauphine) & CREST Wang-Landau 36/ 40
  • 53. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Animation N = 10 R. Ryder (Dauphine) & CREST Wang-Landau 36/ 40
  • 54. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Animation N=1 R. Ryder (Dauphine) & CREST Wang-Landau 36/ 40
  • 55. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Animation “ R. Ryder (Dauphine) & CREST Wang-Landau 37/ 40
  • 56. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour New take on the Parallel Wang–Landau Call fψ the function such that: θt ← fψ (θt−1 ) As we saw: fψ (θ) = θ(1 − α(θ)) + ψα(θ)) where −1 γMψ (θ) α(θ) = γ −1 1− d + γMψ (θ) R. Ryder (Dauphine) & CREST Wang-Landau 38/ 40
  • 57. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour New take on the Parallel Wang–Landau (in progress) In the ideal case we have n θn = fψ ◦ . . . ◦ fψ (θ0 ) = fψ (θ0 ) ×n If we add perturbations to ψ we also add perturbations to θ, as follows: ε θ1 = fψ (θ0 ) = fψ (θ0 ) + V0 R. Ryder (Dauphine) & CREST Wang-Landau 39/ 40
  • 58. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour New take on the Parallel Wang–Landau (in progress) In the ideal case we have n θn = fψ ◦ . . . ◦ fψ (θ0 ) = fψ (θ0 ) ×n If we add perturbations to ψ we also add perturbations to θ, as follows: ε θ1 = fψ (θ0 ) = fψ (θ0 ) + V0 ε ε 2 θ2 = fψ (θ1 ) = fψ (fψ (θ0 ) + V0 ) = fψ (θ0 ) + V1 ... R. Ryder (Dauphine) & CREST Wang-Landau 39/ 40
  • 59. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour New take on the Parallel Wang–Landau (in progress) In the ideal case we have n θn = fψ ◦ . . . ◦ fψ (θ0 ) = fψ (θ0 ) ×n If we add perturbations to ψ we also add perturbations to θ, as follows: ε θ1 = fψ (θ0 ) = fψ (θ0 ) + V0 ε ε 2 θ2 = fψ (θ1 ) = fψ (fψ (θ0 ) + V0 ) = fψ (θ0 ) + V1 ... The properties of fψ should help control the noise terms (Vn )n≥0 . . . R. Ryder (Dauphine) & CREST Wang-Landau 39/ 40
  • 60. Wang–Landau algorithm Flat Histogram in finite time Parallel Wang–Landau: asymptotic behaviour Bibliography Atchad´, Y. and Liu, J. (2010). The Wang-Landau algorithm e in general state spaces: applications and convergence analysis. Statistica Sinica, 20:209–233. Wang, F. and Landau, D. (2001). Determining the density of states for classical statistical models: A random walk algorithm to produce a flat histogram. Physical Review E, 64(5):56101. An Adaptive Interacting Wang-Landau Algorithm for Automatic Density Exploration, L. Bornn, P. Jacob, P. Del Moral, A. Doucet, on arXiv. The Wang-Landau algorithm reaches the Flat Histogram criterion in finite time, P. Jacob & RR, on arXiv. R. Ryder (Dauphine) & CREST Wang-Landau 40/ 40