SlideShare a Scribd company logo
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data




            A Short History of Markov Chain Monte Carlo:
            Subjective Recollections from Incomplete Data

                             Christian P. Robert and George Casella

                                   Universit´ Paris-Dauphine, IuF, & CRESt
                                            e
                                            and University of Florida


                                                   April 2, 2011
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data




       In memoriam, Julian Besag, 1945–2010
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Introduction




Introduction



       Markov Chain Monte Carlo (MCMC) methods around for almost
       as long as Monte Carlo techniques, even though impact on
       Statistics not been truly felt until the late 1980s / early 1990s .
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Introduction




Introduction



       Markov Chain Monte Carlo (MCMC) methods around for almost
       as long as Monte Carlo techniques, even though impact on
       Statistics not been truly felt until the late 1980s / early 1990s .
       Contents: Distinction between Metropolis-Hastings based
       algorithms and those related with Gibbs sampling, and brief entry
       into “second-generation MCMC revolution”.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Introduction




A few landmarks


       Realization that Markov chains could be used in a wide variety of
       situations only came to “mainstream statisticians” with Gelfand
       and Smith (1990) despite earlier publications in the statistical
       literature like Hastings (1970) and growing awareness in spatial
       statistics (Besag, 1986)
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Introduction




A few landmarks


       Realization that Markov chains could be used in a wide variety of
       situations only came to “mainstream statisticians” with Gelfand
       and Smith (1990) despite earlier publications in the statistical
       literature like Hastings (1970) and growing awareness in spatial
       statistics (Besag, 1986)
       Several reasons:
                 lack of computing machinery
                 lack of background on Markov chains
                 lack of trust in the practicality of the method
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Los Alamos



Bombs before the revolution


    Monte Carlo methods born in Los
    Alamos, New Mexico, during WWII,
    mostly by physicists working on atomic
    bombs and eventually producing the
    Metropolis algorithm in the early
    1950’s.

                  [Metropolis, Rosenbluth, Rosenbluth, Teller and Teller, 1953]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Los Alamos



Bombs before the revolution


    Monte Carlo methods born in Los
    Alamos, New Mexico, during WWII,
    mostly by physicists working on atomic
    bombs and eventually producing the
    Metropolis algorithm in the early
    1950’s.
                  [Metropolis, Rosenbluth, Rosenbluth, Teller and Teller, 1953]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Los Alamos



Monte Carlo genesis
    Monte Carlo method usually traced to
    Ulam and von Neumann:
           Stanislaw Ulam associates idea
           with an intractable combinatorial
           computation attempted in 1946
           about “solitaire”
           Idea was enthusiastically adopted
           by John von Neumann for
           implementation on neutron
           diffusion
           Name “Monte Carlo“ being
           suggested by Nicholas Metropolis
                                                                                         [Eckhardt, 1987]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Los Alamos



Monte Carlo genesis
    Monte Carlo method usually traced to
    Ulam and von Neumann:
           Stanislaw Ulam associates idea
           with an intractable combinatorial
           computation attempted in 1946
           about “solitaire”
           Idea was enthusiastically adopted
           by John von Neumann for
           implementation on neutron
           diffusion
           Name “Monte Carlo“ being
           suggested by Nicholas Metropolis
                                                                                         [Eckhardt, 1987]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Los Alamos



Monte Carlo genesis
    Monte Carlo method usually traced to
    Ulam and von Neumann:
           Stanislaw Ulam associates idea
           with an intractable combinatorial
           computation attempted in 1946
           about “solitaire”
           Idea was enthusiastically adopted
           by John von Neumann for
           implementation on neutron
           diffusion
           Name “Monte Carlo“ being
           suggested by Nicholas Metropolis
                                                                                         [Eckhardt, 1987]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Los Alamos



Monte Carlo genesis
    Monte Carlo method usually traced to
    Ulam and von Neumann:
           Stanislaw Ulam associates idea
           with an intractable combinatorial
           computation attempted in 1946
           about “solitaire”
           Idea was enthusiastically adopted
           by John von Neumann for
           implementation on neutron
           diffusion
           Name “Monte Carlo“ being
           suggested by Nicholas Metropolis
                                                                                         [Eckhardt, 1987]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Los Alamos



Monte Carlo with computers


       Very close “coincidence” with
       appearance of very first
       computer, ENIAC, born Feb.
       1946, on which von Neumann
       implemented Monte Carlo in
       1947
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Los Alamos



Monte Carlo with computers


       Very close “coincidence” with
       appearance of very first
       computer, ENIAC, born Feb.
       1946, on which von Neumann
       implemented Monte Carlo in
       1947
       Same year Ulam and von Neumann (re)invented inversion and
       accept-reject techniques
       In 1949, very first symposium on Monte Carlo and very first paper
                                          [Metropolis and Ulam, 1949]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Metropolis et al., 1953



The Metropolis et al. (1953) paper


    Very first MCMC algorithm associated
    with the second computer, MANIAC,
    Los Alamos, early 1952.
    Besides Metropolis, Arianna W.
    Rosenbluth, Marshall N. Rosenbluth,
    Augusta H. Teller, and Edward Teller
    contributed to create the Metropolis
    algorithm...
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Metropolis et al., 1953



Motivating problem

       Computation of integrals of the form

                                        F (p, q) exp{−E(p, q)/kT }dpdq
                               I=                                      ,
                                            exp{−E(p, q)/kT }dpdq

       with energy E defined as
                                                             N     N
                                                 1
                                       E(p, q) =                         V (dij ),
                                                 2
                                                            i=1    j=1
                                                                  j=i

       and N number of particles, V a potential function and dij the
       distance between particles i and j.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Metropolis et al., 1953



Boltzmann distribution



       Boltzmann distribution exp{−E(p, q)/kT } parameterised by
       temperature T , k being the Boltzmann constant, with a
       normalisation factor

                                 Z(T ) =           exp{−E(p, q)/kT }dpdq

       not available in closed form.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Metropolis et al., 1953



Computational challenge



       Since p and q are 2N -dimensional vectors, numerical integration is
       impossible
       Plus, standard Monte Carlo techniques fail to correctly
       approximate I: exp{−E(p, q)/kT } is very small for most
       realizations of random configurations (p, q) of the particle system.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Metropolis et al., 1953



Metropolis algorithm

       Consider a random walk modification of the N particles: for each
       1 ≤ i ≤ N , values

                                 xi = xi + αξ1i and yi = yi + αξ2i

       are proposed, where both ξ1i and ξ2i are uniform U(−1, 1). The
       energy difference between new and previous configurations is ∆E
       and the new configuration is accepted with probability

                                            1 ∧ exp{−∆E/kT } ,

       and otherwise the previous configuration is replicated∗

           ∗
            counting one more time in the average of the F (pt , pt )’s over the τ moves
       of the random walk.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Metropolis et al., 1953



Convergence

       Validity of the algorithm established by proving
         1. irreducibility
         2. ergodicity, that is convergence to the stationary distribution.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Metropolis et al., 1953



Convergence

       Validity of the algorithm established by proving
         1. irreducibility
         2. ergodicity, that is convergence to the stationary distribution.
       Second part obtained via discretization of the space: Metropolis et
       al. note that the proposal is reversible, then establish that
       exp{−E/kT } is invariant.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Metropolis et al., 1953



Convergence

       Validity of the algorithm established by proving
         1. irreducibility
         2. ergodicity, that is convergence to the stationary distribution.
       Second part obtained via discretization of the space: Metropolis et
       al. note that the proposal is reversible, then establish that
       exp{−E/kT } is invariant.
       Application to the specific problem of the rigid-sphere collision
       model. The number of iterations of the Metropolis algorithm
       seems to be limited: 16 steps for burn-in and 48 to 64 subsequent
       iterations (that still required four to five hours on the Los Alamos
       MANIAC).
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Metropolis et al., 1953



Physics and chemistry

              The method of Markov chain Monte Carlo immediately
              had wide use in physics and chemistry.
                                                                        [Geyer & Thompson, 1992]


              Hammersley and Handscomb, 1967
              Piekaar and Clarenburg, 1967
              Kennedy and Kutil, 1985
              Sokal, 1989
              &tc...
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Metropolis et al., 1953



Physics and chemistry

              Statistics has always been fuelled by energetic mining of
              the physics literature.
                                                                                             [Clifford, 1993]


              Hammersley and Handscomb, 1967
              Piekaar and Clarenburg, 1967
              Kennedy and Kutil, 1985
              Sokal, 1989
              &tc...
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Hastings, 1970



A fair generalisation


       In Biometrika 1970, Hastings defines MCMC methodology for
       finite and reversible Markov chains, the continuous case being
       discretised:
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Hastings, 1970



A fair generalisation


       In Biometrika 1970, Hastings defines MCMC methodology for
       finite and reversible Markov chains, the continuous case being
       discretised:
       Generic acceptance probability for a move from state i to state j is
                                                             sij
                                               αij =          πi q ,
                                                          1 + πj qij
                                                                  ji


       where sij is a symmetric function.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Hastings, 1970



State of the art


       Note
       Generic form that encompasses both Metropolis et al. (1953) and
       Barker (1965).
       Peskun’s ordering not yet discovered: Hastings mentions that little
       is known about the relative merits of those two choices (even
       though) Metropolis’s method may be preferable.
       Warning against high rejection rates as indicative of a poor choice
       of transition matrix, but not mention of the opposite pitfall of low
       rejection.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Hastings, 1970



What else?!
       Items included in the paper are
              a Poisson target with a ±1 random walk proposal,
              a normal target with a uniform random walk proposal mixed with its
              reflection (i.e. centered at −X(t) rather than X(t)),
              a multivariate target where Hastings introduces Gibbs sampling,
              updating one component at a time and defining the composed
              transition as satisfying the stationary condition because each
              component does leave the target invariant
              a reference to Erhman, Fosdick and Handscomb (1960) as a
              preliminary if specific instance of this Metropolis-within-Gibbs
              sampler
              an importance sampling version of MCMC,
              some remarks about error assessment,
              a Gibbs sampler for random orthogonal matrices
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Hastings, 1970



Three years later

       Peskun (1973) compares Metropolis’ and Barker’s acceptance
       probabilities and shows (again in a discrete setup) that Metropolis’
       is optimal (in terms of the asymptotic variance of any empirical
       average).
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Hastings, 1970



Three years later

       Peskun (1973) compares Metropolis’ and Barker’s acceptance
       probabilities and shows (again in a discrete setup) that Metropolis’
       is optimal (in terms of the asymptotic variance of any empirical
       average).
       Proof direct consequence of Kemeny and Snell (1960) on
       asymptotic variance. Peskun also establishes that this variance can
       improve upon the iid case if and only if the eigenvalues of P − A
       are all negative, when A is the transition matrix corresponding to
       the iid simulation and P the transition matrix corresponding to the
       Metropolis algorithm, but he concludes that the trace of P − A is
       always positive.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



Julian’s early works (1)




       Early 1970’s, Hammersley, Clifford, and Besag were working on the
       specification of joint distributions from conditional distributions
       and on necessary and sufficient conditions for the conditional
       distributions to be compatible with a joint distribution.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



Julian’s early works (1)


       Early 1970’s, Hammersley, Clifford, and Besag were working on the
       specification of joint distributions from conditional distributions
       and on necessary and sufficient conditions for the conditional
       distributions to be compatible with a joint distribution.




                                                                [Hammersley and Clifford, 1971]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



Julian’s early works (1)


       Early 1970’s, Hammersley, Clifford, and Besag were working on the
       specification of joint distributions from conditional distributions
       and on necessary and sufficient conditions for the conditional
       distributions to be compatible with a joint distribution.
       What is the most general form of the conditional probability
       functions that define a coherent joint function? And what will the
       joint look like?
                                                           [Besag, 1972]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



Hammersley-Clifford theorem



       Theorem (Hammersley-Clifford)
       Joint distribution of vector associated with a dependence graph
       must be represented as product of functions over the cliques of the
       graphs, i.e., of functions depending only on the components
       indexed by the labels in the clique.

                                                                 [Cressie, 1993; Lauritzen, 1996]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



Hammersley-Clifford theorem



       Theorem (Hammersley-Clifford)
       A probability distribution P with positive and continuous density f
       satisfies the pairwise Markov property with respect to an undirected
       graph G if and only if it factorizes according to G , i.e., (F ) ≡ (G)

                                                                 [Cressie, 1993; Lauritzen, 1996]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



Hammersley-Clifford theorem


       Theorem (Hammersley-Clifford)
       Under the positivity condition, the joint distribution g satisfies
                                           p
                                                 g j (y j |y 1 , . . . , y   j−1
                                                                                   ,y   j+1
                                                                                              , . . . , y p)
               g(y1 , . . . , yp ) ∝
                                                 g j (y j |y 1 , . . . , y   j−1
                                                                                   ,y         , . . . , y p)
                                         j=1                                            j+1


        for every permutation                  on {1, 2, . . . , p} and every y ∈ Y .

                                                                 [Cressie, 1993; Lauritzen, 1996]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



An apocryphal theorem


       The Hammersley-Clifford theorem was never published by its
       authors, but only through Grimmet (1973), Preston (1973),
       Sherman (1973), Besag (1974). The authors were dissatisfied with
       the positivity constraint: The joint density could only be recovered
       from the full conditionals when the support of the joint was made
       of the product of the supports of the full conditionals (with
       obvious counter-examples. Moussouris’ counter-example put a full
       stop to their endeavors.
                                                        [Hammersley, 1974]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



To Gibbs or not to Gibbs?

       Julian Besag should certainly be credited to a large extent of the
       (re?-)discovery of the Gibbs sampler.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



To Gibbs or not to Gibbs?

       Julian Besag should certainly be credited to a large extent of the
       (re?-)discovery of the Gibbs sampler.
                   The simulation procedure is to consider the sites
               cyclically and, at each stage, to amend or leave unaltered
               the particular site value in question, according to a
               probability distribution whose elements depend upon the
               current value at neighboring sites (...) However, the
               technique is unlikely to be particularly helpful in many
               other than binary situations and the Markov chain itself
               has no practical interpretation.
                                                                                             [Besag, 1974]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



Broader perspective

       In 1964, Hammersley and Handscomb wrote a (the first?)
       textbook on Monte Carlo methods: they cover
       They cover such topics as
               “Crude Monte Carlo“;
               importance sampling;
               control variates; and
               “Conditional Monte Carlo”, which looks surprisingly like a
               missing-data Gibbs completion approach.
       They state in the Preface
               We are convinced nevertheless that Monte Carlo methods
               will one day reach an impressive maturity.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



Clicking in

       After Peskun (1973), MCMC mostly dormant in mainstream
       statistical world for about 10 years, then several papers/books
       highlighted its usefulness in specific settings:
               Geman and Geman (1984)
               Besag (1986)
               Strauss (1986)
               Ripley (Stochastic Simulation, 1987)
               Tanner and Wong (1987)
               Younes (1988)
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



Enters the Gibbs sampler


       Geman and Geman (1984), building on Metropolis et al. (1953),
       Hastings (1970), and Peskun (1973), constructed a Gibbs sampler
       for optimisation in a discrete image processing problem without
       completion.
       Responsible for the name Gibbs sampling, because method used for
       the Bayesian study of Gibbs random fields linked to the physicist
       Josiah Willard Gibbs (1839–1903)
       Back to Metropolis et al., 1953: the Gibbs sampler is used as a
       simulated annealing algorithm and ergodicity is proven on the
       collection of global maxima
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



Besag (1986) integrates GS for SA...


               ...easy to construct the transition matrix Q, of a discrete
               time Markov chain, with state space Ω and limit
               distribution (4). Simulated annealing proceeds by
               running an associated time inhomogeneous Markov chain
               with transition matrices QT , where T is progressively
               decreased according to a prescribed “schedule” to a value
               close to zero.
                                                                                             [Besag, 1986]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



...and links with Metropolis-Hastings...

               There are various related methods of constructing a
               manageable QT (Hastings, 1970). Geman and Geman
               (1984) adopt the simplest, which they term the ”Gibbs
               sampler” (...) time reversibility, a common ingredient in
               this type of problem (see, for example, Besag, 1977a), is
               present at individual stages but not over complete cycles,
               though Peter Green has pointed out that it returns if QT
               is taken over a pair of cycles, the second of which visits
               pixels in reverse order
                                                                                             [Besag, 1986]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



...seeing the larger picture,...
               As Geman and Geman (1984) point out, any property of
               the (posterior) distribution P (x|y) can be simulated by
               running the Gibbs sampler at “temperature” T = 1.
               Thus, if xi maximizes P (xi |y), then it is the most
                         ˆ
               frequently occurring colour at pixel i in an infinite
               realization of the Markov chain with transition matrix Q
               of Section 2.3. The xi ’s can therefore be simultaneously
                                     ˆ
               estimated from a single finite realization of the chain. It
               is not yet clear how long the realization needs to be,
               particularly for estimation near colour boundaries, but the
               amount of computation required is generally prohibitive
               for routine purposes
                                                                                             [Besag, 1986]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



...seeing the larger picture,...
               P (x|y) can be simulated using the Gibbs sampler, as
               suggested by Grenander (1983) and by Geman and
               Geman (1984). My dismissal of such an approach for
               routine applications was somewhat cavalier:
               purpose-built array processors could become relatively
               inexpensive (...) suppose that, for 100 complete cycles
               say, images have been collected from the Gibbs sampler
               (or by Metropolis’ method), following a “settling-in”
               period of perhaps another 100 cycles, which should cater
               for fairly intricate priors (...) These 100 images should
               often be adequate for estimating properties of the
               posterior (...) and for making approximate associated
               confidence statements, as mentioned by Mr Haslett.
                                                                                             [Besag, 1986]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



...if not going fully Bayes!




               ...a neater and more efficient procedure [for parameter
               estimation] is to adopt maximum ”pseudo-likelihood”
               estimation (Besag, 1975)
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



...if not going fully Bayes!


               ...a neater and more efficient procedure [for parameter
               estimation] is to adopt maximum ”pseudo-likelihood”
               estimation (Besag, 1975)

               I have become increasingly enamoured with the Bayesian
               paradigm
                                                                                             [Besag, 1986]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



...if not going fully Bayes!
               ...a neater and more efficient procedure [for parameter
               estimation] is to adopt maximum ”pseudo-likelihood”
               estimation (Besag, 1975)

               I have become increasingly enamoured with the Bayesian
               paradigm
                                                                                              [Besag, 1986]


               The pair (xi , βi ) is then a (bivariate) Markov field and
               can be reconstructed as a bivariate process by the
               methods described in Professor Besag’s paper.
                                                                                             [Clifford, 1986]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



...if not going fully Bayes!
               ...a neater and more efficient procedure [for parameter
               estimation] is to adopt maximum ”pseudo-likelihood”
               estimation (Besag, 1975)

               I have become increasingly enamoured with the Bayesian
               paradigm
                                                                                             [Besag, 1986]


               The simulation-based estimator Epost Ψ(X) will differ
                                         ˆ
               from the m.a.p. estimator Ψ(x).

                                                                                        [Silverman, 1986]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



Discussants of Besag (1986)




       Impressive who’s who: D.M. Titterington, P. Clifford, P. Green, P.
       Brown, B. Silverman, F. Critchley, F. Kelly, K. Mardia, C.
       Jennison, J. Kent, D. Spiegelhalter, H. Wynn, D. and S. Geman, J.
       Haslett, J. Kay, H. K¨nsch, P. Switzer, B. Torsney, &tc
                            u
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



A comment on Besag (1986)

               While special purpose algorithms will determine the
               utility of the Bayesian methods, the general purpose
               methods-stochastic relaxation and simulation of solutions
               of the Langevin equation (Grenander, 1983; Geman and
               Geman, 1984; Gidas, 1985a; Geman and Hwang, 1986)
               have proven enormously convenient and versatile. We are
               able to apply a single computer program to every new
               problem by merely changing the subroutine that
               computes the energy function in the Gibbs representation
               of the posterior distribution.
                                                                      [Geman and McClure, 1986]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



Another one

               It is easy to compute exact marginal and joint posterior
               probabilities of currently unobserved features, conditional
               on those clinical findings currently available
               (Spiegelhalter, 1986a,b), the updating taking the form of
               ‘propagating evidence’ through the network (...) it would
               be interesting to see if the techniques described tonight,
               which are of intermediate complexity, may have any
               applications in this new and exciting area [causal
               networks].

                                                                                   [Spiegelhalter, 1986]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



The candidate’s formula
       Representation of the marginal likelihood as

                                                         π(θ)f (x|θ)
                                                m(x)
                                                           π(θ|x)

       or of the marginal predictive as

                            pn (y |y) = f (y |θ)πn (θ|y) πn+1 (θ|y, y )

                                                                                             [Besag, 1989]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



The candidate’s formula
       Representation of the marginal likelihood as

                                                         π(θ)f (x|θ)
                                                m(x)
                                                           π(θ|x)

       or of the marginal predictive as

                            pn (y |y) = f (y |θ)πn (θ|y) πn+1 (θ|y, y )

                                                                                             [Besag, 1989]

       Why candidate?
       “Equation (2) appeared without explanation in a Durham
       University undergraduate final examination script of 1984.
       Regrettably, the student’s name is no longer known to me.”
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



Implications



               Newton and Raftery (1994) used this representation to derive
               the [infamous] harmonic mean approximation to the marginal
               likelihood
               Gelfand and Dey (1994)
               Geyer and Thompson (1995)
               Chib (1995)
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



Implications



               Newton and Raftery (1994)
               Gelfand and Dey (1994) also relied on this formula for the
               same purpose in a more general perspective
               Geyer and Thompson (1995)
               Chib (1995)
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



Implications



               Newton and Raftery (1994)
               Gelfand and Dey (1994)
               Geyer and Thompson (1995) derived MLEs by a Monte Carlo
               approximation to the normalising constant
               Chib (1995)
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  Before the revolution
     Julian’s early works



Implications



               Newton and Raftery (1994)
               Gelfand and Dey (1994)
               Geyer and Thompson (1995)
               Chib (1995) uses this representation to build a MCMC
               approximation to the marginal likelihood
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  The Revolution
     Final steps to



Impact


       “This is surely a revolution.”
                                                                                             [Clifford, 1993]
       Geman and Geman (1984) is one more spark that led to the
       explosion, as it had a clear influence on Gelfand, Green, Smith,
       Spiegelhalter and others.
       Sparked new interest in Bayesian methods, statistical computing,
       algorithms, and stochastic processes through the use of computing
       algorithms such as the Gibbs sampler and the Metropolis–Hastings
       algorithm.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  The Revolution
     Final steps to



Impact


       “[Gibbs sampler] use seems to have been isolated in the spatial
       statistics community until Gelfand and Smith (1990)”
                                                            [Geyer, 1990]
       Geman and Geman (1984) is one more spark that led to the
       explosion, as it had a clear influence on Gelfand, Green, Smith,
       Spiegelhalter and others.
       Sparked new interest in Bayesian methods, statistical computing,
       algorithms, and stochastic processes through the use of computing
       algorithms such as the Gibbs sampler and the Metropolis–Hastings
       algorithm.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  The Revolution
     Final steps to



Data augmentation
       Tanner and Wong (1987) has essentialy the same ingredients as
       Gelfand and Smith (1990): simulating from conditionals is
       simulating from the joint
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  The Revolution
     Final steps to



Data augmentation
       Tanner and Wong (1987) has essentialy the same ingredients as
       Gelfand and Smith (1990): simulating from conditionals is
       simulating from the joint
       Lower impact:
               emphasis on missing data problems (hence data augmentation)
               MCMC approximation to the target at every iteration
                                               K
                                         1
                          π(θ|x) ≈                  π(θ|x, z t,k ) ,          z t,k ∼ πt−1 (z|x) ,
                                                                                      ˆ
                                         K
                                              k=1


               too close to Rubin’s (1978) multiple imputation
               theoretical backup based on functional analysis (Markov kernel had
               to be uniformly bounded and equicontinuous)
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  The Revolution
     Gelfand and Smith, 1990



Epiphany

                                In June 1989, at a Bayesian workshop in Sherbrooke,
                                Qu´bec, Adrian Smith exposed for the first time (?)
                                    e
                                the generic features of Gibbs sampler, exhibiting a ten
                                line Fortran program handling a random effect model

                     Yij       = θi + εij ,           i = 1, . . . , K,         j = 1, . . . , J,
                                          2                           2
                       θi ∼         N(µ, σθ )         εij ∼     N(0, σε )

       by full conditionals on µ, σθ , σε ...
                                                   [Gelfand and Smith, 1990]
                     This was enough to convince the whole audience!
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  The Revolution
     Gelfand and Smith, 1990



Garden of Eden

       In early 1990s, researchers found that Gibbs and then Metropolis -
       Hastings algorithms would crack almost any problem!
       Flood of papers followed applying MCMC:
              linear mixed models (Gelfand et al., 1990; Zeger and Karim, 1991;
              Wang et al., 1993, 1994)
              generalized linear mixed models (Albert and Chib, 1993)
              mixture models (Tanner and Wong, 1987; Diebolt and X., 1990,
              1994; Escobar and West, 1993)
              changepoint analysis (Carlin et al., 1992)
              point processes (Grenander and Møller, 1994)
              &tc
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  The Revolution
     Gelfand and Smith, 1990



Garden of Eden

       In early 1990s, researchers found that Gibbs and then Metropolis -
       Hastings algorithms would crack almost any problem!
       Flood of papers followed applying MCMC:
              genomics (Stephens and Smith, 1993; Lawrence et al., 1993;
              Churchill, 1995; Geyer and Thompson, 1995)
              ecology (George and X, 1992; Dupuis, 1995)
              variable selection in regression (George and mcCulloch, 1993)
              spatial statistics (Raftery and Banfield, 1991)
              longitudinal studies (Lange et al., 1992)
              &tc
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  The Revolution
     Gelfand and Smith, 1990



[some of the] early theoretical advances

       “It may well be remembered as the afternoon of the 11 Bayesians”
                                                         [Clifford, 1993]
              Geyer and Thompson, 1992, relied on MCMC methods for ML
              estimation
              Smith and Roberts, 1993
              Besag and Green, 1993
              Tierney, 1994
              Liu, Wong and Kong, 1994,95
              Mengersen and Tweedie, 1996
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  The Revolution
     Gelfand and Smith, 1990



[some of the] early theoretical advances

       “It may well be remembered as the afternoon of the 11 Bayesians”
                                                         [Clifford, 1993]
              Geyer and Thompson, 1992,
              Smith and Roberts, 1993 discussed convergence diagnoses and
              applications, incl. mixtures for Gibbs and Metropolis–Hastings
              Besag and Green, 1993
              Tierney, 1994
              Liu, Wong and Kong, 1994,95
              Mengersen and Tweedie, 1996
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  The Revolution
     Gelfand and Smith, 1990



[some of the] early theoretical advances

       “It may well be remembered as the afternoon of the 11 Bayesians”
                                                         [Clifford, 1993]
              Geyer and Thompson, 1992,
              Smith and Roberts, 1993
              Besag and Green, 1993 stated the desideratas for
              convergences, and connect MCMC with auxiliary and
              antithetic variables
              Tierney, 1994
              Liu, Wong and Kong, 1994,95
              Mengersen and Tweedie, 1996
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  The Revolution
     Gelfand and Smith, 1990



[some of the] early theoretical advances

       “It may well be remembered as the afternoon of the 11 Bayesians”
                                                         [Clifford, 1993]
              Geyer and Thompson, 1992,
              Smith and Roberts, 1993
              Besag and Green, 1993
              Tierney, 1994 laid out all of the assumptions needed to
              analyze the Markov chains and then developed their
              properties, in particular, convergence of ergodic averages and
              central limit theorems
              Liu, Wong and Kong, 1994,95
              Mengersen and Tweedie, 1996
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  The Revolution
     Gelfand and Smith, 1990



[some of the] early theoretical advances

       “It may well be remembered as the afternoon of the 11 Bayesians”
                                                         [Clifford, 1993]
              Geyer and Thompson, 1992,
              Smith and Roberts, 1993
              Besag and Green, 1993
              Tierney, 1994
              Liu, Wong and Kong, 1994,95 analyzed the covariance
              structure of Gibbs sampling, and were able to formally
              establish the validity of Rao-Blackwellization in Gibbs
              sampling
              Mengersen and Tweedie, 1996
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  The Revolution
     Gelfand and Smith, 1990



[some of the] early theoretical advances

       “It may well be remembered as the afternoon of the 11 Bayesians”
                                                         [Clifford, 1993]
              Geyer and Thompson, 1992,
              Smith and Roberts, 1993
              Besag and Green, 1993
              Tierney, 1994
              Liu, Wong and Kong, 1994,95
              Mengersen and Tweedie, 1996 set the tone for the study of
              the speed of convergence of MCMC algorithms to the target
              distribution
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  The Revolution
     Gelfand and Smith, 1990



[some of the] early theoretical advances

       “It may well be remembered as the afternoon of the 11 Bayesians”
                                                         [Clifford, 1993]
              Geyer and Thompson, 1992,
              Smith and Roberts, 1993
              Besag and Green, 1993
              Tierney, 1994
              Liu, Wong and Kong, 1994,95
              Mengersen and Tweedie, 1996
              Gilks, Clayton and Spiegelhalter, 1993
              &tc...
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  The Revolution
     Convergence diagnoses



Convergence diagnoses
              Can we really tell when a complicated Markov chain has
              reached equilibrium? Frankly, I doubt it.
                                                                                             [Clifford, 1993]

       Explosion of methods
              Gelman and Rubin (1991)
              Besag and Green (1992)
              Geyer (1992)
              Raftery and Lewis (1992)
              Cowles and Carlin (1996) coda
              Brooks and Roberts (1998)
              &tc
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  After the Revolution
     Particle systems



Particles, again


       Iterating importance sampling is about as old as Monte Carlo
       methods themselves!
                 [Hammersley and Morton,1954; Rosenbluth and Rosenbluth, 1955]
       Found in the molecular simulation literature of the 50’s with
       self-avoiding random walks and signal processing
                                                [Marshall, 1965; Handschin and Mayne, 1969]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  After the Revolution
     Particle systems



Particles, again


       Iterating importance sampling is about as old as Monte Carlo
       methods themselves!
                 [Hammersley and Morton,1954; Rosenbluth and Rosenbluth, 1955]
       Found in the molecular simulation literature of the 50’s with
       self-avoiding random walks and signal processing
                                                [Marshall, 1965; Handschin and Mayne, 1969]

       Use of the term “particle” dates back to Kitagawa (1996), and Carpenter
       et al. (1997) coined the term “particle filter”.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  After the Revolution
     Particle systems



Bootstrap filter and sequential Monte Carlo



       Gordon, Salmon and Smith (1993) introduced the bootstrap filter
       which, while formally connected with importance sampling,
       involves past simulations and possible MCMC steps (Gilks and
       Berzuini, 2001).
       Sequential imputation was developped in Kong, Liu and Wong
       (1994), while Liu and Chen (1995) first formally pointed out the
       importance of resampling in “sequential Monte Carlo”, a term they
       coined
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  After the Revolution
     Particle systems



pMC versus pMCMC


              Recycling of past simulations legitimate to build better
              importance sampling functions as in population Monte Carlo
                                     [Iba, 2000; Capp´ et al, 2004; Del Moral et al., 2007]
                                                     e


              Recent synthesis by Andrieu, Doucet, and Hollenstein (2010)
              using particles to build an evolving MCMC kernel pθ (y1:T ) in
                                                                   ˆ
              state space models p(x1:T )p(y1:T |x1:T ), along with Andrieu’s
              and Roberts’ (2009) use of approximations in MCMC
              acceptance steps
                                                                             [Kennedy and Kulti, 1985]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  After the Revolution
     Reversible jump



Reversible jump

       Generaly considered as the second Revolution.

   Formalisation of a Markov chain moving across
   models and parameter spaces allows for the
   Bayesian processing of a wide variety of models
   and to the success of Bayesian model choice

       Definition of a proper balance condition on cross-model Markov
       kernels gives a generic setup for exploring variable dimension
       spaces, even when the number of models under comparison is
       infinite.
                                                               [Green, 1995]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  After the Revolution
     Perfect sampling



Perfect sampling


       Seminal paper of Propp and Wilson (1996) showed how to use
       MCMC methods to produce an exact (or perfect) simulation from
       the target.
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  After the Revolution
     Perfect sampling



Perfect sampling


       Seminal paper of Propp and Wilson (1996) showed how to use
       MCMC methods to produce an exact (or perfect) simulation from
       the target.
       Outburst of papers, particularly from Jesper Møller and coauthors,
       but the excitement somehow dried out [except in dedicated areas]
       as construction of perfect samplers is hard and coalescence times
       very high...
                                         [Møller and Waagepetersen, 2003]
A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data
  After the Revolution
     Envoi



To be continued...
       ...standing on the shoulders of giants

More Related Content

What's hot

Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
Partha Sarathi Kar
 
確率論基礎
確率論基礎確率論基礎
確率論基礎
hoxo_m
 
20130116_pfiseminar_gwas_postgwas
20130116_pfiseminar_gwas_postgwas20130116_pfiseminar_gwas_postgwas
20130116_pfiseminar_gwas_postgwasPreferred Networks
 
1 2.t検定
1 2.t検定1 2.t検定
1 2.t検定
logics-of-blue
 
Data assimilation with OpenDA
Data assimilation with OpenDAData assimilation with OpenDA
Data assimilation with OpenDA
nilsvanvelzen
 
Lights outを線形代数で解く
Lights outを線形代数で解くLights outを線形代数で解く
Lights outを線形代数で解く
まえすとろ
 
短距離古典分子動力学計算の 高速化と大規模並列化
短距離古典分子動力学計算の 高速化と大規模並列化短距離古典分子動力学計算の 高速化と大規模並列化
短距離古典分子動力学計算の 高速化と大規模並列化
Hiroshi Watanabe
 
三角関数の加法定理と関連公式(人間科学のための基礎数学 補足資料)
三角関数の加法定理と関連公式(人間科学のための基礎数学 補足資料)三角関数の加法定理と関連公式(人間科学のための基礎数学 補足資料)
三角関数の加法定理と関連公式(人間科学のための基礎数学 補足資料)
Masahiro Okano
 
グラフネットワーク〜フロー&カット〜
グラフネットワーク〜フロー&カット〜グラフネットワーク〜フロー&カット〜
グラフネットワーク〜フロー&カット〜
HCPC: 北海道大学競技プログラミングサークル
 
DS Lecture 2.ppt
DS Lecture 2.pptDS Lecture 2.ppt
DS Lecture 2.ppt
NomanMehboob4
 
圏論は、随伴が全て
圏論は、随伴が全て圏論は、随伴が全て
圏論は、随伴が全て
ohmori
 
競技プログラミングでの線型方程式系
競技プログラミングでの線型方程式系競技プログラミングでの線型方程式系
競技プログラミングでの線型方程式系tmaehara
 
Pathogen phylogenetics using BEAST
Pathogen phylogenetics using BEASTPathogen phylogenetics using BEAST
Pathogen phylogenetics using BEAST
Bioinformatics and Computational Biosciences Branch
 
Nagoya.R #15 順位相関係数の信頼区間の算出
Nagoya.R #15 順位相関係数の信頼区間の算出Nagoya.R #15 順位相関係数の信頼区間の算出
Nagoya.R #15 順位相関係数の信頼区間の算出
Yusaku Kawaguchi
 
【人工知能学会2013 】社会知としての消費者価値観構造モデルと類型「Societas」の構築
【人工知能学会2013 】社会知としての消費者価値観構造モデルと類型「Societas」の構築【人工知能学会2013 】社会知としての消費者価値観構造モデルと類型「Societas」の構築
【人工知能学会2013 】社会知としての消費者価値観構造モデルと類型「Societas」の構築
Ayako Baba
 
ローマと道に関するいくつかの問題とその解決
ローマと道に関するいくつかの問題とその解決ローマと道に関するいくつかの問題とその解決
ローマと道に関するいくつかの問題とその解決
at_akada
 
プログラミングコンテストでの動的計画法
プログラミングコンテストでの動的計画法プログラミングコンテストでの動的計画法
プログラミングコンテストでの動的計画法Takuya Akiba
 
最小カットを使って「燃やす埋める問題」を解く
最小カットを使って「燃やす埋める問題」を解く最小カットを使って「燃やす埋める問題」を解く
最小カットを使って「燃やす埋める問題」を解く
shindannin
 
Yamadai.Rデモンストレーションセッション
Yamadai.RデモンストレーションセッションYamadai.Rデモンストレーションセッション
Yamadai.Rデモンストレーションセッション
考司 小杉
 
AtCoder Regular Contest 033 解説
AtCoder Regular Contest 033 解説AtCoder Regular Contest 033 解説
AtCoder Regular Contest 033 解説
AtCoder Inc.
 

What's hot (20)

Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
確率論基礎
確率論基礎確率論基礎
確率論基礎
 
20130116_pfiseminar_gwas_postgwas
20130116_pfiseminar_gwas_postgwas20130116_pfiseminar_gwas_postgwas
20130116_pfiseminar_gwas_postgwas
 
1 2.t検定
1 2.t検定1 2.t検定
1 2.t検定
 
Data assimilation with OpenDA
Data assimilation with OpenDAData assimilation with OpenDA
Data assimilation with OpenDA
 
Lights outを線形代数で解く
Lights outを線形代数で解くLights outを線形代数で解く
Lights outを線形代数で解く
 
短距離古典分子動力学計算の 高速化と大規模並列化
短距離古典分子動力学計算の 高速化と大規模並列化短距離古典分子動力学計算の 高速化と大規模並列化
短距離古典分子動力学計算の 高速化と大規模並列化
 
三角関数の加法定理と関連公式(人間科学のための基礎数学 補足資料)
三角関数の加法定理と関連公式(人間科学のための基礎数学 補足資料)三角関数の加法定理と関連公式(人間科学のための基礎数学 補足資料)
三角関数の加法定理と関連公式(人間科学のための基礎数学 補足資料)
 
グラフネットワーク〜フロー&カット〜
グラフネットワーク〜フロー&カット〜グラフネットワーク〜フロー&カット〜
グラフネットワーク〜フロー&カット〜
 
DS Lecture 2.ppt
DS Lecture 2.pptDS Lecture 2.ppt
DS Lecture 2.ppt
 
圏論は、随伴が全て
圏論は、随伴が全て圏論は、随伴が全て
圏論は、随伴が全て
 
競技プログラミングでの線型方程式系
競技プログラミングでの線型方程式系競技プログラミングでの線型方程式系
競技プログラミングでの線型方程式系
 
Pathogen phylogenetics using BEAST
Pathogen phylogenetics using BEASTPathogen phylogenetics using BEAST
Pathogen phylogenetics using BEAST
 
Nagoya.R #15 順位相関係数の信頼区間の算出
Nagoya.R #15 順位相関係数の信頼区間の算出Nagoya.R #15 順位相関係数の信頼区間の算出
Nagoya.R #15 順位相関係数の信頼区間の算出
 
【人工知能学会2013 】社会知としての消費者価値観構造モデルと類型「Societas」の構築
【人工知能学会2013 】社会知としての消費者価値観構造モデルと類型「Societas」の構築【人工知能学会2013 】社会知としての消費者価値観構造モデルと類型「Societas」の構築
【人工知能学会2013 】社会知としての消費者価値観構造モデルと類型「Societas」の構築
 
ローマと道に関するいくつかの問題とその解決
ローマと道に関するいくつかの問題とその解決ローマと道に関するいくつかの問題とその解決
ローマと道に関するいくつかの問題とその解決
 
プログラミングコンテストでの動的計画法
プログラミングコンテストでの動的計画法プログラミングコンテストでの動的計画法
プログラミングコンテストでの動的計画法
 
最小カットを使って「燃やす埋める問題」を解く
最小カットを使って「燃やす埋める問題」を解く最小カットを使って「燃やす埋める問題」を解く
最小カットを使って「燃やす埋める問題」を解く
 
Yamadai.Rデモンストレーションセッション
Yamadai.RデモンストレーションセッションYamadai.Rデモンストレーションセッション
Yamadai.Rデモンストレーションセッション
 
AtCoder Regular Contest 033 解説
AtCoder Regular Contest 033 解説AtCoder Regular Contest 033 解説
AtCoder Regular Contest 033 解説
 

Viewers also liked

Monte Carlo Simulation - Paul wilmott
Monte Carlo Simulation - Paul wilmottMonte Carlo Simulation - Paul wilmott
Monte Carlo Simulation - Paul wilmott
Aguinaldo Flor
 
Introduction of mixed effect model
Introduction of mixed effect modelIntroduction of mixed effect model
Introduction of mixed effect model
Vivian S. Zhang
 
Modeling of players activity by Michel pierfitte, Director of Game Analytics ...
Modeling of players activity by Michel pierfitte, Director of Game Analytics ...Modeling of players activity by Michel pierfitte, Director of Game Analytics ...
Modeling of players activity by Michel pierfitte, Director of Game Analytics ...
Sylvain Gauthier
 
Spectral clustering Tutorial
Spectral clustering TutorialSpectral clustering Tutorial
Spectral clustering Tutorial
Zitao Liu
 
Metodos de monte carlo en mecánica estadistica
Metodos de monte carlo en mecánica estadisticaMetodos de monte carlo en mecánica estadistica
Metodos de monte carlo en mecánica estadistica
Alejandro Claro Mosqueda
 
That's like, so random! Monte Carlo for Data Science
That's like, so random! Monte Carlo for Data ScienceThat's like, so random! Monte Carlo for Data Science
That's like, so random! Monte Carlo for Data Science
Corey Chivers
 
MCQMC 2016 Tutorial
MCQMC 2016 TutorialMCQMC 2016 Tutorial
MCQMC 2016 Tutorial
Fred J. Hickernell
 
Machine Learning for dummies
Machine Learning for dummiesMachine Learning for dummies
Machine Learning for dummies
Vizury - Growth Marketing Platform
 
Introduction to Bayesian Methods
Introduction to Bayesian MethodsIntroduction to Bayesian Methods
Introduction to Bayesian Methods
Corey Chivers
 
Intro to Machine Learning
Intro to Machine LearningIntro to Machine Learning
Intro to Machine Learning
Corey Chivers
 
Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN DecodersConditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoders
suga93
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive Flow
Tatsuya Shirakawa
 
Improving Forecasts with Monte Carlo Simulations
Improving Forecasts with Monte Carlo Simulations  Improving Forecasts with Monte Carlo Simulations
Improving Forecasts with Monte Carlo Simulations
Michael Wallace
 
Monte carlo simulation
Monte carlo simulationMonte carlo simulation
Monte carlo simulation
Rajesh Piryani
 
TensorFlowによるニューラルネットワーク入門
TensorFlowによるニューラルネットワーク入門TensorFlowによるニューラルネットワーク入門
TensorFlowによるニューラルネットワーク入門
Etsuji Nakai
 
IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習
Preferred Networks
 

Viewers also liked (16)

Monte Carlo Simulation - Paul wilmott
Monte Carlo Simulation - Paul wilmottMonte Carlo Simulation - Paul wilmott
Monte Carlo Simulation - Paul wilmott
 
Introduction of mixed effect model
Introduction of mixed effect modelIntroduction of mixed effect model
Introduction of mixed effect model
 
Modeling of players activity by Michel pierfitte, Director of Game Analytics ...
Modeling of players activity by Michel pierfitte, Director of Game Analytics ...Modeling of players activity by Michel pierfitte, Director of Game Analytics ...
Modeling of players activity by Michel pierfitte, Director of Game Analytics ...
 
Spectral clustering Tutorial
Spectral clustering TutorialSpectral clustering Tutorial
Spectral clustering Tutorial
 
Metodos de monte carlo en mecánica estadistica
Metodos de monte carlo en mecánica estadisticaMetodos de monte carlo en mecánica estadistica
Metodos de monte carlo en mecánica estadistica
 
That's like, so random! Monte Carlo for Data Science
That's like, so random! Monte Carlo for Data ScienceThat's like, so random! Monte Carlo for Data Science
That's like, so random! Monte Carlo for Data Science
 
MCQMC 2016 Tutorial
MCQMC 2016 TutorialMCQMC 2016 Tutorial
MCQMC 2016 Tutorial
 
Machine Learning for dummies
Machine Learning for dummiesMachine Learning for dummies
Machine Learning for dummies
 
Introduction to Bayesian Methods
Introduction to Bayesian MethodsIntroduction to Bayesian Methods
Introduction to Bayesian Methods
 
Intro to Machine Learning
Intro to Machine LearningIntro to Machine Learning
Intro to Machine Learning
 
Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN DecodersConditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoders
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive Flow
 
Improving Forecasts with Monte Carlo Simulations
Improving Forecasts with Monte Carlo Simulations  Improving Forecasts with Monte Carlo Simulations
Improving Forecasts with Monte Carlo Simulations
 
Monte carlo simulation
Monte carlo simulationMonte carlo simulation
Monte carlo simulation
 
TensorFlowによるニューラルネットワーク入門
TensorFlowによるニューラルネットワーク入門TensorFlowによるニューラルネットワーク入門
TensorFlowによるニューラルネットワーク入門
 
IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習
 

Similar to A short history of MCMC

Computational Universe
Computational UniverseComputational Universe
Computational Universe
Mehmet Zirek
 
Part III - Quantum Mechanics
Part III - Quantum MechanicsPart III - Quantum Mechanics
Part III - Quantum Mechanics
Maurice R. TREMBLAY
 
The birth of quantum mechanics canvas
The birth of quantum mechanics canvasThe birth of quantum mechanics canvas
The birth of quantum mechanics canvas
Gabriel O'Brien
 
A Short Introduction to Quantum Information and Quantum.pdf
A Short Introduction to Quantum Information and Quantum.pdfA Short Introduction to Quantum Information and Quantum.pdf
A Short Introduction to Quantum Information and Quantum.pdf
SolMar38
 
Eden by Wire Webcameras and the Telepresent LandscapeTHOMAS J. .docx
Eden by Wire Webcameras and the Telepresent LandscapeTHOMAS J. .docxEden by Wire Webcameras and the Telepresent LandscapeTHOMAS J. .docx
Eden by Wire Webcameras and the Telepresent LandscapeTHOMAS J. .docx
madlynplamondon
 
Eden by Wire Webcameras and the Telepresent LandscapeTHOMAS J. .docx
Eden by Wire Webcameras and the Telepresent LandscapeTHOMAS J. .docxEden by Wire Webcameras and the Telepresent LandscapeTHOMAS J. .docx
Eden by Wire Webcameras and the Telepresent LandscapeTHOMAS J. .docx
tidwellveronique
 
Quantum computers
Quantum computersQuantum computers
Quantum computers
Geet Patel
 
Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7
Pierre Jacob
 
photon2022116p23.pdf
photon2022116p23.pdfphoton2022116p23.pdf
photon2022116p23.pdf
ssuser4ac3d8
 
Quantum Entanglement
Quantum EntanglementQuantum Entanglement
Quantum Entanglement
pixiejen
 

Similar to A short history of MCMC (10)

Computational Universe
Computational UniverseComputational Universe
Computational Universe
 
Part III - Quantum Mechanics
Part III - Quantum MechanicsPart III - Quantum Mechanics
Part III - Quantum Mechanics
 
The birth of quantum mechanics canvas
The birth of quantum mechanics canvasThe birth of quantum mechanics canvas
The birth of quantum mechanics canvas
 
A Short Introduction to Quantum Information and Quantum.pdf
A Short Introduction to Quantum Information and Quantum.pdfA Short Introduction to Quantum Information and Quantum.pdf
A Short Introduction to Quantum Information and Quantum.pdf
 
Eden by Wire Webcameras and the Telepresent LandscapeTHOMAS J. .docx
Eden by Wire Webcameras and the Telepresent LandscapeTHOMAS J. .docxEden by Wire Webcameras and the Telepresent LandscapeTHOMAS J. .docx
Eden by Wire Webcameras and the Telepresent LandscapeTHOMAS J. .docx
 
Eden by Wire Webcameras and the Telepresent LandscapeTHOMAS J. .docx
Eden by Wire Webcameras and the Telepresent LandscapeTHOMAS J. .docxEden by Wire Webcameras and the Telepresent LandscapeTHOMAS J. .docx
Eden by Wire Webcameras and the Telepresent LandscapeTHOMAS J. .docx
 
Quantum computers
Quantum computersQuantum computers
Quantum computers
 
Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7
 
photon2022116p23.pdf
photon2022116p23.pdfphoton2022116p23.pdf
photon2022116p23.pdf
 
Quantum Entanglement
Quantum EntanglementQuantum Entanglement
Quantum Entanglement
 

More from Christian Robert

Adaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloAdaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte Carlo
Christian Robert
 
Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de France
Christian Robert
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael Martin
Christian Robert
 
discussion of ICML23.pdf
discussion of ICML23.pdfdiscussion of ICML23.pdf
discussion of ICML23.pdf
Christian Robert
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?
Christian Robert
 
restore.pdf
restore.pdfrestore.pdf
restore.pdf
Christian Robert
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13
Christian Robert
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?
Christian Robert
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
Christian Robert
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking components
Christian Robert
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihood
Christian Robert
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)
Christian Robert
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
Christian Robert
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
Christian Robert
 
eugenics and statistics
eugenics and statisticseugenics and statistics
eugenics and statistics
Christian Robert
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
Christian Robert
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
Christian Robert
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
Christian Robert
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
Christian Robert
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussion
Christian Robert
 

More from Christian Robert (20)

Adaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloAdaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte Carlo
 
Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de France
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael Martin
 
discussion of ICML23.pdf
discussion of ICML23.pdfdiscussion of ICML23.pdf
discussion of ICML23.pdf
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?
 
restore.pdf
restore.pdfrestore.pdf
restore.pdf
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking components
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihood
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
 
eugenics and statistics
eugenics and statisticseugenics and statistics
eugenics and statistics
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussion
 

Recently uploaded

Smart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICTSmart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICT
simonomuemu
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
National Information Standards Organization (NISO)
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
Celine George
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
AyyanKhan40
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
RitikBhardwaj56
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
mulvey2
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
IreneSebastianRueco1
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
TechSoup
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
Priyankaranawat4
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 

Recently uploaded (20)

Smart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICTSmart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICT
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 

A short history of MCMC

  • 1. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Christian P. Robert and George Casella Universit´ Paris-Dauphine, IuF, & CRESt e and University of Florida April 2, 2011
  • 2. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data In memoriam, Julian Besag, 1945–2010
  • 3. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Introduction Introduction Markov Chain Monte Carlo (MCMC) methods around for almost as long as Monte Carlo techniques, even though impact on Statistics not been truly felt until the late 1980s / early 1990s .
  • 4. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Introduction Introduction Markov Chain Monte Carlo (MCMC) methods around for almost as long as Monte Carlo techniques, even though impact on Statistics not been truly felt until the late 1980s / early 1990s . Contents: Distinction between Metropolis-Hastings based algorithms and those related with Gibbs sampling, and brief entry into “second-generation MCMC revolution”.
  • 5. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Introduction A few landmarks Realization that Markov chains could be used in a wide variety of situations only came to “mainstream statisticians” with Gelfand and Smith (1990) despite earlier publications in the statistical literature like Hastings (1970) and growing awareness in spatial statistics (Besag, 1986)
  • 6. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Introduction A few landmarks Realization that Markov chains could be used in a wide variety of situations only came to “mainstream statisticians” with Gelfand and Smith (1990) despite earlier publications in the statistical literature like Hastings (1970) and growing awareness in spatial statistics (Besag, 1986) Several reasons: lack of computing machinery lack of background on Markov chains lack of trust in the practicality of the method
  • 7. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Los Alamos Bombs before the revolution Monte Carlo methods born in Los Alamos, New Mexico, during WWII, mostly by physicists working on atomic bombs and eventually producing the Metropolis algorithm in the early 1950’s. [Metropolis, Rosenbluth, Rosenbluth, Teller and Teller, 1953]
  • 8. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Los Alamos Bombs before the revolution Monte Carlo methods born in Los Alamos, New Mexico, during WWII, mostly by physicists working on atomic bombs and eventually producing the Metropolis algorithm in the early 1950’s. [Metropolis, Rosenbluth, Rosenbluth, Teller and Teller, 1953]
  • 9. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Los Alamos Monte Carlo genesis Monte Carlo method usually traced to Ulam and von Neumann: Stanislaw Ulam associates idea with an intractable combinatorial computation attempted in 1946 about “solitaire” Idea was enthusiastically adopted by John von Neumann for implementation on neutron diffusion Name “Monte Carlo“ being suggested by Nicholas Metropolis [Eckhardt, 1987]
  • 10. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Los Alamos Monte Carlo genesis Monte Carlo method usually traced to Ulam and von Neumann: Stanislaw Ulam associates idea with an intractable combinatorial computation attempted in 1946 about “solitaire” Idea was enthusiastically adopted by John von Neumann for implementation on neutron diffusion Name “Monte Carlo“ being suggested by Nicholas Metropolis [Eckhardt, 1987]
  • 11. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Los Alamos Monte Carlo genesis Monte Carlo method usually traced to Ulam and von Neumann: Stanislaw Ulam associates idea with an intractable combinatorial computation attempted in 1946 about “solitaire” Idea was enthusiastically adopted by John von Neumann for implementation on neutron diffusion Name “Monte Carlo“ being suggested by Nicholas Metropolis [Eckhardt, 1987]
  • 12. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Los Alamos Monte Carlo genesis Monte Carlo method usually traced to Ulam and von Neumann: Stanislaw Ulam associates idea with an intractable combinatorial computation attempted in 1946 about “solitaire” Idea was enthusiastically adopted by John von Neumann for implementation on neutron diffusion Name “Monte Carlo“ being suggested by Nicholas Metropolis [Eckhardt, 1987]
  • 13. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Los Alamos Monte Carlo with computers Very close “coincidence” with appearance of very first computer, ENIAC, born Feb. 1946, on which von Neumann implemented Monte Carlo in 1947
  • 14. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Los Alamos Monte Carlo with computers Very close “coincidence” with appearance of very first computer, ENIAC, born Feb. 1946, on which von Neumann implemented Monte Carlo in 1947 Same year Ulam and von Neumann (re)invented inversion and accept-reject techniques In 1949, very first symposium on Monte Carlo and very first paper [Metropolis and Ulam, 1949]
  • 15. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Metropolis et al., 1953 The Metropolis et al. (1953) paper Very first MCMC algorithm associated with the second computer, MANIAC, Los Alamos, early 1952. Besides Metropolis, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller, and Edward Teller contributed to create the Metropolis algorithm...
  • 16. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Metropolis et al., 1953 Motivating problem Computation of integrals of the form F (p, q) exp{−E(p, q)/kT }dpdq I= , exp{−E(p, q)/kT }dpdq with energy E defined as N N 1 E(p, q) = V (dij ), 2 i=1 j=1 j=i and N number of particles, V a potential function and dij the distance between particles i and j.
  • 17. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Metropolis et al., 1953 Boltzmann distribution Boltzmann distribution exp{−E(p, q)/kT } parameterised by temperature T , k being the Boltzmann constant, with a normalisation factor Z(T ) = exp{−E(p, q)/kT }dpdq not available in closed form.
  • 18. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Metropolis et al., 1953 Computational challenge Since p and q are 2N -dimensional vectors, numerical integration is impossible Plus, standard Monte Carlo techniques fail to correctly approximate I: exp{−E(p, q)/kT } is very small for most realizations of random configurations (p, q) of the particle system.
  • 19. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Metropolis et al., 1953 Metropolis algorithm Consider a random walk modification of the N particles: for each 1 ≤ i ≤ N , values xi = xi + αξ1i and yi = yi + αξ2i are proposed, where both ξ1i and ξ2i are uniform U(−1, 1). The energy difference between new and previous configurations is ∆E and the new configuration is accepted with probability 1 ∧ exp{−∆E/kT } , and otherwise the previous configuration is replicated∗ ∗ counting one more time in the average of the F (pt , pt )’s over the τ moves of the random walk.
  • 20. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Metropolis et al., 1953 Convergence Validity of the algorithm established by proving 1. irreducibility 2. ergodicity, that is convergence to the stationary distribution.
  • 21. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Metropolis et al., 1953 Convergence Validity of the algorithm established by proving 1. irreducibility 2. ergodicity, that is convergence to the stationary distribution. Second part obtained via discretization of the space: Metropolis et al. note that the proposal is reversible, then establish that exp{−E/kT } is invariant.
  • 22. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Metropolis et al., 1953 Convergence Validity of the algorithm established by proving 1. irreducibility 2. ergodicity, that is convergence to the stationary distribution. Second part obtained via discretization of the space: Metropolis et al. note that the proposal is reversible, then establish that exp{−E/kT } is invariant. Application to the specific problem of the rigid-sphere collision model. The number of iterations of the Metropolis algorithm seems to be limited: 16 steps for burn-in and 48 to 64 subsequent iterations (that still required four to five hours on the Los Alamos MANIAC).
  • 23. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Metropolis et al., 1953 Physics and chemistry The method of Markov chain Monte Carlo immediately had wide use in physics and chemistry. [Geyer & Thompson, 1992] Hammersley and Handscomb, 1967 Piekaar and Clarenburg, 1967 Kennedy and Kutil, 1985 Sokal, 1989 &tc...
  • 24. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Metropolis et al., 1953 Physics and chemistry Statistics has always been fuelled by energetic mining of the physics literature. [Clifford, 1993] Hammersley and Handscomb, 1967 Piekaar and Clarenburg, 1967 Kennedy and Kutil, 1985 Sokal, 1989 &tc...
  • 25. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Hastings, 1970 A fair generalisation In Biometrika 1970, Hastings defines MCMC methodology for finite and reversible Markov chains, the continuous case being discretised:
  • 26. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Hastings, 1970 A fair generalisation In Biometrika 1970, Hastings defines MCMC methodology for finite and reversible Markov chains, the continuous case being discretised: Generic acceptance probability for a move from state i to state j is sij αij = πi q , 1 + πj qij ji where sij is a symmetric function.
  • 27. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Hastings, 1970 State of the art Note Generic form that encompasses both Metropolis et al. (1953) and Barker (1965). Peskun’s ordering not yet discovered: Hastings mentions that little is known about the relative merits of those two choices (even though) Metropolis’s method may be preferable. Warning against high rejection rates as indicative of a poor choice of transition matrix, but not mention of the opposite pitfall of low rejection.
  • 28. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Hastings, 1970 What else?! Items included in the paper are a Poisson target with a ±1 random walk proposal, a normal target with a uniform random walk proposal mixed with its reflection (i.e. centered at −X(t) rather than X(t)), a multivariate target where Hastings introduces Gibbs sampling, updating one component at a time and defining the composed transition as satisfying the stationary condition because each component does leave the target invariant a reference to Erhman, Fosdick and Handscomb (1960) as a preliminary if specific instance of this Metropolis-within-Gibbs sampler an importance sampling version of MCMC, some remarks about error assessment, a Gibbs sampler for random orthogonal matrices
  • 29. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Hastings, 1970 Three years later Peskun (1973) compares Metropolis’ and Barker’s acceptance probabilities and shows (again in a discrete setup) that Metropolis’ is optimal (in terms of the asymptotic variance of any empirical average).
  • 30. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Hastings, 1970 Three years later Peskun (1973) compares Metropolis’ and Barker’s acceptance probabilities and shows (again in a discrete setup) that Metropolis’ is optimal (in terms of the asymptotic variance of any empirical average). Proof direct consequence of Kemeny and Snell (1960) on asymptotic variance. Peskun also establishes that this variance can improve upon the iid case if and only if the eigenvalues of P − A are all negative, when A is the transition matrix corresponding to the iid simulation and P the transition matrix corresponding to the Metropolis algorithm, but he concludes that the trace of P − A is always positive.
  • 31. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works Julian’s early works (1) Early 1970’s, Hammersley, Clifford, and Besag were working on the specification of joint distributions from conditional distributions and on necessary and sufficient conditions for the conditional distributions to be compatible with a joint distribution.
  • 32. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works Julian’s early works (1) Early 1970’s, Hammersley, Clifford, and Besag were working on the specification of joint distributions from conditional distributions and on necessary and sufficient conditions for the conditional distributions to be compatible with a joint distribution. [Hammersley and Clifford, 1971]
  • 33. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works Julian’s early works (1) Early 1970’s, Hammersley, Clifford, and Besag were working on the specification of joint distributions from conditional distributions and on necessary and sufficient conditions for the conditional distributions to be compatible with a joint distribution. What is the most general form of the conditional probability functions that define a coherent joint function? And what will the joint look like? [Besag, 1972]
  • 34. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works Hammersley-Clifford theorem Theorem (Hammersley-Clifford) Joint distribution of vector associated with a dependence graph must be represented as product of functions over the cliques of the graphs, i.e., of functions depending only on the components indexed by the labels in the clique. [Cressie, 1993; Lauritzen, 1996]
  • 35. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works Hammersley-Clifford theorem Theorem (Hammersley-Clifford) A probability distribution P with positive and continuous density f satisfies the pairwise Markov property with respect to an undirected graph G if and only if it factorizes according to G , i.e., (F ) ≡ (G) [Cressie, 1993; Lauritzen, 1996]
  • 36. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works Hammersley-Clifford theorem Theorem (Hammersley-Clifford) Under the positivity condition, the joint distribution g satisfies p g j (y j |y 1 , . . . , y j−1 ,y j+1 , . . . , y p) g(y1 , . . . , yp ) ∝ g j (y j |y 1 , . . . , y j−1 ,y , . . . , y p) j=1 j+1 for every permutation on {1, 2, . . . , p} and every y ∈ Y . [Cressie, 1993; Lauritzen, 1996]
  • 37. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works An apocryphal theorem The Hammersley-Clifford theorem was never published by its authors, but only through Grimmet (1973), Preston (1973), Sherman (1973), Besag (1974). The authors were dissatisfied with the positivity constraint: The joint density could only be recovered from the full conditionals when the support of the joint was made of the product of the supports of the full conditionals (with obvious counter-examples. Moussouris’ counter-example put a full stop to their endeavors. [Hammersley, 1974]
  • 38. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works To Gibbs or not to Gibbs? Julian Besag should certainly be credited to a large extent of the (re?-)discovery of the Gibbs sampler.
  • 39. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works To Gibbs or not to Gibbs? Julian Besag should certainly be credited to a large extent of the (re?-)discovery of the Gibbs sampler. The simulation procedure is to consider the sites cyclically and, at each stage, to amend or leave unaltered the particular site value in question, according to a probability distribution whose elements depend upon the current value at neighboring sites (...) However, the technique is unlikely to be particularly helpful in many other than binary situations and the Markov chain itself has no practical interpretation. [Besag, 1974]
  • 40. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works Broader perspective In 1964, Hammersley and Handscomb wrote a (the first?) textbook on Monte Carlo methods: they cover They cover such topics as “Crude Monte Carlo“; importance sampling; control variates; and “Conditional Monte Carlo”, which looks surprisingly like a missing-data Gibbs completion approach. They state in the Preface We are convinced nevertheless that Monte Carlo methods will one day reach an impressive maturity.
  • 41. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works Clicking in After Peskun (1973), MCMC mostly dormant in mainstream statistical world for about 10 years, then several papers/books highlighted its usefulness in specific settings: Geman and Geman (1984) Besag (1986) Strauss (1986) Ripley (Stochastic Simulation, 1987) Tanner and Wong (1987) Younes (1988)
  • 42. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works Enters the Gibbs sampler Geman and Geman (1984), building on Metropolis et al. (1953), Hastings (1970), and Peskun (1973), constructed a Gibbs sampler for optimisation in a discrete image processing problem without completion. Responsible for the name Gibbs sampling, because method used for the Bayesian study of Gibbs random fields linked to the physicist Josiah Willard Gibbs (1839–1903) Back to Metropolis et al., 1953: the Gibbs sampler is used as a simulated annealing algorithm and ergodicity is proven on the collection of global maxima
  • 43. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works Besag (1986) integrates GS for SA... ...easy to construct the transition matrix Q, of a discrete time Markov chain, with state space Ω and limit distribution (4). Simulated annealing proceeds by running an associated time inhomogeneous Markov chain with transition matrices QT , where T is progressively decreased according to a prescribed “schedule” to a value close to zero. [Besag, 1986]
  • 44. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works ...and links with Metropolis-Hastings... There are various related methods of constructing a manageable QT (Hastings, 1970). Geman and Geman (1984) adopt the simplest, which they term the ”Gibbs sampler” (...) time reversibility, a common ingredient in this type of problem (see, for example, Besag, 1977a), is present at individual stages but not over complete cycles, though Peter Green has pointed out that it returns if QT is taken over a pair of cycles, the second of which visits pixels in reverse order [Besag, 1986]
  • 45. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works ...seeing the larger picture,... As Geman and Geman (1984) point out, any property of the (posterior) distribution P (x|y) can be simulated by running the Gibbs sampler at “temperature” T = 1. Thus, if xi maximizes P (xi |y), then it is the most ˆ frequently occurring colour at pixel i in an infinite realization of the Markov chain with transition matrix Q of Section 2.3. The xi ’s can therefore be simultaneously ˆ estimated from a single finite realization of the chain. It is not yet clear how long the realization needs to be, particularly for estimation near colour boundaries, but the amount of computation required is generally prohibitive for routine purposes [Besag, 1986]
  • 46. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works ...seeing the larger picture,... P (x|y) can be simulated using the Gibbs sampler, as suggested by Grenander (1983) and by Geman and Geman (1984). My dismissal of such an approach for routine applications was somewhat cavalier: purpose-built array processors could become relatively inexpensive (...) suppose that, for 100 complete cycles say, images have been collected from the Gibbs sampler (or by Metropolis’ method), following a “settling-in” period of perhaps another 100 cycles, which should cater for fairly intricate priors (...) These 100 images should often be adequate for estimating properties of the posterior (...) and for making approximate associated confidence statements, as mentioned by Mr Haslett. [Besag, 1986]
  • 47. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works ...if not going fully Bayes! ...a neater and more efficient procedure [for parameter estimation] is to adopt maximum ”pseudo-likelihood” estimation (Besag, 1975)
  • 48. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works ...if not going fully Bayes! ...a neater and more efficient procedure [for parameter estimation] is to adopt maximum ”pseudo-likelihood” estimation (Besag, 1975) I have become increasingly enamoured with the Bayesian paradigm [Besag, 1986]
  • 49. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works ...if not going fully Bayes! ...a neater and more efficient procedure [for parameter estimation] is to adopt maximum ”pseudo-likelihood” estimation (Besag, 1975) I have become increasingly enamoured with the Bayesian paradigm [Besag, 1986] The pair (xi , βi ) is then a (bivariate) Markov field and can be reconstructed as a bivariate process by the methods described in Professor Besag’s paper. [Clifford, 1986]
  • 50. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works ...if not going fully Bayes! ...a neater and more efficient procedure [for parameter estimation] is to adopt maximum ”pseudo-likelihood” estimation (Besag, 1975) I have become increasingly enamoured with the Bayesian paradigm [Besag, 1986] The simulation-based estimator Epost Ψ(X) will differ ˆ from the m.a.p. estimator Ψ(x). [Silverman, 1986]
  • 51. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works Discussants of Besag (1986) Impressive who’s who: D.M. Titterington, P. Clifford, P. Green, P. Brown, B. Silverman, F. Critchley, F. Kelly, K. Mardia, C. Jennison, J. Kent, D. Spiegelhalter, H. Wynn, D. and S. Geman, J. Haslett, J. Kay, H. K¨nsch, P. Switzer, B. Torsney, &tc u
  • 52. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works A comment on Besag (1986) While special purpose algorithms will determine the utility of the Bayesian methods, the general purpose methods-stochastic relaxation and simulation of solutions of the Langevin equation (Grenander, 1983; Geman and Geman, 1984; Gidas, 1985a; Geman and Hwang, 1986) have proven enormously convenient and versatile. We are able to apply a single computer program to every new problem by merely changing the subroutine that computes the energy function in the Gibbs representation of the posterior distribution. [Geman and McClure, 1986]
  • 53. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works Another one It is easy to compute exact marginal and joint posterior probabilities of currently unobserved features, conditional on those clinical findings currently available (Spiegelhalter, 1986a,b), the updating taking the form of ‘propagating evidence’ through the network (...) it would be interesting to see if the techniques described tonight, which are of intermediate complexity, may have any applications in this new and exciting area [causal networks]. [Spiegelhalter, 1986]
  • 54. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works The candidate’s formula Representation of the marginal likelihood as π(θ)f (x|θ) m(x) π(θ|x) or of the marginal predictive as pn (y |y) = f (y |θ)πn (θ|y) πn+1 (θ|y, y ) [Besag, 1989]
  • 55. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works The candidate’s formula Representation of the marginal likelihood as π(θ)f (x|θ) m(x) π(θ|x) or of the marginal predictive as pn (y |y) = f (y |θ)πn (θ|y) πn+1 (θ|y, y ) [Besag, 1989] Why candidate? “Equation (2) appeared without explanation in a Durham University undergraduate final examination script of 1984. Regrettably, the student’s name is no longer known to me.”
  • 56. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works Implications Newton and Raftery (1994) used this representation to derive the [infamous] harmonic mean approximation to the marginal likelihood Gelfand and Dey (1994) Geyer and Thompson (1995) Chib (1995)
  • 57. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works Implications Newton and Raftery (1994) Gelfand and Dey (1994) also relied on this formula for the same purpose in a more general perspective Geyer and Thompson (1995) Chib (1995)
  • 58. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works Implications Newton and Raftery (1994) Gelfand and Dey (1994) Geyer and Thompson (1995) derived MLEs by a Monte Carlo approximation to the normalising constant Chib (1995)
  • 59. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data Before the revolution Julian’s early works Implications Newton and Raftery (1994) Gelfand and Dey (1994) Geyer and Thompson (1995) Chib (1995) uses this representation to build a MCMC approximation to the marginal likelihood
  • 60. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data The Revolution Final steps to Impact “This is surely a revolution.” [Clifford, 1993] Geman and Geman (1984) is one more spark that led to the explosion, as it had a clear influence on Gelfand, Green, Smith, Spiegelhalter and others. Sparked new interest in Bayesian methods, statistical computing, algorithms, and stochastic processes through the use of computing algorithms such as the Gibbs sampler and the Metropolis–Hastings algorithm.
  • 61. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data The Revolution Final steps to Impact “[Gibbs sampler] use seems to have been isolated in the spatial statistics community until Gelfand and Smith (1990)” [Geyer, 1990] Geman and Geman (1984) is one more spark that led to the explosion, as it had a clear influence on Gelfand, Green, Smith, Spiegelhalter and others. Sparked new interest in Bayesian methods, statistical computing, algorithms, and stochastic processes through the use of computing algorithms such as the Gibbs sampler and the Metropolis–Hastings algorithm.
  • 62. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data The Revolution Final steps to Data augmentation Tanner and Wong (1987) has essentialy the same ingredients as Gelfand and Smith (1990): simulating from conditionals is simulating from the joint
  • 63. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data The Revolution Final steps to Data augmentation Tanner and Wong (1987) has essentialy the same ingredients as Gelfand and Smith (1990): simulating from conditionals is simulating from the joint Lower impact: emphasis on missing data problems (hence data augmentation) MCMC approximation to the target at every iteration K 1 π(θ|x) ≈ π(θ|x, z t,k ) , z t,k ∼ πt−1 (z|x) , ˆ K k=1 too close to Rubin’s (1978) multiple imputation theoretical backup based on functional analysis (Markov kernel had to be uniformly bounded and equicontinuous)
  • 64. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data The Revolution Gelfand and Smith, 1990 Epiphany In June 1989, at a Bayesian workshop in Sherbrooke, Qu´bec, Adrian Smith exposed for the first time (?) e the generic features of Gibbs sampler, exhibiting a ten line Fortran program handling a random effect model Yij = θi + εij , i = 1, . . . , K, j = 1, . . . , J, 2 2 θi ∼ N(µ, σθ ) εij ∼ N(0, σε ) by full conditionals on µ, σθ , σε ... [Gelfand and Smith, 1990] This was enough to convince the whole audience!
  • 65. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data The Revolution Gelfand and Smith, 1990 Garden of Eden In early 1990s, researchers found that Gibbs and then Metropolis - Hastings algorithms would crack almost any problem! Flood of papers followed applying MCMC: linear mixed models (Gelfand et al., 1990; Zeger and Karim, 1991; Wang et al., 1993, 1994) generalized linear mixed models (Albert and Chib, 1993) mixture models (Tanner and Wong, 1987; Diebolt and X., 1990, 1994; Escobar and West, 1993) changepoint analysis (Carlin et al., 1992) point processes (Grenander and Møller, 1994) &tc
  • 66. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data The Revolution Gelfand and Smith, 1990 Garden of Eden In early 1990s, researchers found that Gibbs and then Metropolis - Hastings algorithms would crack almost any problem! Flood of papers followed applying MCMC: genomics (Stephens and Smith, 1993; Lawrence et al., 1993; Churchill, 1995; Geyer and Thompson, 1995) ecology (George and X, 1992; Dupuis, 1995) variable selection in regression (George and mcCulloch, 1993) spatial statistics (Raftery and Banfield, 1991) longitudinal studies (Lange et al., 1992) &tc
  • 67. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data The Revolution Gelfand and Smith, 1990 [some of the] early theoretical advances “It may well be remembered as the afternoon of the 11 Bayesians” [Clifford, 1993] Geyer and Thompson, 1992, relied on MCMC methods for ML estimation Smith and Roberts, 1993 Besag and Green, 1993 Tierney, 1994 Liu, Wong and Kong, 1994,95 Mengersen and Tweedie, 1996
  • 68. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data The Revolution Gelfand and Smith, 1990 [some of the] early theoretical advances “It may well be remembered as the afternoon of the 11 Bayesians” [Clifford, 1993] Geyer and Thompson, 1992, Smith and Roberts, 1993 discussed convergence diagnoses and applications, incl. mixtures for Gibbs and Metropolis–Hastings Besag and Green, 1993 Tierney, 1994 Liu, Wong and Kong, 1994,95 Mengersen and Tweedie, 1996
  • 69. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data The Revolution Gelfand and Smith, 1990 [some of the] early theoretical advances “It may well be remembered as the afternoon of the 11 Bayesians” [Clifford, 1993] Geyer and Thompson, 1992, Smith and Roberts, 1993 Besag and Green, 1993 stated the desideratas for convergences, and connect MCMC with auxiliary and antithetic variables Tierney, 1994 Liu, Wong and Kong, 1994,95 Mengersen and Tweedie, 1996
  • 70. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data The Revolution Gelfand and Smith, 1990 [some of the] early theoretical advances “It may well be remembered as the afternoon of the 11 Bayesians” [Clifford, 1993] Geyer and Thompson, 1992, Smith and Roberts, 1993 Besag and Green, 1993 Tierney, 1994 laid out all of the assumptions needed to analyze the Markov chains and then developed their properties, in particular, convergence of ergodic averages and central limit theorems Liu, Wong and Kong, 1994,95 Mengersen and Tweedie, 1996
  • 71. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data The Revolution Gelfand and Smith, 1990 [some of the] early theoretical advances “It may well be remembered as the afternoon of the 11 Bayesians” [Clifford, 1993] Geyer and Thompson, 1992, Smith and Roberts, 1993 Besag and Green, 1993 Tierney, 1994 Liu, Wong and Kong, 1994,95 analyzed the covariance structure of Gibbs sampling, and were able to formally establish the validity of Rao-Blackwellization in Gibbs sampling Mengersen and Tweedie, 1996
  • 72. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data The Revolution Gelfand and Smith, 1990 [some of the] early theoretical advances “It may well be remembered as the afternoon of the 11 Bayesians” [Clifford, 1993] Geyer and Thompson, 1992, Smith and Roberts, 1993 Besag and Green, 1993 Tierney, 1994 Liu, Wong and Kong, 1994,95 Mengersen and Tweedie, 1996 set the tone for the study of the speed of convergence of MCMC algorithms to the target distribution
  • 73. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data The Revolution Gelfand and Smith, 1990 [some of the] early theoretical advances “It may well be remembered as the afternoon of the 11 Bayesians” [Clifford, 1993] Geyer and Thompson, 1992, Smith and Roberts, 1993 Besag and Green, 1993 Tierney, 1994 Liu, Wong and Kong, 1994,95 Mengersen and Tweedie, 1996 Gilks, Clayton and Spiegelhalter, 1993 &tc...
  • 74. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data The Revolution Convergence diagnoses Convergence diagnoses Can we really tell when a complicated Markov chain has reached equilibrium? Frankly, I doubt it. [Clifford, 1993] Explosion of methods Gelman and Rubin (1991) Besag and Green (1992) Geyer (1992) Raftery and Lewis (1992) Cowles and Carlin (1996) coda Brooks and Roberts (1998) &tc
  • 75. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data After the Revolution Particle systems Particles, again Iterating importance sampling is about as old as Monte Carlo methods themselves! [Hammersley and Morton,1954; Rosenbluth and Rosenbluth, 1955] Found in the molecular simulation literature of the 50’s with self-avoiding random walks and signal processing [Marshall, 1965; Handschin and Mayne, 1969]
  • 76. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data After the Revolution Particle systems Particles, again Iterating importance sampling is about as old as Monte Carlo methods themselves! [Hammersley and Morton,1954; Rosenbluth and Rosenbluth, 1955] Found in the molecular simulation literature of the 50’s with self-avoiding random walks and signal processing [Marshall, 1965; Handschin and Mayne, 1969] Use of the term “particle” dates back to Kitagawa (1996), and Carpenter et al. (1997) coined the term “particle filter”.
  • 77. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data After the Revolution Particle systems Bootstrap filter and sequential Monte Carlo Gordon, Salmon and Smith (1993) introduced the bootstrap filter which, while formally connected with importance sampling, involves past simulations and possible MCMC steps (Gilks and Berzuini, 2001). Sequential imputation was developped in Kong, Liu and Wong (1994), while Liu and Chen (1995) first formally pointed out the importance of resampling in “sequential Monte Carlo”, a term they coined
  • 78. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data After the Revolution Particle systems pMC versus pMCMC Recycling of past simulations legitimate to build better importance sampling functions as in population Monte Carlo [Iba, 2000; Capp´ et al, 2004; Del Moral et al., 2007] e Recent synthesis by Andrieu, Doucet, and Hollenstein (2010) using particles to build an evolving MCMC kernel pθ (y1:T ) in ˆ state space models p(x1:T )p(y1:T |x1:T ), along with Andrieu’s and Roberts’ (2009) use of approximations in MCMC acceptance steps [Kennedy and Kulti, 1985]
  • 79. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data After the Revolution Reversible jump Reversible jump Generaly considered as the second Revolution. Formalisation of a Markov chain moving across models and parameter spaces allows for the Bayesian processing of a wide variety of models and to the success of Bayesian model choice Definition of a proper balance condition on cross-model Markov kernels gives a generic setup for exploring variable dimension spaces, even when the number of models under comparison is infinite. [Green, 1995]
  • 80. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data After the Revolution Perfect sampling Perfect sampling Seminal paper of Propp and Wilson (1996) showed how to use MCMC methods to produce an exact (or perfect) simulation from the target.
  • 81. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data After the Revolution Perfect sampling Perfect sampling Seminal paper of Propp and Wilson (1996) showed how to use MCMC methods to produce an exact (or perfect) simulation from the target. Outburst of papers, particularly from Jesper Møller and coauthors, but the excitement somehow dried out [except in dedicated areas] as construction of perfect samplers is hard and coalescence times very high... [Møller and Waagepetersen, 2003]
  • 82. A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data After the Revolution Envoi To be continued... ...standing on the shoulders of giants