SlideShare a Scribd company logo
1 of 56
Download to read offline
Background
           ABC methods for generative models
                         MC2 type methods
               State-Space models, PMCMC
                                      SMC2




      Dealing with intractability: recent advances in
      Bayesian Monte Carlo methods for intractable
                        likelihoods

                                    N. CHOPIN1

                                     CREST-ENSAE




  1
   joint work with S. BARTHELME, P.E. JACOB, & O.
PAPASPILIOPOULOS
                                 N. CHOPIN     Intractability   1/ 54
Background
             ABC methods for generative models
                           MC2 type methods
                 State-Space models, PMCMC
                                        SMC2


Outline


  1   Background

  2   ABC methods for generative models

  3   MC2 type methods

  4   State-Space models, PMCMC

  5   SMC2



                                   N. CHOPIN     Intractability   2/ 54
Background
              ABC methods for generative models
                            MC2 type methods
                  State-Space models, PMCMC
                                         SMC2


Tractable models
  For a prototypic Bayesian model, defined by (a) prior p(θ), and (b)
  likelihood p(y |θ), a standard approach is to sample from the
  posterior
                           p(θ|y ) ∝ p(θ)p(y |θ).
  using the Metropolis-Hastings algorithm:
  Metropolis-Hastings
  From current point θn
    1   Sample θp ∼ T (θn , dθp )
    2   With probability 1 ∧ r , take θn+1 = θp , otherwise θn+1 = θn ,
        where
                               p(θp )p(y |θp )T (θp , θn )
                         r=
                               p(θn )p(y |θn )T (θn , θp )

  This generates a Markov chain which leaves the posterior invariant.
                                    N. CHOPIN     Intractability          3/ 54
Background
              ABC methods for generative models
                            MC2 type methods
                  State-Space models, PMCMC
                                         SMC2


Intractable models
  This generic approach cannot be applied in the following
  situations:
    1   The likelihood reads p(y |θ) = C (θ)hθ (y ), where C (θ) is an
        intractable normalising constant; e.g. log-linear models, Ising
        models.
    2   The likelihood p(y |θ) is an intractable integral

                                  p(y |θ) =           p(y , x|θ) dx
                                                  X

        of a tractable integrand; e.g. state-space models.
    3   The likelihood is even more complicate, because it
        corresponds to some generative process (scientific models).
  Solutions to these problems involve auxiliary variables.
                                    N. CHOPIN     Intractability          4/ 54
Background
             ABC methods for generative models
                           MC2 type methods
                 State-Space models, PMCMC
                                        SMC2


Outline


  1   Background

  2   ABC methods for generative models

  3   MC2 type methods

  4   State-Space models, PMCMC

  5   SMC2



                                   N. CHOPIN     Intractability   5/ 54
Background
            ABC methods for generative models
                          MC2 type methods
                State-Space models, PMCMC
                                       SMC2


Example of a generative model: reaction times
  Subject must choose between k alternatives. Evidence ej (t) in
  favour of choice j follows a Brownian motion with drift:
                             τ dej (t) = mj dt + dWtj .
  Decision is taken when one evidence “wins the race”; see plot.

                                 Threshold for B


                                 Threshold for A




                                                                Evidence for B



                                                                             Evidence for A




                             0                     50                  100                    150

                                                        time (ms)



                                      N. CHOPIN                 Intractability                      6/ 54
Background
             ABC methods for generative models
                           MC2 type methods
                 State-Space models, PMCMC
                                        SMC2


ABC methods for generative models

  ABC stands for “Approximate Bayesian Computation”. In such
  algorithms, the auxiliary variable is an artificial dataset y ∼ p(y |θ).
  Denote the actual dataset y . Consider the simple rejection
  algorithm:
  Basic ABC
  Repeat
    1   Sample θ ∼ p(θ).
    2   Sample y ∼ p(y |θ).
    3   Accept with probability Kε ( s(y ) − s(y ) ).

  where Kε (x) = K (x/ε), K is a kernel function, and s is a vector of
  “summary statistics”.

                                   N. CHOPIN     Intractability             7/ 54
Background
             ABC methods for generative models
                           MC2 type methods
                 State-Space models, PMCMC
                                        SMC2


ABC target



  This algorithm samples from:

                πε (θ, y ) ∝ p(θ)p(y |θ)Kε ( s(y ) − s(y ) ).

  and the marginal πε (θ) → p(θ|s(y )) as ε → 0.

  If s is sufficient, then the limit is the true posterior
  p(θ|s(y )) = p(θ|y ), but this is rarely possible unfortunately.




                                   N. CHOPIN     Intractability      8/ 54
Background
             ABC methods for generative models
                           MC2 type methods
                 State-Space models, PMCMC
                                        SMC2


MCMC-ABC

 One can instead derive a MCMC algorithm that sample from the
 same distribution.
 MCMC-ABC
 From current point (θn , yn )
   1   Sample θp ∼ T (θn , dθp ).
   2   Sample y p ∼ p(y |θp ).
   3   With probability 1 ∧ r , take (θn+1 , yn+1 ) = (θp , y p ), otherwise
       (θn+1 , yn+1 ) = (θn , yn ), where

                            p(θp )Kε ( s(y p ) − s(y ) )T (θp , θn )
                     r=
                            p(θn )Kε ( s(yn ) − s(y ) )T (θn , θp )


                                   N. CHOPIN     Intractability                9/ 54
Background
              ABC methods for generative models
                            MC2 type methods
                  State-Space models, PMCMC
                                         SMC2


Remarks on the KDE interpretation of ABC


  Having sampled N pairs (θi , y i ) from p(θ)p(y |θ), choosing ε
  essentially amounts to choosing the bandwidth of a KDE. There
  are some specific aspects that may deserve some investigation
  however:
    1   The objective is to approximate a conditional density, that is
        p(θ|s(y )). (But approximating p(s(y )) may be interesting
        too.)
    2   The marginal distribution of the simulated θ’s is known.
    3   Could we use a bandwidth matrix instead?



                                    N. CHOPIN     Intractability         10/ 54
Background
            ABC methods for generative models
                          MC2 type methods
                State-Space models, PMCMC
                                       SMC2


Parametric interpretation of ABC


  It would be great to take s(y ) = y . In that way, the ABC posterior
  could be interpreted as the posterior distribution of the same
  model, but corrupted with noise (of size ε). See the following
  paper for a fast (EP) approximation of such an ABC posterior:

  Barthelm´, S. and Chopin, N. (2011). ABC-EP: Expectation
           e
  Propagation for Likelihood-free Bayesian Computation, ICML
  2011, L. Getoor and T. Scheffer (eds), 289-296. (see also
  arXiv:1107.5959).




                                  N. CHOPIN     Intractability           11/ 54
Background
             ABC methods for generative models
                           MC2 type methods
                 State-Space models, PMCMC
                                        SMC2


ABC: summary


  We use ABC for very challenging models (generative/scientific
  models). We pay a heavy price for this:
    1   First level of approximation is p(θ|y ) ≈ p(θ|s(y ))
        (althought not in ABC-EP).
    2   Second level of approximation is p(θ|s(y )) ≈ πε (θ).
    3   Huge CPU cost (but less in ABC-EP).
    4   ABC-EP cannot be used in all situations.
  In the rest of the talk, we will deal with milder problems, and we
  will be able to avoid approximations.



                                   N. CHOPIN     Intractability        12/ 54
Background
             ABC methods for generative models
                           MC2 type methods
                 State-Space models, PMCMC
                                        SMC2


Outline


  1   Background

  2   ABC methods for generative models

  3   MC2 type methods

  4   State-Space models, PMCMC

  5   SMC2



                                   N. CHOPIN     Intractability   13/ 54
Background
            ABC methods for generative models
                          MC2 type methods
                State-Space models, PMCMC
                                       SMC2


Basic framework

  Imagine a model such that

                            p(y |θ) =           p(x, y |θ) dx

  is intractable, but can be approximated by the following unbiased
  MC estimate:
                                                N
                                          1           p(x j , y |θ)
                           p (y |θ) =
                           ˆ
                                          N            qθ (x j )
                                                j=1

  where the x j ’s are N points sampled from the (user-chosen)
  proposal distribution qθ .

                                  N. CHOPIN         Intractability    14/ 54
Background
              ABC methods for generative models
                            MC2 type methods
                  State-Space models, PMCMC
                                         SMC2


Naive question

  Can we simply replace p(y |θ) by p (y |θ)? i.e.
                                   ˆ

  MC2
  From current point θn (plus p (y |θn ) from previous iteration)
                              ˆ
    1   Sample θp ∼ T (θn , dθp )
    2   Sample x 1:N ∼ qθp so as to compute p (y |θp ).
                                            ˆ
    3   With probability 1 ∧ r , set θn+1 = θp , otherwise θn+1 = θn
        with
                              p(θp )ˆ(y |θp )T (θp , θn )
                                     p
                         r=                                .
                               p(θn )ˆ(y |θn )T (θn , θp )
                                     p



                                    N. CHOPIN     Intractability       15/ 54
Background
             ABC methods for generative models
                           MC2 type methods
                 State-Space models, PMCMC
                                        SMC2


Answer: yes, and the algorithm is exact!

  More precisely, this algorithm is a correct Metropolis step with
  respect to the following extended distribution:
                                                               
                                N               N     j , y |θ)
                                              1   p(x
            π(θ, x 1:N ) ∝ p(θ)   qθ (x j )                    
                                              N    qθ (x j )
                                       j=1                   j=1

  which is such that the marginal distribution of θ is precisely the
  true posterior distribution:

                              π(θ, x 1:N ) dx 1:N = p(θ|y ).




                                   N. CHOPIN     Intractability        16/ 54
Background
             ABC methods for generative models
                           MC2 type methods
                 State-Space models, PMCMC
                                        SMC2


Outline


  1   Background

  2   ABC methods for generative models

  3   MC2 type methods

  4   State-Space models, PMCMC

  5   SMC2



                                   N. CHOPIN     Intractability   17/ 54
Background
            ABC methods for generative models
                          MC2 type methods
                State-Space models, PMCMC
                                       SMC2


State Space Models
  A system of equations
      Hidden states (Markov): p(x1 |θ) = µθ (x1 ) and for t ≥ 1

                 p(xt+1 |x1:t , θ) = p(xt+1 |xt , θ) = fθ (xt+1 |xt )

      Observations:

               p(yt |y1:t−1 , x1:t−1 , θ) = p(yt |xt , θ) = gθ (yt |xt )

      Parameter: θ ∈ Θ, prior p(θ). We observe y1:T = (y1 , . . . yT ),
      T might be large (≈ 104 ). x and θ will also be of several
      dimensions.
  There are several interesting models for which fθ cannot be written
  in closed form (but it can be simulated).
                                  N. CHOPIN     Intractability             18/ 54
Background
             ABC methods for generative models
                           MC2 type methods
                 State-Space models, PMCMC
                                        SMC2


State Space Models

  Some interesting distributions
  Bayesian inference focuses on:

           static: p(θ|y1:T )           dynamic: p(θ|y1:t ) , t ∈ 1 : T

  Filtering (traditionally) focuses on:

                              ∀t ∈ [1, T ]       pθ (xt |y1:t )

  Smoothing (traditionally) focuses on:

                             ∀t ∈ [1, T ]        pθ (xt |y1:T )


                                   N. CHOPIN     Intractability           19/ 54
Background
               ABC methods for generative models
                             MC2 type methods
                   State-Space models, PMCMC
                                          SMC2


Examples




  Population growth model
                   yt              = xt + σw εt
                   log xt+1        = log xt + b0 + b1 (xt )b2 + σ ηt

  θ = (b0 , b1 , b2 , σ , σW ).




                                     N. CHOPIN     Intractability      20/ 54
Background
            ABC methods for generative models
                          MC2 type methods
                State-Space models, PMCMC
                                       SMC2


Examples


  Stochastic Volatility (L´vy-driven models)
                          e
      Observations (“log returns”):
                                                      1/2
                            yt = µ + βvt + vt                t   ,t ≥ 1

      Hidden states (“actual volatility” - integrated process):
                                                                  k
                                         1
                            vt+1 =         (zt − zt+1 +                ej )
                                         λ
                                                                 j=1




                                  N. CHOPIN     Intractability                21/ 54
Background
                      ABC methods for generative models
                                    MC2 type methods
                          State-Space models, PMCMC
                                                 SMC2


Examples


  . . . where the process zt is the “spot volatility”:
                                                          k
                               zt+1 = e −λ zt +                  e −λ(t+1−cj ) ej
                                                          j=1

                                                  iid                             iid
       k ∼ Poi λξ 2 /ω 2                   c1:k ∼ U(t, t + 1)                  ei:k ∼ Exp ξ/ω 2
  The parameter is θ ∈ (µ, β, ξ, ω 2 , λ), and xt = (vt , zt ) .
    See the results




                                            N. CHOPIN         Intractability                      22/ 54
Background
             ABC methods for generative models
                           MC2 type methods
                 State-Space models, PMCMC
                                        SMC2


Why are those models challenging?




  . . . It is effectively impossible to compute the likelihood

           p(y1:T |θ) =                p(y1:T |x1:T , θ)p(x1:T |θ)dx1:T
                                  XT

  Similarly, all other inferential quantities are impossible to compute.




                                   N. CHOPIN     Intractability            23/ 54
Background
            ABC methods for generative models
                          MC2 type methods
                State-Space models, PMCMC
                                       SMC2


Problems with MCMC approaches


     Metropolis-Hastings:
       1   p(θ|y1:T ) cannot be evaluated point-wise (marginal MH)
       2   p(x1:T , θ|y1:T ) are high-dimensional and it is hard to design
           reasonable proposals
     Gibbs sampler (updates states and parameters):
       1   The hidden states x1:T are typically very correlated and it is
           hard to update them efficiently in a block
       2   Parameters and latent variables highly correlated
     Common: they are not designed to recover the whole
     sequence π(x1:t , θ | y1:t )



                                  N. CHOPIN     Intractability               24/ 54
Background
            ABC methods for generative models
                          MC2 type methods
                State-Space models, PMCMC
                                       SMC2


Particle filters

  Consider the simplified problem of targeting

                                    pθ (xt+1 |y1:t+1 )



  This sequence of distributions is approximated by a sequence of
  weighted particles which are properly weighted using importance
  sampling, mutated/propagated according to the system dynamics,
  and resampled to control the variance.
  Below we give a pseudo-code version. Any operation involving the
  superscript n must be understood as performed for n = 1 : Nx ,
  where Nx is the total number of particles.

                                  N. CHOPIN     Intractability       25/ 54
Background
               ABC methods for generative models
                             MC2 type methods
                   State-Space models, PMCMC
                                          SMC2


  Step 1: At iteration t = 1,
                         n
           (a) Sample x1 ∼ q1,θ (·).
           (b) Compute and normalise weights
                                             n          n                              n
                            n          µθ (x1 )gθ (y1 |x1 )           n         w1,θ (x1 )
                     w1,θ (x1 ) =                  n        ,        W1,θ =    N
                                                                                                .
                                            q1,θ (x1 )                                    i
                                                                               i=1 w1,θ (x1 )

  Step 2: At iteration t = 2 : T
                                 n        1:Nx
           (a) Sample the index at−1 ∼ M(Wt−1,θ ) of the ancestor
                                                   an
              (b) Sample xtn ∼ qt,θ (·|xt−1 ).
                                         t−1


              (c) Compute and normalise weights
                                  an                                             an
        n
       at−1              fθ (xtn |xt−1 )gθ (yt |xtn )
                                    t−1
                                                                          wt,θ (xt−1 , xtn )
                                                                                  t−1

wt,θ (xt−1 , xtn )   =                     an
                                                        ,     n
                                                             Wt,θ    =               i
                                                                                    at−1 i
                             qt,θ (xtn |xt−1 )
                                          t−1                            Nx
                                                                         i=1 wt,θ (xt−1 , xt )

                                       N. CHOPIN    Intractability                             26/ 54
Background
           ABC methods for generative models
                         MC2 type methods
               State-Space models, PMCMC
                                      SMC2


Particle filtering




                                                                time
           Figure: Three weighted trajectories x1:t at time t.




                                 N. CHOPIN     Intractability          27/ 54
Background
            ABC methods for generative models
                          MC2 type methods
                State-Space models, PMCMC
                                       SMC2


Particle filtering




                                                                 time
         Figure: Three proposed trajectories x1:t+1 at time t + 1.




                                  N. CHOPIN     Intractability          28/ 54
Background
           ABC methods for generative models
                         MC2 type methods
               State-Space models, PMCMC
                                      SMC2


Particle filtering




                                                                time
        Figure: Three reweighted trajectories x1:t+1 at time t + 1




                                 N. CHOPIN     Intractability          29/ 54
Background
           ABC methods for generative models
                         MC2 type methods
               State-Space models, PMCMC
                                      SMC2


Observations


                       (i)    (i)
     At each t, (wt , x1:t )Nx is a particle approximation of
                            i=1
     pθ (xt |y1:t ).
     Resampling to avoid degeneracy. If there were no interaction
     between particles there would be typically polynomial or worse
     increase in the variance of weights
     Taking qθ = fθ simplifies weights, but mainly yields a feasible
     algorithm when fθ can only be simulated.




                                 N. CHOPIN     Intractability         30/ 54
Background
             ABC methods for generative models
                           MC2 type methods
                 State-Space models, PMCMC
                                        SMC2


Unbiased likelihood estimator


  A by-product of PF output is that
                                        T             Nx
                            ˆ                    1            (i)
                            ZtN =                           wt
                                                 Nx
                                       t=1            i=1

  is an unbiased estimator of the likelihood Zt = p(y1:t |θ) for all t.
  Whereas consistency of the estimator is immediate to check,
  unbiasedness is subtle, see e.g Proposition 7.4.1 in Del Moral. The
  variance of this estimator grows typically linealy with T (and not
  exponentially) because of lack of independence.



                                   N. CHOPIN      Intractability          31/ 54
Background
             ABC methods for generative models
                           MC2 type methods
                 State-Space models, PMCMC
                                        SMC2


PSMC

 Breakthrough paper of Andrieu et al. (2011), based on the
 unbiasedness of the PF estimate of the likelihood.
 Marginal PMCMC
 From current point θn (and current PF estimate p (y |θn )):
                                                ˆ
   1   Sample θp ∼ T (θn , dθp )
   2   Run a PF so as to obtain p (y |θp ), an unbiased estimate of
                                ˆ
       p(y |θp ).
   3   With probability 1 ∧ r , set θn+1 = θp , otherwise θn+1 = θn
       with
                              p(θp )p(y |θp )T (θp , θn )
                        r=
                              p(θn )ˆ(y |θn )T (θn , θp )
                                     p


                                   N. CHOPIN     Intractability       32/ 54
Background
             ABC methods for generative models
                           MC2 type methods
                 State-Space models, PMCMC
                                        SMC2


Outline


  1   Background

  2   ABC methods for generative models

  3   MC2 type methods

  4   State-Space models, PMCMC

  5   SMC2



                                   N. CHOPIN     Intractability   33/ 54
Background
              ABC methods for generative models
                            MC2 type methods
                  State-Space models, PMCMC
                                         SMC2


Objectives




    1   to derive sequentially

             p(θ, x1:t |y1:t ),       p(y1:t ),    for all t ∈ {1, . . . , T }

    2   to obtain a black box algorithm (automatic calibration).




                                    N. CHOPIN     Intractability                 34/ 54
Background
            ABC methods for generative models
                          MC2 type methods
                State-Space models, PMCMC
                                       SMC2


Main tools of our approach



      Particle filter algorithms for state-space models (this will be to
      estimate the likelihood, for a fixed θ).
      Iterated Batch Importance Sampling for sequential Bayesian
      inference for parameters (this will be the theoretical algorithm
      we will try to approximate).
  Both are sequential Monte Carlo (SMC) methods




                                  N. CHOPIN     Intractability            35/ 54
Background
            ABC methods for generative models
                          MC2 type methods
                State-Space models, PMCMC
                                       SMC2


IBIS


  SMC method for particle approximation of the sequence p(θ | y1:t )
  for t = 1 : T . PF is not going to work here by just pretending that
  θ is a dynamic process with zero (or small) variance. Recall the
  path degeneracy problem.
  In the next slide we give the pseudo-code of the IBIS algorithm.
  Operations with superscript m must be understood as operations
  performed for all m ∈ 1 : Nθ , where Nθ is the total number of
  θ-particles.




                                  N. CHOPIN     Intractability           36/ 54
Background
           ABC methods for generative models
                         MC2 type methods
               State-Space models, PMCMC
                                      SMC2


Sample θm from p(θ) and set ω m ← 1. Then, at time t = 1, . . . , T
        (a) Compute the incremental weights and their weighted
            average
                                                                         Nθ
                                                              1
  ut (θm ) = p(yt |y1:t−1 , θm ),              Lt =        Nθ
                                                                     ×           ω m ut (θm ),
                                                                 m
                                                           m=1 ω         m=1
         (b) Update the importance weights,
                                           ω m ← ω m ut (θm ).                              (1)
                                                               ˜
          (c) If some degeneracy criterion is fulfilled, sample θm
              independently from the mixture distribution
                                                      Nθ
                                          1
                                       Nθ
                                                             ω m Kt (θm , ·) .
                                            m
                                       m=1 ω m=1
               Finally, replace the current weighted particle system:
                                                ˜
                                 (θm , ω m ) ← (θm , 1).
                                 N. CHOPIN        Intractability                                  37/ 54
Background
          ABC methods for generative models
                        MC2 type methods
              State-Space models, PMCMC
                                     SMC2


Observations


     Cost of lack of ergodicity in θ: the occasional MCMC move
     Still, in regular problems resampling happens at diminishing
     frequency (logarithmically)
     Kt is an MCMC kernel invariant wrt π(θ | y1:t ). Its
     parameters can be chosen using information from current
     population of θ-particles
     Lt is a MC estimator of the model evidence
     Infeasible to implement for state-space models: intractable
     incremental weights, and MCMC kernel



                                N. CHOPIN     Intractability        38/ 54
Background
            ABC methods for generative models
                          MC2 type methods
                State-Space models, PMCMC
                                       SMC2


Our algorithm: SMC2



  We provide a generic (black box) algorithm for recovering the
  sequence of parameter posterior distributions, but as well filtering,
  smoothing and predictive.
  We give next a pseudo-code; the code seems to only track the
  parameter posteriors, but actually it does all other jobs.
  Superficially, it looks an approximation of IBIS, but in fact it does
  not produce any systematic errors (unbiased MC).




                                  N. CHOPIN     Intractability           39/ 54
Background
           ABC methods for generative models
                         MC2 type methods
               State-Space models, PMCMC
                                      SMC2

Sample θm from p(θ) and set ω m ← 1. Then, at time
t = 1, . . . , T ,
            (a) For each particle θm , perform iteration t of the PF: If
                   t = 1, sample independently x1 x ,m from ψ1,θm , and
                                                1:N

                   compute
                                                          Nx
                                                  1                    n,m
                                 p (y1 |θm ) =
                                 ˆ                              w1,θ (x1 );
                                                  Nx
                                                         n=1


               If t > 1, sample xt1:Nx ,m , at−1x ,m from ψt,θm
                                             1:N

                               1:Nx ,m 1:Nx ,m
               conditional on x1:t−1 , a1:t−2 , and compute

                                                         Nx
                                                 1                    an,m ,m
                      p (yt |y1:t−1 , θm ) =
                      ˆ                                         wt,θ (xt−1 , xtn,m ).
                                                                        t−1

                                                 Nx
                                                        n=1

                                 N. CHOPIN     Intractability                           39/ 54
Background
 ABC methods for generative models
               MC2 type methods
     State-Space models, PMCMC
                            SMC2

(b) Update the importance weights,


                           ω m ← ω m p (yt |y1:t−1 , θm )
                                     ˆ


(c) If some degeneracy criterion is fulfilled, sample
      θm , x1:t x ,m , ˜1:t−1 independently from
      ˜ ˜1:N           a1:Nx


                            Nθ
                1
             Nθ
                                 ω m Kt       θm , x1:t x ,m , a1:t−1 , ·
                                                    1:N         1:Nx ,m
                  m
             m=1 ω m=1


     Finally, replace current weighted particle system:


      (θm , x1:t x ,m , a1:t−1 , ω m ) ← (θm , x1:t x ,m , ˜1:t−1 , 1)
             1:N         1:Nx ,m          ˜ ˜1:N           a1:Nx ,m
                       N. CHOPIN     Intractability                         40/ 54
Background
           ABC methods for generative models
                         MC2 type methods
               State-Space models, PMCMC
                                      SMC2


Observations

     It appears as approximation to IBIS. For Nx = ∞ it is IBIS.
     However, no approximation is done whatsoever. This
     algorithm really samples from p(θ|y1:t ) and all other
     distributions of interest. One would expect an increase of MC
     variance over IBIS.
     The validity of algorithm is essentially based on two results: i)
     the particles are weighted due to unbiasedness of PF estimator
     of likelihood; ii) the MCMC kernel is appropriately constructed
     to maintain invariance wrt to an expanded distribution which
     admits those of interest as marginals; it is a Particle MCMC
     kernel.
     The algorithm does not suffer from the path degeneracy
     problem due to the MCMC updates
                                 N. CHOPIN     Intractability            40/ 54
Background
            ABC methods for generative models
                          MC2 type methods
                State-Space models, PMCMC
                                       SMC2


The MCMC step
                          ˜                       ˜       ˜
           (a) Sample θ from proposal kernel, θ ∼ T (θ, d θ).
                                       ˜
           (b) Run a new PF for θ: sample independently
                  1:Nx 1:Nx
               (˜1:t , ˜1:t−1 ) from ψt,θ , and compute
                x       a                ˜
               ˆt (θ, x 1:Nx , ˜1:Nx ).
               Z    ˜˜         a
                           1:t        1:t−1
           (c) Accept the move with probability
                                        ˜ ˆ ˜ ˜1:N a1:Nx             ˜
                                      p(θ)Zt (θ, x1:t x , ˜1:t−1 )T (θ, θ)
                                 1∧                                        .
                                          ˆ                             ˜
                                      p(θ)Zt (θ, x 1:Nx , a1:Nx )T (θ, θ)
                                                    1:t          1:t−1

 It can be shown that this is a standard Hastings-Metropolis kernel
 with proposal
                 ˜ ˜1:N a1:N                  ˜        1:N
                                                               a1:N
             qθ (θ, x1:t x , ˜1:t x ) = T (θ, θ)ψt,θ (˜1:t x , ˜1:t x )
                                                   ˜ x

                                                   1:N      1:Nx
 invariant wrt to an extended distribution πt (θ, x1:t x , a1:t−1 ).
                                  N. CHOPIN     Intractability                 41/ 54
Background
          ABC methods for generative models
                        MC2 type methods
              State-Space models, PMCMC
                                     SMC2


Some advantages of the algorithm



     Immediate estimates of filtering and predictive distributions
     Immediate and sequential estimator of model evidence
     Easy recovery of smoothing distributions
     Principled framework for automatic calibration of Nx
     Population Monte Carlo advantages




                                N. CHOPIN     Intractability        42/ 54
Background
                                  ABC methods for generative models
                                                MC2 type methods
                                      State-Space models, PMCMC
                                                             SMC2


Numerical illustrations: SV

                                                                             1.0                                          800


                                                                                                                          700
                        8                                                    0.8
                                                                                                                          600
     Squared observations




                                                              Acceptance rates
                        6                                                    0.6                                          500




                                                                                                                         Nx
                                                                                                                          400
                        4                                                    0.4
                                                                                                                          300


                        2                                                    0.2                                          200


                                                                                                                          100
                        0                                                    0.0

                            200   400      600   800   1000                        0   200   400      600   800   1000          0   200   400          600   800   1000
                                    Time                                                     Iterations                                   Iterations



                                  (a)                                                        (b)                                          (c)

  Figure: Squared observations (synthetic data set), acceptance rates, and
  illustration of the automatic increase of Nx .


    See the model

                                                                                 N. CHOPIN            Intractability                                                      43/ 54
Background
                                  ABC methods for generative models
                                                MC2 type methods
                                      State-Space models, PMCMC
                                                             SMC2


Numerical illustrations: SV


                          T = 250                            T = 500                           T = 750                           T = 1000
        8



        6
  Density




        4



        2



        0
            −1.0   −0.5     0.0      0.5   1.0 −1.0   −0.5     0.0     0.5   1.0 −1.0   −0.5     0.0     0.5   1.0 −1.0   −0.5     0.0      0.5   1.0
                                                                              µ




             Figure: Concentration of the posterior distribution for parameter µ.




                                                             N. CHOPIN             Intractability                                                       44/ 54
Background
             ABC methods for generative models
                           MC2 type methods
                 State-Space models, PMCMC
                                        SMC2


Numerical illustrations: SV


  Multifactor model

                                    k1              k2
                    1/2
  yt = µ+βvt +vt          t +ρ1          e1,j +ρ2         e2,j −ξ(w ρ1 λ1 +(1−w )ρ2 λ2 )
                                   j=1              j=1

  where vt = v1,t + v2,t , and (vi , zi )i=1,2 are following the same
  dynamics with parameters (wi ξ, wi ω 2 , λi ) and w1 = w ,
  w2 = 1 − w .




                                   N. CHOPIN     Intractability                        45/ 54
Background
                                  ABC methods for generative models
                                                MC2 type methods
                                      State-Space models, PMCMC
                                                             SMC2


Numerical illustrations: SV




                                                                            Evidence compared to the one factor model
                                                                                                                            variable
                         20                                                                                                  Multi factor without leverage
                                                                                                                        4    Multi factor with leverage
      Squared observations




                         15

                                                                                                                        2


                         10


                                                                                                                        0

                             5


                                                                                                                   −2



                                 100   200   300   400   500    600   700                                                   100     200     300     400      500   600   700
                                               Time                                                                                         Iterations


                                             (a)                                                                                             (b)

  Figure: S&P500 squared observations, and log-evidence comparison
  between models (relative to the one-factor model).
                                                               N. CHOPIN                   Intractability                                                                      46/ 54
Background
             ABC methods for generative models
                           MC2 type methods
                 State-Space models, PMCMC
                                        SMC2


Final Remarks


  A powerful framework
      A generic algorithm for sequential estimation and state
      inference in state space models: only requirements are to be
      able (a) to simulate the Markov transition fθ (xt |xt−1 ), and (b)
      to evaluate the likelihood term gθ (yt |xt ).
      The article is available on arXiv and our web pages
      A package is available at:
                       http://code.google.com/p/py-smc2/.




                                   N. CHOPIN     Intractability            47/ 54
Background
           ABC methods for generative models
                         MC2 type methods
               State-Space models, PMCMC
                                      SMC2


Appendix




                                 N. CHOPIN     Intractability   48/ 54
Background
            ABC methods for generative models
                          MC2 type methods
                State-Space models, PMCMC
                                       SMC2


Why does it work? - Intuition for t = 1




  At time t = 1, the algorithm generates variables θm from the prior
  p(θ), and for each θm , the algorithm generates vectors x1 x ,m of
                                                           1:N
                          1:N
  particles, from ψ1,θm (x1 x ).




                                  N. CHOPIN     Intractability         49/ 54
Background
          ABC methods for generative models
                        MC2 type methods
              State-Space models, PMCMC
                                     SMC2




Thus, the sampling space is Θ × X Nx , and the actual “particles” of
the algorithm are Nθ independent and identically distributed copies
                            1:N
of the random variable (θ, x1 x ), with density:
                                                        Nx
                            1:N                                      n
                 p(θ)ψ1,θ (x1 x )             = p(θ)          q1,θ (x1 ).
                                                       n=1




                                N. CHOPIN        Intractability             50/ 54
Background
           ABC methods for generative models
                         MC2 type methods
               State-Space models, PMCMC
                                      SMC2




Then, these particles are assigned importance weights
corresponding to the incremental weight function
ˆ       1:N      −1    Nx         n
Z1 (θ, x1 x ) = Nx     n=1 w1,θ (x1 ).

This means that, at iteration 1, the target distribution of the
algorithm should be defined as:

                  1:N                1:N
                                                                ˆ       1:N
                                                                Z1 (θ, x1 x )
          π1 (θ, x1 x ) = p(θ)ψ1,θ (x1 x ) ×                                  ,
                                                                    p(y1 )

where the normalising constant p(y1 ) is easily deduced from the
              ˆ       1:N
property that Z1 (θ, x1 x ) is an unbiased estimator of p(y1 |θ).




                                 N. CHOPIN     Intractability                     51/ 54
Background
           ABC methods for generative models
                         MC2 type methods
               State-Space models, PMCMC
                                      SMC2


Direct substitutions yield
                                 Nx                         n Nx       n
        1:N            p(θ)                  i        µθ (x1 )gθ (y1 |x1 )
                                                       1
π1 (θ, x1 x ) =                       q1,θ (x1 )                  n
                       p(y1 )                          Nx  q1,θ (x1 )
                            i=1                  n=1
                                                                              
                          N                                Nx
                       1 x p(θ)           n           n                    i
                                                                               
                =                    µθ (x1 )gθ (y1 |x1 )           q1,θ (x1 )
                       Nx     p(y1 )                                          
                            n=1                                         i=1,i=n

and noting that, for the triplet (θ, x1 , y1 ) of random variables,
   p(θ)µθ (x1 )gθ (y1 |x1 ) = p(θ, x1 , y1 ) = p(y1 )p(θ|y1 )p(x1 |y1 , θ)
one finally gets that:
                                                                                      
                                        Nx                             Nx
           1:N             p(θ|y1 )               n
                                                                   
                                                                                    i
                                                                                       
   π1 (θ, x1 x ) =                             p(x1 |y1 , θ)                 q1,θ (x1 ) .
                             Nx                                                       
                                       n=1                         i=1,i=n

                                 N. CHOPIN        Intractability                            52/ 54
Background
            ABC methods for generative models
                          MC2 type methods
                State-Space models, PMCMC
                                       SMC2




By a simple induction, one sees that the target density πt at
iteration t ≥ 2 should be defined as:
                                                                   ˆ       1:N      1:Nx
                                                                   Zt (θ, x1:t x , a1:t−1 )
        1:N      1:Nx                     1:N      1:Nx
πt (θ, x1:t x , a1:t−1 )   =   p(θ)ψt,θ (x1:t x , a1:t−1 )       ×
                                                                          p(y1:t )

and the following Proposition




                                  N. CHOPIN     Intractability                                53/ 54
Background
          ABC methods for generative models
                        MC2 type methods
              State-Space models, PMCMC
                                     SMC2




Proposition

The probability density πt may be written as:
                  1:N      1:Nx
          πt (θ, x1:t x , a1:t−1 ) = p(θ|y1:t )
                                                                 
                        N
                                            N                    
                   1 x p(xn |θ, y1:t )  x
                                                                 
                                                                  
                                 1:t                           i
             ×                       t−1
                                                        q1,θ (x1 )
                  Nx              Nx                             
                       n=1                  i=1
                                            n                    
                                                                  
                                               i=ht (1)
                                                         
                   t
                            Nx
                                                          
                                                          
                                      i
                                      as−1            i   
                                                 i as−1
             ×                    Ws−1,θ qs,θ (xs |xs−1 )
                  
                  s=2 i=1                                
                                                          
                             n
                                                          
                           i=ht (s)




                                N. CHOPIN     Intractability          54/ 54

More Related Content

What's hot

Introduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methodsIntroduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methodsChristian Robert
 
Introduction to MCMC methods
Introduction to MCMC methodsIntroduction to MCMC methods
Introduction to MCMC methodsChristian Robert
 
MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methodsChristian Robert
 
Simulated annealing for MMR-Path
Simulated annealing for MMR-PathSimulated annealing for MMR-Path
Simulated annealing for MMR-PathFrancisco Pérez
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018Christian Robert
 
Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical MethodsChristian Robert
 
Do we need a logic of quantum computation?
Do we need a logic of quantum computation?Do we need a logic of quantum computation?
Do we need a logic of quantum computation?Matthew Leifer
 
Differential Geometry
Differential GeometryDifferential Geometry
Differential Geometrylapuyade
 
Quantum automata for infinite periodic words
Quantum automata for infinite periodic wordsQuantum automata for infinite periodic words
Quantum automata for infinite periodic wordsKonstantinos Giannakis
 
A new class of restricted quantum membrane systems
A new class of restricted quantum membrane systemsA new class of restricted quantum membrane systems
A new class of restricted quantum membrane systemsKonstantinos Giannakis
 
A Vanilla Rao-Blackwellisation
A Vanilla Rao-BlackwellisationA Vanilla Rao-Blackwellisation
A Vanilla Rao-BlackwellisationChristian Robert
 
Further discriminatory signature of inflation
Further discriminatory signature of inflationFurther discriminatory signature of inflation
Further discriminatory signature of inflationLaila A
 
Illustration Clamor Echelon Evaluation via Prime Piece Psychotherapy
Illustration Clamor Echelon Evaluation via Prime Piece PsychotherapyIllustration Clamor Echelon Evaluation via Prime Piece Psychotherapy
Illustration Clamor Echelon Evaluation via Prime Piece PsychotherapyIJMER
 
Call-by-value non-determinism in a linear logic type discipline
Call-by-value non-determinism in a linear logic type disciplineCall-by-value non-determinism in a linear logic type discipline
Call-by-value non-determinism in a linear logic type disciplineAlejandro Díaz-Caro
 

What's hot (20)

Introduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methodsIntroduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methods
 
Introduction to MCMC methods
Introduction to MCMC methodsIntroduction to MCMC methods
Introduction to MCMC methods
 
MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methods
 
Simulated annealing for MMR-Path
Simulated annealing for MMR-PathSimulated annealing for MMR-Path
Simulated annealing for MMR-Path
 
Lecture 8
Lecture 8Lecture 8
Lecture 8
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018
 
0906.2042v2
0906.2042v20906.2042v2
0906.2042v2
 
Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical Methods
 
Ben Gal
Ben Gal Ben Gal
Ben Gal
 
Do we need a logic of quantum computation?
Do we need a logic of quantum computation?Do we need a logic of quantum computation?
Do we need a logic of quantum computation?
 
Differential Geometry
Differential GeometryDifferential Geometry
Differential Geometry
 
Em
EmEm
Em
 
Quantum automata for infinite periodic words
Quantum automata for infinite periodic wordsQuantum automata for infinite periodic words
Quantum automata for infinite periodic words
 
A new class of restricted quantum membrane systems
A new class of restricted quantum membrane systemsA new class of restricted quantum membrane systems
A new class of restricted quantum membrane systems
 
A Vanilla Rao-Blackwellisation
A Vanilla Rao-BlackwellisationA Vanilla Rao-Blackwellisation
A Vanilla Rao-Blackwellisation
 
M16302
M16302M16302
M16302
 
Further discriminatory signature of inflation
Further discriminatory signature of inflationFurther discriminatory signature of inflation
Further discriminatory signature of inflation
 
Weighted Lasso for Network inference
Weighted Lasso for Network inferenceWeighted Lasso for Network inference
Weighted Lasso for Network inference
 
Illustration Clamor Echelon Evaluation via Prime Piece Psychotherapy
Illustration Clamor Echelon Evaluation via Prime Piece PsychotherapyIllustration Clamor Echelon Evaluation via Prime Piece Psychotherapy
Illustration Clamor Echelon Evaluation via Prime Piece Psychotherapy
 
Call-by-value non-determinism in a linear logic type discipline
Call-by-value non-determinism in a linear logic type disciplineCall-by-value non-determinism in a linear logic type discipline
Call-by-value non-determinism in a linear logic type discipline
 

Similar to Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing with intractable likelihoods

Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Pierre Jacob
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationUmberto Picchini
 
ABC and empirical likelihood
ABC and empirical likelihoodABC and empirical likelihood
ABC and empirical likelihoodChristian Robert
 
SMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space modelsSMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space modelsPierre Jacob
 
ABC & Empirical Lkd
ABC & Empirical LkdABC & Empirical Lkd
ABC & Empirical LkdDeb Roy
 
ABC and empirical likelihood
ABC and empirical likelihoodABC and empirical likelihood
ABC and empirical likelihoodChristian Robert
 
Integration of biological annotations using hierarchical modeling
Integration of biological annotations using hierarchical modelingIntegration of biological annotations using hierarchical modeling
Integration of biological annotations using hierarchical modelingUSC
 
RSS Read Paper by Mark Girolami
RSS Read Paper by Mark GirolamiRSS Read Paper by Mark Girolami
RSS Read Paper by Mark GirolamiChristian Robert
 
Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010Christian Robert
 
An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...Alexander Decker
 
2012 mdsp pr04 monte carlo
2012 mdsp pr04 monte carlo2012 mdsp pr04 monte carlo
2012 mdsp pr04 monte carlonozomuhamada
 
Introduction to Bootstrap and elements of Markov Chains
Introduction to Bootstrap and elements of Markov ChainsIntroduction to Bootstrap and elements of Markov Chains
Introduction to Bootstrap and elements of Markov ChainsUniversity of Salerno
 
Monte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxMonte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxHaibinSu2
 

Similar to Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing with intractable likelihoods (20)

Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7
 
mcmc
mcmcmcmc
mcmc
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computation
 
ABC and empirical likelihood
ABC and empirical likelihoodABC and empirical likelihood
ABC and empirical likelihood
 
SMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space modelsSMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space models
 
ABC in Venezia
ABC in VeneziaABC in Venezia
ABC in Venezia
 
ABC & Empirical Lkd
ABC & Empirical LkdABC & Empirical Lkd
ABC & Empirical Lkd
 
ABC and empirical likelihood
ABC and empirical likelihoodABC and empirical likelihood
ABC and empirical likelihood
 
Integration of biological annotations using hierarchical modeling
Integration of biological annotations using hierarchical modelingIntegration of biological annotations using hierarchical modeling
Integration of biological annotations using hierarchical modeling
 
RSS Read Paper by Mark Girolami
RSS Read Paper by Mark GirolamiRSS Read Paper by Mark Girolami
RSS Read Paper by Mark Girolami
 
Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010
 
An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...
 
CMB Likelihood Part 1
CMB Likelihood Part 1CMB Likelihood Part 1
CMB Likelihood Part 1
 
likelihood_p1.pdf
likelihood_p1.pdflikelihood_p1.pdf
likelihood_p1.pdf
 
Hastings 1970
Hastings 1970Hastings 1970
Hastings 1970
 
2012 mdsp pr04 monte carlo
2012 mdsp pr04 monte carlo2012 mdsp pr04 monte carlo
2012 mdsp pr04 monte carlo
 
Introduction to Bootstrap and elements of Markov Chains
Introduction to Bootstrap and elements of Markov ChainsIntroduction to Bootstrap and elements of Markov Chains
Introduction to Bootstrap and elements of Markov Chains
 
Manuscript 1334
Manuscript 1334Manuscript 1334
Manuscript 1334
 
Manuscript 1334-1
Manuscript 1334-1Manuscript 1334-1
Manuscript 1334-1
 
Monte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxMonte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptx
 

More from BigMC

Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...
Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...
Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...BigMC
 
Stability of adaptive random-walk Metropolis algorithms
Stability of adaptive random-walk Metropolis algorithmsStability of adaptive random-walk Metropolis algorithms
Stability of adaptive random-walk Metropolis algorithmsBigMC
 
"Monte-Carlo Tree Search for the game of Go"
"Monte-Carlo Tree Search for the game of Go""Monte-Carlo Tree Search for the game of Go"
"Monte-Carlo Tree Search for the game of Go"BigMC
 
Hedibert Lopes' talk at BigMC
Hedibert Lopes' talk at  BigMCHedibert Lopes' talk at  BigMC
Hedibert Lopes' talk at BigMCBigMC
 
Andreas Eberle
Andreas EberleAndreas Eberle
Andreas EberleBigMC
 
Olivier Féron's talk at BigMC March 2011
Olivier Féron's talk at BigMC March 2011Olivier Féron's talk at BigMC March 2011
Olivier Féron's talk at BigMC March 2011BigMC
 
Olivier Cappé's talk at BigMC March 2011
Olivier Cappé's talk at BigMC March 2011Olivier Cappé's talk at BigMC March 2011
Olivier Cappé's talk at BigMC March 2011BigMC
 
Estimation de copules, une approche bayésienne
Estimation de copules, une approche bayésienneEstimation de copules, une approche bayésienne
Estimation de copules, une approche bayésienneBigMC
 
Comparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsComparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsBigMC
 
Computation of the marginal likelihood
Computation of the marginal likelihoodComputation of the marginal likelihood
Computation of the marginal likelihoodBigMC
 
Learning spline-based curve models (Laure Amate)
Learning spline-based curve models (Laure Amate)Learning spline-based curve models (Laure Amate)
Learning spline-based curve models (Laure Amate)BigMC
 
Omiros' talk on the Bernoulli factory problem
Omiros' talk on the  Bernoulli factory problemOmiros' talk on the  Bernoulli factory problem
Omiros' talk on the Bernoulli factory problemBigMC
 

More from BigMC (12)

Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...
Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...
Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility i...
 
Stability of adaptive random-walk Metropolis algorithms
Stability of adaptive random-walk Metropolis algorithmsStability of adaptive random-walk Metropolis algorithms
Stability of adaptive random-walk Metropolis algorithms
 
"Monte-Carlo Tree Search for the game of Go"
"Monte-Carlo Tree Search for the game of Go""Monte-Carlo Tree Search for the game of Go"
"Monte-Carlo Tree Search for the game of Go"
 
Hedibert Lopes' talk at BigMC
Hedibert Lopes' talk at  BigMCHedibert Lopes' talk at  BigMC
Hedibert Lopes' talk at BigMC
 
Andreas Eberle
Andreas EberleAndreas Eberle
Andreas Eberle
 
Olivier Féron's talk at BigMC March 2011
Olivier Féron's talk at BigMC March 2011Olivier Féron's talk at BigMC March 2011
Olivier Féron's talk at BigMC March 2011
 
Olivier Cappé's talk at BigMC March 2011
Olivier Cappé's talk at BigMC March 2011Olivier Cappé's talk at BigMC March 2011
Olivier Cappé's talk at BigMC March 2011
 
Estimation de copules, une approche bayésienne
Estimation de copules, une approche bayésienneEstimation de copules, une approche bayésienne
Estimation de copules, une approche bayésienne
 
Comparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsComparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering models
 
Computation of the marginal likelihood
Computation of the marginal likelihoodComputation of the marginal likelihood
Computation of the marginal likelihood
 
Learning spline-based curve models (Laure Amate)
Learning spline-based curve models (Laure Amate)Learning spline-based curve models (Laure Amate)
Learning spline-based curve models (Laure Amate)
 
Omiros' talk on the Bernoulli factory problem
Omiros' talk on the  Bernoulli factory problemOmiros' talk on the  Bernoulli factory problem
Omiros' talk on the Bernoulli factory problem
 

Recently uploaded

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 

Recently uploaded (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Dealing with intractability: Recent Bayesian Monte Carlo methods for dealing with intractable likelihoods

  • 1. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Dealing with intractability: recent advances in Bayesian Monte Carlo methods for intractable likelihoods N. CHOPIN1 CREST-ENSAE 1 joint work with S. BARTHELME, P.E. JACOB, & O. PAPASPILIOPOULOS N. CHOPIN Intractability 1/ 54
  • 2. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Outline 1 Background 2 ABC methods for generative models 3 MC2 type methods 4 State-Space models, PMCMC 5 SMC2 N. CHOPIN Intractability 2/ 54
  • 3. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Tractable models For a prototypic Bayesian model, defined by (a) prior p(θ), and (b) likelihood p(y |θ), a standard approach is to sample from the posterior p(θ|y ) ∝ p(θ)p(y |θ). using the Metropolis-Hastings algorithm: Metropolis-Hastings From current point θn 1 Sample θp ∼ T (θn , dθp ) 2 With probability 1 ∧ r , take θn+1 = θp , otherwise θn+1 = θn , where p(θp )p(y |θp )T (θp , θn ) r= p(θn )p(y |θn )T (θn , θp ) This generates a Markov chain which leaves the posterior invariant. N. CHOPIN Intractability 3/ 54
  • 4. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Intractable models This generic approach cannot be applied in the following situations: 1 The likelihood reads p(y |θ) = C (θ)hθ (y ), where C (θ) is an intractable normalising constant; e.g. log-linear models, Ising models. 2 The likelihood p(y |θ) is an intractable integral p(y |θ) = p(y , x|θ) dx X of a tractable integrand; e.g. state-space models. 3 The likelihood is even more complicate, because it corresponds to some generative process (scientific models). Solutions to these problems involve auxiliary variables. N. CHOPIN Intractability 4/ 54
  • 5. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Outline 1 Background 2 ABC methods for generative models 3 MC2 type methods 4 State-Space models, PMCMC 5 SMC2 N. CHOPIN Intractability 5/ 54
  • 6. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Example of a generative model: reaction times Subject must choose between k alternatives. Evidence ej (t) in favour of choice j follows a Brownian motion with drift: τ dej (t) = mj dt + dWtj . Decision is taken when one evidence “wins the race”; see plot. Threshold for B Threshold for A Evidence for B Evidence for A 0 50 100 150 time (ms) N. CHOPIN Intractability 6/ 54
  • 7. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 ABC methods for generative models ABC stands for “Approximate Bayesian Computation”. In such algorithms, the auxiliary variable is an artificial dataset y ∼ p(y |θ). Denote the actual dataset y . Consider the simple rejection algorithm: Basic ABC Repeat 1 Sample θ ∼ p(θ). 2 Sample y ∼ p(y |θ). 3 Accept with probability Kε ( s(y ) − s(y ) ). where Kε (x) = K (x/ε), K is a kernel function, and s is a vector of “summary statistics”. N. CHOPIN Intractability 7/ 54
  • 8. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 ABC target This algorithm samples from: πε (θ, y ) ∝ p(θ)p(y |θ)Kε ( s(y ) − s(y ) ). and the marginal πε (θ) → p(θ|s(y )) as ε → 0. If s is sufficient, then the limit is the true posterior p(θ|s(y )) = p(θ|y ), but this is rarely possible unfortunately. N. CHOPIN Intractability 8/ 54
  • 9. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 MCMC-ABC One can instead derive a MCMC algorithm that sample from the same distribution. MCMC-ABC From current point (θn , yn ) 1 Sample θp ∼ T (θn , dθp ). 2 Sample y p ∼ p(y |θp ). 3 With probability 1 ∧ r , take (θn+1 , yn+1 ) = (θp , y p ), otherwise (θn+1 , yn+1 ) = (θn , yn ), where p(θp )Kε ( s(y p ) − s(y ) )T (θp , θn ) r= p(θn )Kε ( s(yn ) − s(y ) )T (θn , θp ) N. CHOPIN Intractability 9/ 54
  • 10. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Remarks on the KDE interpretation of ABC Having sampled N pairs (θi , y i ) from p(θ)p(y |θ), choosing ε essentially amounts to choosing the bandwidth of a KDE. There are some specific aspects that may deserve some investigation however: 1 The objective is to approximate a conditional density, that is p(θ|s(y )). (But approximating p(s(y )) may be interesting too.) 2 The marginal distribution of the simulated θ’s is known. 3 Could we use a bandwidth matrix instead? N. CHOPIN Intractability 10/ 54
  • 11. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Parametric interpretation of ABC It would be great to take s(y ) = y . In that way, the ABC posterior could be interpreted as the posterior distribution of the same model, but corrupted with noise (of size ε). See the following paper for a fast (EP) approximation of such an ABC posterior: Barthelm´, S. and Chopin, N. (2011). ABC-EP: Expectation e Propagation for Likelihood-free Bayesian Computation, ICML 2011, L. Getoor and T. Scheffer (eds), 289-296. (see also arXiv:1107.5959). N. CHOPIN Intractability 11/ 54
  • 12. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 ABC: summary We use ABC for very challenging models (generative/scientific models). We pay a heavy price for this: 1 First level of approximation is p(θ|y ) ≈ p(θ|s(y )) (althought not in ABC-EP). 2 Second level of approximation is p(θ|s(y )) ≈ πε (θ). 3 Huge CPU cost (but less in ABC-EP). 4 ABC-EP cannot be used in all situations. In the rest of the talk, we will deal with milder problems, and we will be able to avoid approximations. N. CHOPIN Intractability 12/ 54
  • 13. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Outline 1 Background 2 ABC methods for generative models 3 MC2 type methods 4 State-Space models, PMCMC 5 SMC2 N. CHOPIN Intractability 13/ 54
  • 14. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Basic framework Imagine a model such that p(y |θ) = p(x, y |θ) dx is intractable, but can be approximated by the following unbiased MC estimate: N 1 p(x j , y |θ) p (y |θ) = ˆ N qθ (x j ) j=1 where the x j ’s are N points sampled from the (user-chosen) proposal distribution qθ . N. CHOPIN Intractability 14/ 54
  • 15. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Naive question Can we simply replace p(y |θ) by p (y |θ)? i.e. ˆ MC2 From current point θn (plus p (y |θn ) from previous iteration) ˆ 1 Sample θp ∼ T (θn , dθp ) 2 Sample x 1:N ∼ qθp so as to compute p (y |θp ). ˆ 3 With probability 1 ∧ r , set θn+1 = θp , otherwise θn+1 = θn with p(θp )ˆ(y |θp )T (θp , θn ) p r= . p(θn )ˆ(y |θn )T (θn , θp ) p N. CHOPIN Intractability 15/ 54
  • 16. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Answer: yes, and the algorithm is exact! More precisely, this algorithm is a correct Metropolis step with respect to the following extended distribution:   N N j , y |θ) 1 p(x π(θ, x 1:N ) ∝ p(θ) qθ (x j )   N qθ (x j ) j=1 j=1 which is such that the marginal distribution of θ is precisely the true posterior distribution: π(θ, x 1:N ) dx 1:N = p(θ|y ). N. CHOPIN Intractability 16/ 54
  • 17. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Outline 1 Background 2 ABC methods for generative models 3 MC2 type methods 4 State-Space models, PMCMC 5 SMC2 N. CHOPIN Intractability 17/ 54
  • 18. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 State Space Models A system of equations Hidden states (Markov): p(x1 |θ) = µθ (x1 ) and for t ≥ 1 p(xt+1 |x1:t , θ) = p(xt+1 |xt , θ) = fθ (xt+1 |xt ) Observations: p(yt |y1:t−1 , x1:t−1 , θ) = p(yt |xt , θ) = gθ (yt |xt ) Parameter: θ ∈ Θ, prior p(θ). We observe y1:T = (y1 , . . . yT ), T might be large (≈ 104 ). x and θ will also be of several dimensions. There are several interesting models for which fθ cannot be written in closed form (but it can be simulated). N. CHOPIN Intractability 18/ 54
  • 19. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 State Space Models Some interesting distributions Bayesian inference focuses on: static: p(θ|y1:T ) dynamic: p(θ|y1:t ) , t ∈ 1 : T Filtering (traditionally) focuses on: ∀t ∈ [1, T ] pθ (xt |y1:t ) Smoothing (traditionally) focuses on: ∀t ∈ [1, T ] pθ (xt |y1:T ) N. CHOPIN Intractability 19/ 54
  • 20. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Examples Population growth model yt = xt + σw εt log xt+1 = log xt + b0 + b1 (xt )b2 + σ ηt θ = (b0 , b1 , b2 , σ , σW ). N. CHOPIN Intractability 20/ 54
  • 21. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Examples Stochastic Volatility (L´vy-driven models) e Observations (“log returns”): 1/2 yt = µ + βvt + vt t ,t ≥ 1 Hidden states (“actual volatility” - integrated process): k 1 vt+1 = (zt − zt+1 + ej ) λ j=1 N. CHOPIN Intractability 21/ 54
  • 22. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Examples . . . where the process zt is the “spot volatility”: k zt+1 = e −λ zt + e −λ(t+1−cj ) ej j=1 iid iid k ∼ Poi λξ 2 /ω 2 c1:k ∼ U(t, t + 1) ei:k ∼ Exp ξ/ω 2 The parameter is θ ∈ (µ, β, ξ, ω 2 , λ), and xt = (vt , zt ) . See the results N. CHOPIN Intractability 22/ 54
  • 23. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Why are those models challenging? . . . It is effectively impossible to compute the likelihood p(y1:T |θ) = p(y1:T |x1:T , θ)p(x1:T |θ)dx1:T XT Similarly, all other inferential quantities are impossible to compute. N. CHOPIN Intractability 23/ 54
  • 24. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Problems with MCMC approaches Metropolis-Hastings: 1 p(θ|y1:T ) cannot be evaluated point-wise (marginal MH) 2 p(x1:T , θ|y1:T ) are high-dimensional and it is hard to design reasonable proposals Gibbs sampler (updates states and parameters): 1 The hidden states x1:T are typically very correlated and it is hard to update them efficiently in a block 2 Parameters and latent variables highly correlated Common: they are not designed to recover the whole sequence π(x1:t , θ | y1:t ) N. CHOPIN Intractability 24/ 54
  • 25. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Particle filters Consider the simplified problem of targeting pθ (xt+1 |y1:t+1 ) This sequence of distributions is approximated by a sequence of weighted particles which are properly weighted using importance sampling, mutated/propagated according to the system dynamics, and resampled to control the variance. Below we give a pseudo-code version. Any operation involving the superscript n must be understood as performed for n = 1 : Nx , where Nx is the total number of particles. N. CHOPIN Intractability 25/ 54
  • 26. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Step 1: At iteration t = 1, n (a) Sample x1 ∼ q1,θ (·). (b) Compute and normalise weights n n n n µθ (x1 )gθ (y1 |x1 ) n w1,θ (x1 ) w1,θ (x1 ) = n , W1,θ = N . q1,θ (x1 ) i i=1 w1,θ (x1 ) Step 2: At iteration t = 2 : T n 1:Nx (a) Sample the index at−1 ∼ M(Wt−1,θ ) of the ancestor an (b) Sample xtn ∼ qt,θ (·|xt−1 ). t−1 (c) Compute and normalise weights an an n at−1 fθ (xtn |xt−1 )gθ (yt |xtn ) t−1 wt,θ (xt−1 , xtn ) t−1 wt,θ (xt−1 , xtn ) = an , n Wt,θ = i at−1 i qt,θ (xtn |xt−1 ) t−1 Nx i=1 wt,θ (xt−1 , xt ) N. CHOPIN Intractability 26/ 54
  • 27. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Particle filtering time Figure: Three weighted trajectories x1:t at time t. N. CHOPIN Intractability 27/ 54
  • 28. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Particle filtering time Figure: Three proposed trajectories x1:t+1 at time t + 1. N. CHOPIN Intractability 28/ 54
  • 29. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Particle filtering time Figure: Three reweighted trajectories x1:t+1 at time t + 1 N. CHOPIN Intractability 29/ 54
  • 30. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Observations (i) (i) At each t, (wt , x1:t )Nx is a particle approximation of i=1 pθ (xt |y1:t ). Resampling to avoid degeneracy. If there were no interaction between particles there would be typically polynomial or worse increase in the variance of weights Taking qθ = fθ simplifies weights, but mainly yields a feasible algorithm when fθ can only be simulated. N. CHOPIN Intractability 30/ 54
  • 31. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Unbiased likelihood estimator A by-product of PF output is that T Nx ˆ 1 (i) ZtN = wt Nx t=1 i=1 is an unbiased estimator of the likelihood Zt = p(y1:t |θ) for all t. Whereas consistency of the estimator is immediate to check, unbiasedness is subtle, see e.g Proposition 7.4.1 in Del Moral. The variance of this estimator grows typically linealy with T (and not exponentially) because of lack of independence. N. CHOPIN Intractability 31/ 54
  • 32. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 PSMC Breakthrough paper of Andrieu et al. (2011), based on the unbiasedness of the PF estimate of the likelihood. Marginal PMCMC From current point θn (and current PF estimate p (y |θn )): ˆ 1 Sample θp ∼ T (θn , dθp ) 2 Run a PF so as to obtain p (y |θp ), an unbiased estimate of ˆ p(y |θp ). 3 With probability 1 ∧ r , set θn+1 = θp , otherwise θn+1 = θn with p(θp )p(y |θp )T (θp , θn ) r= p(θn )ˆ(y |θn )T (θn , θp ) p N. CHOPIN Intractability 32/ 54
  • 33. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Outline 1 Background 2 ABC methods for generative models 3 MC2 type methods 4 State-Space models, PMCMC 5 SMC2 N. CHOPIN Intractability 33/ 54
  • 34. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Objectives 1 to derive sequentially p(θ, x1:t |y1:t ), p(y1:t ), for all t ∈ {1, . . . , T } 2 to obtain a black box algorithm (automatic calibration). N. CHOPIN Intractability 34/ 54
  • 35. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Main tools of our approach Particle filter algorithms for state-space models (this will be to estimate the likelihood, for a fixed θ). Iterated Batch Importance Sampling for sequential Bayesian inference for parameters (this will be the theoretical algorithm we will try to approximate). Both are sequential Monte Carlo (SMC) methods N. CHOPIN Intractability 35/ 54
  • 36. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 IBIS SMC method for particle approximation of the sequence p(θ | y1:t ) for t = 1 : T . PF is not going to work here by just pretending that θ is a dynamic process with zero (or small) variance. Recall the path degeneracy problem. In the next slide we give the pseudo-code of the IBIS algorithm. Operations with superscript m must be understood as operations performed for all m ∈ 1 : Nθ , where Nθ is the total number of θ-particles. N. CHOPIN Intractability 36/ 54
  • 37. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Sample θm from p(θ) and set ω m ← 1. Then, at time t = 1, . . . , T (a) Compute the incremental weights and their weighted average Nθ 1 ut (θm ) = p(yt |y1:t−1 , θm ), Lt = Nθ × ω m ut (θm ), m m=1 ω m=1 (b) Update the importance weights, ω m ← ω m ut (θm ). (1) ˜ (c) If some degeneracy criterion is fulfilled, sample θm independently from the mixture distribution Nθ 1 Nθ ω m Kt (θm , ·) . m m=1 ω m=1 Finally, replace the current weighted particle system: ˜ (θm , ω m ) ← (θm , 1). N. CHOPIN Intractability 37/ 54
  • 38. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Observations Cost of lack of ergodicity in θ: the occasional MCMC move Still, in regular problems resampling happens at diminishing frequency (logarithmically) Kt is an MCMC kernel invariant wrt π(θ | y1:t ). Its parameters can be chosen using information from current population of θ-particles Lt is a MC estimator of the model evidence Infeasible to implement for state-space models: intractable incremental weights, and MCMC kernel N. CHOPIN Intractability 38/ 54
  • 39. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Our algorithm: SMC2 We provide a generic (black box) algorithm for recovering the sequence of parameter posterior distributions, but as well filtering, smoothing and predictive. We give next a pseudo-code; the code seems to only track the parameter posteriors, but actually it does all other jobs. Superficially, it looks an approximation of IBIS, but in fact it does not produce any systematic errors (unbiased MC). N. CHOPIN Intractability 39/ 54
  • 40. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Sample θm from p(θ) and set ω m ← 1. Then, at time t = 1, . . . , T , (a) For each particle θm , perform iteration t of the PF: If t = 1, sample independently x1 x ,m from ψ1,θm , and 1:N compute Nx 1 n,m p (y1 |θm ) = ˆ w1,θ (x1 ); Nx n=1 If t > 1, sample xt1:Nx ,m , at−1x ,m from ψt,θm 1:N 1:Nx ,m 1:Nx ,m conditional on x1:t−1 , a1:t−2 , and compute Nx 1 an,m ,m p (yt |y1:t−1 , θm ) = ˆ wt,θ (xt−1 , xtn,m ). t−1 Nx n=1 N. CHOPIN Intractability 39/ 54
  • 41. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 (b) Update the importance weights, ω m ← ω m p (yt |y1:t−1 , θm ) ˆ (c) If some degeneracy criterion is fulfilled, sample θm , x1:t x ,m , ˜1:t−1 independently from ˜ ˜1:N a1:Nx Nθ 1 Nθ ω m Kt θm , x1:t x ,m , a1:t−1 , · 1:N 1:Nx ,m m m=1 ω m=1 Finally, replace current weighted particle system: (θm , x1:t x ,m , a1:t−1 , ω m ) ← (θm , x1:t x ,m , ˜1:t−1 , 1) 1:N 1:Nx ,m ˜ ˜1:N a1:Nx ,m N. CHOPIN Intractability 40/ 54
  • 42. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Observations It appears as approximation to IBIS. For Nx = ∞ it is IBIS. However, no approximation is done whatsoever. This algorithm really samples from p(θ|y1:t ) and all other distributions of interest. One would expect an increase of MC variance over IBIS. The validity of algorithm is essentially based on two results: i) the particles are weighted due to unbiasedness of PF estimator of likelihood; ii) the MCMC kernel is appropriately constructed to maintain invariance wrt to an expanded distribution which admits those of interest as marginals; it is a Particle MCMC kernel. The algorithm does not suffer from the path degeneracy problem due to the MCMC updates N. CHOPIN Intractability 40/ 54
  • 43. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 The MCMC step ˜ ˜ ˜ (a) Sample θ from proposal kernel, θ ∼ T (θ, d θ). ˜ (b) Run a new PF for θ: sample independently 1:Nx 1:Nx (˜1:t , ˜1:t−1 ) from ψt,θ , and compute x a ˜ ˆt (θ, x 1:Nx , ˜1:Nx ). Z ˜˜ a 1:t 1:t−1 (c) Accept the move with probability ˜ ˆ ˜ ˜1:N a1:Nx ˜ p(θ)Zt (θ, x1:t x , ˜1:t−1 )T (θ, θ) 1∧ . ˆ ˜ p(θ)Zt (θ, x 1:Nx , a1:Nx )T (θ, θ) 1:t 1:t−1 It can be shown that this is a standard Hastings-Metropolis kernel with proposal ˜ ˜1:N a1:N ˜ 1:N a1:N qθ (θ, x1:t x , ˜1:t x ) = T (θ, θ)ψt,θ (˜1:t x , ˜1:t x ) ˜ x 1:N 1:Nx invariant wrt to an extended distribution πt (θ, x1:t x , a1:t−1 ). N. CHOPIN Intractability 41/ 54
  • 44. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Some advantages of the algorithm Immediate estimates of filtering and predictive distributions Immediate and sequential estimator of model evidence Easy recovery of smoothing distributions Principled framework for automatic calibration of Nx Population Monte Carlo advantages N. CHOPIN Intractability 42/ 54
  • 45. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Numerical illustrations: SV 1.0 800 700 8 0.8 600 Squared observations Acceptance rates 6 0.6 500 Nx 400 4 0.4 300 2 0.2 200 100 0 0.0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000 Time Iterations Iterations (a) (b) (c) Figure: Squared observations (synthetic data set), acceptance rates, and illustration of the automatic increase of Nx . See the model N. CHOPIN Intractability 43/ 54
  • 46. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Numerical illustrations: SV T = 250 T = 500 T = 750 T = 1000 8 6 Density 4 2 0 −1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0 µ Figure: Concentration of the posterior distribution for parameter µ. N. CHOPIN Intractability 44/ 54
  • 47. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Numerical illustrations: SV Multifactor model k1 k2 1/2 yt = µ+βvt +vt t +ρ1 e1,j +ρ2 e2,j −ξ(w ρ1 λ1 +(1−w )ρ2 λ2 ) j=1 j=1 where vt = v1,t + v2,t , and (vi , zi )i=1,2 are following the same dynamics with parameters (wi ξ, wi ω 2 , λi ) and w1 = w , w2 = 1 − w . N. CHOPIN Intractability 45/ 54
  • 48. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Numerical illustrations: SV Evidence compared to the one factor model variable 20 Multi factor without leverage 4 Multi factor with leverage Squared observations 15 2 10 0 5 −2 100 200 300 400 500 600 700 100 200 300 400 500 600 700 Time Iterations (a) (b) Figure: S&P500 squared observations, and log-evidence comparison between models (relative to the one-factor model). N. CHOPIN Intractability 46/ 54
  • 49. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Final Remarks A powerful framework A generic algorithm for sequential estimation and state inference in state space models: only requirements are to be able (a) to simulate the Markov transition fθ (xt |xt−1 ), and (b) to evaluate the likelihood term gθ (yt |xt ). The article is available on arXiv and our web pages A package is available at: http://code.google.com/p/py-smc2/. N. CHOPIN Intractability 47/ 54
  • 50. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Appendix N. CHOPIN Intractability 48/ 54
  • 51. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Why does it work? - Intuition for t = 1 At time t = 1, the algorithm generates variables θm from the prior p(θ), and for each θm , the algorithm generates vectors x1 x ,m of 1:N 1:N particles, from ψ1,θm (x1 x ). N. CHOPIN Intractability 49/ 54
  • 52. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Thus, the sampling space is Θ × X Nx , and the actual “particles” of the algorithm are Nθ independent and identically distributed copies 1:N of the random variable (θ, x1 x ), with density: Nx 1:N n p(θ)ψ1,θ (x1 x ) = p(θ) q1,θ (x1 ). n=1 N. CHOPIN Intractability 50/ 54
  • 53. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Then, these particles are assigned importance weights corresponding to the incremental weight function ˆ 1:N −1 Nx n Z1 (θ, x1 x ) = Nx n=1 w1,θ (x1 ). This means that, at iteration 1, the target distribution of the algorithm should be defined as: 1:N 1:N ˆ 1:N Z1 (θ, x1 x ) π1 (θ, x1 x ) = p(θ)ψ1,θ (x1 x ) × , p(y1 ) where the normalising constant p(y1 ) is easily deduced from the ˆ 1:N property that Z1 (θ, x1 x ) is an unbiased estimator of p(y1 |θ). N. CHOPIN Intractability 51/ 54
  • 54. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Direct substitutions yield Nx n Nx n 1:N p(θ) i µθ (x1 )gθ (y1 |x1 ) 1 π1 (θ, x1 x ) = q1,θ (x1 ) n p(y1 ) Nx q1,θ (x1 ) i=1 n=1   N  Nx 1 x p(θ) n n i  = µθ (x1 )gθ (y1 |x1 ) q1,θ (x1 ) Nx p(y1 )   n=1 i=1,i=n and noting that, for the triplet (θ, x1 , y1 ) of random variables, p(θ)µθ (x1 )gθ (y1 |x1 ) = p(θ, x1 , y1 ) = p(y1 )p(θ|y1 )p(x1 |y1 , θ) one finally gets that:   Nx Nx 1:N p(θ|y1 ) n  i  π1 (θ, x1 x ) = p(x1 |y1 , θ) q1,θ (x1 ) . Nx   n=1 i=1,i=n N. CHOPIN Intractability 52/ 54
  • 55. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 By a simple induction, one sees that the target density πt at iteration t ≥ 2 should be defined as: ˆ 1:N 1:Nx Zt (θ, x1:t x , a1:t−1 ) 1:N 1:Nx 1:N 1:Nx πt (θ, x1:t x , a1:t−1 ) = p(θ)ψt,θ (x1:t x , a1:t−1 ) × p(y1:t ) and the following Proposition N. CHOPIN Intractability 53/ 54
  • 56. Background ABC methods for generative models MC2 type methods State-Space models, PMCMC SMC2 Proposition The probability density πt may be written as: 1:N 1:Nx πt (θ, x1:t x , a1:t−1 ) = p(θ|y1:t )   N  N  1 x p(xn |θ, y1:t )  x    1:t i × t−1 q1,θ (x1 ) Nx Nx   n=1  i=1  n   i=ht (1)    t  Nx    i as−1 i  i as−1 × Ws−1,θ qs,θ (xs |xs−1 )  s=2 i=1    n  i=ht (s) N. CHOPIN Intractability 54/ 54