SlideShare a Scribd company logo
1 of 141
Download to read offline
ABC Methods for Bayesian Model Choice




                 ABC Methods for Bayesian Model Choice

                                        Christian P. Robert

                                Universit´ Paris-Dauphine, IuF, & CREST
                                         e
                               http://www.ceremade.dauphine.fr/~xian


                          Princeton University, April 03, 2012
                        Joint work with J.-M. Cornuet, A. Grelaud,
                           J.-M. Marin, N. Pillai, & J. Rousseau
ABC Methods for Bayesian Model Choice




Outline




                                        Approximate Bayesian computation

                                        ABC for model choice

                                        Model choice consistency
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation




Approximate Bayesian computation



 Approximate Bayesian computation
    ABC basics
    Alphabet soup
    simulation-based methods in
    Econometrics

 ABC for model choice

 Model choice consistency
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


Regular Bayesian computation issues


      When faced with a non-standard posterior distribution

                                        π(θ|y) ∝ π(θ)L(θ|y)

      the standard solution is to use simulation (Monte Carlo) to
      produce a sample
                                   θ1 , . . . , θ T
      from π(θ|y) (or approximately by Markov chain Monte Carlo
      methods)
                                              [Robert & Casella, 2004]
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


Untractable likelihoods



      Cases when the likelihood function f (y|θ) is unavailable and when
      the completion step

                                        f (y|θ) =       f (y, z|θ) dz
                                                    Z

      is impossible or too costly because of the dimension of z
       c MCMC cannot be implemented!
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


Illustration


      Example
   Coalescence tree: in population
   genetics, reconstitution of a common
   ancestor from a sample of genes via
   a phylogenetic tree that is close to
   impossible to integrate out
   [100 processor days with 4
   parameters]
                                    [Cornuet et al., 2009, Bioinformatics]
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


The ABC method

      Bayesian setting: target is π(θ)f (x|θ)
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


The ABC method

      Bayesian setting: target is π(θ)f (x|θ)
      When likelihood f (x|θ) not in closed form, likelihood-free rejection
      technique:
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


The ABC method

      Bayesian setting: target is π(θ)f (x|θ)
      When likelihood f (x|θ) not in closed form, likelihood-free rejection
      technique:
      ABC algorithm
      For an observation y ∼ f (y|θ), under the prior π(θ), keep jointly
      simulating
                           θ ∼ π(θ) , z ∼ f (z|θ ) ,
      until the auxiliary variable z is equal to the observed value, z = y.

                                                      [Tavar´ et al., 1997]
                                                            e
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


Why does it work?!



      The proof is trivial:

                                     f (θi ) ∝         π(θi )f (z|θi )Iy (z)
                                                 z∈D
                                           ∝ π(θi )f (y|θi )
                                           = π(θi |y) .

                                                                          [Accept–Reject 101]
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


Earlier occurrence


             ‘Bayesian statistics and Monte Carlo methods are ideally
             suited to the task of passing many models over one
             dataset’
                                        [Don Rubin, Annals of Statistics, 1984]

      Note Rubin (1984) does not promote this algorithm for
      likelihood-free simulation but frequentist intuition on posterior
      distributions: parameters from posteriors are more likely to be
      those that could have generated the data.
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


A as approximative


      When y is a continuous random variable, equality z = y is replaced
      with a tolerance condition,

                                        (y, z) ≤

      where        is a distance
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


A as approximative


      When y is a continuous random variable, equality z = y is replaced
      with a tolerance condition,

                                         (y, z) ≤

      where is a distance
      Output distributed from

                           π(θ) Pθ { (y, z) < } ∝ π(θ| (y, z) < )
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


ABC algorithm


      Algorithm 1 Likelihood-free rejection sampler
        for i = 1 to N do
          repeat
             generate θ from the prior distribution π(·)
             generate z from the likelihood f (·|θ )
          until ρ{η(z), η(y)} ≤
          set θi = θ
        end for

      where η(y) defines a (maybe in-sufficient) statistic
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


Output

      The likelihood-free algorithm samples from the marginal in z of:

                                            π(θ)f (z|θ)IA ,y (z)
                            π (θ, z|y) =                           ,
                                           A ,y ×Θ π(θ)f (z|θ)dzdθ

      where A       ,y   = {z ∈ D|ρ(η(z), η(y)) < }.
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


Output

      The likelihood-free algorithm samples from the marginal in z of:

                                            π(θ)f (z|θ)IA ,y (z)
                            π (θ, z|y) =                           ,
                                           A ,y ×Θ π(θ)f (z|θ)dzdθ

      where A       ,y   = {z ∈ D|ρ(η(z), η(y)) < }.
      The idea behind ABC is that the summary statistics coupled with a
      small tolerance should provide a good approximation of the
      posterior distribution:

                              π (θ|y) =    π (θ, z|y)dz ≈ π(θ|y) .
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


Probit modelling on Pima Indian women
      Example (R benchmark)
      200 Pima Indian women with observed variables
             plasma glucose concentration in oral glucose tolerance test
             diastolic blood pressure
             diabetes pedigree function
             presence/absence of diabetes
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


Probit modelling on Pima Indian women
      Example (R benchmark)
      200 Pima Indian women with observed variables
             plasma glucose concentration in oral glucose tolerance test
             diastolic blood pressure
             diabetes pedigree function
             presence/absence of diabetes


      Probability of diabetes function of above variables
                             P(y = 1|x) = Φ(x1 β1 + x2 β2 + x3 β3 ) ,
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


Probit modelling on Pima Indian women
      Example (R benchmark)
      200 Pima Indian women with observed variables
             plasma glucose concentration in oral glucose tolerance test
             diastolic blood pressure
             diabetes pedigree function
             presence/absence of diabetes


      Probability of diabetes function of above variables
                             P(y = 1|x) = Φ(x1 β1 + x2 β2 + x3 β3 ) ,
      Test of H0 : β3 = 0 for 200 observations of Pima.tr based on a
      g-prior modelling:
                                            β ∼ N3 (0, n xT x)−1
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


Probit modelling on Pima Indian women
      Example (R benchmark)
      200 Pima Indian women with observed variables
             plasma glucose concentration in oral glucose tolerance test
             diastolic blood pressure
             diabetes pedigree function
             presence/absence of diabetes


      Probability of diabetes function of above variables
                             P(y = 1|x) = Φ(x1 β1 + x2 β2 + x3 β3 ) ,
      Test of H0 : β3 = 0 for 200 observations of Pima.tr based on a
      g-prior modelling:
                                            β ∼ N3 (0, n xT x)−1
      Use of importance function inspired from the MLE estimate
      distribution
                                         ˆ ˆ
                                β ∼ N (β, Σ)
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


Pima Indian benchmark




                                                                         80
                            100




                                                                                                                1.0
                            80




                                                                         60




                                                                                                                0.8
                            60




                                                                                                                0.6
                  Density




                                                               Density




                                                                                                      Density
                                                                         40
                            40




                                                                                                                0.4
                                                                         20
                            20




                                                                                                                0.2
                                                                                                                0.0
                            0




                                                                         0




                              −0.005   0.010   0.020   0.030                  −0.05   −0.03   −0.01                   −1.0   0.0   1.0   2.0




      Figure: Comparison between density estimates of the marginals on β1
      (left), β2 (center) and β3 (right) from ABC rejection samples (red) and
      MCMC samples (black)

                                                                                      .
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


MA example



      Consider the MA(q) model
                                                     q
                                        xt =   t+         ϑi   t−i
                                                    i=1

      Simple prior: uniform prior over the identifiability zone, e.g.
      triangle for MA(2)
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


MA example (2)
      ABC algorithm thus made of
         1. picking a new value (ϑ1 , ϑ2 ) in the triangle
         2. generating an iid sequence ( t )−q<t≤T
         3. producing a simulated series (xt )1≤t≤T
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


MA example (2)
      ABC algorithm thus made of
         1. picking a new value (ϑ1 , ϑ2 ) in the triangle
         2. generating an iid sequence ( t )−q<t≤T
         3. producing a simulated series (xt )1≤t≤T
      Distance: basic distance between the series
                                                             T
                          ρ((xt )1≤t≤T , (xt )1≤t≤T ) =           (xt − xt )2
                                                            t=1

      or between summary statistics like the first q autocorrelations
                                                T
                                        τj =           xt xt−j
                                               t=j+1
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


Comparison of distance impact




      Evaluation of the tolerance on the ABC sample against both
      distances ( = 100%, 10%, 1%, 0.1%) for an MA(2) model
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


Comparison of distance impact

                      4




                                                          1.5
                      3




                                                          1.0
                      2




                                                          0.5
                      1




                                                          0.0
                      0




                            0.0   0.2   0.4   0.6   0.8         −2.0   −1.0    0.0   0.5   1.0   1.5

                                         θ1                                   θ2




      Evaluation of the tolerance on the ABC sample against both
      distances ( = 100%, 10%, 1%, 0.1%) for an MA(2) model
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


Comparison of distance impact

                      4




                                                          1.5
                      3




                                                          1.0
                      2




                                                          0.5
                      1




                                                          0.0
                      0




                            0.0   0.2   0.4   0.6   0.8         −2.0   −1.0    0.0   0.5   1.0   1.5

                                         θ1                                   θ2




      Evaluation of the tolerance on the ABC sample against both
      distances ( = 100%, 10%, 1%, 0.1%) for an MA(2) model
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


ABC advances

      Simulating from the prior is often poor in efficiency
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


ABC advances

      Simulating from the prior is often poor in efficiency
      Either modify the proposal distribution on θ to increase the density
      of x’s within the vicinity of y...
           [Marjoram et al, 2003; Bortot et al., 2007, Sisson et al., 2007]
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


ABC advances

      Simulating from the prior is often poor in efficiency
      Either modify the proposal distribution on θ to increase the density
      of x’s within the vicinity of y...
           [Marjoram et al, 2003; Bortot et al., 2007, Sisson et al., 2007]

      ...or by viewing the problem as a conditional density estimation
      and by developing techniques to allow for larger
                                                  [Beaumont et al., 2002]
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     ABC basics


ABC advances

      Simulating from the prior is often poor in efficiency
      Either modify the proposal distribution on θ to increase the density
      of x’s within the vicinity of y...
           [Marjoram et al, 2003; Bortot et al., 2007, Sisson et al., 2007]

      ...or by viewing the problem as a conditional density estimation
      and by developing techniques to allow for larger
                                                  [Beaumont et al., 2002]

      .....or even by including         in the inferential framework [ABCµ ]
                                                             [Ratmann et al., 2009]
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     Alphabet soup


ABC-MCMC


      Markov chain (θ(t) ) created via the transition function
                
                θ ∼ Kω (θ |θ(t) ) if x ∼ f (x|θ ) is such that x = y
                
                
                                                           π(θ )Kω (t) |θ )
      θ (t+1)
              =                       and u ∼ U(0, 1) ≤ π(θ(t) )K (θ |θ(t) ) ,
                                                                 ω (θ
                 (t)
                θ                   otherwise,
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     Alphabet soup


ABC-MCMC


      Markov chain (θ(t) ) created via the transition function
                
                θ ∼ Kω (θ |θ(t) ) if x ∼ f (x|θ ) is such that x = y
                
                
                                                           π(θ )Kω (t) |θ )
      θ (t+1)
              =                       and u ∼ U(0, 1) ≤ π(θ(t) )K (θ |θ(t) ) ,
                                                                 ω (θ
                 (t)
                θ                   otherwise,

      has the posterior π(θ|y) as stationary distribution
                                                    [Marjoram et al, 2003]
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     Alphabet soup


ABC-MCMC (2)

      Algorithm 2 Likelihood-free MCMC sampler
          Use Algorithm 1 to get (θ(0) , z(0) )
          for t = 1 to N do
            Generate θ from Kω ·|θ(t−1) ,
            Generate z from the likelihood f (·|θ ),
            Generate u from U[0,1] ,
                          π(θ )Kω (θ(t−1) |θ )
             if u ≤                             I
                        π(θ(t−1) Kω (θ |θ(t−1) ) A   ,y   (z ) then
               set     (θ(t) , z(t) ) = (θ , z )
            else
               (θ(t) , z(t) )) = (θ(t−1) , z(t−1) ),
            end if
          end for
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     Alphabet soup


Why does it work?

      Acceptance probability that does not involve the calculation of the
      likelihood and

               π (θ , z |y)         Kω (θ(t−1) |θ )f (z(t−1) |θ(t−1) )
                                  ×
           π (θ(t−1) , z(t−1) |y)       Kω (θ |θ(t−1) )f (z |θ )
                                            π(θ ) f (z |θ ) IA ,y (z )
                                  =     (t−1) ) f (z(t−1) |θ (t−1) )I        (t−1) )
                                    π(θ                              A ,y (z

                                                Kω (θ(t−1) |θ ) f (z(t−1) |θ(t−1) )
                                            ×
                                                    Kω (θ |θ(t−1) ) f (z |θ )
                                             π(θ )Kω (θ(t−1) |θ )
                                        =                           IA (z ) .
                                            π(θ(t−1) Kω (θ |θ(t−1) ) ,y
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     Alphabet soup


ABCµ


      Use of a joint density

                             f (θ, |y) ∝ ξ( |y, θ) × πθ (θ) × π ( )

      where y is the data, and ξ(·|y, θ) is the prior predictive density of
       = ρ(η(z), η(y)) given θ and x when z ∼ f (z|θ)
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     Alphabet soup


ABCµ


      Use of a joint density

                             f (θ, |y) ∝ ξ( |y, θ) × πθ (θ) × π ( )

      where y is the data, and ξ(·|y, θ) is the prior predictive density of
       = ρ(η(z), η(y)) given θ and x when z ∼ f (z|θ)
      Warning! Replacement of ξ( |y, θ) with a non-parametric kernel
      approximation.
                 [Ratmann, Andrieu, Wiuf and Richardson, 2009, PNAS]
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     Alphabet soup


ABCµ details

      Major drift: Multidimensional distances ρk (k = 1, . . . , K) and
      errors k = ρk (ηk (z), ηk (y)), with

                            ˆ                1
        k   ∼ ξk ( |y, θ) ≈ ξk ( |y, θ) =             K[{ k −ρk (ηk (zb ), ηk (y))}/hk ]
                                            Bhk
                                                  b

                                                 ˆ
      then used in replacing ξ( |y, θ) with mink ξk ( |y, θ)
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     Alphabet soup


ABCµ details

      Major drift: Multidimensional distances ρk (k = 1, . . . , K) and
      errors k = ρk (ηk (z), ηk (y)), with

                            ˆ                 1
        k   ∼ ξk ( |y, θ) ≈ ξk ( |y, θ) =              K[{ k −ρk (ηk (zb ), ηk (y))}/hk ]
                                             Bhk
                                                   b

                                                 ˆ
      then used in replacing ξ( |y, θ) with mink ξk ( |y, θ)
      ABCµ involves acceptance probability

                                                        ˆ
                            π(θ , ) q(θ , θ)q( , ) mink ξk ( |y, θ )
                                                         ˆ
                            π(θ, ) q(θ, θ )q( , ) mink ξk ( |y, θ)
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     Alphabet soup


ABCµ multiple errors




                                        [ c Ratmann et al., PNAS, 2009]
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     Alphabet soup


ABCµ for model choice




                                        [ c Ratmann et al., PNAS, 2009]
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     Alphabet soup


Which summary statistics?




      Fundamental difficulty of the choice of the summary statistic when
      there is no non-trivial sufficient statistic [except when done by the
      experimenters in the field]
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     Alphabet soup


Which summary statistics?



      Fundamental difficulty of the choice of the summary statistic when
      there is no non-trivial sufficient statistic [except when done by the
      experimenters in the field]

      Starting from a large collection of summary statistics is available,
      Joyce and Marjoram (2008) consider the sequential inclusion into
      the ABC target, with a stopping rule based on a likelihood ratio
      test.
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     Alphabet soup


Which summary statistics?




      Fundamental difficulty of the choice of the summary statistic when
      there is no non-trivial sufficient statistic [except when done by the
      experimenters in the field]

      Based on decision-theoretic principles, Fearnhead and Prangle
      (2012) end up with E[θ|y] as the optimal summary statistic
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     simulation-based methods in Econometrics


Econ’ections



      Similar exploration of simulation-based techniques in Econometrics
              Simulated method of moments
              Method of simulated moments
              Simulated pseudo-maximum-likelihood
              Indirect inference
                                                [Gouri´roux & Monfort, 1996]
                                                      e
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     simulation-based methods in Econometrics


Simulated method of moments

                          o
      Given observations y1:n from a model

                                 yt = r(y1:(t−1) , t , θ) ,         t   ∼ g(·)

      simulate        1:n ,   derive

                                         yt (θ) = r(y1:(t−1) ,     t , θ)

      and estimate θ by
                                                    n
                                        arg min           (yt − yt (θ))2
                                                            o
                                                θ
                                                    t=1
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     simulation-based methods in Econometrics


Simulated method of moments

                          o
      Given observations y1:n from a model

                                 yt = r(y1:(t−1) , t , θ) ,           t   ∼ g(·)

      simulate        1:n ,   derive

                                         yt (θ) = r(y1:(t−1) ,       t , θ)

      and estimate θ by
                                                 n             n              2
                                                       o
                                   arg min            yt   −         yt (θ)
                                           θ
                                                t=1            t=1
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     simulation-based methods in Econometrics


Method of simulated moments

      Given a statistic vector K(y) with

                                 Eθ [K(Yt )|y1:(t−1) ] = k(y1:(t−1) ; θ)

      find an unbiased estimator of k(y1:(t−1) ; θ),

                                                 ˜
                                                 k( t , y1:(t−1) ; θ)

      Estimate θ by
                                        n                  S
                    arg min                     K(yt ) −         ˜
                                                                 k( s , y1:(t−1) ; θ)/S
                                                                    t
                             θ
                                     t=1                   s=1

                                                                         [Pakes & Pollard, 1989]
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     simulation-based methods in Econometrics


Indirect inference



                                                      ˆ
      Minimise (in θ) the distance between estimators β based on
      pseudo-models for genuine observations and for observations
      simulated under the true model and the parameter θ.

                                                [Gouri´roux, Monfort, & Renault, 1993;
                                                      e
                                                Smith, 1993; Gallant & Tauchen, 1996]
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     simulation-based methods in Econometrics


Indirect inference (PML vs. PSE)


      Example of the pseudo-maximum-likelihood (PML)

                           ˆ
                           β(y) = arg max               log f (yt |β, y1:(t−1) )
                                                β
                                                    t


      leading to
                                     ˆ        ˆ
                           arg min ||β(yo ) − β(y1 (θ), . . . , yS (θ))||2
                                     θ

      when
                                  ys (θ) ∼ f (y|θ)          s = 1, . . . , S
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     simulation-based methods in Econometrics


Indirect inference (PML vs. PSE)


      Example of the pseudo-score-estimator (PSE)
                                                                                 2
                      ˆ                             ∂ log f
                      β(y) = arg min                        (yt |β, y1:(t−1) )
                                           β
                                                t
                                                       ∂β

      leading to
                                     ˆ        ˆ
                           arg min ||β(yo ) − β(y1 (θ), . . . , yS (θ))||2
                                     θ

      when
                                  ys (θ) ∼ f (y|θ)        s = 1, . . . , S
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     simulation-based methods in Econometrics


ABC using indirect inference (1)


                  We present a novel approach for developing summary statistics
              for use in approximate Bayesian computation (ABC) algorithms by
              using indirect inference(...) In the indirect inference approach to
              ABC the parameters of an auxiliary model fitted to the data become
              the summary statistics. Although applicable to any ABC technique,
              we embed this approach within a sequential Monte Carlo algorithm
              that is completely adaptive and requires very little tuning(...)

                                                [Drovandi, Pettitt & Faddy, 2011]

             c Indirect inference provides summary statistics for ABC...
ABC Methods for Bayesian Model Choice
  Approximate Bayesian computation
     simulation-based methods in Econometrics


ABC using indirect inference (2)



                  ...the above result shows that, in the limit as h → 0, ABC will
              be more accurate than an indirect inference method whose auxiliary
              statistics are the same as the summary statistic that is used for
              ABC(...) Initial analysis showed that which method is more
              accurate depends on the true value of θ.

                                                   [Fearnhead and Prangle, 2012]

     c Indirect inference provides estimates rather than global inference...
ABC Methods for Bayesian Model Choice
  ABC for model choice




ABC for model choice




 Approximate Bayesian computation

 ABC for model choice
   Principle
   Gibbs random fields
   Generic ABC model choice

 Model choice consistency
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Principle


Bayesian model choice

      Principle
      Several models
                                         M1 , M2 , . . .
      are considered simultaneously for dataset y and model index M
      central to inference.
      Use of a prior π(M = m), plus a prior distribution on the
      parameter conditional on the value m of the model index, πm (θ m )
      Goal is to derive the posterior distribution of M,

                                        π(M = m|data)

      a challenging computational target when models are complex.
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Principle


Generic ABC for model choice


      Algorithm 3 Likelihood-free model choice sampler (ABC-MC)
        for t = 1 to T do
          repeat
             Generate m from the prior π(M = m)
             Generate θ m from the prior πm (θ m )
             Generate z from the model fm (z|θ m )
          until ρ{η(z), η(y)} <
          Set m(t) = m and θ (t) = θ m
        end for

                                        [Grelaud & al., 2009; Toni & al., 2009]
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Principle


ABC estimates




      Posterior probability π(M = m|y) approximated by the frequency
      of acceptances from model m
                                            T
                                        1
                                                  Im(t) =m .
                                        T
                                            t=1
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Principle


ABC estimates



      Posterior probability π(M = m|y) approximated by the frequency
      of acceptances from model m
                                            T
                                        1
                                                  Im(t) =m .
                                        T
                                            t=1

       Extension to a weighted polychotomous logistic regression
      estimate of π(M = m|y), with non-parametric kernel weights
                                        [Cornuet et al., DIYABC, 2009]
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Principle


The great ABC controversy
      On-going controvery in phylogeographic genetics about the validity
      of using ABC for testing




   Against: Templeton, 2008,
   2009, 2010a, 2010b, 2010c, &tc
   argues that nested hypotheses
   cannot have higher probabilities
   than nesting hypotheses (!)
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Principle


The great ABC controversy



     On-going controvery in phylogeographic genetics about the validity
     of using ABC for testing
                                        Replies: Fagundes et al., 2008,
   Against: Templeton, 2008,           Beaumont et al., 2010, Berger et
   2009, 2010a, 2010b, 2010c, &tc      al., 2010, Csill`ry et al., 2010
                                                       e
   argues that nested hypotheses       point out that the criticisms are
   cannot have higher probabilities    addressed at [Bayesian]
   than nesting hypotheses (!)         model-based inference and have
                                       nothing to do with ABC...
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Gibbs random fields


Potts model
         Skip MRFs


      Potts model
      Distribution with an energy function of the form

                                        θS(y) = θ         δyl =yi
                                                    l∼i

      where l∼i denotes a neighbourhood structure
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Gibbs random fields


Potts model
         Skip MRFs


      Potts model
      Distribution with an energy function of the form

                                        θS(y) = θ           δyl =yi
                                                      l∼i

      where l∼i denotes a neighbourhood structure

      In most realistic settings, summation

                                        Zθ =         exp{θS(x)}
                                               x∈X

      involves too many terms to be manageable and numerical
      approximations cannot always be trusted
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Gibbs random fields


Neighbourhood relations



      Setup
      Choice to be made between M neighbourhood relations
                                        m
                                    i∼i           (0 ≤ m ≤ M − 1)

      with
                                            Sm (x) =         I{xi =xi }
                                                       m
                                                       i∼i

      driven by the posterior probabilities of the models.
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Gibbs random fields


Model index


      Computational target:

               P(M = m|x) ∝                  fm (x|θm )πm (θm ) dθm π(M = m)
                                        Θm
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Gibbs random fields


Model index


      Computational target:

               P(M = m|x) ∝                  fm (x|θm )πm (θm ) dθm π(M = m)
                                        Θm

      If S(x) sufficient statistic for the joint parameters
      (M, θ0 , . . . , θM −1 ),

                               P(M = m|x) = P(M = m|S(x)) .
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Gibbs random fields


Sufficient statistics in Gibbs random fields


      Each model m has its own sufficient statistic Sm (·) and
      S(·) = (S0 (·), . . . , SM −1 (·)) is also (model-)sufficient.
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Gibbs random fields


Sufficient statistics in Gibbs random fields


      Each model m has its own sufficient statistic Sm (·) and
      S(·) = (S0 (·), . . . , SM −1 (·)) is also (model-)sufficient.
      Explanation: For Gibbs random fields,
                                        1           2
                x|M = m ∼ fm (x|θm ) = fm (x|S(x))fm (S(x)|θm )
                                           1
                                     =         f 2 (S(x)|θm )
                                       n(S(x)) m

      where
                              n(S(x)) = {˜ ∈ X : S(˜ ) = S(x)}
                                         x         x
                      c S(x) is sufficient for the joint parameters
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Gibbs random fields


Toy example


      iid Bernoulli model versus two-state first-order Markov chain, i.e.
                                              n
                  f0 (x|θ0 ) = exp θ0             I{xi =1}     {1 + exp(θ0 )}n ,
                                           i=1

      versus
                                          n
                               1
             f1 (x|θ1 ) =        exp θ1         I{xi =xi−1 }    {1 + exp(θ1 )}n−1 ,
                               2
                                          i=2

      with priors θ0 ∼ U(−5, 5) and θ1 ∼ U(0, 6) (inspired by “phase
      transition” boundaries).
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Gibbs random fields


Toy example (2)




                                                                            10
                                         5




                                                                            5
                                  BF01




                                                                     BF01

                                                                            0
                                   ^




                                                                      ^
                                         0




                                                                            −5
                                         −5




                                              −40   −20     0   10          −10   −40   −20     0   10

                                                     BF01                                BF01




      (left) Comparison of the true BF m0 /m1 (x0 ) with BF m0 /m1 (x0 )
      (in logs) over 2, 000 simulations and 4.106 proposals from the
      prior. (right) Same when using tolerance corresponding to the
      1% quantile on the distances.
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


More about sufficiency




      If η1 (x) sufficient statistic for model m = 1 and parameter θ1 and
      η2 (x) sufficient statistic for model m = 2 and parameter θ2 ,
      (η1 (x), η2 (x)) is not always sufficient for (m, θm )
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


More about sufficiency




      If η1 (x) sufficient statistic for model m = 1 and parameter θ1 and
      η2 (x) sufficient statistic for model m = 2 and parameter θ2 ,
      (η1 (x), η2 (x)) is not always sufficient for (m, θm )
                c Potential loss of information at the testing level
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Limiting behaviour of B12 (T → ∞)

      ABC approximation
                                        T
                                        t=1 Imt =1 Iρ{η(zt ),η(y)}≤
                            B12 (y) =   T
                                                                      ,
                                        t=1 Imt =2 Iρ{η(zt ),η(y)}≤

      where the (mt , z t )’s are simulated from the (joint) prior
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Limiting behaviour of B12 (T → ∞)

      ABC approximation
                                            T
                                            t=1 Imt =1 Iρ{η(zt ),η(y)}≤
                            B12 (y) =       T
                                                                          ,
                                            t=1 Imt =2 Iρ{η(zt ),η(y)}≤

      where the (mt , z t )’s are simulated from the (joint) prior
      As T go to infinity, limit

                                        Iρ{η(z),η(y)}≤ π1 (θ 1 )f1 (z|θ 1 ) dz dθ 1
                  B12 (y) =
                                        Iρ{η(z),η(y)}≤ π2 (θ 2 )f2 (z|θ 2 ) dz dθ 2
                                                              η
                                        Iρ{η,η(y)}≤ π1 (θ 1 )f1 (η|θ 1 ) dη dθ 1
                                =                             η                  ,
                                        Iρ{η,η(y)}≤ π2 (θ 2 )f2 (η|θ 2 ) dη dθ 2
             η               η
      where f1 (η|θ 1 ) and f2 (η|θ 2 ) distributions of η(z)
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Limiting behaviour of B12 ( → 0)




      When         goes to zero,
                                                      η
                                 η          π1 (θ 1 )f1 (η(y)|θ 1 ) dθ 1
                                B12 (y) =             η
                                            π2 (θ 2 )f2 (η(y)|θ 2 ) dθ 2
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Limiting behaviour of B12 ( → 0)




      When         goes to zero,
                                                      η
                                 η          π1 (θ 1 )f1 (η(y)|θ 1 ) dθ 1
                                B12 (y) =             η
                                            π2 (θ 2 )f2 (η(y)|θ 2 ) dθ 2

                 c Bayes factor based on the sole observation of η(y)
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Limiting behaviour of B12 (under sufficiency)

      If η(y) sufficient statistic in both models,

                                      fi (y|θ i ) = gi (y)fiη (η(y)|θ i )

      Thus
                                                    η
                                 Θ1   π(θ 1 )g1 (y)f1 (η(y)|θ 1 ) dθ 1
         B12 (y) =                                  η
                                 Θ2   π(θ 2 )g2 (y)f2 (η(y)|θ 2 ) dθ 2
                                                     η
                                g1 (y)     π1 (θ 1 )f1 (η(y)|θ 1 ) dθ 1   g1 (y) η
                         =                           η                  =       B (y) .
                                g2 (y)     π2 (θ 2 )f2 (η(y)|θ 2 ) dθ 2   g2 (y) 12

                                          [Didelot, Everitt, Johansen & Lawson, 2011]
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Limiting behaviour of B12 (under sufficiency)

      If η(y) sufficient statistic in both models,

                                      fi (y|θ i ) = gi (y)fiη (η(y)|θ i )

      Thus
                                                    η
                                 Θ1   π(θ 1 )g1 (y)f1 (η(y)|θ 1 ) dθ 1
         B12 (y) =                                  η
                                 Θ2   π(θ 2 )g2 (y)f2 (η(y)|θ 2 ) dθ 2
                                                     η
                                g1 (y)     π1 (θ 1 )f1 (η(y)|θ 1 ) dθ 1   g1 (y) η
                         =                           η                  =       B (y) .
                                g2 (y)     π2 (θ 2 )f2 (η(y)|θ 2 ) dθ 2   g2 (y) 12

                                          [Didelot, Everitt, Johansen & Lawson, 2011]
                  c No discrepancy only when cross-model sufficiency
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Poisson/geometric example

      Sample
                                           x = (x1 , . . . , xn )
      from either a Poisson P(λ) or from a geometric G(p)
      Sum
                                                 n
                                          S=          xi = η(x)
                                                i=1

      sufficient statistic for either model but not simultaneously
      Discrepancy ratio

                                        g1 (x)   S!n−S / i xi !
                                               =
                                        g2 (x)    1 n+S−1
                                                        S
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Poisson/geometric discrepancy

                               η
      Range of B12 (x) versus B12 (x): The values produced have
      nothing in common.
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Formal recovery



      Creating an encompassing exponential family
                                      T           T
      f (x|θ1 , θ2 , α1 , α2 ) ∝ exp{θ1 η1 (x) + θ2 η2 (x) + α1 t1 (x) + α2 t2 (x)}

      leads to a sufficient statistic (η1 (x), η2 (x), t1 (x), t2 (x))
                             [Didelot, Everitt, Johansen & Lawson, 2011]
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Formal recovery



      Creating an encompassing exponential family
                                      T           T
      f (x|θ1 , θ2 , α1 , α2 ) ∝ exp{θ1 η1 (x) + θ2 η2 (x) + α1 t1 (x) + α2 t2 (x)}

      leads to a sufficient statistic (η1 (x), η2 (x), t1 (x), t2 (x))
                             [Didelot, Everitt, Johansen & Lawson, 2011]

      In the Poisson/geometric case, if        i xi !   is added to S, no
      discrepancy
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Formal recovery


      Creating an encompassing exponential family
                                      T           T
      f (x|θ1 , θ2 , α1 , α2 ) ∝ exp{θ1 η1 (x) + θ2 η2 (x) + α1 t1 (x) + α2 t2 (x)}

      leads to a sufficient statistic (η1 (x), η2 (x), t1 (x), t2 (x))
                             [Didelot, Everitt, Johansen & Lawson, 2011]

      Only applies in genuine sufficiency settings...
       c Inability to evaluate information loss due to summary
      statistics
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Meaning of the ABC-Bayes factor



      The ABC approximation to the Bayes Factor is based solely on the
      summary statistics....
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Meaning of the ABC-Bayes factor



      The ABC approximation to the Bayes Factor is based solely on the
      summary statistics....
      In the Poisson/geometric case, if E[yi ] = θ0 > 0,

                                         η          (θ0 + 1)2 −θ0
                                    lim B12 (y) =            e
                                   n→∞                  θ0
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


MA example




                                                                                  0.7
                                                                    0.6
                                        0.6




                                                      0.6




                                                                                  0.6
                                                                    0.5
                                        0.5




                                                      0.5




                                                                                  0.5
                                                                    0.4
                                        0.4




                                                      0.4




                                                                                  0.4
                                                                    0.3
                                        0.3




                                                      0.3




                                                                                  0.3
                                                                    0.2
                                        0.2




                                                      0.2




                                                                                  0.2
                                                                    0.1
                                        0.1




                                                      0.1




                                                                                  0.1
                                        0.0




                                                      0.0




                                                                    0.0




                                                                                  0.0
                                              1   2         1   2         1   2         1   2




      Evolution [against ] of ABC Bayes factor, in terms of frequencies of
      visits to models MA(1) (left) and MA(2) (right) when equal to
      10, 1, .1, .01% quantiles on insufficient autocovariance distances. Sample
      of 50 points from a MA(2) with θ1 = 0.6, θ2 = 0.2. True Bayes factor
      equal to 17.71.
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


MA example




                                                                                  0.8
                                        0.6




                                                      0.6




                                                                    0.6




                                                                                  0.6
                                        0.4




                                                      0.4




                                                                    0.4




                                                                                  0.4
                                        0.2




                                                      0.2




                                                                    0.2




                                                                                  0.2
                                        0.0




                                                      0.0




                                                                    0.0




                                                                                  0.0
                                              1   2         1   2         1   2         1   2




      Evolution [against ] of ABC Bayes factor, in terms of frequencies of
      visits to models MA(1) (left) and MA(2) (right) when equal to
      10, 1, .1, .01% quantiles on insufficient autocovariance distances. Sample
      of 50 points from a MA(1) model with θ1 = 0.6. True Bayes factor B21
      equal to .004.
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


A population genetics evaluation

         Gene free

      Population genetics example with
             3 populations
             2 scenari
             15 individuals
             5 loci
             single mutation parameter
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


A population genetics evaluation

         Gene free

      Population genetics example with
             3 populations
             2 scenari
             15 individuals
             5 loci
             single mutation parameter
             24 summary statistics
             2 million ABC proposal
             importance [tree] sampling alternative
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Stability of importance sampling


                                1.0




                                          1.0




                                                1.0




                                                          1.0




                                                                    1.0
                                                                q




                                      q
                                0.8




                                          0.8




                                                0.8




                                                          0.8




                                                                    0.8
                                0.6




                                          0.6




                                                0.6




                                                          0.6




                                                                    0.6
                                                      q
                                0.4




                                          0.4




                                                0.4




                                                          0.4




                                                                    0.4
                                0.2




                                          0.2




                                                0.2




                                                          0.2




                                                                    0.2
                                0.0




                                          0.0




                                                0.0




                                                          0.0




                                                                    0.0
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Comparison with ABC
      Use of 24 summary statistics and DIY-ABC logistic correction

                                                     1.0
                                                                                                                                                                    qq
                                                                                                                                                                    qq
                                                                                                                                                               qq qq
                                                                                                                                                                  qqq
                                                                                                                                                               qq q q
                                                                                                                                                      qq       q q qq
                                                                                                                                                               q q
                                                                                                                                                                   q
                                                                                                                             q            q          q         q q
                                                                                                                                                               q
                                                                                                                                                       q qq q q
                                                                                                                                                           qq
                                                                                                                                                     q q q          qq
                                                                                                                                              qq     qq
                                                                                                                                                     q      q q q qq
                                                                                                                                                                 q q
                                                                                                                                                         q           q
                                                                                                                                                   q               q
                                                                                                                                                                  q q
                                                                                                                                          q      q
                                                                                                                                             q
                                                                                                                                                    qq
                                                                                                                                                     q      q q qq  q
                                                                                                                                                                    q
                                                     0.8




                                                                                                                                           q                      q
                                                                                                                                               q             q     q
                                                                                                                                                 q            q
                                                                                                                                             q                 q     q
                                                                                                                                                                     q
                                                                                                                                q                    qq q
                                                                                                                                                            q q
                                                                                                                                 q    q            q q qq q
                                                                                                                                                          q
                                                                                                                                   q    q           q qqq q q
                                                                                                                                                        qq
                                                                                                                                                      q
                           ABC direct and logistic




                                                                                                                                                     q
                                                                                                                                          q qq
                                                                                                                           q     q q       q
                                                                                                                                                               q
                                                                                                                                                 q q
                                                     0.6




                                                                                                                                    q      q     q    qq
                                                                                                                                 q           q
                                                                                                                             qq
                                                                                                                                   q q              q q
                                                                                                                         q      q q            q
                                                                                           q                               q q
                                                                                                                            q                q
                                                                                                                 q          q      q                q q
                                                                                               q q      q                q q
                                                                                                                           q                   q
                                                                                                                                qq
                                                                                                             qq
                                                                                                 q      q
                                                                                   qq
                                                     0.4




                                                                                                qq               q
                                                                              q                                      q
                                                                                                    q        q
                                                                          q                         q                q
                                                                     q                                                            q
                                                                                           q                     q
                                                                          q                         q

                                                             q                     q            q
                                                     0.2




                                                                     q                         qq
                                                                                       q         q
                                                                 q
                                                                                                q

                                                                          q                         q
                                                                              q
                                                                     q
                                                                      q
                                                                          q
                                                     0.0




                                                             qq




                                                           0.0                    0.2                       0.4                  0.6               0.8             1.0

                                                                                                        importance sampling
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Comparison with ABC
      Use of 15 summary statistics

                                        6
                                        4




                                                                                                                 q                 q
                                                                                                                               q         q
                                                                                                                         q
                                                                                                                 q
                                                                                                                 q                  qq
                                        2




                                                                                                                        q          q
                                                                                                                       q                         q
                                                                                                                         qq
                           ABC direct




                                                                                                  q      q q              qq       q
                                                                                                          qq         q
                                                                                                                         q
                                                                                                      q
                                                                                                            q                      qq
                                                                                                  qq q qq q
                                                                                                         q q
                                                                                                            q
                                                                                             q    qq q q             q
                                                                                                q  q qq
                                                                                         q q qq
                                                                                              q    qq        q
                                                                                       q qq q q q qqq
                                                                                       q
                                                                           q     q    q q
                                                                                      q        q
                                                                                                    q
                                        0




                                                                            qq qq     q
                                                                                     q q        q
                                                                      qq     qq
                                                                  q           q qq
                                                      q       q               q
                                                              q
                                              q                             q
                                                          q
                                                  q
                                        −2
                                        −4




                                             −4               −2                 0                    2                        4             6

                                                                                importance sampling
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


Comparison with ABC
      Use of 15 summary statistics and DIY-ABC logistic correction

                                                                                                                                   qq
                                                     6
                                                                                                                                                   q
                                                                                                                                                           q
                                                                                                                                        q
                                                                                                                                               q
                                                                                                                                           q
                                                                                                                             q
                                                                                                                                 q q
                                                                                                                         q        q       q
                                                     4




                                                                                                                            q
                                                                                                                qq       q qq    qq      q
                                                                                                                         q      q        q
                                                                                                                          q   q q
                                                                                                   q    q     q          qq
                                                                                                                    q                    q
                                                                                                                 q q q      q           q
                           ABC direct and logistic




                                                                                                          q   q q q   q                            q
                                                                                                         q    qq                 q    q
                                                                                                               q      q q q q             qq
                                                                                                                  q
                                                     2




                                                                                                             q                   q       q
                                                                                                                                q                          q
                                                                                                         qq q qq
                                                                                                              q       q q         qq
                                                                                                                                   qq    q
                                                                                                         qq            qq     q
                                                                                                                                  q
                                                                                                     q    qq q           q              qq
                                                                                                             qq qq q q
                                                                                                     qqqq         qq q q  q
                                                                                                              qqqqq
                                                                                                               qq q           q
                                                                                                       q qq q
                                                                                                  q q q qq                q
                                                                                                   qq     q q qqq
                                                                                                                q
                                                                                     q           qqqqq q q
                                                                                                  qq       q
                                                                                           q     qq q q    q qq
                                                     0




                                                                                      qq qqq      q q
                                                                                   qq  q qqq
                                                                                        q
                                                                                        q
                                                                               q        q q
                                                                           q            q    q      q
                                                                   q                 q q q
                                                                           q
                                                           q                       q   q
                                                                       q               q
                                                                                    q qq
                                                               q                       q
                                                     −2




                                                                           q            q
                                                                               q

                                                                   q

                                                                       q
                                                     −4




                                                                           q
                                                               q
                                                           q



                                                          −4               −2                0                    2                    4               6

                                                                                            importance sampling
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


The only safe cases???



      Besides specific models like Gibbs random fields,
      using distances over the data itself escapes the discrepancy...
                               [Toni & Stumpf, 2010; Sousa & al., 2009]
ABC Methods for Bayesian Model Choice
  ABC for model choice
     Generic ABC model choice


The only safe cases???



      Besides specific models like Gibbs random fields,
      using distances over the data itself escapes the discrepancy...
                               [Toni & Stumpf, 2010; Sousa & al., 2009]

      ...and so does the use of more informal model fitting measures
                                                 [Ratmann & al., 2009]
ABC Methods for Bayesian Model Choice
  Model choice consistency




ABC model choice consistency



 Approximate Bayesian computation

 ABC for model choice

 Model choice consistency
   Formalised framework
   Consistency results
   Summary statistics
   Conclusions
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Formalised framework


The starting point



      Central question to the validation of ABC for model choice:

          When is a Bayes factor based on an insufficient statistic
                            T (y) consistent?
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Formalised framework


The starting point



      Central question to the validation of ABC for model choice:

          When is a Bayes factor based on an insufficient statistic
                            T (y) consistent?

                                      T
      Note: c drawn on T (y) through B12 (y) necessarily differs from
      c drawn on y through B12 (y)
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Formalised framework


A benchmark if toy example




      Comparison suggested by referee of PNAS paper [thanks]:
                                [X, Cornuet, Marin, & Pillai, Aug. 2011]
      Model M1 : y ∼ N (θ1 , 1) opposed
                                 √
      to model M2 : y ∼ L(θ√ 1/ 2), Laplace distribution with mean θ2
                             2,
      and scale parameter 1/ 2 (variance one).
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Formalised framework


A benchmark if toy example


      Comparison suggested by referee of PNAS paper [thanks]:
                                  [X, Cornuet, Marin, & Pillai, Aug. 2011]
      Model M1 : y ∼ N (θ1 , 1) opposed
                                   √
      to model M2 : y ∼ L(θ√ 1/ 2), Laplace distribution with mean θ2
                               2,
      and scale parameter 1/ 2 (variance one).
      Four possible statistics
         1. sample mean y (sufficient for M1 if not M2 );
         2. sample median med(y) (insufficient);
         3. sample variance var(y) (ancillary);
         4. median absolute deviation mad(y) = med(y − med(y));
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Formalised framework


A benchmark if toy example
      Comparison suggested by referee of PNAS paper [thanks]:
                                [X, Cornuet, Marin, & Pillai, Aug. 2011]
      Model M1 : y ∼ N (θ1 , 1) opposed
                                 √
      to model M2 : y ∼ L(θ√ 1/ 2), Laplace distribution with mean θ2
                             2,
      and scale parameter 1/ 2 (variance one).
                 6
                 5
                 4
       Density

                 3
                 2
                 1
                 0




                     0.1   0.2   0.3          0.4        0.5   0.6   0.7

                                       posterior probability
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Formalised framework


A benchmark if toy example
      Comparison suggested by referee of PNAS paper [thanks]:
                                [X, Cornuet, Marin, & Pillai, Aug. 2011]
      Model M1 : y ∼ N (θ1 , 1) opposed
                                 √
      to model M2 : y ∼ L(θ√ 1/ 2), Laplace distribution with mean θ2
                             2,
      and scale parameter 1/ 2 (variance one).
                             n=100
         0.7
         0.6
         0.5
         0.4




                     q
         0.3




                     q
                     q

                     q
         0.2
         0.1




                                        q
                                        q
                                        q

                                        q
                                        q
         0.0




                                        q
                                        q



                   Gauss             Laplace
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Formalised framework


A benchmark if toy example
      Comparison suggested by referee of PNAS paper [thanks]:
                                [X, Cornuet, Marin, & Pillai, Aug. 2011]
      Model M1 : y ∼ N (θ1 , 1) opposed
                                 √
      to model M2 : y ∼ L(θ√ 1/ 2), Laplace distribution with mean θ2
                             2,
      and scale parameter 1/ 2 (variance one).
                 6




                                                                                     3
                 5
                 4
       Density




                                                                           Density

                                                                                     2
                 3
                 2




                                                                                     1
                 1
                 0




                                                                                     0




                     0.1   0.2   0.3          0.4        0.5   0.6   0.7                 0.0   0.2   0.4                 0.6   0.8   1.0

                                       posterior probability                                               probability
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Formalised framework


A benchmark if toy example
      Comparison suggested by referee of PNAS paper [thanks]:
                                [X, Cornuet, Marin, & Pillai, Aug. 2011]
      Model M1 : y ∼ N (θ1 , 1) opposed
                                 √
      to model M2 : y ∼ L(θ√ 1/ 2), Laplace distribution with mean θ2
                             2,
      and scale parameter 1/ 2 (variance one).
                             n=100                           n=100




                                               1.0
         0.7




                                                                        q

                                                                        q
         0.6




                                                                        q
                                               0.8
                                                                        q

                                                                        q
         0.5




                                                                        q
                                                                        q

                                                                        q
                                               0.6




                                                                        q
         0.4




                                                                        q

                                                       q
                     q                                 q
         0.3




                     q
                     q
                                               0.4




                                                       q
                     q
         0.2




                                               0.2




                                                       q
         0.1




                                        q
                                        q
                                        q              q
                                        q              q

                                        q
                                               0.0
         0.0




                                        q
                                        q



                   Gauss             Laplace         Gauss           Laplace
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Formalised framework


Framework

      Starting from sample

                                        y = (y1 , . . . , yn )

      the observed sample, not necessarily iid with true distribution

                                              y ∼ Pn

      Summary statistics

                      T (y) = T n = (T1 (y), T2 (y), · · · , Td (y)) ∈ Rd

      with true distribution T n ∼ Gn .
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Formalised framework


Framework



       c Comparison of
          – under M1 , y ∼ F1,n (·|θ1 ) where θ1 ∈ Θ1 ⊂ Rp1
          – under M2 , y ∼ F2,n (·|θ2 ) where θ2 ∈ Θ2 ⊂ Rp2
      turned into
          – under M1 , T (y) ∼ G1,n (·|θ1 ), and θ1 |T (y) ∼ π1 (·|T n )
          – under M2 , T (y) ∼ G2,n (·|θ2 ), and θ2 |T (y) ∼ π2 (·|T n )
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Consistency results


Assumptions



      A collection of asymptotic “standard” assumptions:
      [A1] is a standard central limit theorem under the true model
      [A2] controls the large deviations of the estimator T n from the estimand
      µ(θ)
      [A3] is the standard prior mass condition found in Bayesian asymptotics
      (di effective dimension of the parameter)
      [A4] restricts the behaviour of the model density against the true density
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Consistency results


Assumptions



      A collection of asymptotic “standard” assumptions:
      [A1] There exist a sequence {vn } converging to +∞,
      a distribution Q,
      a symmetric, d × d positive definite matrix V0
      and a vector µ0 ∈ Rd such that
                                    −1/2                 n→∞
                              vn V0        (T n − µ0 )         Q, under Gn
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Consistency results


Assumptions


      A collection of asymptotic “standard” assumptions:
      [A2] For i = 1, 2, there exist sets Fn,i ⊂ Θi and constants                      i , τi , αi   >0
      such that for all τ > 0,

                                      Gi,n |T n − µ(θi )| > τ |µi (θi ) − µ0 | ∧   i   |θi
                             sup                                       −αi
                           θi ∈Fn,i              (|µi (θi ) − µ0 | ∧ i )
                                       −α
                                      vn i

       with
                                                  c          −τ
                                             πi (Fn,i ) = o(vn i ).
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Consistency results


Assumptions



      A collection of asymptotic “standard” assumptions:
      [A3] For u > 0
                                                                        −1
                             Sn,i (u) = θi ∈ Fn,i ; |µ(θi ) − µ0 | ≤ u vn

      if inf{|µi (θi ) − µ0 |; θi ∈ Θi } = 0, there exist constants di < τi ∧ αi − 1
      such that
                                                   −d
                              πi (Sn,i (u)) ∼ udi vn i , ∀u vn
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Consistency results


Assumptions




      A collection of asymptotic “standard” assumptions:
      [A4] If inf{|µi (θi ) − µ0 |; θi ∈ Θi } = 0, for any        > 0, there exist
      U, δ > 0 and (En ) such that, if θi ∈ Sn,i (U )
                                                                     c
                           En ⊂ {t; gi (t|θi ) < δgn (t)}   and Gn (En ) < .
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Consistency results


Assumptions



      A collection of asymptotic “standard” assumptions:
      Again
      [A1] is a standard central limit theorem under the true model
      [A2] controls the large deviations of the estimator T n from the estimand
      µ(θ)
      [A3] is the standard prior mass condition found in Bayesian asymptotics
      (di effective dimension of the parameter)
      [A4] restricts the behaviour of the model density against the true density
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Consistency results


Effective dimension




      Understanding di in [A3]: defined only when
      µ0 ∈ {µi (θi ), θi ∈ Θi },

                           πi (θi : |µi (θi ) − µ0 | < n−1/2 ) = O(n−di /2 )

      is the effective dimension of the model Θi around µ0
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Consistency results


Asymptotic marginals

      Asymptotically, under [A1]–[A4]

                                   mi (t) =        gi (t|θi ) πi (θi ) dθi
                                              Θi

      is such that
      (i) if inf{|µi (θi ) − µ0 |; θi ∈ Θi } = 0,

                                  Cl vn i ≤ mi (T n ) ≤ Cu vn i
                                      d−d                   d−d


      and
      (ii) if inf{|µi (θi ) − µ0 |; θi ∈ Θi } > 0

                                  mi (T n ) = oPn [vn i + vn i ].
                                                    d−τ    d−α
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Consistency results


Within-model consistency


      Under same assumptions,
                        if inf{|µi (θi ) − µ0 |; θi ∈ Θi } = 0,
      the posterior distribution of µi (θi ) given T n is consistent at rate
      1/vn provided αi ∧ τi > di .
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Consistency results


Within-model consistency


      Under same assumptions,
                        if inf{|µi (θi ) − µ0 |; θi ∈ Θi } = 0,
      the posterior distribution of µi (θi ) given T n is consistent at rate
      1/vn provided αi ∧ τi > di .
      Note: di can truly be seen as an effective dimension of the model
      under the posterior πi (.|T n ), since if µ0 ∈ {µi (θi ); θi ∈ Θi },

                                        mi (T n ) ∼ vn i
                                                     d−d
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Consistency results


Between-model consistency




      Consequence of above is that asymptotic behaviour of the Bayes
      factor is driven by the asymptotic mean value of T n under both
      models. And only by this mean value!
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Consistency results


Between-model consistency

      Consequence of above is that asymptotic behaviour of the Bayes
      factor is driven by the asymptotic mean value of T n under both
      models. And only by this mean value!
      Indeed, if

          inf{|µ0 − µ2 (θ2 )|; θ2 ∈ Θ2 } = inf{|µ0 − µ1 (θ1 )|; θ1 ∈ Θ1 } = 0

       then

                      Cl vn 1 −d2 ) ≤ m1 (T n ) m2 (T n ) ≤ Cu vn 1 −d2 ) ,
                          −(d                                   −(d


      where Cl , Cu = OPn (1), irrespective of the true model.
      c Only depends on the difference d1 − d2 : no consistency
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Consistency results


Between-model consistency


      Consequence of above is that asymptotic behaviour of the Bayes
      factor is driven by the asymptotic mean value of T n under both
      models. And only by this mean value!
      Else, if

          inf{|µ0 − µ2 (θ2 )|; θ2 ∈ Θ2 } > inf{|µ0 − µ1 (θ1 )|; θ1 ∈ Θ1 } = 0

       then
                             m1 (T n )
                                       ≥ Cu min vn 1 −α2 ) , vn 1 −τ2 )
                                                 −(d          −(d
                             m2 (T n )
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Consistency results


Consistency theorem



      If

           inf{|µ0 − µ2 (θ2 )|; θ2 ∈ Θ2 } = inf{|µ0 − µ1 (θ1 )|; θ1 ∈ Θ1 } = 0,

      Bayes factor
                                        B12 = O(vn 1 −d2 ) )
                                         T       −(d


      irrespective of the true model. It is inconsistent since it always
      picks the model with the smallest dimension
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Consistency results


Consistency theorem



      If Pn belongs to one of the two models and if µ0 cannot be
      attained by the other one :

                       0 = min (inf{|µ0 − µi (θi )|; θi ∈ Θi }, i = 1, 2)
                           < max (inf{|µ0 − µi (θi )|; θi ∈ Θi }, i = 1, 2) ,
                             T
      then the Bayes factor B12 is consistent
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Summary statistics


Consequences on summary statistics



      Bayes factor driven by the means µi (θi ) and the relative position of
      µ0 wrt both sets {µi (θi ); θi ∈ Θi }, i = 1, 2.
      For ABC, this implies the most likely statistics T n are ancillary
      statistics with different mean values under both models
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Summary statistics


Consequences on summary statistics



      Bayes factor driven by the means µi (θi ) and the relative position of
      µ0 wrt both sets {µi (θi ); θi ∈ Θi }, i = 1, 2.
      For ABC, this implies the most likely statistics T n are ancillary
      statistics with different mean values under both models
      Else, if T n asymptotically depends on some of the parameters of
      the models, it is quite likely that there exists θi ∈ Θi such that
      µi (θi ) = µ0 even though model M1 is misspecified
ABC Methods for Bayesian Model Choice
  Model choice consistency
     Summary statistics


Toy example: Laplace versus Gauss [1]


      If
                             n
           T n = n−1               Xi4 ,   µ1 (θ) = 3 + θ4 + 6θ2 ,   µ2 (θ) = 6 + · · ·
                             i=1

      and the true distribution is Laplace with mean θ0 = 1, under the
                                        √
      Gaussian model the value θ∗ = 2 3 − 3 leads to µ0 = µ(θ∗ )
                                                  [here d1 = d2 = d = 1]
seminar at Princeton University
seminar at Princeton University
seminar at Princeton University
seminar at Princeton University
seminar at Princeton University
seminar at Princeton University
seminar at Princeton University
seminar at Princeton University
seminar at Princeton University
seminar at Princeton University
seminar at Princeton University
seminar at Princeton University
seminar at Princeton University
seminar at Princeton University
seminar at Princeton University
seminar at Princeton University
seminar at Princeton University
seminar at Princeton University

More Related Content

What's hot

ABC short course: introduction chapters
ABC short course: introduction chaptersABC short course: introduction chapters
ABC short course: introduction chaptersChristian Robert
 
Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Christian Robert
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chaptersChristian Robert
 
Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Christian Robert
 
An overview of Bayesian testing
An overview of Bayesian testingAn overview of Bayesian testing
An overview of Bayesian testingChristian Robert
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixturesChristian Robert
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsChristian Robert
 
CISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceCISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceChristian Robert
 
from model uncertainty to ABC
from model uncertainty to ABCfrom model uncertainty to ABC
from model uncertainty to ABCChristian Robert
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
Reliable ABC model choice via random forests
Reliable ABC model choice via random forestsReliable ABC model choice via random forests
Reliable ABC model choice via random forestsChristian Robert
 
Discussion of ABC talk by Stefano Cabras, Padova, March 21, 2013
Discussion of ABC talk by Stefano Cabras, Padova, March 21, 2013Discussion of ABC talk by Stefano Cabras, Padova, March 21, 2013
Discussion of ABC talk by Stefano Cabras, Padova, March 21, 2013Christian Robert
 
ESSLLI2016 DTS Lecture Day 1: From natural deduction to dependent type theory
ESSLLI2016 DTS Lecture Day 1: From natural deduction to dependent type theoryESSLLI2016 DTS Lecture Day 1: From natural deduction to dependent type theory
ESSLLI2016 DTS Lecture Day 1: From natural deduction to dependent type theoryDaisuke BEKKI
 

What's hot (20)

ABC workshop: 17w5025
ABC workshop: 17w5025ABC workshop: 17w5025
ABC workshop: 17w5025
 
ABC short course: introduction chapters
ABC short course: introduction chaptersABC short course: introduction chapters
ABC short course: introduction chapters
 
Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chapters
 
Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]
 
An overview of Bayesian testing
An overview of Bayesian testingAn overview of Bayesian testing
An overview of Bayesian testing
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixtures
 
Boston talk
Boston talkBoston talk
Boston talk
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
 
CISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceCISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergence
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 
Savage-Dickey paradox
Savage-Dickey paradoxSavage-Dickey paradox
Savage-Dickey paradox
 
MaxEnt 2009 talk
MaxEnt 2009 talkMaxEnt 2009 talk
MaxEnt 2009 talk
 
from model uncertainty to ABC
from model uncertainty to ABCfrom model uncertainty to ABC
from model uncertainty to ABC
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
Reliable ABC model choice via random forests
Reliable ABC model choice via random forestsReliable ABC model choice via random forests
Reliable ABC model choice via random forests
 
Discussion of ABC talk by Stefano Cabras, Padova, March 21, 2013
Discussion of ABC talk by Stefano Cabras, Padova, March 21, 2013Discussion of ABC talk by Stefano Cabras, Padova, March 21, 2013
Discussion of ABC talk by Stefano Cabras, Padova, March 21, 2013
 
ESSLLI2016 DTS Lecture Day 1: From natural deduction to dependent type theory
ESSLLI2016 DTS Lecture Day 1: From natural deduction to dependent type theoryESSLLI2016 DTS Lecture Day 1: From natural deduction to dependent type theory
ESSLLI2016 DTS Lecture Day 1: From natural deduction to dependent type theory
 

Similar to seminar at Princeton University

Columbia workshop [ABC model choice]
Columbia workshop [ABC model choice]Columbia workshop [ABC model choice]
Columbia workshop [ABC model choice]Christian Robert
 
Discussion of Fearnhead and Prangle, RSS&lt; Dec. 14, 2011
Discussion of Fearnhead and Prangle, RSS&lt; Dec. 14, 2011Discussion of Fearnhead and Prangle, RSS&lt; Dec. 14, 2011
Discussion of Fearnhead and Prangle, RSS&lt; Dec. 14, 2011Christian Robert
 
Semi-automatic ABC: a discussion
Semi-automatic ABC: a discussionSemi-automatic ABC: a discussion
Semi-automatic ABC: a discussionChristian Robert
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationChristian Robert
 
KNN, SVM, Naive bayes classifiers
KNN, SVM, Naive bayes classifiersKNN, SVM, Naive bayes classifiers
KNN, SVM, Naive bayes classifiersSreerajVA
 
Yes III: Computational methods for model choice
Yes III: Computational methods for model choiceYes III: Computational methods for model choice
Yes III: Computational methods for model choiceChristian Robert
 
4th joint Warwick Oxford Statistics Seminar
4th joint Warwick Oxford Statistics Seminar4th joint Warwick Oxford Statistics Seminar
4th joint Warwick Oxford Statistics SeminarChristian Robert
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015Christian Robert
 
Course on Bayesian computational methods
Course on Bayesian computational methodsCourse on Bayesian computational methods
Course on Bayesian computational methodsChristian Robert
 
San Antonio short course, March 2010
San Antonio short course, March 2010San Antonio short course, March 2010
San Antonio short course, March 2010Christian Robert
 
ABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsChristian Robert
 
Approximate Bayesian computation for the Ising/Potts model
Approximate Bayesian computation for the Ising/Potts modelApproximate Bayesian computation for the Ising/Potts model
Approximate Bayesian computation for the Ising/Potts modelMatt Moores
 

Similar to seminar at Princeton University (20)

ABC model choice
ABC model choiceABC model choice
ABC model choice
 
Edinburgh, Bayes-250
Edinburgh, Bayes-250Edinburgh, Bayes-250
Edinburgh, Bayes-250
 
Columbia workshop [ABC model choice]
Columbia workshop [ABC model choice]Columbia workshop [ABC model choice]
Columbia workshop [ABC model choice]
 
Discussion of Fearnhead and Prangle, RSS&lt; Dec. 14, 2011
Discussion of Fearnhead and Prangle, RSS&lt; Dec. 14, 2011Discussion of Fearnhead and Prangle, RSS&lt; Dec. 14, 2011
Discussion of Fearnhead and Prangle, RSS&lt; Dec. 14, 2011
 
Intro to ABC
Intro to ABCIntro to ABC
Intro to ABC
 
Semi-automatic ABC: a discussion
Semi-automatic ABC: a discussionSemi-automatic ABC: a discussion
Semi-automatic ABC: a discussion
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimation
 
KNN, SVM, Naive bayes classifiers
KNN, SVM, Naive bayes classifiersKNN, SVM, Naive bayes classifiers
KNN, SVM, Naive bayes classifiers
 
Naive Bayes Presentation
Naive Bayes PresentationNaive Bayes Presentation
Naive Bayes Presentation
 
Desy
DesyDesy
Desy
 
Yes III: Computational methods for model choice
Yes III: Computational methods for model choiceYes III: Computational methods for model choice
Yes III: Computational methods for model choice
 
Jsm09 talk
Jsm09 talkJsm09 talk
Jsm09 talk
 
4th joint Warwick Oxford Statistics Seminar
4th joint Warwick Oxford Statistics Seminar4th joint Warwick Oxford Statistics Seminar
4th joint Warwick Oxford Statistics Seminar
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015
 
Course on Bayesian computational methods
Course on Bayesian computational methodsCourse on Bayesian computational methods
Course on Bayesian computational methods
 
San Antonio short course, March 2010
San Antonio short course, March 2010San Antonio short course, March 2010
San Antonio short course, March 2010
 
ML.pptx
ML.pptxML.pptx
ML.pptx
 
ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.ppt
 
ABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified models
 
Approximate Bayesian computation for the Ising/Potts model
Approximate Bayesian computation for the Ising/Potts modelApproximate Bayesian computation for the Ising/Potts model
Approximate Bayesian computation for the Ising/Potts model
 

More from Christian Robert

Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceChristian Robert
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinChristian Robert
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?Christian Robert
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Christian Robert
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Christian Robert
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking componentsChristian Robert
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihoodChristian Robert
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)Christian Robert
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerChristian Robert
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussionChristian Robert
 
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment modelsa discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment modelsChristian Robert
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distancesChristian Robert
 
Poster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conferencePoster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conferenceChristian Robert
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018Christian Robert
 

More from Christian Robert (20)

Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de France
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael Martin
 
discussion of ICML23.pdf
discussion of ICML23.pdfdiscussion of ICML23.pdf
discussion of ICML23.pdf
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?
 
restore.pdf
restore.pdfrestore.pdf
restore.pdf
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking components
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihood
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
 
eugenics and statistics
eugenics and statisticseugenics and statistics
eugenics and statistics
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussion
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment modelsa discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distances
 
Poster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conferencePoster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conference
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018
 

Recently uploaded

Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptxPoojaSen20
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 

Recently uploaded (20)

Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 

seminar at Princeton University

  • 1. ABC Methods for Bayesian Model Choice ABC Methods for Bayesian Model Choice Christian P. Robert Universit´ Paris-Dauphine, IuF, & CREST e http://www.ceremade.dauphine.fr/~xian Princeton University, April 03, 2012 Joint work with J.-M. Cornuet, A. Grelaud, J.-M. Marin, N. Pillai, & J. Rousseau
  • 2. ABC Methods for Bayesian Model Choice Outline Approximate Bayesian computation ABC for model choice Model choice consistency
  • 3. ABC Methods for Bayesian Model Choice Approximate Bayesian computation Approximate Bayesian computation Approximate Bayesian computation ABC basics Alphabet soup simulation-based methods in Econometrics ABC for model choice Model choice consistency
  • 4. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics Regular Bayesian computation issues When faced with a non-standard posterior distribution π(θ|y) ∝ π(θ)L(θ|y) the standard solution is to use simulation (Monte Carlo) to produce a sample θ1 , . . . , θ T from π(θ|y) (or approximately by Markov chain Monte Carlo methods) [Robert & Casella, 2004]
  • 5. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics Untractable likelihoods Cases when the likelihood function f (y|θ) is unavailable and when the completion step f (y|θ) = f (y, z|θ) dz Z is impossible or too costly because of the dimension of z c MCMC cannot be implemented!
  • 6. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics Illustration Example Coalescence tree: in population genetics, reconstitution of a common ancestor from a sample of genes via a phylogenetic tree that is close to impossible to integrate out [100 processor days with 4 parameters] [Cornuet et al., 2009, Bioinformatics]
  • 7. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics The ABC method Bayesian setting: target is π(θ)f (x|θ)
  • 8. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics The ABC method Bayesian setting: target is π(θ)f (x|θ) When likelihood f (x|θ) not in closed form, likelihood-free rejection technique:
  • 9. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics The ABC method Bayesian setting: target is π(θ)f (x|θ) When likelihood f (x|θ) not in closed form, likelihood-free rejection technique: ABC algorithm For an observation y ∼ f (y|θ), under the prior π(θ), keep jointly simulating θ ∼ π(θ) , z ∼ f (z|θ ) , until the auxiliary variable z is equal to the observed value, z = y. [Tavar´ et al., 1997] e
  • 10. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics Why does it work?! The proof is trivial: f (θi ) ∝ π(θi )f (z|θi )Iy (z) z∈D ∝ π(θi )f (y|θi ) = π(θi |y) . [Accept–Reject 101]
  • 11. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics Earlier occurrence ‘Bayesian statistics and Monte Carlo methods are ideally suited to the task of passing many models over one dataset’ [Don Rubin, Annals of Statistics, 1984] Note Rubin (1984) does not promote this algorithm for likelihood-free simulation but frequentist intuition on posterior distributions: parameters from posteriors are more likely to be those that could have generated the data.
  • 12. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics A as approximative When y is a continuous random variable, equality z = y is replaced with a tolerance condition, (y, z) ≤ where is a distance
  • 13. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics A as approximative When y is a continuous random variable, equality z = y is replaced with a tolerance condition, (y, z) ≤ where is a distance Output distributed from π(θ) Pθ { (y, z) < } ∝ π(θ| (y, z) < )
  • 14. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics ABC algorithm Algorithm 1 Likelihood-free rejection sampler for i = 1 to N do repeat generate θ from the prior distribution π(·) generate z from the likelihood f (·|θ ) until ρ{η(z), η(y)} ≤ set θi = θ end for where η(y) defines a (maybe in-sufficient) statistic
  • 15. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics Output The likelihood-free algorithm samples from the marginal in z of: π(θ)f (z|θ)IA ,y (z) π (θ, z|y) = , A ,y ×Θ π(θ)f (z|θ)dzdθ where A ,y = {z ∈ D|ρ(η(z), η(y)) < }.
  • 16. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics Output The likelihood-free algorithm samples from the marginal in z of: π(θ)f (z|θ)IA ,y (z) π (θ, z|y) = , A ,y ×Θ π(θ)f (z|θ)dzdθ where A ,y = {z ∈ D|ρ(η(z), η(y)) < }. The idea behind ABC is that the summary statistics coupled with a small tolerance should provide a good approximation of the posterior distribution: π (θ|y) = π (θ, z|y)dz ≈ π(θ|y) .
  • 17. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics Probit modelling on Pima Indian women Example (R benchmark) 200 Pima Indian women with observed variables plasma glucose concentration in oral glucose tolerance test diastolic blood pressure diabetes pedigree function presence/absence of diabetes
  • 18. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics Probit modelling on Pima Indian women Example (R benchmark) 200 Pima Indian women with observed variables plasma glucose concentration in oral glucose tolerance test diastolic blood pressure diabetes pedigree function presence/absence of diabetes Probability of diabetes function of above variables P(y = 1|x) = Φ(x1 β1 + x2 β2 + x3 β3 ) ,
  • 19. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics Probit modelling on Pima Indian women Example (R benchmark) 200 Pima Indian women with observed variables plasma glucose concentration in oral glucose tolerance test diastolic blood pressure diabetes pedigree function presence/absence of diabetes Probability of diabetes function of above variables P(y = 1|x) = Φ(x1 β1 + x2 β2 + x3 β3 ) , Test of H0 : β3 = 0 for 200 observations of Pima.tr based on a g-prior modelling: β ∼ N3 (0, n xT x)−1
  • 20. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics Probit modelling on Pima Indian women Example (R benchmark) 200 Pima Indian women with observed variables plasma glucose concentration in oral glucose tolerance test diastolic blood pressure diabetes pedigree function presence/absence of diabetes Probability of diabetes function of above variables P(y = 1|x) = Φ(x1 β1 + x2 β2 + x3 β3 ) , Test of H0 : β3 = 0 for 200 observations of Pima.tr based on a g-prior modelling: β ∼ N3 (0, n xT x)−1 Use of importance function inspired from the MLE estimate distribution ˆ ˆ β ∼ N (β, Σ)
  • 21. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics Pima Indian benchmark 80 100 1.0 80 60 0.8 60 0.6 Density Density Density 40 40 0.4 20 20 0.2 0.0 0 0 −0.005 0.010 0.020 0.030 −0.05 −0.03 −0.01 −1.0 0.0 1.0 2.0 Figure: Comparison between density estimates of the marginals on β1 (left), β2 (center) and β3 (right) from ABC rejection samples (red) and MCMC samples (black) .
  • 22. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics MA example Consider the MA(q) model q xt = t+ ϑi t−i i=1 Simple prior: uniform prior over the identifiability zone, e.g. triangle for MA(2)
  • 23. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics MA example (2) ABC algorithm thus made of 1. picking a new value (ϑ1 , ϑ2 ) in the triangle 2. generating an iid sequence ( t )−q<t≤T 3. producing a simulated series (xt )1≤t≤T
  • 24. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics MA example (2) ABC algorithm thus made of 1. picking a new value (ϑ1 , ϑ2 ) in the triangle 2. generating an iid sequence ( t )−q<t≤T 3. producing a simulated series (xt )1≤t≤T Distance: basic distance between the series T ρ((xt )1≤t≤T , (xt )1≤t≤T ) = (xt − xt )2 t=1 or between summary statistics like the first q autocorrelations T τj = xt xt−j t=j+1
  • 25. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics Comparison of distance impact Evaluation of the tolerance on the ABC sample against both distances ( = 100%, 10%, 1%, 0.1%) for an MA(2) model
  • 26. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics Comparison of distance impact 4 1.5 3 1.0 2 0.5 1 0.0 0 0.0 0.2 0.4 0.6 0.8 −2.0 −1.0 0.0 0.5 1.0 1.5 θ1 θ2 Evaluation of the tolerance on the ABC sample against both distances ( = 100%, 10%, 1%, 0.1%) for an MA(2) model
  • 27. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics Comparison of distance impact 4 1.5 3 1.0 2 0.5 1 0.0 0 0.0 0.2 0.4 0.6 0.8 −2.0 −1.0 0.0 0.5 1.0 1.5 θ1 θ2 Evaluation of the tolerance on the ABC sample against both distances ( = 100%, 10%, 1%, 0.1%) for an MA(2) model
  • 28. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics ABC advances Simulating from the prior is often poor in efficiency
  • 29. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics ABC advances Simulating from the prior is often poor in efficiency Either modify the proposal distribution on θ to increase the density of x’s within the vicinity of y... [Marjoram et al, 2003; Bortot et al., 2007, Sisson et al., 2007]
  • 30. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics ABC advances Simulating from the prior is often poor in efficiency Either modify the proposal distribution on θ to increase the density of x’s within the vicinity of y... [Marjoram et al, 2003; Bortot et al., 2007, Sisson et al., 2007] ...or by viewing the problem as a conditional density estimation and by developing techniques to allow for larger [Beaumont et al., 2002]
  • 31. ABC Methods for Bayesian Model Choice Approximate Bayesian computation ABC basics ABC advances Simulating from the prior is often poor in efficiency Either modify the proposal distribution on θ to increase the density of x’s within the vicinity of y... [Marjoram et al, 2003; Bortot et al., 2007, Sisson et al., 2007] ...or by viewing the problem as a conditional density estimation and by developing techniques to allow for larger [Beaumont et al., 2002] .....or even by including in the inferential framework [ABCµ ] [Ratmann et al., 2009]
  • 32. ABC Methods for Bayesian Model Choice Approximate Bayesian computation Alphabet soup ABC-MCMC Markov chain (θ(t) ) created via the transition function  θ ∼ Kω (θ |θ(t) ) if x ∼ f (x|θ ) is such that x = y   π(θ )Kω (t) |θ ) θ (t+1) = and u ∼ U(0, 1) ≤ π(θ(t) )K (θ |θ(t) ) ,  ω (θ  (t) θ otherwise,
  • 33. ABC Methods for Bayesian Model Choice Approximate Bayesian computation Alphabet soup ABC-MCMC Markov chain (θ(t) ) created via the transition function  θ ∼ Kω (θ |θ(t) ) if x ∼ f (x|θ ) is such that x = y   π(θ )Kω (t) |θ ) θ (t+1) = and u ∼ U(0, 1) ≤ π(θ(t) )K (θ |θ(t) ) ,  ω (θ  (t) θ otherwise, has the posterior π(θ|y) as stationary distribution [Marjoram et al, 2003]
  • 34. ABC Methods for Bayesian Model Choice Approximate Bayesian computation Alphabet soup ABC-MCMC (2) Algorithm 2 Likelihood-free MCMC sampler Use Algorithm 1 to get (θ(0) , z(0) ) for t = 1 to N do Generate θ from Kω ·|θ(t−1) , Generate z from the likelihood f (·|θ ), Generate u from U[0,1] , π(θ )Kω (θ(t−1) |θ ) if u ≤ I π(θ(t−1) Kω (θ |θ(t−1) ) A ,y (z ) then set (θ(t) , z(t) ) = (θ , z ) else (θ(t) , z(t) )) = (θ(t−1) , z(t−1) ), end if end for
  • 35. ABC Methods for Bayesian Model Choice Approximate Bayesian computation Alphabet soup Why does it work? Acceptance probability that does not involve the calculation of the likelihood and π (θ , z |y) Kω (θ(t−1) |θ )f (z(t−1) |θ(t−1) ) × π (θ(t−1) , z(t−1) |y) Kω (θ |θ(t−1) )f (z |θ ) π(θ ) f (z |θ ) IA ,y (z ) = (t−1) ) f (z(t−1) |θ (t−1) )I (t−1) ) π(θ A ,y (z Kω (θ(t−1) |θ ) f (z(t−1) |θ(t−1) ) × Kω (θ |θ(t−1) ) f (z |θ ) π(θ )Kω (θ(t−1) |θ ) = IA (z ) . π(θ(t−1) Kω (θ |θ(t−1) ) ,y
  • 36. ABC Methods for Bayesian Model Choice Approximate Bayesian computation Alphabet soup ABCµ Use of a joint density f (θ, |y) ∝ ξ( |y, θ) × πθ (θ) × π ( ) where y is the data, and ξ(·|y, θ) is the prior predictive density of = ρ(η(z), η(y)) given θ and x when z ∼ f (z|θ)
  • 37. ABC Methods for Bayesian Model Choice Approximate Bayesian computation Alphabet soup ABCµ Use of a joint density f (θ, |y) ∝ ξ( |y, θ) × πθ (θ) × π ( ) where y is the data, and ξ(·|y, θ) is the prior predictive density of = ρ(η(z), η(y)) given θ and x when z ∼ f (z|θ) Warning! Replacement of ξ( |y, θ) with a non-parametric kernel approximation. [Ratmann, Andrieu, Wiuf and Richardson, 2009, PNAS]
  • 38. ABC Methods for Bayesian Model Choice Approximate Bayesian computation Alphabet soup ABCµ details Major drift: Multidimensional distances ρk (k = 1, . . . , K) and errors k = ρk (ηk (z), ηk (y)), with ˆ 1 k ∼ ξk ( |y, θ) ≈ ξk ( |y, θ) = K[{ k −ρk (ηk (zb ), ηk (y))}/hk ] Bhk b ˆ then used in replacing ξ( |y, θ) with mink ξk ( |y, θ)
  • 39. ABC Methods for Bayesian Model Choice Approximate Bayesian computation Alphabet soup ABCµ details Major drift: Multidimensional distances ρk (k = 1, . . . , K) and errors k = ρk (ηk (z), ηk (y)), with ˆ 1 k ∼ ξk ( |y, θ) ≈ ξk ( |y, θ) = K[{ k −ρk (ηk (zb ), ηk (y))}/hk ] Bhk b ˆ then used in replacing ξ( |y, θ) with mink ξk ( |y, θ) ABCµ involves acceptance probability ˆ π(θ , ) q(θ , θ)q( , ) mink ξk ( |y, θ ) ˆ π(θ, ) q(θ, θ )q( , ) mink ξk ( |y, θ)
  • 40. ABC Methods for Bayesian Model Choice Approximate Bayesian computation Alphabet soup ABCµ multiple errors [ c Ratmann et al., PNAS, 2009]
  • 41. ABC Methods for Bayesian Model Choice Approximate Bayesian computation Alphabet soup ABCµ for model choice [ c Ratmann et al., PNAS, 2009]
  • 42. ABC Methods for Bayesian Model Choice Approximate Bayesian computation Alphabet soup Which summary statistics? Fundamental difficulty of the choice of the summary statistic when there is no non-trivial sufficient statistic [except when done by the experimenters in the field]
  • 43. ABC Methods for Bayesian Model Choice Approximate Bayesian computation Alphabet soup Which summary statistics? Fundamental difficulty of the choice of the summary statistic when there is no non-trivial sufficient statistic [except when done by the experimenters in the field] Starting from a large collection of summary statistics is available, Joyce and Marjoram (2008) consider the sequential inclusion into the ABC target, with a stopping rule based on a likelihood ratio test.
  • 44. ABC Methods for Bayesian Model Choice Approximate Bayesian computation Alphabet soup Which summary statistics? Fundamental difficulty of the choice of the summary statistic when there is no non-trivial sufficient statistic [except when done by the experimenters in the field] Based on decision-theoretic principles, Fearnhead and Prangle (2012) end up with E[θ|y] as the optimal summary statistic
  • 45. ABC Methods for Bayesian Model Choice Approximate Bayesian computation simulation-based methods in Econometrics Econ’ections Similar exploration of simulation-based techniques in Econometrics Simulated method of moments Method of simulated moments Simulated pseudo-maximum-likelihood Indirect inference [Gouri´roux & Monfort, 1996] e
  • 46. ABC Methods for Bayesian Model Choice Approximate Bayesian computation simulation-based methods in Econometrics Simulated method of moments o Given observations y1:n from a model yt = r(y1:(t−1) , t , θ) , t ∼ g(·) simulate 1:n , derive yt (θ) = r(y1:(t−1) , t , θ) and estimate θ by n arg min (yt − yt (θ))2 o θ t=1
  • 47. ABC Methods for Bayesian Model Choice Approximate Bayesian computation simulation-based methods in Econometrics Simulated method of moments o Given observations y1:n from a model yt = r(y1:(t−1) , t , θ) , t ∼ g(·) simulate 1:n , derive yt (θ) = r(y1:(t−1) , t , θ) and estimate θ by n n 2 o arg min yt − yt (θ) θ t=1 t=1
  • 48. ABC Methods for Bayesian Model Choice Approximate Bayesian computation simulation-based methods in Econometrics Method of simulated moments Given a statistic vector K(y) with Eθ [K(Yt )|y1:(t−1) ] = k(y1:(t−1) ; θ) find an unbiased estimator of k(y1:(t−1) ; θ), ˜ k( t , y1:(t−1) ; θ) Estimate θ by n S arg min K(yt ) − ˜ k( s , y1:(t−1) ; θ)/S t θ t=1 s=1 [Pakes & Pollard, 1989]
  • 49. ABC Methods for Bayesian Model Choice Approximate Bayesian computation simulation-based methods in Econometrics Indirect inference ˆ Minimise (in θ) the distance between estimators β based on pseudo-models for genuine observations and for observations simulated under the true model and the parameter θ. [Gouri´roux, Monfort, & Renault, 1993; e Smith, 1993; Gallant & Tauchen, 1996]
  • 50. ABC Methods for Bayesian Model Choice Approximate Bayesian computation simulation-based methods in Econometrics Indirect inference (PML vs. PSE) Example of the pseudo-maximum-likelihood (PML) ˆ β(y) = arg max log f (yt |β, y1:(t−1) ) β t leading to ˆ ˆ arg min ||β(yo ) − β(y1 (θ), . . . , yS (θ))||2 θ when ys (θ) ∼ f (y|θ) s = 1, . . . , S
  • 51. ABC Methods for Bayesian Model Choice Approximate Bayesian computation simulation-based methods in Econometrics Indirect inference (PML vs. PSE) Example of the pseudo-score-estimator (PSE) 2 ˆ ∂ log f β(y) = arg min (yt |β, y1:(t−1) ) β t ∂β leading to ˆ ˆ arg min ||β(yo ) − β(y1 (θ), . . . , yS (θ))||2 θ when ys (θ) ∼ f (y|θ) s = 1, . . . , S
  • 52. ABC Methods for Bayesian Model Choice Approximate Bayesian computation simulation-based methods in Econometrics ABC using indirect inference (1) We present a novel approach for developing summary statistics for use in approximate Bayesian computation (ABC) algorithms by using indirect inference(...) In the indirect inference approach to ABC the parameters of an auxiliary model fitted to the data become the summary statistics. Although applicable to any ABC technique, we embed this approach within a sequential Monte Carlo algorithm that is completely adaptive and requires very little tuning(...) [Drovandi, Pettitt & Faddy, 2011] c Indirect inference provides summary statistics for ABC...
  • 53. ABC Methods for Bayesian Model Choice Approximate Bayesian computation simulation-based methods in Econometrics ABC using indirect inference (2) ...the above result shows that, in the limit as h → 0, ABC will be more accurate than an indirect inference method whose auxiliary statistics are the same as the summary statistic that is used for ABC(...) Initial analysis showed that which method is more accurate depends on the true value of θ. [Fearnhead and Prangle, 2012] c Indirect inference provides estimates rather than global inference...
  • 54. ABC Methods for Bayesian Model Choice ABC for model choice ABC for model choice Approximate Bayesian computation ABC for model choice Principle Gibbs random fields Generic ABC model choice Model choice consistency
  • 55. ABC Methods for Bayesian Model Choice ABC for model choice Principle Bayesian model choice Principle Several models M1 , M2 , . . . are considered simultaneously for dataset y and model index M central to inference. Use of a prior π(M = m), plus a prior distribution on the parameter conditional on the value m of the model index, πm (θ m ) Goal is to derive the posterior distribution of M, π(M = m|data) a challenging computational target when models are complex.
  • 56. ABC Methods for Bayesian Model Choice ABC for model choice Principle Generic ABC for model choice Algorithm 3 Likelihood-free model choice sampler (ABC-MC) for t = 1 to T do repeat Generate m from the prior π(M = m) Generate θ m from the prior πm (θ m ) Generate z from the model fm (z|θ m ) until ρ{η(z), η(y)} < Set m(t) = m and θ (t) = θ m end for [Grelaud & al., 2009; Toni & al., 2009]
  • 57. ABC Methods for Bayesian Model Choice ABC for model choice Principle ABC estimates Posterior probability π(M = m|y) approximated by the frequency of acceptances from model m T 1 Im(t) =m . T t=1
  • 58. ABC Methods for Bayesian Model Choice ABC for model choice Principle ABC estimates Posterior probability π(M = m|y) approximated by the frequency of acceptances from model m T 1 Im(t) =m . T t=1 Extension to a weighted polychotomous logistic regression estimate of π(M = m|y), with non-parametric kernel weights [Cornuet et al., DIYABC, 2009]
  • 59. ABC Methods for Bayesian Model Choice ABC for model choice Principle The great ABC controversy On-going controvery in phylogeographic genetics about the validity of using ABC for testing Against: Templeton, 2008, 2009, 2010a, 2010b, 2010c, &tc argues that nested hypotheses cannot have higher probabilities than nesting hypotheses (!)
  • 60. ABC Methods for Bayesian Model Choice ABC for model choice Principle The great ABC controversy On-going controvery in phylogeographic genetics about the validity of using ABC for testing Replies: Fagundes et al., 2008, Against: Templeton, 2008, Beaumont et al., 2010, Berger et 2009, 2010a, 2010b, 2010c, &tc al., 2010, Csill`ry et al., 2010 e argues that nested hypotheses point out that the criticisms are cannot have higher probabilities addressed at [Bayesian] than nesting hypotheses (!) model-based inference and have nothing to do with ABC...
  • 61. ABC Methods for Bayesian Model Choice ABC for model choice Gibbs random fields Potts model Skip MRFs Potts model Distribution with an energy function of the form θS(y) = θ δyl =yi l∼i where l∼i denotes a neighbourhood structure
  • 62. ABC Methods for Bayesian Model Choice ABC for model choice Gibbs random fields Potts model Skip MRFs Potts model Distribution with an energy function of the form θS(y) = θ δyl =yi l∼i where l∼i denotes a neighbourhood structure In most realistic settings, summation Zθ = exp{θS(x)} x∈X involves too many terms to be manageable and numerical approximations cannot always be trusted
  • 63. ABC Methods for Bayesian Model Choice ABC for model choice Gibbs random fields Neighbourhood relations Setup Choice to be made between M neighbourhood relations m i∼i (0 ≤ m ≤ M − 1) with Sm (x) = I{xi =xi } m i∼i driven by the posterior probabilities of the models.
  • 64. ABC Methods for Bayesian Model Choice ABC for model choice Gibbs random fields Model index Computational target: P(M = m|x) ∝ fm (x|θm )πm (θm ) dθm π(M = m) Θm
  • 65. ABC Methods for Bayesian Model Choice ABC for model choice Gibbs random fields Model index Computational target: P(M = m|x) ∝ fm (x|θm )πm (θm ) dθm π(M = m) Θm If S(x) sufficient statistic for the joint parameters (M, θ0 , . . . , θM −1 ), P(M = m|x) = P(M = m|S(x)) .
  • 66. ABC Methods for Bayesian Model Choice ABC for model choice Gibbs random fields Sufficient statistics in Gibbs random fields Each model m has its own sufficient statistic Sm (·) and S(·) = (S0 (·), . . . , SM −1 (·)) is also (model-)sufficient.
  • 67. ABC Methods for Bayesian Model Choice ABC for model choice Gibbs random fields Sufficient statistics in Gibbs random fields Each model m has its own sufficient statistic Sm (·) and S(·) = (S0 (·), . . . , SM −1 (·)) is also (model-)sufficient. Explanation: For Gibbs random fields, 1 2 x|M = m ∼ fm (x|θm ) = fm (x|S(x))fm (S(x)|θm ) 1 = f 2 (S(x)|θm ) n(S(x)) m where n(S(x)) = {˜ ∈ X : S(˜ ) = S(x)} x x c S(x) is sufficient for the joint parameters
  • 68. ABC Methods for Bayesian Model Choice ABC for model choice Gibbs random fields Toy example iid Bernoulli model versus two-state first-order Markov chain, i.e. n f0 (x|θ0 ) = exp θ0 I{xi =1} {1 + exp(θ0 )}n , i=1 versus n 1 f1 (x|θ1 ) = exp θ1 I{xi =xi−1 } {1 + exp(θ1 )}n−1 , 2 i=2 with priors θ0 ∼ U(−5, 5) and θ1 ∼ U(0, 6) (inspired by “phase transition” boundaries).
  • 69. ABC Methods for Bayesian Model Choice ABC for model choice Gibbs random fields Toy example (2) 10 5 5 BF01 BF01 0 ^ ^ 0 −5 −5 −40 −20 0 10 −10 −40 −20 0 10 BF01 BF01 (left) Comparison of the true BF m0 /m1 (x0 ) with BF m0 /m1 (x0 ) (in logs) over 2, 000 simulations and 4.106 proposals from the prior. (right) Same when using tolerance corresponding to the 1% quantile on the distances.
  • 70. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice More about sufficiency If η1 (x) sufficient statistic for model m = 1 and parameter θ1 and η2 (x) sufficient statistic for model m = 2 and parameter θ2 , (η1 (x), η2 (x)) is not always sufficient for (m, θm )
  • 71. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice More about sufficiency If η1 (x) sufficient statistic for model m = 1 and parameter θ1 and η2 (x) sufficient statistic for model m = 2 and parameter θ2 , (η1 (x), η2 (x)) is not always sufficient for (m, θm ) c Potential loss of information at the testing level
  • 72. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Limiting behaviour of B12 (T → ∞) ABC approximation T t=1 Imt =1 Iρ{η(zt ),η(y)}≤ B12 (y) = T , t=1 Imt =2 Iρ{η(zt ),η(y)}≤ where the (mt , z t )’s are simulated from the (joint) prior
  • 73. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Limiting behaviour of B12 (T → ∞) ABC approximation T t=1 Imt =1 Iρ{η(zt ),η(y)}≤ B12 (y) = T , t=1 Imt =2 Iρ{η(zt ),η(y)}≤ where the (mt , z t )’s are simulated from the (joint) prior As T go to infinity, limit Iρ{η(z),η(y)}≤ π1 (θ 1 )f1 (z|θ 1 ) dz dθ 1 B12 (y) = Iρ{η(z),η(y)}≤ π2 (θ 2 )f2 (z|θ 2 ) dz dθ 2 η Iρ{η,η(y)}≤ π1 (θ 1 )f1 (η|θ 1 ) dη dθ 1 = η , Iρ{η,η(y)}≤ π2 (θ 2 )f2 (η|θ 2 ) dη dθ 2 η η where f1 (η|θ 1 ) and f2 (η|θ 2 ) distributions of η(z)
  • 74. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Limiting behaviour of B12 ( → 0) When goes to zero, η η π1 (θ 1 )f1 (η(y)|θ 1 ) dθ 1 B12 (y) = η π2 (θ 2 )f2 (η(y)|θ 2 ) dθ 2
  • 75. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Limiting behaviour of B12 ( → 0) When goes to zero, η η π1 (θ 1 )f1 (η(y)|θ 1 ) dθ 1 B12 (y) = η π2 (θ 2 )f2 (η(y)|θ 2 ) dθ 2 c Bayes factor based on the sole observation of η(y)
  • 76. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Limiting behaviour of B12 (under sufficiency) If η(y) sufficient statistic in both models, fi (y|θ i ) = gi (y)fiη (η(y)|θ i ) Thus η Θ1 π(θ 1 )g1 (y)f1 (η(y)|θ 1 ) dθ 1 B12 (y) = η Θ2 π(θ 2 )g2 (y)f2 (η(y)|θ 2 ) dθ 2 η g1 (y) π1 (θ 1 )f1 (η(y)|θ 1 ) dθ 1 g1 (y) η = η = B (y) . g2 (y) π2 (θ 2 )f2 (η(y)|θ 2 ) dθ 2 g2 (y) 12 [Didelot, Everitt, Johansen & Lawson, 2011]
  • 77. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Limiting behaviour of B12 (under sufficiency) If η(y) sufficient statistic in both models, fi (y|θ i ) = gi (y)fiη (η(y)|θ i ) Thus η Θ1 π(θ 1 )g1 (y)f1 (η(y)|θ 1 ) dθ 1 B12 (y) = η Θ2 π(θ 2 )g2 (y)f2 (η(y)|θ 2 ) dθ 2 η g1 (y) π1 (θ 1 )f1 (η(y)|θ 1 ) dθ 1 g1 (y) η = η = B (y) . g2 (y) π2 (θ 2 )f2 (η(y)|θ 2 ) dθ 2 g2 (y) 12 [Didelot, Everitt, Johansen & Lawson, 2011] c No discrepancy only when cross-model sufficiency
  • 78. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Poisson/geometric example Sample x = (x1 , . . . , xn ) from either a Poisson P(λ) or from a geometric G(p) Sum n S= xi = η(x) i=1 sufficient statistic for either model but not simultaneously Discrepancy ratio g1 (x) S!n−S / i xi ! = g2 (x) 1 n+S−1 S
  • 79. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Poisson/geometric discrepancy η Range of B12 (x) versus B12 (x): The values produced have nothing in common.
  • 80. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Formal recovery Creating an encompassing exponential family T T f (x|θ1 , θ2 , α1 , α2 ) ∝ exp{θ1 η1 (x) + θ2 η2 (x) + α1 t1 (x) + α2 t2 (x)} leads to a sufficient statistic (η1 (x), η2 (x), t1 (x), t2 (x)) [Didelot, Everitt, Johansen & Lawson, 2011]
  • 81. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Formal recovery Creating an encompassing exponential family T T f (x|θ1 , θ2 , α1 , α2 ) ∝ exp{θ1 η1 (x) + θ2 η2 (x) + α1 t1 (x) + α2 t2 (x)} leads to a sufficient statistic (η1 (x), η2 (x), t1 (x), t2 (x)) [Didelot, Everitt, Johansen & Lawson, 2011] In the Poisson/geometric case, if i xi ! is added to S, no discrepancy
  • 82. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Formal recovery Creating an encompassing exponential family T T f (x|θ1 , θ2 , α1 , α2 ) ∝ exp{θ1 η1 (x) + θ2 η2 (x) + α1 t1 (x) + α2 t2 (x)} leads to a sufficient statistic (η1 (x), η2 (x), t1 (x), t2 (x)) [Didelot, Everitt, Johansen & Lawson, 2011] Only applies in genuine sufficiency settings... c Inability to evaluate information loss due to summary statistics
  • 83. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Meaning of the ABC-Bayes factor The ABC approximation to the Bayes Factor is based solely on the summary statistics....
  • 84. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Meaning of the ABC-Bayes factor The ABC approximation to the Bayes Factor is based solely on the summary statistics.... In the Poisson/geometric case, if E[yi ] = θ0 > 0, η (θ0 + 1)2 −θ0 lim B12 (y) = e n→∞ θ0
  • 85. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice MA example 0.7 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.5 0.4 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.2 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.0 0.0 0.0 0.0 1 2 1 2 1 2 1 2 Evolution [against ] of ABC Bayes factor, in terms of frequencies of visits to models MA(1) (left) and MA(2) (right) when equal to 10, 1, .1, .01% quantiles on insufficient autocovariance distances. Sample of 50 points from a MA(2) with θ1 = 0.6, θ2 = 0.2. True Bayes factor equal to 17.71.
  • 86. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice MA example 0.8 0.6 0.6 0.6 0.6 0.4 0.4 0.4 0.4 0.2 0.2 0.2 0.2 0.0 0.0 0.0 0.0 1 2 1 2 1 2 1 2 Evolution [against ] of ABC Bayes factor, in terms of frequencies of visits to models MA(1) (left) and MA(2) (right) when equal to 10, 1, .1, .01% quantiles on insufficient autocovariance distances. Sample of 50 points from a MA(1) model with θ1 = 0.6. True Bayes factor B21 equal to .004.
  • 87. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice A population genetics evaluation Gene free Population genetics example with 3 populations 2 scenari 15 individuals 5 loci single mutation parameter
  • 88. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice A population genetics evaluation Gene free Population genetics example with 3 populations 2 scenari 15 individuals 5 loci single mutation parameter 24 summary statistics 2 million ABC proposal importance [tree] sampling alternative
  • 89. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Stability of importance sampling 1.0 1.0 1.0 1.0 1.0 q q 0.8 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.6 q 0.4 0.4 0.4 0.4 0.4 0.2 0.2 0.2 0.2 0.2 0.0 0.0 0.0 0.0 0.0
  • 90. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Comparison with ABC Use of 24 summary statistics and DIY-ABC logistic correction 1.0 qq qq qq qq qqq qq q q qq q q qq q q q q q q q q q q qq q q qq q q q qq qq qq q q q q qq q q q q q q q q q q q qq q q q qq q q 0.8 q q q q q q q q q q q q qq q q q q q q q qq q q q q q qqq q q qq q ABC direct and logistic q q qq q q q q q q q 0.6 q q q qq q q qq q q q q q q q q q q q q q q q q q q q q q q q q q qq qq q q qq 0.4 qq q q q q q q q q q q q q q q q q q 0.2 q qq q q q q q q q q q q 0.0 qq 0.0 0.2 0.4 0.6 0.8 1.0 importance sampling
  • 91. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Comparison with ABC Use of 15 summary statistics 6 4 q q q q q q q qq 2 q q q q qq ABC direct q q q qq q qq q q q q qq qq q qq q q q q q qq q q q q q qq q q qq q qq q q qq q q q qqq q q q q q q q q 0 qq qq q q q q qq qq q q qq q q q q q q q q −2 −4 −4 −2 0 2 4 6 importance sampling
  • 92. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice Comparison with ABC Use of 15 summary statistics and DIY-ABC logistic correction qq 6 q q q q q q q q q q q 4 q qq q qq qq q q q q q q q q q q qq q q q q q q q ABC direct and logistic q q q q q q q qq q q q q q q q qq q 2 q q q q q qq q qq q q q qq qq q qq qq q q q qq q q qq qq qq q q qqqq qq q q q qqqqq qq q q q qq q q q q qq q qq q q qqq q q qqqqq q q qq q q qq q q q qq 0 qq qqq q q qq q qqq q q q q q q q q q q q q q q q q q q q q qq q q −2 q q q q q −4 q q q −4 −2 0 2 4 6 importance sampling
  • 93. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice The only safe cases??? Besides specific models like Gibbs random fields, using distances over the data itself escapes the discrepancy... [Toni & Stumpf, 2010; Sousa & al., 2009]
  • 94. ABC Methods for Bayesian Model Choice ABC for model choice Generic ABC model choice The only safe cases??? Besides specific models like Gibbs random fields, using distances over the data itself escapes the discrepancy... [Toni & Stumpf, 2010; Sousa & al., 2009] ...and so does the use of more informal model fitting measures [Ratmann & al., 2009]
  • 95. ABC Methods for Bayesian Model Choice Model choice consistency ABC model choice consistency Approximate Bayesian computation ABC for model choice Model choice consistency Formalised framework Consistency results Summary statistics Conclusions
  • 96. ABC Methods for Bayesian Model Choice Model choice consistency Formalised framework The starting point Central question to the validation of ABC for model choice: When is a Bayes factor based on an insufficient statistic T (y) consistent?
  • 97. ABC Methods for Bayesian Model Choice Model choice consistency Formalised framework The starting point Central question to the validation of ABC for model choice: When is a Bayes factor based on an insufficient statistic T (y) consistent? T Note: c drawn on T (y) through B12 (y) necessarily differs from c drawn on y through B12 (y)
  • 98. ABC Methods for Bayesian Model Choice Model choice consistency Formalised framework A benchmark if toy example Comparison suggested by referee of PNAS paper [thanks]: [X, Cornuet, Marin, & Pillai, Aug. 2011] Model M1 : y ∼ N (θ1 , 1) opposed √ to model M2 : y ∼ L(θ√ 1/ 2), Laplace distribution with mean θ2 2, and scale parameter 1/ 2 (variance one).
  • 99. ABC Methods for Bayesian Model Choice Model choice consistency Formalised framework A benchmark if toy example Comparison suggested by referee of PNAS paper [thanks]: [X, Cornuet, Marin, & Pillai, Aug. 2011] Model M1 : y ∼ N (θ1 , 1) opposed √ to model M2 : y ∼ L(θ√ 1/ 2), Laplace distribution with mean θ2 2, and scale parameter 1/ 2 (variance one). Four possible statistics 1. sample mean y (sufficient for M1 if not M2 ); 2. sample median med(y) (insufficient); 3. sample variance var(y) (ancillary); 4. median absolute deviation mad(y) = med(y − med(y));
  • 100. ABC Methods for Bayesian Model Choice Model choice consistency Formalised framework A benchmark if toy example Comparison suggested by referee of PNAS paper [thanks]: [X, Cornuet, Marin, & Pillai, Aug. 2011] Model M1 : y ∼ N (θ1 , 1) opposed √ to model M2 : y ∼ L(θ√ 1/ 2), Laplace distribution with mean θ2 2, and scale parameter 1/ 2 (variance one). 6 5 4 Density 3 2 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 posterior probability
  • 101. ABC Methods for Bayesian Model Choice Model choice consistency Formalised framework A benchmark if toy example Comparison suggested by referee of PNAS paper [thanks]: [X, Cornuet, Marin, & Pillai, Aug. 2011] Model M1 : y ∼ N (θ1 , 1) opposed √ to model M2 : y ∼ L(θ√ 1/ 2), Laplace distribution with mean θ2 2, and scale parameter 1/ 2 (variance one). n=100 0.7 0.6 0.5 0.4 q 0.3 q q q 0.2 0.1 q q q q q 0.0 q q Gauss Laplace
  • 102. ABC Methods for Bayesian Model Choice Model choice consistency Formalised framework A benchmark if toy example Comparison suggested by referee of PNAS paper [thanks]: [X, Cornuet, Marin, & Pillai, Aug. 2011] Model M1 : y ∼ N (θ1 , 1) opposed √ to model M2 : y ∼ L(θ√ 1/ 2), Laplace distribution with mean θ2 2, and scale parameter 1/ 2 (variance one). 6 3 5 4 Density Density 2 3 2 1 1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.0 0.2 0.4 0.6 0.8 1.0 posterior probability probability
  • 103. ABC Methods for Bayesian Model Choice Model choice consistency Formalised framework A benchmark if toy example Comparison suggested by referee of PNAS paper [thanks]: [X, Cornuet, Marin, & Pillai, Aug. 2011] Model M1 : y ∼ N (θ1 , 1) opposed √ to model M2 : y ∼ L(θ√ 1/ 2), Laplace distribution with mean θ2 2, and scale parameter 1/ 2 (variance one). n=100 n=100 1.0 0.7 q q 0.6 q 0.8 q q 0.5 q q q 0.6 q 0.4 q q q q 0.3 q q 0.4 q q 0.2 0.2 q 0.1 q q q q q q q 0.0 0.0 q q Gauss Laplace Gauss Laplace
  • 104. ABC Methods for Bayesian Model Choice Model choice consistency Formalised framework Framework Starting from sample y = (y1 , . . . , yn ) the observed sample, not necessarily iid with true distribution y ∼ Pn Summary statistics T (y) = T n = (T1 (y), T2 (y), · · · , Td (y)) ∈ Rd with true distribution T n ∼ Gn .
  • 105. ABC Methods for Bayesian Model Choice Model choice consistency Formalised framework Framework c Comparison of – under M1 , y ∼ F1,n (·|θ1 ) where θ1 ∈ Θ1 ⊂ Rp1 – under M2 , y ∼ F2,n (·|θ2 ) where θ2 ∈ Θ2 ⊂ Rp2 turned into – under M1 , T (y) ∼ G1,n (·|θ1 ), and θ1 |T (y) ∼ π1 (·|T n ) – under M2 , T (y) ∼ G2,n (·|θ2 ), and θ2 |T (y) ∼ π2 (·|T n )
  • 106. ABC Methods for Bayesian Model Choice Model choice consistency Consistency results Assumptions A collection of asymptotic “standard” assumptions: [A1] is a standard central limit theorem under the true model [A2] controls the large deviations of the estimator T n from the estimand µ(θ) [A3] is the standard prior mass condition found in Bayesian asymptotics (di effective dimension of the parameter) [A4] restricts the behaviour of the model density against the true density
  • 107. ABC Methods for Bayesian Model Choice Model choice consistency Consistency results Assumptions A collection of asymptotic “standard” assumptions: [A1] There exist a sequence {vn } converging to +∞, a distribution Q, a symmetric, d × d positive definite matrix V0 and a vector µ0 ∈ Rd such that −1/2 n→∞ vn V0 (T n − µ0 ) Q, under Gn
  • 108. ABC Methods for Bayesian Model Choice Model choice consistency Consistency results Assumptions A collection of asymptotic “standard” assumptions: [A2] For i = 1, 2, there exist sets Fn,i ⊂ Θi and constants i , τi , αi >0 such that for all τ > 0, Gi,n |T n − µ(θi )| > τ |µi (θi ) − µ0 | ∧ i |θi sup −αi θi ∈Fn,i (|µi (θi ) − µ0 | ∧ i ) −α vn i with c −τ πi (Fn,i ) = o(vn i ).
  • 109. ABC Methods for Bayesian Model Choice Model choice consistency Consistency results Assumptions A collection of asymptotic “standard” assumptions: [A3] For u > 0 −1 Sn,i (u) = θi ∈ Fn,i ; |µ(θi ) − µ0 | ≤ u vn if inf{|µi (θi ) − µ0 |; θi ∈ Θi } = 0, there exist constants di < τi ∧ αi − 1 such that −d πi (Sn,i (u)) ∼ udi vn i , ∀u vn
  • 110. ABC Methods for Bayesian Model Choice Model choice consistency Consistency results Assumptions A collection of asymptotic “standard” assumptions: [A4] If inf{|µi (θi ) − µ0 |; θi ∈ Θi } = 0, for any > 0, there exist U, δ > 0 and (En ) such that, if θi ∈ Sn,i (U ) c En ⊂ {t; gi (t|θi ) < δgn (t)} and Gn (En ) < .
  • 111. ABC Methods for Bayesian Model Choice Model choice consistency Consistency results Assumptions A collection of asymptotic “standard” assumptions: Again [A1] is a standard central limit theorem under the true model [A2] controls the large deviations of the estimator T n from the estimand µ(θ) [A3] is the standard prior mass condition found in Bayesian asymptotics (di effective dimension of the parameter) [A4] restricts the behaviour of the model density against the true density
  • 112. ABC Methods for Bayesian Model Choice Model choice consistency Consistency results Effective dimension Understanding di in [A3]: defined only when µ0 ∈ {µi (θi ), θi ∈ Θi }, πi (θi : |µi (θi ) − µ0 | < n−1/2 ) = O(n−di /2 ) is the effective dimension of the model Θi around µ0
  • 113. ABC Methods for Bayesian Model Choice Model choice consistency Consistency results Asymptotic marginals Asymptotically, under [A1]–[A4] mi (t) = gi (t|θi ) πi (θi ) dθi Θi is such that (i) if inf{|µi (θi ) − µ0 |; θi ∈ Θi } = 0, Cl vn i ≤ mi (T n ) ≤ Cu vn i d−d d−d and (ii) if inf{|µi (θi ) − µ0 |; θi ∈ Θi } > 0 mi (T n ) = oPn [vn i + vn i ]. d−τ d−α
  • 114. ABC Methods for Bayesian Model Choice Model choice consistency Consistency results Within-model consistency Under same assumptions, if inf{|µi (θi ) − µ0 |; θi ∈ Θi } = 0, the posterior distribution of µi (θi ) given T n is consistent at rate 1/vn provided αi ∧ τi > di .
  • 115. ABC Methods for Bayesian Model Choice Model choice consistency Consistency results Within-model consistency Under same assumptions, if inf{|µi (θi ) − µ0 |; θi ∈ Θi } = 0, the posterior distribution of µi (θi ) given T n is consistent at rate 1/vn provided αi ∧ τi > di . Note: di can truly be seen as an effective dimension of the model under the posterior πi (.|T n ), since if µ0 ∈ {µi (θi ); θi ∈ Θi }, mi (T n ) ∼ vn i d−d
  • 116. ABC Methods for Bayesian Model Choice Model choice consistency Consistency results Between-model consistency Consequence of above is that asymptotic behaviour of the Bayes factor is driven by the asymptotic mean value of T n under both models. And only by this mean value!
  • 117. ABC Methods for Bayesian Model Choice Model choice consistency Consistency results Between-model consistency Consequence of above is that asymptotic behaviour of the Bayes factor is driven by the asymptotic mean value of T n under both models. And only by this mean value! Indeed, if inf{|µ0 − µ2 (θ2 )|; θ2 ∈ Θ2 } = inf{|µ0 − µ1 (θ1 )|; θ1 ∈ Θ1 } = 0 then Cl vn 1 −d2 ) ≤ m1 (T n ) m2 (T n ) ≤ Cu vn 1 −d2 ) , −(d −(d where Cl , Cu = OPn (1), irrespective of the true model. c Only depends on the difference d1 − d2 : no consistency
  • 118. ABC Methods for Bayesian Model Choice Model choice consistency Consistency results Between-model consistency Consequence of above is that asymptotic behaviour of the Bayes factor is driven by the asymptotic mean value of T n under both models. And only by this mean value! Else, if inf{|µ0 − µ2 (θ2 )|; θ2 ∈ Θ2 } > inf{|µ0 − µ1 (θ1 )|; θ1 ∈ Θ1 } = 0 then m1 (T n ) ≥ Cu min vn 1 −α2 ) , vn 1 −τ2 ) −(d −(d m2 (T n )
  • 119. ABC Methods for Bayesian Model Choice Model choice consistency Consistency results Consistency theorem If inf{|µ0 − µ2 (θ2 )|; θ2 ∈ Θ2 } = inf{|µ0 − µ1 (θ1 )|; θ1 ∈ Θ1 } = 0, Bayes factor B12 = O(vn 1 −d2 ) ) T −(d irrespective of the true model. It is inconsistent since it always picks the model with the smallest dimension
  • 120. ABC Methods for Bayesian Model Choice Model choice consistency Consistency results Consistency theorem If Pn belongs to one of the two models and if µ0 cannot be attained by the other one : 0 = min (inf{|µ0 − µi (θi )|; θi ∈ Θi }, i = 1, 2) < max (inf{|µ0 − µi (θi )|; θi ∈ Θi }, i = 1, 2) , T then the Bayes factor B12 is consistent
  • 121. ABC Methods for Bayesian Model Choice Model choice consistency Summary statistics Consequences on summary statistics Bayes factor driven by the means µi (θi ) and the relative position of µ0 wrt both sets {µi (θi ); θi ∈ Θi }, i = 1, 2. For ABC, this implies the most likely statistics T n are ancillary statistics with different mean values under both models
  • 122. ABC Methods for Bayesian Model Choice Model choice consistency Summary statistics Consequences on summary statistics Bayes factor driven by the means µi (θi ) and the relative position of µ0 wrt both sets {µi (θi ); θi ∈ Θi }, i = 1, 2. For ABC, this implies the most likely statistics T n are ancillary statistics with different mean values under both models Else, if T n asymptotically depends on some of the parameters of the models, it is quite likely that there exists θi ∈ Θi such that µi (θi ) = µ0 even though model M1 is misspecified
  • 123. ABC Methods for Bayesian Model Choice Model choice consistency Summary statistics Toy example: Laplace versus Gauss [1] If n T n = n−1 Xi4 , µ1 (θ) = 3 + θ4 + 6θ2 , µ2 (θ) = 6 + · · · i=1 and the true distribution is Laplace with mean θ0 = 1, under the √ Gaussian model the value θ∗ = 2 3 − 3 leads to µ0 = µ(θ∗ ) [here d1 = d2 = d = 1]