SlideShare a Scribd company logo
1 of 53
Download to read offline
Uncertainties within some Bayesian concepts:
                Examples from classnotes

                                         Christian P. Robert

                            Universit´ Paris-Dauphine, IuF, and CREST-INSEE
                                     e
                               http://www.ceremade.dauphine.fr/~xian


                                              July 31, 2011




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   1 / 30
Outline



             Anyone not shocked by the Bayesian theory of inference has not understood it.
                                                     — S. Senn, Bayesian Analysis, 2008


1   Testing

2   Fully specified models?

3   Model choice




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   2 / 30
Add: Call for vignettes




Kerrie Mengersen and myself are collecting proposals towards a collection
of vignettes on the theme
              When is Bayesian analysis really successfull?
celebrating notable achievements of Bayesian analysis.
                                                 [deadline: September 30]




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   3 / 30
Bayes factors


     The Jeffreys-subjective synthesis betrays a much more dangerous confusion than the
      Neyman-Pearson-Fisher synthesis as regards hypothesis tests — S. Senn, BA, 2008


Definition (Bayes factors)
When testing H0 : θ ∈ Θ0 vs. Ha : θ ∈ Θ0 use

                                                                         f (x|θ)π0 (θ)dθ
                               π(Θ0 |x)          π(Θ0 )             Θ0
                      B01    =                          =
                               π(Θc |x)
                                  0              π(Θc )
                                                    0                    f (x|θ)π1 (θ)dθ
                                                                    Θc
                                                                     0


                                                                      [Good, 1958 & Jeffreys, 1939]



Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts              July 31, 2011   4 / 30
Self-contained concept


        Derived from 0 − 1 loss and Bayes rule: acceptance if
        B01 > {(1 − π(Θ0 ))/a1 }/{π(Θ0 )/a0 }




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   5 / 30
Self-contained concept


        Derived from 0 − 1 loss and Bayes rule: acceptance if
        B01 > {(1 − π(Θ0 ))/a1 }/{π(Θ0 )/a0 }
        but used outside decision-theoretic environment
        eliminates choice of π(Θ0 )




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   5 / 30
Self-contained concept


        Derived from 0 − 1 loss and Bayes rule: acceptance if
        B01 > {(1 − π(Θ0 ))/a1 }/{π(Θ0 )/a0 }
        but used outside decision-theoretic environment
        eliminates choice of π(Θ0 )
        but still depends on the choice of (π0 , π1 )




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   5 / 30
Self-contained concept


        Derived from 0 − 1 loss and Bayes rule: acceptance if
        B01 > {(1 − π(Θ0 ))/a1 }/{π(Θ0 )/a0 }
        but used outside decision-theoretic environment
        eliminates choice of π(Θ0 )
        but still depends on the choice of (π0 , π1 )
        Jeffreys’ [arbitrary] scale of evidence:
                            π
               if   log10 (B10 )       between 0 and 0.5, evidence against H0 weak,
                            π
               if   log10 (B10 )       0.5 and 1, evidence substantial,
                            π
               if   log10 (B10 )       1 and 2, evidence strong and
                            π
               if   log10 (B10 )       above 2, evidence decisive
        convergent if used with proper statistics



Christian P. Robert (Paris-Dauphine)       Uncertainties within Bayesian concepts   July 31, 2011   5 / 30
Difficulties with ABC-Bayes factors



        ‘This is also why focus on model discrimination typically (...) proceeds by
        (...) accepting that the Bayes Factor that one obtains is only derived from
        the summary statistics and may in no way correspond to that of the full
        model.’ — S. Sisson, Jan. 31, 2011, X.’Og




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   6 / 30
Difficulties with ABC-Bayes factors



        ‘This is also why focus on model discrimination typically (...) proceeds by
        (...) accepting that the Bayes Factor that one obtains is only derived from
        the summary statistics and may in no way correspond to that of the full
        model.’ — S. Sisson, Jan. 31, 2011, X.’Og

In the Poisson versus geometric case, if E[yi ] = θ0 > 0,

                                            η                (θ0 + 1)2 −θ0
                                       lim B12 (y) =                  e
                                       n→∞                       θ0




Christian P. Robert (Paris-Dauphine)     Uncertainties within Bayesian concepts   July 31, 2011   6 / 30
Difficulties with ABC-Bayes factors




Laplace vs. Normal models:
Comparing a sample x1 , . . . , xn from the Laplace (double-exponential)
       √
L(µ, 1/ 2) distribution
                                             1     √
                                  f (x|µ) = √ exp{− 2|x − µ|} .
                                              2
or from the Normal N (µ, 1)




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   7 / 30
Difficulties with ABC-Bayes factors
Empirical mean, median and variance have the same mean under both
models: useless!




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   7 / 30
Difficulties with ABC-Bayes factors
Median absolute deviation: priceless!




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   7 / 30
Point null hypotheses


       I have no patience for statistical methods that assign positive probability to point
   hypotheses of the θ = 0 type that can never actually be true — A. Gelman, BA, 2008

Particular case H0 : θ = θ0
Take ρ0 = Prπ (θ = θ0 ) and π1 prior density under Ha .




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   8 / 30
Point null hypotheses


       I have no patience for statistical methods that assign positive probability to point
   hypotheses of the θ = 0 type that can never actually be true — A. Gelman, BA, 2008

Particular case H0 : θ = θ0
Take ρ0 = Prπ (θ = θ0 ) and π1 prior density under Ha .
Posterior probability of H0

                                    f (x|θ0 )ρ0               f (x|θ0 )ρ0
             π(Θ0 |x) =                           =
                                   f (x|θ)π(θ) dθ   f (x|θ0 )ρ0 + (1 − ρ0 )m1 (x)

and marginal under Ha

                                       m1 (x) =           f (x|θ)g1 (θ) dθ.
                                                     Θ1




Christian P. Robert (Paris-Dauphine)     Uncertainties within Bayesian concepts   July 31, 2011   8 / 30
Point null hypotheses (cont’d)


Example (Normal mean)
Test of H0 : θ = 0 when x ∼ N (θ, 1): we take π1 as N (0, τ 2 ) then
                                                                                                      −1
                          1 − ρ0                        σ2                            τ 2 x2
         π(θ = 0|x) = 1 +                                      exp
                            ρ0                        σ2 + τ 2                  2σ 2 (σ 2 + τ 2 )

Influence of τ :
                               τ /x      0           0.68         1.28           1.96
                                 1     0.586        0.557        0.484          0.351
                                10     0.768        0.729        0.612          0.366




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts                 July 31, 2011   9 / 30
A fundamental difficulty




Improper priors are not allowed in this setting
If
                                  π1 (dθ1 ) = ∞ or                     π2 (dθ2 ) = ∞
                             Θ1                                   Θ2

then either π1 or π2 cannot be coherently normalised




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts          July 31, 2011   10 / 30
A fundamental difficulty




Improper priors are not allowed in this setting
If
                                  π1 (dθ1 ) = ∞ or                     π2 (dθ2 ) = ∞
                             Θ1                                   Θ2

then either π1 or π2 cannot be coherently normalised but the
normalisation matters in the Bayes factor




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts          July 31, 2011   10 / 30
Jeffreys unaware of the problem??


Example of testing for a zero normal mean:
      If σ is the standard error and λ the
      true value, λ is 0 on q. We want a
      suitable form for its prior on q . (...)
      Then we should take

              P (qdσ|H) ∝ dσ/σ
                                       λ
         P (q dσdλ|H) ∝ f                     dσ/σdλ/λ
                                       σ

      where f [is a true density] (ToP, V,
      §5.2).




Christian P. Robert (Paris-Dauphine)       Uncertainties within Bayesian concepts   July 31, 2011   11 / 30
Jeffreys unaware of the problem??


Example of testing for a zero normal mean:
      If σ is the standard error and λ the
      true value, λ is 0 on q. We want a
      suitable form for its prior on q . (...)
      Then we should take

              P (qdσ|H) ∝ dσ/σ
                                       λ
         P (q dσdλ|H) ∝ f                     dσ/σdλ/λ
                                       σ

      where f [is a true density] (ToP, V,
      §5.2).
Unavoidable fallacy of the “same” σ?!



Christian P. Robert (Paris-Dauphine)       Uncertainties within Bayesian concepts   July 31, 2011   11 / 30
Puzzling alternatives
When taking two normal samples x11 , . . . , x1n1 and x21 , . . . , x2n2 with
means λ1 and λ2 and same variance σ, testing for H0 : λ1 = λ2 gets
outwordly:
        ...we are really considering four hypotheses, not two as in the test for
        agreement of a location parameter with zero; for neither may be disturbed,
        or either, or both may.

ToP then uses parameters (λ, σ) in all versions of the alternative
hypotheses, with

                         π0 (λ, σ) ∝ 1/σ
                    π1 (λ, σ, λ1 ) ∝ 1/π{σ 2 + (λ1 − λ)2 }
                    π2 (λ, σ, λ2 ) ∝ 1/π{σ 2 + (λ2 − λ)2 }
            π12 (λ, σ, λ1 , λ2 ) ∝ σ/π 2 {σ 2 + (λ1 − λ)2 }{σ 2 + (λ2 − λ)2 }



Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   12 / 30
Puzzling alternatives

ToP misses the points that
   1    λ does not have the same meaning under q, under q1 (= λ2 ) and
        under q2 (= λ1 )
   2    λ has no precise meaning under q12 [hyperparameter?]
            On q12 , since λ does not appear explicitely in the likelihood
            we can integrate it (V, §5.41).

   3    even σ has a varying meaning over hypotheses
   4    integrating over measures
                                                                2     dσdλ1 dλ2
                              P (q12 dσdλ1 dλ2 |H) ∝
                                                                π 4σ 2 + (λ1 − λ2 )2

         simply defines a new improper prior...


Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts      July 31, 2011   13 / 30
Addiction to models


One potential difficulty with Bayesian analysis is its ultimate dependence
on model(s) specification

                                          π(θ) ∝ π(θ)f (x|θ)




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   14 / 30
Addiction to models


One potential difficulty with Bayesian analysis is its ultimate dependence
on model(s) specification

                                          π(θ) ∝ π(θ)f (x|θ)

While Bayesian analysis allows for model variability, prunning,
improvement, comparison, embedding, &tc., there always is a basic
reliance [or at least conditioning] on the ”truth” of an overall model.




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   14 / 30
Addiction to models


One potential difficulty with Bayesian analysis is its ultimate dependence
on model(s) specification

                                          π(θ) ∝ π(θ)f (x|θ)

While Bayesian analysis allows for model variability, prunning,
improvement, comparison, embedding, &tc., there always is a basic
reliance [or at least conditioning] on the ”truth” of an overall model. May
sound paradoxical because of the many tools offered by Bayesian analysis,
however method is blind once ”out of the model”, in the sense that it
cannot assess the validity of a model without imbedding this model inside
another model.



Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   14 / 30
ABCµ multiple errors




                                                                [ c Ratmann et al., PNAS, 2009]



Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts    July 31, 2011   15 / 30
ABCµ multiple errors




                                                                [ c Ratmann et al., PNAS, 2009]


Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts    July 31, 2011   15 / 30
No proper goodness-of-fit test




        ‘There is not the slightest use in rejecting any hypothesis unless we can do it
        in favor of some definite alternative that better fits the facts.” — E.T.
        Jaynes, Probability Theory

While the setting

                             H 0 : M = M0           versus         H a : M = M0

is rather artificial, there is no satisfactory way of answering the question




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts     July 31, 2011   16 / 30
An approximate goodness-of-fit test



Testing
                             H 0 : M = Mθ           versus         H a : M = Mθ
rephrased as

          H0 : min d(Fθ , U(0,1) ) = 0 versus                      Ha : min d(Fθ , U(0,1) ) > 0
                    θ                                                           θ


                            [Verdinelli and Wasserman, 98; Rousseau and Robert, 01]




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts         July 31, 2011   17 / 30
An approximate goodness-of-fit test

Testing
                             H 0 : M = Mθ              versus           H a : M = Mθ
rephrased as
             H0 : Fθ (x)        ∼      U(0, 1)    versus
                                                                    k
                                                                           ωi
             Ha : Fθ (x)        ∼      p0 U(0, 1) + (1 − p0 )                  Be(αi i , αi (1 − i ))
                                                                   i=1
                                                                             ω

with

                                 (αi , i ) ∼ [1 − exp{−(αi − 2)2 − ( i − .5)2 }]
                                                   2                   2
                                        × exp[−1/(αi i (1 − i )) − 0.2αi /2]


                            [Verdinelli and Wasserman, 98; Rousseau and Robert, 01]



Christian P. Robert (Paris-Dauphine)      Uncertainties within Bayesian concepts             July 31, 2011   17 / 30
Robustness

Models only partly defined through moments

                                  Eθ [hi (x)] = Hi (θ)               i = 1, . . .

i.e., no complete construction of the underlying model

Example (White noise in AR)
The relation
                                           xt = ρxt−1 + σ              t

often makes no assumption on                     t   besides its first two moments...

How can we run Bayesian analysis in such settings? Should we?
                             [Lazar, 2005; Cornuet et al., 2011, in prep.]



Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts       July 31, 2011   18 / 30
[back to] Bayesian model choice



Having a high relative probability does not mean that a hypothesis is true or supported
by the data — A. Templeton, Mol. Ecol., 2009

The formal Bayesian approach put probabilities all over the entire
model/parameter space




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   19 / 30
[back to] Bayesian model choice



Having a high relative probability does not mean that a hypothesis is true or supported
by the data — A. Templeton, Mol. Ecol., 2009

The formal Bayesian approach put probabilities all over the entire
model/parameter space
This means:
        allocating probabilities pi to all models Mi
        defining priors πi (θi ) for each parameter space Θi
        pick largest p(Mi |x) to determine “best” model




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   19 / 30
Several types of problems




Concentrate on selection perspective:
        how to integrate loss function/decision/consequences
        representation of parsimony/sparcity (Occam’s rule)
        how to fight overfitting for nested models




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   20 / 30
Several types of problems


Incoherent methods, such as ABC, Bayes factor, or any simulation approach that treats
all hypotheses as mutually exclusive, should never be used with logically overlapping
hypotheses. — A. Templeton, PNAS, 2010

Choice of prior structures
        adequate weights pi :
                                 >
        if M1 = M2 ∪ M3 , p(M1 ) = p(M2 ) + p(M3 ) ?
        priors distributions
               πi (·) defined for every i ∈ I
               πi (·) proper (Jeffreys)
               πi (·) coherent (?) for nested models
        prior modelling inflation



Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   20 / 30
Compatibility principle




Difficulty of finding simultaneously priors on a collection of models Mi
(i ∈ I)




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   21 / 30
Compatibility principle




Difficulty of finding simultaneously priors on a collection of models Mi
(i ∈ I)
Easier to start from a single prior on a “big” model and to derive the
others from a coherence principle
                                                 [Dawid & Lauritzen, 2000]




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   21 / 30
Projection approach


                                                                  ⊥
For M2 submodel of M1 , π2 can be derived as the distribution of θ2 (θ1 )
                        ⊥ (θ ) is a projection of θ on M , e.g.
when θ1 ∼ π1 (θ1 ) and θ2 1                        1    2

                   d(f (· |θ1 ), f (· |θ1 ⊥ )) = inf              d(f (· |θ1 ) , f (· |θ2 )) .
                                                       θ2 ∈Θ2

where d is a divergence measure
                                                                           [McCulloch & Rossi, 1992]




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts             July 31, 2011   22 / 30
Projection approach


                                                                  ⊥
For M2 submodel of M1 , π2 can be derived as the distribution of θ2 (θ1 )
                        ⊥ (θ ) is a projection of θ on M , e.g.
when θ1 ∼ π1 (θ1 ) and θ2 1                        1    2

                   d(f (· |θ1 ), f (· |θ1 ⊥ )) = inf              d(f (· |θ1 ) , f (· |θ2 )) .
                                                       θ2 ∈Θ2

where d is a divergence measure
                                                                           [McCulloch & Rossi, 1992]
Or we can look instead at the posterior distribution of

                                        d(f (· |θ1 ), f (· |θ1 ⊥ ))

                                       [Goutis & Robert, 1998; Dupuis & Robert, 2001]



Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts             July 31, 2011   22 / 30
Kullback proximity



Alternative projection to the above

Definition (Compatible prior)
Given a prior π1 on a model M1 and a submodel M2 , a prior π2 on M2 is
compatible with π1




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   23 / 30
Kullback proximity



Alternative projection to the above

Definition (Compatible prior)
Given a prior π1 on a model M1 and a submodel M2 , a prior π2 on M2 is
compatible with π1 when it achieves the minimum Kullback divergence
between the corresponding marginals: m1 (x; π1 ) = Θ1 f1 (x|θ)π1 (θ)dθ
and m2 (x); π2 = Θ2 f2 (x|θ)π2 (θ)dθ,




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   23 / 30
Kullback proximity



Alternative projection to the above

Definition (Compatible prior)
Given a prior π1 on a model M1 and a submodel M2 , a prior π2 on M2 is
compatible with π1 when it achieves the minimum Kullback divergence
between the corresponding marginals: m1 (x; π1 ) = Θ1 f1 (x|θ)π1 (θ)dθ
and m2 (x); π2 = Θ2 f2 (x|θ)π2 (θ)dθ,

                                                            m1 (x; π1 )
                      π2 = arg min                 log                           m1 (x; π1 ) dx
                                       π2                   m2 (x; π2 )




Christian P. Robert (Paris-Dauphine)        Uncertainties within Bayesian concepts            July 31, 2011   23 / 30
Difficulties



     Further complicating dimensionality of test statistics is the fact that the models are
 often not nested, and one model may contain parameters that do not have analogues in
                    the other models and vice versa. — A. Templeton, Mol. Ecol., 2009


        Does not give a working principle when M2 is not a submodel M1
                    [Perez & Berger, 2000; Cano, Salmer´n & Robert, 2006]
                                                         o

        Depends on the choice of π1
        Prohibits the use of improper priors
        Worse: useless in unconstrained settings...




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   24 / 30
A side remark: Zellner’s g




Use of Zellner’s g-prior in linear regression, i.e. a normal prior for β
conditional on σ 2 ,
                                     ˜
                        β|σ 2 ∼ N (β, gσ 2 (X T X)−1 )
and a Jeffreys prior for σ 2 ,

                                               π(σ 2 ) ∝ σ −2




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   25 / 30
Variable selection


For the hierarchical parameter γ, we use
                                                   p
                                       π(γ) =          τiγi (1 − τi )1−γi ,
                                                 i=1

where τi corresponds to the prior probability that variable i is present in
the model (and a priori independence between the presence/absence of
variables)




Christian P. Robert (Paris-Dauphine)    Uncertainties within Bayesian concepts   July 31, 2011   26 / 30
Variable selection


For the hierarchical parameter γ, we use
                                                   p
                                       π(γ) =          τiγi (1 − τi )1−γi ,
                                                 i=1

where τi corresponds to the prior probability that variable i is present in
the model (and a priori independence between the presence/absence of
variables)
Typically (?), when no prior information is available, τ1 = . . . = τp = 1/2,
ie a uniform prior
                                π(γ) = 2−p




Christian P. Robert (Paris-Dauphine)    Uncertainties within Bayesian concepts   July 31, 2011   26 / 30
Influence of g



                  Taking               ˜
                                       β = 0p+1          and c large does not work




Christian P. Robert (Paris-Dauphine)      Uncertainties within Bayesian concepts   July 31, 2011   27 / 30
Influence of g



                     Taking            ˜
                                       β = 0p+1                 and c large does not work

Consider the 10-predictor full model
                                                                                                                     
                                3                3
               2                                            2                                                   2
      y|β, σ       ∼ N β0 +         βi x i +         βi+3 xi + β7 x1 x2 + β8 x1 x3 + β9 x2 x3 + β10 x1 x2 x3 , σ In 
                               i=1              i=1



 where the xi s are iid U (0, 10)
                                                                                         [Casella & Moreno, 2004]




Christian P. Robert (Paris-Dauphine)            Uncertainties within Bayesian concepts                July 31, 2011       27 / 30
Influence of g



                     Taking            ˜
                                       β = 0p+1                 and c large does not work

Consider the 10-predictor full model
                                                                                                                     
                                3                3
               2                                            2                                                   2
      y|β, σ       ∼ N β0 +         βi x i +         βi+3 xi + β7 x1 x2 + β8 x1 x3 + β9 x2 x3 + β10 x1 x2 x3 , σ In 
                               i=1              i=1



 where the xi s are iid U (0, 10)
                                                 [Casella & Moreno, 2004]
True model: two predictors x1 and x2 , i.e. γ ∗ = 110. . .0,
(β0 , β1 , β2 ) = (5, 1, 3), and σ 2 = 4.




Christian P. Robert (Paris-Dauphine)            Uncertainties within Bayesian concepts                July 31, 2011       27 / 30
Influence of g 2




            t1 (γ)           g = 10      g = 100           g = 103              g = 104   g = 106

            0,1,2           0.04062      0.35368           0.65858              0.85895   0.98222
            0,1,2,7         0.01326      0.06142           0.08395              0.04434   0.00524
            0,1,2,4         0.01299      0.05310           0.05805              0.02868   0.00336
            0,2,4           0.02927      0.03962           0.00409              0.00246   0.00254
            0,1,2,8         0.01240      0.03833           0.01100              0.00126   0.00126




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts              July 31, 2011   28 / 30
Case for a noninformative hierarchical solution




                                                              ˜
Use the same compatible informative g-prior distribution with β = 0p+1
and a hierarchical diffuse prior distribution on g, e.g.

                                           π(g) ∝ g −1 IN∗ (c)

             [Liang et al., 2007; Marin & Robert, 2007; Celeux et al., ca. 2011]




Christian P. Robert (Paris-Dauphine)   Uncertainties within Bayesian concepts   July 31, 2011   29 / 30
Occam’s razor



Pluralitas non est ponenda sine neccesitate

      Variation is random until the contrary
      is shown; and new parameters in laws,
      when they are suggested, must be
      tested one at a time, unless there is
      specific reason to the contrary.

                                       H. Jeffreys, ToP, 1939

No well-accepted implementation behind the principle...




Christian P. Robert (Paris-Dauphine)     Uncertainties within Bayesian concepts   July 31, 2011   30 / 30
Occam’s razor



Pluralitas non est ponenda sine neccesitate

      Variation is random until the contrary
      is shown; and new parameters in laws,
      when they are suggested, must be
      tested one at a time, unless there is
      specific reason to the contrary.

                                       H. Jeffreys, ToP, 1939

No well-accepted implementation behind the principle...
besides the fact that the Bayes factor naturally penalises larger models




Christian P. Robert (Paris-Dauphine)     Uncertainties within Bayesian concepts   July 31, 2011   30 / 30

More Related Content

What's hot

Is ABC a new empirical Bayes approach?
Is ABC a new empirical Bayes approach?Is ABC a new empirical Bayes approach?
Is ABC a new empirical Bayes approach?Christian Robert
 
Tro07 sparse-solutions-talk
Tro07 sparse-solutions-talkTro07 sparse-solutions-talk
Tro07 sparse-solutions-talkmpbchina
 
Poster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conferencePoster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conferenceChristian Robert
 
Métodos computacionales para el estudio de modelos epidemiológicos con incer...
Métodos computacionales para el estudio de modelos  epidemiológicos con incer...Métodos computacionales para el estudio de modelos  epidemiológicos con incer...
Métodos computacionales para el estudio de modelos epidemiológicos con incer...Facultad de Informática UCM
 
05 history of cv a machine learning (theory) perspective on computer vision
05  history of cv a machine learning (theory) perspective on computer vision05  history of cv a machine learning (theory) perspective on computer vision
05 history of cv a machine learning (theory) perspective on computer visionzukun
 
Monash University short course, part II
Monash University short course, part IIMonash University short course, part II
Monash University short course, part IIChristian Robert
 
Lesson 4: Calculating Limits (Section 21 slides)
Lesson 4: Calculating Limits (Section 21 slides)Lesson 4: Calculating Limits (Section 21 slides)
Lesson 4: Calculating Limits (Section 21 slides)Matthew Leingang
 
Olivier Cappé's talk at BigMC March 2011
Olivier Cappé's talk at BigMC March 2011Olivier Cappé's talk at BigMC March 2011
Olivier Cappé's talk at BigMC March 2011BigMC
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018Christian Robert
 
Nonlinear Manifolds in Computer Vision
Nonlinear Manifolds in Computer VisionNonlinear Manifolds in Computer Vision
Nonlinear Manifolds in Computer Visionzukun
 
Coordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerCoordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerChristian Robert
 
seminar at Princeton University
seminar at Princeton Universityseminar at Princeton University
seminar at Princeton UniversityChristian Robert
 

What's hot (17)

ABC in Varanasi
ABC in VaranasiABC in Varanasi
ABC in Varanasi
 
Is ABC a new empirical Bayes approach?
Is ABC a new empirical Bayes approach?Is ABC a new empirical Bayes approach?
Is ABC a new empirical Bayes approach?
 
Boston talk
Boston talkBoston talk
Boston talk
 
Tro07 sparse-solutions-talk
Tro07 sparse-solutions-talkTro07 sparse-solutions-talk
Tro07 sparse-solutions-talk
 
Poster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conferencePoster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conference
 
Métodos computacionales para el estudio de modelos epidemiológicos con incer...
Métodos computacionales para el estudio de modelos  epidemiológicos con incer...Métodos computacionales para el estudio de modelos  epidemiológicos con incer...
Métodos computacionales para el estudio de modelos epidemiológicos con incer...
 
05 history of cv a machine learning (theory) perspective on computer vision
05  history of cv a machine learning (theory) perspective on computer vision05  history of cv a machine learning (theory) perspective on computer vision
05 history of cv a machine learning (theory) perspective on computer vision
 
Lesson 4: Continuity
Lesson 4: ContinuityLesson 4: Continuity
Lesson 4: Continuity
 
Monash University short course, part II
Monash University short course, part IIMonash University short course, part II
Monash University short course, part II
 
Lesson 4: Calculating Limits (Section 21 slides)
Lesson 4: Calculating Limits (Section 21 slides)Lesson 4: Calculating Limits (Section 21 slides)
Lesson 4: Calculating Limits (Section 21 slides)
 
Olivier Cappé's talk at BigMC March 2011
Olivier Cappé's talk at BigMC March 2011Olivier Cappé's talk at BigMC March 2011
Olivier Cappé's talk at BigMC March 2011
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018
 
Bayesian computation with INLA
Bayesian computation with INLABayesian computation with INLA
Bayesian computation with INLA
 
Nonlinear Manifolds in Computer Vision
Nonlinear Manifolds in Computer VisionNonlinear Manifolds in Computer Vision
Nonlinear Manifolds in Computer Vision
 
Coordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerCoordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like sampler
 
seminar at Princeton University
seminar at Princeton Universityseminar at Princeton University
seminar at Princeton University
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 

Viewers also liked

Bayes250: Thomas Bayes Memorial Lecture at EMS 2013
Bayes250: Thomas Bayes Memorial Lecture at EMS 2013Bayes250: Thomas Bayes Memorial Lecture at EMS 2013
Bayes250: Thomas Bayes Memorial Lecture at EMS 2013Christian Robert
 
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrapStatistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrapChristian Robert
 
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...Christian Robert
 
Chapter 4: Decision theory and Bayesian analysis
Chapter 4: Decision theory and Bayesian analysisChapter 4: Decision theory and Bayesian analysis
Chapter 4: Decision theory and Bayesian analysisChristian Robert
 
Reading Birnbaum's (1962) paper, by Li Chenlu
Reading Birnbaum's (1962) paper, by Li ChenluReading Birnbaum's (1962) paper, by Li Chenlu
Reading Birnbaum's (1962) paper, by Li ChenluChristian Robert
 

Viewers also liked (9)

Athens workshop on MCMC
Athens workshop on MCMCAthens workshop on MCMC
Athens workshop on MCMC
 
Bayes250: Thomas Bayes Memorial Lecture at EMS 2013
Bayes250: Thomas Bayes Memorial Lecture at EMS 2013Bayes250: Thomas Bayes Memorial Lecture at EMS 2013
Bayes250: Thomas Bayes Memorial Lecture at EMS 2013
 
Edinburgh, Bayes-250
Edinburgh, Bayes-250Edinburgh, Bayes-250
Edinburgh, Bayes-250
 
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrapStatistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
 
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
 
Big model, big data
Big model, big dataBig model, big data
Big model, big data
 
Chapter 4: Decision theory and Bayesian analysis
Chapter 4: Decision theory and Bayesian analysisChapter 4: Decision theory and Bayesian analysis
Chapter 4: Decision theory and Bayesian analysis
 
Reading Birnbaum's (1962) paper, by Li Chenlu
Reading Birnbaum's (1962) paper, by Li ChenluReading Birnbaum's (1962) paper, by Li Chenlu
Reading Birnbaum's (1962) paper, by Li Chenlu
 
Jsm09 talk
Jsm09 talkJsm09 talk
Jsm09 talk
 

Similar to JSM 2011 round table

An overview of Bayesian testing
An overview of Bayesian testingAn overview of Bayesian testing
An overview of Bayesian testingChristian Robert
 
Statistics symposium talk, Harvard University
Statistics symposium talk, Harvard UniversityStatistics symposium talk, Harvard University
Statistics symposium talk, Harvard UniversityChristian Robert
 
Approximating Bayes Factors
Approximating Bayes FactorsApproximating Bayes Factors
Approximating Bayes FactorsChristian Robert
 
Bayesian model choice (and some alternatives)
Bayesian model choice (and some alternatives)Bayesian model choice (and some alternatives)
Bayesian model choice (and some alternatives)Christian Robert
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?Christian Robert
 
A Geometric Note on a Type of Multiple Testing-07-24-2015
A Geometric Note on a Type of Multiple Testing-07-24-2015A Geometric Note on a Type of Multiple Testing-07-24-2015
A Geometric Note on a Type of Multiple Testing-07-24-2015Junfeng Liu
 
Tutorial on testing at O'Bayes 2015, Valencià, June 1, 2015
Tutorial on testing at O'Bayes 2015, Valencià, June 1, 2015Tutorial on testing at O'Bayes 2015, Valencià, June 1, 2015
Tutorial on testing at O'Bayes 2015, Valencià, June 1, 2015Christian Robert
 
Yes III: Computational methods for model choice
Yes III: Computational methods for model choiceYes III: Computational methods for model choice
Yes III: Computational methods for model choiceChristian Robert
 
Bayesian statistics using r intro
Bayesian statistics using r   introBayesian statistics using r   intro
Bayesian statistics using r introBayesLaplace1
 
Diaconis Ylvisaker 1985
Diaconis Ylvisaker 1985Diaconis Ylvisaker 1985
Diaconis Ylvisaker 1985Julyan Arbel
 
RSS discussion of Girolami and Calderhead, October 13, 2010
RSS discussion of Girolami and Calderhead, October 13, 2010RSS discussion of Girolami and Calderhead, October 13, 2010
RSS discussion of Girolami and Calderhead, October 13, 2010Christian Robert
 
Probability/Statistics Lecture Notes 4: Hypothesis Testing
Probability/Statistics Lecture Notes 4: Hypothesis TestingProbability/Statistics Lecture Notes 4: Hypothesis Testing
Probability/Statistics Lecture Notes 4: Hypothesis Testingjemille6
 
Chapter 3 projection
Chapter 3 projectionChapter 3 projection
Chapter 3 projectionNBER
 
Statistics (1): estimation, Chapter 1: Models
Statistics (1): estimation, Chapter 1: ModelsStatistics (1): estimation, Chapter 1: Models
Statistics (1): estimation, Chapter 1: ModelsChristian Robert
 
Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013
Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013
Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013Christian Robert
 
On Generalized Classical Fréchet Derivatives in the Real Banach Space
On Generalized Classical Fréchet Derivatives in the Real Banach SpaceOn Generalized Classical Fréchet Derivatives in the Real Banach Space
On Generalized Classical Fréchet Derivatives in the Real Banach SpaceBRNSS Publication Hub
 

Similar to JSM 2011 round table (20)

An overview of Bayesian testing
An overview of Bayesian testingAn overview of Bayesian testing
An overview of Bayesian testing
 
ISBA 2016: Foundations
ISBA 2016: FoundationsISBA 2016: Foundations
ISBA 2016: Foundations
 
Statistics symposium talk, Harvard University
Statistics symposium talk, Harvard UniversityStatistics symposium talk, Harvard University
Statistics symposium talk, Harvard University
 
Approximating Bayes Factors
Approximating Bayes FactorsApproximating Bayes Factors
Approximating Bayes Factors
 
Bayesian model choice (and some alternatives)
Bayesian model choice (and some alternatives)Bayesian model choice (and some alternatives)
Bayesian model choice (and some alternatives)
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?
 
A Geometric Note on a Type of Multiple Testing-07-24-2015
A Geometric Note on a Type of Multiple Testing-07-24-2015A Geometric Note on a Type of Multiple Testing-07-24-2015
A Geometric Note on a Type of Multiple Testing-07-24-2015
 
Tutorial on testing at O'Bayes 2015, Valencià, June 1, 2015
Tutorial on testing at O'Bayes 2015, Valencià, June 1, 2015Tutorial on testing at O'Bayes 2015, Valencià, June 1, 2015
Tutorial on testing at O'Bayes 2015, Valencià, June 1, 2015
 
Yes III: Computational methods for model choice
Yes III: Computational methods for model choiceYes III: Computational methods for model choice
Yes III: Computational methods for model choice
 
Bayesian statistics using r intro
Bayesian statistics using r   introBayesian statistics using r   intro
Bayesian statistics using r intro
 
Diaconis Ylvisaker 1985
Diaconis Ylvisaker 1985Diaconis Ylvisaker 1985
Diaconis Ylvisaker 1985
 
RSS discussion of Girolami and Calderhead, October 13, 2010
RSS discussion of Girolami and Calderhead, October 13, 2010RSS discussion of Girolami and Calderhead, October 13, 2010
RSS discussion of Girolami and Calderhead, October 13, 2010
 
Probability/Statistics Lecture Notes 4: Hypothesis Testing
Probability/Statistics Lecture Notes 4: Hypothesis TestingProbability/Statistics Lecture Notes 4: Hypothesis Testing
Probability/Statistics Lecture Notes 4: Hypothesis Testing
 
Chapter 3 projection
Chapter 3 projectionChapter 3 projection
Chapter 3 projection
 
Statistics (1): estimation, Chapter 1: Models
Statistics (1): estimation, Chapter 1: ModelsStatistics (1): estimation, Chapter 1: Models
Statistics (1): estimation, Chapter 1: Models
 
Proba stats-r1-2017
Proba stats-r1-2017Proba stats-r1-2017
Proba stats-r1-2017
 
QMC: Operator Splitting Workshop, Compactness Estimates for Nonlinear PDEs - ...
QMC: Operator Splitting Workshop, Compactness Estimates for Nonlinear PDEs - ...QMC: Operator Splitting Workshop, Compactness Estimates for Nonlinear PDEs - ...
QMC: Operator Splitting Workshop, Compactness Estimates for Nonlinear PDEs - ...
 
01_AJMS_277_20_20210128_V1.pdf
01_AJMS_277_20_20210128_V1.pdf01_AJMS_277_20_20210128_V1.pdf
01_AJMS_277_20_20210128_V1.pdf
 
Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013
Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013
Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013
 
On Generalized Classical Fréchet Derivatives in the Real Banach Space
On Generalized Classical Fréchet Derivatives in the Real Banach SpaceOn Generalized Classical Fréchet Derivatives in the Real Banach Space
On Generalized Classical Fréchet Derivatives in the Real Banach Space
 

More from Christian Robert

Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceChristian Robert
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinChristian Robert
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?Christian Robert
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Christian Robert
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Christian Robert
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking componentsChristian Robert
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihoodChristian Robert
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)Christian Robert
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerChristian Robert
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussionChristian Robert
 

More from Christian Robert (20)

Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de France
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael Martin
 
discussion of ICML23.pdf
discussion of ICML23.pdfdiscussion of ICML23.pdf
discussion of ICML23.pdf
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?
 
restore.pdf
restore.pdfrestore.pdf
restore.pdf
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking components
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihood
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
 
eugenics and statistics
eugenics and statisticseugenics and statistics
eugenics and statistics
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussion
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 

Recently uploaded

An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfSanaAli374401
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfChris Hunter
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterMateoGardella
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 

Recently uploaded (20)

An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 

JSM 2011 round table

  • 1. Uncertainties within some Bayesian concepts: Examples from classnotes Christian P. Robert Universit´ Paris-Dauphine, IuF, and CREST-INSEE e http://www.ceremade.dauphine.fr/~xian July 31, 2011 Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 1 / 30
  • 2. Outline Anyone not shocked by the Bayesian theory of inference has not understood it. — S. Senn, Bayesian Analysis, 2008 1 Testing 2 Fully specified models? 3 Model choice Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 2 / 30
  • 3. Add: Call for vignettes Kerrie Mengersen and myself are collecting proposals towards a collection of vignettes on the theme When is Bayesian analysis really successfull? celebrating notable achievements of Bayesian analysis. [deadline: September 30] Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 3 / 30
  • 4. Bayes factors The Jeffreys-subjective synthesis betrays a much more dangerous confusion than the Neyman-Pearson-Fisher synthesis as regards hypothesis tests — S. Senn, BA, 2008 Definition (Bayes factors) When testing H0 : θ ∈ Θ0 vs. Ha : θ ∈ Θ0 use f (x|θ)π0 (θ)dθ π(Θ0 |x) π(Θ0 ) Θ0 B01 = = π(Θc |x) 0 π(Θc ) 0 f (x|θ)π1 (θ)dθ Θc 0 [Good, 1958 & Jeffreys, 1939] Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 4 / 30
  • 5. Self-contained concept Derived from 0 − 1 loss and Bayes rule: acceptance if B01 > {(1 − π(Θ0 ))/a1 }/{π(Θ0 )/a0 } Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 5 / 30
  • 6. Self-contained concept Derived from 0 − 1 loss and Bayes rule: acceptance if B01 > {(1 − π(Θ0 ))/a1 }/{π(Θ0 )/a0 } but used outside decision-theoretic environment eliminates choice of π(Θ0 ) Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 5 / 30
  • 7. Self-contained concept Derived from 0 − 1 loss and Bayes rule: acceptance if B01 > {(1 − π(Θ0 ))/a1 }/{π(Θ0 )/a0 } but used outside decision-theoretic environment eliminates choice of π(Θ0 ) but still depends on the choice of (π0 , π1 ) Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 5 / 30
  • 8. Self-contained concept Derived from 0 − 1 loss and Bayes rule: acceptance if B01 > {(1 − π(Θ0 ))/a1 }/{π(Θ0 )/a0 } but used outside decision-theoretic environment eliminates choice of π(Θ0 ) but still depends on the choice of (π0 , π1 ) Jeffreys’ [arbitrary] scale of evidence: π if log10 (B10 ) between 0 and 0.5, evidence against H0 weak, π if log10 (B10 ) 0.5 and 1, evidence substantial, π if log10 (B10 ) 1 and 2, evidence strong and π if log10 (B10 ) above 2, evidence decisive convergent if used with proper statistics Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 5 / 30
  • 9. Difficulties with ABC-Bayes factors ‘This is also why focus on model discrimination typically (...) proceeds by (...) accepting that the Bayes Factor that one obtains is only derived from the summary statistics and may in no way correspond to that of the full model.’ — S. Sisson, Jan. 31, 2011, X.’Og Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 6 / 30
  • 10. Difficulties with ABC-Bayes factors ‘This is also why focus on model discrimination typically (...) proceeds by (...) accepting that the Bayes Factor that one obtains is only derived from the summary statistics and may in no way correspond to that of the full model.’ — S. Sisson, Jan. 31, 2011, X.’Og In the Poisson versus geometric case, if E[yi ] = θ0 > 0, η (θ0 + 1)2 −θ0 lim B12 (y) = e n→∞ θ0 Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 6 / 30
  • 11. Difficulties with ABC-Bayes factors Laplace vs. Normal models: Comparing a sample x1 , . . . , xn from the Laplace (double-exponential) √ L(µ, 1/ 2) distribution 1 √ f (x|µ) = √ exp{− 2|x − µ|} . 2 or from the Normal N (µ, 1) Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 7 / 30
  • 12. Difficulties with ABC-Bayes factors Empirical mean, median and variance have the same mean under both models: useless! Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 7 / 30
  • 13. Difficulties with ABC-Bayes factors Median absolute deviation: priceless! Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 7 / 30
  • 14. Point null hypotheses I have no patience for statistical methods that assign positive probability to point hypotheses of the θ = 0 type that can never actually be true — A. Gelman, BA, 2008 Particular case H0 : θ = θ0 Take ρ0 = Prπ (θ = θ0 ) and π1 prior density under Ha . Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 8 / 30
  • 15. Point null hypotheses I have no patience for statistical methods that assign positive probability to point hypotheses of the θ = 0 type that can never actually be true — A. Gelman, BA, 2008 Particular case H0 : θ = θ0 Take ρ0 = Prπ (θ = θ0 ) and π1 prior density under Ha . Posterior probability of H0 f (x|θ0 )ρ0 f (x|θ0 )ρ0 π(Θ0 |x) = = f (x|θ)π(θ) dθ f (x|θ0 )ρ0 + (1 − ρ0 )m1 (x) and marginal under Ha m1 (x) = f (x|θ)g1 (θ) dθ. Θ1 Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 8 / 30
  • 16. Point null hypotheses (cont’d) Example (Normal mean) Test of H0 : θ = 0 when x ∼ N (θ, 1): we take π1 as N (0, τ 2 ) then −1 1 − ρ0 σ2 τ 2 x2 π(θ = 0|x) = 1 + exp ρ0 σ2 + τ 2 2σ 2 (σ 2 + τ 2 ) Influence of τ : τ /x 0 0.68 1.28 1.96 1 0.586 0.557 0.484 0.351 10 0.768 0.729 0.612 0.366 Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 9 / 30
  • 17. A fundamental difficulty Improper priors are not allowed in this setting If π1 (dθ1 ) = ∞ or π2 (dθ2 ) = ∞ Θ1 Θ2 then either π1 or π2 cannot be coherently normalised Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 10 / 30
  • 18. A fundamental difficulty Improper priors are not allowed in this setting If π1 (dθ1 ) = ∞ or π2 (dθ2 ) = ∞ Θ1 Θ2 then either π1 or π2 cannot be coherently normalised but the normalisation matters in the Bayes factor Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 10 / 30
  • 19. Jeffreys unaware of the problem?? Example of testing for a zero normal mean: If σ is the standard error and λ the true value, λ is 0 on q. We want a suitable form for its prior on q . (...) Then we should take P (qdσ|H) ∝ dσ/σ λ P (q dσdλ|H) ∝ f dσ/σdλ/λ σ where f [is a true density] (ToP, V, §5.2). Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 11 / 30
  • 20. Jeffreys unaware of the problem?? Example of testing for a zero normal mean: If σ is the standard error and λ the true value, λ is 0 on q. We want a suitable form for its prior on q . (...) Then we should take P (qdσ|H) ∝ dσ/σ λ P (q dσdλ|H) ∝ f dσ/σdλ/λ σ where f [is a true density] (ToP, V, §5.2). Unavoidable fallacy of the “same” σ?! Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 11 / 30
  • 21. Puzzling alternatives When taking two normal samples x11 , . . . , x1n1 and x21 , . . . , x2n2 with means λ1 and λ2 and same variance σ, testing for H0 : λ1 = λ2 gets outwordly: ...we are really considering four hypotheses, not two as in the test for agreement of a location parameter with zero; for neither may be disturbed, or either, or both may. ToP then uses parameters (λ, σ) in all versions of the alternative hypotheses, with π0 (λ, σ) ∝ 1/σ π1 (λ, σ, λ1 ) ∝ 1/π{σ 2 + (λ1 − λ)2 } π2 (λ, σ, λ2 ) ∝ 1/π{σ 2 + (λ2 − λ)2 } π12 (λ, σ, λ1 , λ2 ) ∝ σ/π 2 {σ 2 + (λ1 − λ)2 }{σ 2 + (λ2 − λ)2 } Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 12 / 30
  • 22. Puzzling alternatives ToP misses the points that 1 λ does not have the same meaning under q, under q1 (= λ2 ) and under q2 (= λ1 ) 2 λ has no precise meaning under q12 [hyperparameter?] On q12 , since λ does not appear explicitely in the likelihood we can integrate it (V, §5.41). 3 even σ has a varying meaning over hypotheses 4 integrating over measures 2 dσdλ1 dλ2 P (q12 dσdλ1 dλ2 |H) ∝ π 4σ 2 + (λ1 − λ2 )2 simply defines a new improper prior... Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 13 / 30
  • 23. Addiction to models One potential difficulty with Bayesian analysis is its ultimate dependence on model(s) specification π(θ) ∝ π(θ)f (x|θ) Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 14 / 30
  • 24. Addiction to models One potential difficulty with Bayesian analysis is its ultimate dependence on model(s) specification π(θ) ∝ π(θ)f (x|θ) While Bayesian analysis allows for model variability, prunning, improvement, comparison, embedding, &tc., there always is a basic reliance [or at least conditioning] on the ”truth” of an overall model. Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 14 / 30
  • 25. Addiction to models One potential difficulty with Bayesian analysis is its ultimate dependence on model(s) specification π(θ) ∝ π(θ)f (x|θ) While Bayesian analysis allows for model variability, prunning, improvement, comparison, embedding, &tc., there always is a basic reliance [or at least conditioning] on the ”truth” of an overall model. May sound paradoxical because of the many tools offered by Bayesian analysis, however method is blind once ”out of the model”, in the sense that it cannot assess the validity of a model without imbedding this model inside another model. Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 14 / 30
  • 26. ABCµ multiple errors [ c Ratmann et al., PNAS, 2009] Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 15 / 30
  • 27. ABCµ multiple errors [ c Ratmann et al., PNAS, 2009] Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 15 / 30
  • 28. No proper goodness-of-fit test ‘There is not the slightest use in rejecting any hypothesis unless we can do it in favor of some definite alternative that better fits the facts.” — E.T. Jaynes, Probability Theory While the setting H 0 : M = M0 versus H a : M = M0 is rather artificial, there is no satisfactory way of answering the question Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 16 / 30
  • 29. An approximate goodness-of-fit test Testing H 0 : M = Mθ versus H a : M = Mθ rephrased as H0 : min d(Fθ , U(0,1) ) = 0 versus Ha : min d(Fθ , U(0,1) ) > 0 θ θ [Verdinelli and Wasserman, 98; Rousseau and Robert, 01] Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 17 / 30
  • 30. An approximate goodness-of-fit test Testing H 0 : M = Mθ versus H a : M = Mθ rephrased as H0 : Fθ (x) ∼ U(0, 1) versus k ωi Ha : Fθ (x) ∼ p0 U(0, 1) + (1 − p0 ) Be(αi i , αi (1 − i )) i=1 ω with (αi , i ) ∼ [1 − exp{−(αi − 2)2 − ( i − .5)2 }] 2 2 × exp[−1/(αi i (1 − i )) − 0.2αi /2] [Verdinelli and Wasserman, 98; Rousseau and Robert, 01] Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 17 / 30
  • 31. Robustness Models only partly defined through moments Eθ [hi (x)] = Hi (θ) i = 1, . . . i.e., no complete construction of the underlying model Example (White noise in AR) The relation xt = ρxt−1 + σ t often makes no assumption on t besides its first two moments... How can we run Bayesian analysis in such settings? Should we? [Lazar, 2005; Cornuet et al., 2011, in prep.] Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 18 / 30
  • 32. [back to] Bayesian model choice Having a high relative probability does not mean that a hypothesis is true or supported by the data — A. Templeton, Mol. Ecol., 2009 The formal Bayesian approach put probabilities all over the entire model/parameter space Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 19 / 30
  • 33. [back to] Bayesian model choice Having a high relative probability does not mean that a hypothesis is true or supported by the data — A. Templeton, Mol. Ecol., 2009 The formal Bayesian approach put probabilities all over the entire model/parameter space This means: allocating probabilities pi to all models Mi defining priors πi (θi ) for each parameter space Θi pick largest p(Mi |x) to determine “best” model Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 19 / 30
  • 34. Several types of problems Concentrate on selection perspective: how to integrate loss function/decision/consequences representation of parsimony/sparcity (Occam’s rule) how to fight overfitting for nested models Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 20 / 30
  • 35. Several types of problems Incoherent methods, such as ABC, Bayes factor, or any simulation approach that treats all hypotheses as mutually exclusive, should never be used with logically overlapping hypotheses. — A. Templeton, PNAS, 2010 Choice of prior structures adequate weights pi : > if M1 = M2 ∪ M3 , p(M1 ) = p(M2 ) + p(M3 ) ? priors distributions πi (·) defined for every i ∈ I πi (·) proper (Jeffreys) πi (·) coherent (?) for nested models prior modelling inflation Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 20 / 30
  • 36. Compatibility principle Difficulty of finding simultaneously priors on a collection of models Mi (i ∈ I) Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 21 / 30
  • 37. Compatibility principle Difficulty of finding simultaneously priors on a collection of models Mi (i ∈ I) Easier to start from a single prior on a “big” model and to derive the others from a coherence principle [Dawid & Lauritzen, 2000] Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 21 / 30
  • 38. Projection approach ⊥ For M2 submodel of M1 , π2 can be derived as the distribution of θ2 (θ1 ) ⊥ (θ ) is a projection of θ on M , e.g. when θ1 ∼ π1 (θ1 ) and θ2 1 1 2 d(f (· |θ1 ), f (· |θ1 ⊥ )) = inf d(f (· |θ1 ) , f (· |θ2 )) . θ2 ∈Θ2 where d is a divergence measure [McCulloch & Rossi, 1992] Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 22 / 30
  • 39. Projection approach ⊥ For M2 submodel of M1 , π2 can be derived as the distribution of θ2 (θ1 ) ⊥ (θ ) is a projection of θ on M , e.g. when θ1 ∼ π1 (θ1 ) and θ2 1 1 2 d(f (· |θ1 ), f (· |θ1 ⊥ )) = inf d(f (· |θ1 ) , f (· |θ2 )) . θ2 ∈Θ2 where d is a divergence measure [McCulloch & Rossi, 1992] Or we can look instead at the posterior distribution of d(f (· |θ1 ), f (· |θ1 ⊥ )) [Goutis & Robert, 1998; Dupuis & Robert, 2001] Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 22 / 30
  • 40. Kullback proximity Alternative projection to the above Definition (Compatible prior) Given a prior π1 on a model M1 and a submodel M2 , a prior π2 on M2 is compatible with π1 Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 23 / 30
  • 41. Kullback proximity Alternative projection to the above Definition (Compatible prior) Given a prior π1 on a model M1 and a submodel M2 , a prior π2 on M2 is compatible with π1 when it achieves the minimum Kullback divergence between the corresponding marginals: m1 (x; π1 ) = Θ1 f1 (x|θ)π1 (θ)dθ and m2 (x); π2 = Θ2 f2 (x|θ)π2 (θ)dθ, Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 23 / 30
  • 42. Kullback proximity Alternative projection to the above Definition (Compatible prior) Given a prior π1 on a model M1 and a submodel M2 , a prior π2 on M2 is compatible with π1 when it achieves the minimum Kullback divergence between the corresponding marginals: m1 (x; π1 ) = Θ1 f1 (x|θ)π1 (θ)dθ and m2 (x); π2 = Θ2 f2 (x|θ)π2 (θ)dθ, m1 (x; π1 ) π2 = arg min log m1 (x; π1 ) dx π2 m2 (x; π2 ) Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 23 / 30
  • 43. Difficulties Further complicating dimensionality of test statistics is the fact that the models are often not nested, and one model may contain parameters that do not have analogues in the other models and vice versa. — A. Templeton, Mol. Ecol., 2009 Does not give a working principle when M2 is not a submodel M1 [Perez & Berger, 2000; Cano, Salmer´n & Robert, 2006] o Depends on the choice of π1 Prohibits the use of improper priors Worse: useless in unconstrained settings... Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 24 / 30
  • 44. A side remark: Zellner’s g Use of Zellner’s g-prior in linear regression, i.e. a normal prior for β conditional on σ 2 , ˜ β|σ 2 ∼ N (β, gσ 2 (X T X)−1 ) and a Jeffreys prior for σ 2 , π(σ 2 ) ∝ σ −2 Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 25 / 30
  • 45. Variable selection For the hierarchical parameter γ, we use p π(γ) = τiγi (1 − τi )1−γi , i=1 where τi corresponds to the prior probability that variable i is present in the model (and a priori independence between the presence/absence of variables) Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 26 / 30
  • 46. Variable selection For the hierarchical parameter γ, we use p π(γ) = τiγi (1 − τi )1−γi , i=1 where τi corresponds to the prior probability that variable i is present in the model (and a priori independence between the presence/absence of variables) Typically (?), when no prior information is available, τ1 = . . . = τp = 1/2, ie a uniform prior π(γ) = 2−p Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 26 / 30
  • 47. Influence of g Taking ˜ β = 0p+1 and c large does not work Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 27 / 30
  • 48. Influence of g Taking ˜ β = 0p+1 and c large does not work Consider the 10-predictor full model   3 3 2 2 2 y|β, σ ∼ N β0 + βi x i + βi+3 xi + β7 x1 x2 + β8 x1 x3 + β9 x2 x3 + β10 x1 x2 x3 , σ In  i=1 i=1 where the xi s are iid U (0, 10) [Casella & Moreno, 2004] Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 27 / 30
  • 49. Influence of g Taking ˜ β = 0p+1 and c large does not work Consider the 10-predictor full model   3 3 2 2 2 y|β, σ ∼ N β0 + βi x i + βi+3 xi + β7 x1 x2 + β8 x1 x3 + β9 x2 x3 + β10 x1 x2 x3 , σ In  i=1 i=1 where the xi s are iid U (0, 10) [Casella & Moreno, 2004] True model: two predictors x1 and x2 , i.e. γ ∗ = 110. . .0, (β0 , β1 , β2 ) = (5, 1, 3), and σ 2 = 4. Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 27 / 30
  • 50. Influence of g 2 t1 (γ) g = 10 g = 100 g = 103 g = 104 g = 106 0,1,2 0.04062 0.35368 0.65858 0.85895 0.98222 0,1,2,7 0.01326 0.06142 0.08395 0.04434 0.00524 0,1,2,4 0.01299 0.05310 0.05805 0.02868 0.00336 0,2,4 0.02927 0.03962 0.00409 0.00246 0.00254 0,1,2,8 0.01240 0.03833 0.01100 0.00126 0.00126 Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 28 / 30
  • 51. Case for a noninformative hierarchical solution ˜ Use the same compatible informative g-prior distribution with β = 0p+1 and a hierarchical diffuse prior distribution on g, e.g. π(g) ∝ g −1 IN∗ (c) [Liang et al., 2007; Marin & Robert, 2007; Celeux et al., ca. 2011] Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 29 / 30
  • 52. Occam’s razor Pluralitas non est ponenda sine neccesitate Variation is random until the contrary is shown; and new parameters in laws, when they are suggested, must be tested one at a time, unless there is specific reason to the contrary. H. Jeffreys, ToP, 1939 No well-accepted implementation behind the principle... Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 30 / 30
  • 53. Occam’s razor Pluralitas non est ponenda sine neccesitate Variation is random until the contrary is shown; and new parameters in laws, when they are suggested, must be tested one at a time, unless there is specific reason to the contrary. H. Jeffreys, ToP, 1939 No well-accepted implementation behind the principle... besides the fact that the Bayes factor naturally penalises larger models Christian P. Robert (Paris-Dauphine) Uncertainties within Bayesian concepts July 31, 2011 30 / 30