On resolving the Savage–Dickey paradox
                                                                    Jean-Michel Marin1,3 and Christian P. Robert2,3
                                                      1
                                                        Universit´ Montpellier 2, 2Universit´ Paris-Dauphine, and 3CREST, Paris
                                                                 e                          e



                                                                                                                                                                                    Therefore, (4) leads to the unbiased estimator
                                 Abstract                                                          Verdinelli & Wasserman’s extension
                                                                                                                                                                                                             T                 ¯         T
                                                                                                                                                                                            MR         1        π1(θ0|x, ¯(t), ψ (t)) 1
                                                                                                                                                                                                                ˜        z                    π0(ψ (t))
                                                                                                                                                                                      B01        (x) =                                                     .         (5)
When testing a null hypothesis in a Bayesian framework, the Savage–                     Verdinelli and Wasserman (1995) have proposed a generalisation of the                                          T
                                                                                                                                                                                                            t=1
                                                                                                                                                                                                                     π1(θ0)           T
                                                                                                                                                                                                                                        t=1
                                                                                                                                                                                                                                            π1(ψ (t)|θ(t))
Dickey ratio (Dickey, 1971) is known as a specific representation of                     Savage–Dickey density ratio that bypasses the [void] constraint (2).
                                                                                        Their derivation is                                                                         Note that
the Bayes factor (O’Hagan and Forster, 2004) that only uses the pos-
terior distribution under the alternative hypothesis at the null value,                            π0(ψ)f (x|θ0, ψ) dψ                                                                              π1(θ, ψ)f (x|θ, ψ)
                                                                                        B01(x) =                                                 [by definition]
                                                                                                                                                                                     Eπ1(θ,ψ|x)
                                                                                                                                                                                      ˜                                    π1(θ,ψ|x) π1(ψ|θ) = m1(x)
                                                                                                                                                                                                                        = E˜
thus allowing for a plugg-in version of this quantity. We demon-                                         m1(x)
                                                                                                           π0(ψ)f (x|θ0, ψ) dψ
                                                                                                                                                                                                   π0(ψ)π1(θ)f (x|θ, ψ)               π0(ψ)    m1(x)
                                                                                                                                                                                                                                               ˜
strate here that the Savage–Dickey representation is a generic rep-                            = π1(θ0|x)                                        [for all π1(θ0|x)]
resentation of the Bayes factor and that it fundamentally relies on                                          m1(x)π1(θ0|x)                                                          implies that
                                                                                                             π0(ψ)f (x|θ0, ψ) π1(ψ|θ0)                                                                                 T
specific measure-theoretic versions of the densities involved in the                            = π1(θ0|x)                              dψ        [for all π1(ψ|θ0)]                                                             ¯
                                                                                                                                                                                                                             π1(ψ (t)|θ(t))
                                                                                                              m1(x)π1(θ0|x) π1(ψ|θ0)                                                                              T
ratio, instead of being a special identity imposing a mathematically                                                                                                                                                               ¯
                                                                                               = π1(θ0|x)
                                                                                                               π0(ψ)
                                                                                                                                                                                                                       t=1
                                                                                                                                                                                                                               π0(ψ (t))
void constraint on the prior. We completely clarify the measure-                                             π1(ψ|θ0)
theoretic foundations of the Savage–Dickey representation as well as                                 f (x|θ0, ψ)π1(ψ|θ0) dψ π1(θ0)                                                  is another convergent (if biased) estimator of m1(x)/m1(x).
                                                                                                                                                                                                                                   ˜
                                                                                                   ×                                             [for all π1(θ0)]
of the later generalisation of Verdinelli and Wasserman (1995). We                                        m1(x)π1(θ0|x)      π1(θ0)
provide furthermore a general framework that produces a converging                               π1(θ0|x)       π0(ψ)                                                               For Verdinelli and Wasserman (1995), given (θ(t), ψ (t), z (t)) ∼
                                                                                               =                       π1(ψ|θ0, x) dψ            [specific version of π1(ψ|θ0, x)]   π1(θ, ψ, z|x), π1(θ0|x, z (t), ψ (t)) estimates π1(θ0|x) under the con-
approximation of the Bayes factor that is unrelated with the approach                             π1(θ0)      π1(ψ|θ0)
of Verdinelli and Wasserman (1995) and propose a comparison of this                              π1(θ0|x) π1(ψ|x,θ0) π0(ψ)                                                          straint
                                                                                               =          E                      .                                                                 π1(θ0|x, z, ψ)            f (x, z|θ0, ψ)
                                                                                                  π1(θ0)              π1(ψ|θ0)                                                                                      =                           .
new approximation with their version, as well as with bridge sampling
                                                                                                                                                                                                       π1(θ0)             f (x, z|θ, ψ)π1(θ) dθ
and Chib’s approaches.                                                                   This representation of Verdinelli and Wasserman (1995) therefore re-
                                                                                                                                                                                                   ˜ ˜                                    ˜       ˜
                                                                                                                                                                                    Moreover, if (ψ (t), z (t)) ∼ π1(ψ, z|x, θ0), π0(ψ (t))/π1(ψ (t)|θ0) esti-
                                                                                        mains valid for any choice of versions for π1(θ0|x), π1(θ0), π1(ψ|θ0),
                                                                                        provided the conditional density π1(ψ|θ0, x) is defined by                                   mates
                                                                                                                                                                                                          Eπ1(ψ|x,θ0) π0(ψ) π1(ψ|θ0)
                                                                                                                      f (x|θ0, ψ)π1(ψ|θ0)π1(θ0)
                                                                                                        π1(ψ|θ0, x) =                           .                                   under the constraint
                   The Savage–Dickey ratio                                                                                  m1(x)π1(θ0|x)
                                                                                                                                                                                                     π1(ψ, z|θ0, x) ∝ f (x, z|θ0, ψ)π1(ψ|θ0) .

Testing a null hypothesis versus the alternative                                                                                                                                    Therefore, Verdinelli and Wasserman’s (1995) representation leads to
                                                                                                                                                                                                             T                                 T         ˜
          H0 : x ∼ f0(x|ω0) versus Ha : x ∼ f1(x|ω1)                                                                                                                                         VW         1         π1(θ0|x, z(t), ψ (t)) 1            π0(ψ (t))
                                                                                                           A generic Savage–Dickey                                                     B01        (x) =
                                                                                                                                                                                                                                                       ˜
                                                                                                                                                                                                                                                                 .   (6)
                                                                                                               representation                                                                           T
                                                                                                                                                                                                            t=1
                                                                                                                                                                                                                       π1(θ0)           T
                                                                                                                                                                                                                                              t=1
                                                                                                                                                                                                                                                    π1(ψ (t)|θ0)
under the prior distributions, π0(ω0) and π1(ω1), leads to the Bayes
factor (Jeffreys, 1939)                                                                                                                                                              to be compared with (5).
                              π0(ω0)f0(x|ω0) dω0 m0(x)                                  When considering the Bayes factor
               B01(x) =                         =      .
                              π1(ω1)f1(x|ω1) dω1 m1(x)
                                                                                                                          π0(ψ)f (x|θ0, ψ) dψ π1(θ0)
                                                                                                       B01(x) =                                        ,
The practical computation of the Bayes factor has generated a large                                                     π1(θ, ψ)f (x|θ, ψ) dψdθ π1(θ0)
literature on approximative (see, e.g. Chen et al., 2000, Marin and
Robert, 2010), seeking improvements in numerical precision.                             the numerator involves a specific version in θ = θ0 of the (pseudo-)
                                                                                        marginal posterior density
When considering the null H0 : θ = θ0, with a nuisance parameter ψ,
i.e. when ω1 = (θ.ψ), for a sampling distribution f (x|θ, ψ), Dickey’s
                                                                                                          π1(θ|x) ∝
                                                                                                          ˜                   π0(ψ)f (x|θ, ψ) dψ π1(θ) ,
(1971) representation is

                                     π1(θ0|x)                                           which is associated with the (pseudo-) prior π1(θ, ψ) = π1(θ)π0(ψ). It
                                                                                                                                     ˜
                            B01(x) =          ,                                   (1)   is the marginal posterior of
                                      π1(θ0)
with the obvious notations for the marginal distributions                                               π1(θ, ψ|x) = π0(ψ)π1(θ)f (x|θ, ψ) m1(x) ,
                                                                                                        ˜                                 ˜

     π1(θ) =        π1(θ, ψ)dψ and π1(θ|x) =              π1(θ, ψ|x)dψ .                where m1(x) is the proper normalising constant.
                                                                                              ˜
                                                                                        A Savage–Dickey-like representation of the Bayes factor is obtained by
holds under the assumption that                                                         imposing
                                                                                                        π1(θ0|x)
                                                                                                        ˜              π0(ψ)f (x|θ0, ψ) dψ
                             π1(ψ|θ0) = π0(ψ) .                                   (2)                             =                        ,               (3)
                                                                                                          π0(θ0)            m1(x)
                                                                                                                             ˜
Equation (1) reduces the Bayes factor to the ratio of the posterior over                where the rhs of the equation is uniquely defined. Then, for any value
                                                                                        of π1(θ0),                                                                                    Figure 1: Variations of five approximations of B01(x) for evalu-
the prior marginal densities of θ under the alternative model, taken at
                                                                                                                        π1(θ0|x) m1(x)
                                                                                                                        ˜         ˜                                                   ating the impact of a covariate upon the occurrence of diabetes in
the tested value θ0.                                                                                         B01(x) =                   .                 (4)
                                                                                                                         π1(θ0) m1(x)                                                 the Pima Indian population, Pima.te(MASS), based on a probit
                                                                                                                                                                                      modelling. The boxplots are based on 100 replicas and our Savage–
                                                                                                                                                                                      Dickey representation (5) is denoted by MR, while Verdinelli and
                                                                                                                                                                                      Wasserman’s (1995) version (6) is denoted by VW. The bridge
                                                                                                                                                                                      sampling (Bridge), importance sampling (IS) and Chib’s solution
          A measure-theoretic impossibility                                                                                                                                           are all described in Marin and Robert (2010). The importance
                                                                                                         A computational derivation                                                   sampling proposal is based on the MLE approximation.

From a measure-theoretic perspective, since the value θ0 to be tested
                                                                                                         ¯     ¯                      ¯      ¯
                                                                                        Given a sample (θ(1), ψ (1), z (1)), . . . , (θ(T ), ψ (T ), z (T )) from the aug-
is given, the density function                                                                                       ¯                               ¯                              References
                                                                                        mented posterior π1(θ, ψ, z|x), the sequence
                                                                                                         ˜
                                    π1(ψ|θ0)                                                                                                                                        Chen, M., Shao, Q. and Ibrahim, J. (2000). Monte Carlo Meth-
                                                                                                                         T
                                                                                                                   1                                                                 ods in Bayesian Computation. Springer-Verlag, New York.
may be chosen in a completely arbitrary manner and there always                                                               π1(θ0|x, z (t), ψ (t))
                                                                                                                              ˜        ¯      ¯
                                                                                                                   T                                                                Dickey, J. (1971). The weighted likelihood ratio, linear hypotheses
is a version of the conditional density π1(ψ|θ0) such that (2) is satis-                                                t=1
fied. This is the Savage–Dickey paradox: the representation (1)                                                                                                                       on normal location parameters. Ann. Math. Statist., 42 204–223.
                                                                                        converges to π1(θ0|x) under the constraint
                                                                                                     ˜                                                                              Jeffreys, H. (1939). Theory of Probability. 1st ed. The Clarendon
relies on a mathematically void constraint on the prior
distribution!                                                                                            π1(θ0|x, z, ψ)
                                                                                                         ˜                            f (x, z|θ0, ψ)                                  Press, Oxford.
Furthermore,                                                                                                            =                                .                          Marin, J. and Robert, C. (2010). Importance sampling methods
                                                                                                            π1(θ0)                 f (x, z|θ, ψ)π1(θ) dθ
                  π0(ψ)f (x|θ0, ψ) dψ                                                                                                                                                for Bayesian discrimination between embedded models. In Fron-
   B01(x) =                                      [by definition]                         Therefore an approximation to B01(x) is                                                      tiers of Statistical Decision Making and Bayesian Analysis (M.-
                π1(θ, ψ)f (x|θ, ψ) dψdθ
                π1(ψ|θ0)f (x|θ0, ψ) dψ π1(θ0)                                                                                                                                        H. Chen, D. Dey, P. M¨ller, D. Sun and K. Ye, eds.). Springer-Verlag,
                                                                                                                                                                                                           u
          =                                      [specific version of π1(ψ|θ0)]                                    T                     ¯
                π1(θ, ψ)f (x|θ, ψ) dψdθ π1(θ0)                                                                1         π1(θ0|x, z (t), ψ (t)) m1(x)
                                                                                                                        ˜        ¯             ˜                                     New York. ArXiv:0910.2325.
                π1(θ0, ψ)f (x|θ0, ψ) dψ                                                                                                              .
          =                                      [specific version of π1(θ0, ψ)]                               T               π1(θ0)           m1(x)                                O’Hagan, A. and Forster, J. (2004). Kendall’s Advanced The-
                    m1(x)π1(θ0)                                                                                   t=1
                                                                                                                                                                                     ory of Statistics: Bayesian inference. Arnold, London.
              π1(θ0|x)
          =            ,                         [specific version of π1(θ0|x)]          Since both m1(x) and m1(x) are unknown, we use the bridge sam-
                                                                                                             ˜                                                                      Torrie, G. and Valleau, J. (1977). Nonphysical sampling distri-
               π1(θ0)
                                                                                        pling identity of Torrie and Valleau (1977) which gives                                      butions in Monte Carlo free-energy estimation: Umbrella sampling.
 In conclusion, the Savage–Dickey representation relies on the choice of                                                                                                             J. Comp. Phys., 23 187–199.
a specific version of π1(θ0|x), namely                                                    Eπ1(θ,ψ|x) π0(ψ)π1(θ)f (x|θ, ψ)            = Eπ1(θ,ψ|x)
                                                                                                                                                        π0(ψ)
                                                                                                                                                               =
                                                                                                                                                                 m1(x)
                                                                                                                                                                 ˜
                                                                                                                                                                       ,
                                                                                                        π1(θ, ψ)f (x|θ, ψ)                             π1(ψ|θ)   m1(x)              Verdinelli, I. and Wasserman, L. (1995). Computing Bayes
                    π1(θ0|x)         π0(ψ)f (x|θ0, ψ) dψ                                                                                                                             factors using a generalization of the Savage–Dickey density ratio. J.
                             =                           .                              implying that, if (θ(t), ψ (t)) ∼ π1(θ, ψ|x), then π0(ψ (t))π1(ψ (t)|θ(t)) is                American Statist. Assoc., 90 614–618.
                     π1(θ0)              m1(x)
                                                                                        an unbiased estimator of m1(x)/m1(x).
                                                                                                                     ˜

                                                                                                                                                                                                                                              Tetons range, August 2007

Valencia 9 (poster)

  • 1.
    On resolving theSavage–Dickey paradox Jean-Michel Marin1,3 and Christian P. Robert2,3 1 Universit´ Montpellier 2, 2Universit´ Paris-Dauphine, and 3CREST, Paris e e Therefore, (4) leads to the unbiased estimator Abstract Verdinelli & Wasserman’s extension T ¯ T MR 1 π1(θ0|x, ¯(t), ψ (t)) 1 ˜ z π0(ψ (t)) B01 (x) = . (5) When testing a null hypothesis in a Bayesian framework, the Savage– Verdinelli and Wasserman (1995) have proposed a generalisation of the T t=1 π1(θ0) T t=1 π1(ψ (t)|θ(t)) Dickey ratio (Dickey, 1971) is known as a specific representation of Savage–Dickey density ratio that bypasses the [void] constraint (2). Their derivation is Note that the Bayes factor (O’Hagan and Forster, 2004) that only uses the pos- terior distribution under the alternative hypothesis at the null value, π0(ψ)f (x|θ0, ψ) dψ π1(θ, ψ)f (x|θ, ψ) B01(x) = [by definition] Eπ1(θ,ψ|x) ˜ π1(θ,ψ|x) π1(ψ|θ) = m1(x) = E˜ thus allowing for a plugg-in version of this quantity. We demon- m1(x) π0(ψ)f (x|θ0, ψ) dψ π0(ψ)π1(θ)f (x|θ, ψ) π0(ψ) m1(x) ˜ strate here that the Savage–Dickey representation is a generic rep- = π1(θ0|x) [for all π1(θ0|x)] resentation of the Bayes factor and that it fundamentally relies on m1(x)π1(θ0|x) implies that π0(ψ)f (x|θ0, ψ) π1(ψ|θ0) T specific measure-theoretic versions of the densities involved in the = π1(θ0|x) dψ [for all π1(ψ|θ0)] ¯ π1(ψ (t)|θ(t)) m1(x)π1(θ0|x) π1(ψ|θ0) T ratio, instead of being a special identity imposing a mathematically ¯ = π1(θ0|x) π0(ψ) t=1 π0(ψ (t)) void constraint on the prior. We completely clarify the measure- π1(ψ|θ0) theoretic foundations of the Savage–Dickey representation as well as f (x|θ0, ψ)π1(ψ|θ0) dψ π1(θ0) is another convergent (if biased) estimator of m1(x)/m1(x). ˜ × [for all π1(θ0)] of the later generalisation of Verdinelli and Wasserman (1995). We m1(x)π1(θ0|x) π1(θ0) provide furthermore a general framework that produces a converging π1(θ0|x) π0(ψ) For Verdinelli and Wasserman (1995), given (θ(t), ψ (t), z (t)) ∼ = π1(ψ|θ0, x) dψ [specific version of π1(ψ|θ0, x)] π1(θ, ψ, z|x), π1(θ0|x, z (t), ψ (t)) estimates π1(θ0|x) under the con- approximation of the Bayes factor that is unrelated with the approach π1(θ0) π1(ψ|θ0) of Verdinelli and Wasserman (1995) and propose a comparison of this π1(θ0|x) π1(ψ|x,θ0) π0(ψ) straint = E . π1(θ0|x, z, ψ) f (x, z|θ0, ψ) π1(θ0) π1(ψ|θ0) = . new approximation with their version, as well as with bridge sampling π1(θ0) f (x, z|θ, ψ)π1(θ) dθ and Chib’s approaches. This representation of Verdinelli and Wasserman (1995) therefore re- ˜ ˜ ˜ ˜ Moreover, if (ψ (t), z (t)) ∼ π1(ψ, z|x, θ0), π0(ψ (t))/π1(ψ (t)|θ0) esti- mains valid for any choice of versions for π1(θ0|x), π1(θ0), π1(ψ|θ0), provided the conditional density π1(ψ|θ0, x) is defined by mates Eπ1(ψ|x,θ0) π0(ψ) π1(ψ|θ0) f (x|θ0, ψ)π1(ψ|θ0)π1(θ0) π1(ψ|θ0, x) = . under the constraint The Savage–Dickey ratio m1(x)π1(θ0|x) π1(ψ, z|θ0, x) ∝ f (x, z|θ0, ψ)π1(ψ|θ0) . Testing a null hypothesis versus the alternative Therefore, Verdinelli and Wasserman’s (1995) representation leads to T T ˜ H0 : x ∼ f0(x|ω0) versus Ha : x ∼ f1(x|ω1) VW 1 π1(θ0|x, z(t), ψ (t)) 1 π0(ψ (t)) A generic Savage–Dickey B01 (x) = ˜ . (6) representation T t=1 π1(θ0) T t=1 π1(ψ (t)|θ0) under the prior distributions, π0(ω0) and π1(ω1), leads to the Bayes factor (Jeffreys, 1939) to be compared with (5). π0(ω0)f0(x|ω0) dω0 m0(x) When considering the Bayes factor B01(x) = = . π1(ω1)f1(x|ω1) dω1 m1(x) π0(ψ)f (x|θ0, ψ) dψ π1(θ0) B01(x) = , The practical computation of the Bayes factor has generated a large π1(θ, ψ)f (x|θ, ψ) dψdθ π1(θ0) literature on approximative (see, e.g. Chen et al., 2000, Marin and Robert, 2010), seeking improvements in numerical precision. the numerator involves a specific version in θ = θ0 of the (pseudo-) marginal posterior density When considering the null H0 : θ = θ0, with a nuisance parameter ψ, i.e. when ω1 = (θ.ψ), for a sampling distribution f (x|θ, ψ), Dickey’s π1(θ|x) ∝ ˜ π0(ψ)f (x|θ, ψ) dψ π1(θ) , (1971) representation is π1(θ0|x) which is associated with the (pseudo-) prior π1(θ, ψ) = π1(θ)π0(ψ). It ˜ B01(x) = , (1) is the marginal posterior of π1(θ0) with the obvious notations for the marginal distributions π1(θ, ψ|x) = π0(ψ)π1(θ)f (x|θ, ψ) m1(x) , ˜ ˜ π1(θ) = π1(θ, ψ)dψ and π1(θ|x) = π1(θ, ψ|x)dψ . where m1(x) is the proper normalising constant. ˜ A Savage–Dickey-like representation of the Bayes factor is obtained by holds under the assumption that imposing π1(θ0|x) ˜ π0(ψ)f (x|θ0, ψ) dψ π1(ψ|θ0) = π0(ψ) . (2) = , (3) π0(θ0) m1(x) ˜ Equation (1) reduces the Bayes factor to the ratio of the posterior over where the rhs of the equation is uniquely defined. Then, for any value of π1(θ0), Figure 1: Variations of five approximations of B01(x) for evalu- the prior marginal densities of θ under the alternative model, taken at π1(θ0|x) m1(x) ˜ ˜ ating the impact of a covariate upon the occurrence of diabetes in the tested value θ0. B01(x) = . (4) π1(θ0) m1(x) the Pima Indian population, Pima.te(MASS), based on a probit modelling. The boxplots are based on 100 replicas and our Savage– Dickey representation (5) is denoted by MR, while Verdinelli and Wasserman’s (1995) version (6) is denoted by VW. The bridge sampling (Bridge), importance sampling (IS) and Chib’s solution A measure-theoretic impossibility are all described in Marin and Robert (2010). The importance A computational derivation sampling proposal is based on the MLE approximation. From a measure-theoretic perspective, since the value θ0 to be tested ¯ ¯ ¯ ¯ Given a sample (θ(1), ψ (1), z (1)), . . . , (θ(T ), ψ (T ), z (T )) from the aug- is given, the density function ¯ ¯ References mented posterior π1(θ, ψ, z|x), the sequence ˜ π1(ψ|θ0) Chen, M., Shao, Q. and Ibrahim, J. (2000). Monte Carlo Meth- T 1 ods in Bayesian Computation. Springer-Verlag, New York. may be chosen in a completely arbitrary manner and there always π1(θ0|x, z (t), ψ (t)) ˜ ¯ ¯ T Dickey, J. (1971). The weighted likelihood ratio, linear hypotheses is a version of the conditional density π1(ψ|θ0) such that (2) is satis- t=1 fied. This is the Savage–Dickey paradox: the representation (1) on normal location parameters. Ann. Math. Statist., 42 204–223. converges to π1(θ0|x) under the constraint ˜ Jeffreys, H. (1939). Theory of Probability. 1st ed. The Clarendon relies on a mathematically void constraint on the prior distribution! π1(θ0|x, z, ψ) ˜ f (x, z|θ0, ψ) Press, Oxford. Furthermore, = . Marin, J. and Robert, C. (2010). Importance sampling methods π1(θ0) f (x, z|θ, ψ)π1(θ) dθ π0(ψ)f (x|θ0, ψ) dψ for Bayesian discrimination between embedded models. In Fron- B01(x) = [by definition] Therefore an approximation to B01(x) is tiers of Statistical Decision Making and Bayesian Analysis (M.- π1(θ, ψ)f (x|θ, ψ) dψdθ π1(ψ|θ0)f (x|θ0, ψ) dψ π1(θ0) H. Chen, D. Dey, P. M¨ller, D. Sun and K. Ye, eds.). Springer-Verlag, u = [specific version of π1(ψ|θ0)] T ¯ π1(θ, ψ)f (x|θ, ψ) dψdθ π1(θ0) 1 π1(θ0|x, z (t), ψ (t)) m1(x) ˜ ¯ ˜ New York. ArXiv:0910.2325. π1(θ0, ψ)f (x|θ0, ψ) dψ . = [specific version of π1(θ0, ψ)] T π1(θ0) m1(x) O’Hagan, A. and Forster, J. (2004). Kendall’s Advanced The- m1(x)π1(θ0) t=1 ory of Statistics: Bayesian inference. Arnold, London. π1(θ0|x) = , [specific version of π1(θ0|x)] Since both m1(x) and m1(x) are unknown, we use the bridge sam- ˜ Torrie, G. and Valleau, J. (1977). Nonphysical sampling distri- π1(θ0) pling identity of Torrie and Valleau (1977) which gives butions in Monte Carlo free-energy estimation: Umbrella sampling. In conclusion, the Savage–Dickey representation relies on the choice of J. Comp. Phys., 23 187–199. a specific version of π1(θ0|x), namely Eπ1(θ,ψ|x) π0(ψ)π1(θ)f (x|θ, ψ) = Eπ1(θ,ψ|x) π0(ψ) = m1(x) ˜ , π1(θ, ψ)f (x|θ, ψ) π1(ψ|θ) m1(x) Verdinelli, I. and Wasserman, L. (1995). Computing Bayes π1(θ0|x) π0(ψ)f (x|θ0, ψ) dψ factors using a generalization of the Savage–Dickey density ratio. J. = . implying that, if (θ(t), ψ (t)) ∼ π1(θ, ψ|x), then π0(ψ (t))π1(ψ (t)|θ(t)) is American Statist. Assoc., 90 614–618. π1(θ0) m1(x) an unbiased estimator of m1(x)/m1(x). ˜ Tetons range, August 2007