Successfully reported this slideshow.
Upcoming SlideShare
×

# Computation of the marginal likelihood

2,615 views

Published on

First talk at BigMC seminar on 06/01/2010 (Institut Henri Poincaré, Paris), by Jean-Louis Foulley, INRA, on "Computation of the marginal likelihood".

• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

### Computation of the marginal likelihood

1. 1. Computation of the marginal likelihood: brief summary and method of power posteriors Jean-Louis Foulley jean-louis.foulley@jouy.inra.fr 06/01/2011 JLF/BigMC 1
2. 2. Outline Objectives Brief summary of current methods Monte Carlo direct Harmonic mean Generalized harmonic mean Chib Bridge sampling Nested sampling Power Posteriors Relationship with fractional BF Algorithm Examples Conclusion 06/01/2011 JLF/BigMC 2
3. 3. Objectives Marginal likelihood ("Prior Predictive", "Evidence") m ( y ) = ∫ f ( y | θ )π ( θ ) dθ Θ -Normalization constant of π * ( θ | y ) π * (θ) π (θ | y ) = where π * ( θ | y ) = f ( y | θ ) π ( θ ) m(y) -Component of the Bayes factor π ( M1 | y ) / π ( M 2 | y ) m1 ( y ) BF12 = = π ( M1 ) / π ( M 2 ) m2 ( y ) ∆Dm,12 = −2 ln BF12 = Dm,1 − Dm,2 Dm , j = −2 ln m j ( y ) : Marginal deviance Calibration: Jeffreys & Turing (Deciban: 10log10 BF) 06/01/2011 JLF/BigMC 3
4. 4. Methods/Monte Carlo, Harmonic Mean 1 G 1) Direct Monte Carlo mMC ( y ) = ˆ ∑ G g =1 f y | θ( ) g ( ) θ( ) ,..., θ( ) : draws from π ( θ ) 1 g Converges (a. s) to m ( y ) but very inefficient Many samples outside regions ofhigh likelihood 2)Harmonic mean (Newton & Raftery, 1994) −1   1 G 1 mNR ( y ) =  ∑ g =1  θ( ) ,..., θ( ) : draws from π ( θ | y ) 1 g ˆ G  ( f y | θ( ) g )   A special case of WIS: ∑ j =1 f y | θ( J ( j) )w (θ( ) ) / ∑ j J j =1 ( )) w θ( j where w θ(( ) ) ∝ π ( θ ) / g ( θ ) for g ( θ ) ∝ f ( y | θ ) π ( θ ) j Converges (a.s) but very instable (infinite variance): to be absolutely avoided "Worst Monte Carlo Method Ever" Radford Neal (2010) Harmonic mean not really affected by change in prior while true marginal highly sensitive to prior 06/01/2011 JLF/BigMC 4
5. 5. Methods/Gelfand&Dey & Chib 3) Generalized harmonic mean (Gelfand & Dey, 1994; Chen & Shao, 1997) −1  1 G mGD ( y ) =  ∑ g =1 ˆ ( ) g θ( ) g   G  ( ) ( ) f y | θ( ) π θ( ) g g   θ( ) ,..., θ( ) : draws from π ( θ | y ) 1 g g (.) as an approx of the posterior: pbs in large dimension 4)Chib's methods (1995) ln m ( y ) = ln f ( y | θ ) + ln π ( θ ) − ln π ( θ | y ) , ∀θ ln mSC ( y ) = ln f ( y | θ* ) + ln π ( θ* ) − ln π ( θ* | y ) ˆ ˆ π ( θ | y ) to be estimated & θ* = ML, MAP, E ( θ | y ) selected ˆ Simple & often effective 06/01/2011 JLF/BigMC 5
6. 6. Chib(Cont.) 4)Chib (1995) ln mSC ( y ) = ln f ( y | θ* ) + ln π ( θ* ) − ln π ( θ* | y ) ˆ ˆ a) Gibbs & RaoBlackwellization (Chib,1995) b) Metropolis-Hastings (Chib & Jeliazkov, 2001) c) Kernel estimator (Chen, 1994) 06/01/2011 JLF/BigMC 6
7. 7. Chib via Gibbs If θ = ( θ1 , θ2 ) π ( θ1 , θ 2 | y ) = π ( θ1 | y, θ 2 ) π ( θ 2 | y ) known estimated π ( θ 2 | y ) = ∫ π ( θ 2 | y , θ1 ) π ( θ1 | y ) dθ1 known MCMC draws "Estimation by Rao-Blackwellization" 1 G ˆ * 2 G * ( π ( θ | y ) = ∑ g =1 π θ 2 | y , θ1 (g) ) (g) θ1 : draws from π ( θ1 | y ) 06/01/2011 JLF/BigMC 7
8. 8. Bridge sampling 5)Bridge sampling (Meng & Wong, 1996) f ( y | θ)π (θ) ∫ α ( θ ) g ( θ ) m ( y ) dθ =1 ∫ α ( θ ) g ( θ ) π ( θ | y ) dθ 06/01/2011 JLF/BigMC 8
9. 9. Bridge sampling/cont. 5)Bridge sampling (Meng & Wong, 1996) ∫ α ( θ ) f ( y | θ ) π ( θ ) g ( θ ) dθ g (θ) m(y) = = E (α ( θ ) f ( y | θ ) π ( θ ) ) ∫ α ( θ ) g ( θ ) π ( θ | y ) dθ E ( ) (α ( θ ) g ( θ ) ) π θ|y α ( θ ) "bridge function" g ( θ ) = density to be calibated For α ( θ ) = 1/ g ( θ ) −1 ˆ −1  ( ) ( ) ( ) mBS 1 ( y ) = L ∑ l =1  f y | θ( ) π θ( ) / g θ( )  ( IS ) L l l l  For α ( θ ) = 1/ f ( y | θ ) π ( θ ) mBS 2 ( y ) = Gelfand-Dey (1994) ˆ 1/ 2 For α ( θ ) = 1/  f ( y | θ ) π ( θ ) g ( θ )    mBS 3 ( y ) = Lopes-West (2004) ˆ 1/ 2 mBS 3 ( y ) = ˆ −1  L l ( ) ( ) ( ) L ∑ l =1  f y | θ( ) π θ( ) / g θ( )  l l  1/ 2 M ∑ m =1 −1 M  ( ) ( ) ( )  g θ( m ) / f y | θ ( m ) π θ( m )   θ( ) : draws from g ( θ ) ; θ( ) : draws from π ( θ | y ) l m 06/01/2011 JLF/BigMC 9
10. 10. Bridge sampling (cont.) 5)Bridge sampling (Meng & Wong, 1996) ∫ α ( θ ) f ( y | θ ) π ( θ ) g ( θ ) dθ = E ( ) (α ( θ ) f ( y | θ ) π ( θ ) ) g θ m (y) = ∫ α ( θ ) g ( θ ) π ( θ | y ) dθ E ( ) (α ( θ ) g ( θ ) ) π θ|y For α ( θ ) = 1/ f ( y | θ ) π ( θ ) g ( θ ) mBS 4 ( y ) = ˆ L−1 ∑ l =1 1/ g θ( )  L  l  ( ) (Lopes & West, 2004; Ando, 2010) 1/ 2 M ∑ m =1 1/ f y | θ π θ −1 M  ( m) ( ( m)   ) ( ) θ( ) : draws from g ( θ ) ; θ( ) : draws from π ( θ | y ) Odd (cf numerator) l m draws -1 For α ( θ ) ∝  sM π ( θ | y ) +sL g ( θ )  , optimum estim. wrt E(RMSE)   (Meng & Wong, 1996; Lopes & West, 2004; Fruhwirth-Schnatter,2004) L−1 ∑ l =1 L ˆ ( π t θ( l ) | y ) mBS 5) ( y ) = mBS) 5 ˆ ( t +1 ˆ (t ˆ ( ) sM π t θ ( ) | y + s L g θ ( ) l ( ) l ( )) g θ( m ∑ −1 M M m =1 sM π ( θ( ) | y ) + s g ( θ( ) ) ˆt m L m where π t ( θ | y ) = f ( y | θ ) π ( θ ) / mBS) 5 and mBS)5 = mBS 1 ou mBS 2 ˆ ˆ (t ˆ (0 ˆ ˆ sM = 1 − sL = M /( M + L) 06/01/2011 JLF/BigMC 10
11. 11. Nested sampling 6)Nested sampling (Skilling, 2006; Murray et al, 2006; Chopin & Robert, 2010) m ( y ) = ∫ f ( y | θ) π ( θ) dθ = Eπ  L ( θ)    Z L( θ) Let x = ϕ −1 ( l ) = Pr  L ( θ) > l  be the survival function of rv L ( θ)   where l = ϕ( x) (upper tail) quantile function of L ( θ) so that x ~ U (0,1) 1 ˆ = ∑m ∆ l Then Z = ∫ ϕ ( x)dx area under curve l =ϕ ( x )  and Z 0   i =1 xi i with ∆xi = xi−1 − xi or ∆xi = ½ ( xi−1 − xi+1 ) if trapezoidal integration 06/01/2011 JLF/BigMC 11
12. 12. Nested sampling/Cont. 1) Draw N points θ1,i from prior, θ1 = Argmin i =1,.., N L (θ1,i ) set l1 = L (θ1 ) 2) Obtain N points θ 2,i by repeating θ1,i except θ1 replaced by a draw from prior constrained by L (θ ) > l1 , record θ 2 = Argmin i =1,.., N L (θ 2,i ) and set l2 = L (θ 2 ) 3) Repeat 1 & 2 until a stopping rule (change in max of L ≤ ε ) Since xi = ϕ −1 ( li ) is unknown Set a) deterministic xi = exp(−i / N ) so that lnxi = E ( ln ϕ −1 ( li ) ) or b) random xi +1 = ti xi with x0 = 1, ti ~ Be ( N ,1) Main difficulty in sampling θ from the prior constrained by L ( θ ) > l ? See Chopin & Robert (2010) Extended Importance Sampling scheme Z = ∑ i =1 ∆ xi ϕi wi with π (θ ) L (θ ) = π (θ ) L (θ ) w (θ ) m 06/01/2011 JLF/BigMC 12
13. 13. Power Posteriors/basic principle Method due to Friel & Petit (2008) Lartillot & Philippe (2006) "Annealing-Melting" t f ( y | θ) π (θ) Power Posterior defined as π ( θ | y , t ) = zt ( y ) where zt ( y ) = ∫ f ( y | θ ) π ( θ )dθ t and t ∈ ]0,1[ with t −1 equivalent to "physical temperature" t = 0 to 1: cooling down or "annealing"; t = 1 to 0 "melting" Notice the path sampling scheme (Gelman & Meng, 1998) π ( θ | y, 0 ) = π ( θ ) with z0 ( y ) = 1 π ( θ | y,1) = π ( θ | y ) with z1 ( y ) = m ( y ) 06/01/2011 JLF/BigMC 13
14. 14. PP/key result 1 log m ( y ) = ∫ Eθ|y ,t log f ( y | θ ) dt   0 where θ | y , t has density: t f ( y | θ) π (θ) π ( θ | y, t ) = zt ( y ) Thermodynamic integration (end of the 70's) Ripley (1988),Ogata (1989), Neal (1993) "Path sampling" (Gelman & Meng, 1998) 06/01/2011 JLF/BigMC 14
15. 15. PP formula/proof as a special case of path sampling If p (θ | t ) = q (θ | t ) / z ( t ) où z ( t ) = ∫ q (θ | t ) dθ Let label U (θ , t ) = ln q (θ | t ) as the potential d dt z (1) 1 One has ln = ∫ Eθ |t U (θ , t ) dt   z ( 0) 0 Here p (θ | t ) = π ( θ | y, t ) ; q (θ | t ) =  f ( y | θ )  π ( θ ) t   Then U (θ , t ) = ln f ( y | θ ) 06/01/2011 JLF/BigMC 15
16. 16. PP/Example yi | θ ~ iid N (θ ,1) , i = 1,.., N θ ~ N ( µ ,τ 2 ) Alors θ | y, t ~ N ( µt ,τ t2 ) Nty + µτ −2 1 µt = ; τ t2 = Nt + τ −2 Nt + τ −2 −2 Eθ |y ,t log f ( y | θ )  =   Dt (θ )  (µ − y ) + 1  2 N log 2π + log s 2 +   ( µτ 2t + 1) Nt + τ  2 −2   y = N −1 ∑ i =1 yi ; s 2 = N −1 ∑ i =1 ( yi − y ) N N 2 D0 (θ ) = N Cte + ( µ − y )  + Nτ 2 2   High sensitivity to τ 2 (τ 2 → ∞, D0 (θ ) → ∞) 06/01/2011 JLF/BigMC 16
17. 17. PP/Example/cont. 06/01/2011 JLF/BigMC 17
18. 18. KL distance Prior-Posterior π (θ | y ) KL (π ( θ | y ) , π ( θ ) ) = ∫ ln π ( θ | y ) dθ π (θ) f ( y | θ) π (θ) KL = ∫ ln π ( θ | y ) dθ m ( y ) π (θ) KL = Eθ|y ln f ( y | θ )  − ln m ( y )   −2 KL = D − Dm (by-product of PP) ⇒ Dm = D + 2 KL DIC = D + pD where pD = D − D ( θ ) model complexity 06/01/2011 JLF/BigMC 18
19. 19. PP/partial BF 1)if π (θ ) improper ⇒ marginal f ( x ) also improper resulting in problems for defining BF 2) High sensitivity of BF to priors (does not vanish with increasing sample size) sample Idea behind partial BF (Lempers,1971) y = ( y P , y T ) -Learning or pilot sample y P to tune the prior -Testing sample y T for data analysis Intrinsinc BF (Berger & Perrichi, 1996) Fractional BF (O'Hagan, 1995) 06/01/2011 JLF/BigMC 19
20. 20. Fractional BF A fraction b of the likelihood is used to tune the prior b f ( y P | θ ) ≈ f ( y | θ ) b = m / N < 1 (O'Hagan, 1995) resulting in: in: b π ( θ, b ) ∝ f ( y | θ ) π ( θ ) 06/01/2011 20
21. 21. PP & fractional BF b π ( θ, b ) ∝ f ( y | θ ) π ( θ ) 1−b m F ( y, b ) = ∫ f ( y | θ ) π ( θ, b ) dθ m F ( y, b ) = ∫ f ( y | θ ) π ( θ ) dθ = m ( y,1) ∫ f ( y | θ ) π ( θ ) dθ m ( y , b ) b PP directly provides −π ( θ, b ) via π ( θ | y , t = b ) 1 − log m F ( y, b ) = ∫ b Eθ|y ,t log f ( y | θ )dt   06/01/2011 JLF/BigMC 21
22. 22. PP/algorithm MCMC with discretization of t on [ 0,1[ t0 = 0 < t1 < ... < ti < ... < tn −1 < tn = 1 ti = (i / n)c with i = 1,.., n; n = 20 − 100; c = 2 − 5 1)Make draws of θ( gi ) MCMC from π ( θ | y, ti ) 1 G   G i ( 2)Compute Eθ|y ,t =ti log p ( y | θ )  = ∑ g =1 log p y | θ( i ) ˆ g ) Often conditional independence, log p ( y | θ ) = ∑ i =1 log p ( yi | θ ) N eg if θ if the closest stochastic parent of y = ( yi ) (as for DIC) 3)Approximate the integral (eg trapezoidal rule) ∑ i=0 i+1 i i i+1 ˆog m ( y ) = ½ n ( t − t )( E + E ) l Error due to this numerical approx. (Calderhead & Girolami,2009) Formula for MC sampling error: see Friel & Pettitt 06/01/2011 JLF/BigMC 22
23. 23. PP/Little toy example yi 0) yi | λi ~ id P ( λi xi ) ⇔ (λ x ) f ( yi | λi ) = i i exp ( −λi xi ) yi ! β α λiα −1 exp ( − βλi ) 1)λi ~ id G (α , β ) ⇔ π ( λi ) = Γ (α ) 0 + 1) yi ~ id BN (α , pi ) where pi = β / ( β + xi ) Γ ( yi + α ) α y Direct approach: f ( yi ) = pi (1 − pi ) i Γ (α ) yi ! f ( y ) = − n ln Γ (α ) + ∑ i =1 ln Γ ( yi + α ) −∑ i =1 ln ( yi !) n n +α ∑ i =1 ln pi + ∑ i =1 yi ln (1 − p )i n n n Indirect approach: f ( y ) = ∏ i =1 ∫ f ( yi | λi ) π ( λi ) d λi 06/01/2011 JLF/BigMC 23
24. 24. PP/Little toy example/cont. Ex / Pump data: Ex#2 in Winbugs, Carlin-Louis (p126) y = # failures of pumps in x (103 hrs ) y = ( 5,1,5,14,3,19,1,1, 4, 22 ) ; n = 10; α = β = 1 x = (94.3,15.7, 62.9,126,5.24,31.4,1.05,1.05, 2.1,10.5) ˆ D = −2 ln f ( y ) = 66.03 D = 66.28 ± 0.03 (20pts) FP 06/01/2011 JLF/BigMC 24
25. 25. PP/Toy example in Openbugs 06/01/2011 JLF/BigMC 25
26. 26. PP/Toy example in Openbugs/Cont. 06/01/2011 JLF/BigMC 26
27. 27. PP/Toy example in Openbugs/Cont. 06/01/2011 JLF/BigMC 27
28. 28. PP/Toy example in Openbugs/Cont. 06/01/2011 JLF/BigMC 28
29. 29. Sampling both θ & t 1 log m ( y ) = ∫ log f ( y | θ ) π ( θ | y , t ) dt 0  log f ( y | θ ) 1 log m ( y ) = ∫ π ( θ | y, t ) p(t ) dt 0 p (t ) π ( θ ,t | y )  log f ( y | θ )  log log m ( y ) = Eθ ,t|y    p (t )  t π ( θ | y, t ) ∝ f ( y | θ ) π ( θ ) t if we assume p (t ) ∝ zt ( y ) ⇒ π ( t | θ, y ) ∝ f ( y | θ ) Sampling ( θ, t ) in such conditions gives poor estimation (too few draws of t close to 0) 06/01/2011 JLF/BigMC 29
30. 30. Example 1/ Pothoff&Roy’s data Growth measurements in 11 girls and 16 boys: Pothoff and Roy,1964; Little and Rubin, 1987 Age (years) Age (years) Girl 8 10 12 14 Boy 8 10 12 14 1 210 200 215 230 1 260 250 290 310 2 210 215 240 255 2 215 230 265 3 205 245 260 3 230 225 240 275 4 235 245 250 265 4 255 275 265 270 5 215 230 225 235 5 200 225 260 6 200 210 225 6 245 255 270 285 7 215 225 230 250 7 220 220 245 265 8 230 230 235 240 8 240 215 245 255 9 200 220 215 9 230 205 310 260 10 165 190 195 10 275 280 310 315 11 245 250 280 280 11 230 230 235 250 12 215 240 280 13 170 260 295 14 225 255 255 260 15 230 245 260 300 16 220 235 250 distance from the centre of the pituary to the pteryomaxillary fissure (unit 10-4m) 06/01/2011 JLF/BigMC 30
31. 31. Model comparison on Pothoff’s data i: subscript for individual i = 1,.., I = 25 (11girls+16boys) j: subscript for measurement at age t j (8,10,12,14 yrs ) 1)Purely Fixed Model yij = (α 0 + α xi ) + ( β 0 + β xi ) ( t j − 8 ) + eij int ercept pente 2)Random intercept model yij = (α 0 + α xi + ai ) + ( β 0 + β xi ) ( t j − 8 ) + eij 3)Random intercept & slope model assuming independent effects yij = (α 0 + α xi + ai ) + ( β 0 + β xi + bi ) ( t j − 8 ) + eij or yij = φi1 + φi 2 ( t j − 8 ) + eij , yij ~ id N (ηij , σ e2 )  φi1   α 0 + α xi   σ a 0   2 with φi =   ~ N  ,   φi 2    β 0 + β xi   0 σ b2     4)Random intercept & slope model assuming correlated effects  φi1   α 0 + α xi   σ a σ ab   2 φi =   ~ N  ,   φi 2    β 0 + β xi   σ ab σ b2     06/01/2011 JLF/BigMC 31
32. 32. Model presentation:Hierarchical Bayes 1st level:yij ~ id N (ηij , σ e2 ) with ηij = φi1 + φi 2 ( t j − 8 ) 2nd level :   φ   α 0 + α xi   σ a σ ab   2 2a) φi =  i1  ~ N  ,   φi 2   β 0 + β xi   σ ab σ b2      Σ   2b) σ e ~ U ( 0, ∆ e ) or σ e2 ~ InvG (1, σ e2 ) 3rd level: Fixed effects: α 0 , α , β 0 , β ~ U(inf,sup) Var (Covar) components: − If σ ab = 0, then i) σ a ~ U ( 0, ∆ a ) , same for σ b ~ U ( 0, ∆ b ) or ii) σ a ~ InvG (1, σ a ) ,same for σ b2 ~ InvG (1, σ b2 ) 2 2 − If σ ab ≠ 0, then i)σ a ~ U ( 0, ∆ a ) , σ b ~ U ( 0, ∆ b ) , ρ ~ U ( -1,1) * ( or ii) Ω ~ W (νΣ ) ,ν −1 ) for Ω = Σ −1 with ν = dim(Ω) + 1 and Σ known location parameter *Take care as Winbugs uses another notation ie W ( (νΣ ) ,ν ) 06/01/2011 JLF/BigMC 32
33. 33. Results 06/01/2011 JLF/BigMC 33
34. 34. Results/fractional priors (b=0 vs 0.125) 06/01/2011 JLF/BigMC 34
35. 35. Example 2:Models of genetic differentiation 2 level hierarchical model i =locus; j =(sub)population aij =Nbre of genes carrying a given allele at locus i in pop. j pij = Frequency of that allele at locus i in pop. j 0) yij | α ij ~ id B ( nij , α ij ) 1− cj 1) α ij | xi ,λij ~ id Beta (τ jπ i ,τ j (1 − π i ) ) τ j = where c j ( Dif. index ) cj π i = Frequency of that allele at locus i in the gene pool 2)π i ~ id Beta ( aπ , bπ ) , c j ~ id Beta ( ac , bc ) Migration-Drift at equilibrium (Balding) 06/01/2011 JLF/BigMC 35
36. 36. Ex2: Nicholson’s model Nicholson et al (2002) same as previously but 1) α ij | xi ,λij ~ id N (π i , c jπ i (1 − π i ) ) Truncated normal with masses in 0 and 1 so that yij | α ij ~ id B ( nij , α ij ) * * where α ij = max(0, min(1, α ij )) 2)π i ~ id Beta ( aπ , bπ ) , c j ~ id Beta ( ac , bc ) Pure drift model 06/01/2011 JLF/BigMC 36
37. 37. Results 06/01/2011 JLF/BigMC 37
38. 38. Conclusion Derived from thermodynamical integration Link with « path sampling » Easy to understand and quite general Well suited to complex hierarchical models « Theta’s » can be defined as the closest stochastic parents of data making the latter conditionally independent Draws only from posterior distributions Gives as a by product fractional BF Easy to implement (including in Openbugs) but time consuming Caution needed in discretization of t (close to 0) 06/01/2011 JLF/BigMC 38
39. 39. Some references Chen M, Shao Q, Ibrahim J (2000) Monte Carlo methods in Bayesian computation. Springer Chib S (1995) Marginal likelihood from the Gibbs output. JASA 90,1313-1321 Chopin N, Robert CP (2010) Properties of nested sampling. Biometrika, 97, 741- 755 Friel N, Pettitt AN (2008) Marginal likelihood estimation via power posteriors, JRSS, B, 70, 589-607 Frühwirth-Schnatter (2004) Estimating marginal likelihoods from mixtures & Markov switching models using bridge sampling techniques. Econometrics Journal, 7,143-167 Gelman A, Meng X-L (1998) Simulating normalizing constants: from importance sampling to bridge sampling and path sampling, Statistical Science, 13, 163-185 Lartillot N, Philippe H (2006) Computing Bayes factors using thermodynamic integration. Systematic Biology, 55, 195-207 Marin JM, Robert CP (2009) Importance sampling methods for Bayesian discrimination between embedded models. arXiv:0910.2325v1 Meng X-L, Wong WH (1996) Simulating ratios of normalizing constants via a simple identity: a theoretical exploration. Statistica Sinica,6,831-860 O Hagan A (1995) Fractional Bayes factors for model comparison. JRSS, B, 57, 99-138 06/01/2011 JLF/BigMC 39
40. 40. Acknowledgements Nial Friel (U College, Dublin) for his interest in these applications and his unvaluable explanations & suggestions Tony O’Hagan for further insight into FBF Gilles Celeux, Mathieu Gautier as coadvisors of the Master dissertation of Yoan Soussan (Paris VI) Christian Robert for his blog and his relevant comments, standpoints and bibliographical references The Applibugs & Babayes groups for stimulating discussions on DIC, BF,CPO & other information criteria (AIC,BIC) 06/01/2011 JLF/BigMC 40