Bayesian model choice in cosmology

1,618 views

Published on

Talk at JSM 2010, Vancouver, B.C.

Published in: Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,618
On SlideShare
0
From Embeds
0
Number of Embeds
146
Actions
Shares
0
Downloads
25
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Bayesian model choice in cosmology

  1. 1. Bayesian Model Comparison in Cosmology Bayesian Model Comparison in Cosmology with Population Monte Carlo Monthly Notices Royal Astronomical Soc. 405 (4), 2381 - 2390, 2010 Christian P. Robert Universit´ Paris Dauphine & CREST e http://www.ceremade.dauphine.fr/~ xian Joint works with D., Benabed K., Capp´ O., Cardoso J.F., Fort G., Kilbinger M., e [Marin J.-M., Mira A.,] Prunet S., Wraith D.
  2. 2. Bayesian Model Comparison in Cosmology Outline 1 Cosmology background 2 Importance sampling 3 Application to cosmological data 4 Evidence approximation 5 Cosmology models 6 lexicon
  3. 3. Bayesian Model Comparison in Cosmology Cosmology background Cosmology A large part of the data to answer some of the major questions in cosmology comes from studying the Cosmic Microwave Background (CMB) radiation (fossil heat released circa 380,000 years after the BB). Huge uniformity of the CMB. Only very sensitive instruments like such as WMAP (NASA, 2001) can detect fluctuations CMB temperature e.g minute temperature variations: one part of the sky has a temperature of 2.7251 Kelvin (degrees above absolute zero), while another part of the sky has a temperature of 2.7249 Kelvin
  4. 4. Bayesian Model Comparison in Cosmology Cosmology background Cosmology A large part of the data to answer some of the major questions in cosmology comes from studying the Cosmic Microwave Background (CMB) radiation (fossil heat released circa 380,000 years after the BB). 1.0 5 0.8 4 0.6 3 0.4 2 0.2 1 0 0.0 0.0 0.2 0.4 0.6 0.8 1.0 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 CMB [Marin & CPR, Bayesian Core, 2007]
  5. 5. Bayesian Model Comparison in Cosmology Cosmology background Plank Temperature variations are related to fluctuations in the density of matter in the early universe and thus carry information about the initial conditions for the formation of cosmic structures such as galaxies, clusters, and voids for example. Planck Joint mission between the European Space Agency (ESA) and NASA, launched in May 2009. The Planck mission plans to provide datasets of nearly 5 × 1010 observations to settle many open questions with CMB temperature data. Rather than scalar valued observations, Planck will provide tensor-valued data and thus is likely to also open up this area of statistical research.
  6. 6. Bayesian Model Comparison in Cosmology Cosmology background Plank Temperature variations are related to fluctuations in the density of matter in the early universe and thus carry information about the initial conditions for the formation of cosmic structures such as galaxies, clusters, and voids for example. Planck Joint mission between the European Space Agency (ESA) and NASA, launched in May 2009. The Planck mission plans to provide datasets of nearly 5 × 1010 observations to settle many open questions with CMB temperature data. Rather than scalar valued observations, Planck will provide tensor-valued data and thus is likely to also open up this area of statistical research.
  7. 7. Bayesian Model Comparison in Cosmology Cosmology background .
  8. 8. Bayesian Model Comparison in Cosmology Cosmology background Some questions in cosmology Will the universe expand forever, or will it collapse? Is the universe dominated by exotic dark matter and what is its concentration? What is the shape of the universe? Is the expansion of the universe accelerating rather than decelerating? Is the “flat ΛCDM paradigm” appropriate or is the curvature different from zero? [Adams, The Guide [a.k.a. H2G2], 1979]
  9. 9. Bayesian Model Comparison in Cosmology Cosmology background Statistical problems in cosmology Potentially high dimensional parameter space [Not considered here] Immensely slow computation of likelihoods, e.g WMAP, CMB, because of numerically costly spectral transforms [Data is a Fortran program] Nonlinear dependence and degeneracies between parameters introduced by physical constraints or theoretical assumptions 2.5 0.0 2.0 −1.0 w0 α 1.5 −2.0 1.0 −3.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 19.1 19.3 19.5 19.7 Ωm −M
  10. 10. Bayesian Model Comparison in Cosmology Importance sampling Importance sampling solutions 1 Cosmology background 2 Importance sampling Adaptive importance sampling Adaptive multiple importance sampling 3 Application to cosmological data 4 Evidence approximation 5 Cosmology models 6 lexicon
  11. 11. Bayesian Model Comparison in Cosmology Importance sampling Importance sampling 101 Importance sampling is based on the fundamental identity π(x) π(f ) = f (x)π(x) dx = f (x) q(x) dx q(x) If x1 , . . . , xN are drawn independently from q, N π(xn )/q(xn ) π (f ) = ˆ f (xn )wn ; ¯ wn = ¯ N , n=1 m=1 π(xm )/q(xm ) provides a converging approximation to π(f ) (independent of the normalisation of π).
  12. 12. Bayesian Model Comparison in Cosmology Importance sampling Adaptive importance sampling Initialising importance sampling PMC/AIS offers a solution to the difficulty of picking q through adaptivity: Given a target π, PMC produces a sequence q t of importance functions (t = 1, . . . , T ) aimed at approximating π First sample produced by a regular importance sampling scheme, x1 , . . . , x1 ∼ q 1 , associated with importance weights 1 N 1 π(x1 )n wn = q 1 (x1 ) n ¯1 and their normalised counterparts wn , providing a first approximation to a sample from π. Moments of π can then be approximated to construct an updated importance function q 2 , &c.
  13. 13. Bayesian Model Comparison in Cosmology Importance sampling Adaptive importance sampling Adaptive importance sampling Optimality criterion? The quality of approximation can be measured in terms of the Kullback divergence from the target, π(x) D(π q t ) = log π(x)dx, q t (x) and the density q t can be adjusted incrementally to minimize this divergence.
  14. 14. Bayesian Model Comparison in Cosmology Importance sampling Adaptive importance sampling PMC – Some papers Capp´ et al (2004) - J. Comput. Graph. Stat. e Outline of Population Monte Carlo but missed main point Celeux et al (2005) - Comput. Stat. & Data Analysis Rao-Blackwellisation for importance sampling and missing data problems Douc et al (2007) - ESAIM Prob. & Stat. and Annals of Statistics Convergence issues proving adaptation is positive where q is a mixture density of random-walk proposals (mixture weights varied) Capp´ et al (2007) - Stat. & Computing e Adaptation of q (mixture density of independent proposals), where weights and parameters vary Wraith et al (2009) - Physical Review D Application of Capp´ et al (2007) to cosmology and comparison with MCMC e Beaumont et al (2009) - Biometrika Application of Capp´ et al (2007) to ABC settings e Kilbinger et al (2010) - Month. N. Royal Astro. Soc. Application of Capp´ et al (2007) to model choice in cosmology e
  15. 15. Bayesian Model Comparison in Cosmology Importance sampling Adaptive importance sampling Adaptive importance sampling (2) Use of mixture densities D t t t q (x) = q(x; α , θ ) = d t αt ϕ(x; θd ) d=1 [West, 1993] where αt = (αt , . . . , αt ) is a vector of adaptable weights for the D 1 D mixture components θ t = (θ1 , . . . , θD ) is a vector of parameters which specify the t t components ϕ is a parameterised density (usually taken to be multivariate Gaussian or Student-t, the later preferred)
  16. 16. Bayesian Model Comparison in Cosmology Importance sampling Adaptive importance sampling Capp´ et al (2007) optimal scheme e Update q t using an integrated EM approach minimising the KL divergence at each iteration π(x) D(π q t ) = log D π(x)dx, t t d=1 αd ϕ(x; θd ) equivalent to maximising D ℓ(α, θ) = log αd ϕ(x; θd ) π(x) dx d=1 in α, θ.
  17. 17. Bayesian Model Comparison in Cosmology Importance sampling Adaptive importance sampling PMC updates Maximization of Lt (α, θ) leads to closed form solutions in exponential families (and for the t distributions) For instance for Np (µd , Σd ): αt+1 = d ρd (x; αt , µt , Σt )π(x)dx, xρd (x; αt , µt , Σt )π(x)dx µt+1 = d , αt+1 d (x − µt+1 )(x − µt+1 )T ρd (x; αt , µt , Σt )π(x)dx Σt+1 = d d d . αt+1 d
  18. 18. Bayesian Model Comparison in Cosmology Importance sampling Adaptive importance sampling Empirical updates And empirical versions, N αt+1 = X d ¯t wn ρd (xt ; αt , µt , Σt ) n n=1 PN wn xt ρd (xt ; αt , µt , Σt ) ¯t n n µt+1 = d n=1 αt+1 d Σt+1 d = PN n=1 wn (xt − µt+1 )(xt − µt+1 )T ρd (xt ; αt , µt , Σt ) ¯t n d n d n αt+1 d
  19. 19. Bayesian Model Comparison in Cosmology Importance sampling Adaptive importance sampling Banana benchmark 2 Twisted Np (0, Σ) target with Σ = diag(σ1 , 1, . . . , 1), changing the 2 2 second co-ordinate x2 to x2 + b(x1 − σ1 ) 20 10 0 −10 x2 −20 −30 −40 −40 −20 0 20 40 x1 2 p = 10, σ1 = 100, b = 0.03 [Haario et al. 1999]
  20. 20. Bayesian Model Comparison in Cosmology Importance sampling Adaptive importance sampling Simulation 20 20 10 10 0 0 −20 −20 −40 −40 −40 −20 0 20 40 −40 −20 0 20 40 20 20 10 10 0 0 −20 −20 −40 −40 −40 −20 0 20 40 −40 −20 0 20 40 20 20 10 10 0 0 −20 −20 −40 −40 −40 −20 0 20 40 −40 −20 0 20 40
  21. 21. Bayesian Model Comparison in Cosmology Importance sampling Adaptive importance sampling Monitoring by perplexity Stop iterations when further adaptations do not improve D(π q t ). The transform exp[−D(π q t )] may be estimated by the normalised perplexity p = exp(Ht )/N, where N N Ht = − N ¯t ¯t wn log wn n=1 is the Shannon entropy of the normalised weights Thus, minimization of the Kullback divergence can be approximately connected with the maximization of the perplexity (normalised) (values closer to 1 indicating good agreement between q and π).
  22. 22. Bayesian Model Comparison in Cosmology Importance sampling Adaptive importance sampling Monitoring by ESS A second criterion is the effective sample size (ESS) N −1 2 ESSt N = ¯t wn n=1 which can be interpreted as the number of equivalent iid sample points.
  23. 23. Bayesian Model Comparison in Cosmology Importance sampling Adaptive importance sampling Simulation 0.8 0.6 NPERPL 0.4 0.2 0.0 1 2 3 4 5 6 7 8 9 10 0.8 0.6 NESS 0.4 0.2 0.0 1 2 3 4 5 6 7 8 9 10 Normalised perplexity (top panel) and normalised effective sample size(ESS/N) (bottom panel) estimates for the first 10 iterations of PMC
  24. 24. Bayesian Model Comparison in Cosmology Importance sampling Adaptive importance sampling Comparison to MCMC Adaptive MCMC: Proposal is a multivariate Gaussian with Σ updated/based on previous values in the chain. Scale and update times chosen for optimal results. PMC MCMC fa fa fb fb Evolution of π(fa ) (top panels) and π(fb ) (bottom panels) from 10k points to 100k points for both PMC (left panels) and MCMC (right panels).
  25. 25. Bayesian Model Comparison in Cosmology Importance sampling Adaptive importance sampling Simulation 0.74 fc fe fh Propoportion of points inside 0.70 0.66 0.62 PMC d10 PMC MCMC d10 MCMC PMC d2 PMC MCMC d2 MCMC PMC d1 PMC MCMC d1 MCMC 1.00 fd fg fi Propoportion of points inside 0.96 0.92 0.88 PMC d10 PMC MCMC d10 MCMC PMC d2 PMC MCMC d2 MCMC PMC d1 PMC MCMC d1 MCMC Results showing the distributions of the PMC and the MCMC estimates. All estimates are based on 500 simulation runs.
  26. 26. Bayesian Model Comparison in Cosmology Importance sampling Adaptive multiple importance sampling Adaptive multiple importance sampling Full recycling: At iteration t, design a new proposal qt based on all previous samples x1 , . . . , x1 , . . . , xt−1 , . . . , xt−1 1 N 1 N At each stage, the whole past can be used: if un-normalised weights ωi,t are preserved along iterations, then all xt ’s can be i pooled together
  27. 27. Bayesian Model Comparison in Cosmology Importance sampling Adaptive multiple importance sampling Adaptive multiple importance sampling Full recycling: At iteration t, design a new proposal qt based on all previous samples x1 , . . . , x1 , . . . , xt−1 , . . . , xt−1 1 N 1 N At each stage, the whole past can be used: if un-normalised weights ωi,t are preserved along iterations, then all xt ’s can be i pooled together
  28. 28. Bayesian Model Comparison in Cosmology Importance sampling Adaptive multiple importance sampling Caveat When using several importance functions at once, q0 , . . . , qT , with samples x0 , . . . , x0 0 , . . ., xT , . . . , xT T and importance weights 1 N 1 N t ωi = π(xt )/qt (xt ), merging thru the empirical distribution i i t t ωi δxt (x) i ωi ≈ π(x) t,i t,i Fails to cull poor proposals: very large weights do remain large in the cumulated sample and poorly performing samples overwhelmingly dominate other samples in the final outcome. c Raw mixing of importance samples may be harmful, compared with a single sample, even when most proposals are efficient.
  29. 29. Bayesian Model Comparison in Cosmology Importance sampling Adaptive multiple importance sampling Caveat When using several importance functions at once, q0 , . . . , qT , with samples x0 , . . . , x0 0 , . . ., xT , . . . , xT T and importance weights 1 N 1 N t ωi = π(xt )/qt (xt ), merging thru the empirical distribution i i t t ωi δxt (x) i ωi ≈ π(x) t,i t,i Fails to cull poor proposals: very large weights do remain large in the cumulated sample and poorly performing samples overwhelmingly dominate other samples in the final outcome. c Raw mixing of importance samples may be harmful, compared with a single sample, even when most proposals are efficient.
  30. 30. Bayesian Model Comparison in Cosmology Importance sampling Adaptive multiple importance sampling Deterministic mixtures Owen and Zhou (2000) propose a stabilising recycling of the weights via deterministic mixtures by modifying the importance density qt (xt ) under which xt was truly simulated to a mixture of i i all the densities that have been used so far T 1 T Nt qt (xT ) , i j=0 Nj t=0 resulting into the deterministic mixture weight T t 1 ωi = π(xt ) i T Nt qt (xt ) . i j=0 Nj t=0
  31. 31. Bayesian Model Comparison in Cosmology Importance sampling Adaptive multiple importance sampling Unbiasedness Potential to exploit the most efficient proposals in the sequence Q0 , . . . , QT without rejecting any simulated value nor sample. Poorly performing importance functions are simply eliminated through the erosion of their weights T 1 π(xt ) i T Nl ql (xt ) i j=0 Nj l=0 as T increases. Paradoxical feature of competing acceptable importance weights for the same simulated value well-understood in the cases of Rao-Blackwellisation and of Population Monte Carlo. More intricated here in that only unbiasedness remains [fake mixture]
  32. 32. Bayesian Model Comparison in Cosmology Importance sampling Adaptive multiple importance sampling Unbiasedness Potential to exploit the most efficient proposals in the sequence Q0 , . . . , QT without rejecting any simulated value nor sample. Poorly performing importance functions are simply eliminated through the erosion of their weights T 1 π(xt ) i T Nl ql (xt ) i j=0 Nj l=0 as T increases. Paradoxical feature of competing acceptable importance weights for the same simulated value well-understood in the cases of Rao-Blackwellisation and of Population Monte Carlo. More intricated here in that only unbiasedness remains [fake mixture]
  33. 33. Bayesian Model Comparison in Cosmology Importance sampling Adaptive multiple importance sampling AMIS AMIS (or Adaptive Multiple Importance Sampling) uses importance sampling functions (qt ) that are constructed sequentially and adaptively, using past t − 1 weighted samples. i weights of all present and past variables xl i (1 ≤ l ≤ t , 1 ≤ j ≤ Nt ) are modified, based on the current proposals ii the entire collection of importance samples is used to build the next importance function. [Parallel with IMIS: Raftery & Bo, 2010]
  34. 34. Bayesian Model Comparison in Cosmology Importance sampling Adaptive multiple importance sampling The AMIS algorithm Adaptive Multiple Importance Sampling At iteration t = 1, . . . , T ˆ 1) Independently generate Nt particles xt ∼ q(x|θ t−1 ) i t ˆl−1 2) For 1 ≤ i ≤ Nt , compute the mixture at xi δi = N0 q0 (xt ) + t Pt ffi t i l=1 Nl q(xi ; θ ) and derive the t t t ‹ t Pt weight of xi , ωi = π(xi ) [δi {N0 + l=0 Nl }] . 3) For 0 ≤ l ≤ t − 1 and 1 ≤ i ≤ Nl , actualise past weights as t l l l ˆt−1 l l ‹ l‹ X δi = δi + q(xi ; θ ) and ωi = π(xi ) [δi {N0 + Nl }] . l=0 ˆ 4) Compute the parameter estimate θ t based on 0 0 0 0 t t t t ({x1 , ω1 }, . . . , {xN , ωN }, . . . , {x1 , ω1 }, . . . , {xN , ωN }) 0 0 t t [Cornuet, Marin, Mira & CPR, 2009, arXiv:0907.1254]
  35. 35. Bayesian Model Comparison in Cosmology Importance sampling Adaptive multiple importance sampling Studentised AMIS When the proposal distribution qt is a Student’s t proposal, T3 (µ, Σ) mean µ and covariance Σ parameters can be updated by estimating first two moments of the target distribution Π Pt PNl l l Pt PNl l=0 i=1 ω x ˆ l=0 ωi (xl − µt )(xl − µt )T l ˆ ˆ µt = Pt PN i i ˆ and Σt = i=1 i Pt PNl l i . l l l=0 i=1 ωi l=0 i=1 ωi i.e. using optimal update of Capp´ et al. (2007) e Obvious extension to mixtures [and again optimal update of Capp´ e et al. (2007)]
  36. 36. Bayesian Model Comparison in Cosmology Importance sampling Adaptive multiple importance sampling Studentised AMIS When the proposal distribution qt is a Student’s t proposal, T3 (µ, Σ) mean µ and covariance Σ parameters can be updated by estimating first two moments of the target distribution Π Pt PNl l l Pt PNl l=0 i=1 ω x ˆ l=0 ωi (xl − µt )(xl − µt )T l ˆ ˆ µt = Pt PN i i ˆ and Σt = i=1 i Pt PNl l i . l l l=0 i=1 ωi l=0 i=1 ωi i.e. using optimal update of Capp´ et al. (2007) e Obvious extension to mixtures [and again optimal update of Capp´ e et al. (2007)]
  37. 37. Bayesian Model Comparison in Cosmology Importance sampling Adaptive multiple importance sampling Simulations Same banana benchmark Target function p AMIS Capp´’07 e 5 0.06558 0.06879 E(x1 ) = 0 10 0.06388 0.11051 20 0.09167 0.17912 5 0.10215 0.11583 E(x2 ) = 0 10 0.21421 0.22557 20 0.25316 0.29087 P5 E(xi ) = 0 5 0.00478 0.00927 Pi=3 10 E(xi ) = 0 10 0.00902 0.02099 Pi=3 20 i=3 E(xi ) = 0 20 0.01666 0.04208 5 2.60672 3.92650 var(x1 ) = 100 10 7.06686 7.48877 20 8.20020 9.71725 5 2.10682 2.96132 var(x2 ) = 19 10 3.76660 5.08474 20 4.85407 5.98031 P5 var(xi ) = 3 5 0.00645 0.01196 Pi=3 10 i=3 var(xi ) = 8 P20 10 0.01370 0.02636 i=3 var(xi ) = 18 20 0.04609 0.06424 Root mean square errors calculated over 10 replications for different target functions and dimensions p.
  38. 38. Bayesian Model Comparison in Cosmology Importance sampling Adaptive multiple importance sampling Simulation (cont’d) 10 replicate ESSs for AMIS (left) and PMC (right) for p = 5, 10, 20.
  39. 39. Bayesian Model Comparison in Cosmology Importance sampling Adaptive multiple importance sampling Simulation (cont’d) 10 replicate absolute errors associated to the estimations of E(x1 ) (left column), E(x2 ) (center column) and p E(xi ) (right column) using AMIS (left in each P i=3 block) and PMC (right) for p = 5, 10, 20.
  40. 40. Bayesian Model Comparison in Cosmology Importance sampling Adaptive multiple importance sampling Simulation (cont’d) 10 replicate absolute errors associated to the estimations of var(x1 ) (left column), var(x2 ) (center column) and p var(xi ) (right column) using AMIS (left in each P i=3 block) and PMC (right) for p = 5, 10, 20.
  41. 41. Bayesian Model Comparison in Cosmology Application to cosmological data Cosmological data Posterior distribution of cosmological parameters for recent observational data of CMB anisotropies (differences in temperature from directions) [WMAP], SNIa, and cosmic shear. Combination of three likelihoods, some of which are available as public (Fortran) code, and of a uniform prior on a hypercube.
  42. 42. Bayesian Model Comparison in Cosmology Application to cosmological data Cosmology parameters Parameters for the cosmology likelihood (C=CMB, S=SNIa, L=lensing) Symbol Description Minimum Maximum Experiment Ωb Baryon density 0.01 0.1 C L Ωm Total matter density 0.01 1.2 C S L w Dark-energy eq. of state -3.0 0.5 C S L ns Primordial spectral index 0.7 1.4 C L ∆2R Normalization (large scales) C σ8 Normalization (small scales) C L h Hubble constant C L τ Optical depth C M Absolute SNIa magnitude S α Colour response S β Stretch response S a L b galaxy z-distribution fit L c L For WMAP5, σ8 is a deduced quantity that depends on the other parameters
  43. 43. Bayesian Model Comparison in Cosmology Application to cosmological data Adaptation of importance function
  44. 44. Bayesian Model Comparison in Cosmology Application to cosmological data Estimates Parameter PMC MCMC Ωb 0.0432+0.0027 −0.0024 0.0432+0.0026 −0.0023 Ωm 0.254+0.018 −0.017 0.253+0.018 −0.016 τ 0.088+0.018 −0.016 0.088+0.019 −0.015 +0.059 w −1.011 ± 0.060 −1.010−0.060 ns 0.963+0.015 −0.014 0.963+0.015 −0.014 109 ∆2 R 2.413+0.098 −0.093 2.414+0.098 −0.092 h 0.720+0.022 −0.021 0.720+0.023 −0.021 a 0.648+0.040 −0.041 0.649+0.043 −0.042 b 9.3+1.4 −0.9 9.3+1.7 −0.9 c 0.639+0.084 −0.070 0.639+0.082 −0.070 +0.029 −M 19.331 ± 0.030 19.332−0.031 α 1.61+0.15 −0.14 1.62+0.16 −0.14 −β −1.82+0.17 −0.16 −1.82 ± 0.16 σ8 0.795+0.028 −0.030 0.795+0.030 −0.027 Means and 68% credible intervals using lensing, SNIa and CMB
  45. 45. Bayesian Model Comparison in Cosmology Application to cosmological data Advantage of AIS and PMC? Parallelisation of the posterior calculations - For the cosmological examples, we used up to 100 CPUs on a computer cluster to explore the cosmology posteriors using AIS/PMC. Reducing the computational time from several days for MCMC to a few hours using PMC. Low variance of Monte Carlo estimates - For PMC and q closely matched to π, significant reductions in the variance of the Monte Carlo estimates are possible compared to estimates using MCMC. Also translating into a computational saving, with further savings possible by combining samples across iterations Simple diagnostics of ‘convergence’ (perplexity) - For PMC, the perplexity provides a relatively simple measure of sampling adequacy to the target density of interest
  46. 46. Bayesian Model Comparison in Cosmology Evidence approximation Evidence/Marginal likelihood/Integrated Likelihood ... Central quantity of interest in (Bayesian) model choice π(x) E= π(x)dx = q(x)dx. q(x) expressed as an expectation under any density q with large enough support. Importance sampling provides a sample x1 , . . . xN ∼ q and approximation of the above integral, N E≈ wn n=1 π(xn ) where the wn = q(xn ) are the (unnormalised) importance weights.
  47. 47. Bayesian Model Comparison in Cosmology Evidence approximation Evidence/Marginal likelihood/Integrated Likelihood ... Central quantity of interest in (Bayesian) model choice π(x) E= π(x)dx = q(x)dx. q(x) expressed as an expectation under any density q with large enough support. Importance sampling provides a sample x1 , . . . xN ∼ q and approximation of the above integral, N E≈ wn n=1 π(xn ) where the wn = q(xn ) are the (unnormalised) importance weights.
  48. 48. Bayesian Model Comparison in Cosmology Evidence approximation Back to the banana ... Centred d-multivariate normal, x ∼ Nd (0, Σ) with covariance 2 Σ = diag(σ1 , 1, . . . , 1), which is slightly twisted in the first two 1 2 dimensions by changing x2 to be x2 + β(x2 − σ1 ). where σ1 = 100 2 and β controls the degree of curvature. We integrate over the unormalised target density E= π(β)f (x|β, Σ) dβ or E= π(x|β, Σ) dx.
  49. 49. Bayesian Model Comparison in Cosmology Evidence approximation Simulation results (1) 10 10 0 0 −10 −10 x2 x2 −20 −20 −30 −30 −40 −20 0 20 40 −40 −20 0 20 40 x1 x1 0.03004 −264.028 Posterior mean of β Evidence (log) 0.03000 −264.032 0.02996 −264.036 0.02992 After 10th iteration After 10th iteration β unknown
  50. 50. Bayesian Model Comparison in Cosmology Evidence approximation Simulation results (2) 0.8 0.8 0.6 0.6 Perplexity NESS 0.4 0.4 0.2 0.2 0.0 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 Iteration Iteration Evidence (log): final sample 0.2 0.015 Evidence (log) 0.1 0.005 0.0 −0.015 −0.005 −0.1 1 2 3 4 5 6 7 8 9 10 Iteration After 10th iteration β = 0.015 known
  51. 51. Bayesian Model Comparison in Cosmology Cosmology models Back to cosmology questions Standard cosmology successful in explaining recent observations, such as CMB, SNIa, galaxy clustering, cosmic shear, galaxy cluster counts, and Lyα forest clustering. Flat ΛCDM model with only six free parameters (Ωm , Ωb , h, ns , τ, σ8 ) Extensions to ΛCDM may be based on independent evidence (massive neutrinos from oscillation experiments), predicted by compelling hypotheses (primordial gravitational waves from inflation) or reflect ignorance about fundamental physics (dynamical dark energy). Testing for dark energy, curvature, and inflationary models
  52. 52. Bayesian Model Comparison in Cosmology Cosmology models Back to cosmology questions Standard cosmology successful in explaining recent observations, such as CMB, SNIa, galaxy clustering, cosmic shear, galaxy cluster counts, and Lyα forest clustering. Flat ΛCDM model with only six free parameters (Ωm , Ωb , h, ns , τ, σ8 ) Extensions to ΛCDM may be based on independent evidence (massive neutrinos from oscillation experiments), predicted by compelling hypotheses (primordial gravitational waves from inflation) or reflect ignorance about fundamental physics (dynamical dark energy). Testing for dark energy, curvature, and inflationary models
  53. 53. Bayesian Model Comparison in Cosmology Cosmology models Extended models Focus on the dark energy equation-of-state parameter, modeled as w = −1 ΛCDM w = w0 wCDM w = w0 + w1 (1 − a) w(z)CDM In addition, curvature parameter ΩK for each of the above is either ΩK = 0 (‘flat’) or ΩK = 0 (‘curved’). Choice of models represents simplest models beyond a “cosmological constant” model able to explain the observed, recent accelerated expansion of the Universe.
  54. 54. Bayesian Model Comparison in Cosmology Cosmology models Cosmology priors Prior ranges for dark energy and curvature models. In case of w(a) models, the prior on w1 depends on w0 Parameter Description Min. Max. Ωm Total matter density 0.15 0.45 Ωb Baryon density 0.01 0.08 h Hubble parameter 0.5 0.9 ΩK Curvature −1 1 w0 Constant dark-energy par. −1 −1/3 w1 Linear dark-energy par. −1 − w0 −1/3−w0 1−aacc
  55. 55. Bayesian Model Comparison in Cosmology Cosmology models Cosmology priors (2) Component to the matter-density tensor with w(a) < −1/3 for values of the scale factor a > aacc = 2/3. To limit the state equation from below, we impose the condition w(a) > −1 for all a, thereby excluding phantom energy. Natural limit on the curvature is that of an empty Universe, i.e. upper boundary on the curvature ΩK = 1. A lower boundary corresponds to an upper limit on the total matter-energy density: ΩK > −1, excluding high-density Universe(s) which are ruled out by the age of the oldest observed objects. Alternative prior on ΩK could be derived from the paradigm of inflation, but most scenarios imply the curvature to be , on the order of 10−60 . The likelihood over such a prior on ΩK is essentially flat for any current and future experiments, hence cannot be assessed.
  56. 56. Bayesian Model Comparison in Cosmology Cosmology models Cosmology priors (2) Component to the matter-density tensor with w(a) < −1/3 for values of the scale factor a > aacc = 2/3. To limit the state equation from below, we impose the condition w(a) > −1 for all a, thereby excluding phantom energy. Natural limit on the curvature is that of an empty Universe, i.e. upper boundary on the curvature ΩK = 1. A lower boundary corresponds to an upper limit on the total matter-energy density: ΩK > −1, excluding high-density Universe(s) which are ruled out by the age of the oldest observed objects. Alternative prior on ΩK could be derived from the paradigm of inflation, but most scenarios imply the curvature to be , on the order of 10−60 . The likelihood over such a prior on ΩK is essentially flat for any current and future experiments, hence cannot be assessed.
  57. 57. Bayesian Model Comparison in Cosmology Cosmology models PMC setup q 0 is a Gaussian mixture model with D components randomly shifted away from the MLE and covariance equal to the information matrix. For the dark-energy and curvature models number of iterations T equal to 10, unless perplexity indicated the contrary. Average number of points sampled under an individual mixture-component, N/D, controlled for stable updating component (N = 7 500 and D = 10). For the primordial models T = 5, N = 10 000 and D between 7 and 10, depending on the dimensionality. Parameters controlling the initial mixture means and covariances, chosen as fshift = 0.02, and fvar between 1 and 1.5. Final iteration run with a five-times larger sample
  58. 58. Bayesian Model Comparison in Cosmology Cosmology models Results In most cases evidence in favour of the standard model. especially when more datasets/experiments are combined. Largest evidence is ln B12 = 1.8, for the w(z)CDM model and CMB alone. Case where a large part of the prior range is still allowed by the data, and a region of comparable size is excluded. Hence weak evidence that both w0 and w1 are required, but excluded when adding SNIa and BAO datasets. Results on the curvature are compatible with current findings: non-flat Universe(s) strongly disfavoured for the three dark-energy cases.
  59. 59. Bayesian Model Comparison in Cosmology Cosmology models Evidence Evidence (reference model ΛCDM flat) CMB mod. 4 CMB+SN CMB+SN+BAO weak inconcl. weak 2 w(z) flat w(z) flat w0 flat 0 w(z) flat w(z) curved ln B12 -2 Λ curved w0 flat w0 curved mod. -4 w(z) curved Λ curved w(z) curved Λ curved strong -6 w0 curved -8 w0 curved 4 5 6 npar
  60. 60. Bayesian Model Comparison in Cosmology Cosmology models Posterior outcome Posterior on dark-energy parameters w0 and w1 as 68%- and 95% credible regions for WMAP (solid blue lines), WMAP+SNIa (dashed green) and WMAP+SNIa+BAO (dotted red curves). Allowed prior range as red straight lines. 2.0 1.5 1.0 w1 0.5 0.0 −0.5 −1.0 −0.9 −0.8 −0.7 −0.6 −0.5 −0.4 w0
  61. 61. Bayesian Model Comparison in Cosmology Cosmology models PMC stability wCDM flat wCDM curvature −8.5 −10 −9.0 −11 −9.5 ln E ln E −10.0 −12 −13 −11.0 −14 1 2 3 4 5 6 7 8 9 10 1 3 5 7 9 11 13 15 17 19 iteration iteration Distribution of 25 PMC samplings of two dark-energy models, flat wCDM (left panel) and curved wCDM (right panel). Log-evidence
  62. 62. Bayesian Model Comparison in Cosmology Cosmology models PMC stability wCDM flat wCDM curvature 0.5 0.8 0.4 0.6 perplexity perplexity 0.3 0.4 0.2 0.2 0.1 0.0 0.0 1 2 3 4 5 6 7 8 9 10 1 3 5 7 9 11 13 15 17 19 iteration iteration Distribution of 25 PMC samplings of two dark-energy models, flat wCDM (left panel) and curved wCDM (right panel). Perplexity
  63. 63. Bayesian Model Comparison in Cosmology lexicon lexicon BAO, baryon acoustic oscillations CMB, cosmic microwave background radiation COBE, cosmic background explorer ΛCDM, lambda-cold dark matter Lyα, Lyman-alpha SNIa, type Ia supernovae WMAP, Wilkinson microwave anisotropy probe

×