likelihood_p1.pdf

Cosmological Microwave Background Radiation
Likelihood Analysis - Part 1
Jayanti Prasad
Inter-University Centre for Astronomy & Astrophysics (IUCAA)
Pune, India (411007)
September 12, 2011

Plan of the talk
I CMBR Likelihood
I Gaussian Likelihood
I Optimal Quadratic Estimator
I WMAP Likelihood

CMBR Likelihood
I CMBR temperature and polarization observations can
constrain cosmological parameters if the likelihood function
can be computed exactly.
I Computing the likelihood function exactly in a brute force way
is computationally challenging since it involves inversion of
the covariance matrix i.e., O(N3) computation.
L =
1
(2π)N/2
p
|C|
exp

−dt
C−1
d

, (1)
where d is the data vector and C is the covariance matrix.
I Temperature and polarizations fields are correlated and partial
sky coverage correlates power at different ls.
I Since the exact computation of likelihood function is
challenging, approximations are generally made (Gaussian
Likelihood, Gibbs sampling etc).

Gaussian Likelihood
I The likelihood function for the temperature fluctuations
observed by a noiseless experiment with full-sky coverage has
the form:
L(T|Cl ) ∝
1
p
|S|
exp[−(TS
−1
T)/2] (2)
or
L(T|Cl ) =
Y
lm
1
p
Cl
exp[−|alm|
2
/(2Cl )] (3)
where
T(n̂) =
X
lm
almYlm; and Sij =
X
l
((2l + 1)/(4π))Cl Pl (n̂i .n̂j ) (4)
I Since observe only one sky, we cannot measure the power
spectra directly, but instead form the rotationally invariant
estimators, CXY
l , for full-sky CMB maps given by
Ĉ
XY
l = (1/(2l + 1))
m=l
X
m=−l
a
X
lma
Y
lm (5)
where ĈXY
l = CXY
l (true power spectrum). Note that
the likelihood has a maximum when Cl = Ĉl so Ĉl is the MLE.
[Hamimeche Lewis (2008)]

I Since alm are assumed to be Gaussian and statistically
isotropic, they have independent distributions (for |m| ≥ 0).
The probability of a set of alm at a given l is given by:
−2lnP(alm|Cl ) =
m=l
X
m=−l
[a
0
lmC−1
l alm + ln|2πCl |]
= (2l + 1)[Tr(Ĉl C−1
l ) + ln|Cl |] + const. (6)
I The fact that this likelihood for Cl depends only on the ĈXY
l
shows that on the full sky the CMB data can losslessly be
compressed to a set of power spectrum estimators, that
contain all the relevant information about the posterior
distribution i.e., Ĉl is a sufficient statistic for the likelihood.

I If the likelihood of the theory power spectrum Cl as a function
of the measured estimators Ĉl were Gaussian, the likelihood
could be calculated straightforwardly from the measured Ĉl .
I However the distribution is non-Gaussian because for a given
temperature power spectrum Cl , the Ĉl is a sum of squares of
Gaussian harmonic coefficients, have a (reduced) χ2
distribution. This makes it difficult to use “χ2 per degree of
freedom” as “goodness of fit”.

Maximum Likelihood Estimation
L =
1
(2π)N/2
p
|C|
exp

−(x − µ)C−1
(x − µ)t

(7)
or
2 log L = 2L = ln|C|+(x −µ)C−1
(x −µ)t
= Tr[lnC +C−1
D] (8)
where the covariance matrix C and data matrices D are given by
C = (x − µ)(x − µ)t
and D = (x − µ)(x − µ)t
(9)
differentiating equation (8) with respect to to θi
2L,i = Tr

C−1
C,i − C−1
C,i C−1
D + C−1
D,i

(10)
note that
, =
∂
∂θi
;
∂
∂θi
(C
−1
) = −C
−1 ∂C
∂θi
C
−1
and
∂
∂θi
(lnC) = C
−1 ∂C
∂θi
(11)
[Bond et al. (1998); Tegmark et al. (1997)]

Fisher Information Matrix
I
Fij = hL,ij i =
1
2
Tr Ai Aj + C−1
Mij

, (12)
where
Ai = (lnC),i and Mij = D,ij (13)
for isotropic CMBR fluctuations µ = 0 and
Cij = δij

Cl +
4πσ2
N
eθ2
bl(l+1)

(14)
where σ is the RMS pixel noise and θb = 0.425FWHM.
I From the above equations
Fij =
l=lmax
X
l=2

2l + 1
2

Cl +
4πσ2
N
eθ2
bl(l+1)
−2
∂Cl
∂θi

∂Cl
∂θj

(15)

Quadratic Estimator
I The best fit model i.e., parameter set θ̄, corresponds to a
point in the parameter space at which
∂L
∂θ
|θ=θ̄ = 0 (16)
We can use a root finding method to obtain θ̄ in the
following way:
I We begin with some best guess value θ(0) and approximate
the derived to the first order
∂L(θ̄)
∂θi
=
∂L(θ(0)
)
∂θi
+

θ̄j − θ
(0)
j
∂2
L(θ(0)
)
∂θi ∂θj
(17)
setting the LHS zero

θ̄j − θ
(0)
j

≈ −

∂2
L(θ(0)
)
∂θi ∂θj
#−1
∂L(θ(0)
)
∂θi
δθ (18)
for the next step we can write
θ
(1)
= θ
(0)
+ δθ (19)
and keep iterating till we get convergence.
[Bond et al. (1998); Durrer (2008)]

I
∂L
∂θi
=
1
2

dt
C−1
C,i C−1
d − Trace C−1
C,i

(20)
and
Fij =

∂2L
∂θi ∂θj

=
1
2
Trace

C−1
C,i C−1
C,i

(21)
I
θ(1)
= θ(0)
+
1
2
Fij

dt
C−1
C,i C−1
d − Trace C−1
C,i

(22)
I The advantage of this method is that for a given starting
parameter the Fisher matrix is readily calculated from the
covariance matrix alone, without having to know the data.
I The above estimated parameters θ(1) depend quadratically on
the data vector d so they are called a quadratic estimators.
I This method is faster than the grid based methods, however,
there are problems. This method is Levenberg-Marquardt
method as is implemented in GSL.

Error Estimate
I A good first estimate for 1σ error bars are the diagonal elements of
the Fisher matrix at the maximum, i.e., the best fit point.
hδθi δθj i = F−1
ij (23)
I These are the true 1σ errors only if the distribution is Gaussian in
the parameters θ which usually it is not.
I The Fisher matrix defines an ellipse around θ̄ in the parameter space
via the equation δθt
F(θ̄)δθ = 1
I The principal directions of this error ellipse are parallel to the
eigenvectors of F and their half length is given by the square root of
the eigenvalues of F
I The total width of the error ellipse in a given direction j at θ̄ is
2/
√
Fjj and 1/
√
Fjj is the error of the parameter j if all other
parameters are known and are equal to θ̄.
I In practice we do not know the other parameters any better than j
so the true error in j is given by the size of the projection of the
error matrix onto the j axis.

WMAP Likelihood
I At large l likelihood is approximated as Gaussian
2lnLGauss ∝
X
ll0
(Cl − Ĉl )Qll0 (Cl − Ĉl ) (24)
where Qll0 , the curvature matrix, is the inverse of the power
spectrum covariance matrix.
I The power spectrum covariance encodes the uncertainties in
the power spectrum due to cosmic variance, detector noise,
point sources, the sky cut, and systematic errors.
I Since the likelihood function for the power spectrum is slightly
non-Gaussian, equation (24) is a systematically biased
estimator so a log normal function is suggested
2lnLLN ∝
X
ll0
(zl − ẑl )Dll0 (zl − ẑl ) (25)
where zl = (Cl + Nl ) and Dll0 = (Cl + Nl )Qll0 (Cl + Nl )
I It has been argued that the following estimator is more
accurate then the the Gaussian or log normal only [Verde et al.
(2003)]
L = (1/3)LGauss + (2/3)LLN (26)

WMAP seven years Likelihood
I For low-l TT ( l ≤ 32) the likelihood of a model is computed
directly from the 7-year Internal Linear Combination (ILC)
maps.
I For high-l (l 32) TT MASTER pseudo-Cl quadratic
estimator is used.
I For low-l polarization, (l 23) TE, EE, BB, pixel-space
estimator is used.
I For high-l TE (l 23) MASTER quadratic estimator is being
used.
I Given a best-fit model one asks how well the model fits the data.
Given that the likelihood function is non-Gaussian, answering the
question is not as straightforward as testing the χ2
per dof of the
best-fit model. Instead one resorts to Monte Carlo simulations and
compare the absolute likelihood obtained from fitting the flight data
to an ensemble of simulated values.
[Larson et al. (2011)]

Summary Discussion
I I have discussed the difficulties and approximations used to
compute the CMBR likelihood function.
I My aim was to understand why and how the likelihood
function is computed differently for different values of l and
different type of power spectra.
I I have also discussed various estimators related to angular
power spectrum and the likelihood function.
I I was not able to discuss the actual computation of the
various components of the likelihood function in the WMAP
likelihood code.
I It has been suggested (by Dodelson) that the Nelder Mead
method or downhill simplex method (commonly known
amoeba) can be be used for finding the best fit parameters.
I Another method called NEWUOA which is used for
unconstrained optimization without derivatives, is also
suggested (by Antony Lewis ) to find the best fit parameters,
optimization of the likelihood function.

References
Bond, J. R., Jaffe, A. H., Knox, L. 1998, Phys. Rev. D , 57, 2117
Dodelson, S. 2003, Modern cosmology (San Diego, U.S.A.: Academic
Press)
Durrer, R. 2008, The Cosmic Microwave Background (Cambridge
University Press)
Hamimeche, S., Lewis, A. 2008, Phys. Rev. D , 77, 103013
Larson, D., et al. 2011, Astrophys. J. Suppl. Ser. , 192, 16
Tegmark, M., Taylor, A. N., Heavens, A. F. 1997, Astrophys. J. , 480,
22
Verde, L., et al. 2003, Astrophys. J. Suppl. Ser. , 148, 195

likelihood_p1.pdf

Recommended

Recommended

More Related Content

Similar to likelihood_p1.pdf

Similar to likelihood_p1.pdf (20)

More from Jayanti Prasad Ph.D.

More from Jayanti Prasad Ph.D. (10)

Recently uploaded

Recently uploaded (20)

likelihood_p1.pdf