Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Computing Bayesian posterior with empirical likelihood in population genetics
1. Computing Bayesian posterior with empirical
likelihood in population genetics
Pierre Pudlo
INRA & U. Montpellier 2
SSB, 18 sept. 2012
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 1 / 29
2. Table of contents
1 Likelihood free methods
2 Empirical likelihood
3 Population genetics
4 Numerical experiments
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 2 / 29
3. Table of contents
1 Likelihood free methods
2 Empirical likelihood
3 Population genetics
4 Numerical experiments
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 3 / 29
4. When the likelihood is not completely known
Likelihood intractable: `(yj) =
Z
U
`(y,uj)du
where u is a latent process
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 4 / 29
5. When the likelihood is not completely known
Likelihood intractable: `(yj) =
Z
U
`(y,uj)du
where u is a latent process
I Hidden Markov and other dynamic models: latent process
which is not observed
,! Classical answer: Markov chain Monte Carlo,. . .
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 4 / 29
6. When the likelihood is not completely known
Likelihood intractable: `(yj) =
Z
U
`(y,uj)du
where u is a latent process
I Hidden Markov and other dynamic models: latent process
which is not observed
,! Classical answer: Markov chain Monte Carlo,. . .
I Population genetics: the whole gene genealogy is unobserved
Likelihood is an integral over
I all possible gene genealogies
I all possible mutations along the genealogies
,! Classical answer: Approximate Bayesian computation (ABC)
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 4 / 29
7. ABC in a nutshell
Posterior distribution is the conditional distribution of
()`(xj) ()
knowing that x = xobs
Methodology
Draw a (large) set of particles (i, xi) from () and use a
nonparametric estimate of the conditional density
(jxobs) / ()`(xobsj)
Seminal papers
I Tavar ´ e, Balding, Griffith and Donnelly (1997, Genetics)
I Pritchard, Seielstad, Perez-Lezuan, Feldman (1999, Molecular
Biology and Evolution)
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 5 / 29
8. ABC in a nutshell
Posterior distribution is the conditional distribution of
()`(xj) ()
knowing that x = xobs
Methodology
Draw a (large) set of particles (i, xi) from () and use a
nonparametric estimate of the conditional density
(jxobs) / ()`(xobsj)
Shortcomings.
I time consuming – If simulation of the latent process is not
straightforward
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 5 / 29
9. ABC in a nutshell
Posterior distribution is the conditional distribution of
()`(xj) ()
knowing that x = xobs
Methodology
Draw a (large) set of particles (i, xi) from () and use a
nonparametric estimate of the conditional density
(jxobs) / ()`(xobsj)
Shortcomings.
I time consuming – If simulation of the latent process is not
straightforward
I curse of dimensionality vs. lost of information –
I If x lies in a high dimensional space X (often), we are unable to
estimate of the conditional density
I Hence, we project the (observed and simulated) datasets on a
space with smaller dimension (trough summary statistics)
: X ! Rd (summary statistics)
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 5 / 29
10. You went to school to learn, girl (. . . )
Why 2 plus 2 makes four
Now, now, now, I’m gonna teach you (. . . )
All you gotta do is repeat after me!
A, B, C!
It’s easy as 1, 2, 3!
Or simple as Do, re, mi! (. . . )
Surveys
I Beaumont (2010, Annual Review of Ecology, Evolution, and
Systematics)
I Lopes, Beaumont (2010, Genetics and Evolution)
I Csill ` ery, Blum, Gaggiotti, Franc¸ois (2010, Trends in Ecology and
Evolution)
I Marin, Pudlo, Robert, Ryder (to appear, Statis. and Comput.)
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 6 / 29
11. Table of contents
1 Likelihood free methods
2 Empirical likelihood
3 Population genetics
4 Numerical experiments
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 7 / 29
12. Empirical likelihood (EL)
Owen (1988, Biometrika),
Owen (2001, Chapman
Hall)
Data
Empirical
Likelihood
Parameters'
definition in
term of a
moment
constraint
x = (x1, ..., xK)
Profiled likelihood
x
xi
Black box
Input:
I Independent data
x = (x1, . . . , xK)
I Definition of the parameters
E(h(xi,)) = 0
Output:
I A profiled “likelihood function”
Lel(jx)
EL do not require a fully defined and
often complex (hence debatable)
parametric model
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 8 / 29
13. Empirical likelihood (EL)
Owen (1988, Biometrika), Owen (2001, Chapman Hall)
Assume that the dataset x is composed of n independent replicates
x = (x1, . . . , xn) of some X F
Generalized moment condition model
The law F of X satisfy
EF
h(X,)
= 0,
where h is a known function, and an unknown parameter
Empirical likelihood
Lel(jx) = max
p
Yn
i=1
pi ()
for all p such that 0 6 pi 6 1,
P
pi = 1,
P
i pih(xi,) = 0.
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 9 / 29
14. Empirical likelihood (EL)
Empirical likelihood
Lel(jx) = max
p
Yn
i=1
pi ()
for all p such that 0 6 pi 6 1,
P
pi = 1,
P
i pih(xi,) = 0.
Why ()?
(1) L(p) :=
Yn
i=1
pi is the likehood of
discrete distribution p =
P
i pixi
(2)
X
i
pih(xi,) = 0 implies that p
fulfils the moment condition
How to compute?
Newton-Lagrange scheme.
The constraint is linear (in p)
See Owen (2001) chap. 12
package emplik in R
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 10 / 29
15. EL examples
Data
Empirical
Likelihood
Parameters'
definition in
term of a
moment
constraint
x = (x1, ..., xK)
Profiled likelihood
x
xi
Simple case: = mean
h(xi,) = xi -
I Lel(jx) is maximal at
^
= 1
K(x1 + + xK)
I if enough data (K large),
if data from a true parametric
model,
then Lel true likelihood
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 11 / 29
16. EL examples
Data
Empirical
Likelihood
Parameters'
definition in
term of a
moment
constraint
x = (x1, ..., xK)
Profiled likelihood
x
xi
h(xi,) = xi -
red = param. likelihood
black = Lel
Example 1: Gaussian model
K = 20
1.0 1.5 2.0 2.5 3.0
0.0 0.2 0.4 0.6 0.8 1.0
theta
el.normal
K = 200
1.0 1.5 2.0 2.5 3.0
0.0 0.2 0.4 0.6 0.8 1.0
theta
el.normal
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 12 / 29
17. EL examples
Data
Empirical
Likelihood
Parameters'
definition in
term of a
moment
constraint
x = (x1, ..., xK)
Profiled likelihood
x
xi
red = true likelihood
blue = Gaussian likelihood
black = Lel
Example 2: non Gaussian model
K = 200
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
0.0 0.2 0.4 0.6 0.8 1.0
theta
el.normal
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 13 / 29
18. EL examples
Data
Empirical
Likelihood
Parameters'
definition in
term of a
moment
constraint
x = (x1, ..., xK)
Profiled likelihood
x
xi
red = true likelihood
blue = Gaussian likelihood
black = Lel
Example 2: non Gaussian model
K = 200
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
0.0 0.2 0.4 0.6 0.8 1.0
theta
el.normal
I Lel is better than a false model.
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 13 / 29
19. Table of contents
1 Likelihood free methods
2 Empirical likelihood
3 Population genetics
4 Numerical experiments
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 14 / 29
20. Neutral model at a given microsatellite locus, in a
closed panmictic population at equilibrium
Sample of 8 genes
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 15 / 29
21. Neutral model at a given microsatellite locus, in a
closed panmictic population at equilibrium
Kingman’s genealogy
When time axis is
normalized,
T(k) Exp(k(k - 1)=2)
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 15 / 29
22. Neutral model at a given microsatellite locus, in a
closed panmictic population at equilibrium
Kingman’s genealogy
When time axis is
normalized,
T(k) Exp(k(k - 1)=2)
Mutations according to
the Simple stepwise
Mutation Model (SMM)
date of the mutations
Poisson process with
intensity =2 over the
branches
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 15 / 29
23. Neutral model at a given microsatellite locus, in a
closed panmictic population at equilibrium
Observations: leafs of the tree
^ = ?
Kingman’s genealogy
When time axis is
normalized,
T(k) Exp(k(k - 1)=2)
Mutations according to
the Simple stepwise
Mutation Model (SMM)
date of the mutations
Poisson process with
intensity =2 over the
branches
MRCA = 100
independent mutations:
1 with pr. 1=2
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 15 / 29
24. Much more interesting models. . .
I several independent locus
Independent gene genealogies and mutations
I different populations
linked by an evolutionary scenario made of divergences,
admixtures, migrations between populations, etc.
I larger sample size
usually between 50 and 100 genes
A typical evolutionary scenario:
MRCA
POP 0 POP 1 POP 2
2
1
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 16 / 29
25. Moment condition in population genetics?
Main difficulty
Derive a constraint
EF
h(X,)
= 0,
on the parameters of interest when X is the genotypes of the sample
of individuals at a given locus
E.g., in phylogeography, is composed of
I dates of divergence between populations,
I ratio of population sizes,
I mutation rates, etc.
None of them are moments of the distribution of the allelic states of the
sample
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 17 / 29
26. Moment condition in population genetics?
Main difficulty
Derive a constraint
EF
h(X,)
= 0,
on the parameters of interest when X is the genotypes of the sample
of individuals at a given locus
E.g., in phylogeography, is composed of
I dates of divergence between populations,
I ratio of population sizes,
I mutation rates, etc.
None of them are moments of the distribution of the allelic states of the
sample
,! h = pairwise composite scores whose zero is the pairwise
maximum likelihood estimator
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 17 / 29
27. Pairwise composite likelihood?
The intra-locus pairwise likelihood
`2(xkj) =
Y
ij
k, xj
`2(xi
kj)
with x1
k, . . . , xnk
: allelic states of the gene sample at the k-th locus
Sample of size 2
) coalescent tree is a single line joining xi
k to xj
k
k, xj
) closed form of `2(xi
kj)
The pairwise score function
r log `2(xkj) =
X
ij
k, xj
r log `2(xi
kj)
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 18 / 29
28. Composite lik , approximation of lik
Composite likelihoods are often much more narrow than the
distribution of the model (See, e.g., Cooley et al., 2012)
Hence,
I produces too narrow posterior distributions
I confidence intervals or credibility intervals are not reliable
But,
E(r log `2(xj)) =
Z
r
log `2(xj)
`(xj)dx
=
X
ij
Z
r
log `2(xi, xjj)
`2(xi, xjj)dxidxj
=
X
ij
Z
r`2(xi, xjj)
`2(xi, xjj)
`2(xi, xjj)dxidxj
=
X
ij
r
Z
`2(xi, xjj)dxidxj
= 0
SPaufdelo (wINRitAh UE. MLonbtpeellicera2)use we only AuBsCeel GpenoPospition of its mode SSB 09/2012 19 / 29
29. Pairwise likelihood: a simple case
Assumptions
I sample closed, panmictic
population at equilibrium
I marker: microsatellite
I mutation rate: =2
k et xj
if xi
k are two genes of the
sample,
`2(xi
k, xj
kj) depends only on
k - xj
= xi
k
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 20 / 29
30. Pairwise likelihood: a simple case
Assumptions
I sample closed, panmictic
population at equilibrium
I marker: microsatellite
I mutation rate: =2
k et xj
if xi
k are two genes of the
sample,
`2(xi
k, xj
kj) depends only on
k - xj
= xi
k
`2(j) =
1
p
1 + 2
()jj
with
() =
1 + +
p
1 + 2
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 20 / 29
31. Pairwise likelihood: a simple case
Assumptions
I sample closed, panmictic
population at equilibrium
I marker: microsatellite
I mutation rate: =2
k et xj
if xi
k are two genes of the
sample,
`2(xi
k, xj
kj) depends only on
k - xj
= xi
k
`2(j) =
1
p
1 + 2
()jj
with
() =
1 + +
p
1 + 2
Pairwise score function
@ log `2(j) =
1
-
1 + 2
+
jj
p
1 + 2
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 20 / 29
32. Pairwise likelihood: 2 diverging populations
MRCA
POP a POP b
Assumptions
I : divergence date of pop.
a and b
I =2: mutation rate
Let xi
k and xj
k be two genes
coming resp. from pop. a and b
Set = xi
k - xj
k.
Then `2(j, ) =
e-
p
1 + 2
X+1
k=-1
()jkjI-k().
where
In(z) nth-order modified Bessel
function of the first kind
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 21 / 29
33. Pairwise likelihood: 2 diverging populations
MRCA
POP a POP b
Assumptions
I : divergence date of pop.
a and b
I =2: mutation rate
Let xi
k and xj
k be two genes
coming resp. from pop. a and b
Set = xi
k - xj
k.
A 2-dim score function
@ log `2(j, ) =
- +
2
`2( - 1j, ) + `2( + 1j, )
`2(j, )
@ log `2(j, ) =
1
- -
1 + 2
+
2
`2( - 1j, ) + `2( + 1j, )
`2(j, )
+
q(j, )
`2(j, )
where
q(j, ) :=
e-
p
1 + 2
0()
()
1X
k=-1
jkj()jkjI-k()
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 21 / 29
34. Table of contents
1 Likelihood free methods
2 Empirical likelihood
3 Population genetics
4 Numerical experiments
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 22 / 29
35. A first experiment
Evolutionary scenario:
MRCA
POP 0 POP 1
Dataset:
I 50 genes per populations,
I 100 microsat. loci
Assumptions:
I Ne identical over all
populations
I = (log10 , log10 )
I uniform prior over
(-1., 1.5) (-1., 1.)
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 23 / 29
36. A first experiment
Evolutionary scenario:
MRCA
POP 0 POP 1
Dataset:
I 50 genes per populations,
I 100 microsat. loci
Assumptions:
I Ne identical over all
populations
I = (log10 , log10 )
I uniform prior over
(-1., 1.5) (-1., 1.)
Comparison of the original
ABC with ABCel
ESS=7034
log(theta)
Density
0.00 0.05 0.10 0.15 0.20 0.25
0 5 10 15
log(tau1)
Density
−0.3 −0.2 −0.1 0.0 0.1 0.2 0.3
0 1 2 3 4 5 6 7
histogram = ABCel
curve = original ABC
vertical line = “true” parameter
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 23 / 29
37. ABC vs. ABCel on 100 replicates of the 1st experiment
Accuracy:
log10 log10
ABC ABCel ABC ABCel
(1) 0.097 0.094 0.315 0.117
(2) 0.071 0.059 0.272 0.077
(3) 0.68 0.81 1.0 0.80
(1) Root Mean Square Error of the posterior mean
(2) Median Absolute Deviation of the posterior median
(3) Coverage of the credibility interval of probability 0.8
Computation time: on a recent 6-core computer (C++/OpenMP)
I ABC 4 hours
I ABCel 2 minutes
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 24 / 29
38. Second experiment
Evolutionary scenario:
MRCA
POP 0 POP 1 POP 2
2
1
Dataset:
I 50 genes per populations,
I 100 microsat. loci
Assumptions:
I Ne identical over all
populations
I = log10(, 1, 2)
I non-informative uniform prior
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 25 / 29
39. Second experiment
Evolutionary scenario:
MRCA
POP 0 POP 1 POP 2
2
1
Dataset:
I 50 genes per populations,
I 100 microsat. loci
Assumptions:
I Ne identical over all
populations
I = log10(, 1, 2)
I non-informative uniform prior
Comparison of the original ABC
with ABCel
histogram = ABCel
curve = original ABC
vertical line = “true” parameter
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 25 / 29
40. Second experiment
Evolutionary scenario:
MRCA
POP 0 POP 1 POP 2
2
1
Dataset:
I 50 genes per populations,
I 100 microsat. loci
Assumptions:
I Ne identical over all
populations
I = log10(, 1, 2)
I non-informative uniform prior
Comparison of the original ABC
with ABCel
histogram = ABCel
curve = original ABC
vertical line = “true” parameter
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 25 / 29
41. Second experiment
Evolutionary scenario:
MRCA
POP 0 POP 1 POP 2
2
1
Dataset:
I 50 genes per populations,
I 100 microsat. loci
Assumptions:
I Ne identical over all
populations
I = log10(, 1, 2)
I non-informative uniform prior
Comparison of the original ABC
with ABCel
histogram = ABCel
curve = original ABC
vertical line = “true” parameter
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 25 / 29
42. ABC vs. ABCel on 100 replicates of the 2nd
experiment
Accuracy:
log10 log10 1 log10 2
ABC ABCel ABC ABCel ABC ABCel
(1) 0.0059 0.0794 0.472 0.483 29.3 4.76
(2) 0.048 0.053 0.32 0.28 4.13 3.36
(3) 0.79 0.76 0.88 0.76 0.89 0.79
(1) Root Mean Square Error of the posterior mean
(2) Median Absolute Deviation of the posterior median
(3) Coverage of the credibility interval of probability 0.8
Computation time: on a recent 6-core computer (C++/OpenMP)
I ABC 6 hours
I ABCel 8 minutes
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 26 / 29
43. Why?
On large datasets, ABCel gives more accurate results than ABC
ABC simplifies the dataset through summary statistics
Due to the large dimension of x, the original ABC algorithm estimates
44.
45.
46. (xobs)
,
where (xobs) is some (non-linear) projection of the observed dataset
on a space with smaller dimension
,! Some information is lost
ABCelsimplifies the model through a generalized moment condition
model.
,! Here, the moment condition model is based on pairwise
composition likelihood
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 27 / 29
47. Joint work with
I Christian P. Robert (U. Dauphine IUF)
I Kerrie Mengersen (QUT, Australia)
I Rapha¨ el Leblois (INRA CBGP, Montpellier)
Grant from ANR through
Project “Emile”
First preprint on arXiv
Approximate Bayesian computation via
empirical likelihood
Coming soon: population genetic models
which are too slow to simulate
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 28 / 29
48. Joint work with
I Christian P. Robert (U. Dauphine IUF)
I Kerrie Mengersen (QUT, Australia)
I Rapha¨ el Leblois (INRA CBGP, Montpellier)
Grant from ANR through
Project “Emile”
First preprint on arXiv
Approximate Bayesian computation via
empirical likelihood
Coming soon: population genetic models
which are too slow to simulate
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 28 / 29
49. Empirical likelihood
Wilk’s theorem to EL likelihood ratio.
Theorem (Owen, 1991)
Let X, X1, . . . ,Xn be iid random vector with common distribution F0.
For 2 , x 2 X , let h(x,) 2 Rs.
Let 0 2 be such that Var(h(X, 0)) is finite and has rank q 0.
If 0 satisfies E(h(X,0)) = 0, then -2 log
Lel(0jX1:n)
n-n
! 2 (q)
Note that
I n-n is the maximum value of Lel
I no condition to ensure a unique value for 0
I no condition to make ^
el = argmax Lel a good estimate of 0
I provides tests and confidence intervals
Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 29 / 29