SlideShare a Scribd company logo
1 of 46
Download to read offline
Computing Bayesian posterior with empirical 
likelihood in population genetics 
Pierre Pudlo 
INRA & U. Montpellier 2 
SSB, 18 sept. 2012 
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 1 / 29
Table of contents 
1 Likelihood free methods 
2 Empirical likelihood 
3 Population genetics 
4 Numerical experiments 
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 2 / 29
Table of contents 
1 Likelihood free methods 
2 Empirical likelihood 
3 Population genetics 
4 Numerical experiments 
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 3 / 29
When the likelihood is not completely known 
Likelihood intractable: `(yj) = 
Z 
U 
`(y,uj)du 
where u is a latent process 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 4 / 29
When the likelihood is not completely known 
Likelihood intractable: `(yj) = 
Z 
U 
`(y,uj)du 
where u is a latent process 
I Hidden Markov and other dynamic models: latent process 
which is not observed 
,! Classical answer: Markov chain Monte Carlo,. . . 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 4 / 29
When the likelihood is not completely known 
Likelihood intractable: `(yj) = 
Z 
U 
`(y,uj)du 
where u is a latent process 
I Hidden Markov and other dynamic models: latent process 
which is not observed 
,! Classical answer: Markov chain Monte Carlo,. . . 
I Population genetics: the whole gene genealogy is unobserved 
Likelihood is an integral over 
I all possible gene genealogies 
I all possible mutations along the genealogies 
,! Classical answer: Approximate Bayesian computation (ABC) 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 4 / 29
ABC in a nutshell 
Posterior distribution is the conditional distribution of 
()`(xj) () 
knowing that x = xobs 
Methodology 
Draw a (large) set of particles (i, xi) from () and use a 
nonparametric estimate of the conditional density 
(jxobs) / ()`(xobsj) 
Seminal papers 
I Tavar ´ e, Balding, Griffith and Donnelly (1997, Genetics) 
I Pritchard, Seielstad, Perez-Lezuan, Feldman (1999, Molecular 
Biology and Evolution) 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 5 / 29
ABC in a nutshell 
Posterior distribution is the conditional distribution of 
()`(xj) () 
knowing that x = xobs 
Methodology 
Draw a (large) set of particles (i, xi) from () and use a 
nonparametric estimate of the conditional density 
(jxobs) / ()`(xobsj) 
Shortcomings. 
I time consuming – If simulation of the latent process is not 
straightforward 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 5 / 29
ABC in a nutshell 
Posterior distribution is the conditional distribution of 
()`(xj) () 
knowing that x = xobs 
Methodology 
Draw a (large) set of particles (i, xi) from () and use a 
nonparametric estimate of the conditional density 
(jxobs) / ()`(xobsj) 
Shortcomings. 
I time consuming – If simulation of the latent process is not 
straightforward 
I curse of dimensionality vs. lost of information – 
I If x lies in a high dimensional space X (often), we are unable to 
estimate of the conditional density 
I Hence, we project the (observed and simulated) datasets on a 
space with smaller dimension (trough summary statistics) 
 : X ! Rd (summary statistics) 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 5 / 29
You went to school to learn, girl (. . . ) 
Why 2 plus 2 makes four 
Now, now, now, I’m gonna teach you (. . . ) 
All you gotta do is repeat after me! 
A, B, C! 
It’s easy as 1, 2, 3! 
Or simple as Do, re, mi! (. . . ) 
Surveys 
I Beaumont (2010, Annual Review of Ecology, Evolution, and 
Systematics) 
I Lopes, Beaumont (2010, Genetics and Evolution) 
I Csill ` ery, Blum, Gaggiotti, Franc¸ois (2010, Trends in Ecology and 
Evolution) 
I Marin, Pudlo, Robert, Ryder (to appear, Statis. and Comput.) 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 6 / 29
Table of contents 
1 Likelihood free methods 
2 Empirical likelihood 
3 Population genetics 
4 Numerical experiments 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 7 / 29
Empirical likelihood (EL) 
Owen (1988, Biometrika), 
Owen (2001, Chapman  
Hall) 
Data 
Empirical 
Likelihood 
Parameters' 
definition in 
term of a 
moment 
constraint 
x = (x1, ..., xK) 
Profiled likelihood 
x 
xi 
Black box 
Input: 
I Independent data 
x = (x1, . . . , xK) 
I Definition of the parameters 
E(h(xi,)) = 0 
Output: 
I A profiled “likelihood function” 
Lel(jx) 
EL do not require a fully defined and 
often complex (hence debatable) 
parametric model 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 8 / 29
Empirical likelihood (EL) 
Owen (1988, Biometrika), Owen (2001, Chapman  Hall) 
Assume that the dataset x is composed of n independent replicates 
x = (x1, . . . , xn) of some X  F 
Generalized moment condition model 
The law F of X satisfy 
EF 
 
h(X,) 
 
= 0, 
where h is a known function, and  an unknown parameter 
Empirical likelihood 
Lel(jx) = max 
p 
Yn 
i=1 
pi () 
for all p such that 0 6 pi 6 1, 
P 
pi = 1, 
P 
i pih(xi,) = 0. 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 9 / 29
Empirical likelihood (EL) 
Empirical likelihood 
Lel(jx) = max 
p 
Yn 
i=1 
pi () 
for all p such that 0 6 pi 6 1, 
P 
pi = 1, 
P 
i pih(xi,) = 0. 
Why ()? 
(1) L(p) := 
Yn 
i=1 
pi is the likehood of 
discrete distribution p = 
P 
i pixi 
(2) 
X 
i 
pih(xi,) = 0 implies that p 
fulfils the moment condition 
How to compute? 
Newton-Lagrange scheme. 
The constraint is linear (in p) 
See Owen (2001) chap. 12 
 package emplik in R 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 10 / 29
EL examples 
Data 
Empirical 
Likelihood 
Parameters' 
definition in 
term of a 
moment 
constraint 
x = (x1, ..., xK) 
Profiled likelihood 
x 
xi 
Simple case:  = mean 
h(xi,) = xi -  
I Lel(jx) is maximal at 
^ 
= 1 
K(x1 +    + xK) 
I if enough data (K large), 
 if data from a true parametric 
model, 
then Lel  true likelihood 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 11 / 29
EL examples 
Data 
Empirical 
Likelihood 
Parameters' 
definition in 
term of a 
moment 
constraint 
x = (x1, ..., xK) 
Profiled likelihood 
x 
xi 
h(xi,) = xi -  
red = param. likelihood 
black = Lel 
Example 1: Gaussian model 
K = 20 
1.0 1.5 2.0 2.5 3.0 
0.0 0.2 0.4 0.6 0.8 1.0 
theta 
el.normal 
K = 200 
1.0 1.5 2.0 2.5 3.0 
0.0 0.2 0.4 0.6 0.8 1.0 
theta 
el.normal 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 12 / 29
EL examples 
Data 
Empirical 
Likelihood 
Parameters' 
definition in 
term of a 
moment 
constraint 
x = (x1, ..., xK) 
Profiled likelihood 
x 
xi 
red = true likelihood 
blue = Gaussian likelihood 
black = Lel 
Example 2: non Gaussian model 
K = 200 
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 
0.0 0.2 0.4 0.6 0.8 1.0 
theta 
el.normal 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 13 / 29
EL examples 
Data 
Empirical 
Likelihood 
Parameters' 
definition in 
term of a 
moment 
constraint 
x = (x1, ..., xK) 
Profiled likelihood 
x 
xi 
red = true likelihood 
blue = Gaussian likelihood 
black = Lel 
Example 2: non Gaussian model 
K = 200 
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 
0.0 0.2 0.4 0.6 0.8 1.0 
theta 
el.normal 
I Lel is better than a false model. 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 13 / 29
Table of contents 
1 Likelihood free methods 
2 Empirical likelihood 
3 Population genetics 
4 Numerical experiments 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 14 / 29
Neutral model at a given microsatellite locus, in a 
closed panmictic population at equilibrium 
Sample of 8 genes 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 15 / 29
Neutral model at a given microsatellite locus, in a 
closed panmictic population at equilibrium 
Kingman’s genealogy 
When time axis is 
normalized, 
T(k)  Exp(k(k - 1)=2) 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 15 / 29
Neutral model at a given microsatellite locus, in a 
closed panmictic population at equilibrium 
Kingman’s genealogy 
When time axis is 
normalized, 
T(k)  Exp(k(k - 1)=2) 
Mutations according to 
the Simple stepwise 
Mutation Model (SMM) 
 date of the mutations  
Poisson process with 
intensity =2 over the 
branches 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 15 / 29
Neutral model at a given microsatellite locus, in a 
closed panmictic population at equilibrium 
Observations: leafs of the tree 
^ = ? 
Kingman’s genealogy 
When time axis is 
normalized, 
T(k)  Exp(k(k - 1)=2) 
Mutations according to 
the Simple stepwise 
Mutation Model (SMM) 
 date of the mutations  
Poisson process with 
intensity =2 over the 
branches 
 MRCA = 100 
 independent mutations: 
1 with pr. 1=2 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 15 / 29
Much more interesting models. . . 
I several independent locus 
Independent gene genealogies and mutations 
I different populations 
linked by an evolutionary scenario made of divergences, 
admixtures, migrations between populations, etc. 
I larger sample size 
usually between 50 and 100 genes 
A typical evolutionary scenario: 
MRCA 
POP 0 POP 1 POP 2 
2 
1 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 16 / 29
Moment condition in population genetics? 
Main difficulty 
Derive a constraint 
EF 
 
h(X,) 
 
= 0, 
on the parameters of interest  when X is the genotypes of the sample 
of individuals at a given locus 
E.g., in phylogeography,  is composed of 
I dates of divergence between populations, 
I ratio of population sizes, 
I mutation rates, etc. 
None of them are moments of the distribution of the allelic states of the 
sample 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 17 / 29
Moment condition in population genetics? 
Main difficulty 
Derive a constraint 
EF 
 
h(X,) 
 
= 0, 
on the parameters of interest  when X is the genotypes of the sample 
of individuals at a given locus 
E.g., in phylogeography,  is composed of 
I dates of divergence between populations, 
I ratio of population sizes, 
I mutation rates, etc. 
None of them are moments of the distribution of the allelic states of the 
sample 
,! h = pairwise composite scores whose zero is the pairwise 
maximum likelihood estimator 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 17 / 29
Pairwise composite likelihood? 
The intra-locus pairwise likelihood 
`2(xkj) = 
Y 
ij 
k, xj 
`2(xi 
kj) 
with x1 
k, . . . , xnk 
: allelic states of the gene sample at the k-th locus 
Sample of size 2 
) coalescent tree is a single line joining xi 
k to xj 
k 
k, xj 
) closed form of `2(xi 
kj) 
The pairwise score function 
r log `2(xkj) = 
X 
ij 
k, xj 
r log `2(xi 
kj) 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 18 / 29
Composite lik , approximation of lik 
Composite likelihoods are often much more narrow than the 
distribution of the model (See, e.g., Cooley et al., 2012) 
Hence, 
I produces too narrow posterior distributions 
I confidence intervals or credibility intervals are not reliable 
But, 
E(r log `2(xj)) = 
Z 
r 
 
log `2(xj) 
 
`(xj)dx 
= 
X 
ij 
Z 
r 
 
log `2(xi, xjj) 
 
`2(xi, xjj)dxidxj 
= 
X 
ij 
Z 
r`2(xi, xjj) 
`2(xi, xjj) 
`2(xi, xjj)dxidxj 
= 
X 
ij 
r 
 Z 
`2(xi, xjj)dxidxj 
 
= 0 
SPaufdelo (wINRitAh UE. MLonbtpeellicera2)use we only AuBsCeel GpenoPospition of its mode SSB 09/2012 19 / 29
Pairwise likelihood: a simple case 
Assumptions 
I sample  closed, panmictic 
population at equilibrium 
I marker: microsatellite 
I mutation rate: =2 
k et xj 
if xi 
k are two genes of the 
sample, 
`2(xi 
k, xj 
kj) depends only on 
k - xj 
 = xi 
k 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 20 / 29
Pairwise likelihood: a simple case 
Assumptions 
I sample  closed, panmictic 
population at equilibrium 
I marker: microsatellite 
I mutation rate: =2 
k et xj 
if xi 
k are two genes of the 
sample, 
`2(xi 
k, xj 
kj) depends only on 
k - xj 
 = xi 
k 
`2(j) = 
1 
p 
1 + 2 
 ()jj 
with 
() = 
 
1 +  + 
p 
1 + 2 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 20 / 29
Pairwise likelihood: a simple case 
Assumptions 
I sample  closed, panmictic 
population at equilibrium 
I marker: microsatellite 
I mutation rate: =2 
k et xj 
if xi 
k are two genes of the 
sample, 
`2(xi 
k, xj 
kj) depends only on 
k - xj 
 = xi 
k 
`2(j) = 
1 
p 
1 + 2 
 ()jj 
with 
() = 
 
1 +  + 
p 
1 + 2 
Pairwise score function 
@ log `2(j) = 
1 
- 
1 + 2 
+ 
jj 
 
p 
1 + 2 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 20 / 29
Pairwise likelihood: 2 diverging populations 
MRCA 
POP a POP b 
 
Assumptions 
I : divergence date of pop. 
a and b 
I =2: mutation rate 
Let xi 
k and xj 
k be two genes 
coming resp. from pop. a and b 
Set  = xi 
k - xj 
k. 
Then `2(j, ) = 
e- 
p 
1 + 2 
X+1 
k=-1 
()jkjI-k(). 
where 
In(z) nth-order modified Bessel 
function of the first kind 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 21 / 29
Pairwise likelihood: 2 diverging populations 
MRCA 
POP a POP b 
 
Assumptions 
I : divergence date of pop. 
a and b 
I =2: mutation rate 
Let xi 
k and xj 
k be two genes 
coming resp. from pop. a and b 
Set  = xi 
k - xj 
k. 
A 2-dim score function 
@ log `2(j, ) = 
- + 
 
2 
`2( - 1j, ) + `2( + 1j, ) 
`2(j, ) 
@ log `2(j, ) = 
1 
- - 
1 + 2 
+ 
 
2 
`2( - 1j, ) + `2( + 1j, ) 
`2(j, ) 
+ 
q(j, ) 
`2(j, ) 
where 
q(j, ) := 
e- 
p 
1 + 2 
0() 
() 
1X 
k=-1 
jkj()jkjI-k() 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 21 / 29
Table of contents 
1 Likelihood free methods 
2 Empirical likelihood 
3 Population genetics 
4 Numerical experiments 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 22 / 29
A first experiment 
Evolutionary scenario: 
MRCA 
POP 0 POP 1 
 
Dataset: 
I 50 genes per populations, 
I 100 microsat. loci 
Assumptions: 
I Ne identical over all 
populations 
I  = (log10 , log10 ) 
I uniform prior over 
(-1., 1.5)  (-1., 1.) 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 23 / 29
A first experiment 
Evolutionary scenario: 
MRCA 
POP 0 POP 1 
 
Dataset: 
I 50 genes per populations, 
I 100 microsat. loci 
Assumptions: 
I Ne identical over all 
populations 
I  = (log10 , log10 ) 
I uniform prior over 
(-1., 1.5)  (-1., 1.) 
Comparison of the original 
ABC with ABCel 
ESS=7034 
log(theta) 
Density 
0.00 0.05 0.10 0.15 0.20 0.25 
0 5 10 15 
log(tau1) 
Density 
−0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 
0 1 2 3 4 5 6 7 
histogram = ABCel 
curve = original ABC 
vertical line = “true” parameter 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 23 / 29
ABC vs. ABCel on 100 replicates of the 1st experiment 
Accuracy: 
log10  log10  
ABC ABCel ABC ABCel 
(1) 0.097 0.094 0.315 0.117 
(2) 0.071 0.059 0.272 0.077 
(3) 0.68 0.81 1.0 0.80 
(1) Root Mean Square Error of the posterior mean 
(2) Median Absolute Deviation of the posterior median 
(3) Coverage of the credibility interval of probability 0.8 
Computation time: on a recent 6-core computer (C++/OpenMP) 
I ABC  4 hours 
I ABCel 2 minutes 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 24 / 29
Second experiment 
Evolutionary scenario: 
MRCA 
POP 0 POP 1 POP 2 
2 
1 
Dataset: 
I 50 genes per populations, 
I 100 microsat. loci 
Assumptions: 
I Ne identical over all 
populations 
I  = log10(, 1, 2) 
I non-informative uniform prior 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 25 / 29
Second experiment 
Evolutionary scenario: 
MRCA 
POP 0 POP 1 POP 2 
2 
1 
Dataset: 
I 50 genes per populations, 
I 100 microsat. loci 
Assumptions: 
I Ne identical over all 
populations 
I  = log10(, 1, 2) 
I non-informative uniform prior 
Comparison of the original ABC 
with ABCel 
histogram = ABCel 
curve = original ABC 
vertical line = “true” parameter 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 25 / 29
Second experiment 
Evolutionary scenario: 
MRCA 
POP 0 POP 1 POP 2 
2 
1 
Dataset: 
I 50 genes per populations, 
I 100 microsat. loci 
Assumptions: 
I Ne identical over all 
populations 
I  = log10(, 1, 2) 
I non-informative uniform prior 
Comparison of the original ABC 
with ABCel 
histogram = ABCel 
curve = original ABC 
vertical line = “true” parameter 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 25 / 29
Second experiment 
Evolutionary scenario: 
MRCA 
POP 0 POP 1 POP 2 
2 
1 
Dataset: 
I 50 genes per populations, 
I 100 microsat. loci 
Assumptions: 
I Ne identical over all 
populations 
I  = log10(, 1, 2) 
I non-informative uniform prior 
Comparison of the original ABC 
with ABCel 
histogram = ABCel 
curve = original ABC 
vertical line = “true” parameter 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 25 / 29
ABC vs. ABCel on 100 replicates of the 2nd 
experiment 
Accuracy: 
log10  log10 1 log10 2 
ABC ABCel ABC ABCel ABC ABCel 
(1) 0.0059 0.0794 0.472 0.483 29.3 4.76 
(2) 0.048 0.053 0.32 0.28 4.13 3.36 
(3) 0.79 0.76 0.88 0.76 0.89 0.79 
(1) Root Mean Square Error of the posterior mean 
(2) Median Absolute Deviation of the posterior median 
(3) Coverage of the credibility interval of probability 0.8 
Computation time: on a recent 6-core computer (C++/OpenMP) 
I ABC  6 hours 
I ABCel 8 minutes 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 26 / 29
Why? 
On large datasets, ABCel gives more accurate results than ABC 
ABC simplifies the dataset through summary statistics 
Due to the large dimension of x, the original ABC algorithm estimates
(xobs) 
 
, 
where (xobs) is some (non-linear) projection of the observed dataset 
on a space with smaller dimension 
,! Some information is lost 
ABCelsimplifies the model through a generalized moment condition 
model. 
,! Here, the moment condition model is based on pairwise 
composition likelihood 
Pudlo (INRA  U. Montpellier 2) ABCel GenPop SSB 09/2012 27 / 29

More Related Content

Viewers also liked

Revista giro rura 2ª edição
Revista giro rura 2ª ediçãoRevista giro rura 2ª edição
Revista giro rura 2ª ediçãoRevistagirorural
 
Destaque ABAG/RP 2012
Destaque ABAG/RP 2012Destaque ABAG/RP 2012
Destaque ABAG/RP 2012blog2012
 
Visita da natura em nossa escola!
Visita da natura em nossa escola!Visita da natura em nossa escola!
Visita da natura em nossa escola!thayscler
 
Informativo mensal da Associação Brasileira de Advogados Trabalhistas
Informativo mensal da Associação Brasileira de Advogados TrabalhistasInformativo mensal da Associação Brasileira de Advogados Trabalhistas
Informativo mensal da Associação Brasileira de Advogados TrabalhistasDavidson Malacco
 
Clipagem da Sociedade Gaúcha de Nefrologia de agosto de 2011
Clipagem da Sociedade Gaúcha de Nefrologia de agosto de 2011Clipagem da Sociedade Gaúcha de Nefrologia de agosto de 2011
Clipagem da Sociedade Gaúcha de Nefrologia de agosto de 2011PlayPress Assessoria e Conteúdo
 
David Aaker - Managing brand equity
David Aaker - Managing brand equityDavid Aaker - Managing brand equity
David Aaker - Managing brand equityLuis Rasquilha
 
Construção de Warehouse
Construção de WarehouseConstrução de Warehouse
Construção de WarehouseMarco Coghi
 
II Prêmio Professor ABAG/RP
II Prêmio Professor ABAG/RPII Prêmio Professor ABAG/RP
II Prêmio Professor ABAG/RPblog2012
 
Dialogos entre a arte e a realidade
Dialogos entre a arte e a realidadeDialogos entre a arte e a realidade
Dialogos entre a arte e a realidadeEditora Moderna
 
Oficina de Bacia Hidrográfica Manuelzão
Oficina de Bacia Hidrográfica ManuelzãoOficina de Bacia Hidrográfica Manuelzão
Oficina de Bacia Hidrográfica ManuelzãoCBH Rio das Velhas
 
Controle básico e administração de estoques
Controle básico e administração de estoquesControle básico e administração de estoques
Controle básico e administração de estoquesam3solucoes
 
Como funciona Fuel Age
Como funciona Fuel AgeComo funciona Fuel Age
Como funciona Fuel AgeWelyton Batuel
 
Sistema babilonico
Sistema babilonicoSistema babilonico
Sistema babilonicoerikapoh
 
Pranchas dos Arquitetos
Pranchas dos Arquitetos Pranchas dos Arquitetos
Pranchas dos Arquitetos leticiabandeira
 
Hard and soft sounds of ‘c’ and 'g'
Hard and soft sounds of ‘c’ and 'g'Hard and soft sounds of ‘c’ and 'g'
Hard and soft sounds of ‘c’ and 'g'Rita Simons Santiago
 

Viewers also liked (20)

Revista giro rura 2ª edição
Revista giro rura 2ª ediçãoRevista giro rura 2ª edição
Revista giro rura 2ª edição
 
Destaque ABAG/RP 2012
Destaque ABAG/RP 2012Destaque ABAG/RP 2012
Destaque ABAG/RP 2012
 
Visita da natura em nossa escola!
Visita da natura em nossa escola!Visita da natura em nossa escola!
Visita da natura em nossa escola!
 
Informativo mensal da Associação Brasileira de Advogados Trabalhistas
Informativo mensal da Associação Brasileira de Advogados TrabalhistasInformativo mensal da Associação Brasileira de Advogados Trabalhistas
Informativo mensal da Associação Brasileira de Advogados Trabalhistas
 
Sopro divino
Sopro divinoSopro divino
Sopro divino
 
Clipagem da Sociedade Gaúcha de Nefrologia de agosto de 2011
Clipagem da Sociedade Gaúcha de Nefrologia de agosto de 2011Clipagem da Sociedade Gaúcha de Nefrologia de agosto de 2011
Clipagem da Sociedade Gaúcha de Nefrologia de agosto de 2011
 
Criacaode peixes
Criacaode peixesCriacaode peixes
Criacaode peixes
 
Sites de busca
Sites de buscaSites de busca
Sites de busca
 
David Aaker - Managing brand equity
David Aaker - Managing brand equityDavid Aaker - Managing brand equity
David Aaker - Managing brand equity
 
Construção de Warehouse
Construção de WarehouseConstrução de Warehouse
Construção de Warehouse
 
II Prêmio Professor ABAG/RP
II Prêmio Professor ABAG/RPII Prêmio Professor ABAG/RP
II Prêmio Professor ABAG/RP
 
Dialogos entre a arte e a realidade
Dialogos entre a arte e a realidadeDialogos entre a arte e a realidade
Dialogos entre a arte e a realidade
 
Oficina de Bacia Hidrográfica Manuelzão
Oficina de Bacia Hidrográfica ManuelzãoOficina de Bacia Hidrográfica Manuelzão
Oficina de Bacia Hidrográfica Manuelzão
 
Controle básico e administração de estoques
Controle básico e administração de estoquesControle básico e administração de estoques
Controle básico e administração de estoques
 
Como funciona Fuel Age
Como funciona Fuel AgeComo funciona Fuel Age
Como funciona Fuel Age
 
Visita virtual - Associação Apostolado do Sagrado Coração de Jesus
Visita virtual - Associação Apostolado do Sagrado Coração de Jesus Visita virtual - Associação Apostolado do Sagrado Coração de Jesus
Visita virtual - Associação Apostolado do Sagrado Coração de Jesus
 
Sistema babilonico
Sistema babilonicoSistema babilonico
Sistema babilonico
 
Feiras promocionais - check list
Feiras promocionais - check listFeiras promocionais - check list
Feiras promocionais - check list
 
Pranchas dos Arquitetos
Pranchas dos Arquitetos Pranchas dos Arquitetos
Pranchas dos Arquitetos
 
Hard and soft sounds of ‘c’ and 'g'
Hard and soft sounds of ‘c’ and 'g'Hard and soft sounds of ‘c’ and 'g'
Hard and soft sounds of ‘c’ and 'g'
 

Similar to Computing Bayesian posterior with empirical likelihood in population genetics

Approximate Bayesian computation and machine learning (BigMC 2014)
Approximate Bayesian computation and machine learning (BigMC 2014)Approximate Bayesian computation and machine learning (BigMC 2014)
Approximate Bayesian computation and machine learning (BigMC 2014)Pierre Pudlo
 
ISBA 2022 Susie Bayarri lecture
ISBA 2022 Susie Bayarri lectureISBA 2022 Susie Bayarri lecture
ISBA 2022 Susie Bayarri lecturePierre Jacob
 
Individual Heterogeneity in Capture-Recapture Models
Individual Heterogeneity in Capture-Recapture ModelsIndividual Heterogeneity in Capture-Recapture Models
Individual Heterogeneity in Capture-Recapture Modelsolivier gimenez
 
BA4101 Question Bank.pdf
BA4101 Question Bank.pdfBA4101 Question Bank.pdf
BA4101 Question Bank.pdfFredrickWilson4
 
Habilitation à diriger des recherches
Habilitation à diriger des recherchesHabilitation à diriger des recherches
Habilitation à diriger des recherchesPierre Pudlo
 
Entropy Nucleus a nd Use i n Waste Disposal Policies
Entropy Nucleus  a nd Use  i n Waste Disposal  PoliciesEntropy Nucleus  a nd Use  i n Waste Disposal  Policies
Entropy Nucleus a nd Use i n Waste Disposal Policiesijitjournal
 
Generative models : VAE and GAN
Generative models : VAE and GANGenerative models : VAE and GAN
Generative models : VAE and GANSEMINARGROOT
 
Performance of Spiked Population Models for Spectrum Sensing
Performance of Spiked Population Models for Spectrum SensingPerformance of Spiked Population Models for Spectrum Sensing
Performance of Spiked Population Models for Spectrum SensingPolytechnique Montreal
 
Spike sorting: What is it? Why do we need it? Where does it come from? How is...
Spike sorting: What is it? Why do we need it? Where does it come from? How is...Spike sorting: What is it? Why do we need it? Where does it come from? How is...
Spike sorting: What is it? Why do we need it? Where does it come from? How is...NeuroMat
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsChristian Robert
 
Lecture17 xing fei-fei
Lecture17 xing fei-feiLecture17 xing fei-fei
Lecture17 xing fei-feiTianlu Wang
 
Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Pierre Jacob
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
Otter 2017-12-18-01-ss
Otter 2017-12-18-01-ssOtter 2017-12-18-01-ss
Otter 2017-12-18-01-ssRuo Ando
 

Similar to Computing Bayesian posterior with empirical likelihood in population genetics (20)

Approximate Bayesian computation and machine learning (BigMC 2014)
Approximate Bayesian computation and machine learning (BigMC 2014)Approximate Bayesian computation and machine learning (BigMC 2014)
Approximate Bayesian computation and machine learning (BigMC 2014)
 
BAYSM'14, Wien, Austria
BAYSM'14, Wien, AustriaBAYSM'14, Wien, Austria
BAYSM'14, Wien, Austria
 
ISBA 2022 Susie Bayarri lecture
ISBA 2022 Susie Bayarri lectureISBA 2022 Susie Bayarri lecture
ISBA 2022 Susie Bayarri lecture
 
Seminaire ihp
Seminaire ihpSeminaire ihp
Seminaire ihp
 
Individual Heterogeneity in Capture-Recapture Models
Individual Heterogeneity in Capture-Recapture ModelsIndividual Heterogeneity in Capture-Recapture Models
Individual Heterogeneity in Capture-Recapture Models
 
BA4101 Question Bank.pdf
BA4101 Question Bank.pdfBA4101 Question Bank.pdf
BA4101 Question Bank.pdf
 
Habilitation à diriger des recherches
Habilitation à diriger des recherchesHabilitation à diriger des recherches
Habilitation à diriger des recherches
 
Entropy Nucleus a nd Use i n Waste Disposal Policies
Entropy Nucleus  a nd Use  i n Waste Disposal  PoliciesEntropy Nucleus  a nd Use  i n Waste Disposal  Policies
Entropy Nucleus a nd Use i n Waste Disposal Policies
 
Generative models : VAE and GAN
Generative models : VAE and GANGenerative models : VAE and GAN
Generative models : VAE and GAN
 
Performance of Spiked Population Models for Spectrum Sensing
Performance of Spiked Population Models for Spectrum SensingPerformance of Spiked Population Models for Spectrum Sensing
Performance of Spiked Population Models for Spectrum Sensing
 
Spike sorting: What is it? Why do we need it? Where does it come from? How is...
Spike sorting: What is it? Why do we need it? Where does it come from? How is...Spike sorting: What is it? Why do we need it? Where does it come from? How is...
Spike sorting: What is it? Why do we need it? Where does it come from? How is...
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
 
Statistical analysis by iswar
Statistical analysis by iswarStatistical analysis by iswar
Statistical analysis by iswar
 
Lecture17 xing fei-fei
Lecture17 xing fei-feiLecture17 xing fei-fei
Lecture17 xing fei-fei
 
Poster_PingPong
Poster_PingPongPoster_PingPong
Poster_PingPong
 
Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation
 
Upm etsiccp-seminar-vf
Upm etsiccp-seminar-vfUpm etsiccp-seminar-vf
Upm etsiccp-seminar-vf
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
F0742328
F0742328F0742328
F0742328
 
Otter 2017-12-18-01-ss
Otter 2017-12-18-01-ssOtter 2017-12-18-01-ss
Otter 2017-12-18-01-ss
 

Recently uploaded

trihybrid cross , test cross chi squares
trihybrid cross , test cross chi squarestrihybrid cross , test cross chi squares
trihybrid cross , test cross chi squaresusmanzain586
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024Jene van der Heide
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalMAESTRELLAMesa2
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 

Recently uploaded (20)

trihybrid cross , test cross chi squares
trihybrid cross , test cross chi squarestrihybrid cross , test cross chi squares
trihybrid cross , test cross chi squares
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and Vertical
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 

Computing Bayesian posterior with empirical likelihood in population genetics

  • 1. Computing Bayesian posterior with empirical likelihood in population genetics Pierre Pudlo INRA & U. Montpellier 2 SSB, 18 sept. 2012 Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 1 / 29
  • 2. Table of contents 1 Likelihood free methods 2 Empirical likelihood 3 Population genetics 4 Numerical experiments Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 2 / 29
  • 3. Table of contents 1 Likelihood free methods 2 Empirical likelihood 3 Population genetics 4 Numerical experiments Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 3 / 29
  • 4. When the likelihood is not completely known Likelihood intractable: `(yj) = Z U `(y,uj)du where u is a latent process Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 4 / 29
  • 5. When the likelihood is not completely known Likelihood intractable: `(yj) = Z U `(y,uj)du where u is a latent process I Hidden Markov and other dynamic models: latent process which is not observed ,! Classical answer: Markov chain Monte Carlo,. . . Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 4 / 29
  • 6. When the likelihood is not completely known Likelihood intractable: `(yj) = Z U `(y,uj)du where u is a latent process I Hidden Markov and other dynamic models: latent process which is not observed ,! Classical answer: Markov chain Monte Carlo,. . . I Population genetics: the whole gene genealogy is unobserved Likelihood is an integral over I all possible gene genealogies I all possible mutations along the genealogies ,! Classical answer: Approximate Bayesian computation (ABC) Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 4 / 29
  • 7. ABC in a nutshell Posterior distribution is the conditional distribution of ()`(xj) () knowing that x = xobs Methodology Draw a (large) set of particles (i, xi) from () and use a nonparametric estimate of the conditional density (jxobs) / ()`(xobsj) Seminal papers I Tavar ´ e, Balding, Griffith and Donnelly (1997, Genetics) I Pritchard, Seielstad, Perez-Lezuan, Feldman (1999, Molecular Biology and Evolution) Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 5 / 29
  • 8. ABC in a nutshell Posterior distribution is the conditional distribution of ()`(xj) () knowing that x = xobs Methodology Draw a (large) set of particles (i, xi) from () and use a nonparametric estimate of the conditional density (jxobs) / ()`(xobsj) Shortcomings. I time consuming – If simulation of the latent process is not straightforward Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 5 / 29
  • 9. ABC in a nutshell Posterior distribution is the conditional distribution of ()`(xj) () knowing that x = xobs Methodology Draw a (large) set of particles (i, xi) from () and use a nonparametric estimate of the conditional density (jxobs) / ()`(xobsj) Shortcomings. I time consuming – If simulation of the latent process is not straightforward I curse of dimensionality vs. lost of information – I If x lies in a high dimensional space X (often), we are unable to estimate of the conditional density I Hence, we project the (observed and simulated) datasets on a space with smaller dimension (trough summary statistics) : X ! Rd (summary statistics) Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 5 / 29
  • 10. You went to school to learn, girl (. . . ) Why 2 plus 2 makes four Now, now, now, I’m gonna teach you (. . . ) All you gotta do is repeat after me! A, B, C! It’s easy as 1, 2, 3! Or simple as Do, re, mi! (. . . ) Surveys I Beaumont (2010, Annual Review of Ecology, Evolution, and Systematics) I Lopes, Beaumont (2010, Genetics and Evolution) I Csill ` ery, Blum, Gaggiotti, Franc¸ois (2010, Trends in Ecology and Evolution) I Marin, Pudlo, Robert, Ryder (to appear, Statis. and Comput.) Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 6 / 29
  • 11. Table of contents 1 Likelihood free methods 2 Empirical likelihood 3 Population genetics 4 Numerical experiments Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 7 / 29
  • 12. Empirical likelihood (EL) Owen (1988, Biometrika), Owen (2001, Chapman Hall) Data Empirical Likelihood Parameters' definition in term of a moment constraint x = (x1, ..., xK) Profiled likelihood x xi Black box Input: I Independent data x = (x1, . . . , xK) I Definition of the parameters E(h(xi,)) = 0 Output: I A profiled “likelihood function” Lel(jx) EL do not require a fully defined and often complex (hence debatable) parametric model Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 8 / 29
  • 13. Empirical likelihood (EL) Owen (1988, Biometrika), Owen (2001, Chapman Hall) Assume that the dataset x is composed of n independent replicates x = (x1, . . . , xn) of some X F Generalized moment condition model The law F of X satisfy EF h(X,) = 0, where h is a known function, and an unknown parameter Empirical likelihood Lel(jx) = max p Yn i=1 pi () for all p such that 0 6 pi 6 1, P pi = 1, P i pih(xi,) = 0. Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 9 / 29
  • 14. Empirical likelihood (EL) Empirical likelihood Lel(jx) = max p Yn i=1 pi () for all p such that 0 6 pi 6 1, P pi = 1, P i pih(xi,) = 0. Why ()? (1) L(p) := Yn i=1 pi is the likehood of discrete distribution p = P i pixi (2) X i pih(xi,) = 0 implies that p fulfils the moment condition How to compute? Newton-Lagrange scheme. The constraint is linear (in p) See Owen (2001) chap. 12 package emplik in R Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 10 / 29
  • 15. EL examples Data Empirical Likelihood Parameters' definition in term of a moment constraint x = (x1, ..., xK) Profiled likelihood x xi Simple case: = mean h(xi,) = xi - I Lel(jx) is maximal at ^ = 1 K(x1 + + xK) I if enough data (K large), if data from a true parametric model, then Lel true likelihood Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 11 / 29
  • 16. EL examples Data Empirical Likelihood Parameters' definition in term of a moment constraint x = (x1, ..., xK) Profiled likelihood x xi h(xi,) = xi - red = param. likelihood black = Lel Example 1: Gaussian model K = 20 1.0 1.5 2.0 2.5 3.0 0.0 0.2 0.4 0.6 0.8 1.0 theta el.normal K = 200 1.0 1.5 2.0 2.5 3.0 0.0 0.2 0.4 0.6 0.8 1.0 theta el.normal Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 12 / 29
  • 17. EL examples Data Empirical Likelihood Parameters' definition in term of a moment constraint x = (x1, ..., xK) Profiled likelihood x xi red = true likelihood blue = Gaussian likelihood black = Lel Example 2: non Gaussian model K = 200 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.2 0.4 0.6 0.8 1.0 theta el.normal Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 13 / 29
  • 18. EL examples Data Empirical Likelihood Parameters' definition in term of a moment constraint x = (x1, ..., xK) Profiled likelihood x xi red = true likelihood blue = Gaussian likelihood black = Lel Example 2: non Gaussian model K = 200 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.2 0.4 0.6 0.8 1.0 theta el.normal I Lel is better than a false model. Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 13 / 29
  • 19. Table of contents 1 Likelihood free methods 2 Empirical likelihood 3 Population genetics 4 Numerical experiments Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 14 / 29
  • 20. Neutral model at a given microsatellite locus, in a closed panmictic population at equilibrium Sample of 8 genes Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 15 / 29
  • 21. Neutral model at a given microsatellite locus, in a closed panmictic population at equilibrium Kingman’s genealogy When time axis is normalized, T(k) Exp(k(k - 1)=2) Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 15 / 29
  • 22. Neutral model at a given microsatellite locus, in a closed panmictic population at equilibrium Kingman’s genealogy When time axis is normalized, T(k) Exp(k(k - 1)=2) Mutations according to the Simple stepwise Mutation Model (SMM) date of the mutations Poisson process with intensity =2 over the branches Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 15 / 29
  • 23. Neutral model at a given microsatellite locus, in a closed panmictic population at equilibrium Observations: leafs of the tree ^ = ? Kingman’s genealogy When time axis is normalized, T(k) Exp(k(k - 1)=2) Mutations according to the Simple stepwise Mutation Model (SMM) date of the mutations Poisson process with intensity =2 over the branches MRCA = 100 independent mutations: 1 with pr. 1=2 Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 15 / 29
  • 24. Much more interesting models. . . I several independent locus Independent gene genealogies and mutations I different populations linked by an evolutionary scenario made of divergences, admixtures, migrations between populations, etc. I larger sample size usually between 50 and 100 genes A typical evolutionary scenario: MRCA POP 0 POP 1 POP 2 2 1 Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 16 / 29
  • 25. Moment condition in population genetics? Main difficulty Derive a constraint EF h(X,) = 0, on the parameters of interest when X is the genotypes of the sample of individuals at a given locus E.g., in phylogeography, is composed of I dates of divergence between populations, I ratio of population sizes, I mutation rates, etc. None of them are moments of the distribution of the allelic states of the sample Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 17 / 29
  • 26. Moment condition in population genetics? Main difficulty Derive a constraint EF h(X,) = 0, on the parameters of interest when X is the genotypes of the sample of individuals at a given locus E.g., in phylogeography, is composed of I dates of divergence between populations, I ratio of population sizes, I mutation rates, etc. None of them are moments of the distribution of the allelic states of the sample ,! h = pairwise composite scores whose zero is the pairwise maximum likelihood estimator Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 17 / 29
  • 27. Pairwise composite likelihood? The intra-locus pairwise likelihood `2(xkj) = Y ij k, xj `2(xi kj) with x1 k, . . . , xnk : allelic states of the gene sample at the k-th locus Sample of size 2 ) coalescent tree is a single line joining xi k to xj k k, xj ) closed form of `2(xi kj) The pairwise score function r log `2(xkj) = X ij k, xj r log `2(xi kj) Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 18 / 29
  • 28. Composite lik , approximation of lik Composite likelihoods are often much more narrow than the distribution of the model (See, e.g., Cooley et al., 2012) Hence, I produces too narrow posterior distributions I confidence intervals or credibility intervals are not reliable But, E(r log `2(xj)) = Z r log `2(xj) `(xj)dx = X ij Z r log `2(xi, xjj) `2(xi, xjj)dxidxj = X ij Z r`2(xi, xjj) `2(xi, xjj) `2(xi, xjj)dxidxj = X ij r Z `2(xi, xjj)dxidxj = 0 SPaufdelo (wINRitAh UE. MLonbtpeellicera2)use we only AuBsCeel GpenoPospition of its mode SSB 09/2012 19 / 29
  • 29. Pairwise likelihood: a simple case Assumptions I sample closed, panmictic population at equilibrium I marker: microsatellite I mutation rate: =2 k et xj if xi k are two genes of the sample, `2(xi k, xj kj) depends only on k - xj = xi k Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 20 / 29
  • 30. Pairwise likelihood: a simple case Assumptions I sample closed, panmictic population at equilibrium I marker: microsatellite I mutation rate: =2 k et xj if xi k are two genes of the sample, `2(xi k, xj kj) depends only on k - xj = xi k `2(j) = 1 p 1 + 2 ()jj with () = 1 + + p 1 + 2 Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 20 / 29
  • 31. Pairwise likelihood: a simple case Assumptions I sample closed, panmictic population at equilibrium I marker: microsatellite I mutation rate: =2 k et xj if xi k are two genes of the sample, `2(xi k, xj kj) depends only on k - xj = xi k `2(j) = 1 p 1 + 2 ()jj with () = 1 + + p 1 + 2 Pairwise score function @ log `2(j) = 1 - 1 + 2 + jj p 1 + 2 Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 20 / 29
  • 32. Pairwise likelihood: 2 diverging populations MRCA POP a POP b Assumptions I : divergence date of pop. a and b I =2: mutation rate Let xi k and xj k be two genes coming resp. from pop. a and b Set = xi k - xj k. Then `2(j, ) = e- p 1 + 2 X+1 k=-1 ()jkjI-k(). where In(z) nth-order modified Bessel function of the first kind Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 21 / 29
  • 33. Pairwise likelihood: 2 diverging populations MRCA POP a POP b Assumptions I : divergence date of pop. a and b I =2: mutation rate Let xi k and xj k be two genes coming resp. from pop. a and b Set = xi k - xj k. A 2-dim score function @ log `2(j, ) = - + 2 `2( - 1j, ) + `2( + 1j, ) `2(j, ) @ log `2(j, ) = 1 - - 1 + 2 + 2 `2( - 1j, ) + `2( + 1j, ) `2(j, ) + q(j, ) `2(j, ) where q(j, ) := e- p 1 + 2 0() () 1X k=-1 jkj()jkjI-k() Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 21 / 29
  • 34. Table of contents 1 Likelihood free methods 2 Empirical likelihood 3 Population genetics 4 Numerical experiments Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 22 / 29
  • 35. A first experiment Evolutionary scenario: MRCA POP 0 POP 1 Dataset: I 50 genes per populations, I 100 microsat. loci Assumptions: I Ne identical over all populations I = (log10 , log10 ) I uniform prior over (-1., 1.5) (-1., 1.) Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 23 / 29
  • 36. A first experiment Evolutionary scenario: MRCA POP 0 POP 1 Dataset: I 50 genes per populations, I 100 microsat. loci Assumptions: I Ne identical over all populations I = (log10 , log10 ) I uniform prior over (-1., 1.5) (-1., 1.) Comparison of the original ABC with ABCel ESS=7034 log(theta) Density 0.00 0.05 0.10 0.15 0.20 0.25 0 5 10 15 log(tau1) Density −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 0 1 2 3 4 5 6 7 histogram = ABCel curve = original ABC vertical line = “true” parameter Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 23 / 29
  • 37. ABC vs. ABCel on 100 replicates of the 1st experiment Accuracy: log10 log10 ABC ABCel ABC ABCel (1) 0.097 0.094 0.315 0.117 (2) 0.071 0.059 0.272 0.077 (3) 0.68 0.81 1.0 0.80 (1) Root Mean Square Error of the posterior mean (2) Median Absolute Deviation of the posterior median (3) Coverage of the credibility interval of probability 0.8 Computation time: on a recent 6-core computer (C++/OpenMP) I ABC 4 hours I ABCel 2 minutes Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 24 / 29
  • 38. Second experiment Evolutionary scenario: MRCA POP 0 POP 1 POP 2 2 1 Dataset: I 50 genes per populations, I 100 microsat. loci Assumptions: I Ne identical over all populations I = log10(, 1, 2) I non-informative uniform prior Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 25 / 29
  • 39. Second experiment Evolutionary scenario: MRCA POP 0 POP 1 POP 2 2 1 Dataset: I 50 genes per populations, I 100 microsat. loci Assumptions: I Ne identical over all populations I = log10(, 1, 2) I non-informative uniform prior Comparison of the original ABC with ABCel histogram = ABCel curve = original ABC vertical line = “true” parameter Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 25 / 29
  • 40. Second experiment Evolutionary scenario: MRCA POP 0 POP 1 POP 2 2 1 Dataset: I 50 genes per populations, I 100 microsat. loci Assumptions: I Ne identical over all populations I = log10(, 1, 2) I non-informative uniform prior Comparison of the original ABC with ABCel histogram = ABCel curve = original ABC vertical line = “true” parameter Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 25 / 29
  • 41. Second experiment Evolutionary scenario: MRCA POP 0 POP 1 POP 2 2 1 Dataset: I 50 genes per populations, I 100 microsat. loci Assumptions: I Ne identical over all populations I = log10(, 1, 2) I non-informative uniform prior Comparison of the original ABC with ABCel histogram = ABCel curve = original ABC vertical line = “true” parameter Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 25 / 29
  • 42. ABC vs. ABCel on 100 replicates of the 2nd experiment Accuracy: log10 log10 1 log10 2 ABC ABCel ABC ABCel ABC ABCel (1) 0.0059 0.0794 0.472 0.483 29.3 4.76 (2) 0.048 0.053 0.32 0.28 4.13 3.36 (3) 0.79 0.76 0.88 0.76 0.89 0.79 (1) Root Mean Square Error of the posterior mean (2) Median Absolute Deviation of the posterior median (3) Coverage of the credibility interval of probability 0.8 Computation time: on a recent 6-core computer (C++/OpenMP) I ABC 6 hours I ABCel 8 minutes Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 26 / 29
  • 43. Why? On large datasets, ABCel gives more accurate results than ABC ABC simplifies the dataset through summary statistics Due to the large dimension of x, the original ABC algorithm estimates
  • 44.
  • 45.
  • 46. (xobs) , where (xobs) is some (non-linear) projection of the observed dataset on a space with smaller dimension ,! Some information is lost ABCelsimplifies the model through a generalized moment condition model. ,! Here, the moment condition model is based on pairwise composition likelihood Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 27 / 29
  • 47. Joint work with I Christian P. Robert (U. Dauphine IUF) I Kerrie Mengersen (QUT, Australia) I Rapha¨ el Leblois (INRA CBGP, Montpellier) Grant from ANR through Project “Emile” First preprint on arXiv Approximate Bayesian computation via empirical likelihood Coming soon: population genetic models which are too slow to simulate Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 28 / 29
  • 48. Joint work with I Christian P. Robert (U. Dauphine IUF) I Kerrie Mengersen (QUT, Australia) I Rapha¨ el Leblois (INRA CBGP, Montpellier) Grant from ANR through Project “Emile” First preprint on arXiv Approximate Bayesian computation via empirical likelihood Coming soon: population genetic models which are too slow to simulate Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 28 / 29
  • 49. Empirical likelihood Wilk’s theorem to EL likelihood ratio. Theorem (Owen, 1991) Let X, X1, . . . ,Xn be iid random vector with common distribution F0. For 2 , x 2 X , let h(x,) 2 Rs. Let 0 2 be such that Var(h(X, 0)) is finite and has rank q 0. If 0 satisfies E(h(X,0)) = 0, then -2 log Lel(0jX1:n) n-n ! 2 (q) Note that I n-n is the maximum value of Lel I no condition to ensure a unique value for 0 I no condition to make ^ el = argmax Lel a good estimate of 0 I provides tests and confidence intervals Pudlo (INRA U. Montpellier 2) ABCel GenPop SSB 09/2012 29 / 29