SlideShare a Scribd company logo
1 of 79
Download to read offline
I  B
P
Tracy Heath
(+ content by Paul Lewis)
Ecology, Evolution, & Organismal Biology
Iowa State University
@trayc7
http://phyloworks.org
2017 Phylogenomics Workshop
Český Krumlov, Czech Republic
O
Lecture & Demo: Bayesian fundamentals
Bayes theorem, priors
break
Markov chain Monte Carlo
Bayesian phylogenetics
Site-specific models applied in phylogenomic studies
lunch
Lecture: Introduction to RevBayes
break
RevBayes tutorial 1: Inferring unrooted phylogenies
dinner
Lecture: Bayesian divergence-time estimation
break
RevBayes tutorial 2: Divergence dating w/ fossil & molecular data
done!
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
B  M L?
What’s the difference?
Bayesian:
• estimates f(θ | D)
• estimates a distribution
• parameters are random
variables
• average over nuisance
parameters
Maximum Likelihood:
• maximizes f(D | θ)
• point estimate
• parameters are
fixed/unknown
• optimize nuisance
parameters
30 32 34 36 38 40
0.00.10.20.30.40.50.6
parameter value
density
maximum likelihood
estimate
posterior density
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
J P
Let’s start with joint probability and the
simple example that Paul Lewis gives in his
lecture on Bayesian phylogenetics
Slides source: https://molevol.mbl.edu/index.php/Paul_Lewis
(photo of Paul Lewis by David Hillis)
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 4
Joint probabilities B = Black S = Solid
W = White D = Dotted
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 5
Conditional probabilities
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 6
Bayes’ rule
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 7
Probability of "Dotted"
Pr(D) is the marginal probability of being dotted
To compute it, we marginalize over colors
Pr(B|D) =
Pr(B) Pr(D|B)
Pr(D)
=
Pr(D, B)
Pr(D, B) + Pr(D, W)
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 8
Bayes' rule (cont.)
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 9
Bayes' rule (cont.)
It is easy to see that Pr(D) serves as a normalization
constant, ensuring that Pr(B|D) + Pr(W|D) = 1.0
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop)
Joint probabilities
10
B W
Pr(D,B)D
S Pr(S,B) Pr(S,W)
Pr(D,W)
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop)
Marginalizing over colors
11
B
W
Pr(D,B)
D
S
Pr(S,B) Pr(S,W)
Pr(D,W)
Marginal probability of
being a dotted marble
is the sum of all joint
probabilities involving
dotted marbles
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop)
Marginal probabilities
12
B W
Pr(D,B) + Pr(D,W)D
S Pr(S,B) + Pr(S,W)
Pr(S) = marginal probability
of being solid
Pr(D) = marginal probability
of being dotted
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop)
Joint probabilities
13
B W
Pr(D,B)D
S Pr(S,B) Pr(S,W)
Pr(D,W)
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop)
Marginalizing over "dottedness"
14
B W
Pr(D,B)
D
S
Pr(S,B) Pr(S,W)
Pr(D,W)
Marginal
probability of
being a white
marble
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 15
Bayes' rule (cont.)
Pr(✓|D) =
Pr(D|✓) Pr(✓)
P
✓ Pr(D|✓) Pr(✓)
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 16
Bayes' rule in Statistics
D refers to the "observables" (i.e. the Data)
refers to one or more "unobservables" 

(i.e. parameters of a model, or the model itself):
– tree model (i.e. tree topology)
– substitution model (e.g. JC, F84, GTR, etc.)
– parameter of a substitution model (e.g. a branch length, a
base frequency, transition/transversion rate ratio, etc.)
– hypothesis (i.e. a special case of a model)
– a latent variable (e.g. ancestral state)
✓
B I
Estimate the probability of a hypothesis (model) conditional
on observed data.
The probability represents the researcher’s degree of belief.
Bayes’ Theorem (also called Bayes Rule) specifies the
conditional probability of the hypothesis given the data.
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
B’ T
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
B’ T
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
B’ T
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
B’ T
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
B’ T
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
B’ T
The posterior probability of a discrete parameter δ
conditional on the data D is
Pr(δ | D) =
Pr(D | δ) Pr(δ)
δ Pr(D | δ) Pr(δ)
δ Pr(D | δ) Pr(δ) is the likelihood marginalized over all
possible values of δ.
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
B’ T
The posterior probability density a continuous parameter θ
conditional on the data D is
f(θ | D) =
f(D | θ)f(θ)
θ f(D | θ)f(θ)dθ
θ f(D | θ)f(θ)dθ is the likelihood marginalized over all
possible values of θ.
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
P
Priors distributions are an important part of Bayesian
statistics
The the distribution of θ before any data are collected is
the prior
f(θ)
The prior describes your uncertainty in the parameters of
your model
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
P
Paul Lewis gives a clear example of a prior
in action...
Slides source: https://molevol.mbl.edu/index.php/Paul_Lewis
(photo of Paul Lewis by David Hillis)
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 24
If you had to guess...
0.0
∞
1 meter
Not knowing anything
about my archery abilities,
draw a curve representing
your view of the chances of
my arrow landing a distance
d from the center of the target
(if it helps, I'm standing 50
meters away from the target)
d
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 25
Case 1: assume I have talent
0.0
1 meter
20.0 40.0 60.0 80.0
distance in centimeters from target center
An informative prior
(low variance) that
says most of my
arrows will fall within
20 cm of the center
(thanks for your
confidence!)
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 26
Case 2: assume I have a talent for missing the
target!
0.0
1 meter
20.0 40.0 60.0
distance in cm from target center
Also an informative prior,
but one that says most of
my arrows will fall within
a narrow range just
outside the entire target!
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 27
Case 3: assume I have no talent
0.0
1 meter
20.0 40.0 60.0 80.0
distance in cm from target center
This is a vague prior:
its high variance reflects
nearly total ignorance
of my abilities, saying
that my arrows could
land nearly anywhere!
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 28
A matter of scale
∞
Notice that I haven't provided a scale for
the vertical axis.
What exactly does the height of this
curve mean?
For example, does the height of the dotted
line represent the probability that my
arrow lands 60 cm from the center
of the target?
0.0 20.0 40.0 60.0
No.
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 28
A matter of scale
∞
Notice that I haven't provided a scale for
the vertical axis.
What exactly does the height of this
curve mean?
For example, does the height of the dotted
line represent the probability that my
arrow lands 60 cm from the center
of the target?
0.0 20.0 40.0 60.0
No.
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 29
Probabilities are associated with intervals
Probabilities are attached to intervals
(i.e. ranges of values), not individual values
The probability of any given point (e.g.
d = 60.0) is zero!
However, we can ask about the probability
that d falls in a particular range
e.g. 50.0 < d < 65.0
0.0 20.0 40.0 60.0
P: A E
Let’s continue with the archery example: we may assume a
gamma-prior distribution on my archery skill (distance from
bullseye d) with a shape parameter α and a scale
parameter β.
d ∼ Gamma(α,β)
f(d | α,β) = 1
Γ(α)βα dα−1e− d
β
0 10 20 30 40 50 60 70
0.000.050.100.150.200.25
distance in cm from target center (d)
density
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
qqq
q
qq
This requires us to assign values for α and β based on our
prior belief
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
P: A E
If we have some prior knowledge of the mean (m) and
variance (v) of the gamma distribution, we can compute α
and β.
m = α
β, α = m2
v
v = α
β2 , β = m
v
d ∼ Gamma(α,β)
0 10 20 30 40 50 60 70
0.000.050.100.150.200.25
distance in cm from target center (d)
density
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
qqq
q
qq
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
P: A E
Let’s assume that I will consistently miss the target, this
corresponds to a gamma distribution with a mean (m) of 60
and a variance (v) of 3.
α = 1200
β = 20
d ∼ Gamma(α,β)
0 10 20 30 40 50 60 70
0.000.050.100.150.200.25
distance in cm from target center (d)
density
1 meter
q
q
q
q
q
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
P: A E
Another way of expressing d ∼ Gamma(α,β) is with a
probabilistic graphical model
d
α β
gamma
distribution
0 10 20 30 40 50 60 70
0.000.050.100.150.200.25
distance in cm from target center (d)
density
1 meter
q
q
q
q
q
This shows that our observed datum (d a single observed
shot) is conditionally dependent on the shape (α) and rate
(β) of the gamma distribution.
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
P: A E
We can parameterize the model using the mean (m) and
variance (v), where α and β are computed using m and v.
d
α β
gamma
distribution
m v
β = m
vα = m2
v
0 10 20 30 40 50 60 70
0.000.050.100.150.200.25 distance in cm from target center (d)
density
1 meter
q
q
q
q
q
Sometimes it’s better to alternative parameterization. We
may have more intuition about mean and variance than we
have about shape and rate.
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
P: A E
If somehow we happened to know the true mean and
variance of my accuracy at the archery range, we can easily
calculate the likelihood of any observed shot:
f(d | α,β) = 1
Γ(α)βα dα−1e− d
β
d
α β
gamma
distribution
m v
β = m
vα = m2
v
f(d = 39.76 | α = 1200,β = 20) = 7.89916e − 40
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
RB D: A A
RevBayes
Fully integrative Bayesian inference of
phylogenetic parameters using
probabilistic graphical models and an
interpreted language
http://RevBayes.com
d
α β
gamma
distribution
m v
β = m
vα = m2
v
Höhna, Landis, Heath, Boussau, Lartillot, Moore, Huelsenbeck, Ronquist.
2016. RevBayes: Bayesian phylogenetic inference using graphical
models and an interactive model-specification language. Systematic
Biology. (doi: 10.1093/sysbio/syu021)
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
G M  RB
Graphical models provide tools for
visually & computationally representing
complex, parameter-rich probabilistic
models
We can depict the conditional
dependence structure of various
parameters and other random variables
d
α β
gamma
distribution
m v
β = m
vα = m2
v
Höhna, Heath, Boussau, Landis, Ronquist, Huelsenbeck. 2014.
Probabilistic Graphical Model Representation in Phylogenetics.
Systematic Biology. (doi: 10.1093/sysbio/syu039)
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
RB D: M  A S
The Rev language calculating the probability of 1 data
observation observed_shot given a mean and variance.
mean <- 60
var <- 3
alpha := (mean * mean) / var
beta := mean / var
observed_shot = 39.76
d ∼ dnGamma(alpha,beta)
d.clamp(observed_shot)
d.lnProbability()
-90.0366
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
E: M  A S
What if we do not know m and v?
We can use maximum likelihood or Bayesian methods for
estimating their values.
Maximum likelihood methods require us to find the values
of m and v that maximize f(d | m,v).
Bayesian methods use prior distributions to describe our
uncertainty in m and v and estimate f(m,v | d).
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
E: H A M
We must define prior
distributions for m and v to
account for uncertainty and
estimate the posterior densities
of those parameters
Now x and y are the
parameters of the uniform
prior distribution on m and a
and b are the shape and rate
parameters of the gamma
prior distribution on v.
d
α β
gamma
distribution
m v
β = m
vα = m2
v
x y a b
uniform
distribution
gamma
distribution
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
E: H A M
The values we choose for the parameters of these
hyperprior distributions should reflect our prior knowledge.
The previous observed shot was 39.76 cm, thus we may
use this to parameterize our hyperpriors for analysis of
future observations.
m ∼ Uniform(x,y)
x = 10.0
y = 50.0
E(m) = 30.0
v ∼ Gamma(a,b)
a = 20.0
b = 2.0
E(v) = 10.0
d
α β
gamma
distribution
m v
β = m
vα = m2
v
x y a b
uniform
distribution
gamma
distribution
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
RB D: H A M
The Rev language specifying a hierarchical model on shot
accuracy based on 1 new observation.
mean ∼ dnUnif(10,50)
var ∼ dnGamma(20,2)
alpha := (mean * mean) / var
beta := mean / var
observed_shot = 35.21
d ∼ dnGamma(alpha,beta)
d.clamp(observed_shot)
d.lnProbability()
depends on initial value of mean & var
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
E: H A M
Now that we have a defined model, how do we estimate
the posterior probability density?
m ∼ Uniform(x,y)
v ∼ Gamma(a,b)
d ∼ Gamma(α,β)
d
α β
gamma
distribution
m v
β = m
vα = m2
v
x y a b
uniform
distribution
gamma
distribution
f(m,v | d,a,b,x,y) ∝ f(d | α =
m2
v
,β =
m
v
)f(m | x,y)f(v | a,b)
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
M C M C (MCMC)
An algorithm for approximating the posterior distribution
Metropolis, Rosenblusth, Rosenbluth, Teller, Teller. 1953. Equations of state calculations by fast computing
machines. J. Chem. Phys.
Hastings. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika.
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
M C M C (MCMC)
More on MCMC from Paul Lewis and his
lecture on Bayesian phylogenetics
Slides source: https://molevol.mbl.edu/index.php/Paul_Lewis
(photo of Paul Lewis by David Hillis)
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 41
Markov chain Monte Carlo (MCMC)
For more complex problems, we might settle for a
good approximation
to the posterior distribution
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 42
MCMC robot’s rules
Uphill steps are
always accepted
Slightly downhill steps
are usually accepted
Drastic “off the cliff”
downhill steps are almost
never accepted
With these rules, it
is easy to see why the
robot tends to stay near
the tops of hills
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 43
(Actual) MCMC robot rules
Uphill steps are
always accepted
because R > 1
Slightly downhill steps
are usually accepted
because R is near 1
Drastic “off the cliff”
downhill steps are almost
never accepted because
R is near 0
Currently at 1.0 m
Proposed at 2.3 m
R = 2.3/1.0 = 2.3
Currently at 6.2 m
Proposed at 5.7 m
R = 5.7/6.2 =0.92 Currently at 6.2 m
Proposed at 0.2 m
R = 0.2/6.2 = 0.03
6
8
4
2
0
10
The robot takes a step if it draws
a Uniform(0,1) random deviate
that is less than or equal to R
=
f(D| ⇤
)f( ⇤
)
f(D)
f(D| )f( )
f(D)
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 44
Cancellation of marginal likelihood
When calculating the ratio R of posterior densities, the marginal
probability of the data cancels.
f( ⇤
|D)
f( |D)
Posterior
odds
=
f(D| ⇤
)f( ⇤
)
f(D| )f( )
Likelihood
ratio
Prior odds
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 45
Target vs. Proposal Distributions
Pretend this proposal
distribution allows good
mixing. What does good
mixing mean?
default2.TXT
State
0 2500 5000 7500 10000 12500 15000 17500
-10
-9
-8
-7
-6
-5
-4
-3
-2
-1
0
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 46
Trace plots
“White noise”
appearance is a sign of
good mixing
I used the program Tracer to create this plot:
http://tree.bio.ed.ac.uk/software/tracer/
AWTY (Are We There Yet?) is useful for
investigating convergence:
http://king2.scs.fsu.edu/CEBProjects/awty/
awty_start.php
log(posterior)
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 47
Target vs. Proposal Distributions
Proposal distributions
with smaller variance...
Disadvantage: robot takes
smaller steps, more time
required to explore the
same area
Advantage: robot seldom
refuses to take proposed
steps
smallsteps.TXT
State
0 2500 5000 7500 10000 12500 15000 17500
-6
-5
-4
-3
-2
-1
0
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 48
If step size is too
small, large-scale
trends will be
apparent
log(posterior)
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 49
Target vs. Proposal Distributions
Proposal distributions
with larger variance...
Disadvantage: robot
often proposes a step
that would take it off
a cliff, and refuses to
move
Advantage: robot can
potentially cover a lot of
ground quickly
bigsteps2.TXT
State
0 2500 5000 7500 10000 12500 15000 17500
-12
-11
-10
-9
-8
-7
-6
-5
-4
-3
-2
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 50
Chain is spending long periods of time
“stuck” in one place
“Stuck” robot is indicative of step sizes
that are too large (most proposed steps
would take the robot “off the cliff”)
log(posterior)
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 52
Metropolis-coupled Markov chain Monte Carlo
(MCMCMC)
• MCMCMC involves running several chains
simultaneously
• The cold chain is the one that counts, the rest
are heated chains
• Chain is heated by raising densities to a power
less than 1.0 (values closer to 0.0 are warmer)
Geyer, C. J. 1991. Markov chain Monte Carlo maximum likelihood for dependent data. Pages 156-163 in Computing Science and
Statistics (E. Keramidas, ed.).
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop)
Heated chains act as scouts for the
cold chain
53
cold
heated
Cold chain robot can easily
make this jump because it is
uphill
Hot chain robot can also
make this jump with high
probability because it is only
slightly downhill
Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 54
cold
heated
Hot chain and cold chain
robots swapping places
Swapping places means
both robots can cross
the valley, but this is
more important for the
cold chain because its
valley is much deeper
M C M C (MCMC)
Thanks, Paul!
Slides source: https://molevol.mbl.edu/index.php/Paul_Lewis
See MCMCRobot, a helpful
software program for learning
MCMC by Paul Lewis
http://www.mcmcrobot.org
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
RB D: H A M
The Rev language specifying the MCMC sampler for the
hierarchical model on archery accuracy.
... # model specification from previous demo
mymodel = model(beta)
moves[1] = mvSlide(mean,delta=1.0,tune=true,weight=3.0)
moves[2] = mvScale(var,lambda=1.0,tune=true,weight=3.0)
monitors[1] = mnModel(file="archery_mcmc_1.log",printgen=10, ...)
monitors[2] = mnScreen(printgen=1000, mean, var)
mymcmc = mcmc(mymodel, monitors, moves,nruns=1)
mymcmc.burnin(generations=10000,tuningInterval=1000)
mymcmc.run(generations=40000,underPrior=false)
MCMC screen output
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
RB D: H A M
Summary of the MCMC sample for the mean distance from
target center.
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
RB D: H A M
The trace-plot of the MCMC samples for the mean distance
from target center
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
E: H A M
Under this model, we do a good job of estimating the
mean, but when judging archery skill, precision (variance) is
as (if not more) important than accuracy
q
q
qq
q
q
q
q q
qqq
q
d
α β
gamma
distribution
m v
β = m
vα = m2
v
x y a b
uniform
distribution
gamma
distribution
Thus, it is also worth evaluating the estimated posterior
distribution for the variance component of our model
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
E: V
The posterior estimate of
the variance (v) is quite
different from the true
value (4.0) and from the
highest likelihood value
found by our MCMC (MLE
3.51374).
0 5 10 15
0.000.050.100.150.200.25
variance (v)
density
MLE
true
posterior
prior
This indicates that the prior is having a strong influence on
the posterior. Why do you think that is?
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
E: V
When the prior closely
matches the posterior, it can
indicate that the data are
not very informative for this
parameter.
0 5 10 15
0.000.050.100.150.200.25
variance (v)
density
MLE
true
posterior
prior
Remember that our data were only 6 observed shots. What
would happen if I had 600 arrows?
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
E: W   M D
With 100X more observations, we can estimate the mean
and variance with greater precision.
0 5 10 15
0.00.51.01.52.0
variance (v)
density
MLE
true
posterior
prior
Marginal Posterior of the Variance (v)
33.0 33.5 34.0 34.5 35.0 35.5 36.0
0123456
mean (m)
density
posterior
prior
Marginal Posterior of the Mean (m)
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
B P
How is this applied to phylogenetic inference?
Jukes-Cantor (1969) on an unrooted tree
seq
PhyloCTMC
Q
JCTree
bli ⌧
Exponential Uniform
10 N
i 2 2N 3
for (i in 1:n_branches) {
bl[i] ⇠ dnExponential(10.0)
}
topology ⇠ dnUniformTopology(taxa)
psi := treeAssembly(topology, bl)
Q <- fnJC(4)
seq ⇠ dnPhyloCTMC( tree=psi, Q=Q, type="DNA" )
seq.clamp( data )
(image source RevBayes Substitution Models Tutorial)
We can assemble a phylogenetic model in the same way,
using previously described models and probability
distributions as priors.
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
B P
With a defined model we simply then have to:
• draw starting values for every random variable in the
model
• define moves on each random variable that propose
new values
• then for each step in our MCMC, choose a parameter
and update it according to the correct proposal.
• propose a new tree topology and accept or reject
• propose a new model parameter value and accept or
reject
• save the current state of every random variable (tree,
branch lengths, base frequencies, etc.) after every k
number of states
• after n MCMC steps, evaluate the run for signs of
non-convergence
• summarize the tree and other parameters
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
M C P
Complex data require complex models and careful analysis
Many people feel that
genome-scale data will
overcome model
misspecification
Genomic datasets with
many sparsely sampled
lineages are prone to
systematic biases from
overly simple models
Choanoflagellates
Porifera
Ctenophora
Placazoa
Cnidaria
Rotifera
Platyhelminthes
Annelida
Mollusca
Nematoda
Arthropoda
Echinodermata
Chordata
ANIMALIA
BILATERIA
?
These issues are part of the Porifera-sister vs. Ctenophora-
sister debate over the last 8 years
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
M C P
In the case of the animal
phylogeny, it is argued that
some resolutions may be
due to long-branch
attraction because simple
models do not account for
site-specific amino acid
profiles
Choanoflagellates
Porifera
Ctenophora
Placazoa
Cnidaria
Bilateria
Site Patterns
more
frequent
less
frequent
True Tree
Choanoflagellates
Porifera
Ctenophora
Placazoa
Cnidaria
Bilateria
Inferred Tree
figure adapted from Nicolas Lartillot
(Phyloseminar.org, 2016)
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
M C P
This problem may be
exacerbated by including
few species from distant
outgroups
Choanoflagellates
Porifera
Ctenophora
Placazoa
Cnidaria
Bilateria
Fungi
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
S-H S M
Phylogenomic studies often ignore site-specific amino acid
profiles by applying uniform models or locus-partitioned
models
Flexible, non-parametric
Bayesian models can
accommodate this complex
process heterogeneity
These models include both
uniform and locus-specific
models (they are nested)
YMIPTSELKPGELRLLE
YMVPTQDLAPGQFRLLE
YMTPTTDLPLGHFRLLE
YMLPPLFLEPGDLRLLD
MAYPMQLGFQDATSPIM
MAHPTQLGFKDAAMPVM
MANHSQLGFQDASSPIM
MAHAAQVGLQDATSPIM
VETIWTILPAIILILIA
IEIVWTILPAVILVLIA
VELIWTILPAIVLVLLA
METVWTILPAIILVLIA
Uniform effect
YMIPTSELKPGELRLLE
YMVPTQDLAPGQFRLLE
YMTPTTDLPLGHFRLLE
YMLPPLFLEPGDLRLLD
MAYPMQLGFQDATSPIM
MAHPTQLGFKDAAMPVM
MANHSQLGFQDASSPIM
MAHAAQVGLQDATSPIM
VETIWTILPAIILILIA
IEIVWTILPAVILVLIA
VELIWTILPAIVLVLLA
METVWTILPAIILVLIA
Locus-specific effects
YMIPTSELKPGELRLLE
YMVPTQDLAPGQFRLLE
YMTPTTDLPLGHFRLLE
YMLPPLFLEPGDLRLLD
MAYPMQLGFQDATSPIM
MAHPTQLGFKDAAMPVM
MANHSQLGFQDASSPIM
MAHAAQVGLQDATSPIM
VETIWTILPAIILILIA
IEIVWTILPAVILVLIA
VELIWTILPAIVLVLLA
METVWTILPAIILVLIA
Site-specific effects
(Lartillot & Phillippe, MBE, 2004; Lartillot et al., Bioinformatics, 2009)
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
S-H S M
Models that treat amino acid profiles as random effects
across sites can accommodate these patterns
Implemented as the CAT
family of models in
Phylobayes (Lartillot et al.,
Bioinformatics, 2009)
Note that this is NOT
the same as the CAT
model for rate
heterogeneity in RAXML!!
Choanoflagellates
Porifera
Ctenophora
Placazoa
Cnidaria
Bilateria
Site Patterns
more
frequent
less
frequent
True Tree
Nicolas Lartillot has also written some blog posts on this
topic (as well as several other Bayesian phylogenetics
concepts) and see his Phyloseminar explaining these models:
https://www.youtube.com/watch?v 2zJuuXjX978
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
B P
Bayesian analysis of genomic data under complex models has
many challenges
• Many parameters require long MCMC runs to effectively
sample all parameters, this problem gets worse with more
complex models
• Even complex models like CAT-GTR (in Phylobayes) are
likely inadequate for many genome-scale datasets
• There is no one-size-fits-all method or model for
phylogenetic data
• Thorough analysis takes expensive computational resources
• Thorough analysis takes a LONG time even with access to
HPC resources
But...amazing progress on models & algorithms has been made in
our field and we will overcome these obstacles in the very near
future. Then, we will be faced with new, more difficult challenges.
This is why phylogenetics is fun!
Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)

More Related Content

What's hot

random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationChristian Robert
 
Statistics symposium talk, Harvard University
Statistics symposium talk, Harvard UniversityStatistics symposium talk, Harvard University
Statistics symposium talk, Harvard UniversityChristian Robert
 
ABC short course: introduction chapters
ABC short course: introduction chaptersABC short course: introduction chapters
ABC short course: introduction chaptersChristian Robert
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chaptersChristian Robert
 
ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapterChristian Robert
 
CISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceCISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceChristian Robert
 
ABC short course: survey chapter
ABC short course: survey chapterABC short course: survey chapter
ABC short course: survey chapterChristian Robert
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distancesChristian Robert
 
Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Christian Robert
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?Christian Robert
 
On the vexing dilemma of hypothesis testing and the predicted demise of the B...
On the vexing dilemma of hypothesis testing and the predicted demise of the B...On the vexing dilemma of hypothesis testing and the predicted demise of the B...
On the vexing dilemma of hypothesis testing and the predicted demise of the B...Christian Robert
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsChristian Robert
 
from model uncertainty to ABC
from model uncertainty to ABCfrom model uncertainty to ABC
from model uncertainty to ABCChristian Robert
 

What's hot (20)

random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimation
 
Statistics symposium talk, Harvard University
Statistics symposium talk, Harvard UniversityStatistics symposium talk, Harvard University
Statistics symposium talk, Harvard University
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
ABC short course: introduction chapters
ABC short course: introduction chaptersABC short course: introduction chapters
ABC short course: introduction chapters
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chapters
 
ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapter
 
CISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceCISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergence
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 
ABC short course: survey chapter
ABC short course: survey chapterABC short course: survey chapter
ABC short course: survey chapter
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distances
 
Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017
 
Intractable likelihoods
Intractable likelihoodsIntractable likelihoods
Intractable likelihoods
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?
 
On the vexing dilemma of hypothesis testing and the predicted demise of the B...
On the vexing dilemma of hypothesis testing and the predicted demise of the B...On the vexing dilemma of hypothesis testing and the predicted demise of the B...
On the vexing dilemma of hypothesis testing and the predicted demise of the B...
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 
from model uncertainty to ABC
from model uncertainty to ABCfrom model uncertainty to ABC
from model uncertainty to ABC
 

Viewers also liked

Bayesian Divergence Time Estimation – Workshop Lecture
Bayesian Divergence Time Estimation – Workshop LectureBayesian Divergence Time Estimation – Workshop Lecture
Bayesian Divergence Time Estimation – Workshop LectureTracy Heath
 
Bayesian phylogenetic inference_big4_ws_2016-10-10
Bayesian phylogenetic inference_big4_ws_2016-10-10Bayesian phylogenetic inference_big4_ws_2016-10-10
Bayesian phylogenetic inference_big4_ws_2016-10-10FredrikRonquist
 
Phylogeny in R - Bianca Santini Sheffield R Users March 2015
Phylogeny in R - Bianca Santini Sheffield R Users March 2015Phylogeny in R - Bianca Santini Sheffield R Users March 2015
Phylogeny in R - Bianca Santini Sheffield R Users March 2015Paul Richards
 
Phylogenetics in R
Phylogenetics in RPhylogenetics in R
Phylogenetics in Rschamber
 
Genomics in Public Health
Genomics in Public HealthGenomics in Public Health
Genomics in Public HealthJennifer Gardy
 
Interpreting genomic variation and phylogenetic trees to understand disease t...
Interpreting genomic variation and phylogenetic trees to understand disease t...Interpreting genomic variation and phylogenetic trees to understand disease t...
Interpreting genomic variation and phylogenetic trees to understand disease t...Jennifer Gardy
 

Viewers also liked (8)

Bayesian Divergence Time Estimation – Workshop Lecture
Bayesian Divergence Time Estimation – Workshop LectureBayesian Divergence Time Estimation – Workshop Lecture
Bayesian Divergence Time Estimation – Workshop Lecture
 
La p.c. (rof)
La p.c. (rof)La p.c. (rof)
La p.c. (rof)
 
Bayesian phylogenetic inference_big4_ws_2016-10-10
Bayesian phylogenetic inference_big4_ws_2016-10-10Bayesian phylogenetic inference_big4_ws_2016-10-10
Bayesian phylogenetic inference_big4_ws_2016-10-10
 
Phylolecture
PhylolecturePhylolecture
Phylolecture
 
Phylogeny in R - Bianca Santini Sheffield R Users March 2015
Phylogeny in R - Bianca Santini Sheffield R Users March 2015Phylogeny in R - Bianca Santini Sheffield R Users March 2015
Phylogeny in R - Bianca Santini Sheffield R Users March 2015
 
Phylogenetics in R
Phylogenetics in RPhylogenetics in R
Phylogenetics in R
 
Genomics in Public Health
Genomics in Public HealthGenomics in Public Health
Genomics in Public Health
 
Interpreting genomic variation and phylogenetic trees to understand disease t...
Interpreting genomic variation and phylogenetic trees to understand disease t...Interpreting genomic variation and phylogenetic trees to understand disease t...
Interpreting genomic variation and phylogenetic trees to understand disease t...
 

Similar to Introduction to Bayesian Phylogenetics

Bayes Classification
Bayes ClassificationBayes Classification
Bayes Classificationsathish sak
 
original
originaloriginal
originalbutest
 
Threshold network models
Threshold network modelsThreshold network models
Threshold network modelsNaoki Masuda
 
Unit-2 Bayes Decision Theory.pptx
Unit-2 Bayes Decision Theory.pptxUnit-2 Bayes Decision Theory.pptx
Unit-2 Bayes Decision Theory.pptxavinashBajpayee1
 
Unit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VEC
Unit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VECUnit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VEC
Unit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VECsundarKanagaraj1
 
PhD_Thesis_slides.pdf
PhD_Thesis_slides.pdfPhD_Thesis_slides.pdf
PhD_Thesis_slides.pdfNiloyBiswas36
 
Introduction to Bayesian Statistics.ppt
Introduction to Bayesian Statistics.pptIntroduction to Bayesian Statistics.ppt
Introduction to Bayesian Statistics.pptLong Dang
 
Bayesian statistics using r intro
Bayesian statistics using r   introBayesian statistics using r   intro
Bayesian statistics using r introBayesLaplace1
 
Locality-sensitive hashing for search in metric space
Locality-sensitive hashing for search in metric space Locality-sensitive hashing for search in metric space
Locality-sensitive hashing for search in metric space Eliezer Silva
 
Statistics in Astronomy
Statistics in AstronomyStatistics in Astronomy
Statistics in AstronomyPeter Coles
 
Chapter 4: Vector Spaces - Part 1/Slides By Pearson
Chapter 4: Vector Spaces - Part 1/Slides By PearsonChapter 4: Vector Spaces - Part 1/Slides By Pearson
Chapter 4: Vector Spaces - Part 1/Slides By PearsonChaimae Baroudi
 
Cs221 lecture3-fall11
Cs221 lecture3-fall11Cs221 lecture3-fall11
Cs221 lecture3-fall11darwinrlo
 
ABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsChristian Robert
 

Similar to Introduction to Bayesian Phylogenetics (20)

Bayes Classification
Bayes ClassificationBayes Classification
Bayes Classification
 
original
originaloriginal
original
 
ISBA 2016: Foundations
ISBA 2016: FoundationsISBA 2016: Foundations
ISBA 2016: Foundations
 
Threshold network models
Threshold network modelsThreshold network models
Threshold network models
 
Unit-2 Bayes Decision Theory.pptx
Unit-2 Bayes Decision Theory.pptxUnit-2 Bayes Decision Theory.pptx
Unit-2 Bayes Decision Theory.pptx
 
Unit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VEC
Unit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VECUnit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VEC
Unit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VEC
 
Intro to ABC
Intro to ABCIntro to ABC
Intro to ABC
 
PhD_Thesis_slides.pdf
PhD_Thesis_slides.pdfPhD_Thesis_slides.pdf
PhD_Thesis_slides.pdf
 
Introduction to Bayesian Statistics.ppt
Introduction to Bayesian Statistics.pptIntroduction to Bayesian Statistics.ppt
Introduction to Bayesian Statistics.ppt
 
Bayesian statistics
Bayesian statisticsBayesian statistics
Bayesian statistics
 
Dempster shafer theory
Dempster shafer theoryDempster shafer theory
Dempster shafer theory
 
bayesjaw.ppt
bayesjaw.pptbayesjaw.ppt
bayesjaw.ppt
 
Probability
ProbabilityProbability
Probability
 
Bayesian statistics using r intro
Bayesian statistics using r   introBayesian statistics using r   intro
Bayesian statistics using r intro
 
Locality-sensitive hashing for search in metric space
Locality-sensitive hashing for search in metric space Locality-sensitive hashing for search in metric space
Locality-sensitive hashing for search in metric space
 
Statistics in Astronomy
Statistics in AstronomyStatistics in Astronomy
Statistics in Astronomy
 
Chapter 4: Vector Spaces - Part 1/Slides By Pearson
Chapter 4: Vector Spaces - Part 1/Slides By PearsonChapter 4: Vector Spaces - Part 1/Slides By Pearson
Chapter 4: Vector Spaces - Part 1/Slides By Pearson
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Cs221 lecture3-fall11
Cs221 lecture3-fall11Cs221 lecture3-fall11
Cs221 lecture3-fall11
 
ABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified models
 

Recently uploaded

RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555kikilily0909
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsCharlene Llagas
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
insect anatomy and insect body wall and their physiology
insect anatomy and insect body wall and their  physiologyinsect anatomy and insect body wall and their  physiology
insect anatomy and insect body wall and their physiologyDrAnita Sharma
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |aasikanpl
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayZachary Labe
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett SquareIsiahStephanRadaza
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫qfactory1
 

Recently uploaded (20)

RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of Traits
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
insect anatomy and insect body wall and their physiology
insect anatomy and insect body wall and their  physiologyinsect anatomy and insect body wall and their  physiology
insect anatomy and insect body wall and their physiology
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work Day
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett Square
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫
 

Introduction to Bayesian Phylogenetics

  • 1. I  B P Tracy Heath (+ content by Paul Lewis) Ecology, Evolution, & Organismal Biology Iowa State University @trayc7 http://phyloworks.org 2017 Phylogenomics Workshop Český Krumlov, Czech Republic
  • 2. O Lecture & Demo: Bayesian fundamentals Bayes theorem, priors break Markov chain Monte Carlo Bayesian phylogenetics Site-specific models applied in phylogenomic studies lunch Lecture: Introduction to RevBayes break RevBayes tutorial 1: Inferring unrooted phylogenies dinner Lecture: Bayesian divergence-time estimation break RevBayes tutorial 2: Divergence dating w/ fossil & molecular data done! Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 3. B  M L? What’s the difference? Bayesian: • estimates f(θ | D) • estimates a distribution • parameters are random variables • average over nuisance parameters Maximum Likelihood: • maximizes f(D | θ) • point estimate • parameters are fixed/unknown • optimize nuisance parameters 30 32 34 36 38 40 0.00.10.20.30.40.50.6 parameter value density maximum likelihood estimate posterior density Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 4. J P Let’s start with joint probability and the simple example that Paul Lewis gives in his lecture on Bayesian phylogenetics Slides source: https://molevol.mbl.edu/index.php/Paul_Lewis (photo of Paul Lewis by David Hillis)
  • 5. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 4 Joint probabilities B = Black S = Solid W = White D = Dotted
  • 6. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 5 Conditional probabilities
  • 7. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 6 Bayes’ rule
  • 8. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 7 Probability of "Dotted"
  • 9. Pr(D) is the marginal probability of being dotted To compute it, we marginalize over colors Pr(B|D) = Pr(B) Pr(D|B) Pr(D) = Pr(D, B) Pr(D, B) + Pr(D, W) Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 8 Bayes' rule (cont.)
  • 10. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 9 Bayes' rule (cont.) It is easy to see that Pr(D) serves as a normalization constant, ensuring that Pr(B|D) + Pr(W|D) = 1.0
  • 11. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) Joint probabilities 10 B W Pr(D,B)D S Pr(S,B) Pr(S,W) Pr(D,W)
  • 12. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) Marginalizing over colors 11 B W Pr(D,B) D S Pr(S,B) Pr(S,W) Pr(D,W) Marginal probability of being a dotted marble is the sum of all joint probabilities involving dotted marbles
  • 13. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) Marginal probabilities 12 B W Pr(D,B) + Pr(D,W)D S Pr(S,B) + Pr(S,W) Pr(S) = marginal probability of being solid Pr(D) = marginal probability of being dotted
  • 14. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) Joint probabilities 13 B W Pr(D,B)D S Pr(S,B) Pr(S,W) Pr(D,W)
  • 15. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) Marginalizing over "dottedness" 14 B W Pr(D,B) D S Pr(S,B) Pr(S,W) Pr(D,W) Marginal probability of being a white marble
  • 16. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 15 Bayes' rule (cont.)
  • 17. Pr(✓|D) = Pr(D|✓) Pr(✓) P ✓ Pr(D|✓) Pr(✓) Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 16 Bayes' rule in Statistics D refers to the "observables" (i.e. the Data) refers to one or more "unobservables" 
 (i.e. parameters of a model, or the model itself): – tree model (i.e. tree topology) – substitution model (e.g. JC, F84, GTR, etc.) – parameter of a substitution model (e.g. a branch length, a base frequency, transition/transversion rate ratio, etc.) – hypothesis (i.e. a special case of a model) – a latent variable (e.g. ancestral state) ✓
  • 18. B I Estimate the probability of a hypothesis (model) conditional on observed data. The probability represents the researcher’s degree of belief. Bayes’ Theorem (also called Bayes Rule) specifies the conditional probability of the hypothesis given the data. Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 19. B’ T Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 20. B’ T Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 21. B’ T Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 22. B’ T Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 23. B’ T Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 24. B’ T The posterior probability of a discrete parameter δ conditional on the data D is Pr(δ | D) = Pr(D | δ) Pr(δ) δ Pr(D | δ) Pr(δ) δ Pr(D | δ) Pr(δ) is the likelihood marginalized over all possible values of δ. Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 25. B’ T The posterior probability density a continuous parameter θ conditional on the data D is f(θ | D) = f(D | θ)f(θ) θ f(D | θ)f(θ)dθ θ f(D | θ)f(θ)dθ is the likelihood marginalized over all possible values of θ. Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 26. P Priors distributions are an important part of Bayesian statistics The the distribution of θ before any data are collected is the prior f(θ) The prior describes your uncertainty in the parameters of your model Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 27. P Paul Lewis gives a clear example of a prior in action... Slides source: https://molevol.mbl.edu/index.php/Paul_Lewis (photo of Paul Lewis by David Hillis)
  • 28. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 24 If you had to guess... 0.0 ∞ 1 meter Not knowing anything about my archery abilities, draw a curve representing your view of the chances of my arrow landing a distance d from the center of the target (if it helps, I'm standing 50 meters away from the target) d
  • 29. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 25 Case 1: assume I have talent 0.0 1 meter 20.0 40.0 60.0 80.0 distance in centimeters from target center An informative prior (low variance) that says most of my arrows will fall within 20 cm of the center (thanks for your confidence!)
  • 30. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 26 Case 2: assume I have a talent for missing the target! 0.0 1 meter 20.0 40.0 60.0 distance in cm from target center Also an informative prior, but one that says most of my arrows will fall within a narrow range just outside the entire target!
  • 31. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 27 Case 3: assume I have no talent 0.0 1 meter 20.0 40.0 60.0 80.0 distance in cm from target center This is a vague prior: its high variance reflects nearly total ignorance of my abilities, saying that my arrows could land nearly anywhere!
  • 32. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 28 A matter of scale ∞ Notice that I haven't provided a scale for the vertical axis. What exactly does the height of this curve mean? For example, does the height of the dotted line represent the probability that my arrow lands 60 cm from the center of the target? 0.0 20.0 40.0 60.0 No.
  • 33. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 28 A matter of scale ∞ Notice that I haven't provided a scale for the vertical axis. What exactly does the height of this curve mean? For example, does the height of the dotted line represent the probability that my arrow lands 60 cm from the center of the target? 0.0 20.0 40.0 60.0 No.
  • 34. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 29 Probabilities are associated with intervals Probabilities are attached to intervals (i.e. ranges of values), not individual values The probability of any given point (e.g. d = 60.0) is zero! However, we can ask about the probability that d falls in a particular range e.g. 50.0 < d < 65.0 0.0 20.0 40.0 60.0
  • 35. P: A E Let’s continue with the archery example: we may assume a gamma-prior distribution on my archery skill (distance from bullseye d) with a shape parameter α and a scale parameter β. d ∼ Gamma(α,β) f(d | α,β) = 1 Γ(α)βα dα−1e− d β 0 10 20 30 40 50 60 70 0.000.050.100.150.200.25 distance in cm from target center (d) density q q q q q q q q q q q q q q q q q q q q q qqq q qq This requires us to assign values for α and β based on our prior belief Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 36. P: A E If we have some prior knowledge of the mean (m) and variance (v) of the gamma distribution, we can compute α and β. m = α β, α = m2 v v = α β2 , β = m v d ∼ Gamma(α,β) 0 10 20 30 40 50 60 70 0.000.050.100.150.200.25 distance in cm from target center (d) density q q q q q q q q q q q q q q q q q q q q q qqq q qq Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 37. P: A E Let’s assume that I will consistently miss the target, this corresponds to a gamma distribution with a mean (m) of 60 and a variance (v) of 3. α = 1200 β = 20 d ∼ Gamma(α,β) 0 10 20 30 40 50 60 70 0.000.050.100.150.200.25 distance in cm from target center (d) density 1 meter q q q q q Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 38. P: A E Another way of expressing d ∼ Gamma(α,β) is with a probabilistic graphical model d α β gamma distribution 0 10 20 30 40 50 60 70 0.000.050.100.150.200.25 distance in cm from target center (d) density 1 meter q q q q q This shows that our observed datum (d a single observed shot) is conditionally dependent on the shape (α) and rate (β) of the gamma distribution. Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 39. P: A E We can parameterize the model using the mean (m) and variance (v), where α and β are computed using m and v. d α β gamma distribution m v β = m vα = m2 v 0 10 20 30 40 50 60 70 0.000.050.100.150.200.25 distance in cm from target center (d) density 1 meter q q q q q Sometimes it’s better to alternative parameterization. We may have more intuition about mean and variance than we have about shape and rate. Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 40. P: A E If somehow we happened to know the true mean and variance of my accuracy at the archery range, we can easily calculate the likelihood of any observed shot: f(d | α,β) = 1 Γ(α)βα dα−1e− d β d α β gamma distribution m v β = m vα = m2 v f(d = 39.76 | α = 1200,β = 20) = 7.89916e − 40 Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 41. RB D: A A RevBayes Fully integrative Bayesian inference of phylogenetic parameters using probabilistic graphical models and an interpreted language http://RevBayes.com d α β gamma distribution m v β = m vα = m2 v Höhna, Landis, Heath, Boussau, Lartillot, Moore, Huelsenbeck, Ronquist. 2016. RevBayes: Bayesian phylogenetic inference using graphical models and an interactive model-specification language. Systematic Biology. (doi: 10.1093/sysbio/syu021) Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 42. G M  RB Graphical models provide tools for visually & computationally representing complex, parameter-rich probabilistic models We can depict the conditional dependence structure of various parameters and other random variables d α β gamma distribution m v β = m vα = m2 v Höhna, Heath, Boussau, Landis, Ronquist, Huelsenbeck. 2014. Probabilistic Graphical Model Representation in Phylogenetics. Systematic Biology. (doi: 10.1093/sysbio/syu039) Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 43. RB D: M  A S The Rev language calculating the probability of 1 data observation observed_shot given a mean and variance. mean <- 60 var <- 3 alpha := (mean * mean) / var beta := mean / var observed_shot = 39.76 d ∼ dnGamma(alpha,beta) d.clamp(observed_shot) d.lnProbability() -90.0366 Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 44. E: M  A S What if we do not know m and v? We can use maximum likelihood or Bayesian methods for estimating their values. Maximum likelihood methods require us to find the values of m and v that maximize f(d | m,v). Bayesian methods use prior distributions to describe our uncertainty in m and v and estimate f(m,v | d). Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 45. E: H A M We must define prior distributions for m and v to account for uncertainty and estimate the posterior densities of those parameters Now x and y are the parameters of the uniform prior distribution on m and a and b are the shape and rate parameters of the gamma prior distribution on v. d α β gamma distribution m v β = m vα = m2 v x y a b uniform distribution gamma distribution Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 46. E: H A M The values we choose for the parameters of these hyperprior distributions should reflect our prior knowledge. The previous observed shot was 39.76 cm, thus we may use this to parameterize our hyperpriors for analysis of future observations. m ∼ Uniform(x,y) x = 10.0 y = 50.0 E(m) = 30.0 v ∼ Gamma(a,b) a = 20.0 b = 2.0 E(v) = 10.0 d α β gamma distribution m v β = m vα = m2 v x y a b uniform distribution gamma distribution Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 47. RB D: H A M The Rev language specifying a hierarchical model on shot accuracy based on 1 new observation. mean ∼ dnUnif(10,50) var ∼ dnGamma(20,2) alpha := (mean * mean) / var beta := mean / var observed_shot = 35.21 d ∼ dnGamma(alpha,beta) d.clamp(observed_shot) d.lnProbability() depends on initial value of mean & var Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 48. E: H A M Now that we have a defined model, how do we estimate the posterior probability density? m ∼ Uniform(x,y) v ∼ Gamma(a,b) d ∼ Gamma(α,β) d α β gamma distribution m v β = m vα = m2 v x y a b uniform distribution gamma distribution f(m,v | d,a,b,x,y) ∝ f(d | α = m2 v ,β = m v )f(m | x,y)f(v | a,b) Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 49. M C M C (MCMC) An algorithm for approximating the posterior distribution Metropolis, Rosenblusth, Rosenbluth, Teller, Teller. 1953. Equations of state calculations by fast computing machines. J. Chem. Phys. Hastings. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika. Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 50. M C M C (MCMC) More on MCMC from Paul Lewis and his lecture on Bayesian phylogenetics Slides source: https://molevol.mbl.edu/index.php/Paul_Lewis (photo of Paul Lewis by David Hillis)
  • 51. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 41 Markov chain Monte Carlo (MCMC) For more complex problems, we might settle for a good approximation to the posterior distribution
  • 52. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 42 MCMC robot’s rules Uphill steps are always accepted Slightly downhill steps are usually accepted Drastic “off the cliff” downhill steps are almost never accepted With these rules, it is easy to see why the robot tends to stay near the tops of hills
  • 53. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 43 (Actual) MCMC robot rules Uphill steps are always accepted because R > 1 Slightly downhill steps are usually accepted because R is near 1 Drastic “off the cliff” downhill steps are almost never accepted because R is near 0 Currently at 1.0 m Proposed at 2.3 m R = 2.3/1.0 = 2.3 Currently at 6.2 m Proposed at 5.7 m R = 5.7/6.2 =0.92 Currently at 6.2 m Proposed at 0.2 m R = 0.2/6.2 = 0.03 6 8 4 2 0 10 The robot takes a step if it draws a Uniform(0,1) random deviate that is less than or equal to R
  • 54. = f(D| ⇤ )f( ⇤ ) f(D) f(D| )f( ) f(D) Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 44 Cancellation of marginal likelihood When calculating the ratio R of posterior densities, the marginal probability of the data cancels. f( ⇤ |D) f( |D) Posterior odds = f(D| ⇤ )f( ⇤ ) f(D| )f( ) Likelihood ratio Prior odds
  • 55. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 45 Target vs. Proposal Distributions Pretend this proposal distribution allows good mixing. What does good mixing mean?
  • 56. default2.TXT State 0 2500 5000 7500 10000 12500 15000 17500 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 46 Trace plots “White noise” appearance is a sign of good mixing I used the program Tracer to create this plot: http://tree.bio.ed.ac.uk/software/tracer/ AWTY (Are We There Yet?) is useful for investigating convergence: http://king2.scs.fsu.edu/CEBProjects/awty/ awty_start.php log(posterior)
  • 57. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 47 Target vs. Proposal Distributions Proposal distributions with smaller variance... Disadvantage: robot takes smaller steps, more time required to explore the same area Advantage: robot seldom refuses to take proposed steps
  • 58. smallsteps.TXT State 0 2500 5000 7500 10000 12500 15000 17500 -6 -5 -4 -3 -2 -1 0 Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 48 If step size is too small, large-scale trends will be apparent log(posterior)
  • 59. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 49 Target vs. Proposal Distributions Proposal distributions with larger variance... Disadvantage: robot often proposes a step that would take it off a cliff, and refuses to move Advantage: robot can potentially cover a lot of ground quickly
  • 60. bigsteps2.TXT State 0 2500 5000 7500 10000 12500 15000 17500 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 50 Chain is spending long periods of time “stuck” in one place “Stuck” robot is indicative of step sizes that are too large (most proposed steps would take the robot “off the cliff”) log(posterior)
  • 61. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 52 Metropolis-coupled Markov chain Monte Carlo (MCMCMC) • MCMCMC involves running several chains simultaneously • The cold chain is the one that counts, the rest are heated chains • Chain is heated by raising densities to a power less than 1.0 (values closer to 0.0 are warmer) Geyer, C. J. 1991. Markov chain Monte Carlo maximum likelihood for dependent data. Pages 156-163 in Computing Science and Statistics (E. Keramidas, ed.).
  • 62. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) Heated chains act as scouts for the cold chain 53 cold heated Cold chain robot can easily make this jump because it is uphill Hot chain robot can also make this jump with high probability because it is only slightly downhill
  • 63. Paul O. Lewis (2016 Woods Hole Molecular Evolution Workshop) 54 cold heated Hot chain and cold chain robots swapping places Swapping places means both robots can cross the valley, but this is more important for the cold chain because its valley is much deeper
  • 64. M C M C (MCMC) Thanks, Paul! Slides source: https://molevol.mbl.edu/index.php/Paul_Lewis See MCMCRobot, a helpful software program for learning MCMC by Paul Lewis http://www.mcmcrobot.org Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 65. RB D: H A M The Rev language specifying the MCMC sampler for the hierarchical model on archery accuracy. ... # model specification from previous demo mymodel = model(beta) moves[1] = mvSlide(mean,delta=1.0,tune=true,weight=3.0) moves[2] = mvScale(var,lambda=1.0,tune=true,weight=3.0) monitors[1] = mnModel(file="archery_mcmc_1.log",printgen=10, ...) monitors[2] = mnScreen(printgen=1000, mean, var) mymcmc = mcmc(mymodel, monitors, moves,nruns=1) mymcmc.burnin(generations=10000,tuningInterval=1000) mymcmc.run(generations=40000,underPrior=false) MCMC screen output Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 66. RB D: H A M Summary of the MCMC sample for the mean distance from target center. Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 67. RB D: H A M The trace-plot of the MCMC samples for the mean distance from target center Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 68. E: H A M Under this model, we do a good job of estimating the mean, but when judging archery skill, precision (variance) is as (if not more) important than accuracy q q qq q q q q q qqq q d α β gamma distribution m v β = m vα = m2 v x y a b uniform distribution gamma distribution Thus, it is also worth evaluating the estimated posterior distribution for the variance component of our model Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 69. E: V The posterior estimate of the variance (v) is quite different from the true value (4.0) and from the highest likelihood value found by our MCMC (MLE 3.51374). 0 5 10 15 0.000.050.100.150.200.25 variance (v) density MLE true posterior prior This indicates that the prior is having a strong influence on the posterior. Why do you think that is? Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 70. E: V When the prior closely matches the posterior, it can indicate that the data are not very informative for this parameter. 0 5 10 15 0.000.050.100.150.200.25 variance (v) density MLE true posterior prior Remember that our data were only 6 observed shots. What would happen if I had 600 arrows? Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 71. E: W   M D With 100X more observations, we can estimate the mean and variance with greater precision. 0 5 10 15 0.00.51.01.52.0 variance (v) density MLE true posterior prior Marginal Posterior of the Variance (v) 33.0 33.5 34.0 34.5 35.0 35.5 36.0 0123456 mean (m) density posterior prior Marginal Posterior of the Mean (m) Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 72. B P How is this applied to phylogenetic inference? Jukes-Cantor (1969) on an unrooted tree seq PhyloCTMC Q JCTree bli ⌧ Exponential Uniform 10 N i 2 2N 3 for (i in 1:n_branches) { bl[i] ⇠ dnExponential(10.0) } topology ⇠ dnUniformTopology(taxa) psi := treeAssembly(topology, bl) Q <- fnJC(4) seq ⇠ dnPhyloCTMC( tree=psi, Q=Q, type="DNA" ) seq.clamp( data ) (image source RevBayes Substitution Models Tutorial) We can assemble a phylogenetic model in the same way, using previously described models and probability distributions as priors. Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 73. B P With a defined model we simply then have to: • draw starting values for every random variable in the model • define moves on each random variable that propose new values • then for each step in our MCMC, choose a parameter and update it according to the correct proposal. • propose a new tree topology and accept or reject • propose a new model parameter value and accept or reject • save the current state of every random variable (tree, branch lengths, base frequencies, etc.) after every k number of states • after n MCMC steps, evaluate the run for signs of non-convergence • summarize the tree and other parameters Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 74. M C P Complex data require complex models and careful analysis Many people feel that genome-scale data will overcome model misspecification Genomic datasets with many sparsely sampled lineages are prone to systematic biases from overly simple models Choanoflagellates Porifera Ctenophora Placazoa Cnidaria Rotifera Platyhelminthes Annelida Mollusca Nematoda Arthropoda Echinodermata Chordata ANIMALIA BILATERIA ? These issues are part of the Porifera-sister vs. Ctenophora- sister debate over the last 8 years Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 75. M C P In the case of the animal phylogeny, it is argued that some resolutions may be due to long-branch attraction because simple models do not account for site-specific amino acid profiles Choanoflagellates Porifera Ctenophora Placazoa Cnidaria Bilateria Site Patterns more frequent less frequent True Tree Choanoflagellates Porifera Ctenophora Placazoa Cnidaria Bilateria Inferred Tree figure adapted from Nicolas Lartillot (Phyloseminar.org, 2016) Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 76. M C P This problem may be exacerbated by including few species from distant outgroups Choanoflagellates Porifera Ctenophora Placazoa Cnidaria Bilateria Fungi Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 77. S-H S M Phylogenomic studies often ignore site-specific amino acid profiles by applying uniform models or locus-partitioned models Flexible, non-parametric Bayesian models can accommodate this complex process heterogeneity These models include both uniform and locus-specific models (they are nested) YMIPTSELKPGELRLLE YMVPTQDLAPGQFRLLE YMTPTTDLPLGHFRLLE YMLPPLFLEPGDLRLLD MAYPMQLGFQDATSPIM MAHPTQLGFKDAAMPVM MANHSQLGFQDASSPIM MAHAAQVGLQDATSPIM VETIWTILPAIILILIA IEIVWTILPAVILVLIA VELIWTILPAIVLVLLA METVWTILPAIILVLIA Uniform effect YMIPTSELKPGELRLLE YMVPTQDLAPGQFRLLE YMTPTTDLPLGHFRLLE YMLPPLFLEPGDLRLLD MAYPMQLGFQDATSPIM MAHPTQLGFKDAAMPVM MANHSQLGFQDASSPIM MAHAAQVGLQDATSPIM VETIWTILPAIILILIA IEIVWTILPAVILVLIA VELIWTILPAIVLVLLA METVWTILPAIILVLIA Locus-specific effects YMIPTSELKPGELRLLE YMVPTQDLAPGQFRLLE YMTPTTDLPLGHFRLLE YMLPPLFLEPGDLRLLD MAYPMQLGFQDATSPIM MAHPTQLGFKDAAMPVM MANHSQLGFQDASSPIM MAHAAQVGLQDATSPIM VETIWTILPAIILILIA IEIVWTILPAVILVLIA VELIWTILPAIVLVLLA METVWTILPAIILVLIA Site-specific effects (Lartillot & Phillippe, MBE, 2004; Lartillot et al., Bioinformatics, 2009) Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 78. S-H S M Models that treat amino acid profiles as random effects across sites can accommodate these patterns Implemented as the CAT family of models in Phylobayes (Lartillot et al., Bioinformatics, 2009) Note that this is NOT the same as the CAT model for rate heterogeneity in RAXML!! Choanoflagellates Porifera Ctenophora Placazoa Cnidaria Bilateria Site Patterns more frequent less frequent True Tree Nicolas Lartillot has also written some blog posts on this topic (as well as several other Bayesian phylogenetics concepts) and see his Phyloseminar explaining these models: https://www.youtube.com/watch?v 2zJuuXjX978 Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)
  • 79. B P Bayesian analysis of genomic data under complex models has many challenges • Many parameters require long MCMC runs to effectively sample all parameters, this problem gets worse with more complex models • Even complex models like CAT-GTR (in Phylobayes) are likely inadequate for many genome-scale datasets • There is no one-size-fits-all method or model for phylogenetic data • Thorough analysis takes expensive computational resources • Thorough analysis takes a LONG time even with access to HPC resources But...amazing progress on models & algorithms has been made in our field and we will overcome these obstacles in the very near future. Then, we will be faced with new, more difficult challenges. This is why phylogenetics is fun! Tracy A. Heath (2017 Workshop on Phylogenomics, Český Krumlov, CZ)