SlideShare a Scribd company logo
1 of 89
Download to read offline
B D T
E
Tracy Heath
Ecology, Evolution, & Organismal Biology
Iowa State University
@trayc7
http://phyloworks.org
2016 Workshop on Molecular Evolution
Woods Hole, MA USA
O
Overview of divergence time estimation
• Relaxed clock models – accounting for variation in
substitution rates among lineages
• Tree models – lineage diversification and sampling
break
BEAST v2 Tutorial — Divergence-time estimation under
birth-death processes
• Dating Bear Divergence Times with the Fossilized
Birth-Death Process
dinner
A T -S E
Phylogenies with branch lengths proportional to time provide
more information about evolutionary history than unrooted
trees with branch lengths in units of substitutions/site.
0.03 substitutions/site
Time (My)
20 0
Oligocene Miocene
Plio
Pleis
(silhouette images from http://phylopic.org)
A T -S E
Phylogenetic divergence-time
estimation
• What was the spacial and
climatic environment of ancient
angiosperms?
• How has mammalian body-size
changed over time?
• How has the infection rate of
HCV in Egypt changed over
time?
• Is diversification in Caribbean
anoles correlated with ecological
opportunity?
• How has the rate of molecular
evolution changed across the
Tree of Life?
Epidemiology
(Antonelli & Sanmartin. Syst. Biol. 2011)
(Lartillot & Delsuc. Evolution 2012)
(Stadler et al. PNAS 2013)
(Mahler, Revell, Glor, & Losos. Evolution 2010)
(Nabholz, Glemin, Galtier. MBE 2008)
Historical biogeography Molecular evolution
Trait evolution
Diversification
Anolis fowleri (image by L. Mahler)
Understanding Evolutionary Processes
D T E
Goal: Estimate the ages of interior nodes to understand the
timing and rates of evolutionary processes
Model how rates are
distributed across the tree
Describe the distribution of
speciation events over time
External calibration
information for estimates of
absolute node times
calibrated node
100 0.020.040.060.080.0
Equus
Rhinoceros
Bos
Hippopotamus
Balaenoptera
Physeter
Ursus
Canis
Felis
Homo
Pan
Gorilla
Pongo
Macaca
Callithrix
Loris
Galago
Daubentonia
Varecia
Eulemur
Lemur
Hapalemur
Propithecus
Lepilemur
Mirza
M. murinus
M. griseorufus
M. myoxinus
M. berthae
M. rufus1
M. tavaratra
M. rufus2
M. sambiranensis
M. ravelobensis
Cheirogaleus
Simiiformes
Microcebus
Cretaceous Paleogene Neogene Q
Time (Millions of years)
T G M C
Assume that the rate of
evolutionary change is
constant over time
(branch lengths equal
percent sequence
divergence) 10%
400 My
200 My
A B C
20%
10%
10%
(Based on slides by Je Thorne; http://statgen.ncsu.edu/thorne/compmolevo.html)
T G M C
We can date the tree if we
know the rate of change is
1% divergence per 10 My
A B C
20%
10%
10%
10%
200 My
400 My
200 My
(Based on slides by Je Thorne; http://statgen.ncsu.edu/thorne/compmolevo.html)
T G M C
If we found a fossil of the
MRCA of B and C, we can
use it to calculate the rate
of change & date the root
of the tree
A B C
20%
10%
10%
10%
200 My
400 My
(Based on slides by Je Thorne; http://statgen.ncsu.edu/thorne/compmolevo.html)
R G M C
Rates of evolution vary across lineages and over time
Mutation rate:
Variation in
• metabolic rate
• generation time
• DNA repair
Fixation rate:
Variation in
• strength and targets of
selection
• population sizes
10%
400 My
200 My
A B C
20%
10%
10%
U A
Sequence data provide
information about branch
lengths
In units of the expected # of
substitutions per site
branch length = rate ⇥ time
0.2 expected
substitutions/site
PhylogeneticRelationshipsSequence
Data
R T
The sequence data
provide information
about branch length
for any possible rate,
there’s a time that fits
the branch length
perfectly
0
1
2
3
4
5
0 1 2 3 4 5
BranchRate
Branch Time
time = 0.8
rate = 0.625
branch length = 0.5
Methods for dating species divergences estimate the
substitution rate and time separately
(based on Thorne & Kishino, 2005)
B D T E
length = rate length = time
R = (r1,r2,r3,...,r2N 2)
A = (a1,a2,a3,...,aN 1)
N = number of tips
B D T E
length = rate length = time
R = (r1,r2,r3,...,r2N 2)
A = (a1,a2,a3,...,aN 1)
N = number of tips
B D T E
Posterior probability
f (R,A, R, A, s | D, )
R Vector of rates on branches
A Vector of internal node ages
R, A, s Model parameters
D Sequence data
Tree topology
B D T E
f(R,A, R, A, s | D) =
f (D | R,A, s) f(R | R) f(A | A) f( s)
f(D)
f(D | R,A, R, A, s) Likelihood
f(R | R) Prior on rates
f(A | A) Prior on node ages
f( s) Prior on substitution parameters
f(D) Marginal probability of the data
B D T E
Estimating divergence times relies on 2 main elements:
• Branch-specific rates: f (R | R)
• Node ages & Topology: f (A | A)
DNA Data
Rate Matrix Site Rates Branch RatesTree
B D T E
Estimating divergence times relies on 2 main elements:
• Branch-specific rates: f (R | R)
• Node ages & Topology: f (A | A)
DNA Data
Rate Matrix Site Rates Branch RatesTree
M R V
Some models describing lineage-specific substitution rate
variation:
• Global molecular clock (Zuckerkandl & Pauling, 1962)
• Local molecular clocks (Hasegawa, Kishino & Yano 1989;
Kishino & Hasegawa 1990; Yoder & Yang 2000; Yang & Yoder
2003, Drummond and Suchard 2010)
• Punctuated rate change model (Huelsenbeck, Larget and
Swo ord 2000)
• Log-normally distributed autocorrelated rates (Thorne,
Kishino & Painter 1998; Kishino, Thorne & Bruno 2001; Thorne &
Kishino 2002)
• Uncorrelated/independent rates models (Drummond et al.
2006; Rannala & Yang 2007; Lepage et al. 2007)
• Mixture models on branch rates (Heath, Holder, Huelsenbeck
2012)
Models of Lineage-specific Rate Variation
G M C
The substitution rate is
constant over time
All lineages share the same
rate
branch length = substitution rate
low high
Models of Lineage-specific Rate Variation (Zuckerkandl & Pauling, 1962)
R -C M
To accommodate variation in substitution rates
‘relaxed-clock’ models estimate lineage-specific substitution
rates
• Local molecular clocks
• Punctuated rate change model
• Log-normally distributed autocorrelated rates
• Uncorrelated/independent rates models
• Mixture models on branch rates
L M C
Rate shifts occur
infrequently over the tree
Closely related lineages
have equivalent rates
(clustered by sub-clades)
low high
branch length = substitution rate
Models of Lineage-specific Rate Variation (Yang & Yoder 2003, Drummond and Suchard 2010)
L M C
Most methods for
estimating local clocks
required specifying the
number and locations of
rate changes a priori
Drummond and Suchard
(2010) introduced a
Bayesian method that
samples over a broad range
of possible random local
clocks
low high
branch length = substitution rate
Models of Lineage-specific Rate Variation (Yang & Yoder 2003, Drummond and Suchard 2010)
A R
Substitution rates evolve
gradually over time –
closely related lineages have
similar rates
The rate at a node is
drawn from a lognormal
distribution with a mean
equal to the parent rate
(geometric brownian
motion) low high
branch length = substitution rate
Models of Lineage-specific Rate Variation (Thorne, Kishino & Painter 1998; Kishino, Thorne & Bruno 2001)
P R C
Rate changes occur along
lineages according to a
point process
At rate-change events, the
new rate is a product of
the parent’s rate and a
-distributed multiplier
low high
branch length = substitution rate
Models of Lineage-specific Rate Variation (Huelsenbeck, Larget and Swo ord 2000)
I /U R
Lineage-specific rates are
uncorrelated when the rate
assigned to each branch is
independently drawn from
an underlying distribution
low high
branch length = substitution rate
Models of Lineage-specific Rate Variation (Drummond et al. 2006; Rannala & Yang 2007; Lepage et al. 2007)
I /U R
Lineage-specific rates are
uncorrelated when the rate
assigned to each branch is
independently drawn from
an underlying distribution
Models of Lineage-specific Rate Variation (Drummond et al. 2006; Rannala & Yang 2007; Lepage et al. 2007)
I M M
Dirichlet process prior:
Branches are partitioned
into distinct rate categories
The number of rate
categories and assignment
of branches to categories
are random variables under
the model
branch length = substitution rate
c5
c4
c3
c2
substitution rate classes
c1
Models of Lineage-specific Rate Variation (Heath, Holder, Huelsenbeck. 2012 MBE)
M R V
These are only a subset of the available models for
branch-rate variation
• Global molecular clock
• Local molecular clocks
• Punctuated rate change model
• Log-normally distributed autocorrelated rates
• Uncorrelated/independent rates models
• Dirchlet process prior
Models of Lineage-specific Rate Variation
M R V
Are our models appropriate across all data sets?
cave bear
American
black bear
sloth bear
Asian
black bear
brown bear
polar bear
American giant
short-faced bear
giant panda
sun bear
harbor seal
spectacled
bear
4.08
5.39
5.66
12.86
2.75
5.05
19.09
35.7
0.88
4.58
[3.11–5.27]
[4.26–7.34]
[9.77–16.58]
[3.9–6.48]
[0.66–1.17]
[4.2–6.86]
[2.1–3.57]
[14.38–24.79]
[3.51–5.89]
14.32
[9.77–16.58]
95% CI
mean age (Ma)
t2
t3
t4
t6
t7
t5
t8
t9
t10
tx
node
MP•MLu•MLp•Bayesian
100•100•100•1.00
100•100•100•1.00
85•93•93•1.00
76•94•97•1.00
99•97•94•1.00
100•100•100•1.00
100•100•100•1.00
100•100•100•1.00
t1
Eocene Oligocene Miocene Plio Plei Hol
34 5.3 1.823.8 0.01
Epochs
Ma
Global expansion of C4 biomass
Major temperature drop and increasing seasonality
Faunal turnover
Krause et al., 2008. Mitochondrial genomes reveal an
explosive radiation of extinct and extant bears near the
Miocene-Pliocene boundary. BMC Evol. Biol. 8.
Taxa
1
5
10
50
100
500
1000
5000
10000
20000
0100200300
MYA
Ophidiiformes
Percomorpha
Beryciformes
Lampriformes
Zeiforms
Polymixiiformes
Percopsif. + Gadiif.
Aulopiformes
Myctophiformes
Argentiniformes
Stomiiformes
Osmeriformes
Galaxiiformes
Salmoniformes
Esociformes
Characiformes
Siluriformes
Gymnotiformes
Cypriniformes
Gonorynchiformes
Denticipidae
Clupeomorpha
Osteoglossomorpha
Elopomorpha
Holostei
Chondrostei
Polypteriformes
Clade r ε ΔAIC
1. 0.041 0.0017 25.3
2. 0.081 * 25.5
3. 0.067 0.37 45.1
4. 0 * 3.1
Bg. 0.011 0.0011
OstariophysiAcanthomorpha
Teleostei
Santini et al., 2009. Did genome duplication drive the origin
of teleosts? A comparative study of diversification in
ray-finned fishes. BMC Evol. Biol. 9.
M R V
These are only a subset of the available models for
branch-rate variation
• Global molecular clock
• Local molecular clocks
• Punctuated rate change model
• Log-normally distributed autocorrelated rates
• Uncorrelated/independent rates models
• Dirchlet process prior
Considering model selection, uncertainty, & plausibility is
very important for Bayesian divergence time analysis
Models of Lineage-specific Rate Variation
B D T E
Estimating divergence times relies on 2 main elements:
• Branch-specific rates: f (R | R)
• Node ages: f (A | A)
DNA Data
Rate Matrix Site Rates Branch RatesTree
P T N A
Relaxed clock Bayesian analyses require a prior distribution
on time trees
Di erent node-age priors make di erent assumptions about
the timing of divergence events
Tree Priors
S B P
Node-age priors based on stochastic models of lineage
diversification
Yule process: assumes a
constant rate of speciation,
across lineages
A pure birth process—every
node leaves extant
descendants (no extinction)
Tree Priors
S B P
Node-age priors based on stochastic models of lineage
diversification
Constant-rate birth-death
process: at any point in
time a lineage can speciate
at rate or go extinct with
a rate of
Tree Priors
S B P
Node-age priors based on stochastic models of lineage
diversification
Constant-rate birth-death
process: at any point in
time a lineage can speciate
at rate or go extinct with
a rate of
Tree Priors
S B P
Di erent values of and lead
to di erent trees
Bayesian inference under these
models can be very sensitive to
the values of these parameters
Using hyperpriors on and (or
d and r) accounts for uncertainty
in these hyperparameters
Tree Priors
S B P
Node-age priors based on stochastic models of lineage
diversification
Birth-death-sampling
process: an extension of
the constant-rate birth-death
model that accounts for
random sampling of tips
Conditions on a probability
of sampling a tip,
Tree Priors
P N T
Sequence data are only informative on relative rates & times
Node-time priors cannot give precise estimates of absolute
node ages
We need external information (like fossils) to provide
absolute time scale
Node Age Priors
C D T
Fossils (or other data) are necessary to estimate absolute
node ages
There is no information in
the sequence data for
absolute time
Uncertainty in the
placement of fossils
A B C
20%
10%
10%
10%
200 My
400 My
C D
Bayesian inference is well suited to accommodating
uncertainty in the age of the calibration node
Divergence times are
calibrated by placing
parametric densities on
internal nodes o set by age
estimates from the fossil
record
A B C
200 My
Density
Age
A F C
Misplaced fossils can a ect node age estimates throughout
the tree – if the fossil is older than its presumed MRCA
Calibrating the Tree (figure from Benton & Donoghue Mol. Biol. Evol. 2007)
F C
Age estimates from fossils
can provide minimum time
constraints for internal
nodes
Reliable maximum bounds
are typically unavailable
Minimum age
Time (My)
Calibrating Divergence Times
P D C N
Common practice in Bayesian divergence-time estimation:
Parametric distributions are
typically o -set by the age
of the oldest fossil assigned
to a clade
These prior densities do not
(necessarily) require
specification of maximum
bounds
Uniform (min, max)
Exponential (λ)
Gamma (α, β)
Log Normal (µ, σ2)
Time (My)Minimum age
Calibrating Divergence Times
P D C N
Describe the waiting time
between the divergence
event and the age of the
oldest fossil
Minimum age
Time (My)
Calibrating Divergence Times
P D C N
Overly informative priors
can bias node age
estimates to be too young
Minimum age
Exponential (λ)
Time (My)
Calibrating Divergence Times
P D C N
Uncertainty in the age of
the MRCA of the clade
relative to the age of the
fossil may be better
captured by vague prior
densities
Minimum age
Exponential (λ)
Time (My)
Calibrating Divergence Times
P D C N
Common practice in Bayesian divergence-time estimation:
Estimates of absolute node
ages are driven primarily by
the calibration density
Specifying appropriate
densities is a challenge for
most molecular biologists
Uniform (min, max)
Exponential (λ)
Gamma (α, β)
Log Normal (µ, σ2
)
Time (My)Minimum age
Calibration Density Approach
I F C
We would prefer to
eliminate the need for
ad hoc calibration
prior densities
Calibration densities
do not account for
diversification of fossils
Domestic dog
Spotted seal
Giant panda
Spectacled bear
Sun bear
Am. black bear
Asian black bear
Brown bear
Polar bear
Sloth bear
Zaragocyon daamsi
Ballusia elmensis
Ursavus brevihinus
Ailurarctos lufengensis
Ursavus primaevus
Agriarctos spp.
Kretzoiarctos beatrix
Indarctos vireti
Indarctos arctoides
Indarctos punjabiensis
Giant short-faced bear
Cave bear
Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)
I F C
We want to use all
of the available fossils
Example: Bears
12 fossils are reduced
to 4 calibration ages
with calibration density
methods
Domestic dog
Spotted seal
Giant panda
Spectacled bear
Sun bear
Am. black bear
Asian black bear
Brown bear
Polar bear
Sloth bear
Zaragocyon daamsi
Ballusia elmensis
Ursavus brevihinus
Ailurarctos lufengensis
Ursavus primaevus
Agriarctos spp.
Kretzoiarctos beatrix
Indarctos vireti
Indarctos arctoides
Indarctos punjabiensis
Giant short-faced bear
Cave bear
Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)
I F C
We want to use all
of the available fossils
Example: Bears
12 fossils are reduced
to 4 calibration ages
with calibration density
methods
Domestic dog
Spotted seal
Giant panda
Spectacled bear
Sun bear
Am. black bear
Asian black bear
Brown bear
Polar bear
Sloth bear
Zaragocyon daamsi
Ballusia elmensis
Ursavus brevihinus
Ailurarctos lufengensis
Ursavus primaevus
Agriarctos spp.
Kretzoiarctos beatrix
Indarctos vireti
Indarctos arctoides
Indarctos punjabiensis
Giant short-faced bear
Cave bear
Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)
I F C
Because fossils are
part of the
diversification process,
we can combine fossil
calibration with
birth-death models
Domestic dog
Spotted seal
Giant panda
Spectacled bear
Sun bear
Am. black bear
Asian black bear
Brown bear
Polar bear
Sloth bear
Zaragocyon daamsi
Ballusia elmensis
Ursavus brevihinus
Ailurarctos lufengensis
Ursavus primaevus
Agriarctos spp.
Kretzoiarctos beatrix
Indarctos vireti
Indarctos arctoides
Indarctos punjabiensis
Giant short-faced bear
Cave bear
Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)
I F C
This relies on a
branching model that
accounts for
speciation, extinction,
and rates of
fossilization,
preservation, and
recovery
Domestic dog
Spotted seal
Giant panda
Spectacled bear
Sun bear
Am. black bear
Asian black bear
Brown bear
Polar bear
Sloth bear
Zaragocyon daamsi
Ballusia elmensis
Ursavus brevihinus
Ailurarctos lufengensis
Ursavus primaevus
Agriarctos spp.
Kretzoiarctos beatrix
Indarctos vireti
Indarctos arctoides
Indarctos punjabiensis
Giant short-faced bear
Cave bear
Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)
P N
“Except during the interlude of the [Modern] Synthesis,
there has been limited communication historically among the
disciplines of evolutionary biology, particularly between
students of evolutionary history (paleontologists and
systematists) and those of molecular, population, and
organismal biology. There has been increasing realization
that barriers between these subfields must be overcome if a
complete theory of evolution and systematics is to be
forged.”.
Reaka-Kudla & Colwell: in Dudley (ed.), The Unity of Evolutionary Biology: Proceedings of the Fourth
International Congress of Systematic & Evolutionary Biology, Discorides Press, Portland, OR, p. 16. (1994)
Two Separate Fields, Same Goals
P N
Two Separate Fields, Same Goals
C F E D
Statistical methods provide a way to integrate
paleontological & neontological data
Two Separate Fields, Same Goals
C F E D
Combine models for sequence evolution, morphological
change, & fossil recovery to jointly estimate the tree
topology, divergence times, & lineage diversification rates
DNA Data
Substitution Model
Site Rate Model
Branch Rate Model
Morphological Data
Substitution Model
Site Rate Model
Branch Rate Model
Time Tree Model
Fossil Occurrence Time Data
C F E D
Until recently, analyses combining fossil & extant taxa used
simple or inappropriate models to describe the tree and fossil
ages
DNA Data
Substitution Model
Site Rate Model
Branch Rate Model
Morphological Data
Substitution Model
Site Rate Model
Branch Rate Model
Time Tree Model
Fossil Occurrence Time Data
M T O T
Stadler (2010) introduced a generating model for a serially
sampled time tree — this is the fossilized birth-death process.
(Stadler. Journal of Theoretical Biology 2010)
P FBD
This graph shows the conditional dependence structure of
the FBD model, which is a generating process for a
sampled, dated time tree and fossil occurrences
T
F
µ
time tree
⇢x0origin time sampling probability
fossil occurrence times
fossil recovery ratespeciation rate extinction rate
P FBD
We re-parameterize the model so that we are directly
estimating the diversification rate, turnonver and fossil
sampling proportion
T
F
µ
d r
s
time tree
⇢x0origin time sampling probability
fossil occurrence times
fossil recovery rate
speciation rate extinction rate
diversification rate turnover
fossil sampling proportion
=
d
1 r
=
rd
1 r
=
s
1 s
rd
1 r
T F B -D P (FBD)
Improving statistical inference of absolute node ages
Eliminates the need to specify arbitrary
calibration densities
Useful for ‘total-evidence’ analyses
Better capture our statistical
uncertainty in species divergence dates
All reliable fossils associated with a
clade are used
150 100 50 0
Time
(Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
Recovered fossil specimens
provide historical
observations of the
diversification process that
generated the tree of
extant species
150 100 50 0
Time
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
The probability of the tree
and fossil observations
under a birth-death model
with rate parameters:
= speciation
= extinction
= fossilization/recovery
150 100 50 0
Time
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
The occurrence time of the
fossil indicates an
observation of the
birth-death process before
the present
0250 50100150200
Time (My)
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
The fossil must attach to
the tree at some time and
to some branch: F
0250 50100150200
Time (My)
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
If it is the descendant of
an unobserved lineage, then
there is a speciation event
at time F
0250 50100150200
Time (My)
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
MCMC is used to propose
new topological placements
for the fossil
0250 50100150200
Time (My)
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
Using rjMCMC, we can
propose F = , which
means that the fossil is a
“sampled ancestor”
0250 50100150200
Time (My)
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
Using rjMCMC, we can
propose F = , which
means that the fossil is a
“sampled ancestor”
0250 50100150200
Time (My)
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
The probability of any
realization of the
diversification process is
conditional on:
, , and
0250 50100150200
Time (My)
N
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
Using MCMC, we can
sample the age of the
MRCA • and the
placement and time of the
fossil lineage
0250 50100150200
Time (My)
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
Under the FBD, multiple
fossils are considered, even
if they are descended from
the same MRCA node in
the extant tree
0250 50100150200
Time (My)
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
Given F and , the new
fossil can attach to the tree
via speciation along either
branch in the extant tree at
time F
0250 50100150200
Time (My)
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
Given F and , the new
fossil can attach to the tree
via speciation along either
branch in the extant tree at
time F
0250 50100150200
Time (My)
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
Given F and , the new
fossil can attach to the tree
via speciation along either
branch in the extant tree at
time F
0250 50100150200
Time (My)
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
Or the unobserved branch
leading to the other fossil
0250 50100150200
Time (My)
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
If F = , then the new
fossil lies directly on a
branch in the extant tree
0250 50100150200
Time (My)
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
If F = , then the new
fossil lies directly on a
branch in the extant tree
0250 50100150200
Time (My)
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
Or it is an ancestor of the
other sampled fossil
0250 50100150200
Time (My)
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
The probability of this
realization of the
diversification process is
conditional on:
, , and
N
0250 50100150200
Time (My)
N
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
T F B -D P (FBD)
Using MCMC, we can
sample the age of the
MRCA • and the
placement and time of all
fossil lineages
0250 50100150200
Time (My)
Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
S A
Sampled lineages with sampled descendants
150 100 50 0
Time
There is a non-zero probability of sampling
ancestor-descendant relationships from the fossil record
S A
Complete FBD Tree Reconstructed FBD Tree
Because fossils & living taxa are assumed to come from a
single diversification process, there is a non-zero probability
of sampled ancestors
S A
Complete FBD Tree No Sampled Ancestor Tree
If all fossils are forced to be on separate lineages, this
induces additional speciation events and will, in turn,
influence rate & node-age estimates.
C F E D
DNA Data
Substitution Model
Site Rate Model
Branch Rate Model
Morphological Data
Substitution Model
Site Rate Model
Branch Rate Model
Time Tree Model
Fossil Occurrence Time Data
0001010111001
0101111110114
0011100110005
1??010000????
???110211????
CATTATTCGCATTA
CATCATTCGAATCA
AATTATCCGAGTAA
??????????????
??????????????
Geological Time
(turtle tree image by M. Landis)
M M C C
0001010111001
0101111110114
0011100110005
1??010000????
???110211????
Geological Time
(turtle tree image by M. Landis)
M M C C
The Lewis Mk model
Assumes a character can take
k states
Transition rates between
states are equal
Q =
2
6
6
6
4
1 k 1 ... 1
... 1 k ... 1
...
... ... ...
1 1 ... 1 k
3
7
7
7
5
0
0
1
2
2
1
1
2
2
2
2
2
2
2
1
1
1
1
1
1
1
T1
T2
T3
T4
T5
T6
T7
0
0
0
0
0
0
0
(Lewis. Systematic Biology 2001)
M M C C
The Lewis Mkv model
Accounts for the ascertainment
bias in morphological datasets
by conditioning the likelihood
on the fact that invariant
characters are not sampled
Q =
2
6
6
6
4
1 k 1 ... 1
... 1 k ... 1
...
... ... ...
1 1 ... 1 k
3
7
7
7
5
2
2
2
2
2
2
2
1
1
1
1
1
1
1
T1
T2
T3
T4
T5
T6
T7
0
0
0
0
0
0
0
0
0
1
2
2
1
1
(Lewis. Systematic Biology 2001)
“T -E ” A
Integrating models of molecular and morphological evolution
with improved tree priors enables joint inference of the tree
topology (extant & extinct) and divergence times
DNA Data
Substitution Model
Site Rate Model
Branch Rate Model
Morphological Data
Substitution Model
Site Rate Model
Branch Rate Model
Time Tree Model
Fossil Occurrence Time Data

More Related Content

What's hot

Mapping and QTL
Mapping and QTLMapping and QTL
Mapping and QTLFAO
 
Molecular clock, Neutral hypothesis
Molecular clock, Neutral hypothesisMolecular clock, Neutral hypothesis
Molecular clock, Neutral hypothesisFreya Cardozo
 
Diversity Array technology
Diversity Array technologyDiversity Array technology
Diversity Array technologyManjesh Saakre
 
Biological databases
Biological databasesBiological databases
Biological databasesAshfaq Ahmad
 
Whole genome sequencing of arabidopsis thaliana
Whole genome sequencing of arabidopsis thalianaWhole genome sequencing of arabidopsis thaliana
Whole genome sequencing of arabidopsis thalianaBhavya Sree
 
An overview of agricultural applications of genome editing: Crop plants
An overview of agricultural applications of genome editing: Crop plantsAn overview of agricultural applications of genome editing: Crop plants
An overview of agricultural applications of genome editing: Crop plantsOECD Environment
 
Candidate Gene Approach in Crop Improvement
Candidate Gene Approach in Crop ImprovementCandidate Gene Approach in Crop Improvement
Candidate Gene Approach in Crop ImprovementBonipasAntony2
 
Distance based method
Distance based method Distance based method
Distance based method Adhena Lulli
 
Microarray Data Analysis
Microarray Data AnalysisMicroarray Data Analysis
Microarray Data Analysisyuvraj404
 
Transgenesis, Intragenesis and Cisgenesis: A Brief Review
Transgenesis, Intragenesis and Cisgenesis: A Brief ReviewTransgenesis, Intragenesis and Cisgenesis: A Brief Review
Transgenesis, Intragenesis and Cisgenesis: A Brief ReviewHuda Nazeer
 
Omics for crop improvement (new)
Omics for crop improvement (new)Omics for crop improvement (new)
Omics for crop improvement (new)Gokul Dhana
 
Epi519 Gwas Talk
Epi519 Gwas TalkEpi519 Gwas Talk
Epi519 Gwas Talkjoshbis
 
SNPs analysis methods
SNPs analysis methodsSNPs analysis methods
SNPs analysis methodshad89
 
Molecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breedingMolecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breedingFOODCROPS
 
Omics in plant breeding
Omics in plant breedingOmics in plant breeding
Omics in plant breedingpoornimakn04
 

What's hot (20)

Mapping and QTL
Mapping and QTLMapping and QTL
Mapping and QTL
 
Molecular marker
Molecular markerMolecular marker
Molecular marker
 
Phylogenetics
PhylogeneticsPhylogenetics
Phylogenetics
 
GWAS
GWASGWAS
GWAS
 
Molecular clock, Neutral hypothesis
Molecular clock, Neutral hypothesisMolecular clock, Neutral hypothesis
Molecular clock, Neutral hypothesis
 
Diversity Array technology
Diversity Array technologyDiversity Array technology
Diversity Array technology
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Whole genome sequencing of arabidopsis thaliana
Whole genome sequencing of arabidopsis thalianaWhole genome sequencing of arabidopsis thaliana
Whole genome sequencing of arabidopsis thaliana
 
An overview of agricultural applications of genome editing: Crop plants
An overview of agricultural applications of genome editing: Crop plantsAn overview of agricultural applications of genome editing: Crop plants
An overview of agricultural applications of genome editing: Crop plants
 
Candidate Gene Approach in Crop Improvement
Candidate Gene Approach in Crop ImprovementCandidate Gene Approach in Crop Improvement
Candidate Gene Approach in Crop Improvement
 
Phylogenetic studies
Phylogenetic studiesPhylogenetic studies
Phylogenetic studies
 
Distance based method
Distance based method Distance based method
Distance based method
 
Microarray Data Analysis
Microarray Data AnalysisMicroarray Data Analysis
Microarray Data Analysis
 
Transgenesis, Intragenesis and Cisgenesis: A Brief Review
Transgenesis, Intragenesis and Cisgenesis: A Brief ReviewTransgenesis, Intragenesis and Cisgenesis: A Brief Review
Transgenesis, Intragenesis and Cisgenesis: A Brief Review
 
Omics for crop improvement (new)
Omics for crop improvement (new)Omics for crop improvement (new)
Omics for crop improvement (new)
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Epi519 Gwas Talk
Epi519 Gwas TalkEpi519 Gwas Talk
Epi519 Gwas Talk
 
SNPs analysis methods
SNPs analysis methodsSNPs analysis methods
SNPs analysis methods
 
Molecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breedingMolecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breeding
 
Omics in plant breeding
Omics in plant breedingOmics in plant breeding
Omics in plant breeding
 

Viewers also liked

Introduction to Bayesian Phylogenetics
Introduction to Bayesian PhylogeneticsIntroduction to Bayesian Phylogenetics
Introduction to Bayesian PhylogeneticsTracy Heath
 
Phylogenetics in R
Phylogenetics in RPhylogenetics in R
Phylogenetics in Rschamber
 
Bayesian phylogenetic inference_big4_ws_2016-10-10
Bayesian phylogenetic inference_big4_ws_2016-10-10Bayesian phylogenetic inference_big4_ws_2016-10-10
Bayesian phylogenetic inference_big4_ws_2016-10-10FredrikRonquist
 
Phylogeny in R - Bianca Santini Sheffield R Users March 2015
Phylogeny in R - Bianca Santini Sheffield R Users March 2015Phylogeny in R - Bianca Santini Sheffield R Users March 2015
Phylogeny in R - Bianca Santini Sheffield R Users March 2015Paul Richards
 
Introduction to Bayesian Divergence Time Estimation
Introduction to Bayesian Divergence Time EstimationIntroduction to Bayesian Divergence Time Estimation
Introduction to Bayesian Divergence Time EstimationTracy Heath
 
Hertweck Evolution 2014
Hertweck Evolution 2014Hertweck Evolution 2014
Hertweck Evolution 2014Kate Hertweck
 
ECCB 2010 Next-gen sequencing Tutorial
ECCB 2010 Next-gen sequencing TutorialECCB 2010 Next-gen sequencing Tutorial
ECCB 2010 Next-gen sequencing TutorialThomas Keane
 
Genomics in Public Health
Genomics in Public HealthGenomics in Public Health
Genomics in Public HealthJennifer Gardy
 
Interpreting genomic variation and phylogenetic trees to understand disease t...
Interpreting genomic variation and phylogenetic trees to understand disease t...Interpreting genomic variation and phylogenetic trees to understand disease t...
Interpreting genomic variation and phylogenetic trees to understand disease t...Jennifer Gardy
 
Water + conservation powerpoint
Water + conservation powerpointWater + conservation powerpoint
Water + conservation powerpointRorey Risdon
 
Water conservation ppt
Water conservation pptWater conservation ppt
Water conservation pptbinnyaji
 

Viewers also liked (14)

Introduction to Bayesian Phylogenetics
Introduction to Bayesian PhylogeneticsIntroduction to Bayesian Phylogenetics
Introduction to Bayesian Phylogenetics
 
Phylogenetics in R
Phylogenetics in RPhylogenetics in R
Phylogenetics in R
 
Bayesian phylogenetic inference_big4_ws_2016-10-10
Bayesian phylogenetic inference_big4_ws_2016-10-10Bayesian phylogenetic inference_big4_ws_2016-10-10
Bayesian phylogenetic inference_big4_ws_2016-10-10
 
Phylogeny in R - Bianca Santini Sheffield R Users March 2015
Phylogeny in R - Bianca Santini Sheffield R Users March 2015Phylogeny in R - Bianca Santini Sheffield R Users March 2015
Phylogeny in R - Bianca Santini Sheffield R Users March 2015
 
Introduction to Bayesian Divergence Time Estimation
Introduction to Bayesian Divergence Time EstimationIntroduction to Bayesian Divergence Time Estimation
Introduction to Bayesian Divergence Time Estimation
 
Phylolecture
PhylolecturePhylolecture
Phylolecture
 
Hertweck Evolution 2014
Hertweck Evolution 2014Hertweck Evolution 2014
Hertweck Evolution 2014
 
Principes world cafe
Principes world cafePrincipes world cafe
Principes world cafe
 
ECCB 2010 Next-gen sequencing Tutorial
ECCB 2010 Next-gen sequencing TutorialECCB 2010 Next-gen sequencing Tutorial
ECCB 2010 Next-gen sequencing Tutorial
 
Genomics in Public Health
Genomics in Public HealthGenomics in Public Health
Genomics in Public Health
 
Interpreting genomic variation and phylogenetic trees to understand disease t...
Interpreting genomic variation and phylogenetic trees to understand disease t...Interpreting genomic variation and phylogenetic trees to understand disease t...
Interpreting genomic variation and phylogenetic trees to understand disease t...
 
Gephi Quick Start
Gephi Quick StartGephi Quick Start
Gephi Quick Start
 
Water + conservation powerpoint
Water + conservation powerpointWater + conservation powerpoint
Water + conservation powerpoint
 
Water conservation ppt
Water conservation pptWater conservation ppt
Water conservation ppt
 

Similar to Bayesian Divergence Time Estimation – Workshop Lecture

Divergence-time estimation in RevBayes
Divergence-time estimation in RevBayesDivergence-time estimation in RevBayes
Divergence-time estimation in RevBayesTracy Heath
 
Phylogenetic analysis in nutshell
Phylogenetic analysis in nutshellPhylogenetic analysis in nutshell
Phylogenetic analysis in nutshellAvinash Kumar
 
A distance-based method for phylogenetic tree reconstruction using algebraic ...
A distance-based method for phylogenetic tree reconstruction using algebraic ...A distance-based method for phylogenetic tree reconstruction using algebraic ...
A distance-based method for phylogenetic tree reconstruction using algebraic ...Emily Castner
 
Gab Abramowitz_The e-MAST data-model interface
Gab Abramowitz_The e-MAST data-model interfaceGab Abramowitz_The e-MAST data-model interface
Gab Abramowitz_The e-MAST data-model interfaceTERN Australia
 
inverse theory and inversion of seismic
inverse theory and inversion of seismic inverse theory and inversion of seismic
inverse theory and inversion of seismic Abdullah Abderahman
 
Testing for heterogeneity in rates of morphological evolution: discrete chara...
Testing for heterogeneity in rates of morphological evolution: discrete chara...Testing for heterogeneity in rates of morphological evolution: discrete chara...
Testing for heterogeneity in rates of morphological evolution: discrete chara...Graeme Lloyd
 
Perl for Phyloinformatics
Perl for PhyloinformaticsPerl for Phyloinformatics
Perl for PhyloinformaticsRutger Vos
 
Collura_Z_Score_platform_paper_Targeting_Strategies_Z_Scores.ppt
Collura_Z_Score_platform_paper_Targeting_Strategies_Z_Scores.pptCollura_Z_Score_platform_paper_Targeting_Strategies_Z_Scores.ppt
Collura_Z_Score_platform_paper_Targeting_Strategies_Z_Scores.pptDya Anggraeni
 
Paper memo: persistent homology on biological problems
Paper memo: persistent homology on biological problemsPaper memo: persistent homology on biological problems
Paper memo: persistent homology on biological problemsRyohei Suzuki
 
Effects of ancestral character estimation on methods for estimating rates of ...
Effects of ancestral character estimation on methods for estimating rates of ...Effects of ancestral character estimation on methods for estimating rates of ...
Effects of ancestral character estimation on methods for estimating rates of ...Graeme Lloyd
 
Inference beyond standard network models
Inference beyond standard network modelsInference beyond standard network models
Inference beyond standard network modelsUmeå University
 
NMR Random Coil Index & Protein Dynamics
NMR Random Coil Index & Protein Dynamics NMR Random Coil Index & Protein Dynamics
NMR Random Coil Index & Protein Dynamics Mark Berjanskii
 
Interpreting ‘tree space’ in the context of very large empirical datasets
Interpreting ‘tree space’ in the context of very large empirical datasetsInterpreting ‘tree space’ in the context of very large empirical datasets
Interpreting ‘tree space’ in the context of very large empirical datasetsJoe Parker
 
Data Requirements for Groundwater Modelling
Data Requirements for Groundwater ModellingData Requirements for Groundwater Modelling
Data Requirements for Groundwater ModellingC. P. Kumar
 
Comparative Genomics and Visualisation BS32010
Comparative Genomics and Visualisation BS32010Comparative Genomics and Visualisation BS32010
Comparative Genomics and Visualisation BS32010Leighton Pritchard
 
Modelling the variability of evolutionary processes
Modelling the variability of evolutionary processesModelling the variability of evolutionary processes
Modelling the variability of evolutionary processesVicente RIBAS-RIPOLL
 
PetroTeach Free Webinar by Dr. Andrew Ross on Seismic Reservoir Characterization
PetroTeach Free Webinar by Dr. Andrew Ross on Seismic Reservoir CharacterizationPetroTeach Free Webinar by Dr. Andrew Ross on Seismic Reservoir Characterization
PetroTeach Free Webinar by Dr. Andrew Ross on Seismic Reservoir CharacterizationPetro Teach
 
PetroTeach Free Webinar on Seismic Reservoir Characterization
PetroTeach Free Webinar on Seismic Reservoir CharacterizationPetroTeach Free Webinar on Seismic Reservoir Characterization
PetroTeach Free Webinar on Seismic Reservoir CharacterizationPetroTeach1
 

Similar to Bayesian Divergence Time Estimation – Workshop Lecture (20)

Divergence-time estimation in RevBayes
Divergence-time estimation in RevBayesDivergence-time estimation in RevBayes
Divergence-time estimation in RevBayes
 
Phylogenetic analysis in nutshell
Phylogenetic analysis in nutshellPhylogenetic analysis in nutshell
Phylogenetic analysis in nutshell
 
Project doc
Project docProject doc
Project doc
 
Bay's marko chain
Bay's marko chainBay's marko chain
Bay's marko chain
 
A distance-based method for phylogenetic tree reconstruction using algebraic ...
A distance-based method for phylogenetic tree reconstruction using algebraic ...A distance-based method for phylogenetic tree reconstruction using algebraic ...
A distance-based method for phylogenetic tree reconstruction using algebraic ...
 
Gab Abramowitz_The e-MAST data-model interface
Gab Abramowitz_The e-MAST data-model interfaceGab Abramowitz_The e-MAST data-model interface
Gab Abramowitz_The e-MAST data-model interface
 
inverse theory and inversion of seismic
inverse theory and inversion of seismic inverse theory and inversion of seismic
inverse theory and inversion of seismic
 
Testing for heterogeneity in rates of morphological evolution: discrete chara...
Testing for heterogeneity in rates of morphological evolution: discrete chara...Testing for heterogeneity in rates of morphological evolution: discrete chara...
Testing for heterogeneity in rates of morphological evolution: discrete chara...
 
Perl for Phyloinformatics
Perl for PhyloinformaticsPerl for Phyloinformatics
Perl for Phyloinformatics
 
Collura_Z_Score_platform_paper_Targeting_Strategies_Z_Scores.ppt
Collura_Z_Score_platform_paper_Targeting_Strategies_Z_Scores.pptCollura_Z_Score_platform_paper_Targeting_Strategies_Z_Scores.ppt
Collura_Z_Score_platform_paper_Targeting_Strategies_Z_Scores.ppt
 
Paper memo: persistent homology on biological problems
Paper memo: persistent homology on biological problemsPaper memo: persistent homology on biological problems
Paper memo: persistent homology on biological problems
 
Effects of ancestral character estimation on methods for estimating rates of ...
Effects of ancestral character estimation on methods for estimating rates of ...Effects of ancestral character estimation on methods for estimating rates of ...
Effects of ancestral character estimation on methods for estimating rates of ...
 
Inference beyond standard network models
Inference beyond standard network modelsInference beyond standard network models
Inference beyond standard network models
 
NMR Random Coil Index & Protein Dynamics
NMR Random Coil Index & Protein Dynamics NMR Random Coil Index & Protein Dynamics
NMR Random Coil Index & Protein Dynamics
 
Interpreting ‘tree space’ in the context of very large empirical datasets
Interpreting ‘tree space’ in the context of very large empirical datasetsInterpreting ‘tree space’ in the context of very large empirical datasets
Interpreting ‘tree space’ in the context of very large empirical datasets
 
Data Requirements for Groundwater Modelling
Data Requirements for Groundwater ModellingData Requirements for Groundwater Modelling
Data Requirements for Groundwater Modelling
 
Comparative Genomics and Visualisation BS32010
Comparative Genomics and Visualisation BS32010Comparative Genomics and Visualisation BS32010
Comparative Genomics and Visualisation BS32010
 
Modelling the variability of evolutionary processes
Modelling the variability of evolutionary processesModelling the variability of evolutionary processes
Modelling the variability of evolutionary processes
 
PetroTeach Free Webinar by Dr. Andrew Ross on Seismic Reservoir Characterization
PetroTeach Free Webinar by Dr. Andrew Ross on Seismic Reservoir CharacterizationPetroTeach Free Webinar by Dr. Andrew Ross on Seismic Reservoir Characterization
PetroTeach Free Webinar by Dr. Andrew Ross on Seismic Reservoir Characterization
 
PetroTeach Free Webinar on Seismic Reservoir Characterization
PetroTeach Free Webinar on Seismic Reservoir CharacterizationPetroTeach Free Webinar on Seismic Reservoir Characterization
PetroTeach Free Webinar on Seismic Reservoir Characterization
 

Recently uploaded

zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaPraksha3
 
Temporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of MasticationTemporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of Masticationvidulajaib
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫qfactory1
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Cytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptxCytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptxVarshiniMK
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Forest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantForest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantadityabhardwaj282
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 

Recently uploaded (20)

zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
 
Temporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of MasticationTemporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of Mastication
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Cytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptxCytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptx
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Forest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantForest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are important
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 

Bayesian Divergence Time Estimation – Workshop Lecture

  • 1. B D T E Tracy Heath Ecology, Evolution, & Organismal Biology Iowa State University @trayc7 http://phyloworks.org 2016 Workshop on Molecular Evolution Woods Hole, MA USA
  • 2. O Overview of divergence time estimation • Relaxed clock models – accounting for variation in substitution rates among lineages • Tree models – lineage diversification and sampling break BEAST v2 Tutorial — Divergence-time estimation under birth-death processes • Dating Bear Divergence Times with the Fossilized Birth-Death Process dinner
  • 3. A T -S E Phylogenies with branch lengths proportional to time provide more information about evolutionary history than unrooted trees with branch lengths in units of substitutions/site. 0.03 substitutions/site Time (My) 20 0 Oligocene Miocene Plio Pleis (silhouette images from http://phylopic.org)
  • 4. A T -S E Phylogenetic divergence-time estimation • What was the spacial and climatic environment of ancient angiosperms? • How has mammalian body-size changed over time? • How has the infection rate of HCV in Egypt changed over time? • Is diversification in Caribbean anoles correlated with ecological opportunity? • How has the rate of molecular evolution changed across the Tree of Life? Epidemiology (Antonelli & Sanmartin. Syst. Biol. 2011) (Lartillot & Delsuc. Evolution 2012) (Stadler et al. PNAS 2013) (Mahler, Revell, Glor, & Losos. Evolution 2010) (Nabholz, Glemin, Galtier. MBE 2008) Historical biogeography Molecular evolution Trait evolution Diversification Anolis fowleri (image by L. Mahler) Understanding Evolutionary Processes
  • 5. D T E Goal: Estimate the ages of interior nodes to understand the timing and rates of evolutionary processes Model how rates are distributed across the tree Describe the distribution of speciation events over time External calibration information for estimates of absolute node times calibrated node 100 0.020.040.060.080.0 Equus Rhinoceros Bos Hippopotamus Balaenoptera Physeter Ursus Canis Felis Homo Pan Gorilla Pongo Macaca Callithrix Loris Galago Daubentonia Varecia Eulemur Lemur Hapalemur Propithecus Lepilemur Mirza M. murinus M. griseorufus M. myoxinus M. berthae M. rufus1 M. tavaratra M. rufus2 M. sambiranensis M. ravelobensis Cheirogaleus Simiiformes Microcebus Cretaceous Paleogene Neogene Q Time (Millions of years)
  • 6. T G M C Assume that the rate of evolutionary change is constant over time (branch lengths equal percent sequence divergence) 10% 400 My 200 My A B C 20% 10% 10% (Based on slides by Je Thorne; http://statgen.ncsu.edu/thorne/compmolevo.html)
  • 7. T G M C We can date the tree if we know the rate of change is 1% divergence per 10 My A B C 20% 10% 10% 10% 200 My 400 My 200 My (Based on slides by Je Thorne; http://statgen.ncsu.edu/thorne/compmolevo.html)
  • 8. T G M C If we found a fossil of the MRCA of B and C, we can use it to calculate the rate of change & date the root of the tree A B C 20% 10% 10% 10% 200 My 400 My (Based on slides by Je Thorne; http://statgen.ncsu.edu/thorne/compmolevo.html)
  • 9. R G M C Rates of evolution vary across lineages and over time Mutation rate: Variation in • metabolic rate • generation time • DNA repair Fixation rate: Variation in • strength and targets of selection • population sizes 10% 400 My 200 My A B C 20% 10% 10%
  • 10. U A Sequence data provide information about branch lengths In units of the expected # of substitutions per site branch length = rate ⇥ time 0.2 expected substitutions/site PhylogeneticRelationshipsSequence Data
  • 11. R T The sequence data provide information about branch length for any possible rate, there’s a time that fits the branch length perfectly 0 1 2 3 4 5 0 1 2 3 4 5 BranchRate Branch Time time = 0.8 rate = 0.625 branch length = 0.5 Methods for dating species divergences estimate the substitution rate and time separately (based on Thorne & Kishino, 2005)
  • 12. B D T E length = rate length = time R = (r1,r2,r3,...,r2N 2) A = (a1,a2,a3,...,aN 1) N = number of tips
  • 13. B D T E length = rate length = time R = (r1,r2,r3,...,r2N 2) A = (a1,a2,a3,...,aN 1) N = number of tips
  • 14. B D T E Posterior probability f (R,A, R, A, s | D, ) R Vector of rates on branches A Vector of internal node ages R, A, s Model parameters D Sequence data Tree topology
  • 15. B D T E f(R,A, R, A, s | D) = f (D | R,A, s) f(R | R) f(A | A) f( s) f(D) f(D | R,A, R, A, s) Likelihood f(R | R) Prior on rates f(A | A) Prior on node ages f( s) Prior on substitution parameters f(D) Marginal probability of the data
  • 16. B D T E Estimating divergence times relies on 2 main elements: • Branch-specific rates: f (R | R) • Node ages & Topology: f (A | A) DNA Data Rate Matrix Site Rates Branch RatesTree
  • 17. B D T E Estimating divergence times relies on 2 main elements: • Branch-specific rates: f (R | R) • Node ages & Topology: f (A | A) DNA Data Rate Matrix Site Rates Branch RatesTree
  • 18. M R V Some models describing lineage-specific substitution rate variation: • Global molecular clock (Zuckerkandl & Pauling, 1962) • Local molecular clocks (Hasegawa, Kishino & Yano 1989; Kishino & Hasegawa 1990; Yoder & Yang 2000; Yang & Yoder 2003, Drummond and Suchard 2010) • Punctuated rate change model (Huelsenbeck, Larget and Swo ord 2000) • Log-normally distributed autocorrelated rates (Thorne, Kishino & Painter 1998; Kishino, Thorne & Bruno 2001; Thorne & Kishino 2002) • Uncorrelated/independent rates models (Drummond et al. 2006; Rannala & Yang 2007; Lepage et al. 2007) • Mixture models on branch rates (Heath, Holder, Huelsenbeck 2012) Models of Lineage-specific Rate Variation
  • 19. G M C The substitution rate is constant over time All lineages share the same rate branch length = substitution rate low high Models of Lineage-specific Rate Variation (Zuckerkandl & Pauling, 1962)
  • 20. R -C M To accommodate variation in substitution rates ‘relaxed-clock’ models estimate lineage-specific substitution rates • Local molecular clocks • Punctuated rate change model • Log-normally distributed autocorrelated rates • Uncorrelated/independent rates models • Mixture models on branch rates
  • 21. L M C Rate shifts occur infrequently over the tree Closely related lineages have equivalent rates (clustered by sub-clades) low high branch length = substitution rate Models of Lineage-specific Rate Variation (Yang & Yoder 2003, Drummond and Suchard 2010)
  • 22. L M C Most methods for estimating local clocks required specifying the number and locations of rate changes a priori Drummond and Suchard (2010) introduced a Bayesian method that samples over a broad range of possible random local clocks low high branch length = substitution rate Models of Lineage-specific Rate Variation (Yang & Yoder 2003, Drummond and Suchard 2010)
  • 23. A R Substitution rates evolve gradually over time – closely related lineages have similar rates The rate at a node is drawn from a lognormal distribution with a mean equal to the parent rate (geometric brownian motion) low high branch length = substitution rate Models of Lineage-specific Rate Variation (Thorne, Kishino & Painter 1998; Kishino, Thorne & Bruno 2001)
  • 24. P R C Rate changes occur along lineages according to a point process At rate-change events, the new rate is a product of the parent’s rate and a -distributed multiplier low high branch length = substitution rate Models of Lineage-specific Rate Variation (Huelsenbeck, Larget and Swo ord 2000)
  • 25. I /U R Lineage-specific rates are uncorrelated when the rate assigned to each branch is independently drawn from an underlying distribution low high branch length = substitution rate Models of Lineage-specific Rate Variation (Drummond et al. 2006; Rannala & Yang 2007; Lepage et al. 2007)
  • 26. I /U R Lineage-specific rates are uncorrelated when the rate assigned to each branch is independently drawn from an underlying distribution Models of Lineage-specific Rate Variation (Drummond et al. 2006; Rannala & Yang 2007; Lepage et al. 2007)
  • 27. I M M Dirichlet process prior: Branches are partitioned into distinct rate categories The number of rate categories and assignment of branches to categories are random variables under the model branch length = substitution rate c5 c4 c3 c2 substitution rate classes c1 Models of Lineage-specific Rate Variation (Heath, Holder, Huelsenbeck. 2012 MBE)
  • 28. M R V These are only a subset of the available models for branch-rate variation • Global molecular clock • Local molecular clocks • Punctuated rate change model • Log-normally distributed autocorrelated rates • Uncorrelated/independent rates models • Dirchlet process prior Models of Lineage-specific Rate Variation
  • 29. M R V Are our models appropriate across all data sets? cave bear American black bear sloth bear Asian black bear brown bear polar bear American giant short-faced bear giant panda sun bear harbor seal spectacled bear 4.08 5.39 5.66 12.86 2.75 5.05 19.09 35.7 0.88 4.58 [3.11–5.27] [4.26–7.34] [9.77–16.58] [3.9–6.48] [0.66–1.17] [4.2–6.86] [2.1–3.57] [14.38–24.79] [3.51–5.89] 14.32 [9.77–16.58] 95% CI mean age (Ma) t2 t3 t4 t6 t7 t5 t8 t9 t10 tx node MP•MLu•MLp•Bayesian 100•100•100•1.00 100•100•100•1.00 85•93•93•1.00 76•94•97•1.00 99•97•94•1.00 100•100•100•1.00 100•100•100•1.00 100•100•100•1.00 t1 Eocene Oligocene Miocene Plio Plei Hol 34 5.3 1.823.8 0.01 Epochs Ma Global expansion of C4 biomass Major temperature drop and increasing seasonality Faunal turnover Krause et al., 2008. Mitochondrial genomes reveal an explosive radiation of extinct and extant bears near the Miocene-Pliocene boundary. BMC Evol. Biol. 8. Taxa 1 5 10 50 100 500 1000 5000 10000 20000 0100200300 MYA Ophidiiformes Percomorpha Beryciformes Lampriformes Zeiforms Polymixiiformes Percopsif. + Gadiif. Aulopiformes Myctophiformes Argentiniformes Stomiiformes Osmeriformes Galaxiiformes Salmoniformes Esociformes Characiformes Siluriformes Gymnotiformes Cypriniformes Gonorynchiformes Denticipidae Clupeomorpha Osteoglossomorpha Elopomorpha Holostei Chondrostei Polypteriformes Clade r ε ΔAIC 1. 0.041 0.0017 25.3 2. 0.081 * 25.5 3. 0.067 0.37 45.1 4. 0 * 3.1 Bg. 0.011 0.0011 OstariophysiAcanthomorpha Teleostei Santini et al., 2009. Did genome duplication drive the origin of teleosts? A comparative study of diversification in ray-finned fishes. BMC Evol. Biol. 9.
  • 30. M R V These are only a subset of the available models for branch-rate variation • Global molecular clock • Local molecular clocks • Punctuated rate change model • Log-normally distributed autocorrelated rates • Uncorrelated/independent rates models • Dirchlet process prior Considering model selection, uncertainty, & plausibility is very important for Bayesian divergence time analysis Models of Lineage-specific Rate Variation
  • 31. B D T E Estimating divergence times relies on 2 main elements: • Branch-specific rates: f (R | R) • Node ages: f (A | A) DNA Data Rate Matrix Site Rates Branch RatesTree
  • 32. P T N A Relaxed clock Bayesian analyses require a prior distribution on time trees Di erent node-age priors make di erent assumptions about the timing of divergence events Tree Priors
  • 33. S B P Node-age priors based on stochastic models of lineage diversification Yule process: assumes a constant rate of speciation, across lineages A pure birth process—every node leaves extant descendants (no extinction) Tree Priors
  • 34. S B P Node-age priors based on stochastic models of lineage diversification Constant-rate birth-death process: at any point in time a lineage can speciate at rate or go extinct with a rate of Tree Priors
  • 35. S B P Node-age priors based on stochastic models of lineage diversification Constant-rate birth-death process: at any point in time a lineage can speciate at rate or go extinct with a rate of Tree Priors
  • 36. S B P Di erent values of and lead to di erent trees Bayesian inference under these models can be very sensitive to the values of these parameters Using hyperpriors on and (or d and r) accounts for uncertainty in these hyperparameters Tree Priors
  • 37. S B P Node-age priors based on stochastic models of lineage diversification Birth-death-sampling process: an extension of the constant-rate birth-death model that accounts for random sampling of tips Conditions on a probability of sampling a tip, Tree Priors
  • 38. P N T Sequence data are only informative on relative rates & times Node-time priors cannot give precise estimates of absolute node ages We need external information (like fossils) to provide absolute time scale Node Age Priors
  • 39. C D T Fossils (or other data) are necessary to estimate absolute node ages There is no information in the sequence data for absolute time Uncertainty in the placement of fossils A B C 20% 10% 10% 10% 200 My 400 My
  • 40. C D Bayesian inference is well suited to accommodating uncertainty in the age of the calibration node Divergence times are calibrated by placing parametric densities on internal nodes o set by age estimates from the fossil record A B C 200 My Density Age
  • 41. A F C Misplaced fossils can a ect node age estimates throughout the tree – if the fossil is older than its presumed MRCA Calibrating the Tree (figure from Benton & Donoghue Mol. Biol. Evol. 2007)
  • 42. F C Age estimates from fossils can provide minimum time constraints for internal nodes Reliable maximum bounds are typically unavailable Minimum age Time (My) Calibrating Divergence Times
  • 43. P D C N Common practice in Bayesian divergence-time estimation: Parametric distributions are typically o -set by the age of the oldest fossil assigned to a clade These prior densities do not (necessarily) require specification of maximum bounds Uniform (min, max) Exponential (λ) Gamma (α, β) Log Normal (µ, σ2) Time (My)Minimum age Calibrating Divergence Times
  • 44. P D C N Describe the waiting time between the divergence event and the age of the oldest fossil Minimum age Time (My) Calibrating Divergence Times
  • 45. P D C N Overly informative priors can bias node age estimates to be too young Minimum age Exponential (λ) Time (My) Calibrating Divergence Times
  • 46. P D C N Uncertainty in the age of the MRCA of the clade relative to the age of the fossil may be better captured by vague prior densities Minimum age Exponential (λ) Time (My) Calibrating Divergence Times
  • 47. P D C N Common practice in Bayesian divergence-time estimation: Estimates of absolute node ages are driven primarily by the calibration density Specifying appropriate densities is a challenge for most molecular biologists Uniform (min, max) Exponential (λ) Gamma (α, β) Log Normal (µ, σ2 ) Time (My)Minimum age Calibration Density Approach
  • 48. I F C We would prefer to eliminate the need for ad hoc calibration prior densities Calibration densities do not account for diversification of fossils Domestic dog Spotted seal Giant panda Spectacled bear Sun bear Am. black bear Asian black bear Brown bear Polar bear Sloth bear Zaragocyon daamsi Ballusia elmensis Ursavus brevihinus Ailurarctos lufengensis Ursavus primaevus Agriarctos spp. Kretzoiarctos beatrix Indarctos vireti Indarctos arctoides Indarctos punjabiensis Giant short-faced bear Cave bear Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)
  • 49. I F C We want to use all of the available fossils Example: Bears 12 fossils are reduced to 4 calibration ages with calibration density methods Domestic dog Spotted seal Giant panda Spectacled bear Sun bear Am. black bear Asian black bear Brown bear Polar bear Sloth bear Zaragocyon daamsi Ballusia elmensis Ursavus brevihinus Ailurarctos lufengensis Ursavus primaevus Agriarctos spp. Kretzoiarctos beatrix Indarctos vireti Indarctos arctoides Indarctos punjabiensis Giant short-faced bear Cave bear Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)
  • 50. I F C We want to use all of the available fossils Example: Bears 12 fossils are reduced to 4 calibration ages with calibration density methods Domestic dog Spotted seal Giant panda Spectacled bear Sun bear Am. black bear Asian black bear Brown bear Polar bear Sloth bear Zaragocyon daamsi Ballusia elmensis Ursavus brevihinus Ailurarctos lufengensis Ursavus primaevus Agriarctos spp. Kretzoiarctos beatrix Indarctos vireti Indarctos arctoides Indarctos punjabiensis Giant short-faced bear Cave bear Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)
  • 51. I F C Because fossils are part of the diversification process, we can combine fossil calibration with birth-death models Domestic dog Spotted seal Giant panda Spectacled bear Sun bear Am. black bear Asian black bear Brown bear Polar bear Sloth bear Zaragocyon daamsi Ballusia elmensis Ursavus brevihinus Ailurarctos lufengensis Ursavus primaevus Agriarctos spp. Kretzoiarctos beatrix Indarctos vireti Indarctos arctoides Indarctos punjabiensis Giant short-faced bear Cave bear Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)
  • 52. I F C This relies on a branching model that accounts for speciation, extinction, and rates of fossilization, preservation, and recovery Domestic dog Spotted seal Giant panda Spectacled bear Sun bear Am. black bear Asian black bear Brown bear Polar bear Sloth bear Zaragocyon daamsi Ballusia elmensis Ursavus brevihinus Ailurarctos lufengensis Ursavus primaevus Agriarctos spp. Kretzoiarctos beatrix Indarctos vireti Indarctos arctoides Indarctos punjabiensis Giant short-faced bear Cave bear Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)
  • 53. P N “Except during the interlude of the [Modern] Synthesis, there has been limited communication historically among the disciplines of evolutionary biology, particularly between students of evolutionary history (paleontologists and systematists) and those of molecular, population, and organismal biology. There has been increasing realization that barriers between these subfields must be overcome if a complete theory of evolution and systematics is to be forged.”. Reaka-Kudla & Colwell: in Dudley (ed.), The Unity of Evolutionary Biology: Proceedings of the Fourth International Congress of Systematic & Evolutionary Biology, Discorides Press, Portland, OR, p. 16. (1994) Two Separate Fields, Same Goals
  • 54. P N Two Separate Fields, Same Goals
  • 55. C F E D Statistical methods provide a way to integrate paleontological & neontological data Two Separate Fields, Same Goals
  • 56. C F E D Combine models for sequence evolution, morphological change, & fossil recovery to jointly estimate the tree topology, divergence times, & lineage diversification rates DNA Data Substitution Model Site Rate Model Branch Rate Model Morphological Data Substitution Model Site Rate Model Branch Rate Model Time Tree Model Fossil Occurrence Time Data
  • 57. C F E D Until recently, analyses combining fossil & extant taxa used simple or inappropriate models to describe the tree and fossil ages DNA Data Substitution Model Site Rate Model Branch Rate Model Morphological Data Substitution Model Site Rate Model Branch Rate Model Time Tree Model Fossil Occurrence Time Data
  • 58. M T O T Stadler (2010) introduced a generating model for a serially sampled time tree — this is the fossilized birth-death process. (Stadler. Journal of Theoretical Biology 2010)
  • 59. P FBD This graph shows the conditional dependence structure of the FBD model, which is a generating process for a sampled, dated time tree and fossil occurrences T F µ time tree ⇢x0origin time sampling probability fossil occurrence times fossil recovery ratespeciation rate extinction rate
  • 60. P FBD We re-parameterize the model so that we are directly estimating the diversification rate, turnonver and fossil sampling proportion T F µ d r s time tree ⇢x0origin time sampling probability fossil occurrence times fossil recovery rate speciation rate extinction rate diversification rate turnover fossil sampling proportion = d 1 r = rd 1 r = s 1 s rd 1 r
  • 61. T F B -D P (FBD) Improving statistical inference of absolute node ages Eliminates the need to specify arbitrary calibration densities Useful for ‘total-evidence’ analyses Better capture our statistical uncertainty in species divergence dates All reliable fossils associated with a clade are used 150 100 50 0 Time (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 62. T F B -D P (FBD) Recovered fossil specimens provide historical observations of the diversification process that generated the tree of extant species 150 100 50 0 Time Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 63. T F B -D P (FBD) The probability of the tree and fossil observations under a birth-death model with rate parameters: = speciation = extinction = fossilization/recovery 150 100 50 0 Time Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 64. T F B -D P (FBD) The occurrence time of the fossil indicates an observation of the birth-death process before the present 0250 50100150200 Time (My) Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 65. T F B -D P (FBD) The fossil must attach to the tree at some time and to some branch: F 0250 50100150200 Time (My) Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 66. T F B -D P (FBD) If it is the descendant of an unobserved lineage, then there is a speciation event at time F 0250 50100150200 Time (My) Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 67. T F B -D P (FBD) MCMC is used to propose new topological placements for the fossil 0250 50100150200 Time (My) Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 68. T F B -D P (FBD) Using rjMCMC, we can propose F = , which means that the fossil is a “sampled ancestor” 0250 50100150200 Time (My) Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 69. T F B -D P (FBD) Using rjMCMC, we can propose F = , which means that the fossil is a “sampled ancestor” 0250 50100150200 Time (My) Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 70. T F B -D P (FBD) The probability of any realization of the diversification process is conditional on: , , and 0250 50100150200 Time (My) N Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 71. T F B -D P (FBD) Using MCMC, we can sample the age of the MRCA • and the placement and time of the fossil lineage 0250 50100150200 Time (My) Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 72. T F B -D P (FBD) Under the FBD, multiple fossils are considered, even if they are descended from the same MRCA node in the extant tree 0250 50100150200 Time (My) Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 73. T F B -D P (FBD) Given F and , the new fossil can attach to the tree via speciation along either branch in the extant tree at time F 0250 50100150200 Time (My) Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 74. T F B -D P (FBD) Given F and , the new fossil can attach to the tree via speciation along either branch in the extant tree at time F 0250 50100150200 Time (My) Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 75. T F B -D P (FBD) Given F and , the new fossil can attach to the tree via speciation along either branch in the extant tree at time F 0250 50100150200 Time (My) Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 76. T F B -D P (FBD) Or the unobserved branch leading to the other fossil 0250 50100150200 Time (My) Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 77. T F B -D P (FBD) If F = , then the new fossil lies directly on a branch in the extant tree 0250 50100150200 Time (My) Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 78. T F B -D P (FBD) If F = , then the new fossil lies directly on a branch in the extant tree 0250 50100150200 Time (My) Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 79. T F B -D P (FBD) Or it is an ancestor of the other sampled fossil 0250 50100150200 Time (My) Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 80. T F B -D P (FBD) The probability of this realization of the diversification process is conditional on: , , and N 0250 50100150200 Time (My) N Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 81. T F B -D P (FBD) Using MCMC, we can sample the age of the MRCA • and the placement and time of all fossil lineages 0250 50100150200 Time (My) Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)
  • 82. S A Sampled lineages with sampled descendants 150 100 50 0 Time There is a non-zero probability of sampling ancestor-descendant relationships from the fossil record
  • 83. S A Complete FBD Tree Reconstructed FBD Tree Because fossils & living taxa are assumed to come from a single diversification process, there is a non-zero probability of sampled ancestors
  • 84. S A Complete FBD Tree No Sampled Ancestor Tree If all fossils are forced to be on separate lineages, this induces additional speciation events and will, in turn, influence rate & node-age estimates.
  • 85. C F E D DNA Data Substitution Model Site Rate Model Branch Rate Model Morphological Data Substitution Model Site Rate Model Branch Rate Model Time Tree Model Fossil Occurrence Time Data 0001010111001 0101111110114 0011100110005 1??010000???? ???110211???? CATTATTCGCATTA CATCATTCGAATCA AATTATCCGAGTAA ?????????????? ?????????????? Geological Time (turtle tree image by M. Landis)
  • 86. M M C C 0001010111001 0101111110114 0011100110005 1??010000???? ???110211???? Geological Time (turtle tree image by M. Landis)
  • 87. M M C C The Lewis Mk model Assumes a character can take k states Transition rates between states are equal Q = 2 6 6 6 4 1 k 1 ... 1 ... 1 k ... 1 ... ... ... ... 1 1 ... 1 k 3 7 7 7 5 0 0 1 2 2 1 1 2 2 2 2 2 2 2 1 1 1 1 1 1 1 T1 T2 T3 T4 T5 T6 T7 0 0 0 0 0 0 0 (Lewis. Systematic Biology 2001)
  • 88. M M C C The Lewis Mkv model Accounts for the ascertainment bias in morphological datasets by conditioning the likelihood on the fact that invariant characters are not sampled Q = 2 6 6 6 4 1 k 1 ... 1 ... 1 k ... 1 ... ... ... ... 1 1 ... 1 k 3 7 7 7 5 2 2 2 2 2 2 2 1 1 1 1 1 1 1 T1 T2 T3 T4 T5 T6 T7 0 0 0 0 0 0 0 0 0 1 2 2 1 1 (Lewis. Systematic Biology 2001)
  • 89. “T -E ” A Integrating models of molecular and morphological evolution with improved tree priors enables joint inference of the tree topology (extant & extinct) and divergence times DNA Data Substitution Model Site Rate Model Branch Rate Model Morphological Data Substitution Model Site Rate Model Branch Rate Model Time Tree Model Fossil Occurrence Time Data
  • 90. P D D T How does our understanding of penguin evolution improve when we consider both extant and fossil taxa? (by Peppermint Narwhal Creative, http://www.peppermintnarwhal.com) PhotoofDanielKsepkaandKairukuspecimenbyR.E.Fordyce (http://blogs.scientificamerican.com/observations/giant-prehistoric-penguin-was-bigger-than-an-emperor/) Ksepka & Clarke. Bull. AMNH, 337(1):1-77. 2010. Artistic reconstructions by: Stephanie Abramowicz for Scientific American Fordyce, R.E. and D.T. Ksepka. The Strangest Bird Scientific American 307, 56 – 61 (2012)
  • 91. P D Miocene Pli. Ple. (By Peppermint Narwhal Creative, http://www.peppermintnarwhal.com)
  • 92. P D Late Cretaceous Paleocene Eocene Oligocene Miocene Pli.Ple. (silhouette images from http://phylopic.org)
  • 93. F P D Miocene Pli. Ple. (S. urbinai holotype fossil, 5-7 MYA, image by Martin Chávez)
  • 94. P O Kairuku • ⇠1.5 m tall • slender, with narrow bill • scapula & pygostyle are more similar to non-penguins • ⇠27 Mya © Geology Museum of Otago (Chris Gaskin) Image courtesy of Daniel Ksepka (Ksepka, Fordyce, Ando, & Jones, J. Vert. Paleo. 2012)
  • 95. P P Waimanu • oldest known penguin species • intermediate wing morphology • ⇠58–61.6 Mya (Slack et al., Mol. Biol. Evol. 2006)
  • 96. P D D T Paleocene Eocene Oligocene Miocene Pli. Ple. Artistic reconstructions by: Stephanie Abramowicz for Scientific American Fordyce, R.E. and D.T. Ksepka. The Strangest Bird Scientific American 307, 56 – 61 (2012)
  • 97. P D D T Penguin images used with permission from the artists (for Gavryushkina et al. 2016): Stephanie Abramowicz and Barbara Harmon Megadyptes Aptenodytes Paleocene Eocene 102030405060 0 Oligocene Miocene Plio. Plei. Paleogene Neogene Qu. Age (Ma) Spheniscus Eudyptes Pygoscelis Eudyptula Icadyptes salasi Kairuku waitaki Paraptenodytes antarcticus Perudyptes devriesi 19 extant species 36 fossil species 202 morphological characters 8,145 DNA sites > 0.75 posterior probability Waimanu manneringi (Gavryushkina, Heath, Ksepka, Welch, Stadler, Drummond. Syst. Biol., in press. http://arxiv.org/abs/1506.04797)
  • 98. P D D T Penguin images used with permission from the artists (for Gavryushkina et al. 2016): Stephanie Abramowicz and Barbara Harmon Megadyptes Aptenodytes Paleocene Eocene 102030405060 0 Oligocene Miocene Plio. Plei. Paleogene Neogene Qu. Age (Ma) Spheniscus Eudyptes Pygoscelis Eudyptula Icadyptes salasi Kairuku waitaki Waimanu manneringi Paraptenodytes antarcticus Perudyptes devriesi MRCA Age Estimates by Previous Studies Subramanian et al. (2013)Baker et al. (2006) FBD Model: Posterior Density of MRCA Age Gavryushkina et al. (2016) 101520253035404550 MRCA Age of Extant Penguins (Ma) Frequency 1400 1200 1000 800 600 400 200 1600 0 MRCA of extant penguins (Gavryushkina, Heath, Ksepka, Welch, Stadler, Drummond. Syst. Biol., in press. http://arxiv.org/abs/1506.04797)
  • 99. M + M + F 0001010111001 0101111110114 0011100110005 1??010000???? ???110211???? CATTATTCGCATTA CATCATTCGAATCA AATTATCCGAGTAA ?????????????? ?????????????? Geological Time (based on slides by M. Landis)
  • 100. ...but%I%study%amphibians... ???? ???? CCGT 1110 CCGA 1101 CTCA 0001 ???? ???? ?? (slides courtesy of M. Landis; http://bit.ly/2alHqB4)
  • 101. CCGT B
 CCGA A CTCA A Molecules%+%biogeography%+%paleogeography +"Paleogeography""""""Landis,%2016 A BA B A B Geological%time (slides courtesy of M. Landis; http://bit.ly/2alHqB4)
  • 102. Event
 rare Event
 common A!B CCGT B
 CCGA A CTCA A Less%probable
 observation
 given%tree Events%should%occur%before&areas&split A BA B A B +"Paleogeography""""""Landis,%2016 A Geological%time (slides courtesy of M. Landis; http://bit.ly/2alHqB4)
  • 103. A!B CCGT B
 CCGA A CTCA A More%probable
 observation
 given%tree A BA B A B Events%should%occur%before&areas&split +"Paleogeography""""""Landis,%2016 Event
 rare Event
 common A Geological%time (slides courtesy of M. Landis; http://bit.ly/2alHqB4)
  • 104. Event
 common Event
 rare CCGT B
 CCGA B CTCA A
 
 TTCC A Less%probable
 observation
 given%tree Events%should%occur%after!areas&merge A BA BA B A!B A Geological%time (slides courtesy of M. Landis; http://bit.ly/2alHqB4)
  • 105. Event
 common Event
 rare CCGT B
 CCGA B CTCA A
 
 TTCC A More%probable
 observation
 given%tree Events%should%occur%after!areas&merge A BA BA B A!B A Geological%time (slides courtesy of M. Landis; http://bit.ly/2alHqB4)
  • 106. B D Fossil-free calibration • data: molecular sequences • data: biogeographic ranges • empirical paleogeographic model that alters the rates of biogeographic change over time Landis. In Press. “Biogeographic Dating of Speciation Times Using Paleogeographically Informed Processes”. Systematic Biology, doi: 10.1093/sysbio/syw040. (image by M. Landis; http://bit.ly/2alHqB4)
  • 107. B D 25 areas, 26 time-slices, 540–0Ma strong=share land, weak=nearby land, nonenone=all pairs Connectivity model constructed using literature review and GPlates (http://www.gplates.org/) (model & animation courtesy of M. Landis; http://bit.ly/2alHqB4)
  • 108. D + A A R (image by M. Landis; http://bit.ly/2alHqB4)
  • 109. S B -D P A piecewise shifting model where parameters change over time Used to estimate epidemiological parameters of an outbreak 0175 255075100125150 Days (see Stadler et al. PNAS 2013 and Stadler et al. PLoS Currents Outbreaks 2014)
  • 110. S B -D P l is the number of parameter intervals Ri is the e ective reproductive number for interval i 2 l is the rate of becoming non-infectious s is the probability of sampling an individual after becoming non-infectious
  • 111. S B -D P l is the number of parameter intervals i is the transmission rate for interval i 2 l is the viral lineage death rate is the rate each individual is sampled
  • 112. S B -D P
  • 113. S B -D P A decline in R over the history of HIV-1 in the UK is consistent with the introduction of e ective drug therapies After 1998 R decreased below 1, indicating a declining epidemic (Stadler et al. PNAS 2013)
  • 114. E BEAST2 This tutorial uses sequence data and fossil occurrence times to date species divergences using a relaxed-clock model and the fossilized birth-death process Agriarctos spp X Ailurarctos lufengensis X Ailuropoda melanoleuca Arctodus simus X Ballusia elmensis X Helarctos malayanus Indarctos arctoides X Indarctos punjabiensis X Indarctos vireti X Kretzoiarctos beatrix X Melursus ursinus Parictis montanus X Tremarctos ornatus Ursavus brevihinus X Ursavus primaevus X Ursus abstrusus X Ursus americanus Ursus arctos Ursus maritimus Ursus spelaeus X Ursus thibetanus Zaragocyon daamsi X Stem Bears CrownBears Pandas Tremarctinae Brown Bears Ursinae Origin Root (Total Group) Crown