Generalizing phylogenetics to infer patterns predicted by processes of diversification

Jamie Oaks
Jamie OaksAssistant Professor at Auburn University
Generalizing phylogenetics to
infer patterns predicted by
processes of diversification
Jamie Oaks
Auburn University
phyletica.org
@jamoaks
Scan for slides:
www.slideshare.net/jamieoaks7 © 2007 Boris Kulikov boris-kulikov.blogspot.com
© 2007 Boris Kulikov boris-kulikov.blogspot.com
© 2007 Boris Kulikov boris-kulikov.blogspot.com
© 2007 Boris Kulikov boris-kulikov.blogspot.com
Generalizing phylogenetics to
infer patterns of shared
evolutionary events
Phyletica Lab
Postdocs
▶ Perry Wood, Jr
▶ Brian Folt
▶ Jesse Grismer
Graduate students
▶ Tashitso Anamza
▶ Matt Buehler
▶ Kerry Cobb
▶ Kyle David
▶ Saman Jahangiri
▶ Randy Klabacka
▶ Morgan Muell
▶ Tanner Myers
▶ Claire Tracy
▶ Breanna Sipley
▶ Aundrea Westfall
The Phyleticians
Undergraduate students
▶ Laura Lewis
▶ Mary Wells
▶ Hailey Whitaker
▶ Noah Yawn
▶ Charlotte Benedict
▶ Eric Carbo
▶ Ryan Cook
▶ Andrew DeSana
▶ Miles Horne
▶ Jacob Landrum
▶ Nadia L’Bahy
▶ Jorge Lopez-Perez
▶ Holden Smith
▶ Virginia White
▶ Kayla Wilson
▶ Phylogenetics is rapidly becoming the
statistical foundation of biology
© 2007 Boris Kulikov boris-kulikov.blogspot.com
▶ Phylogenetics is rapidly becoming the
statistical foundation of biology
▶ “Big data” present exciting possibilities and
challenges
© 2007 Boris Kulikov boris-kulikov.blogspot.com
▶ Phylogenetics is rapidly becoming the
statistical foundation of biology
▶ “Big data” present exciting possibilities and
challenges
▶ Many opportunities to develop new ways to
study biology in light of phylogeny
© 2007 Boris Kulikov boris-kulikov.blogspot.com
phylopic.org CC0
▶ Assumption: All processes of
diversification affect each lineage
independently
phylopic.org CC0
Generalizing phylogenetics to infer patterns predicted by processes of diversification
Generalizing phylogenetics to infer patterns predicted by processes of diversification
Biogeography
▶ Environmental changes that affect whole
communities of species
Biogeography
▶ Environmental changes that affect whole
communities of species
Genome evolution
▶ Duplication of a chromosome segment
harboring gene families
Biogeography
▶ Environmental changes that affect whole
communities of species
Genome evolution
▶ Duplication of a chromosome segment
harboring gene families
Epidemiology
▶ Transmission at social gatherings
Biogeography
▶ Environmental changes that affect whole
communities of species
Genome evolution
▶ Duplication of a chromosome segment
harboring gene families
Epidemiology
▶ Transmission at social gatherings
Endosymbiont evolution (e.g., parasites,
microbiome)
▶ Speciation of the host
▶ Co-colonization of new host species
Why account for shared divergences?
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Why account for shared divergences?
1. Improve inference
True history
τ1
τ2
τ3
Current tree model
τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Why account for shared divergences?
1. Improve inference
2. Provide a framework for
studying processes of
co-diversification
True history
τ1
τ2
τ3
Current tree model
τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Biogeography
▶ Environmental changes that affect whole
communities of species
Genome evolution
▶ Duplication of a chromosome segment
harboring gene families
Epidemiology
▶ Transmission at social gatherings
Endosymbiont evolution (e.g., parasites,
microbiome)
▶ Speciation of the host
▶ Co-colonization of new host species
Approaches to the problem
A pairwise approach (keep it “simple”)
A fully phylogenetic approach
Tashitso Anamza Tanner Myers
Randy Klabacka Perry Wood, Jr.
Claire Tracy Kerry Cobb
Matt Buehler Nadia L’bahy
Approaches to the problem
A pairwise approach (keep it “simple”)
A fully phylogenetic approach
Dr. Perry Wood, Jr.
Challenges to accounting for shared divergences
True history
τ1
τ2
τ3
Current tree model
τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Challenges to accounting for shared divergences
1. Likelihood for genomic data is
tricky
True history
τ1
τ2
τ3
Current tree model
τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Challenges to accounting for shared divergences
1. Likelihood for genomic data is
tricky
2. Lots of possible trees of
different dimensions
True history
τ1
τ2
τ3
Current tree model
τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Generalizing phylogenetics to infer patterns predicted by processes of diversification
A
A
G
G
G
A
A
A
G
G
G
A
A
A
G
G
G
A
▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent
A
A
G
G
G
A
▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent
▶ Coalescent is a stochastic model of shared inheritance (continuous-time Markov chain =
CTMC)
A
A
G
G
G
A
▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent
▶ Coalescent is a stochastic model of shared inheritance (continuous-time Markov chain =
CTMC)
▶ Gene tree branching patterns are a function of population size
v
u
G
A
A
A
G
G
G
A
▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent
▶ Coalescent is a stochastic model of shared inheritance (continuous-time Markov chain =
CTMC)
▶ Gene tree branching patterns are a function of population size
▶ Conditional on G, model mutation as a CTMC
v
u
G
A
A
A
G
G
G
A
▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent
▶ Coalescent is a stochastic model of shared inheritance (continuous-time Markov chain =
CTMC)
▶ Gene tree branching patterns are a function of population size
▶ Conditional on G, model mutation as a CTMC
▶ Genetic characters provide information about G
v
u
G
A
A
A
G
G
G
A
▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent
▶ Coalescent is a stochastic model of shared inheritance (continuous-time Markov chain =
CTMC)
▶ Gene tree branching patterns are a function of population size
▶ Conditional on G, model mutation as a CTMC
▶ Genetic characters provide information about G
▶ G informs T (population sizes, divergence times, and relationships)
v
u
G
A
A
A
G
G
G
A
v
u
G
A
A
A
G
G
G
A
▶ “Standard” hierarchical approach
v
u
G
A
A
A
G
G
G
A
▶ “Standard” hierarchical approach
▶ Calculate p(genetic data | G) × p(G | T)
v
u
G
A
A
A
G
G
G
A
▶ “Standard” hierarchical approach
▶ Calculate p(genetic data | G) × p(G | T)
▶ Use numerical integration (MCMC) to co-estimate both
v
u
G
A
A
A
G
G
G
A
▶ “Standard” hierarchical approach
▶ Calculate p(genetic data | G) × p(G | T)
▶ Use numerical integration (MCMC) to co-estimate both
▶ But, G and T are highly correlated
v
u
G
A
A
A
G
G
G
A
▶ “Standard” hierarchical approach
▶ Calculate p(genetic data | G) × p(G | T)
▶ Use numerical integration (MCMC) to co-estimate both
▶ But, G and T are highly correlated
▶ As the number of loci (gene trees) increases, MCMC falls apart
v
u
G
A
A
A
G
G
G
A
▶ “Standard” hierarchical approach
▶ Calculate p(genetic data | G) × p(G | T)
▶ Use numerical integration (MCMC) to co-estimate both
▶ But, G and T are highly correlated
▶ As the number of loci (gene trees) increases, MCMC falls apart
▶ Can we integrate G analytically?
v
u
G
A
A
A
G
G
G
A
(n, r ) = (3, 1)
(n, r ) = (3, 2)
(n, r ) = (1, 0)
(n, r ) = (1, 0)
(n, r ) = (2, 0)
1
T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18
2
D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
v
u
G
A
A
A
G
G
G
A
(n, r ) = (3, 1)
(n, r ) = (3, 2)
(n, r ) = (1, 0)
(n, r ) = (1, 0)
(n, r ) = (2, 0)
1
T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18
2
D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
v
u
G
A
A
A
G
G
G
A
(n, r ) = (3, 1)
(n, r ) = (3, 2)
(n, r ) = (1, 0)
(n, r ) = (1, 0)
(n, r ) = (2, 0)
Q =
(1, 0) (1, 1) (2, 0) (2, 1) · · · (n, n)
(1, 0) · · · · ·
(1, 1) · · · · ·
(2, 0) · · · · ·
(2, 1) · · · · ·
.
.
.
(n, n) · · · · ·
1
T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18
2
D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
v
u
G
A
A
A
G
G
G
A
(n, r ) = (3, 1)
(n, r ) = (3, 2)
(n, r ) = (1, 0)
(n, r ) = (1, 0)
(n, r ) = (2, 0)
Q =
(1, 0) (1, 1) (2, 0) (2, 1) · · · (n, n)
(1, 0) · · · · ·
(1, 1) · · · · ·
(2, 0) · · · · ·
(2, 1) · · · · ·
.
.
.
(n, n) · · · · ·
Q(n,r);(n,r−1) = (n − r + 1)v, mutation,
Q(n,r);(n,r+1) = (r + 1)u, mutation,
Q(n,r);(n−1,r) =
(n−1−r)n
2Ne (u+v)
, coalescence,
Q(n,r);(n−1,r−1) =
(r−1)n
2Ne (u+v)
, coalescence,
Q(n,r);(n,r) = −
(n−1)n
2Ne (u+v)
− (n − r)v − ru.
1
T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18
2
D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
v
u
G
A
A
A
G
G
G
A
(n, r ) = (3, 1)
(n, r ) = (3, 2)
(n, r ) = (1, 0)
(n, r ) = (1, 0)
(n, r ) = (2, 0)
Q =
(1, 0) (1, 1) (2, 0) (2, 1) · · · (n, n)
(1, 0) · · · · ·
(1, 1) · · · · ·
(2, 0) · · · · ·
(2, 1) · · · · ·
.
.
.
(n, n) · · · · ·
Q(n,r);(n,r−1) = (n − r + 1)v, mutation,
Q(n,r);(n,r+1) = (r + 1)u, mutation,
Q(n,r);(n−1,r) =
(n−1−r)n
2Ne (u+v)
, coalescence,
Q(n,r);(n−1,r−1) =
(r−1)n
2Ne (u+v)
, coalescence,
Q(n,r);(n,r) = −
(n−1)n
2Ne (u+v)
− (n − r)v − ru.
▶ eQt
to keep track of all conditional probabilites along each branch (Carathéodory-Fejér method1
)
1
T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18
2
D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
v
u
G
A
A
A
G
G
G
A
(n, r ) = (3, 1)
(n, r ) = (3, 2)
(n, r ) = (1, 0)
(n, r ) = (1, 0)
(n, r ) = (2, 0)
Q =
(1, 0) (1, 1) (2, 0) (2, 1) · · · (n, n)
(1, 0) · · · · ·
(1, 1) · · · · ·
(2, 0) · · · · ·
(2, 1) · · · · ·
.
.
.
(n, n) · · · · ·
Q(n,r);(n,r−1) = (n − r + 1)v, mutation,
Q(n,r);(n,r+1) = (r + 1)u, mutation,
Q(n,r);(n−1,r) =
(n−1−r)n
2Ne (u+v)
, coalescence,
Q(n,r);(n−1,r−1) =
(r−1)n
2Ne (u+v)
, coalescence,
Q(n,r);(n,r) = −
(n−1)n
2Ne (u+v)
− (n − r)v − ru.
▶ eQt
to keep track of all conditional probabilites along each branch (Carathéodory-Fejér method1
)
▶ At root, get likelihood of population tree integrated over all possible gene trees and mutational
histories2
1
T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18
2
D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
Challenges to accounting for shared divergences
1. Likelihood for genomic data is
tricky
2. Lots of possible trees of
different dimensions
True history
τ1
τ2
τ3
Current tree model
τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Generalizing tree space
Generalizing tree space
nτ = 3
A
B
C
D
A
C
B
D
A
D
B
C
A
B
C
D
A
B
D
C
A
C
B
D
A
C
D
B
A
D
B
C
A
D
C
B
B
C
A
D
B
C
D
A
B
D
A
C
B
D
C
A
C
D
A
B
C
D
B
A
Generalizing tree space
nτ = 3 nτ = 2 nτ = 1
A
B
C
D
A
C
B
D
A
D
B
C
A
B
C
D
A
C
B
D
A
D
B
C
A
B
C
D
A
B
D
C
A
C
B
D
A
B
C
D
A
C
B
D
A
D
B
C
A
C
D
B
A
D
B
C
A
D
C
B
B
C
A
D
B
D
A
C
C
D
A
B
A
B
C
D
B
C
A
D
B
C
D
A
B
D
A
C
A
B
C
D
A
B
D
C
A
C
D
B
B
D
C
A
C
D
A
B
C
D
B
A
B
C
D
A
Generalizing tree space
nτ = 3 nτ = 2 nτ = 1
A
B
C
D
A
C
B
D
A
D
B
C
A
B
C
D
A
C
B
D
A
D
B
C
A
B
C
D
A
B
D
C
A
C
B
D
A
B
C
D
A
C
B
D
A
D
B
C
A
C
D
B
A
D
B
C
A
D
C
B
B
C
A
D
B
D
A
C
C
D
A
B
A
B
C
D
B
C
A
D
B
C
D
A
B
D
A
C
A
B
C
D
A
B
D
C
A
C
D
B
B
D
C
A
C
D
A
B
C
D
B
A
B
C
D
A
Generalized tree distribution
▶ All topologies equally probable
▶ Parametric distribution on age
of root
▶ Beta distributions on other
divergence times
τ0
τ1 ∼ Beta(τ0, τ3, α, β)
*
τ2 ∼ Beta(τ0, τ3, α, β)
*
τ3 ∼ Beta(τ0, τ4, α, β)
τ4 ∼ Gamma(k, θ)
*
I
H
G
F
E
D
C
B
A
t6
t3
t1
t2
t4
t5
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Inferring trees with shared divergences
Split
Merge
Reversible-jump MCMC
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Validating rjMCMC with 7-leaf tree
350 400 450
0.000
0.005
0.010
0.015
0.020
Number of times each topology sampled
Density
Binom(5e+07, 1/127,569)
MCMC
χ2
= 128,046
p = 0.172
The rjMCMC algorithms sample the expected generalized tree distribution
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Phylogenetic coevality
Phyco val
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Ecoevolity
Estimating evolutionary coevality
J. R. Oaks (2019). Systematic Biology 68: 371–395
▶ Tree model
▶ rjMCMC sampling of generalized tree distribution
1
D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
Phylogenetic coevality
Phyco val
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Ecoevolity
Estimating evolutionary coevality
J. R. Oaks (2019). Systematic Biology 68: 371–395
▶ Tree model
▶ rjMCMC sampling of generalized tree distribution
▶ Likelihood model
▶ CTMC model of characters evolving along genealogies
▶ Infer species trees by analytically integrate over genealogies1
1
D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
Phylogenetic coevality
Phyco val
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Ecoevolity
Estimating evolutionary coevality
J. R. Oaks (2019). Systematic Biology 68: 371–395
▶ Tree model
▶ rjMCMC sampling of generalized tree distribution
▶ Likelihood model
▶ CTMC model of characters evolving along genealogies
▶ Infer species trees by analytically integrate over genealogies1
▶ Goal: Co-estimation of phylogeny and shared divergences from genomic data
1
D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
Does it work?
Methods: Simulations
▶ Simulated 100 data sets with 50,000 base
pairs
τ1
0.05
τ2
0.08
τ3
0.2
τ1
0.03
τ2
0.05
τ3
0.06
τ4
0.11
τ5
0.12
τ6
0.14
τ7
0.18
τ8
0.2
Methods: Simulations
▶ Simulated 100 data sets with 50,000 base
pairs
▶ Analyzed each data set with:
▶ MG = Generalized tree model
▶ MIB = Independent-bifurcating tree model
τ1
0.05
τ2
0.08
τ3
0.2
τ1
0.03
τ2
0.05
τ3
0.06
τ4
0.11
τ5
0.12
τ6
0.14
τ7
0.18
τ8
0.2
Methods: Simulations
▶ Simulated 100 data sets with 50,000 base
pairs
▶ Analyzed each data set with:
▶ MG = Generalized tree model
▶ MIB = Independent-bifurcating tree model
▶ Simulated 100 data sets where topology and
div times randomly drawn from MG and MIB
τ1
0.05
τ2
0.08
τ3
0.2
τ1
0.03
τ2
0.05
τ3
0.06
τ4
0.11
τ5
0.12
τ6
0.14
τ7
0.18
τ8
0.2
0.05
0.08
0.2
J. R. Oaks et al. (2022). PNAS 119: e2121036119
0.0 0.5 1.0
0.05
0.0 0.5 1.0
0.08
0.2
0.0 0.5 1.0
0.0 0.5 1.0
0.0 0.5 1.0
0.0 0.5 1.0
J. R. Oaks et al. (2022). PNAS 119: e2121036119
τ1
0.05
τ2
0.08
τ3
0.2
t5
t2
t1
t3
t4
J. R. Oaks et al. (2022). PNAS 119: e2121036119
MG = Generalized model MIB = Independent-bifurcating model
τ1
0.05
τ2
0.08
τ3
0.2
t5
t2
t1
t3
t4
Splitting t4
0.00
0.25
0.50
0.75
1.00
Posterior
probability
J. R. Oaks et al. (2022). PNAS 119: e2121036119
MG = Generalized model MIB = Independent-bifurcating model
τ1
0.05
τ2
0.08
τ3
0.2
t5
t2
t1
t3
t4
0.000
0.002
0.004
0.006
0.008
0.010
0.012
p = 4.1 × 10−18
MG MIB
Distance
from
true
tree
J. R. Oaks et al. (2022). PNAS 119: e2121036119
MG = Generalized model MIB = Independent-bifurcating model
τ1
0.05
τ2
0.08
τ3
0.2
t5
t2
t1
t3
t4
0.000
0.002
0.004
0.006
0.008
0.010
0.012
p = 4.1 × 10−18
MG MIB
Distance
from
true
tree
MG significantly better at inferring trees with shared divergences
J. R. Oaks et al. (2022). PNAS 119: e2121036119
MG = Generalized model MIB = Independent-bifurcating model
τ1
0.03
τ2
0.05
τ3
0.06
τ4
0.11
τ5
0.12
τ6
0.14
τ7
0.18
τ8
0.2
t8
t2
t1
t7
t4
t3
t6
t5
J. R. Oaks et al. (2022). PNAS 119: e2121036119
MG = Generalized model MIB = Independent-bifurcating model
τ1
0.03
τ2
0.05
τ3
0.06
τ4
0.11
τ5
0.12
τ6
0.14
τ7
0.18
τ8
0.2
t8
t2
t1
t7
t4
t3
t6
t5
True topology
0.00
0.25
0.50
0.75
1.00
Posterior
probability
J. R. Oaks et al. (2022). PNAS 119: e2121036119
MG = Generalized model MIB = Independent-bifurcating model
τ1
0.03
τ2
0.05
τ3
0.06
τ4
0.11
τ5
0.12
τ6
0.14
τ7
0.18
τ8
0.2
t8
t2
t1
t7
t4
t3
t6
t5
0.0000
0.0025
0.0050
0.0075
0.0100
0.0125
p = 0.36
MG MIB
Distance
from
true
tree
J. R. Oaks et al. (2022). PNAS 119: e2121036119
MG = Generalized model MIB = Independent-bifurcating model
τ1
0.03
τ2
0.05
τ3
0.06
τ4
0.11
τ5
0.12
τ6
0.14
τ7
0.18
τ8
0.2
t8
t2
t1
t7
t4
t3
t6
t5
0.0000
0.0025
0.0050
0.0075
0.0100
0.0125
p = 0.36
MG MIB
Distance
from
true
tree
MG performs as well as true model when divergences are independent
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Results: random MG trees
0.5 1.0 1.5 2.0 2.5
0.5
1.0
1.5
2.0
2.5 p(TL ∈ CI) = 0.96
RMSE = 0.00486
True tree length
Estimated
tree
length
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Results: random MG trees
0.5 1.0 1.5 2.0 2.5
0.5
1.0
1.5
2.0
2.5 p(TL ∈ CI) = 0.96
RMSE = 0.00486
True tree length
Estimated
tree
length
Shared divergence
0.00
0.25
0.50
0.75
1.00
Posterior
probability J. R. Oaks et al. (2022). PNAS 119: e2121036119
Results: random MG trees
0.5 1.0 1.5 2.0 2.5
0.5
1.0
1.5
2.0
2.5 p(TL ∈ CI) = 0.96
RMSE = 0.00486
True tree length
Estimated
tree
length
Shared divergence
0.00
0.25
0.50
0.75
1.00
Posterior
probability
MG performs well with data simulated on random trees with shared divergences
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Results: random MIB trees
Probability of incorrectly merged divergence times (true model = MIB)
Merged times
0.00
0.25
0.50
0.75
1.00
FPR = 0.047
A
Posterior
probability
0.00 0.05 0.10 0.15 0.20
0.00
0.25
0.50
0.75
1.00
B
Time difference
Posterior
probability 0.0 0.1 0.2 0.3
0.00
0.25
0.50
0.75
1.00
C
Midpoint divergence time
Posterior
probability
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Results: random MIB trees
Probability of incorrectly merged divergence times (true model = MIB)
Merged times
0.00
0.25
0.50
0.75
1.00
FPR = 0.047
A
Posterior
probability
0.00 0.05 0.10 0.15 0.20
0.00
0.25
0.50
0.75
1.00
B
Time difference
Posterior
probability 0.0 0.1 0.2 0.3
0.00
0.25
0.50
0.75
1.00
C
Midpoint divergence time
Posterior
probability
MG has low false positive rate
J. R. Oaks et al. (2022). PNAS 119: e2121036119
MG = Generalized model MIB = Independent-bifurcating model
0.000
0.002
0.004
0.006
0.008
0.010
0.012
Ave
std
dev
of
split
freq
(ASDSF)
Generalizing tree space improves MCMC convergence and mixing
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Scan for sea-level animation
Scan for sea-level animation
Scan for sea-level animation
Did fragmentation of islands
promote diversification?
Cyrtodactylus Gekko
©Rafe M. Brown
10
15
20
118 120 122 124 126
Longitude
Latitude
10
15
20
118 120 122 124 126
Longitude
Latitude
©Rafe M. Brown
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Cyrtodactylus Gekko
©Rafe M. Brown
10
15
20
118 120 122 124 126
Longitude
Latitude
10
15
20
118 120 122 124 126
Longitude
Latitude
©Rafe M. Brown
1702 loci
155,887 sites
1033 loci
94,813 sites
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Gekko
5°N
10°N
15°N
20°N
116°E 118°E 120°E 122°E 124°E 126°E
Longitude
Latitude
D
116°E 118°E 120°E 122°E 124°E 126°E
Longitude
0.28
0.55
0.19
0.35
0.81
1.0
n
c
i
l
P s
l
P
30 20 10 0
1
2
14
15
16
17
19
18
20
22
21
23
24
26
25
4
3
5
7
6
13
12
11
10
9
8
0.87
0.86
n
c
i
l
P s
l
P
30 20 10 0
C
G. athymus
G. carusadensis
G. crombota
G. ernstkelleri
G. gigante
G. mindorensis
G. monarchus
G. porosus
G. romblon
G. rossi
G. sp. a
G. sp. b
1
5
11
2
7
6
22
20
16
19
18
15
23
17
21
26
25
24
14
13
12
4
3
10
9
8
Oligocene Miocene
Oligocene Miocene
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Cyrtodactylus
D
5°N
10°N
15°N
20°N
116°E 118°E 120°E 122°E 124°E 126°E
Longitude
Latitude
B
1.0
1
10
24
23
25
27
26
18
19
22
21
20
14
13
15
17
16
12
11
7
9
8
2
3
4
6
5
0.71
0.75
n
c
i
l
P s
l
P
30 20 10 0
A
C
C. annulatus
C. gubaot
C. jambangan
C. mamanwa
C. philippinicus
C. redimiculus
C. sumuroi
C. tautbatorum
0.30
0.45
0.90
0.89
0.57
10
13
17
6
4
5
9
3
7
24
27
18
19
22
23
14
26
21
25
15
16
20
12
11
1
8
2
Oligocene Miocene
J. R. Oaks et al. (2022). PNAS 119: e2121036119
Take-home points
▶ We can accurately infer phylogenies with shared divergences
Take-home points
▶ We can accurately infer phylogenies with shared divergences
▶ Generalizing tree space avoids spurious support incorrect relationships and improves
MCMC mixing
Take-home points
▶ We can accurately infer phylogenies with shared divergences
▶ Generalizing tree space avoids spurious support incorrect relationships and improves
MCMC mixing
▶ Among Philippine gekkonids, we found support for shared divergences predicted by
sea-level changes
Open science: everything is available. . .
Software:
▶ Phycoeval: github.com/phyletica/ecoevolity
(release coming soon)
Open-Science Notebooks:
▶ Phycoeval analyses: github.com/phyletica/phycoeval-experiments
▶ Gecko RADseq: github.com/phyletica/gekgo
Moving forward: Theory/methods
Moving forward: Theory/methods
▶ Develop process-based and
trait-dependent distributions over the
space of generalized trees
▶ “Birth-death-burst” model
Moving forward: Theory/methods
▶ Develop process-based and
trait-dependent distributions over the
space of generalized trees
▶ “Birth-death-burst” model
▶ Extend generalized tree distribution to
trees that are not ultrametric
Moving forward: Theory/methods
▶ Develop process-based and
trait-dependent distributions over the
space of generalized trees
▶ “Birth-death-burst” model
▶ Extend generalized tree distribution to
trees that are not ultrametric
nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
Moving forward: Theory/methods
▶ Develop process-based and
trait-dependent distributions over the
space of generalized trees
▶ “Birth-death-burst” model
▶ Extend generalized tree distribution to
trees that are not ultrametric
▶ Couple generalized tree distribution with
other phylogenetic likelihood models
nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
Moving forward: Applications
nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
Moving forward: Applications
Epidemiological dynamics of “super-spreading”
events during the COVID-19 pandemic
nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
Moving forward: Applications
Epidemiological dynamics of “super-spreading”
events during the COVID-19 pandemic
▶ Spread at social gatherings creates shared
and multifurcating divergences in the viral
“transmission tree”
nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
Moving forward: Applications
Epidemiological dynamics of “super-spreading”
events during the COVID-19 pandemic
▶ Spread at social gatherings creates shared
and multifurcating divergences in the viral
“transmission tree”
▶ Estimate rate of shared divergences as
proxy for spread via social gatherings
nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
Moving forward: Applications
Epidemiological dynamics of “super-spreading”
events during the COVID-19 pandemic
▶ Spread at social gatherings creates shared
and multifurcating divergences in the viral
“transmission tree”
▶ Estimate rate of shared divergences as
proxy for spread via social gatherings
▶ Test if this varies over time, among
regions, and among variants of
SARS-CoV-2
nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
© 2007 Boris Kulikov boris-kulikov.blogspot.com
Generalizing phylogenetics to infer patterns predicted by processes of diversification
Come work with us!
Adaptive Comp Bio Internships!
Scan to internship listing:
www.adaptivebiotech.com/career-listings
Thanks everyone!
▶ Thanks Devang and Duke GCB & CBB!
▶ Bryan Howie and my team at Adaptive
▶ Phyletica Lab (the Phyleticians)
▶ Mark Holder
▶ Rafe Brown
▶ Cam Siler
▶ Lee Grismer
Computation:
▶ Alabama Supercomputer Authority
▶ Auburn University Hopper Cluster
Funding:
Photo credits:
▶ Rafe Brown
▶ Perry Wood, Jr.
▶ PhyloPic
Questions?
joaks@auburn.edu
@jamoaks
phyletica.org
Scan for slides:
www.slideshare.net/jamieoaks7
© 2007 Boris Kulikov boris-kulikov.blogspot.com
1 of 104

Recommended

Combining co-expression and co-location for gene network inference in porcine... by
Combining co-expression and co-location for gene network inference in porcine...Combining co-expression and co-location for gene network inference in porcine...
Combining co-expression and co-location for gene network inference in porcine...tuxette
246 views60 slides
Thesis seminar by
Thesis seminarThesis seminar
Thesis seminargvesom
451 views55 slides
Splice site recognition among different organisms by
Splice site recognition among different organismsSplice site recognition among different organisms
Splice site recognition among different organismsDespoina Kalfakakou
189 views42 slides
Clinical significance of transcript alignment discrepancies gne - 20141016 by
Clinical significance of transcript alignment discrepancies   gne - 20141016Clinical significance of transcript alignment discrepancies   gne - 20141016
Clinical significance of transcript alignment discrepancies gne - 20141016Reece Hart
889 views28 slides
ppgardner-lecture03-genomesize-complexity.pdf by
ppgardner-lecture03-genomesize-complexity.pdfppgardner-lecture03-genomesize-complexity.pdf
ppgardner-lecture03-genomesize-complexity.pdfPaul Gardner
12 views32 slides
LPEI_ZCNI_Poster by
LPEI_ZCNI_PosterLPEI_ZCNI_Poster
LPEI_ZCNI_PosterLong Pei
120 views1 slide

More Related Content

Similar to Generalizing phylogenetics to infer patterns predicted by processes of diversification

Genomic inference of the evolution of life history traits by
Genomic inference of the evolution of life history traitsGenomic inference of the evolution of life history traits
Genomic inference of the evolution of life history traitsboussau
230 views80 slides
Participation costs dismiss the advantage of heterogeneous networks in evolut... by
Participation costs dismiss the advantage of heterogeneous networks in evolut...Participation costs dismiss the advantage of heterogeneous networks in evolut...
Participation costs dismiss the advantage of heterogeneous networks in evolut...Naoki Masuda
661 views26 slides
Early generation selection in an intra population recurrent selection breedin... by
Early generation selection in an intra population recurrent selection breedin...Early generation selection in an intra population recurrent selection breedin...
Early generation selection in an intra population recurrent selection breedin...CIAT
883 views22 slides
MUMS Undergraduate Workshop - Introduction to Bayesian Inference & Uncertaint... by
MUMS Undergraduate Workshop - Introduction to Bayesian Inference & Uncertaint...MUMS Undergraduate Workshop - Introduction to Bayesian Inference & Uncertaint...
MUMS Undergraduate Workshop - Introduction to Bayesian Inference & Uncertaint...The Statistical and Applied Mathematical Sciences Institute
154 views42 slides
Multidimensional Co-Evolutionary Stability by
Multidimensional Co-Evolutionary StabilityMultidimensional Co-Evolutionary Stability
Multidimensional Co-Evolutionary StabilityFlorence (Flo) Debarre
224 views78 slides
20150115_JQO_NYAPopulationGenomics by
20150115_JQO_NYAPopulationGenomics20150115_JQO_NYAPopulationGenomics
20150115_JQO_NYAPopulationGenomicsJavier Quílez Oliete
116 views46 slides

Similar to Generalizing phylogenetics to infer patterns predicted by processes of diversification(20)

Genomic inference of the evolution of life history traits by boussau
Genomic inference of the evolution of life history traitsGenomic inference of the evolution of life history traits
Genomic inference of the evolution of life history traits
boussau230 views
Participation costs dismiss the advantage of heterogeneous networks in evolut... by Naoki Masuda
Participation costs dismiss the advantage of heterogeneous networks in evolut...Participation costs dismiss the advantage of heterogeneous networks in evolut...
Participation costs dismiss the advantage of heterogeneous networks in evolut...
Naoki Masuda661 views
Early generation selection in an intra population recurrent selection breedin... by CIAT
Early generation selection in an intra population recurrent selection breedin...Early generation selection in an intra population recurrent selection breedin...
Early generation selection in an intra population recurrent selection breedin...
CIAT883 views
AlgoAlignementGenomicSequences.ppt by SkanderBena
AlgoAlignementGenomicSequences.pptAlgoAlignementGenomicSequences.ppt
AlgoAlignementGenomicSequences.ppt
SkanderBena6 views
Session 03 Probability & sampling Distribution NEW.pptx by Muneer Akhter
Session 03 Probability & sampling Distribution NEW.pptxSession 03 Probability & sampling Distribution NEW.pptx
Session 03 Probability & sampling Distribution NEW.pptx
Muneer Akhter3 views
Accommodating clustered divergences in phylogenetic inference by Jamie Oaks
Accommodating clustered divergences in phylogenetic inferenceAccommodating clustered divergences in phylogenetic inference
Accommodating clustered divergences in phylogenetic inference
Jamie Oaks325 views
Evolutionary Computing - Genetic Algorithms - An Introduction by Martin Pacheco
Evolutionary Computing - Genetic Algorithms - An IntroductionEvolutionary Computing - Genetic Algorithms - An Introduction
Evolutionary Computing - Genetic Algorithms - An Introduction
Martin Pacheco946 views
Human genetics evolutionary genetics by Dan Gaston
Human genetics   evolutionary geneticsHuman genetics   evolutionary genetics
Human genetics evolutionary genetics
Dan Gaston2.9K views
Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片 by Chyi-Tsong Chen
Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片
Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片
Chyi-Tsong Chen1.6K views
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r... by tuxette
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
tuxette122 views
Evolutionary Genetics of Complex Genome by jrossibarra
Evolutionary Genetics of Complex GenomeEvolutionary Genetics of Complex Genome
Evolutionary Genetics of Complex Genome
jrossibarra1.2K views
Generalizing phylogenetics to infer shared evolutionary events by Jamie Oaks
Generalizing phylogenetics to infer shared evolutionary eventsGeneralizing phylogenetics to infer shared evolutionary events
Generalizing phylogenetics to infer shared evolutionary events
Jamie Oaks628 views
MS thesis presentation_FINAL by Tom Hajek
MS thesis presentation_FINALMS thesis presentation_FINAL
MS thesis presentation_FINAL
Tom Hajek308 views
Cdac 2018 antoniotti cancer evolution trait by Marco Antoniotti
Cdac 2018 antoniotti cancer evolution traitCdac 2018 antoniotti cancer evolution trait
Cdac 2018 antoniotti cancer evolution trait
Marco Antoniotti203 views
Organizational Heterogeneity of Human Genome by Svetlana Frenkel
Organizational Heterogeneity of Human GenomeOrganizational Heterogeneity of Human Genome
Organizational Heterogeneity of Human Genome
Svetlana Frenkel669 views

More from Jamie Oaks

Full Bayesian comparative biogeography of Philippine geckos challenges predic... by
Full Bayesian comparative biogeography of Philippine geckos challenges predic...Full Bayesian comparative biogeography of Philippine geckos challenges predic...
Full Bayesian comparative biogeography of Philippine geckos challenges predic...Jamie Oaks
223 views35 slides
Generalizing phylogenetics to infer patterns of shared evolutionary events by
Generalizing phylogenetics to infer patterns of shared evolutionary eventsGeneralizing phylogenetics to infer patterns of shared evolutionary events
Generalizing phylogenetics to infer patterns of shared evolutionary eventsJamie Oaks
185 views95 slides
SSB 2018 Comparative Phylogeography Workshop by
SSB 2018 Comparative Phylogeography WorkshopSSB 2018 Comparative Phylogeography Workshop
SSB 2018 Comparative Phylogeography WorkshopJamie Oaks
500 views48 slides
Inferring shared demographic changes from genomic data by
Inferring shared demographic changes from genomic dataInferring shared demographic changes from genomic data
Inferring shared demographic changes from genomic dataJamie Oaks
206 views16 slides
Generalizing phylogenetics to infer shared evolutionary events by
Generalizing phylogenetics to infer shared evolutionary eventsGeneralizing phylogenetics to infer shared evolutionary events
Generalizing phylogenetics to infer shared evolutionary eventsJamie Oaks
314 views96 slides
Generalizing phylogenetics to infer shared evolutionary events by
Generalizing phylogenetics to infer shared evolutionary eventsGeneralizing phylogenetics to infer shared evolutionary events
Generalizing phylogenetics to infer shared evolutionary eventsJamie Oaks
143 views19 slides

More from Jamie Oaks(10)

Full Bayesian comparative biogeography of Philippine geckos challenges predic... by Jamie Oaks
Full Bayesian comparative biogeography of Philippine geckos challenges predic...Full Bayesian comparative biogeography of Philippine geckos challenges predic...
Full Bayesian comparative biogeography of Philippine geckos challenges predic...
Jamie Oaks223 views
Generalizing phylogenetics to infer patterns of shared evolutionary events by Jamie Oaks
Generalizing phylogenetics to infer patterns of shared evolutionary eventsGeneralizing phylogenetics to infer patterns of shared evolutionary events
Generalizing phylogenetics to infer patterns of shared evolutionary events
Jamie Oaks185 views
SSB 2018 Comparative Phylogeography Workshop by Jamie Oaks
SSB 2018 Comparative Phylogeography WorkshopSSB 2018 Comparative Phylogeography Workshop
SSB 2018 Comparative Phylogeography Workshop
Jamie Oaks500 views
Inferring shared demographic changes from genomic data by Jamie Oaks
Inferring shared demographic changes from genomic dataInferring shared demographic changes from genomic data
Inferring shared demographic changes from genomic data
Jamie Oaks206 views
Generalizing phylogenetics to infer shared evolutionary events by Jamie Oaks
Generalizing phylogenetics to infer shared evolutionary eventsGeneralizing phylogenetics to infer shared evolutionary events
Generalizing phylogenetics to infer shared evolutionary events
Jamie Oaks314 views
Generalizing phylogenetics to infer shared evolutionary events by Jamie Oaks
Generalizing phylogenetics to infer shared evolutionary eventsGeneralizing phylogenetics to infer shared evolutionary events
Generalizing phylogenetics to infer shared evolutionary events
Jamie Oaks143 views
Approximate-rejection sampler by Jamie Oaks
Approximate-rejection samplerApproximate-rejection sampler
Approximate-rejection sampler
Jamie Oaks151 views
Dirichlet-process probability tree by Jamie Oaks
Dirichlet-process probability treeDirichlet-process probability tree
Dirichlet-process probability tree
Jamie Oaks166 views
Islands and Integrals by Jamie Oaks
Islands and IntegralsIslands and Integrals
Islands and Integrals
Jamie Oaks419 views
joaks-evolution-2014 by Jamie Oaks
joaks-evolution-2014joaks-evolution-2014
joaks-evolution-2014
Jamie Oaks671 views

Recently uploaded

Workshop Chemical Robotics ChemAI 231116.pptx by
Workshop Chemical Robotics ChemAI 231116.pptxWorkshop Chemical Robotics ChemAI 231116.pptx
Workshop Chemical Robotics ChemAI 231116.pptxMarco Tibaldi
95 views41 slides
"How can I develop my learning path in bioinformatics? by
"How can I develop my learning path in bioinformatics?"How can I develop my learning path in bioinformatics?
"How can I develop my learning path in bioinformatics?Bioinformy
21 views13 slides
ENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptx by
ENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptxENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptx
ENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptxMN
6 views13 slides
MILK LIPIDS 2.pptx by
MILK LIPIDS 2.pptxMILK LIPIDS 2.pptx
MILK LIPIDS 2.pptxabhinambroze18
7 views15 slides
scopus cited journals.pdf by
scopus cited journals.pdfscopus cited journals.pdf
scopus cited journals.pdfKSAravindSrivastava
5 views15 slides
별헤는 사람들 2023년 12월호 전명원 교수 자료 by
별헤는 사람들 2023년 12월호 전명원 교수 자료별헤는 사람들 2023년 12월호 전명원 교수 자료
별헤는 사람들 2023년 12월호 전명원 교수 자료sciencepeople
31 views30 slides

Recently uploaded(20)

Workshop Chemical Robotics ChemAI 231116.pptx by Marco Tibaldi
Workshop Chemical Robotics ChemAI 231116.pptxWorkshop Chemical Robotics ChemAI 231116.pptx
Workshop Chemical Robotics ChemAI 231116.pptx
Marco Tibaldi95 views
"How can I develop my learning path in bioinformatics? by Bioinformy
"How can I develop my learning path in bioinformatics?"How can I develop my learning path in bioinformatics?
"How can I develop my learning path in bioinformatics?
Bioinformy21 views
ENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptx by MN
ENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptxENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptx
ENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptx
MN6 views
별헤는 사람들 2023년 12월호 전명원 교수 자료 by sciencepeople
별헤는 사람들 2023년 12월호 전명원 교수 자료별헤는 사람들 2023년 12월호 전명원 교수 자료
별헤는 사람들 2023년 12월호 전명원 교수 자료
sciencepeople31 views
Distinct distributions of elliptical and disk galaxies across the Local Super... by Sérgio Sacani
Distinct distributions of elliptical and disk galaxies across the Local Super...Distinct distributions of elliptical and disk galaxies across the Local Super...
Distinct distributions of elliptical and disk galaxies across the Local Super...
Sérgio Sacani30 views
Light Pollution for LVIS students by CWBarthlmew
Light Pollution for LVIS studentsLight Pollution for LVIS students
Light Pollution for LVIS students
CWBarthlmew5 views
application of genetic engineering 2.pptx by SankSurezz
application of genetic engineering 2.pptxapplication of genetic engineering 2.pptx
application of genetic engineering 2.pptx
SankSurezz7 views
Artificial Intelligence Helps in Drug Designing and Discovery.pptx by abhinashsahoo2001
Artificial Intelligence Helps in Drug Designing and Discovery.pptxArtificial Intelligence Helps in Drug Designing and Discovery.pptx
Artificial Intelligence Helps in Drug Designing and Discovery.pptx
abhinashsahoo2001118 views
Connecting communities to promote FAIR resources: perspectives from an RDA / ... by Allyson Lister
Connecting communities to promote FAIR resources: perspectives from an RDA / ...Connecting communities to promote FAIR resources: perspectives from an RDA / ...
Connecting communities to promote FAIR resources: perspectives from an RDA / ...
Allyson Lister34 views
Ethical issues associated with Genetically Modified Crops and Genetically Mod... by PunithKumars6
Ethical issues associated with Genetically Modified Crops and Genetically Mod...Ethical issues associated with Genetically Modified Crops and Genetically Mod...
Ethical issues associated with Genetically Modified Crops and Genetically Mod...
PunithKumars622 views
Pollination By Nagapradheesh.M.pptx by MNAGAPRADHEESH
Pollination By Nagapradheesh.M.pptxPollination By Nagapradheesh.M.pptx
Pollination By Nagapradheesh.M.pptx
MNAGAPRADHEESH15 views
Conventional and non-conventional methods for improvement of cucurbits.pptx by gandhi976
Conventional and non-conventional methods for improvement of cucurbits.pptxConventional and non-conventional methods for improvement of cucurbits.pptx
Conventional and non-conventional methods for improvement of cucurbits.pptx
gandhi97618 views
Metatheoretical Panda-Samaneh Borji.pdf by samanehborji
Metatheoretical Panda-Samaneh Borji.pdfMetatheoretical Panda-Samaneh Borji.pdf
Metatheoretical Panda-Samaneh Borji.pdf
samanehborji16 views
PRINCIPLES-OF ASSESSMENT by rbalmagro
PRINCIPLES-OF ASSESSMENTPRINCIPLES-OF ASSESSMENT
PRINCIPLES-OF ASSESSMENT
rbalmagro11 views

Generalizing phylogenetics to infer patterns predicted by processes of diversification

  • 1. Generalizing phylogenetics to infer patterns predicted by processes of diversification Jamie Oaks Auburn University phyletica.org @jamoaks Scan for slides: www.slideshare.net/jamieoaks7 © 2007 Boris Kulikov boris-kulikov.blogspot.com
  • 2. © 2007 Boris Kulikov boris-kulikov.blogspot.com
  • 3. © 2007 Boris Kulikov boris-kulikov.blogspot.com
  • 4. © 2007 Boris Kulikov boris-kulikov.blogspot.com Generalizing phylogenetics to infer patterns of shared evolutionary events
  • 5. Phyletica Lab Postdocs ▶ Perry Wood, Jr ▶ Brian Folt ▶ Jesse Grismer Graduate students ▶ Tashitso Anamza ▶ Matt Buehler ▶ Kerry Cobb ▶ Kyle David ▶ Saman Jahangiri ▶ Randy Klabacka ▶ Morgan Muell ▶ Tanner Myers ▶ Claire Tracy ▶ Breanna Sipley ▶ Aundrea Westfall The Phyleticians Undergraduate students ▶ Laura Lewis ▶ Mary Wells ▶ Hailey Whitaker ▶ Noah Yawn ▶ Charlotte Benedict ▶ Eric Carbo ▶ Ryan Cook ▶ Andrew DeSana ▶ Miles Horne ▶ Jacob Landrum ▶ Nadia L’Bahy ▶ Jorge Lopez-Perez ▶ Holden Smith ▶ Virginia White ▶ Kayla Wilson
  • 6. ▶ Phylogenetics is rapidly becoming the statistical foundation of biology © 2007 Boris Kulikov boris-kulikov.blogspot.com
  • 7. ▶ Phylogenetics is rapidly becoming the statistical foundation of biology ▶ “Big data” present exciting possibilities and challenges © 2007 Boris Kulikov boris-kulikov.blogspot.com
  • 8. ▶ Phylogenetics is rapidly becoming the statistical foundation of biology ▶ “Big data” present exciting possibilities and challenges ▶ Many opportunities to develop new ways to study biology in light of phylogeny © 2007 Boris Kulikov boris-kulikov.blogspot.com
  • 10. ▶ Assumption: All processes of diversification affect each lineage independently phylopic.org CC0
  • 13. Biogeography ▶ Environmental changes that affect whole communities of species
  • 14. Biogeography ▶ Environmental changes that affect whole communities of species Genome evolution ▶ Duplication of a chromosome segment harboring gene families
  • 15. Biogeography ▶ Environmental changes that affect whole communities of species Genome evolution ▶ Duplication of a chromosome segment harboring gene families Epidemiology ▶ Transmission at social gatherings
  • 16. Biogeography ▶ Environmental changes that affect whole communities of species Genome evolution ▶ Duplication of a chromosome segment harboring gene families Epidemiology ▶ Transmission at social gatherings Endosymbiont evolution (e.g., parasites, microbiome) ▶ Speciation of the host ▶ Co-colonization of new host species
  • 17. Why account for shared divergences? J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 18. Why account for shared divergences? 1. Improve inference True history τ1 τ2 τ3 Current tree model τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8 J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 19. Why account for shared divergences? 1. Improve inference 2. Provide a framework for studying processes of co-diversification True history τ1 τ2 τ3 Current tree model τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8 J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 20. Biogeography ▶ Environmental changes that affect whole communities of species Genome evolution ▶ Duplication of a chromosome segment harboring gene families Epidemiology ▶ Transmission at social gatherings Endosymbiont evolution (e.g., parasites, microbiome) ▶ Speciation of the host ▶ Co-colonization of new host species
  • 21. Approaches to the problem A pairwise approach (keep it “simple”) A fully phylogenetic approach Tashitso Anamza Tanner Myers Randy Klabacka Perry Wood, Jr. Claire Tracy Kerry Cobb Matt Buehler Nadia L’bahy
  • 22. Approaches to the problem A pairwise approach (keep it “simple”) A fully phylogenetic approach Dr. Perry Wood, Jr.
  • 23. Challenges to accounting for shared divergences True history τ1 τ2 τ3 Current tree model τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8 J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 24. Challenges to accounting for shared divergences 1. Likelihood for genomic data is tricky True history τ1 τ2 τ3 Current tree model τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8 J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 25. Challenges to accounting for shared divergences 1. Likelihood for genomic data is tricky 2. Lots of possible trees of different dimensions True history τ1 τ2 τ3 Current tree model τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8 J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 29. A A G G G A ▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent
  • 30. A A G G G A ▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent ▶ Coalescent is a stochastic model of shared inheritance (continuous-time Markov chain = CTMC)
  • 31. A A G G G A ▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent ▶ Coalescent is a stochastic model of shared inheritance (continuous-time Markov chain = CTMC) ▶ Gene tree branching patterns are a function of population size
  • 32. v u G A A A G G G A ▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent ▶ Coalescent is a stochastic model of shared inheritance (continuous-time Markov chain = CTMC) ▶ Gene tree branching patterns are a function of population size ▶ Conditional on G, model mutation as a CTMC
  • 33. v u G A A A G G G A ▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent ▶ Coalescent is a stochastic model of shared inheritance (continuous-time Markov chain = CTMC) ▶ Gene tree branching patterns are a function of population size ▶ Conditional on G, model mutation as a CTMC ▶ Genetic characters provide information about G
  • 34. v u G A A A G G G A ▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent ▶ Coalescent is a stochastic model of shared inheritance (continuous-time Markov chain = CTMC) ▶ Gene tree branching patterns are a function of population size ▶ Conditional on G, model mutation as a CTMC ▶ Genetic characters provide information about G ▶ G informs T (population sizes, divergence times, and relationships)
  • 37. v u G A A A G G G A ▶ “Standard” hierarchical approach ▶ Calculate p(genetic data | G) × p(G | T)
  • 38. v u G A A A G G G A ▶ “Standard” hierarchical approach ▶ Calculate p(genetic data | G) × p(G | T) ▶ Use numerical integration (MCMC) to co-estimate both
  • 39. v u G A A A G G G A ▶ “Standard” hierarchical approach ▶ Calculate p(genetic data | G) × p(G | T) ▶ Use numerical integration (MCMC) to co-estimate both ▶ But, G and T are highly correlated
  • 40. v u G A A A G G G A ▶ “Standard” hierarchical approach ▶ Calculate p(genetic data | G) × p(G | T) ▶ Use numerical integration (MCMC) to co-estimate both ▶ But, G and T are highly correlated ▶ As the number of loci (gene trees) increases, MCMC falls apart
  • 41. v u G A A A G G G A ▶ “Standard” hierarchical approach ▶ Calculate p(genetic data | G) × p(G | T) ▶ Use numerical integration (MCMC) to co-estimate both ▶ But, G and T are highly correlated ▶ As the number of loci (gene trees) increases, MCMC falls apart ▶ Can we integrate G analytically?
  • 42. v u G A A A G G G A (n, r ) = (3, 1) (n, r ) = (3, 2) (n, r ) = (1, 0) (n, r ) = (1, 0) (n, r ) = (2, 0) 1 T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18 2 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
  • 43. v u G A A A G G G A (n, r ) = (3, 1) (n, r ) = (3, 2) (n, r ) = (1, 0) (n, r ) = (1, 0) (n, r ) = (2, 0) 1 T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18 2 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
  • 44. v u G A A A G G G A (n, r ) = (3, 1) (n, r ) = (3, 2) (n, r ) = (1, 0) (n, r ) = (1, 0) (n, r ) = (2, 0) Q = (1, 0) (1, 1) (2, 0) (2, 1) · · · (n, n) (1, 0) · · · · · (1, 1) · · · · · (2, 0) · · · · · (2, 1) · · · · · . . . (n, n) · · · · · 1 T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18 2 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
  • 45. v u G A A A G G G A (n, r ) = (3, 1) (n, r ) = (3, 2) (n, r ) = (1, 0) (n, r ) = (1, 0) (n, r ) = (2, 0) Q = (1, 0) (1, 1) (2, 0) (2, 1) · · · (n, n) (1, 0) · · · · · (1, 1) · · · · · (2, 0) · · · · · (2, 1) · · · · · . . . (n, n) · · · · · Q(n,r);(n,r−1) = (n − r + 1)v, mutation, Q(n,r);(n,r+1) = (r + 1)u, mutation, Q(n,r);(n−1,r) = (n−1−r)n 2Ne (u+v) , coalescence, Q(n,r);(n−1,r−1) = (r−1)n 2Ne (u+v) , coalescence, Q(n,r);(n,r) = − (n−1)n 2Ne (u+v) − (n − r)v − ru. 1 T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18 2 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
  • 46. v u G A A A G G G A (n, r ) = (3, 1) (n, r ) = (3, 2) (n, r ) = (1, 0) (n, r ) = (1, 0) (n, r ) = (2, 0) Q = (1, 0) (1, 1) (2, 0) (2, 1) · · · (n, n) (1, 0) · · · · · (1, 1) · · · · · (2, 0) · · · · · (2, 1) · · · · · . . . (n, n) · · · · · Q(n,r);(n,r−1) = (n − r + 1)v, mutation, Q(n,r);(n,r+1) = (r + 1)u, mutation, Q(n,r);(n−1,r) = (n−1−r)n 2Ne (u+v) , coalescence, Q(n,r);(n−1,r−1) = (r−1)n 2Ne (u+v) , coalescence, Q(n,r);(n,r) = − (n−1)n 2Ne (u+v) − (n − r)v − ru. ▶ eQt to keep track of all conditional probabilites along each branch (Carathéodory-Fejér method1 ) 1 T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18 2 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
  • 47. v u G A A A G G G A (n, r ) = (3, 1) (n, r ) = (3, 2) (n, r ) = (1, 0) (n, r ) = (1, 0) (n, r ) = (2, 0) Q = (1, 0) (1, 1) (2, 0) (2, 1) · · · (n, n) (1, 0) · · · · · (1, 1) · · · · · (2, 0) · · · · · (2, 1) · · · · · . . . (n, n) · · · · · Q(n,r);(n,r−1) = (n − r + 1)v, mutation, Q(n,r);(n,r+1) = (r + 1)u, mutation, Q(n,r);(n−1,r) = (n−1−r)n 2Ne (u+v) , coalescence, Q(n,r);(n−1,r−1) = (r−1)n 2Ne (u+v) , coalescence, Q(n,r);(n,r) = − (n−1)n 2Ne (u+v) − (n − r)v − ru. ▶ eQt to keep track of all conditional probabilites along each branch (Carathéodory-Fejér method1 ) ▶ At root, get likelihood of population tree integrated over all possible gene trees and mutational histories2 1 T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18 2 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
  • 48. Challenges to accounting for shared divergences 1. Likelihood for genomic data is tricky 2. Lots of possible trees of different dimensions True history τ1 τ2 τ3 Current tree model τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8 J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 50. Generalizing tree space nτ = 3 A B C D A C B D A D B C A B C D A B D C A C B D A C D B A D B C A D C B B C A D B C D A B D A C B D C A C D A B C D B A
  • 51. Generalizing tree space nτ = 3 nτ = 2 nτ = 1 A B C D A C B D A D B C A B C D A C B D A D B C A B C D A B D C A C B D A B C D A C B D A D B C A C D B A D B C A D C B B C A D B D A C C D A B A B C D B C A D B C D A B D A C A B C D A B D C A C D B B D C A C D A B C D B A B C D A
  • 52. Generalizing tree space nτ = 3 nτ = 2 nτ = 1 A B C D A C B D A D B C A B C D A C B D A D B C A B C D A B D C A C B D A B C D A C B D A D B C A C D B A D B C A D C B B C A D B D A C C D A B A B C D B C A D B C D A B D A C A B C D A B D C A C D B B D C A C D A B C D B A B C D A
  • 53. Generalized tree distribution ▶ All topologies equally probable ▶ Parametric distribution on age of root ▶ Beta distributions on other divergence times τ0 τ1 ∼ Beta(τ0, τ3, α, β) * τ2 ∼ Beta(τ0, τ3, α, β) * τ3 ∼ Beta(τ0, τ4, α, β) τ4 ∼ Gamma(k, θ) * I H G F E D C B A t6 t3 t1 t2 t4 t5 J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 54. Inferring trees with shared divergences Split Merge Reversible-jump MCMC J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 55. Validating rjMCMC with 7-leaf tree 350 400 450 0.000 0.005 0.010 0.015 0.020 Number of times each topology sampled Density Binom(5e+07, 1/127,569) MCMC χ2 = 128,046 p = 0.172 The rjMCMC algorithms sample the expected generalized tree distribution J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 56. Phylogenetic coevality Phyco val J. R. Oaks et al. (2022). PNAS 119: e2121036119 Ecoevolity Estimating evolutionary coevality J. R. Oaks (2019). Systematic Biology 68: 371–395 ▶ Tree model ▶ rjMCMC sampling of generalized tree distribution 1 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
  • 57. Phylogenetic coevality Phyco val J. R. Oaks et al. (2022). PNAS 119: e2121036119 Ecoevolity Estimating evolutionary coevality J. R. Oaks (2019). Systematic Biology 68: 371–395 ▶ Tree model ▶ rjMCMC sampling of generalized tree distribution ▶ Likelihood model ▶ CTMC model of characters evolving along genealogies ▶ Infer species trees by analytically integrate over genealogies1 1 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
  • 58. Phylogenetic coevality Phyco val J. R. Oaks et al. (2022). PNAS 119: e2121036119 Ecoevolity Estimating evolutionary coevality J. R. Oaks (2019). Systematic Biology 68: 371–395 ▶ Tree model ▶ rjMCMC sampling of generalized tree distribution ▶ Likelihood model ▶ CTMC model of characters evolving along genealogies ▶ Infer species trees by analytically integrate over genealogies1 ▶ Goal: Co-estimation of phylogeny and shared divergences from genomic data 1 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
  • 60. Methods: Simulations ▶ Simulated 100 data sets with 50,000 base pairs τ1 0.05 τ2 0.08 τ3 0.2 τ1 0.03 τ2 0.05 τ3 0.06 τ4 0.11 τ5 0.12 τ6 0.14 τ7 0.18 τ8 0.2
  • 61. Methods: Simulations ▶ Simulated 100 data sets with 50,000 base pairs ▶ Analyzed each data set with: ▶ MG = Generalized tree model ▶ MIB = Independent-bifurcating tree model τ1 0.05 τ2 0.08 τ3 0.2 τ1 0.03 τ2 0.05 τ3 0.06 τ4 0.11 τ5 0.12 τ6 0.14 τ7 0.18 τ8 0.2
  • 62. Methods: Simulations ▶ Simulated 100 data sets with 50,000 base pairs ▶ Analyzed each data set with: ▶ MG = Generalized tree model ▶ MIB = Independent-bifurcating tree model ▶ Simulated 100 data sets where topology and div times randomly drawn from MG and MIB τ1 0.05 τ2 0.08 τ3 0.2 τ1 0.03 τ2 0.05 τ3 0.06 τ4 0.11 τ5 0.12 τ6 0.14 τ7 0.18 τ8 0.2
  • 63. 0.05 0.08 0.2 J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 64. 0.0 0.5 1.0 0.05 0.0 0.5 1.0 0.08 0.2 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 65. τ1 0.05 τ2 0.08 τ3 0.2 t5 t2 t1 t3 t4 J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 66. MG = Generalized model MIB = Independent-bifurcating model τ1 0.05 τ2 0.08 τ3 0.2 t5 t2 t1 t3 t4 Splitting t4 0.00 0.25 0.50 0.75 1.00 Posterior probability J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 67. MG = Generalized model MIB = Independent-bifurcating model τ1 0.05 τ2 0.08 τ3 0.2 t5 t2 t1 t3 t4 0.000 0.002 0.004 0.006 0.008 0.010 0.012 p = 4.1 × 10−18 MG MIB Distance from true tree J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 68. MG = Generalized model MIB = Independent-bifurcating model τ1 0.05 τ2 0.08 τ3 0.2 t5 t2 t1 t3 t4 0.000 0.002 0.004 0.006 0.008 0.010 0.012 p = 4.1 × 10−18 MG MIB Distance from true tree MG significantly better at inferring trees with shared divergences J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 69. MG = Generalized model MIB = Independent-bifurcating model τ1 0.03 τ2 0.05 τ3 0.06 τ4 0.11 τ5 0.12 τ6 0.14 τ7 0.18 τ8 0.2 t8 t2 t1 t7 t4 t3 t6 t5 J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 70. MG = Generalized model MIB = Independent-bifurcating model τ1 0.03 τ2 0.05 τ3 0.06 τ4 0.11 τ5 0.12 τ6 0.14 τ7 0.18 τ8 0.2 t8 t2 t1 t7 t4 t3 t6 t5 True topology 0.00 0.25 0.50 0.75 1.00 Posterior probability J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 71. MG = Generalized model MIB = Independent-bifurcating model τ1 0.03 τ2 0.05 τ3 0.06 τ4 0.11 τ5 0.12 τ6 0.14 τ7 0.18 τ8 0.2 t8 t2 t1 t7 t4 t3 t6 t5 0.0000 0.0025 0.0050 0.0075 0.0100 0.0125 p = 0.36 MG MIB Distance from true tree J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 72. MG = Generalized model MIB = Independent-bifurcating model τ1 0.03 τ2 0.05 τ3 0.06 τ4 0.11 τ5 0.12 τ6 0.14 τ7 0.18 τ8 0.2 t8 t2 t1 t7 t4 t3 t6 t5 0.0000 0.0025 0.0050 0.0075 0.0100 0.0125 p = 0.36 MG MIB Distance from true tree MG performs as well as true model when divergences are independent J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 73. Results: random MG trees 0.5 1.0 1.5 2.0 2.5 0.5 1.0 1.5 2.0 2.5 p(TL ∈ CI) = 0.96 RMSE = 0.00486 True tree length Estimated tree length J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 74. Results: random MG trees 0.5 1.0 1.5 2.0 2.5 0.5 1.0 1.5 2.0 2.5 p(TL ∈ CI) = 0.96 RMSE = 0.00486 True tree length Estimated tree length Shared divergence 0.00 0.25 0.50 0.75 1.00 Posterior probability J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 75. Results: random MG trees 0.5 1.0 1.5 2.0 2.5 0.5 1.0 1.5 2.0 2.5 p(TL ∈ CI) = 0.96 RMSE = 0.00486 True tree length Estimated tree length Shared divergence 0.00 0.25 0.50 0.75 1.00 Posterior probability MG performs well with data simulated on random trees with shared divergences J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 76. Results: random MIB trees Probability of incorrectly merged divergence times (true model = MIB) Merged times 0.00 0.25 0.50 0.75 1.00 FPR = 0.047 A Posterior probability 0.00 0.05 0.10 0.15 0.20 0.00 0.25 0.50 0.75 1.00 B Time difference Posterior probability 0.0 0.1 0.2 0.3 0.00 0.25 0.50 0.75 1.00 C Midpoint divergence time Posterior probability J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 77. Results: random MIB trees Probability of incorrectly merged divergence times (true model = MIB) Merged times 0.00 0.25 0.50 0.75 1.00 FPR = 0.047 A Posterior probability 0.00 0.05 0.10 0.15 0.20 0.00 0.25 0.50 0.75 1.00 B Time difference Posterior probability 0.0 0.1 0.2 0.3 0.00 0.25 0.50 0.75 1.00 C Midpoint divergence time Posterior probability MG has low false positive rate J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 78. MG = Generalized model MIB = Independent-bifurcating model 0.000 0.002 0.004 0.006 0.008 0.010 0.012 Ave std dev of split freq (ASDSF) Generalizing tree space improves MCMC convergence and mixing J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 79. Scan for sea-level animation
  • 80. Scan for sea-level animation
  • 81. Scan for sea-level animation Did fragmentation of islands promote diversification?
  • 82. Cyrtodactylus Gekko ©Rafe M. Brown 10 15 20 118 120 122 124 126 Longitude Latitude 10 15 20 118 120 122 124 126 Longitude Latitude ©Rafe M. Brown J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 83. Cyrtodactylus Gekko ©Rafe M. Brown 10 15 20 118 120 122 124 126 Longitude Latitude 10 15 20 118 120 122 124 126 Longitude Latitude ©Rafe M. Brown 1702 loci 155,887 sites 1033 loci 94,813 sites J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 84. Gekko 5°N 10°N 15°N 20°N 116°E 118°E 120°E 122°E 124°E 126°E Longitude Latitude D 116°E 118°E 120°E 122°E 124°E 126°E Longitude 0.28 0.55 0.19 0.35 0.81 1.0 n c i l P s l P 30 20 10 0 1 2 14 15 16 17 19 18 20 22 21 23 24 26 25 4 3 5 7 6 13 12 11 10 9 8 0.87 0.86 n c i l P s l P 30 20 10 0 C G. athymus G. carusadensis G. crombota G. ernstkelleri G. gigante G. mindorensis G. monarchus G. porosus G. romblon G. rossi G. sp. a G. sp. b 1 5 11 2 7 6 22 20 16 19 18 15 23 17 21 26 25 24 14 13 12 4 3 10 9 8 Oligocene Miocene Oligocene Miocene J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 85. Cyrtodactylus D 5°N 10°N 15°N 20°N 116°E 118°E 120°E 122°E 124°E 126°E Longitude Latitude B 1.0 1 10 24 23 25 27 26 18 19 22 21 20 14 13 15 17 16 12 11 7 9 8 2 3 4 6 5 0.71 0.75 n c i l P s l P 30 20 10 0 A C C. annulatus C. gubaot C. jambangan C. mamanwa C. philippinicus C. redimiculus C. sumuroi C. tautbatorum 0.30 0.45 0.90 0.89 0.57 10 13 17 6 4 5 9 3 7 24 27 18 19 22 23 14 26 21 25 15 16 20 12 11 1 8 2 Oligocene Miocene J. R. Oaks et al. (2022). PNAS 119: e2121036119
  • 86. Take-home points ▶ We can accurately infer phylogenies with shared divergences
  • 87. Take-home points ▶ We can accurately infer phylogenies with shared divergences ▶ Generalizing tree space avoids spurious support incorrect relationships and improves MCMC mixing
  • 88. Take-home points ▶ We can accurately infer phylogenies with shared divergences ▶ Generalizing tree space avoids spurious support incorrect relationships and improves MCMC mixing ▶ Among Philippine gekkonids, we found support for shared divergences predicted by sea-level changes
  • 89. Open science: everything is available. . . Software: ▶ Phycoeval: github.com/phyletica/ecoevolity (release coming soon) Open-Science Notebooks: ▶ Phycoeval analyses: github.com/phyletica/phycoeval-experiments ▶ Gecko RADseq: github.com/phyletica/gekgo
  • 91. Moving forward: Theory/methods ▶ Develop process-based and trait-dependent distributions over the space of generalized trees ▶ “Birth-death-burst” model
  • 92. Moving forward: Theory/methods ▶ Develop process-based and trait-dependent distributions over the space of generalized trees ▶ “Birth-death-burst” model ▶ Extend generalized tree distribution to trees that are not ultrametric
  • 93. Moving forward: Theory/methods ▶ Develop process-based and trait-dependent distributions over the space of generalized trees ▶ “Birth-death-burst” model ▶ Extend generalized tree distribution to trees that are not ultrametric nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
  • 94. Moving forward: Theory/methods ▶ Develop process-based and trait-dependent distributions over the space of generalized trees ▶ “Birth-death-burst” model ▶ Extend generalized tree distribution to trees that are not ultrametric ▶ Couple generalized tree distribution with other phylogenetic likelihood models nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
  • 95. Moving forward: Applications nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
  • 96. Moving forward: Applications Epidemiological dynamics of “super-spreading” events during the COVID-19 pandemic nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
  • 97. Moving forward: Applications Epidemiological dynamics of “super-spreading” events during the COVID-19 pandemic ▶ Spread at social gatherings creates shared and multifurcating divergences in the viral “transmission tree” nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
  • 98. Moving forward: Applications Epidemiological dynamics of “super-spreading” events during the COVID-19 pandemic ▶ Spread at social gatherings creates shared and multifurcating divergences in the viral “transmission tree” ▶ Estimate rate of shared divergences as proxy for spread via social gatherings nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
  • 99. Moving forward: Applications Epidemiological dynamics of “super-spreading” events during the COVID-19 pandemic ▶ Spread at social gatherings creates shared and multifurcating divergences in the viral “transmission tree” ▶ Estimate rate of shared divergences as proxy for spread via social gatherings ▶ Test if this varies over time, among regions, and among variants of SARS-CoV-2 nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
  • 100. © 2007 Boris Kulikov boris-kulikov.blogspot.com
  • 102. Come work with us! Adaptive Comp Bio Internships! Scan to internship listing: www.adaptivebiotech.com/career-listings
  • 103. Thanks everyone! ▶ Thanks Devang and Duke GCB & CBB! ▶ Bryan Howie and my team at Adaptive ▶ Phyletica Lab (the Phyleticians) ▶ Mark Holder ▶ Rafe Brown ▶ Cam Siler ▶ Lee Grismer Computation: ▶ Alabama Supercomputer Authority ▶ Auburn University Hopper Cluster Funding: Photo credits: ▶ Rafe Brown ▶ Perry Wood, Jr. ▶ PhyloPic