Report

Share

Follow

•0 likes•74 views

•0 likes•74 views

Report

Share

Download to read offline

Slides for Duke CBB seminar

Follow

- 1. Generalizing phylogenetics to infer patterns predicted by processes of diversification Jamie Oaks Auburn University phyletica.org @jamoaks Scan for slides: www.slideshare.net/jamieoaks7 © 2007 Boris Kulikov boris-kulikov.blogspot.com
- 2. © 2007 Boris Kulikov boris-kulikov.blogspot.com
- 3. © 2007 Boris Kulikov boris-kulikov.blogspot.com
- 4. © 2007 Boris Kulikov boris-kulikov.blogspot.com Generalizing phylogenetics to infer patterns of shared evolutionary events
- 5. Phyletica Lab Postdocs ▶ Perry Wood, Jr ▶ Brian Folt ▶ Jesse Grismer Graduate students ▶ Tashitso Anamza ▶ Matt Buehler ▶ Kerry Cobb ▶ Kyle David ▶ Saman Jahangiri ▶ Randy Klabacka ▶ Morgan Muell ▶ Tanner Myers ▶ Claire Tracy ▶ Breanna Sipley ▶ Aundrea Westfall The Phyleticians Undergraduate students ▶ Laura Lewis ▶ Mary Wells ▶ Hailey Whitaker ▶ Noah Yawn ▶ Charlotte Benedict ▶ Eric Carbo ▶ Ryan Cook ▶ Andrew DeSana ▶ Miles Horne ▶ Jacob Landrum ▶ Nadia L’Bahy ▶ Jorge Lopez-Perez ▶ Holden Smith ▶ Virginia White ▶ Kayla Wilson
- 6. ▶ Phylogenetics is rapidly becoming the statistical foundation of biology © 2007 Boris Kulikov boris-kulikov.blogspot.com
- 7. ▶ Phylogenetics is rapidly becoming the statistical foundation of biology ▶ “Big data” present exciting possibilities and challenges © 2007 Boris Kulikov boris-kulikov.blogspot.com
- 8. ▶ Phylogenetics is rapidly becoming the statistical foundation of biology ▶ “Big data” present exciting possibilities and challenges ▶ Many opportunities to develop new ways to study biology in light of phylogeny © 2007 Boris Kulikov boris-kulikov.blogspot.com
- 10. ▶ Assumption: All processes of diversification affect each lineage independently phylopic.org CC0
- 13. Biogeography ▶ Environmental changes that affect whole communities of species
- 14. Biogeography ▶ Environmental changes that affect whole communities of species Genome evolution ▶ Duplication of a chromosome segment harboring gene families
- 15. Biogeography ▶ Environmental changes that affect whole communities of species Genome evolution ▶ Duplication of a chromosome segment harboring gene families Epidemiology ▶ Transmission at social gatherings
- 16. Biogeography ▶ Environmental changes that affect whole communities of species Genome evolution ▶ Duplication of a chromosome segment harboring gene families Epidemiology ▶ Transmission at social gatherings Endosymbiont evolution (e.g., parasites, microbiome) ▶ Speciation of the host ▶ Co-colonization of new host species
- 17. Why account for shared divergences? J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 18. Why account for shared divergences? 1. Improve inference True history τ1 τ2 τ3 Current tree model τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8 J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 19. Why account for shared divergences? 1. Improve inference 2. Provide a framework for studying processes of co-diversification True history τ1 τ2 τ3 Current tree model τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8 J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 20. Biogeography ▶ Environmental changes that affect whole communities of species Genome evolution ▶ Duplication of a chromosome segment harboring gene families Epidemiology ▶ Transmission at social gatherings Endosymbiont evolution (e.g., parasites, microbiome) ▶ Speciation of the host ▶ Co-colonization of new host species
- 21. Approaches to the problem A pairwise approach (keep it “simple”) A fully phylogenetic approach Tashitso Anamza Tanner Myers Randy Klabacka Perry Wood, Jr. Claire Tracy Kerry Cobb Matt Buehler Nadia L’bahy
- 22. Approaches to the problem A pairwise approach (keep it “simple”) A fully phylogenetic approach Dr. Perry Wood, Jr.
- 23. Challenges to accounting for shared divergences True history τ1 τ2 τ3 Current tree model τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8 J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 24. Challenges to accounting for shared divergences 1. Likelihood for genomic data is tricky True history τ1 τ2 τ3 Current tree model τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8 J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 25. Challenges to accounting for shared divergences 1. Likelihood for genomic data is tricky 2. Lots of possible trees of different dimensions True history τ1 τ2 τ3 Current tree model τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8 J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 27. A A G G G A
- 28. A A G G G A
- 29. A A G G G A ▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent
- 30. A A G G G A ▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent ▶ Coalescent is a stochastic model of shared inheritance (continuous-time Markov chain = CTMC)
- 31. A A G G G A ▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent ▶ Coalescent is a stochastic model of shared inheritance (continuous-time Markov chain = CTMC) ▶ Gene tree branching patterns are a function of population size
- 32. v u G A A A G G G A ▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent ▶ Coalescent is a stochastic model of shared inheritance (continuous-time Markov chain = CTMC) ▶ Gene tree branching patterns are a function of population size ▶ Conditional on G, model mutation as a CTMC
- 33. v u G A A A G G G A ▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent ▶ Coalescent is a stochastic model of shared inheritance (continuous-time Markov chain = CTMC) ▶ Gene tree branching patterns are a function of population size ▶ Conditional on G, model mutation as a CTMC ▶ Genetic characters provide information about G
- 34. v u G A A A G G G A ▶ Conditional on “population tree” (T), model “gene trees” (G) using coalescent ▶ Coalescent is a stochastic model of shared inheritance (continuous-time Markov chain = CTMC) ▶ Gene tree branching patterns are a function of population size ▶ Conditional on G, model mutation as a CTMC ▶ Genetic characters provide information about G ▶ G informs T (population sizes, divergence times, and relationships)
- 36. v u G A A A G G G A ▶ “Standard” hierarchical approach
- 37. v u G A A A G G G A ▶ “Standard” hierarchical approach ▶ Calculate p(genetic data | G) × p(G | T)
- 38. v u G A A A G G G A ▶ “Standard” hierarchical approach ▶ Calculate p(genetic data | G) × p(G | T) ▶ Use numerical integration (MCMC) to co-estimate both
- 39. v u G A A A G G G A ▶ “Standard” hierarchical approach ▶ Calculate p(genetic data | G) × p(G | T) ▶ Use numerical integration (MCMC) to co-estimate both ▶ But, G and T are highly correlated
- 40. v u G A A A G G G A ▶ “Standard” hierarchical approach ▶ Calculate p(genetic data | G) × p(G | T) ▶ Use numerical integration (MCMC) to co-estimate both ▶ But, G and T are highly correlated ▶ As the number of loci (gene trees) increases, MCMC falls apart
- 41. v u G A A A G G G A ▶ “Standard” hierarchical approach ▶ Calculate p(genetic data | G) × p(G | T) ▶ Use numerical integration (MCMC) to co-estimate both ▶ But, G and T are highly correlated ▶ As the number of loci (gene trees) increases, MCMC falls apart ▶ Can we integrate G analytically?
- 42. v u G A A A G G G A (n, r ) = (3, 1) (n, r ) = (3, 2) (n, r ) = (1, 0) (n, r ) = (1, 0) (n, r ) = (2, 0) 1 T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18 2 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
- 43. v u G A A A G G G A (n, r ) = (3, 1) (n, r ) = (3, 2) (n, r ) = (1, 0) (n, r ) = (1, 0) (n, r ) = (2, 0) 1 T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18 2 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
- 44. v u G A A A G G G A (n, r ) = (3, 1) (n, r ) = (3, 2) (n, r ) = (1, 0) (n, r ) = (1, 0) (n, r ) = (2, 0) Q = (1, 0) (1, 1) (2, 0) (2, 1) · · · (n, n) (1, 0) · · · · · (1, 1) · · · · · (2, 0) · · · · · (2, 1) · · · · · . . . (n, n) · · · · · 1 T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18 2 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
- 45. v u G A A A G G G A (n, r ) = (3, 1) (n, r ) = (3, 2) (n, r ) = (1, 0) (n, r ) = (1, 0) (n, r ) = (2, 0) Q = (1, 0) (1, 1) (2, 0) (2, 1) · · · (n, n) (1, 0) · · · · · (1, 1) · · · · · (2, 0) · · · · · (2, 1) · · · · · . . . (n, n) · · · · · Q(n,r);(n,r−1) = (n − r + 1)v, mutation, Q(n,r);(n,r+1) = (r + 1)u, mutation, Q(n,r);(n−1,r) = (n−1−r)n 2Ne (u+v) , coalescence, Q(n,r);(n−1,r−1) = (r−1)n 2Ne (u+v) , coalescence, Q(n,r);(n,r) = − (n−1)n 2Ne (u+v) − (n − r)v − ru. 1 T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18 2 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
- 46. v u G A A A G G G A (n, r ) = (3, 1) (n, r ) = (3, 2) (n, r ) = (1, 0) (n, r ) = (1, 0) (n, r ) = (2, 0) Q = (1, 0) (1, 1) (2, 0) (2, 1) · · · (n, n) (1, 0) · · · · · (1, 1) · · · · · (2, 0) · · · · · (2, 1) · · · · · . . . (n, n) · · · · · Q(n,r);(n,r−1) = (n − r + 1)v, mutation, Q(n,r);(n,r+1) = (r + 1)u, mutation, Q(n,r);(n−1,r) = (n−1−r)n 2Ne (u+v) , coalescence, Q(n,r);(n−1,r−1) = (r−1)n 2Ne (u+v) , coalescence, Q(n,r);(n,r) = − (n−1)n 2Ne (u+v) − (n − r)v − ru. ▶ eQt to keep track of all conditional probabilites along each branch (Carathéodory-Fejér method1 ) 1 T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18 2 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
- 47. v u G A A A G G G A (n, r ) = (3, 1) (n, r ) = (3, 2) (n, r ) = (1, 0) (n, r ) = (1, 0) (n, r ) = (2, 0) Q = (1, 0) (1, 1) (2, 0) (2, 1) · · · (n, n) (1, 0) · · · · · (1, 1) · · · · · (2, 0) · · · · · (2, 1) · · · · · . . . (n, n) · · · · · Q(n,r);(n,r−1) = (n − r + 1)v, mutation, Q(n,r);(n,r+1) = (r + 1)u, mutation, Q(n,r);(n−1,r) = (n−1−r)n 2Ne (u+v) , coalescence, Q(n,r);(n−1,r−1) = (r−1)n 2Ne (u+v) , coalescence, Q(n,r);(n,r) = − (n−1)n 2Ne (u+v) − (n − r)v − ru. ▶ eQt to keep track of all conditional probabilites along each branch (Carathéodory-Fejér method1 ) ▶ At root, get likelihood of population tree integrated over all possible gene trees and mutational histories2 1 T. Schmelzer and L. N. Trefethen (2007). Electronic Transactions on Numerical Analysis 29: 1–18 2 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
- 48. Challenges to accounting for shared divergences 1. Likelihood for genomic data is tricky 2. Lots of possible trees of different dimensions True history τ1 τ2 τ3 Current tree model τ1 τ2 τ3τ4 τ5 τ6 τ7 τ8 J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 50. Generalizing tree space nτ = 3 A B C D A C B D A D B C A B C D A B D C A C B D A C D B A D B C A D C B B C A D B C D A B D A C B D C A C D A B C D B A
- 51. Generalizing tree space nτ = 3 nτ = 2 nτ = 1 A B C D A C B D A D B C A B C D A C B D A D B C A B C D A B D C A C B D A B C D A C B D A D B C A C D B A D B C A D C B B C A D B D A C C D A B A B C D B C A D B C D A B D A C A B C D A B D C A C D B B D C A C D A B C D B A B C D A
- 52. Generalizing tree space nτ = 3 nτ = 2 nτ = 1 A B C D A C B D A D B C A B C D A C B D A D B C A B C D A B D C A C B D A B C D A C B D A D B C A C D B A D B C A D C B B C A D B D A C C D A B A B C D B C A D B C D A B D A C A B C D A B D C A C D B B D C A C D A B C D B A B C D A
- 53. Generalized tree distribution ▶ All topologies equally probable ▶ Parametric distribution on age of root ▶ Beta distributions on other divergence times τ0 τ1 ∼ Beta(τ0, τ3, α, β) * τ2 ∼ Beta(τ0, τ3, α, β) * τ3 ∼ Beta(τ0, τ4, α, β) τ4 ∼ Gamma(k, θ) * I H G F E D C B A t6 t3 t1 t2 t4 t5 J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 54. Inferring trees with shared divergences Split Merge Reversible-jump MCMC J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 55. Validating rjMCMC with 7-leaf tree 350 400 450 0.000 0.005 0.010 0.015 0.020 Number of times each topology sampled Density Binom(5e+07, 1/127,569) MCMC χ2 = 128,046 p = 0.172 The rjMCMC algorithms sample the expected generalized tree distribution J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 56. Phylogenetic coevality Phyco val J. R. Oaks et al. (2022). PNAS 119: e2121036119 Ecoevolity Estimating evolutionary coevality J. R. Oaks (2019). Systematic Biology 68: 371–395 ▶ Tree model ▶ rjMCMC sampling of generalized tree distribution 1 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
- 57. Phylogenetic coevality Phyco val J. R. Oaks et al. (2022). PNAS 119: e2121036119 Ecoevolity Estimating evolutionary coevality J. R. Oaks (2019). Systematic Biology 68: 371–395 ▶ Tree model ▶ rjMCMC sampling of generalized tree distribution ▶ Likelihood model ▶ CTMC model of characters evolving along genealogies ▶ Infer species trees by analytically integrate over genealogies1 1 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
- 58. Phylogenetic coevality Phyco val J. R. Oaks et al. (2022). PNAS 119: e2121036119 Ecoevolity Estimating evolutionary coevality J. R. Oaks (2019). Systematic Biology 68: 371–395 ▶ Tree model ▶ rjMCMC sampling of generalized tree distribution ▶ Likelihood model ▶ CTMC model of characters evolving along genealogies ▶ Infer species trees by analytically integrate over genealogies1 ▶ Goal: Co-estimation of phylogeny and shared divergences from genomic data 1 D. Bryant et al. (2012). Molecular Biology and Evolution 29: 1917–1932
- 59. Does it work?
- 60. Methods: Simulations ▶ Simulated 100 data sets with 50,000 base pairs τ1 0.05 τ2 0.08 τ3 0.2 τ1 0.03 τ2 0.05 τ3 0.06 τ4 0.11 τ5 0.12 τ6 0.14 τ7 0.18 τ8 0.2
- 61. Methods: Simulations ▶ Simulated 100 data sets with 50,000 base pairs ▶ Analyzed each data set with: ▶ MG = Generalized tree model ▶ MIB = Independent-bifurcating tree model τ1 0.05 τ2 0.08 τ3 0.2 τ1 0.03 τ2 0.05 τ3 0.06 τ4 0.11 τ5 0.12 τ6 0.14 τ7 0.18 τ8 0.2
- 62. Methods: Simulations ▶ Simulated 100 data sets with 50,000 base pairs ▶ Analyzed each data set with: ▶ MG = Generalized tree model ▶ MIB = Independent-bifurcating tree model ▶ Simulated 100 data sets where topology and div times randomly drawn from MG and MIB τ1 0.05 τ2 0.08 τ3 0.2 τ1 0.03 τ2 0.05 τ3 0.06 τ4 0.11 τ5 0.12 τ6 0.14 τ7 0.18 τ8 0.2
- 63. 0.05 0.08 0.2 J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 64. 0.0 0.5 1.0 0.05 0.0 0.5 1.0 0.08 0.2 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 65. τ1 0.05 τ2 0.08 τ3 0.2 t5 t2 t1 t3 t4 J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 66. MG = Generalized model MIB = Independent-bifurcating model τ1 0.05 τ2 0.08 τ3 0.2 t5 t2 t1 t3 t4 Splitting t4 0.00 0.25 0.50 0.75 1.00 Posterior probability J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 67. MG = Generalized model MIB = Independent-bifurcating model τ1 0.05 τ2 0.08 τ3 0.2 t5 t2 t1 t3 t4 0.000 0.002 0.004 0.006 0.008 0.010 0.012 p = 4.1 × 10−18 MG MIB Distance from true tree J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 68. MG = Generalized model MIB = Independent-bifurcating model τ1 0.05 τ2 0.08 τ3 0.2 t5 t2 t1 t3 t4 0.000 0.002 0.004 0.006 0.008 0.010 0.012 p = 4.1 × 10−18 MG MIB Distance from true tree MG significantly better at inferring trees with shared divergences J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 69. MG = Generalized model MIB = Independent-bifurcating model τ1 0.03 τ2 0.05 τ3 0.06 τ4 0.11 τ5 0.12 τ6 0.14 τ7 0.18 τ8 0.2 t8 t2 t1 t7 t4 t3 t6 t5 J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 70. MG = Generalized model MIB = Independent-bifurcating model τ1 0.03 τ2 0.05 τ3 0.06 τ4 0.11 τ5 0.12 τ6 0.14 τ7 0.18 τ8 0.2 t8 t2 t1 t7 t4 t3 t6 t5 True topology 0.00 0.25 0.50 0.75 1.00 Posterior probability J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 71. MG = Generalized model MIB = Independent-bifurcating model τ1 0.03 τ2 0.05 τ3 0.06 τ4 0.11 τ5 0.12 τ6 0.14 τ7 0.18 τ8 0.2 t8 t2 t1 t7 t4 t3 t6 t5 0.0000 0.0025 0.0050 0.0075 0.0100 0.0125 p = 0.36 MG MIB Distance from true tree J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 72. MG = Generalized model MIB = Independent-bifurcating model τ1 0.03 τ2 0.05 τ3 0.06 τ4 0.11 τ5 0.12 τ6 0.14 τ7 0.18 τ8 0.2 t8 t2 t1 t7 t4 t3 t6 t5 0.0000 0.0025 0.0050 0.0075 0.0100 0.0125 p = 0.36 MG MIB Distance from true tree MG performs as well as true model when divergences are independent J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 73. Results: random MG trees 0.5 1.0 1.5 2.0 2.5 0.5 1.0 1.5 2.0 2.5 p(TL ∈ CI) = 0.96 RMSE = 0.00486 True tree length Estimated tree length J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 74. Results: random MG trees 0.5 1.0 1.5 2.0 2.5 0.5 1.0 1.5 2.0 2.5 p(TL ∈ CI) = 0.96 RMSE = 0.00486 True tree length Estimated tree length Shared divergence 0.00 0.25 0.50 0.75 1.00 Posterior probability J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 75. Results: random MG trees 0.5 1.0 1.5 2.0 2.5 0.5 1.0 1.5 2.0 2.5 p(TL ∈ CI) = 0.96 RMSE = 0.00486 True tree length Estimated tree length Shared divergence 0.00 0.25 0.50 0.75 1.00 Posterior probability MG performs well with data simulated on random trees with shared divergences J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 76. Results: random MIB trees Probability of incorrectly merged divergence times (true model = MIB) Merged times 0.00 0.25 0.50 0.75 1.00 FPR = 0.047 A Posterior probability 0.00 0.05 0.10 0.15 0.20 0.00 0.25 0.50 0.75 1.00 B Time difference Posterior probability 0.0 0.1 0.2 0.3 0.00 0.25 0.50 0.75 1.00 C Midpoint divergence time Posterior probability J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 77. Results: random MIB trees Probability of incorrectly merged divergence times (true model = MIB) Merged times 0.00 0.25 0.50 0.75 1.00 FPR = 0.047 A Posterior probability 0.00 0.05 0.10 0.15 0.20 0.00 0.25 0.50 0.75 1.00 B Time difference Posterior probability 0.0 0.1 0.2 0.3 0.00 0.25 0.50 0.75 1.00 C Midpoint divergence time Posterior probability MG has low false positive rate J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 78. MG = Generalized model MIB = Independent-bifurcating model 0.000 0.002 0.004 0.006 0.008 0.010 0.012 Ave std dev of split freq (ASDSF) Generalizing tree space improves MCMC convergence and mixing J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 79. Scan for sea-level animation
- 80. Scan for sea-level animation
- 81. Scan for sea-level animation Did fragmentation of islands promote diversification?
- 82. Cyrtodactylus Gekko ©Rafe M. Brown 10 15 20 118 120 122 124 126 Longitude Latitude 10 15 20 118 120 122 124 126 Longitude Latitude ©Rafe M. Brown J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 83. Cyrtodactylus Gekko ©Rafe M. Brown 10 15 20 118 120 122 124 126 Longitude Latitude 10 15 20 118 120 122 124 126 Longitude Latitude ©Rafe M. Brown 1702 loci 155,887 sites 1033 loci 94,813 sites J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 84. Gekko 5°N 10°N 15°N 20°N 116°E 118°E 120°E 122°E 124°E 126°E Longitude Latitude D 116°E 118°E 120°E 122°E 124°E 126°E Longitude 0.28 0.55 0.19 0.35 0.81 1.0 n c i l P s l P 30 20 10 0 1 2 14 15 16 17 19 18 20 22 21 23 24 26 25 4 3 5 7 6 13 12 11 10 9 8 0.87 0.86 n c i l P s l P 30 20 10 0 C G. athymus G. carusadensis G. crombota G. ernstkelleri G. gigante G. mindorensis G. monarchus G. porosus G. romblon G. rossi G. sp. a G. sp. b 1 5 11 2 7 6 22 20 16 19 18 15 23 17 21 26 25 24 14 13 12 4 3 10 9 8 Oligocene Miocene Oligocene Miocene J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 85. Cyrtodactylus D 5°N 10°N 15°N 20°N 116°E 118°E 120°E 122°E 124°E 126°E Longitude Latitude B 1.0 1 10 24 23 25 27 26 18 19 22 21 20 14 13 15 17 16 12 11 7 9 8 2 3 4 6 5 0.71 0.75 n c i l P s l P 30 20 10 0 A C C. annulatus C. gubaot C. jambangan C. mamanwa C. philippinicus C. redimiculus C. sumuroi C. tautbatorum 0.30 0.45 0.90 0.89 0.57 10 13 17 6 4 5 9 3 7 24 27 18 19 22 23 14 26 21 25 15 16 20 12 11 1 8 2 Oligocene Miocene J. R. Oaks et al. (2022). PNAS 119: e2121036119
- 86. Take-home points ▶ We can accurately infer phylogenies with shared divergences
- 87. Take-home points ▶ We can accurately infer phylogenies with shared divergences ▶ Generalizing tree space avoids spurious support incorrect relationships and improves MCMC mixing
- 88. Take-home points ▶ We can accurately infer phylogenies with shared divergences ▶ Generalizing tree space avoids spurious support incorrect relationships and improves MCMC mixing ▶ Among Philippine gekkonids, we found support for shared divergences predicted by sea-level changes
- 89. Open science: everything is available. . . Software: ▶ Phycoeval: github.com/phyletica/ecoevolity (release coming soon) Open-Science Notebooks: ▶ Phycoeval analyses: github.com/phyletica/phycoeval-experiments ▶ Gecko RADseq: github.com/phyletica/gekgo
- 91. Moving forward: Theory/methods ▶ Develop process-based and trait-dependent distributions over the space of generalized trees ▶ “Birth-death-burst” model
- 92. Moving forward: Theory/methods ▶ Develop process-based and trait-dependent distributions over the space of generalized trees ▶ “Birth-death-burst” model ▶ Extend generalized tree distribution to trees that are not ultrametric
- 93. Moving forward: Theory/methods ▶ Develop process-based and trait-dependent distributions over the space of generalized trees ▶ “Birth-death-burst” model ▶ Extend generalized tree distribution to trees that are not ultrametric nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
- 94. Moving forward: Theory/methods ▶ Develop process-based and trait-dependent distributions over the space of generalized trees ▶ “Birth-death-burst” model ▶ Extend generalized tree distribution to trees that are not ultrametric ▶ Couple generalized tree distribution with other phylogenetic likelihood models nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
- 95. Moving forward: Applications nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
- 96. Moving forward: Applications Epidemiological dynamics of “super-spreading” events during the COVID-19 pandemic nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
- 97. Moving forward: Applications Epidemiological dynamics of “super-spreading” events during the COVID-19 pandemic ▶ Spread at social gatherings creates shared and multifurcating divergences in the viral “transmission tree” nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
- 98. Moving forward: Applications Epidemiological dynamics of “super-spreading” events during the COVID-19 pandemic ▶ Spread at social gatherings creates shared and multifurcating divergences in the viral “transmission tree” ▶ Estimate rate of shared divergences as proxy for spread via social gatherings nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
- 99. Moving forward: Applications Epidemiological dynamics of “super-spreading” events during the COVID-19 pandemic ▶ Spread at social gatherings creates shared and multifurcating divergences in the viral “transmission tree” ▶ Estimate rate of shared divergences as proxy for spread via social gatherings ▶ Test if this varies over time, among regions, and among variants of SARS-CoV-2 nextstrain.org J. Hadfield et al. (2018). Bioinformatics 34: 4121–4123
- 100. © 2007 Boris Kulikov boris-kulikov.blogspot.com
- 102. Come work with us! Adaptive Comp Bio Internships! Scan to internship listing: www.adaptivebiotech.com/career-listings
- 103. Thanks everyone! ▶ Thanks Devang and Duke GCB & CBB! ▶ Bryan Howie and my team at Adaptive ▶ Phyletica Lab (the Phyleticians) ▶ Mark Holder ▶ Rafe Brown ▶ Cam Siler ▶ Lee Grismer Computation: ▶ Alabama Supercomputer Authority ▶ Auburn University Hopper Cluster Funding: Photo credits: ▶ Rafe Brown ▶ Perry Wood, Jr. ▶ PhyloPic
- 104. Questions? joaks@auburn.edu @jamoaks phyletica.org Scan for slides: www.slideshare.net/jamieoaks7 © 2007 Boris Kulikov boris-kulikov.blogspot.com