Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

1,057 views

Published on

No Downloads

Total views

1,057

On SlideShare

0

From Embeds

0

Number of Embeds

2

Shares

0

Downloads

24

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position
- 2. Introduction The Coalescent Bayesian Inference MCMC Pre Darwin phylogenetic trees Visualisation Tree Shape Taxa Position
- 3. Introduction The Coalescent Bayesian Inference The Origin sole ﬁgure MCMC Visualisation Tree Shape Taxa Position
- 4. Introduction The Coalescent Bayesian Inference MCMC Visualisation The Cytochrome C Gene Tree (Fitch, 1967) Tree Shape Taxa Position
- 5. Introduction • • • • • • The Coalescent Bayesian Inference MCMC Processes of speciation Evolution of traits Biogeography Epidemiology Co-Evolution (host/parasite) Domestication Visualisation Tree Shape Taxa Position
- 6. Introduction The Coalescent Bayesian Inference Selecting a “Duck” MCMC Visualisation Tree Shape Taxa Position
- 7. Introduction The Coalescent Bayesian Inference MCMC The Molecular Clock (early ’60s) Visualisation Tree Shape Taxa Position
- 8. Introduction The Coalescent Bayesian Inference MCMC Models of Sequence Evolution JC69 model (Jukes and Cantor, 1969) Visualisation Tree Shape Taxa Position
- 9. Introduction The Coalescent Bayesian Inference MCMC The Kingman Coalescent (1982) Visualisation Tree Shape Taxa Position
- 10. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Wright-Fisher Population (1931) • • The individuals were randomly sampled from a population of size N. The parent of any individual is chosen uniformly at random from all potential parents Taxa Position
- 11. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position The Coalescent The larger the population, the longer (on average) you have to travel back in time for the common ancestor.
- 12. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position The Coalescent for multiple individuals The waiting time for the ﬁrst common ancestor of two individuals out of m (going backwards in time) m is exponential with a rate of ( 2 ) /Ne . Ne is the Wright-Fisher eﬀective population size.
- 13. Introduction The Coalescent Bayesian Inference From Models to Inference MCMC Visualisation Tree Shape Taxa Position
- 14. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Bayes’ Theorem (a Reminder) P(A ∧ B) = P(A)P(B|A) Taxa Position
- 15. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Bayes’ Theorem (2) P(B)P(A|B) = P(A ∧ B) = P(A)P(B|A) P(B|A)P(A) P(A|B) = P(B)
- 16. Introduction The Coalescent Bayesian Inference Bayesian Inference MCMC Visualisation Tree Shape Taxa Position
- 17. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Models (so far) Substitution model: A stochastic process for the evolution (change) of genetic data (sequences) over time. Clock model: How substitution rates change over time. Coalescent model: A stochastic process for the ancestral relationship between a group of homologous sequences from several individuals.
- 18. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Models (Math Notation) Coalescent model: f (T |Ne ) Substitution model: f (G |T ) Where G is the gene (sequence data) and T is the ancestral relationships (tree).
- 19. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape The Biological Species Concept The conventional deﬁnition of “a species” amongst evolutionary biologists is “a group of organisms whose members interbreed among themselves, but are separated from other groups by genetically-based barriers to gene ﬂow.” Jerry Coyne “Why Evolution is True” blog. Taxa Position
- 20. Introduction The Coalescent Bayesian Inference The Species “tree” MCMC Visualisation Tree Shape Taxa Position
- 21. Introduction The Coalescent The Gene(s) tree Bayesian Inference MCMC Visualisation Tree Shape Taxa Position
- 22. Introduction The Coalescent Bayesian Inference MCMC Visualisation Species Tree Ancestral Reconstruction Tree Shape Taxa Position
- 23. Introduction The Coalescent Bayesian Inference MCMC Visualisation Multiple Individuals from each Species Tree Shape Taxa Position
- 24. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Multispecies coalescent – Kingman Coalescent per Species Tree Branch
- 25. Introduction The Coalescent Bayesian Inference Multiple Independent Loci MCMC Visualisation Tree Shape Taxa Position
- 26. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape The Multispecies Posterior P(S|D) = g ∝ P(S, g |D) P(D|S, g )P(S, g ) g = P(D|g )P(S, g ) g = g P(D|g )P(g |S)P(S) Taxa Position
- 27. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape A Complex Posterior g f (S, g |D) g f (D|S, g )f (S, g ) f (D) P(S|D) = = Taxa Position
- 28. Introduction The Coalescent Bayesian Inference MCMC Visualisation Problem 1: f(D) The prior probability of obtaining data D. P(S|D) = g f (D|S, g )f (S, g ) f(D) We don’t know the value of f (D). f (D) = f (D|S, g )f (S, g ) g ,S However, it is a constant. Tree Shape Taxa Position
- 29. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Problem 2: The Whole Damn Thing The posterior is a distribution deﬁned by a complex multidimensional integral.
- 30. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Enter MCMC Markov Chain Monte Carlo (MCMC) is a class of methods for stochasticly sampling from probability distributions based on constructing a Markov chain.
- 31. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Very short history of MCMC 1953: Metropolis algorithm published in Journal of Chemical Physics (Metropolis et al.) 1970: Hastings algorithms in Biometrika (Hastrings) 1974: Gibbs sampler and Hammersley-Cliﬀord theorem paper by Besag 1980s: Image analysis and spatial statistics enjoyed MCMC algorithms, not popular with others due to the lack of computing power 1995: Reversible jump algorithm in Biometrika (Green) groundtruth.info/AstroStat/slog/2008/mcmc-historyo
- 32. Introduction The Coalescent Bayesian Inference MCMC in a nutshell MCMC Visualisation Tree Shape Taxa Position
- 33. Introduction The Coalescent Bayesian Inference MCMC in a nutshell (2) MCMC Visualisation Tree Shape Taxa Position
- 34. Introduction The Coalescent Bayesian Inference MCMC in a nutshell (3) MCMC Visualisation Tree Shape Taxa Position
- 35. Introduction The Coalescent Bayesian Inference MCMC in a nutshell (4) MCMC Visualisation Tree Shape Taxa Position
- 36. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape MCMC in a nutshell (5) If we propose to go from B or A to either A or B with equal probability, then 2 1 A B Flow from A to B is 2/3 · 1/4 = 1/6, and from B to B is 1/3 · 1/2 = 1/6, 1/3 in total. Taxa Position
- 37. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape MCMC in a nutshell (6) Hastings Ratio (x to y ) = p(y → x)f (y ) p(x → y )f (x) So far we had p(y → x) = p(x → y ), that is the probability going from x to y was equal to going back. Taxa Position
- 38. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position MCMC in a nutshell (7) If at A we always propose to go to B but from B we go to A or B with equal propability, that is, Table: p(x → y ) A A B 0 1 1/ 2 1/ Then HR(A → B) = And B 2 1/ 2·1 = 1/4, 1·2 HR(B → A) = 4.
- 39. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position MCMC in a nutshell (8) 2 1 A B Flow from B to A is 1/3 · 1/2 and from A to A is 2/3 · 3/4 = 1/2, 2 / in total. 3
- 40. Introduction The Coalescent Bayesian Inference Tree(s) Visualisation MCMC Visualisation Tree Shape Taxa Position
- 41. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Traditional Tree Visualization Cyan con 1 oax FLSJ 0.99 1 0.43 sum 0.47 0.38 0.45 int coast ins ult 0.96 0.88 couc pot 1 woll 0.62 0.91 aria arib
- 42. Introduction The Coalescent Bayesian Inference MCMC Traditional Tree Visualization (2) Visualisation Tree Shape Taxa Position
- 43. Introduction The Coalescent Bayesian Inference MCMC Species Tree with Population Sizes Visualisation Tree Shape Taxa Position
- 44. Introduction The Coalescent Bayesian Inference Species Tree with Gene Trees MCMC Visualisation Tree Shape Taxa Position
- 45. Introduction The Coalescent Bayesian Inference Species Tree (Densitree) MCMC Visualisation Tree Shape Taxa Position
- 46. Introduction The Coalescent The “Star Tree” Bayesian Inference MCMC Visualisation Tree Shape Taxa Position
- 47. Introduction The Coalescent Bayesian Inference Species Tree (Densitree) MCMC Visualisation Tree Shape Taxa Position
- 48. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Taxa Order Matters (When Drawing Multiple Trees) 73% 17% 10% 0 1 2 3 4 5 6 0 1 5 6 2 3 4 2 3 4 5 6 0 1
- 49. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Some Orders are Better than Others 73% 17% 10% 0 1 2 3 4 5 6 0 1 5 6 2 3 4 2 3 4 5 6 0 1 234 0 156 Taxa Position
- 50. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Disadvantages: • Population size changes in one branch have a visual eﬀect on other branches. • Fails when trying to show the whole posterior ` la a DensiTree. • No obvious way to extend for trees with constant population size per branch.
- 51. Introduction The Coalescent Bayesian Inference MCMC The Imperial AT-AT Tree Provide some space between branches. Visualisation Tree Shape Taxa Position
- 52. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape A double Act Target species tree (blue) and BEAST posterior summary (orange). Taxa Position
- 53. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Species Tree with Constant Population Sizes To extend to constant branches, we need a rule to place the bottom of the branch on top of the descendant branches. We use the proportion rule.
- 54. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Position,Position,Position A species tree speciﬁes heights and widths. The challenge is to pick good X-axis positions. The star tree builds the tree from root towards the tips. Building from the tips towards the root is simpler when drawing species trees. When building from the tips the descendants X-positions determine the parent position.
- 55. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Position,Position,Position However, there are many ways to place nodes. Here are four of them: Descendants Mean Halfway between direct descendants. Tips Mean Average of all tips in the sub-tree. Middle Halfway between rightmost tip of left sub-tree and leftmost tip of right sub-tree. Balanced by Population At point minimizing the diﬀerence between branch bottom and top centers.
- 56. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Node Positioning The methods are similar for balanced trees. The diﬀerence is in the handling of unbalanced trees. D-Mean T-Mean Middle Balanced
- 57. Introduction The Coalescent Bayesian Inference MCMC Visualisation Node Positioning D-Mean Balanced Tree Shape Taxa Position
- 58. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Species Tress Posterior 15 10 0 20 40 60 oax con int ins coast sum FLSJ ult couc pot woll arib aria 0 Cyan 5 80 Taxa Position
- 59. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Gene Trees Within Species Trees: Preliminary Next we would like to draw the gene tree within the species tree. Hurdle 1: Obtain a suitable gene tree. The gene tree has to be compatible with the species tree. This is not a problem when drawing a speciﬁc MCMC state, but is a problem when using summary trees.
- 60. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Tips Positioning Hurdle 2: Branches with non-constant width complicates positioning of tips. 3.0 3.0 2.5 2.5 2.0 2.0 1.5 1.5 1.0 1.0 0.5 0.5 0.0 0.0 −1 0 1 2 3 4 5 −2 6 3.0 10 1.0 0.5 8 1.5 1.0 6 2.0 1.5 4 2.5 2.0 2 3.0 2.5 0 0.5 0.0 0.0 −2 0 2 4 6 8 10 −5 0 5 10 15 20 Taxa Position
- 61. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Tips Positioning (automatic) The placing insures that extrema points are at least (horizontally). 3.0 2.5 2.0 1.5 1.0 0.5 0.0 −2 0 2 4 6 8 10 apart Taxa Position
- 62. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Tips Positioning (automatic) Even with Python, it is far from trivial to implement. Remember, we want a tight ﬁt, and placing should work for all modes of internal positioning. Basically, we build the tree from bottom to top up by joining clades. the X position of extrema points for the clade is a linear function of the spacing between the sub-trees (where the spacing inside the two sub trees are ﬁxed). So each extrema points sets a lower limit on the spacing, and the largest is taken as the ﬁnal separation. The best way to pick for a tree still needs to be worked out.
- 63. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Drawing the Gene Tree Hurdle 3: A suitable policy for drawing the gene tree. We reuse the ideas of the Star Tree. Given the position of an internal node, the left/right branches is drawn as a straight lines towards the “middle” of the left/right sub-trees. But we still need to handle the species transitions. From the bottom up, we (linearly) map the lineages leaving the branch to the top of the branch. The top of the clade is then put in the middle of the mapped taxa.
- 64. Introduction The Coalescent Bayesian Inference MCMC Visualisation g12 g10 g11 speciesD g7 g9 g8 speciesC Visual Clutter 0.0 100.0 g13 g15 g14 speciesE g4 g6 speciesB g5 g0 g3 g2 g1 speciesA 200.0 300.0 400.0 500.0 600.0 Contained Tree 700.0 800.0 900.0 branches 300 generations Tree Shape Taxa Position
- 65. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Gene Tree Inside Species Tree: As-Is 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 ID2 0 ID3 ID4 2 ID0 4 ID1 6 ID5 8 ID6 10 Taxa Position
- 66. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Reducing Visual Clutter (1) a b x b a x The size (number of tips) of the sub-tree (b,x) is 2, but the span (number of tips between leftmost and rightmost) is 3.
- 67. Introduction The Coalescent Bayesian Inference MCMC Visualisation Tree Shape Taxa Position Reducing Visual Clutter (2) For every sequential arrangement of the gene tree taxa we can get a rough measure of the amount of crossings, n∈Internal Nodes span(n) − size(n) size(n) is the number of taxa in the sub tree. span(n) is the number of taxa in the group bounded by the leftmost and rightmost tips of the sub tree. The diﬀerence is the excess taxa, the number of potential lineages that may need to cross out of the clade. Note that valid arrangements depend on the orientation of the species tree, so optimization should be over both species ordering and gene tips arrangements compatible with that order. Since the number is typically large, we resort to multiple tries of hill climbing. Number of tries might be ﬁxed or bounded by time.
- 68. Introduction The Coalescent Bayesian Inference Unresolved Conﬂicts Optimized tree on the right. MCMC Visualisation Tree Shape Taxa Position

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment