Phylogenomic Convergence Detection - Evolutionary Biology Meeting in Marseille 2014

428 views

Published on

Invited talk presented at the 18th EBM in Marseille, 16th September 2014.

I outline the state-of-the-art in methods of genomic convergence detection, including adaptive molecular convergence, and highlight some of the next challenges in developing these techniques, including recent results.

Published in: Science
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
428
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Phylogenomic Convergence Detection - Evolutionary Biology Meeting in Marseille 2014

  1. 1. Convergence for everyone? Detecting genomic adaptive convergence: Initial results, lessons & perspectives 16th September 2014 Joe Parker, Queen Mary University London
  2. 2. Adaptive molecular convergence • Background & definition • Site-based methods • Tree-based methods • Combined approaches • Sampling / phylogenies as parameters • Future
  3. 3. Lab Interests • Ecology and evolution of traits • Echolocation, sociality • NGS data for population genetics and phylogenomics
  4. 4. Defining molecular convergence • It isn’t: – Divergence (adaptive or neutral) – Conservation or purifying selection – Retention of ancestral states with secondary changes in outgroups – ‘Neutral’ homoplasy • It ought to be: – ‘Adaptive’ homoplasies – ‘Excess’ homoplasies
  5. 5. Prestin • Gene phylogeny recovers (paraphyletic) mammalian echolocators as monophyletic1 • Functional convergence of parallel changes N7T & I384T demonstrated in vitro2 1Liu et al. (2010) Curr. Biol. 20:R53; 2Liu et al. (2014) MBE 31(9):2415
  6. 6. Methods
  7. 7. Methods • Species phylogeny & inputs • Selection detection • Site-based convergence detection • Tree-based convergence detection
  8. 8. Site-based methods • Look at tips
  9. 9. Site-based methods • Look at tips • Reconstruct ancestral changes ??? ??
  10. 10. Lysozyme • Convergent and parallel substitutions in stomach lysozymes of advanced ruminants • Parsimony (‘over-estimate’) and Bayesian (‘under-estimate’) methods Zhang & Kumar (1997) MBE 14:527
  11. 11. Site-based methods • Look at tips • Reconstruct ancestral changes • Pairwise (conv) ∝ (div) changes • BEB posterior probabilities P(conv|data), P(div|data) Castoe et al. (2009) PNAS 106(22):8986
  12. 12. Tree-based methods
  13. 13. Tree-based methods • de novo tree search – Inference error – Signal : noise – Multiple phylogenies
  14. 14. Tree-based methods ΔSSLSnull - alternative (likelihood support comparison)
  15. 15. for convergence; see Supplementary Fig. 2).Wequantified the extent of sequence convergence at each locus by taking the mean of its DSSLS values, and found 824 loci with mean support for H1 and 392 for H2. Using simulations we confirmed that these convergent signals were not due to neutral processes and were robust to the substitutionmodel used (see Supplementary Methods). We ranked the mean DSSLS for all 2,326 loci under both conver-gence Convergence hypotheses and, to assess the performance of ourmethod, inspected the rank positions of seven hearing genes in that have echolocating previously been shownto exhibit convergence and/or adaptation in echolocatingmam-mals: prestin (Slc26a5), Tmc1, Kcnq4 (Kqt-4), Pjvk (Dfnb9), otoferlin, mammals Pcdh15 and Cdh23 (see Methods). Prestin was ranked 43rd (H1) and 22nd (H2), whereas several other loci were also ranked highly in the distribution of convergence support values (see Fig. 1b). In addition to these,we also found several other hearing genes in the top5%supporting • 22 mammals, 2326 loci, ~600,000 sites • Convergence signals across genome • Loci linked to sensory perception Parker et al. (2013) Nature 502:228 bottlenose dolphin, is greater in hearing genes than in other genes (for locus selection, see Methods). For each phylogenetic hypothesis, we averaged the mean DSSLS values of all 21 genes in our data set that are listed as linked to either hearing and/or deafness in any taxon based on published functional annotations (see Supplementary Informa-tion). By comparing our observed values to null distributions of cor-responding values obtained by randomization, we found that hearing genes had significantly more negative average values than expected by chance for bat–dolphin convergence (H2: z520.0194, P,0.05). We repeated this method for 75 genes listed as involved in vision and/or blindness, and found support, although weaker, in both cases of pheno-typic convergence (z520.0020, P#0.055 and z520.0097, P#0.09). Loci previously reported to have association with echolocation had strong support by randomization for both hypotheses (P#0.01 in both cases). Little brown bat Parnell’s moustached bat Large !ying fox Straw-coloured fruit bat Greater horseshoe bat Slc44a2* Lcat** Pcdh15** Itm2b** Hypothesis H1 (‘bat–bat convergence’) Non-echolocating bats Large !ying fox Straw-coloured fruit bat Little brown bat Parnell’s moustached bat Greater horseshoe bat Hypothesis H2 (‘bat–dolphin convergence’) Hypothesis H0 (species tree) Echolocating bats and dolphin Euarchontoglires Laurasiatheria Chiroptera; Yangochiroptera (all echolocating) Chiroptera; Yinpterochiroptera (echolocating and non-echolocating) Armadillo Elephant Chimpanzee Human Mouse Pika Rabbit Hedgehog Shrew Cat Dog Horse Vicuna Bottlenose dolphin Cow Greater false vampire bat Echolocating bats Atlantogenata Greater false vampire bat Non-echolocating bats Large !ying fox Straw-coloured fruit bat Little brown bat Parnell’s moustached bat Greater horseshoe bat Greater false vampire bat Bottlenose dolphin 1,500 1,000 500 75 50 25 0 Opa1* Rho* Tmc1** Six6* Jmjd6* Pcdh15* Ddx1** –0.06 –0.04 –0.02 0.00 0.02 0.04 0.06 0.08 0.10 –0.2 –0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 !SSLS (H1) !SSLS (H2) a b All other mammal lineages All other mammal lineages Support for H1 tree Support for H2 tree Prestin** Dfnb59** n = 2,326 loci n = 2,326 loci 500 400 300 200 100 20 15 10 5 0 Prestin* Figure 1 | Convergence hypotheses and genomic distribution of support. a, For each locus, the goodness-of-fit of three separate phylogenetic hypotheses was considered: (left) H0, the accepted species phylogeny based on recent findings (for example, refs 14, 23–25); (top-right panel) H1, or ‘bat–bat convergence signal across 2,326 loci in 14–22 representativemammalian taxa, as measured by locus-wise mean site-specific likelihood support for the species topology (H0) over (left) the ‘bat–bat’ hypothesis uniting echolocating bats (that is, DSSLS (H1)) and (right) bat–dolphin hypothesis (that is, DSSLS (H2)).
  16. 16. Combined methods
  17. 17. Trees and sites methods Correlate selection (dN/dS) and incongruence (ΔSSLS) signals
  18. 18. Genomic approaches -1.0 -0.5 0.0 0.5 1.0 Support Ha (alternative phylogeny) Support H0 (species phylogeny) • Pool information across sites • Orthology, paralogy 0 100 200 300 400 500 Distribution of genomic convergence, various hypotheses Mean locus sitewise-specific likelihood support for H0; !SSLS (H0 - Ha)
  19. 19. Genomic approaches -1.0 -0.5 0.0 0.5 1.0 Support Ha (alternative phylogeny) Support H0 (species phylogeny) • Pool information across sites • Orthology, paralogy 0 100 200 300 400 500 Distribution of genomic convergence, various hypotheses Mean locus sitewise-specific likelihood support for H0; !SSLS (H0 - Ha)
  20. 20. Genomic approaches -1.0 -0.5 0.0 0.5 1.0 Support Ha (alternative phylogeny) Support H0 (species phylogeny) • Pool information across sites • Orthology, paralogy 0 100 200 300 400 500 Distribution of genomic convergence, various hypotheses Mean locus sitewise-specific likelihood support for H0; !SSLS (H0 - Ha)
  21. 21. Genomic approaches -1.0 -0.5 0.0 0.5 1.0 Support Ha (alternative phylogeny) Support H0 (species phylogeny) • Pool information across sites • Orthology, paralogy 0 100 200 300 400 500 Distribution of genomic convergence, various hypotheses Mean locus sitewise-specific likelihood support for H0; !SSLS (H0 - Ha)
  22. 22. Genomic approaches -1.0 -0.5 0.0 0.5 1.0 Support Ha (alternative phylogeny) Support H0 (species phylogeny) • Pool information across sites • Orthology, paralogy 0 100 200 300 400 500 Distribution of genomic convergence, various hypotheses Mean locus sitewise-specific likelihood support for H0; !SSLS (H0 - Ha)
  23. 23. Genomic approaches -1.0 -0.5 0.0 0.5 1.0 Support Ha (alternative phylogeny) Support H0 (species phylogeny) • Pool information across sites • Orthology, paralogy 0 100 200 300 400 500 Distribution of genomic convergence, various hypotheses Mean locus sitewise-specific likelihood support for H0; !SSLS (H0 - Ha)
  24. 24. Interpretation • Notional convergence detected across genome, or not at all • Relative measure • Strength-of-evidence
  25. 25. Sampling
  26. 26. Which Trees?
  27. 27. Which Trees? • Choice of hypothesis, subtly different from usual practice • If we accept tree space distance important… • … Hypotheses are parameters • Ennumerate over trees?
  28. 28. Future
  29. 29. On the horizon • Models: – Null model – Alternative / convergent model • Phylogeny methods: – Ennumerated / unrestricted phylogenies – Tree space ‘distance’
  30. 30. Conclusion • Strong evidence molecular convergence, or something like our best definition of it, is a pervasive force • Very early work; e.g. early attempts to estimate ω, and current dN/dS tests
  31. 31. Thanks Georgia Tsagkogeorga1 Kalina Davies1, James Cotton2, Elia Stupka3 & Steve Rossiter1 1School of Biological and Chemical Sciences, Queen Mary, University of London 2Wellcome Trust Sanger Institute 3Center for Translational Genomics and Bioinformatics, San Raffaele Institute, Milan Chris Walker & Dan Traynor Queen Mary GridPP High-throughput Cluster Chaz Mein & Anna Terry Barts and The London Genome Centre Mahesh Pancholi, Seb Bailey, Xiuguang Mao & Chris Faulkes School of Biological and Chemical Sciences European Research Council; BBSRC (UK); Queen Mary, University of London (R-L): Joe Parker; GeorgiaTsagkogeorga; Kalina Davies; Steve Rossiter; Xiuguang Mao; Seb Bailey
  32. 32. Further information References 1. Zhang & Kumar (1997) MBE 14:527 2. Li et al. (2008) PNAS 105(37):13959 3. Castoe et al. (2009) PNAS 106(22):8986 4. Liu et al. (2010) Curr. Biol. 20:R53 5. Parker et al. (2013) Nature 502:228 6. Liu et al. (2014) MBE 31(9):2415 Resources – Lab: evolve.sbcs.qmul.ac.uk/rossiter – SVN: bit.ly/1m96pXM – email: j.d.parker@qmul.ac.uk

×