Your SlideShare is downloading. ×

Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Phylogeny

448
views

Published on

Published in: Education, Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
448
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Phylogenetic analysis for molecular sequence data João C. Setubal Virginia Bioinformatics Institute Virginia Tech June 2011 07/11/11 J. C. Setubal
  • 2. Outline
    • What is the biological question?
    • What input sequences should be used?
    • Analysis pipeline: steps and components
    • Output visualization
    • Output interpretation
    07/11/11 J. C. Setubal
  • 3. Biological questions
    • How do oomycete species relate to one another and to other species?
    • What is the history of a particular gene?
      • Gene trees vs. species trees
      • Lateral Gene Transfer
    • Other questions
    07/11/11 J. C. Setubal
  • 4. Credit: www.apsnet.org 07/11/11 J. C. Setubal
  • 5. Input sequences
    • Assuming we have reliable genome annotations, find related genes by similarity (e.g. by BLASTp)
    07/11/11 J. C. Setubal
  • 6. Pipeline
    • Multiple sequence alignment
    • Alignment editing
    • Phylogeny reconstruction
    • Visualization
    07/11/11 J. C. Setubal
  • 7. Multiple Sequence Alignment 07/11/11 J. C. Setubal
  • 8. Multiple Sequence Alignment
    • Concepts
      • DNA or amino acids
      • Aligned sites (a column) should be homologous
      • Scoring matrices: BLOSUM, PAM
      • Algorithm
      • Formats: clustal, FASTA, MSF, NEXUS, PHYLIP
      • http://molecularevolution.org/resources/fileformats/converting
    • Programs
      • Muscle
      • ClustalW/X
      • Cobalt (NCBI)
      • T-coffee
    07/11/11 J. C. Setubal
  • 9. Input sequences
    • should be related to each other
    • Cannot be too long (less than ~10kb)
    • Not too many (less than ~100)
    • (numbers vary depending on program and on computer)
    • FASTA format is best
    07/11/11 J. C. Setubal
  • 10. Alignment editing 07/11/11 J. C. Setubal Credit: R. Dixon
  • 11. Alignment editing
    • Concepts
      • Certain columns may be uninformative
      • Sometimes humans can see better alignments
    • Programs
      • Jalview or seaview: for manual editing
      • Gblocks: for automatic editing
    07/11/11 J. C. Setubal
  • 12. Phylogeny reconstruction 07/11/11 J. C. Setubal Credit: R. Dixon
  • 13. Phylogeny reconstruction
    • Concepts
      • Topology and branch lengths
      • Rooted vs. unrooted trees
        • Outgroup method
    07/11/11 J. C. Setubal
  • 14. A tree and a cladogram 07/11/11 J. C. Setubal Credit: Wattam et al. 2011
  • 15. Phylogeny reconstruction
    • Methods
      • Distance
      • Parsimony
      • Maximum likelihood
      • Bayesian
    • The inevitable black box
    07/11/11 J. C. Setubal
  • 16. Phylogeny reconstruction
    • Programs
      • Distance
        • Neighbor-joining, UPGMA
      • Parsimony
        • PHYLIP, PAUP
      • Maximum likelihood
        • RaXML, phyML
      • Bayesian
        • MrBayes
    07/11/11 J. C. Setubal
  • 17. Tree visualization: formats
    • Newick, NEXUS
    • (((erHomoC:0.28006,erCaelC:0.22089):0.40998,(erHomoA:0.32304, (erpCaelC:0.58815,((erHomoB:0.5807,erCaelB:0.23569):0.03586, erCaelA:0.38272):0.06516):0.03492):0.14265):0.63594,(TRXHomo:0.65866, TRXSacch:0.38791):0.32147,TRXEcoli:0.57336);
    • http://molecularevolution.org/resources/treeformats
    07/11/11 J. C. Setubal
  • 18. Tree visualization: tools
    • Seaview
    • Interactive Tree of Life http://itol.embl.de
    • http://en.wikipedia.org/wiki/List_of_phylogenetic_tree_visualization_software
    07/11/11 J. C. Setubal
  • 19. All-in-one: phylogeny.fr 07/11/11 J. C. Setubal
  • 20. Phylogeny.fr (2) 07/11/11 J. C. Setubal
  • 21. Seaview
    • http://pbil.univ-lyon1.fr/software/seaview.html
    • Gouy, M. Guindon, S. & Gascuel., O. (2010) SeaView version 4: a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Molecular Biology and Evolution 27(2):221-224. Galtier, N., Gouy, M. & Gautier, C. (1996) SEAVIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny. Comput. Appl. Biosci ., 12:543-548.
    07/11/11 J. C. Setubal
  • 22. JALVIEW http://www.jalview.org/ 07/11/11 J. C. Setubal
  • 23. Interpretation
    • Trees are just hypotheses
    • They could suffer from GIGO
    • Most likely tree may not be the true tree
    • Confidence in the topology
      • Bootstrap values
        • Should be above 0.7 (70%)
    • Branch lengths: # of substitutions per site
    • It’s always a good idea to try more than one reconstruction method
    07/11/11 J. C. Setubal
  • 24. Misc. topics
    • Taxonomy is not phylogeny
    • DNA vs protein
      • DNA is more sensitive
      • But organisms must be closely related
      • 3 rd codon position is less informative
    • Supermatrix approach
      • Alignment concatenation
    • Lateral gene transfer and Network methods
    07/11/11 J. C. Setubal
  • 25. Taxonomy 07/11/11 J. C. Setubal
  • 26. Ciccarelli et al, Science , 2006
  • 27. Eisen & Wu, Genome Biology , 2008
  • 28. Lateral Gene Transfer Networks 07/11/11 J. C. Setubal Kloesges et al, Molecular Biology and Evolution , 2011
  • 29. Additional Resource
    • http://www.megasoftware.net/
    07/11/11 J. C. Setubal
  • 30. Books
    • Bioinformatics . Baxevanis and Ouellette (Eds.) Wiley-Interscience, 2005 (3 rd edition), ch. 14
    • D. Mount. Bioinformatics . CSHL Press, 2004 (2 nd edition), ch. 7
    • The phylogenetic handbook . Lemey, Salemi and Vandamme (Eds.) Cambridge University Press, 2009 (2 nd edition)
    07/11/11 J. C. Setubal