Your SlideShare is downloading. ×
  • Like
Phylogenetic Workflows
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Phylogenetic Workflows

  • 255 views
Published

Phylogenetic Workflows: Tree Building and Post-tree Analyses …

Phylogenetic Workflows: Tree Building and Post-tree Analyses
Given at the Plant Biology 2011 conference.

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
255
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
12
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Our understanding of the phylogeny of the half million known species of green plants has expanded dramatically over the past two decades, The task of assembling a comprehensive "tree of life" for them presents a Grand Challenge. Also part of the grand challenge is developing the necessary infrastructre to view and use the tree of life, to put it into the hands of plant biologists
  • Left tree: Maple tree phylogeny from D. Ackerly Left picture: Joe Felsenstein, ca. 1980 Right picture: Ranger cluster at TACC
  • Distance matrix calculation compared to FASTREE

Transcript

  • 1. Phylogenetic Workflows: Tree Building and Post-tree Analyses Naim Matasci The iPlant Collaborative Plant Biology 2011 August 6-10, 2011
  • 2. Why is the tree of life important? “ Knowledge of evolutionary relationships is fundamental to biology, yielding new insights across the plant sciences, from comparative genomics and molecular evolution, to plant development, to the study of adaptation, speciation, community assembly, and ecosystem functioning.”
  • 3. Nothing in biology makes sense except in the light of evolution. T. G. Dobzahnsky
  • 4. Scalability Ackerly, 2009; J. Felsenstein, ca. 1980; Ranger Cluster at TACC
  • 5. iPlant Tree of Life Grand Challenge
    • Large phylogenetic inference
      • Building a tree of life for up to 500,000 green plants
    • Tree Visualization
      • Scalable visualization for small to large trees
    • Data Assembly and Integration
      • Acquisition, organization and processing the data
    • Taxonomic Intelligence
      • Sorting out different names for the same species
    • Tree Reconciliation
      • Resolving discordant gene and species trees
    • Trait Evolution
      • Using trees to understand how traits evolved
  • 6. Ancestral state of Hawaiian lobelioids Lobelia niihauensis (Image: David Eickhoff) Cyanea leptostegia (Image: Karl Magnacca)
  • 7.  
  • 8. Continuous Ancestral Character Estimation (Schulter et al. 1997, Paradis 2004) ?
  • 9.
    • Obtain sequences
    • GetSeq
    • Align sequences
    • Muscle
    • Build Tree
    • FastTree (aML)
    • Ninja (NJ)
    • PHYLIP (MP, NJ, ML)
    • RAxML (ML)
    • Visualize Tree
    • iPlant Tree Viewer
    • Integrate Data
    • Lopper
    • TNRS
    • Run Analysis
    • CACE
    • DACE
    • Contrast
    • OUch
    • Picante
    • Penalized likelihood
  • 10.
    • Obtain sequences
    • GetSeq
    • Align sequences
    • Muscle
    • Build Tree
    • FastTree (aML)
    • Ninja (NJ)
    • PHYLIP (MP, NJ, ML)
    • RAxML (ML)
    • Visualize Tree
    • iPlant Tree Viewer
    • Integrate Data
    • Lopper
    • TNRS
    • Run Analysis
    • CACE
    • DACE
    • Contrast
    • OUch
    • Picante
    • Penalized likelihood
  • 11. >gi|1835233|emb|Z83147.1| S.nepaulensis rbcL gene TTATTATACTCCTGAATAYGAAACCAAAGATACTGATATCTTGGCAGCATTCCGAGTAACTGCTCAGCCT GGAGTTCCACCCGAAGAAGCGGGGGCCGCGGTAGCTGCGGAATCTTCTACTGGTACATGGACAACTGTGT GGACCGATGGACTTACTAACCTTGATCGTTACAAAGGGCGATGCTACAACATAGAGCCCGTTGCTGGAGA AGAAAATCAATTTATTGCTTATGTAGCTTATCCTTTAGACCTTTTTGAAGAAGGTTCTGTTACTAACATG TTTACTTCCATTGTGGGTAATGTATTTGGGTTCAAAGCCCTGCGCGCTCTACGTCTGGAAGATCTGCGAA TCCCTACTGCGTATTGTAAAACTTTCCAAGGACCGCCTCATGGGATCCAAGTTGAAAGAGATAAATTGAA CAAGTATGGTCGTCCCTTGCTGGGATGTACTATTAAACCTAAATTGGGGTTATCGGCTAAAAACTACGGT AGAGCAGTTTATGAATGTCTACGCGGTGGGCTTGATTTTACCAAAGATGATGAGAACGTGAACTCCCAAC CATTTATGCGTTGGAGAGACCGTTTCGTATTTTGTGCCGAAGCAATTTTTAAAGCACAGTCTGAAACAGG TGAAATCAAAGGGCATTACTTGAATGCTACTGCAGGTACATGTGAAGAAATGATGAAAAGGGCTATATTT >gi|1835227|emb|Z83136.1| S.foetidissimum rbcL gene AAGTGTTGGATTCAAAGCGGGTGTTAAAGATTACAAATTGACTTATTATACTCCTGACTATGAAACCAAA GATACTGATATCTTGGCAGCATTCCGAGTAACTCCTCAACCTGGAGTTCCACCTGAAGAAGCAGGGGCCG CGGTAGCTGCCGAATCTTCTACTGGTACATGGACAACTGTGTGGACCGATGGACTTACTAGCCTTGATCG TTACAAAGGGCGATGCTACCACATCGAGCCCGTNGCTGGAGAAGAAAATCAATATATTGCTTATGTAGCT TATCCTTTAGACCTYTTTGAAGAAGGTTCTGTTACTAATATGTKNACTTCCATTGTGGGGAATGTATTTG GGTTCAAAGCCCTGCGTGCTTTACGTCTGGAAGATCTGCGAATCCCTCCTGCGTATTCTAAAACTTTCCA AGGACCGCCTCATGGCATCCAAGTTGAAAGAGATAAATTGAACAAGTACGGTCGTCCCCTGTTGGGATGT ACTATTAAACCTAAATTGGGGTTATCTGCTAAAAACTACGGTAGAGCGGTTTATGAATGTCTCCGCGGTG GACTTGATTTTACCAAAGATGATGAGAACGTGAACTCCCAACCATTTATGCGTTGGAGAGATCGTTTCTT ATTTTGTGCCGAAGCACTTTATAAAGCACAGGCTGAAACAGGTGAAATCAAAGGGCATTACTTGAATGCT >gi|1834456|emb|Z83132.1| G.urceolata rbcL gene AACTAAAGCGGGTGTTGGATTCAAAGCGGGTGTTAAAGATTACAAATTAACTTATTATACTCCTGACTAT GAAACCAAAGATACTGATATCTTGGCAGCATTCCGAGTAACTCCTCAACCTGGAGTTCCACCTGAAGAAG CGGGGGCCGCCGTAGCTGCCGAATCCTCCACTGGTACATGGACAACTGTGTGGACCGACGGACTTACTAG CCTTGATCGTTACAAAGGGCGATGCTACCACATCGAGCCCGTGGCTGGAGAAGAAAATCAATTTATTGCT TATGTAGCTTACCCTTTAGACCTTTTTGAAGAAGGTTCTGTTACTAACATGTTTACTTCCATTGTGGGTA ATGTATTTGGGTTCAAAGCCCTGCGCGCTCTACGTCTGGAAGATCTGCGAATCCCTGTTGCGTATGCTAA AACTTTCCAAGGGCCGCCTCATGGCATCCAAGTTGAAAGAGATAAATTGAATAAGTATGGTCGTCCCCTG
  • 12. Get Sequences
    • Retrieves nucleotide and amino acid sequences from NCBI's GenBank
    • Automatically includes species name and taxon ID
  • 13. Get sequences DEMO
  • 14.  
  • 15.  
  • 16.  
  • 17.  
  • 18.  
  • 19.  
  • 20.  
  • 21.  
  • 22.
    • Obtain sequences
    • GetSeq
    • Align sequences
    • Muscle
    • Build Tree
    • FastTree (aML)
    • Ninja (NJ)
    • PHYLIP (MP, NJ, ML)
    • RAxML (ML)
    • Visualize Tree
    • iPlant Tree Viewer
    • Integrate Data
    • Lopper
    • TNRS
    • Run Analysis
    • CACE
    • DACE
    • Contrast
    • OUch
    • Picante
    • Penalized likelihood
  • 23. muscleDEMO
  • 24.  
  • 25.  
  • 26.  
  • 27.  
  • 28.  
  • 29.  
  • 30.
    • Obtain sequences
    • GetSeq
    • Align sequences
    • Muscle
    • Build Tree
    • FastTree (aML)
    • Ninja (NJ)
    • PHYLIP (MP, NJ, ML)
    • RAxML (ML)
    • Visualize Tree
    • iPlant Tree Viewer
    • Integrate Data
    • Lopper
    • TNRS
    • Run Analysis
    • CACE
    • DACE
    • Contrast
    • OUch
    • Picante
    • Penalized likelihood
  • 31. Improved Tree Building Tools
    • NINJA/WINDJAMMER (Travis Wheeler)
      • Neighbor-Joining implementation that can analyze > 200K species
      • Six day run time reduced 32-fold to 4.5 hours for 220K species data set
      • Two/three day run time reduced 1,800-folds to 2 minutes for distance matrix calculation on 220K set
    • RAxML-Light (Alexandros Stamatakis)
    • Large Scale Maximum Likelihood implementation
      • 55K Tree published (Stephen A. Smith et al., “Understanding angiosperm diversification using small and large phylogenetic trees,” American Journal of Botany 98, no. 3 (2011): 404 -414)
  • 32. RAxML DEMO
  • 33.  
  • 34.  
  • 35.  
  • 36.  
  • 37.  
  • 38.  
  • 39.
    • Obtain sequences
    • GetSeq
    • Align sequences
    • Muscle
    • Build Tree
    • FastTree (aML)
    • Ninja (NJ)
    • PHYLIP (MP, NJ, ML)
    • RAxML (ML)
    • Visualize Tree
    • iPlant Tree Viewer
    • Integrate Data
    • Lopper
    • TNRS
    • Run Analysis
    • CACE
    • DACE
    • Contrast
    • OUch
    • Picante
    • Penalized likelihood
  • 40. Tree Visualization
    • > 500K Taxa
    • Fast
    • Web based, platform independent
    • Semantic zooming
    • Metadata driven display of information
  • 41. iPlant Tree Viewer http://portnoy.iplantcollaborative.org/
  • 42. Live tree view demo
  • 43.  
  • 44.  
  • 45.  
  • 46.  
  • 47.  
  • 48.  
  • 49.
    • Obtain sequences
    • GetSeq
    • Align sequences
    • Muscle
    • Build Tree
    • FastTree (aML)
    • Ninja (NJ)
    • PHYLIP (MP, NJ, ML)
    • RAxML (ML)
    • Visualize Tree
    • iPlant Tree Viewer
    • Integrate Data
    • Lopper
    • TNRS
    • Run Analysis
    • CACE
    • DACE
    • Contrast
    • OUch
    • Picante
    • Penalized likelihood
  • 50. Obstacles
  • 51. Lopper DEMO
  • 52.  
  • 53.  
  • 54.  
  • 55.  
  • 56.  
  • 57.  
  • 58.  
  • 59.  
  • 60.  
  • 61.  
  • 62.  
  • 63.  
  • 64. Lobelia kauaensis Lobelia villosa Galeatella gloria-montis Trematolobelia kauaiensis Trematolobelia macrostachys Lobelia hypoleuca Neowimmeria yuccoides Lobelia niihauensis Brighamia insignis Brighamia rockii Delissea rhytidosperma Delissea subcordata Cyanea acuminata Cyanea hirtella Cyanea coriacea Delissea leptostegia Clermontia kakeana Clermontia parviflora Clermontia arborescens Clermontia fauriei
  • 65. The TNRS: A Taxonomic Name Resolution Service for Plants Tonight from 5:30 - 7:30 in Exhibit Hall A. Poster number P21011 .
  • 66.
    • Obtain sequences
    • GetSeq
    • Align sequences
    • Muscle
    • Build Tree
    • FastTree (aML)
    • Ninja (NJ)
    • PHYLIP (MP, NJ, ML)
    • RAxML (ML)
    • Visualize Tree
    • iPlant Tree Viewer
    • Integrate Data
    • Lopper
    • TNRS
    • Run Analysis
    • CACE
    • DACE
    • Contrast
    • OUch
    • Picante
    • Penalized likelihood
  • 67. CACE DEMO
  • 68.  
  • 69.  
  • 70.  
  • 71.  
  • 72.  
  • 73.  
  • 74.  
  • 75.  
  • 76.