• Like
Open Tree of Life Phyloseminar 2014
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Open Tree of Life Phyloseminar 2014

  • 574 views
Published

Phyloseminar about the Open Tree of Life project, given Feb 2014 by Karen Cranston http://phyloseminar.org

Phyloseminar about the Open Tree of Life project, given Feb 2014 by Karen Cranston http://phyloseminar.org

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
574
On SlideShare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
5
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. TECHNICAL AND SOCIAL CHALLENGES IN SYNTHESIZING THE TREE OF LIFE Karen Cranston National Evolutionary Synthesis Center @kcranstn http://slideshare.net/kcranstn
  • 2. IF WE “HAD” A TREE OF LIFE? complete = contains all of biodiversity dynamic = continuously updated with new data available digitally = browsing, querying, downloading
  • 3. Produce a digitally-available phylogeny that contains all of biodiversity Provide tools for managing, analyzing and sharing phylogenetic data http://avatol.org
  • 4. CHALLENGE: COMPLETENESS
  • 5. Even if there were phylogenies for all species in GenBank, would only have a small fraction of biodiversity
  • 6. NCBI taxonomy data (578 taxa) Soltis et al APG III phylogeny (30 taxa) from Stephen Smith
  • 7. Dipsicales graph Synthesized tree; contains structure of phylogeny but all 578 taxa from Stephen Smith
  • 8. Inputs: Published phylogenies Taxonomies • • • • filter / weight input trees synthesize into single data structure process feedback input new data sets complete tree of life
  • 9. CHALLENGE: ACCESS TO PUBLISHED PHYLOGENIES
  • 10. “Phylogeny provides a mechanism through which to interpret the patterns and processes of evolution and to predict the responses of life to rapid environmental change. Phylogenies and phylogenetic methods are now being used to enhance agriculture, identify and combat diseases, conserve biodiversity, and predict responses to global climate change and to biological invasions.” * (tl;dr: We need trees to do cool and important science) * OpenTree grant proposal
  • 11. Expertise in phylogenetic inference Expertise in methods that use phylogenies
  • 12. EVOLUTION TREE Fig._S1 = [&R] (2,1,((3,7),(4,(6,(33,(15,((20,(47,((51, (49,50)),(46,(48,(52,16)))))),(((44,45),((18,(12,(13,(43,42)))), ((41,((39,38),(40,17))),((35,9),(34,(36,37)))))),(32,(((21,19), ((30,14),(22,((11,31),((27,25),(23,((28,(24,8)),(10,(26, (5,29)))))))))),((((72,(63,57)),((65,64),((66,67),(68,(69,(70, (71,54))))))),(((82,59),(60,(61,(62,55)))),((80,(81,56)),((53, (77,78)),((75,73),(76,(58,74))))))),((88,((86,87),((85,84), (83,89)))),(79,((91,(93,(95,(92,(96,(94,90)))))),((100,(99,98)), (97,(((168,((172,185),((159,101),(109,157)))),(((181,(179,180)), ((102,(183,187)),(175,(176,(178,177))))),(212,((195,(210,211)), (199,((201,(196,202)),((194,197),((203,(192,205)),(204,(193, ((209,(208,206)),(198,(200,207))))))))))))),(113,(((154, ((169,170),(103,191))),((131,126),(128,((134,135),(129,(125, ((132,130),(104,133)))))))),((((190,166),((162,171),((116,120), (115,114)))),((122,(188,(186,108))),((118,(119,105)),(117,(158, (184,189)))))),((123,124),(((148,((165,161),(174,182))), ((106,121),(163,(167,127)))),((173,(156,(155,160))),(164, (((136,137),(139,(138,107))),((153,145),(112,(((146,143),(144, (140,141))),((142,152),(147,((110,111),(149, (150,151))))))))))))))))))))))))))))))))); Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (−lnL = 344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–88%). Nodes with improved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80– 88%). Colored squares on terminal branches indicate the presence, in at least one species of a family, of ecological traits as shown to lower left. The number of origins of each trait was estimated with reference to the phylogeny, the distribution of each trait among genera within a family, and the known biology of the organisms. thermore, a paraphyletic relationship of phorids and syrphids would support the hypothesis that their shared special mode of extraembryonic development (dorsal amnion closure) (26) evolved in the stem lineage of Cyclorrhapha and preceded the origin of the schizophoran amnioserosa. Wiegmann et al. To test this hypothesis, we used a relatively recent phylogenomic marker: small, noncoding, regulatory micro-RNAs (miRNAs). miRNAs exhibit a striking phylogenetic pattern of conservation across the metazoan tree of life, suggesting the accumulation and maintenance of miRNA families throughout organismal evolution import phytools! flyTree<-read.tree(“flies.tre”)! contMap(flyTree,flyData) PNAS Early Edition | 3 of 6 Weigmann et al. PNAS, 2011
  • 13. Archiving sequence data is a community norm ~ 4% of all published phylogenetic trees Stoltzfus et al 2012 Archiving phylogenetic data is quite rare
  • 14. OPENTREE PHYLOGENY INPUTS Surveyed >7000 phylogenetic studies in plants, fungi and animals, unicellular organisms Result: data for >2700 studies, >4800 trees
  • 15. CHALLENGE: SELECTING BACKBONE TAXONOMY
  • 16. Complete? Up to date with taxonomic literature? Phylogenetically-informed? Systematics research very slow….. Online taxonomic resources
  • 17. OPEN TREE TAXONOMY + + + + patch files for manual edits (requires source info!)
  • 18. • • 3,133,028 nodes and 2,559,835 ‘species’ https://github.com/OpenTreeOfLife/reference-taxonomy
  • 19. CHALLENGE: PHYLOGENY CURATION
  • 20. TREE Fig._S1 = [&R] (2,1,((3,7),(4,(6,(33,(15,((20,(47,((51, (49,50)),(46,(48,(52,16)))))),(((44,45),((18,(12,(13,(43,42)))), ((41,((39,38),(40,17))),((35,9),(34,(36,37)))))),(32,(((21,19), ((30,14),(22,((11,31),((27,25),(23,((28,(24,8)),(10,(26, (5,29)))))))))),((((72,(63,57)),((65,64),((66,67),(68,(69,(70, (71,54))))))),(((82,59),(60,(61,(62,55)))),((80,(81,56)),((53, (77,78)),((75,73),(76,(58,74))))))),((88,((86,87),((85,84), (83,89)))),(79,((91,(93,(95,(92,(96,(94,90)))))),((100,(99,98)), (97,(((168,((172,185),((159,101),(109,157)))),(((181,(179,180)), ((102,(183,187)),(175,(176,(178,177))))),(212,((195,(210,211)), (199,((201,(196,202)),((194,197),((203,(192,205)),(204,(193, ((209,(208,206)),(198,(200,207))))))))))))),(113,(((154, ((169,170),(103,191))),((131,126),(128,((134,135),(129,(125, ((132,130),(104,133)))))))),((((190,166),((162,171),((116,120), (115,114)))),((122,(188,(186,108))),((118,(119,105)),(117,(158, (184,189)))))),((123,124),(((148,((165,161),(174,182))), ((106,121),(163,(167,127)))),((173,(156,(155,160))),(164, (((136,137),(139,(138,107))),((153,145),(112,(((146,143),(144, (140,141))),((142,152),(147,((110,111),(149, (150,151))))))))))))))))))))))))))))))))); How was this tree inferred? What are the tip labels? Is it rooted correctly? What clade was the focus of the study?
  • 21. CURATOR TOOLS
  • 22. Data curation NeXSON (NeXML as JSON) Tree synthesis
  • 23. Input names Mapped to taxonomy
  • 24. Tree synthesis API layer Common data store of NexSON files (NeXML as JSON)
  • 25. • • • • • Open source software tools for managing open data Publicly-accessible data store Full provenance data (who changed what & when?) Allows access & download through standard protocols (git) Where possible, using Creative Commons 0 waiver
  • 26. CHALLENGE: SYNTHESIZING PHYLOGENY AND TAXONOMY
  • 27. Graph databases are key Image:
  • 28. Open Tree of Life
  • 29. Thanks to Joseph Brown, Stephen Smith, Jonathan Rees, Jim Allman for getting the latest version up last night!
  • 30. Thanks to Joseph Brown, Stephen Smith, Jonathan Rees, Jim Allman for getting the latest version up last night!
  • 31. Synthesis details next week from Stephen Smith, University of Michigan Thursday, February 13, 1 pm EST phyloseminar.org
  • 32. WHAT CAN WE DO WITH THESE DATA AND TOOLS?
  • 33. Comparing phylogeny and taxonomy Rick Ree & Lyndon Coghill
  • 34. Conflict within sets of trees Open Tree of Life Stephen Smith
  • 35. Highlight under-studied parts of the tree Label internal nodes on phylogenies Test various methods for synthesis Quantify and visualize phylogenetic conflict Extract phylogeny given list of taxa Infer branch lengths on synthetic trees Organize biodiversity data phylogenetically … and many more, enabled by phylogenetic synthesis and digitally available phylogenetic data
  • 36. COMING IN 2014 Hackathon, jointly with Clade-based curation and analysis workshops
  • 37. QUESTIONS? PARTICIPATE? opentreeoflife@googlegroups.com opentreeoflife-software@googlegroups.com irc: #opentreeoflife on freenode http://github.com/OpenTreeOfLife
  • 38. Gordon Burleigh Keith Crandall Karl Gude David Hibbett Mark Holder Laura Katz Rick Ree Stephen Smith Doug Soltis Tiffani Williams + many postdocs, grad students and undergrads @NESCent: Karen Cranston, Jonathan Rees, Jim Allman