A community-assembled, continually updated
      evolutionary history of all life
               Karen A. Cranston
     National Evolutionary Synthesis Center
                Duke University
opentreeoflife.org


Karen Cranston (NESCent)         Laura Katz (Smith)
 Gordon Burleigh (Florida)        Rick Ree (FMNH)
  Keith Crandall (GWU)       Stephen Smith (Michigan)
    Karl Gude (MSU)            Doug Soltis (Florida)
  David Hibbett (Clark)       Tiffani Williams (TAMU)
  Mark Holder (Kansas)


      AVAToL: Assembling, Visualizing and Analysis of
      the Tree of Life
Phylogeny'papers,'1978;2008'
                              12000"




                              10000"
Number'of'papers'published'




                               8000"
                                                                         Rapid"increase"in"applica?ons"of"
                                                                         phylogeny,"beginning"in"early"1990s"
                               6000"




                               4000"




                               2000"




                                  0"
                                       1978" 979" 980"1981" 982" 983" 984"1985" 986"1987" 988" 989" 990"1991" 992" 993" 994"1995" 996"1997" 998" 999" 000"2001" 002" 003" 004"2005" 006"2007" 008"
                                           1    1         1    1    1         1         1    1    1         1    1    1         1         1    1 2            2    2    2         2         2

                                                                                                                Year'
                          Source:"ISI"Web"of"Science""

                                                                                                                                            graph from David Hillis
Where can I browse,
search and download a
complete tree of life?



    You can’t. (Yet)
DATA AVAILABILITY

   Community norm to archive
   sequence data




                    ~4% of all published
                     phylogenetic trees
Year one goals


1. Synthesize a complete draft tree of life from existing
   phylogenetic trees
2. Publish this draft tree with:
  a. ability to add annotations and upload new data sets
  b. highlighted areas of conflict and links to source data
  c. utilities to download whole tree and subtrees
+ taxonomies of living and extinct species
+ digital phylogenetic data:
   Assembling the Tree of Life (and other) projects
   recent high-profile phylogenies
   ribosomal RNA trees for Bacteria and Archaea
   TreeBASE and Dryad trees




   Graph database holding thousands of input
         trees with millions of nodes
Dipsicales graph
taxonomy data (578 taxa) +
Soltis et al APG III phylogeny (30 taxa)
Dipsicales graph   Synthesized tree (favouring
                   phylogenetic branches); contains
                   all 578 taxa
Graph database holding
thousands of input trees with    • filter / weight input trees
      millions of nodes
                                 • build synthetic trees




  • compare to alternate trees
  • input new data sets
AUTOMATIC UPDATING
    update trees
      with new
   sequence data




               detect and synthesize newly
                     published trees
SMART GENERATION OF FIGURES FOR
                                         PUBLICATION


                                                                                                                                                            • Semantic            annotation layers

                                                                                                                                                            • Collaborative           editing

                                                                                                                                                             EVOLUTION
                                                                                                                                                            • Integrated           submission of
                                                                                                                                                                         data and annotations to
                                                                                                                                                                         archives


                                                                                                                                              Weigmann et al. PNAS, 2011
ig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (−lnL =
?

• Increasing
           availability of digital data associated with
 phylogeny publications

• Makingsynthetic tree open to community annotation
 and data submission

• Simple   access to search and download the tree of life
?


• Open   Science

 • project   wiki: opentree.wikispaces.com

 • mailing   list (opentreeoflife@googlegroups.com)

 • development     board (trello.org/opentree)
• provide  complete phylogenetic
  framework
• link to biodiversity and systematics
  content
• provide  complete phylogenetic
    framework
  • link to biodiversity and systematics
    content



• API   for downloading subtrees to analysis tools
• provide  complete phylogenetic
    framework
  • link to biodiversity and systematics
    content



• API   for downloading subtrees to analysis tools




• source   / storage of underlying data
opentreeoflife.org



• We’ve   only just started (June 1)
• Looking   for input, feedback and participation:
 • join   the mailing list
 • add    publications to the Mendeley group
 • vote   / comment on plans on the Trello boards
 • participate   in virtual data curation sprint in August

Open Tree of Life @Evolution 2012

  • 1.
    A community-assembled, continuallyupdated evolutionary history of all life Karen A. Cranston National Evolutionary Synthesis Center Duke University
  • 2.
    opentreeoflife.org Karen Cranston (NESCent) Laura Katz (Smith) Gordon Burleigh (Florida) Rick Ree (FMNH) Keith Crandall (GWU) Stephen Smith (Michigan) Karl Gude (MSU) Doug Soltis (Florida) David Hibbett (Clark) Tiffani Williams (TAMU) Mark Holder (Kansas) AVAToL: Assembling, Visualizing and Analysis of the Tree of Life
  • 3.
    Phylogeny'papers,'1978;2008' 12000" 10000" Number'of'papers'published' 8000" Rapid"increase"in"applica?ons"of" phylogeny,"beginning"in"early"1990s" 6000" 4000" 2000" 0" 1978" 979" 980"1981" 982" 983" 984"1985" 986"1987" 988" 989" 990"1991" 992" 993" 994"1995" 996"1997" 998" 999" 000"2001" 002" 003" 004"2005" 006"2007" 008" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 Year' Source:"ISI"Web"of"Science"" graph from David Hillis
  • 4.
    Where can Ibrowse, search and download a complete tree of life? You can’t. (Yet)
  • 6.
    DATA AVAILABILITY Community norm to archive sequence data ~4% of all published phylogenetic trees
  • 7.
    Year one goals 1.Synthesize a complete draft tree of life from existing phylogenetic trees 2. Publish this draft tree with: a. ability to add annotations and upload new data sets b. highlighted areas of conflict and links to source data c. utilities to download whole tree and subtrees
  • 8.
    + taxonomies ofliving and extinct species + digital phylogenetic data: Assembling the Tree of Life (and other) projects recent high-profile phylogenies ribosomal RNA trees for Bacteria and Archaea TreeBASE and Dryad trees Graph database holding thousands of input trees with millions of nodes
  • 9.
    Dipsicales graph taxonomy data(578 taxa) + Soltis et al APG III phylogeny (30 taxa)
  • 10.
    Dipsicales graph Synthesized tree (favouring phylogenetic branches); contains all 578 taxa
  • 11.
    Graph database holding thousandsof input trees with • filter / weight input trees millions of nodes • build synthetic trees • compare to alternate trees • input new data sets
  • 12.
    AUTOMATIC UPDATING update trees with new sequence data detect and synthesize newly published trees
  • 13.
    SMART GENERATION OFFIGURES FOR PUBLICATION • Semantic annotation layers • Collaborative editing EVOLUTION • Integrated submission of data and annotations to archives Weigmann et al. PNAS, 2011 ig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (−lnL =
  • 14.
    ? • Increasing availability of digital data associated with phylogeny publications • Makingsynthetic tree open to community annotation and data submission • Simple access to search and download the tree of life
  • 15.
    ? • Open Science • project wiki: opentree.wikispaces.com • mailing list (opentreeoflife@googlegroups.com) • development board (trello.org/opentree)
  • 17.
    • provide complete phylogenetic framework • link to biodiversity and systematics content
  • 18.
    • provide complete phylogenetic framework • link to biodiversity and systematics content • API for downloading subtrees to analysis tools
  • 19.
    • provide complete phylogenetic framework • link to biodiversity and systematics content • API for downloading subtrees to analysis tools • source / storage of underlying data
  • 20.
    opentreeoflife.org • We’ve only just started (June 1) • Looking for input, feedback and participation: • join the mailing list • add publications to the Mendeley group • vote / comment on plans on the Trello boards • participate in virtual data curation sprint in August