SlideShare a Scribd company logo
1 of 29
Download to read offline
A community-assembled, continually updated evolutionary
                  history of all life

                   Karen A. Cranston
         National Evolutionary Synthesis Center
                    Duke University
Phylogeny'papers,'1978;2008'
                              12000"




                              10000"
Number'of'papers'published'




                               8000"
                                                                         Rapid"increase"in"applica?ons"of"
                                                                         phylogeny,"beginning"in"early"1990s"
                               6000"




                               4000"




                               2000"




                                  0"
                                       1978" 979" 980"1981" 982" 983" 984"1985" 986"1987" 988" 989" 990"1991" 992" 993" 994"1995" 996"1997" 998" 999" 000"2001" 002" 003" 004"2005" 006"2007" 008"
                                           1    1         1    1    1         1         1    1    1         1    1    1         1         1    1 2            2    2    2         2         2

                                                                                                                Year'
                          Source:"ISI"Web"of"Science""
Where can I browse,
search and download the
       tree of life?



     You can’t. (Yet)
Phylogeny'papers,'1978;2008'
                              12000"




                              10000"
Number'of'papers'published'




                               8000"
                                                                         Rapid"increase"in"applica?ons"of"
                                                                         phylogeny,"beginning"in"early"1990s"
                               6000"




                               4000"




                               2000"




                                  0"
                                       1978" 979" 980"1981" 982" 983" 984"1985" 986"1987" 988" 989" 990"1991" 992" 993" 994"1995" 996"1997" 998" 999" 000"2001" 002" 003" 004"2005" 006"2007" 008"
                                           1    1         1    1    1         1         1    1    1         1    1    1         1         1    1 2            2    2    2         2         2

                                                                                                                Year'
                          Source:"ISI"Web"of"Science""
DATA AVAILABILITY

   High archival rate of sequence data




                      ~4% of all published
                       phylogenetic trees
Most trees published
                                                                                                                                                         as (beautiful) figures
                                                                                                                                                              in PDF files




                                                                                                                                                             EVOLUTION
                                                                                                                                                                         not reusable!



                                                             Weigmann et al. PNAS, 2011
Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (−lnL =
344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–88%). Nodes with im-
proved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–
Pictures of independent phylogenies
• Ideas Lab = 5-day workshop
• Self-assembly into groups
• Pitched pre-proposals and end of lab
• NSF invited full proposals
Karen Cranston, lead PI (Duke)
                              Gordon Burleigh (Florida)
                              Keith Crandall (BYU)
                              Karl Gude (MSU)
                              David Hibbett (Clark)
                              Mark Holder (Kansas)
                              Laura Katz (Smith)
opentreeoflife.org             Rick Ree (FMNH)
                              Stephen Smith (Michigan)
                              Doug Soltis (Florida)
                              Tiffani Williams (TAMU)

     AVAToL: Assembling, Visualizing and Analysis of
     the Tree of Life
Tree of life

• 1.8
    million named
 species

• Millions
       more
 unnamed / undiscovered
COMPARATIVE BIOLOGY

  Conventional                      Evolutionary
statistics assume:                 trees provide:




         Modified from Garland and Carter, 1994
PHYLOGENETIC PLACEMENT


Metagenomic reads
         +
Reference phylogeny




                      Kembel et al 2011
1. Build the first complete draft tree of life
2. Engage the community in refinement and
   annotation
3. Promote a culture of data sharing through software
   products
4. Develop novel methods for phylogenetic
   synthesis
+ taxonomies of living and extinct species
+ any digital phylogenetic data we can get:
   NSF Assembling the Tree of Life projects
   recent high-profile phylogenies
   ribosomal RNA trees for Bacteria and Archaea
   TreeBASE and Dryad trees




  Graph database holding a ‘cloud’ of thousands
      of input trees with millions of nodes
Graph database holding thousands of input
      trees with millions of nodes




Filter / weight input data (number of taxa, size
      of alignment, year of publication, etc)




        Synthesis (supertrees, grafting)
Graph database holding a ‘cloud’
of thousands of input trees with
       millions of nodes           • filter input trees
                                   • synthesize into summary
                                     trees




    • compare to previous trees
    • invite annotation
    • input new data sets
Ability to annotate
                                                                                        and improve

                                                                                        Clear links to source
                                                                                        data and methods

                                                                                        Compare your
               Flag
                                                                                        results with synthetic
               Get citations                                                            tree
               Annotate
               Upload
               alternate




Tree image modified from Tree of Life Web Project page http://tolweb.org/Nymphalidae/12172 Pictures by Katja Schulz (queen butterfly;
                     CCAttribution-NonCommercial) and Charles Lam (via Flicker;CCAttribution-ShareAlike)
Lonicera ciliosa
   Heptacodium miconioides
   Diervilla rivularis
   Valeriana celtica
   Viburnum densiflorum




        Lonicera ciliosa

        Heptacodium miconioides


  Valeriana celtica
    Viburnum densiflorum

Diervilla rivularis
NESCent hackathon to architect and implement a
  phylogenetic pruning service for megatrees


      http://www.evoio.org/wiki/Phylotastic
YEAR 2 & 3: SMART GENERATION OF
                                         FIGURES FOR PUBLICATION


                                                                                                                                                            • Semantic            annotation layers

                                                                                                                                                            • Collaborative           editing

                                                                                                                                                             EVOLUTION
                                                                                                                                                            • Integrated            submission of
                                                                                                                                                                         topology, branch lengths
                                                                                                                                                                         and annotations to archives



ig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (−lnL =
YEAR 2 & 3: AUTOMATIC UPDATING

         update trees
           with new
        sequence data




                        detect and incorporate
                         newly published trees
Community assembly of the
tree of life (Open Tree of Life)

Next generation Phenomics
(PI O’Leary)

Arbor: Comparative Analysis
Workflows (PI Harmon)
POTENTIAL
                            IMPACTS

• Phylogenies   for any set of species easily available

• Benchmark     for current state of phylogenetic knowledge

• Increasing   rate of data archive

• Placing “dark   taxa” in global informatics framework
BIGGEST
                             CHALLENGES?

• Lack   of digitally-available trees

• Visualization

• Engaging   community to annotate and update

• Producing    usable and visually appealing software
“OPEN” TREE OF LIFE?

     http://opentreeoflife.org

More Related Content

Similar to OpenTree at NESCent Academy 2012

eMonocot IBC Poster
eMonocot IBC PostereMonocot IBC Poster
eMonocot IBC Poster
eMonocot
 
Accelerate research excellence goki presentation (2)
Accelerate research excellence goki presentation (2)Accelerate research excellence goki presentation (2)
Accelerate research excellence goki presentation (2)
p_murali2011
 
Fbip specify2015
Fbip specify2015Fbip specify2015
Fbip specify2015
wcoetzer
 
Graphs are Feeding the World
Graphs are Feeding the WorldGraphs are Feeding the World
Graphs are Feeding the World
Tim Williamson
 
Carleton Biology talk : March 2014
Carleton Biology talk : March 2014Carleton Biology talk : March 2014
Carleton Biology talk : March 2014
Karen Cranston
 

Similar to OpenTree at NESCent Academy 2012 (20)

iPlant Tree of Life
iPlant Tree of LifeiPlant Tree of Life
iPlant Tree of Life
 
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgeFranz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
 
eMonocot IBC Poster
eMonocot IBC PostereMonocot IBC Poster
eMonocot IBC Poster
 
Introduction to EOL.org for scientists
Introduction to EOL.org for scientistsIntroduction to EOL.org for scientists
Introduction to EOL.org for scientists
 
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...
 
The Internet, Science, and Transformations of Knowledge
The Internet, Science, and Transformations of KnowledgeThe Internet, Science, and Transformations of Knowledge
The Internet, Science, and Transformations of Knowledge
 
Accelerate research excellence goki presentation (2)
Accelerate research excellence goki presentation (2)Accelerate research excellence goki presentation (2)
Accelerate research excellence goki presentation (2)
 
Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
 
Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
 
Fbip specify2015
Fbip specify2015Fbip specify2015
Fbip specify2015
 
Knowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, BonnKnowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, Bonn
 
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data HandlingScott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
 
Graphs are Feeding the World
Graphs are Feeding the WorldGraphs are Feeding the World
Graphs are Feeding the World
 
The iPlant Tree of Life Project and Toolkit
The iPlant Tree of Life Project and ToolkitThe iPlant Tree of Life Project and Toolkit
The iPlant Tree of Life Project and Toolkit
 
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
 
Carleton Biology talk : March 2014
Carleton Biology talk : March 2014Carleton Biology talk : March 2014
Carleton Biology talk : March 2014
 
Publishing and Pushing Linked Open Data
Publishing and Pushing Linked Open DataPublishing and Pushing Linked Open Data
Publishing and Pushing Linked Open Data
 
Frontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of LifeFrontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of Life
 
Global content summit: Overview, content partnering, richness
Global content summit: Overview, content partnering, richnessGlobal content summit: Overview, content partnering, richness
Global content summit: Overview, content partnering, richness
 
Cranston Evolution 2013
Cranston Evolution 2013Cranston Evolution 2013
Cranston Evolution 2013
 

More from Karen Cranston

Open Tree of Life at Evolution 2014
Open Tree of Life at Evolution 2014Open Tree of Life at Evolution 2014
Open Tree of Life at Evolution 2014
Karen Cranston
 

More from Karen Cranston (9)

Open Tree of Life at Evolution 2014
Open Tree of Life at Evolution 2014Open Tree of Life at Evolution 2014
Open Tree of Life at Evolution 2014
 
Open Tree of Life Phyloseminar 2014
Open Tree of Life Phyloseminar 2014Open Tree of Life Phyloseminar 2014
Open Tree of Life Phyloseminar 2014
 
WSSSPE: Building communities
WSSSPE: Building communitiesWSSSPE: Building communities
WSSSPE: Building communities
 
Building communities around open-source scientific software
Building communities around open-source scientific softwareBuilding communities around open-source scientific software
Building communities around open-source scientific software
 
Using phylogenetic metadata for large-scale phylogeny synthesis
Using phylogenetic metadata for large-scale phylogeny synthesisUsing phylogenetic metadata for large-scale phylogeny synthesis
Using phylogenetic metadata for large-scale phylogeny synthesis
 
Open Tree at UNCC Jan 2013
Open Tree at UNCC Jan 2013Open Tree at UNCC Jan 2013
Open Tree at UNCC Jan 2013
 
Freeing scientific data using CC0
Freeing scientific data using CC0Freeing scientific data using CC0
Freeing scientific data using CC0
 
Phylotastic @iEvoBio
Phylotastic @iEvoBioPhylotastic @iEvoBio
Phylotastic @iEvoBio
 
Open Tree of Life at Duke Futures
Open Tree of Life at Duke FuturesOpen Tree of Life at Duke Futures
Open Tree of Life at Duke Futures
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

OpenTree at NESCent Academy 2012

  • 1. A community-assembled, continually updated evolutionary history of all life Karen A. Cranston National Evolutionary Synthesis Center Duke University
  • 2.
  • 3. Phylogeny'papers,'1978;2008' 12000" 10000" Number'of'papers'published' 8000" Rapid"increase"in"applica?ons"of" phylogeny,"beginning"in"early"1990s" 6000" 4000" 2000" 0" 1978" 979" 980"1981" 982" 983" 984"1985" 986"1987" 988" 989" 990"1991" 992" 993" 994"1995" 996"1997" 998" 999" 000"2001" 002" 003" 004"2005" 006"2007" 008" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 Year' Source:"ISI"Web"of"Science""
  • 4. Where can I browse, search and download the tree of life? You can’t. (Yet)
  • 5. Phylogeny'papers,'1978;2008' 12000" 10000" Number'of'papers'published' 8000" Rapid"increase"in"applica?ons"of" phylogeny,"beginning"in"early"1990s" 6000" 4000" 2000" 0" 1978" 979" 980"1981" 982" 983" 984"1985" 986"1987" 988" 989" 990"1991" 992" 993" 994"1995" 996"1997" 998" 999" 000"2001" 002" 003" 004"2005" 006"2007" 008" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 Year' Source:"ISI"Web"of"Science""
  • 6.
  • 7. DATA AVAILABILITY High archival rate of sequence data ~4% of all published phylogenetic trees
  • 8. Most trees published as (beautiful) figures in PDF files EVOLUTION not reusable! Weigmann et al. PNAS, 2011 Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (−lnL = 344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–88%). Nodes with im- proved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–
  • 10. • Ideas Lab = 5-day workshop • Self-assembly into groups • Pitched pre-proposals and end of lab • NSF invited full proposals
  • 11. Karen Cranston, lead PI (Duke) Gordon Burleigh (Florida) Keith Crandall (BYU) Karl Gude (MSU) David Hibbett (Clark) Mark Holder (Kansas) Laura Katz (Smith) opentreeoflife.org Rick Ree (FMNH) Stephen Smith (Michigan) Doug Soltis (Florida) Tiffani Williams (TAMU) AVAToL: Assembling, Visualizing and Analysis of the Tree of Life
  • 12.
  • 13. Tree of life • 1.8 million named species • Millions more unnamed / undiscovered
  • 14. COMPARATIVE BIOLOGY Conventional Evolutionary statistics assume: trees provide: Modified from Garland and Carter, 1994
  • 15. PHYLOGENETIC PLACEMENT Metagenomic reads + Reference phylogeny Kembel et al 2011
  • 16.
  • 17. 1. Build the first complete draft tree of life 2. Engage the community in refinement and annotation 3. Promote a culture of data sharing through software products 4. Develop novel methods for phylogenetic synthesis
  • 18. + taxonomies of living and extinct species + any digital phylogenetic data we can get: NSF Assembling the Tree of Life projects recent high-profile phylogenies ribosomal RNA trees for Bacteria and Archaea TreeBASE and Dryad trees Graph database holding a ‘cloud’ of thousands of input trees with millions of nodes
  • 19. Graph database holding thousands of input trees with millions of nodes Filter / weight input data (number of taxa, size of alignment, year of publication, etc) Synthesis (supertrees, grafting)
  • 20. Graph database holding a ‘cloud’ of thousands of input trees with millions of nodes • filter input trees • synthesize into summary trees • compare to previous trees • invite annotation • input new data sets
  • 21. Ability to annotate and improve Clear links to source data and methods Compare your Flag results with synthetic Get citations tree Annotate Upload alternate Tree image modified from Tree of Life Web Project page http://tolweb.org/Nymphalidae/12172 Pictures by Katja Schulz (queen butterfly; CCAttribution-NonCommercial) and Charles Lam (via Flicker;CCAttribution-ShareAlike)
  • 22. Lonicera ciliosa Heptacodium miconioides Diervilla rivularis Valeriana celtica Viburnum densiflorum Lonicera ciliosa Heptacodium miconioides Valeriana celtica Viburnum densiflorum Diervilla rivularis
  • 23. NESCent hackathon to architect and implement a phylogenetic pruning service for megatrees http://www.evoio.org/wiki/Phylotastic
  • 24. YEAR 2 & 3: SMART GENERATION OF FIGURES FOR PUBLICATION • Semantic annotation layers • Collaborative editing EVOLUTION • Integrated submission of topology, branch lengths and annotations to archives ig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (−lnL =
  • 25. YEAR 2 & 3: AUTOMATIC UPDATING update trees with new sequence data detect and incorporate newly published trees
  • 26. Community assembly of the tree of life (Open Tree of Life) Next generation Phenomics (PI O’Leary) Arbor: Comparative Analysis Workflows (PI Harmon)
  • 27. POTENTIAL IMPACTS • Phylogenies for any set of species easily available • Benchmark for current state of phylogenetic knowledge • Increasing rate of data archive • Placing “dark taxa” in global informatics framework
  • 28. BIGGEST CHALLENGES? • Lack of digitally-available trees • Visualization • Engaging community to annotate and update • Producing usable and visually appealing software
  • 29. “OPEN” TREE OF LIFE? http://opentreeoflife.org