SYNTHESISING DISPARATE DATARESOURCES TO OBTAIN COMPOSITEESTIMATES OF GEOPHYLOGENYRutger Vos
A simple assignment?   Refine a tree for the    Primates with    taxonomic and    systematic data   Add divergence    da...
Actually not so easy…
The Tree of Life Web Service                   Using PhyloWS                    we traversed the                    Tree ...
Adding taxonomic metadata   Using the uBio    PhyloWS service we    enhanced our tree    with further    taxonomic    ann...
Fetching additional tree data                  Using the                   TreeBASE                   PhyloWS service    ...
Computing node ages   The    TimeTreePhyloWS    service allowed us to    anchor molecular    (i.e. relative) node    ages...
Adding occurrence data     Using the      GBIF XML      API, we then      fetched      occurrence      records for the   ...
Visualizing the result
Implementation   Except for GBIF, all    services:     returnNeXML     implement PhyloWS   Semantic    annotations usi...
Challenges                Although some                 services have the                 same API, no GUI               ...
Conclusions   The tree of life can be    covered with all sorts of    metadata (taxonomic,    molecular,    biogeographic...
Shameless plug: PhyloTastic                  A web service to                   extract subsets of                   taxa...
Acknowledgements
Upcoming SlideShare
Loading in …5
×

Synthesising disparate data resources to obtain composite estimates of geophylogeny

844 views

Published on

Invited talk to the 2nd BioVeL workshop, Gothenburg, Sweden, 10 May 2012

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
844
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
4
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • What is the tree of lifeI present a simple idea, illustrated with a workflowWorkflow is based on PhyloWS, a URL API started at BH08Workflow uses semantic annotation using RDFa
  • *workflow developed at thecomphy course here in Kyoto*students learned how to operate on phylogenetic data using Bio* toolkits*one of the problem sets was on how to build a large phylogenetic tree and annotate it using web services.
  • *first we used ToL PhyloWS service*what is tolweb?*what does the service return?*students learned recursion by grafting children on their parent*service also returned metadata
  • *then we used uBio PhyloWS service*what is uBio?*what does the service return?*we expanded unresolved genera into their species*we also fetched linkout metadata
  • *one of the links out point to TreeBASE*what is TreeBASE?*fetched source trees to further resolve skeleton using supertree approach
  • *another uBio annotation is NCBI taxon IDs*we used this to access TimeTree PhyloWS service*what is TimeTree?*used node ages to anchor molecular branch lengths
  • *also accessed GBIF, which is not PhyloWS, but an XML REST API*what is GBIF?*what does it return?*we attached lat/lon coordinates to nodes in our tree
  • *now we have a topology…*and taxonomic statements*and age estimates*and paleontological data*and biogeographic data*we can view all these data in different ways, e.g. in google earth*we can see strepsirrhines, lemurs, lorises, old world monkeys and new world monkeys
  • *the tree of life can be overgrown with metadata*like epiphytes on a tree in a rainforest*we can view these metadata in different ways*unfortunately, services still need a lot of work: standards adoption, choosing the best predicates and values, identifiers!
  • Synthesising disparate data resources to obtain composite estimates of geophylogeny

    1. 1. SYNTHESISING DISPARATE DATARESOURCES TO OBTAIN COMPOSITEESTIMATES OF GEOPHYLOGENYRutger Vos
    2. 2. A simple assignment? Refine a tree for the Primates with taxonomic and systematic data Add divergence dates Add occurrence data Visualize the result Use public web services
    3. 3. Actually not so easy…
    4. 4. The Tree of Life Web Service  Using PhyloWS we traversed the Tree of Life and built a local, semantically annotated copy of the Primate clade
    5. 5. Adding taxonomic metadata Using the uBio PhyloWS service we enhanced our tree with further taxonomic annotations and links, and expanded some genera
    6. 6. Fetching additional tree data  Using the TreeBASE PhyloWS service we fetched additional data to resolve the tree further using a “supertree” approach
    7. 7. Computing node ages The TimeTreePhyloWS service allowed us to anchor molecular (i.e. relative) node ages on absolute dates
    8. 8. Adding occurrence data  Using the GBIF XML API, we then fetched occurrence records for the species in our tree
    9. 9. Visualizing the result
    10. 10. Implementation Except for GBIF, all services:  returnNeXML  implement PhyloWS Semantic annotations using RDFa Glued together with Perl
    11. 11. Challenges  Although some services have the same API, no GUI exists to chain them together  No web services for computationally intensive steps  Data and metadata are messy and sparse
    12. 12. Conclusions The tree of life can be covered with all sorts of metadata (taxonomic, molecular, biogeographic, paleontological), viewable in different ways Standards still incompletely defined and adhered to, though
    13. 13. Shameless plug: PhyloTastic  A web service to extract subsets of taxa from megatrees and annotate them  Deliverable of the first HIP hackathon, at NESCent, in June 2012
    14. 14. Acknowledgements

    ×