Phylogenetics & Data Provenance: Survey Results

372 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
372
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Phylogenetics & Data Provenance: Survey Results

  1. 1. Data Provenance for Phyloinformatics:Introduction & Survey Results Elliott Hauser UNC Information Science Karen Cranston NESCent Informatics
  2. 2. Overview:What is Phylogenetics?
  3. 3. What is Phylogenetic Data? ...many things! Source: DRAFT: Current Best Practices for Publishing Trees Electronically, 2010. Stoltzfus et al. http://wiki.tdwg.org/twiki/bin/view/Phylogenetics/LinkingTrees2010
  4. 4. What is Phylogenetic Data? <A sample NeXML file> Source: http://github.com/miapa/miapa-etl/tree/master/nexmlex
  5. 5. What is aMinimum Information Standard?The answer to this question, for a domain:"What is the minimum information necessaryfor an independent scientist to carry out anindependent analysis of the data?" Quackenbush, 2005For Phylogenetics, this is MIAPA:Minimum Information About a Phylogenetic Analysis
  6. 6. What do we need to know to analyzethis tree?
  7. 7. Overview:What is MIAPA? Source: Leebens-Mack et al. 2006
  8. 8. Overview:Producers and Consumers attitudes Most important metadata type Least important metadata type Source: Cranston MIAPA survey, 2012 (unpublished)
  9. 9. Half of all metadata types arecritically important to two+ subfields Source: Cranston MIAPA survey, 2012 (unpublished)
  10. 10. The majority of metadata types areeasy to produce for all subfields Source: Cranston MIAPA survey, 2012 (unpublished)
  11. 11. How to balance the needs ofProducers and Consumers? Most important metadata type Least important metadata type Source: Cranston MIAPA survey, 2012 (unpublished)
  12. 12. Metadata at work:The Open Tree of Life Project Conflicting Data, Conflicting Needs: ● A Single, Best Tree of Life ● Access to Underlying, Conflicting Trees
  13. 13. A new research area:Computational data provenance ...Huh?
  14. 14. A new research area:Computational data provenanceComputational: The result of a computationData provenance: Where/how it came to be As science becomes more and morecomputational, we need to know more about our data!
  15. 15. Reprise:What is Phylogenetics?a perfect field for computational data provenance!
  16. 16. DiscussionWill our survey results predict actual behavior?What tools, if any, will preserve and encouragesubmission of computational data provenance?Is computational data different from measurementdata, classification data, or other types ofmetadata? If so, does that affect our work?
  17. 17. Thanks!eah13@mac.com
  18. 18. Reprise: balancing the needs ofProducers and Consumers? Most important metadata type Least important metadata type

×