Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

Publishing of Scientific Data - Science Foundation Ireland Summit 2010

on

  • 1,205 views

Slides prepared for the Publishing of Scientific Data workshop at the Science Foundation Ireland Summit 2010. I was one of three panelists. We had a lively discussion!

Slides prepared for the Publishing of Scientific Data workshop at the Science Foundation Ireland Summit 2010. I was one of three panelists. We had a lively discussion!

Statistics

Views

Total Views
1,205
Views on SlideShare
1,205
Embed Views
0

Actions

Likes
1
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Naturehttp://www.nature.com/authors/editorial_policies/availability.htmlNSF http://www.nsf.gov/bfa/dias/policy/dmp.jspThis new NSF policy takes effect 18 January 2011Earth System Science Datahttp://www.earth-system-science-data.net/
  • One reason given for using DOIs for data is to track uptake and reuse. This is challenging with current tools, as Heather Piowowa has pointed out: http://bit.ly/doi-fail long URL is http://researchremix.wordpress.com/2010/11/09/tracking-dataset-citations-using-common-citation-tracking-tools-doesnt-work/DataCite is a collaborative endeavor to explore and improve data citation: http://thedata.org/citation/standardThe Australian National Data Service has a nice page on data citation awareness:http://ands.org.au/guides/data-citation-awareness.html
  • Supplemental materialsInteractive data
  • T. K. Attwood, D. B. Kell, P. Mcdermott, J. Marsh, S. R. Pettifer, and D. Thorne. (2009) Calling international rescue: knowledge lost in literature and data landslide! Biochemical Journal. doi:10.1042/BJ20091474.“colleagues and I published a computational method for distinguishing between two types of acute leukemia, based on large-scale gene expression profiles obtained from DNA microarrays ( 3). This paper generated hundreds of requests from scientists interested in replicating and extending the results. The method involved a complex pipeline of steps, including (i) preprocessing of the data, to eliminate likely artifacts; (ii) selection of genes to be used in the model; (iii) building the actual model and setting the appropriate parameters for it from the training data; (iv) preprocessing independent test data; and finally (v) applying the model to test its efficacy. The result was robust and replicable, and the original data were available online, but there was no standardized form in which to make available the various software components and the precise details of their use.” Jill P. Mesirov (2010). Accessible Reproducible Research(Science, 327:415). doi:10.1126/science.1179653,which describes the underlying philosophy: have a Reproducible Research System (RRS) made up of an environment for doing computational work (the Reproducible Research Environment or RRE) and an authoring environment (the Reproducible Research Publisher or RRP) which links back to the research system.
  • Based on eagle-IUPAR23331081710 https://www.eagle-i.org/From http://jobs.climber.com/jobs/Education-Higher-Education/Portland-OR-USA/Research-Associate-Scientific-Data-Curator/6029259/Careers?source=simplyjobs&bid=6029259&cid=Research-Associate-Scientific-Data-CuratorAdvertised 2010-08: http://sourceforge.net/mailarchive/forum.php?thread_name=8E1C6EA1-46C9-4D81-AF18-B50297D50A2C%40ohsu.edu&forum_name=obo-discuss
  • http://systbio.org/?q=node/195Advertised 2007

Publishing of Scientific Data  - Science Foundation Ireland Summit 2010 Publishing of Scientific Data - Science Foundation Ireland Summit 2010 Presentation Transcript

  • Publishing of Scientific Data
    Jodi Schneider
    jodi.schneider@deri.org
    Twitter @jschneider
    SFI Summit2010-11-16
    Athlone, Ireland
  • Data deposit may be required
    Community norms
    Crystallography, astronomy, genomics, …
    Peer-review and publication
    Nature: “Supporting data must be made available to editors and peer-reviewers at the time of submission…”
    Funders
    NSF proposals must include a 2-page Data Management Plan
  • Data citation
    “Cite this paper if you use my dataset”
    DOI, handle, Repository ID
    Tracking reuse is hard! http://bit.ly/doi-fail
    Universal Numerical Fingerprint (UNF)
    Changes when the data does
    Cryptographic hash of the data content
    Micah Altman, Gary King (2007). “A Proposed Standard for the Scholarly Citation of Quantitative Data”. D-Lib 13(3/4)http://www.dlib.org/dlib/march07/altman/03altman.html
    UNF:3:DaYlT6QSX9r0D50ye+tXpA==
  • Data itself as publication?
    Data-only journals
    Earth System Science Data
    Databases as a research product
    Ph.D. curators extracting information from papers
    Machine recording of experiments
    Open Notebook Science
    Integration of data into publications
    Phil Bourne (2005) Will a Biological Database Be Different from a Biological Journal? PLoSComputBiol 1(3): e34. doi:10.1371/journal.pcbi.0010034
  • Interactive Data inside the PDF
    Teresa K. Attwood et al. (2009) Calling international rescue: knowledge lost in literature and data landslide! Biochemical Journal. doi:10.1042/BJ20091474
  • New jobs and roles
    Ph.D. scientists: Extract facts, populate databases, …
    Computer scientists: Semantic tech, data mining, …
    Embedded librarians: Metadata, provenance, …
    Data scientists: Data capture, visualization, stats, …
    Engineers: Self-documenting apparatus, sensors, …
  • Research Assoc./Sci Data Curator
    Develop the biomedical ontology in OWL
    Annotate biomed resource metadata w/ the ontology
    Help with iterative design of annotation tools
    Participate in working groups to define requirements
    Determine database content
    Implement the data model
    Help with data load processes, data reconciliation, quality assurance, and OWL ontology software integration.
  • Scientific Data Curator
    Curate morphological data from the literature
    Populate a database
    Contribute new terms, definitions, and relationships to the ontologies where needed
    Work with the community to ensure consistency
    Review the data submitted by experts
    Work closely with software developers to develop the database, curatorial interface, web interface