 Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
Digital Enterprise Research Institute www.der...
Digital Enterprise Research Institute www.deri.ie
Data deposit may be required
 Community norms
 Crystallography, astron...
Digital Enterprise Research Institute www.deri.ie
Data citation
 “Cite this paper if you use my dataset”
 DOI, handle, R...
Digital Enterprise Research Institute www.deri.ie
Data itself as publication?
 Data-only journals
 Earth System Science ...
Digital Enterprise Research Institute www.deri.ie
Interactive Data inside the PDF
Teresa K. Attwood et al. (2009) Calling ...
Digital Enterprise Research Institute www.deri.ie
New jobs and roles
 Ph.D. scientists: Extract facts, populate databases...
Digital Enterprise Research Institute www.deri.ie
Research Assoc./Sci Data Curator
 Develop the biomedical ontology in OW...
Digital Enterprise Research Institute www.deri.ie
Scientific Data Curator
 Curate morphological data from the literature
...
Upcoming SlideShare
Loading in …5
×

Publishing of Scientific Data - Science Foundation Ireland Summit 2010

1,374 views

Published on

Slides prepared for the Publishing of Scientific Data workshop at the Science Foundation Ireland Summit 2010. I was one of three panelists. We had a lively discussion!

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,374
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
4
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Nature http://www.nature.com/authors/editorial_policies/availability.html
    NSF http://www.nsf.gov/bfa/dias/policy/dmp.jsp
    This new NSF policy takes effect 18 January 2011
    Earth System Science Data http://www.earth-system-science-data.net/
  • One reason given for using DOIs for data is to track uptake and reuse. This is challenging with current tools, as Heather Piowowa has pointed out: http://bit.ly/doi-fail long URL is http://researchremix.wordpress.com/2010/11/09/tracking-dataset-citations-using-common-citation-tracking-tools-doesnt-work/
    DataCite is a collaborative endeavor to explore and improve data citation: http://thedata.org/citation/standard

    The Australian National Data Service has a nice page on data citation awareness:http://ands.org.au/guides/data-citation-awareness.html
  • Supplemental materials
    Interactive data
  • T. K. Attwood, D. B. Kell, P. Mcdermott, J. Marsh, S. R. Pettifer, and D. Thorne. (2009) Calling international rescue: knowledge lost in literature and data landslide! Biochemical Journal. doi:10.1042/BJ20091474.

    “colleagues and I published a computational method for distinguishing between two types of acute leukemia, based on large-scale gene expression profiles obtained from DNA microarrays ( 3). This paper generated hundreds of requests from scientists interested in replicating and extending the results. The method involved a complex pipeline of steps, including (i) preprocessing of the data, to eliminate likely artifacts; (ii) selection of genes to be used in the model; (iii) building the actual model and setting the appropriate parameters for it from the training data; (iv) preprocessing independent test data; and fi nally (v) applying the model to test its efficacy. The result was robust and replicable, and the original data were available online, but there was no standardized form in which to make available the various software components and the precise details of their use.” Jill P. Mesirov (2010). Accessible Reproducible Research (Science, 327:415). doi:10.1126/science.1179653, which describes the underlying philosophy: have a Reproducible Research System (RRS) made up of an environment for doing computational work (the Reproducible Research Environment or RRE) and an authoring environment (the Reproducible Research Publisher or RRP) which links back to the research system.
  • Based on eagle-I UPAR23331081710 https://www.eagle-i.org/
    From http://jobs.climber.com/jobs/Education-Higher-Education/Portland-OR-USA/Research-Associate-Scientific-Data-Curator/6029259/Careers?source=simplyjobs&bid=6029259&cid=Research-Associate-Scientific-Data-Curator
    Advertised 2010-08: http://sourceforge.net/mailarchive/forum.php?thread_name=8E1C6EA1-46C9-4D81-AF18-B50297D50A2C%40ohsu.edu&forum_name=obo-discuss
  • http://systbio.org/?q=node/195
    Advertised 2007
  • Publishing of Scientific Data - Science Foundation Ireland Summit 2010

    1. 1.  Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute www.deri.ie Publishing of Scientific Data Jodi Schneider jodi.schneider@deri.org Twitter @jschneider SFI Summit 2010-11-16 Athlone, Ireland
    2. 2. Digital Enterprise Research Institute www.deri.ie Data deposit may be required  Community norms  Crystallography, astronomy, genomics, …  Peer-review and publication  Nature: “Supporting data must be made available to editors and peer-reviewers at the time of submission…”  Funders  NSF proposals must include a 2-page Data Management Plan
    3. 3. Digital Enterprise Research Institute www.deri.ie Data citation  “Cite this paper if you use my dataset”  DOI, handle, Repository ID  Tracking reuse is hard! http://bit.ly/doi-fail  Universal Numerical Fingerprint (UNF)  Changes when the data does  Cryptographic hash of the data content  Micah Altman, Gary King (2007). “A Proposed Standard for the Scholarly Citation of Quantitative Data”. D-Lib 13(3/4)http://www.dlib.org/dlib/march07/altman/03altman.html  UNF:3:DaYlT6QSX9r0D50ye+tXpA==
    4. 4. Digital Enterprise Research Institute www.deri.ie Data itself as publication?  Data-only journals  Earth System Science Data  Databases as a research product  Ph.D. curators extracting information from papers  Machine recording of experiments  Open Notebook Science  Integration of data into publications  Phil Bourne (2005) Will a Biological Database Be Different from a Biological Journal? PLoS Comput Biol 1(3): e34. doi:10.1371/journal.pcbi.0010034
    5. 5. Digital Enterprise Research Institute www.deri.ie Interactive Data inside the PDF Teresa K. Attwood et al. (2009) Calling international rescue: knowledge lost in literature an data landslide! Biochemical Journal. doi:10.1042/BJ20091474
    6. 6. Digital Enterprise Research Institute www.deri.ie New jobs and roles  Ph.D. scientists: Extract facts, populate databases, …  Computer scientists: Semantic tech, data mining, …  Embedded librarians: Metadata, provenance, …  Data scientists: Data capture, visualization, stats, …  Engineers: Self-documenting apparatus, sensors, …
    7. 7. Digital Enterprise Research Institute www.deri.ie Research Assoc./Sci Data Curator  Develop the biomedical ontology in OWL  Annotate biomed resource metadata w/ the ontology  Help with iterative design of annotation tools  Participate in working groups to define requirements  Determine database content  Implement the data model  Help with data load processes, data reconciliation, quality assurance, and OWL ontology software integration.
    8. 8. Digital Enterprise Research Institute www.deri.ie Scientific Data Curator  Curate morphological data from the literature  Populate a database  Contribute new terms, definitions, and relationships to the ontologies where needed  Work with the community to ensure consistency  Review the data submitted by experts  Work closely with software developers to develop the database, curatorial interface, web interface

    ×