Supporting Preservation of Research Data in the Chemical Sciences


Published on

A skim of the experience of a repository data project facing the challenge of digital preservation.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Supporting Preservation of Research Data in the Chemical Sciences

  1. 1. Supporting Preservation of Research Data in the Chemical Sciences. Dr. Simon Coles School of Chemistry, University of Southampton 2nd June 2009
  2. 2. Representation Information for Crystallography Data • Representation Information (RI), from the OAIS Model, is any information required to render, process, interpret, use and understand data. • Registry/repository for RI (RRoRI) by the DCC and the CASPAR Project • Crystallography domain and the workflow of the NCS are examined to identify significant RI • RI networks relating to the CIF file format are formulated and ingested into the RRoRI • Use case scenario describes how the RI stored in RRoRI may be used in order to gain access to the information content of a CIF instance by someone unfamiliar with that file format.
  3. 3. Preservation Planning for Crystallography Data • Original plan was to apply a DRAMBORA assessement to each of the repositories in the federation as a means of raising awareness of curation and preseravtion issues. • Now covers the notion of trust and trustworthiness with a brief look at several preservation planning tools including: the DCC Curation Lifecycle Model; the OAIS Reference Model; audit and certification instruments (TRAC, NESTOR, DRAMBORA, Data Seal of Approval); PLATO and PLATTER (from the PLANETS Project); and cost models (PrestoSpace, LIFE2 projects). • Raises curation and preservation issues that are likely to be relevant in the context of a crystallography community and the eCrystals federation.
  4. 4. Preservation Metadata for Crystallography Data • The original aim was to augment the eBank- UK application profile with preservation metadata specifically for crystallography data • Superceded by the development of the crystallography Data Commons initiative • Proposed the following…
  5. 5. Resources • Data Set/Collection, • Raw Data, • Derived Data, • Result Data, • Transient Data, • Workflow?
  6. 6. Publication/Dissemination Persistent Identifier Preservation Policy/strategy Rights management: binding intellectual property rights that may limit the ability to preserve and disseminate the digital object over time e.g. use and reuse Technical environment: describing the technical requirements needed to render and use the digital object e.g. File format, software, instrumentation Provenance: the custodial history of the object Context: contextual information indicating how the object was created and under what circumstances Authenticity: validating that the digital object is in fact what purports to be, and has not been altered in an undocumented way e.g. checksum
  7. 7. Management • Embargo e.g. policy • Representation Information: any information required to render, process, use, reuse, interpret and understand the object e.g. Specifications; File formats; Software; Hardware; Semantics • Preservation activity: actions taken to preserve the digital object, and any consequences of these actions that impact its look, feel, or functionality