Be the first to like this
In recent years there have been many efforts towards the preservation of data belonging to scientific research. Institutions like the Virtual Observatory and journals like PLOS ONE, Geoscience Data Journal, Ecological Archives accept datasets that support or were produced in scientific publications. Other efforts like Figshare allow citing data from unpublished research and research in progress, allowing acknowledging authors and improving the shareability of their work. At the same time, many of the challenges associated to the preservation and sharing of data has been a topic of discussion in international initiatives like the Research Data Alliance, which through its working and interest groups aims at identifying requirements and proposing reference solutions to improve such tasks like data citation and provision of correct e-infrastructure for repositories.
However, data per se is often not relevant without proper description metadata, its provenance and the software used for its creation. In fact, scientists are starting to be more concerned about the preservation of the software and methods used to deliver a particular scientific result. Reproducibility and inspectability are crucial for enabling the interpretation and the reusability of a given dataset. In "in vitro" and "in vivo" sciences, protocols exist to capture the methods necessary to reproduce an experiment. In computational sciences this is achieved with scientific workflows, which capture the method (i.e., steps and data dependencies) used to obtain a specific result. In this short talk we will introduce the set of checklists we have developed for the proper conservation of scientific workflows, encapsulated as Research Objects, by adapting existing standards for data preservation.