Sound, reproducible scholarship rests upon a foundation of robust, accessible data. For this to be so in practice as well as theory, data must be accorded due importance in the practice of scholarship and in the enduring scholarly record. In other words, data should be considered legitimate, citable products of research.
A few days ago I was honored to officially announce the Data Citation Working Group's Joint Declaration of Data Citation Principles at IDCC 2014, from which the above quote is taken.
This Joint Data Citation Principles identifies guiding principles for the scholarly citation of data. This recommendation is a s collaborative work with CODATA, FORCE 11, DataCite and many other individuals and organizations. And in the week since it has been released, it has already garnered over twenty institutional endorsements.
Some slides introducing the principles are here:
To summarize, from 1977 through 2009 there were three phases of development in the area of data citation.
The first phase of development focused on the role of citation to facilitate description and information retrieval. This phase introduced the principles that data in archives should be described as works rather than media, using author, title, and version.
The second phase of development extended citations to support data access and persistence. This phase introduced the principles that research data used by publication should be cited, that those citations should include persistent identifiers, and that the citations should be directly actionable on the web.
The third phase of development focused on using citations for verification and reproducibility. Although verification and reproducibility had always been one of the motivations for data archiving – it had not been a focus of citation practice. This phase introduced the principles that citations should support verifiable linkage of data and published claims, and it started the trend towards wider integration with the publishing ecosystem
And over the last five years the importance and urgency of scientific data management and access has been recognized more broadly. The culmination of this trend toward increasing recognition, thus far, is an increasingly widespread consensus by researchers and funders of research that data is a fundamental product of research and therefore a citable product. The fourth and current phase of data development work focuses on integration with the scholarly research and publishing ecosystem. This includes integration of data citation in standardized ways within publication, catalogs, tool chains, and larger systems of attribution.
Read the full recommendation here, along with examples, references and endorsements:
Joint Declaration of Data Citation Principles