Data citationworkshop idcc_2014 Altman
 

Data citationworkshop idcc_2014 Altman

on

  • 732 views

Sound, reproducible scholarship rests upon a foundation of robust, accessible data. For this to be so in practice as well as theory, data must be accorded due importance in the practice of ...

Sound, reproducible scholarship rests upon a foundation of robust, accessible data. For this to be so in practice as well as theory, data must be accorded due importance in the practice of scholarship and in the enduring scholarly record. In other words, data should be considered legitimate, citable products of research.

A few days ago I was honored to officially announce the Data Citation Working Group's Joint Declaration of Data Citation Principles at IDCC 2014, from which the above quote is taken.

This Joint Data Citation Principles identifies guiding principles for the scholarly citation of data. This recommendation is a s collaborative work with CODATA, FORCE 11, DataCite and many other individuals and organizations. And in the week since it has been released, it has already garnered over twenty institutional endorsements.

Some slides introducing the principles are here:

[slideshare id=31957135&doc=datacitationworkshopidcc20142altmandraft-140305141032-phpapp01]

To summarize, from 1977 through 2009 there were three phases of development in the area of data citation.

The first phase of development focused on the role of citation to facilitate description and information retrieval. This phase introduced the principles that data in archives should be described as works rather than media, using author, title, and version.
The second phase of development extended citations to support data access and persistence. This phase introduced the principles that research data used by publication should be cited, that those citations should include persistent identifiers, and that the citations should be directly actionable on the web.
The third phase of development focused on using citations for verification and reproducibility. Although verification and reproducibility had always been one of the motivations for data archiving – it had not been a focus of citation practice. This phase introduced the principles that citations should support verifiable linkage of data and published claims, and it started the trend towards wider integration with the publishing ecosystem
And over the last five years the importance and urgency of scientific data management and access has been recognized more broadly. The culmination of this trend toward increasing recognition, thus far, is an increasingly widespread consensus by researchers and funders of research that data is a fundamental product of research and therefore a citable product. The fourth and current phase of data development work focuses on integration with the scholarly research and publishing ecosystem. This includes integration of data citation in standardized ways within publication, catalogs, tool chains, and larger systems of attribution.

Read the full recommendation here, along with examples, references and endorsements:

Joint Declaration of Data Citation Principles

Statistics

Views

Total Views
732
Views on SlideShare
352
Embed Views
380

Actions

Likes
1
Downloads
0
Comments
0

8 Embeds 380

http://drmaltman.wordpress.com 275
http://informatics.mit.edu 64
http://informatics-dev.mit.edu 27
http://openscholar-dev.mit.edu 6
http://flavors.me 3
http://feedly.com 2
https://twitter.com 2
http://news.google.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • This work. by Micah Altman (http://micahaltman.com) is licensed under the Creative Commons Attribution-Share Alike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
  • Sound, reproducible scholarship rests upon a foundation of robust, accessible data. For this to be so in practice as well as theory, data must be accorded due importance in the practice of scholarship and in the enduring scholarly record. In other words, data should be considered legitimate, citable products of research.A few days ago I was honored to officially announce the Data Citation Working Group's Joint Declaration of Data Citation Principles at IDCC 2014, from which the above quote is taken.This Joint Data Citation Principles identifies guiding principles for the scholarly citation of data. This recommendation is a s collaborative work with CODATA, FORCE 11, DataCite and many other individuals and organizations. And in the week since it has been released, it has already garnered over twenty institutional endorsements.Some slides introducing the principles are here:[slideshare id=31957135&doc=datacitationworkshopidcc20142altmandraft-140305141032-phpapp01]To summarize, from 1977 through 2009 there were three phases of development in the area of data citation.The first phase of development focused on the role of citation to facilitate description and information retrieval. This phase introduced the principles that data in archives should be described as works rather than media, using author, title, and version. The second phase of development extended citations to support data access and persistence. This phase introduced the principles that research data used by publication should be cited, that those citations should include persistent identifiers, and that the citations should be directly actionable on the web. The third phase of development focused on using citations for verification and reproducibility. Although verification and reproducibility had always been one of the motivations for data archiving – it had not been a focus of citation practice. This phase introduced the principles that citations should support verifiable linkage of data and published claims, and it started the trend towards wider integration with the publishing ecosystemAnd over the last five years the importance and urgency of scientific data management and access has been recognized more broadly. The culmination of this trend toward increasing recognition, thus far, is an increasingly widespread consensus by researchers and funders of research that data is a fundamental product of research and therefore a citable product. The fourth and current phase of data development work focuses on integration with the scholarly research and publishing ecosystem. This includes integration of data citation in standardized ways within publication, catalogs, tool chains, and larger systems of attribution.Read the full recommendation here, along with examples, references and endorsements: Joint Declaration of Data Citation Principles

Data citationworkshop idcc_2014 Altman Data citationworkshop idcc_2014 Altman Presentation Transcript

  • Prepared for International Digital Curation Conference Feb 2014 Introducing The Joint Principles for Data Citation Dr. Micah Altman <escience@mit.edu> Director of Research, MIT Libraries Non-Resident Senior Fellow, The Brookings Institution
  • How did we get here? Introducing The Joint Principles for Data Citation 2
  • Exemplar Systems 19771998 ICPR Archive MARC catalog systems. Core Principles Key Work - Facilitate descrip on & informa on retrieval - Describe data in archives - Describe as works not media - Provide author, tle, version. [Avram 1975] [Dodd 1979] [ISO 1997] - Facilitate access & persistence - Cite research data in all publica ons that use it. - Provide ac onable URI’s - Provide persistent iden fiers - Use persistent ins tu ons [Altman, et al. 2001] [Ryssevik & Musgrave 2001] 19992003 NESSTAR Virtual Data Center 20042009 TIB DOI Service Dataverse Network - Facilitate verifica on & reproducibility - Provide bit- or seman c- fixity - Provide granularity [Brase 2004] [Buneman 2006] [Altman & King 2007] Dataverse Network DataCite Data Dryad FigShare Data Cita on Index - Facilitate integra on - Include data cita ons in standard loca ons in text - Index data cita ons in exis ng catalogs - Integrate data cita on with [Uhlir (ed.) 2012] [CODATA 2013] [Data Synthesis Group 2014] 2009- Source: Altman & Crosas, 2014. The Evolution of Data Citation. IASSIST Quarterly. Forthcoming. 3
  • The Joint Principles for Data Citation Introducing The Joint Principles for Data Citation 4
  • Significance & Scope • Sound, reproducible scholarship rests upon a foundation of robust, accessible data. • Data should be considered legitimate, citable products of research. • Data citation, like the citation of other evidence and sources, is good research practice. • The Joint Principles cover purpose, function and attributes of citations. • Specific practices vary across communities and technologies – we recommend communities develop practices for machine and human citations consistent with these general principles. Introducing The Joint Principles for Data Citation 5
  • The Noble Eight-Fold Path to Citing Data • • • • • • • • Importance. Data should be considered legitimate, citable products of research. Data citations should be accorded the same importance in the scholarly record as citations of other research objects, such as publications[1]. Credit and attribution. Data citations should facilitate giving scholarly credit and normative and legal attribution to all contributors to the data, recognizing that a single style or mechanism of attribution may not be applicable to all data[2]. Evidence. In scholarly literature, whenever and wherever a claim relies upon data, the corresponding data should be cited [3]. Unique Identification. A data citation should include a persistent method for identification that is machine-actionable, globally unique, and widely used by a community[4]. Access. Data citations should facilitate access to the data themselves and to such associated metadata, documentation, code, and other materials, as are necessary for both humans and machines to make informed use of the referenced data[5]. Persistence. Unique identifiers, and metadata describing the data and its disposition, should persist -- even beyond the lifespan of the data they describe[6]. Specificity and verifiability. Data citations should facilitate identification of, access to, and verification of the specific data that support a claim. Citations or citation metadata should include information about provenance and fixity sufficient to facilitate verifying that the specific timeslice, version and/or granular portion of data retrieved subsequently is the same as was originally cited[7]. Interoperability and flexibility. Data citation methods should be sufficiently flexible to accommodate the variant practices among communities, but should not differ so much that they 6 compromise interoperability of data citation practices across communities[8].
  • An Example Introducing The Joint Principles for Data Citation 7
  • Placement of Citations Intra-work: ● Should provide sufficient information to identify cited data reference within included reference list. ● Citation to data should be in close proximity to claims relying on data. [Principle 3] ● May include additional information identifying specific portion of data related supporting that claim. [Principle 7] Example: The plots shown in Figure X show the distribution of selected measures from the main data [Author(s), Year, portion or subset used]. Full Citation: Citation may vary in style, but should be included in the full reference list along with citations to other types works. Example: References Section Author(s), Year, Article Title, Journal, Publisher, DOI. Author(s), Year, Dataset Title, Data Repository or Archive, Version, Global Persistent Identifier. Author(s), Year, Book Title, Publisher, ISBN.
  • Generic Data Citation (as it appears in printed reference list) Principle 2: Credit and Attribution (e.g. authors, repositories or other distributors and contributors) Principle 4: Unique Identifier (e.g. DOI, Handle.). Principle 5, 6 Access, Persistence: A persistent identifier that provides access and metadata Author(s), Year, Dataset Title, Data Repository or Archive, Version, Global Persistent Identifier Principle 7: Specificity and verification (e.g. the specific version used). Versioning or timeslice information should be supplied with any updated or dynamic dataset. Note: ● Neither the format nor specific required elements are intended to be defined with this example. Formats, optional elements, and required elements will vary across publishers and communities. [Principle 8: Interoperability and flexibility]. ● As illustrated in the previous examples, intra-work citations may be accompanied with information including the specific portion used. [Principles 7,8]. ● As illustrated in the next example, printed citations should be accompanied by metadata that support credit, attribution, specificity, and verification. [Principles 2, 5 and 7].
  • Citation Metadata Author(s), Year, Dataset Title, Data Repository or Archive, Version, Global Persistent Identifier. Metadata retrieval <!--- CONTRIBUTOR METADATA --> <contributor role=” ORCIDid=”>Name</contributor> <!-- FIXITY and PROVENANCE -<fixity type=”MD5”>XXXX</fixity> <fixity type=”UNF”>UNF:XXXX</fixity> <!-- MACHINE UNDERSTANDABILITY -> <content type>data</content type> <format>HDF5</format> EXAMPLE METADATA Note: ● Metadata location, formats, and elements will vary across publishers and communities. [Principle 8] ● Citation metadata is needed in addition to the information in the printed citation. ● Metadata describing the data and its disposition should persist beyond the lifespan of the data. [Principle 6] ● Citation metadata should support attribution and credit [Principle 2]; machine use [Principle 5]; specificity and verification [principle 7] ● For example, additional citation metadata may be embedded in the citing document; attached to the persistent identifier for the citation, through its resolution service; stored in a separate community indexing service (e.g. DataCite, CrossRef); or provided in a machine-readable way through the surrogate (“landing page”) presented by the repository to which the identifier is resolved. For more detail, see the References section. http://www.force11.org/node/4772
  • What’s next? Introducing The Joint Principles for Data Citation 11
  • Today • 9:20-10:20 Implementation Issues: Panel Discussion (Christine Borgman; Joan Starr; Anita de Waard Turth Duerr; Joe Hourclé, Sarah Callaghan, Puneet Kishnor) • 10:20-10:45 Break • Implementation Issues: Open Discussion (Moderator: Mark Parsons) Introducing The Joint Principles for Data Citation 12
  • After Today • Dissemination & Adoption • Implementation – Joint Implementation Group http://www.force11.org/node/4849 – Research Data Alliance (Dynamic) Data Citation Working Group https://rd-alliance.org/working-groups/datacitation-wg.html – You! Introducing The Joint Principles for Data Citation 13
  • Notes & References Notes [1] CODATA 2013: sec 3.2.1; Uhlir (ed.) 2012, ch 14; Altman & King 2007 [2] CODATA 2013, Sec 3.2; 7.2.3; Uhlir (ed.) 2012,ch. 14 [3] CODATA 2013, Sec 3.1; 7.2.3; Uhlir (ed.) 2012, ch. 14 [4] Altman-King 2007; CODATA 2013, Sec 3.2.3, Ch. 5; Ball & Duke 2012 [5] CODATA 2013, Sec 3.2.4, 3.2.5, 3.2.8 [6] Altman-King 2007; Ball & Duke 2012; CODATA 2013, Sec 3.2.2 [7] Altman-King 2007; CODATA 2013, Sec 3.2.7, 3.2.8 [8] CODATA 2013, Sec 3.2.10 References • M. Altman & G. King, 2007. A Proposed Standard for the Scholarly Citation of Quantitative Data, D-Lib • Ball, A., Duke, M. (2012). ‘Data Citation and Linking’. DCC Briefing Papers. Edinburgh: Digital Curation Centre. • CODATA-ICSTI Task Group on Data Citation, 2013; Out of Cite, Out of Mind: The Current State of Practice, Policy, and Technology for the Citation of Data. Data Science Journal • P. Uhlir (ed.),2011. For Attribution -- Developing Data Attribution and Citation Practices and Standards. National Academies of Sciences 14
  • Questions? Questions about the principles: http://www.force11.org/node/4769 Questions about implementation: datacitationworkgroup@force11.org Questions for me: escience@mit.edu Introducing The Joint Principles for Data Citation 15