• Like
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
262
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
6
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • This work. by Micah Altman (http://micahaltman.com) is licensed under the Creative Commons Attribution-Share Alike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.

Transcript

  • 1. Prepared for Data Citation Synthesis Group Open Workshop s Sept 2013 Summary of data citation synthesis activity & Next steps for review <bit.ly/dsynthrev> Dr. Micah Altman <escience@mit.edu> MIT Libraries Joan Starr Joan.starr@ucop.edu California Digital Library
  • 2. Summary of synthesis activity & Next steps for review What has been done?
  • 3. Refining Approaches to Data Citation Summary of synthesis activity & Next steps for review 2000 -2004 NESSTAR, Virtual Data Center Cite research data in publications; Use persistent identifiers; Facilitate direct access to data through URI’s [Ryssevik & Musgrave 2001] [Altman, et al. 2001] 2005- 2009 Dataverse Network System, TIB Data DOI Registration Include versioning, fixity, and granularity for verification; use permanent institutions; facilitate attribution [Buhneman 2006] [Altman & King 2007] [OECD 2009] 2010- DataCite; Thomson-Reuters Data Citation Index; FigShare; Data Dryad Include data citations in standard locations; index data citations in catalogs; facilitate machine understanding [NAS 2012] [DCC 2012] [Force 11 2013] [CODATA 2013] Example Systems Core Recommendations Key References
  • 4. Integrating Current Recommendation Disciplinary Practices; Repository Practices Summative Recommendations Synthesis PrinciplesSummary of synthesis activity & Next steps for review
  • 5. Synthesis Group Activity • Hosted by Force 11 – Charter here: http://www.force11.org/node/4432 • Formed early summer • Meeting weekly • Reviewed current key recommendations & engaged lead authors: – Force 11/Amsterdam Manifesto [FORCE11 2012] – Co-Data/”Out of Cite” Recommendations [CODATA 2013] – DCC Guide [DCC 2012] – DataCite/Metadata Core [Datacite 2012] – Research Data Alliance • Identified core principles that are consistent across recommendation groups • Formulated a draft synthesis of principles • Agreed to use key documents above for definitions of terms, detailed explanation of issues • Out of scope: specific detailed standards, protocols, infrastructure, tools Summary of synthesis activity & Next steps for review
  • 6. Yesterday • Open Workshop • Line-by-line review of draft • Open editing of document – In shared document – Using revision control • Convergence on principles – 8 principles revised and approved by consensus – 1 recommendation struck – 1 recommendation tabled for discussion today • Summary – Substantial core of agreement need for citation; use of persistent identifiers; support for human and machine access; facilitation of verification, attribution. – Maintain conceptual boundaries among data citation; publication & evaluation – Recognize that terminology cannot always be aligned with colloquial or disciplinary usage Summary of synthesis activity & Next steps for review
  • 7. The principles 1. Importance 2. Credit and attribution 3. Evidence 4. Unique Identification 5. Access 6. Persistence 7. Versioning and granularity 8. Interoperability and flexibility Summary of synthesis activity & Next steps for review
  • 8. Open Question: Data Repository Recommendations Summary of synthesis activity & Next steps for review 6. Persistence Metadata describing the data, and unique identifiers should persist, even beyond the lifespan of the data they describe. Data citations should be resolvable to data stored in repositories with a commitment and demonstrated capability to maintain long term access. Data stored in such repositories may not always be publicly accessible. Although such repositories should be committed to long term maintenance and preservation of data, the nature of digital data is such that they may not persist indefinitely.
  • 9. Review Process • Synthesis group will supplement today’s consensus principles with background: – Illustrative examples for each recommendation – References with each principle to detailed discussion of embedded issues in prior reports. – Glossary. • Public release of draft for open online commentary • Integration of commentary and release of final draft Summary of synthesis activity & Next steps for review
  • 10. Questions for Review & Decisions • Nomination of additional members to synthesis group for preparation of summary material (glossary, references, example, preamble)? – Decision: anyone in attendance who can substantively (if not officially) represent a group – Decision: Identify additional key organizations for commentary, • Public release of draft – when, to whom? – Decision: Available for open public commentary mid November – Decision: Will specifically request comments from key organizations, including: • Organizations listed earlier ( Force11, DCC, CoData, ESIP, RDA, DataVerse, Data-PASS, DataCite) • Additional suggested organizations: NLM, ARL • Additional organization identified by synthesis group • Open commentary via mailing list & force11 website. Period for commentary? – Decision: 6-8 weeks for public commentary • Integration of commentary by synthesis group and release of updated draft. Number of drafts necessary? When to declare “done”? – Decision: Single round of revisions by synthesis group. Will then seek endorsements. Summary of synthesis activity & Next steps for review
  • 11. Additional References • [Ryssevik & Musgrave 2001] J Ryssevik , S. Musgrave. 2001. The Social Science Dream Machine Social Science Computer Review [Altman, et al. 2001] M. Altman, et al. 2001. A Digital Library for the Dissemination and Replication of Quantitative Social Science Research: The Virtual Data Center, Social Science Computer Review • [Buhneman 2006] P. Buhneman 2006. How to Cite Curated Databases and Make them Citable SSDBM ’06 • [Altman & King 2007] M. Altman & G. King, 2007. A Proposed Standard for the Scholarly Citation of Quantitative Data, D-Lib • [OECD 2009] T. Green. 2009, We need publishing standards for datasets and data tables. OECD. • [NAS 2012] P. Uhlir (ed.),2011. For Attribution -- Developing Data Attribution and Citation Practices and Standards. National Academies of Sciences. Summary of synthesis activity & Next steps for review
  • 12. Synthesis Group Contacts About the synthesis group: http://www.force11.org/node/4432 Questions for the synthesis group: datacitationworkgroup@force11.org Consensus document, with revision history: https://docs.google.com/document/d/1Ko sNqBPgE8ziWDuJgBIrk20KxcOXoZdA t_TdJV3xoz8/edit?usp=drive_webSummary of synthesis activity & Next steps for review
  • 13. Key Recommendations • [[Force11 2013] M. Crosas, T. Carptenter, C. Borgman, D. Shotton 2013, The Amsterdam Manifesto on Data Citation Principles, Force11 • [CODATA 2013] CODATA-ICSTI Task Group on Data Citation, 2013; Out of Cite, Out of Mind: The Current State of Practice, Policy, and Technology for the Citation of Data. Data Science Journal • [DCC 2012] Ball, A., Duke, M. (2012). ‘Data Citation and Linking’. DCC Briefing Papers. Edinburgh: Digital Curation Centre. Summary of synthesis activity & Next steps for review
  • 14. Additional References • [Ryssevik & Musgrave 2001] J Ryssevik , S. Musgrave. 2001. The Social Science Dream Machine Social Science Computer Review [Altman, et al. 2001] M. Altman, et al. 2001. A Digital Library for the Dissemination and Replication of Quantitative Social Science Research: The Virtual Data Center, Social Science Computer Review • [Buhneman 2006] P. Buhneman 2006. How to Cite Curated Databases and Make them Citable SSDBM ’06 • [Altman & King 2007] M. Altman & G. King, 2007. A Proposed Standard for the Scholarly Citation of Quantitative Data, D-Lib • [OECD 2009] T. Green. 2009, We need publishing standards for datasets and data tables. OECD. • [NAS 2012] P. Uhlir (ed.),2011. For Attribution -- Developing Data Attribution and Citation Practices and Standards. National Academies of Sciences. • [Datacite 2012] Datacite metadata schema, v 3.0 http://schema.datacite.org/ Summary of synthesis activity & Next steps for review
  • 15. Summary of synthesis activity & Next steps for review Appendix: Principles
  • 16. The principles 1. Importance Data should be considered legitimate, citable products of research. Data citations should be accorded the same importance in the scholarly record as citations of other research objects, such as publications. Summary of synthesis activity & Next steps for review
  • 17. The principles 2. Credit and attribution Data citations should facilitate giving scholarly credit and normative and legal attribution to all contributors to the data, recognizing that a single style or mechanism of attribution may not be applicable to all data. 3. Evidence Where a specific claim rests upon data, the corresponding data citation should be provided. Summary of synthesis activity & Next steps for review
  • 18. The principles 4. Unique identification A data citation should include a persistent method for identification that is machine actionable, globally unique, and widely used by a community. 5. Access Data citations should facilitate access to the data themselves and to such associated metadata, documentation, and other materials, as are necessary for both humans and machines to make informed use of the referenced data. Summary of synthesis activity & Next steps for review
  • 19. The principles 6. Persistence Metadata describing the data, and unique identifiers should persist, even beyond the lifespan of the data they describe. [more to be decided upon] Summary of synthesis activity & Next steps for review
  • 20. The principles 7. Versioning and granularity Data citations should facilitate identification and access to different versions and/or subsets of data. Citations should include sufficient detail to verifiably link the citing work to the portion and version of data cited. 8. Interoperability and flexibility Data citation methods should be sufficiently flexible to accommodate the variant practices among communities but should not differ so much that they compromise interoperability of data citation practices across communities. Summary of synthesis activity & Next steps for review