PIDs and DOI registration with DataCite - IATUL Workshop 2013


Published on

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

PIDs and DOI registration with DataCite - IATUL Workshop 2013

  1. 1. Frauke Ziedorn IATUL Workshop 2013 Research Data Management: Finding our Role 6. December 2013 PIDs and DOI Registration with DataCite
  2. 2. 2 Background: • Easier re-usability and verification of data • Recognition for collection and documentation of data (Citation Indeces) • Compliance with funders‘ requirements (e.g German Research Foundation) • Avoiding duplication • Motivation for new research Why publish and cite research data?
  3. 3. 3 Persistent Identifier I • DOI • Citation of scientific publications • Established in the scientific community • Global resolving via any handle server or{doi} • Persistence and data quality are guaranteed • Handle • Global referencing of data before publication • Global resolving via any handle server • No persistence or quality management
  4. 4. 4 Persistent Identifier II • URN • Referencing of local documents in a closed system ( e. g. dissertations, thesis) • Resolving only on server of publisher • Persistence and data quality are guaranteed • ARK • Documents of all work stages • Persistence declaration available; may be deleted • Resolving only free of cost on server of publisher • Quality management (metadata), no standards for persistence
  5. 5. 5 The DOI® System • International DOI Foundation was founded in 1998. • The DOI system offers long-term persistence and accessibility of data. • Based on the Handle system. • In May 2012 the DOI System ISO Standard 26324 was published. • Part of the quality control is mandatory metadata for each object registered with a DOI. DOI®, DOI.ORG® and shortDOI® are trademarks of the International DOI Foundation
  6. 6. 6 A little History • 2003: DFG-funded project of the TIB with World Data Centres regarding the publication of research data. • 2005: TIB becomes the first DOI registration agency for research data. From the beginning, grey literature is also registrered. • 2009-03: Paris Memorandum regarding the cooperation of 6 European information providers. • 2009-12: DataCite is founded in London with 7 members.
  7. 7. 7 DataCite • Growing demand to make data citable. • DataCite is an international consortium whose aims are • to establish easier access to research data on the Internet • to increase acceptance of research data as legitimate, citable contributions to the scholarly record • to support data archiving that will permit results to be verified and re-purposed for future study. • Developement of standards, worflows, and best practices. • 2013: • 18 members from 13 countries, • 9 associated members, • ~2.2 Million DOIs
  8. 8. 8 DataCite Members
  9. 9. 9 International DOI Foundation DataCite DataCite Member Data CentreData CentreDatacenter DataCite Member Data CentreData CentreDatacenter … Managing Agent TIB Member Associate Members 9 DOI Registration Agencies DOI System Infrastructure
  10. 10. 10 DataCite Services • Registration and updating of DOI names. • Storage of metadata. • Accessible via UI or API. Metadata Store (MDS)
  11. 11. 11 DataCite and Metadata • Metadata make data discoverable. • Long-term maintenance of metadata is an important part of the persistence of an identifier. • Schema is inspired by Dublin Core. • Core value of the DataCite Metadata Schema: Linking between data and related objects. • Future vision: Links between all related publications and objects.
  12. 12. 12 DataCite Metadata Schema • Identifier (with type attribute) • Creator (with type and nameIdentifier attributes) • Title (with optional type attribute) • Publisher • PublicationYear • Citation: Creator (PublicationYear): Title. Publisher. Identifier Mandatory Properties
  13. 13. 13 Citation Creator (PublicationYear): Title. Publisher. Identifier Dataset: Kuhlmann, H et al. (2009): Age models, iron intensity, magnetic susceptibility records and dry bulk density of sediment cores from around the Canary Islands. PANGAEA - Data Publisher for Earth & Environmental Science. doi:10.1594/PANGAEA.727522, Is supplement to this article: Kuhlmann, Holger; Freudenthal, Tim; Helmke, Peer; Meggers, Helge (2004): Reconstruction of paleoceanography off NW Africa during the last 40,000 years: influence of local and regional factors on sediment accumulation. Marine Geology, 207(1-4), 209-224, doi:10.1016/j.margeo.2004.03.017
  14. 14. 14 DataCite Metadata Schema • Subject (with scheme attribute) • Contributor (with type and nameIdentifier attributes) • Date (with type attribute) • Language • ResourceType (with description attribute) • AlternateIdentifier (with type attribute) • RelatedIdentifier (with type and relationType attributes) • Size • Format • Version • Rights • Description (with type attribute) • GeoLocation (with point, box, and place) Optional Properties
  15. 15. 15 DataCite Services • Search engine for all metadata stored in the MDS. • Filter options to refine the search. Metadata Search
  16. 16. 16 DataCite Services • Exposes metadata stored in the MDS using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). • Different metadata formats are available. • Service is open to everyone; harvesters include: • TIB (GetInfo) • Thomson Reuters (Data Citation Index) • Elsevier (Exlibris) OAI-PMH Data Provider
  17. 17. 17 Services in Cooperation with CrossRef • A Citation Formater which provides over 100 different formats for citations. • With Content Negotiation it is possible to access different media types of a registered object (machine-to-machine only).
  18. 18. 18 Content Negotiation Resolving to a citation: datacite+text/10.5524/100005 Li, j; Zhang, G; Lambert, D; Wang, J (2011): Genomic data from Emperor penguin. GigaScience.
  19. 19. 19 Content Negotiation Resolving to RDF metadata: <rdf:RDF xmlns:rdf="" xmlns:owl="" xmlns:j.0="" > <rdf:Description rdf:about=""> <j.0:identifier>10.5524/100005</j.0:identifier> <j.0:creator>Li, J</j.0:creator> <j.0:creator>Zhang, G</j.0:creator> <j.0:creator>Wang, J</j.0:creator> <owl:sameAs>doi:10.5524/100005</owl:sameAs> <owl:sameAs>info:doi/10.5524/100005</owl:sameAs> <j.0:publisher>GigaScience</j.0:publisher> <j.0:creator>Lambert, D</j.0:creator> <j.0:date>2011</j.0:date> <j.0:title>Genomic data from the Emperor penguin (Aptenodytes forsteri)</j.0:title> </rdf:Description></rdf:RDF>
  20. 20. 20 DataCite new developements • ORCID and DataCite Interoperability Network ( ) • Link datasets to your ORCID profile • Inclusion of a DataCite interface in next version of D-Space
  21. 21. 21 Links • Access to all versions of the DataCite metadata schema, with documentation, schema definition, and examples. • Search engine for all metadata stored by DataCite. • Datacite‘s OAI-PMH service which allows access to the metadata. • DataCite Content Service exposes metadata using multiple formats. • DataCite‘s test system includes all services, like MDS, Search, Content Negotiation etc. • Display of registration and resolving statistics.
  22. 22. Thank you for your attention!