Dataset Identification & Citation:
           DataCite and EZID


                 Joan Starr
          California Digital Library
               October, 2011
Dataset Identification & Citation
Introduction
The Researchers’ Challenge
       Identifiers are a tool for researchers
DataCite
       “Helping you find, access and reuse data.”
EZID
       Easy creation and management of DataCite DOIs and other
          identifiers.
Next steps
       For DataCite, EZID and you!
California Digital Library (CDL)
The Researchers’ Challenge
Early in the research life cycle
Data-intensive research                                        +            Writing up the results



                     Where’s
                     the data?                            What if I
                                                          move it?


                                                                               PERSISTENT IDENTIFIERS
                                                                                 make the difference


      by Dave Rogers http://www.flickr.com/photos/dave-rogers/2815036285/
Working on a federated team
         Data-intensive research                                                         +               Regional research center

                                                                                         +                 Aging infrastructure

                                       Where’s
                                                                                                      We have to
                                       the data?
                                                                                                      move it!



                                                                                                                   PERSISTENT IDENTIFIERS
                                                                                                                     make the difference


©All rights reserved by University of California, http://www.flickr.com/photos/universityofcalifornia/5405812887
Making a career move
• Data-intensive research                                         +   • Researcher(s) on the
                                                                        move


                     I know
                     where my
                     data is                  and I’m
                                              taking it
                                              with me!
                                                                       PERSISTENT IDENTIFIERS
                                                                         make the difference



  ©All rights reserved by University of California,
  http://www.flickr.com/photos/universityofcalifornia/540630865
Meeting funder requirements
• Data-intensive research                                   +               • Grantor requirements
                                                                              for data management
       What do we                                                             plan
       put here?
                                                                   How do we
                                                                   track the data?




                                                                               PERSISTENT IDENTIFIERS
                                                                                 make the difference


   By David Mellis, http://www.flickr.com/photos/mellis/7675610/
DataCite
German National Library of Economics (ZBW)                   Canada Institute for Scientific and Technical Information
German National Library of Science and Technology (TIB)          (CISTI)

German National Library of Medicine (ZB MED)                 Technical Information Center of Denmark

GESIS - Leibniz Institute for the Social Sciences, Germany   Institute for Scientific & Technical Information (INIST-

Australian National Data Service (ANDS)                          CNRS), France

ETH Zurich, Switzerland                                      TU Delft Library, The Netherlands

                                                             The Swedish National Data Service (SNDS)

                                                             The British Library , UK

                                                             California Digital Library (CDL), USA

                                                             Office of Scientific & Technical Information (OSTI), USA

                                                             Purdue University Library
DataCite Metadata V. 2.2
• Small required set = citation elements
• Optional descriptive set:
  – extendable lists
  – can refer to other standards, schemes
  – domain-neutral
  – rich ability to describe relationships to other
    digital objects
• Metadata Search (MDS) is full-text indexed
DataCite Metadata V. 2.2
 Required properties                              Optional properties

1.   Identifier (with type attribute)       6.    Subject (with schema attribute)
2.   Creator (with name identifier          7.    Contributor (with type & name identifier
     attributes)                                  attributes)
3.   Title (with optional type attribute)   8.    Date (with type attribute)
4.   Publisher                              9.    Language
5.   PublicationYear                        10.   ResourceType (with description
                                                  attribute)
                                            11.   AlternateIdentifier (with type attribute)
                                            12.   RelatedIdentifier (with type &relation
                                                  type attributes)
                                            13.   Size
                                            14.   Format
                                            15.   Version
                                            16.   Rights
                                            17.   Description (with type attribute)
•   Get identifiers
•   Add location
•   Add metadata
•   Update location
•   Update metadata
http://n2t.net/ezid
http://n2t.net/ezid
http://n2t.net/ezid
http://n2t.net/ezid
http://n2t.net/ezid
http://n2t.net/ezid
http://n2t.net/ezid
What this means…
What this means…
Next Steps
DataCite
• Dublin Core application profile
• Content Service
• Metadata v. 2.3
EZID
•UI redesign
•Automated link checking
•Exposure for citations


                            By Nicola Whitaker http://www.flickr.com/photos/nicolawhitaker/111009156/
Next Steps for you
• Get more information, and
• Try EZID for yourself!




             By Nicola Whitaker http://www.flickr.com/photos/nicolawhitaker/111009156/
For more information
EZID
EZID application: http://n2t.net/ezid/
EZID website: http://www.cdlib.org/services/uc3/ezid/
UC3 website: http://www.cdlib.org/services/uc3/


DataCite
DataCite Home: http://datacite.org/
DataCite Metadata Schema:
   http://schema.datacite.org/meta/kernel-2.2/index.html
DataCite Metadata Search: http://search.datacite.org



Contact Joan Starr at uc3@ucop.edu
Questions?




 by Horia Varlan
 http://www.flickr.com/photos/horiavarlan/4273168957/in/photostream/

Dataset Identification and Citation

  • 1.
    Dataset Identification &Citation: DataCite and EZID Joan Starr California Digital Library October, 2011
  • 2.
    Dataset Identification &Citation Introduction The Researchers’ Challenge Identifiers are a tool for researchers DataCite “Helping you find, access and reuse data.” EZID Easy creation and management of DataCite DOIs and other identifiers. Next steps For DataCite, EZID and you!
  • 3.
  • 5.
  • 6.
    Early in theresearch life cycle Data-intensive research + Writing up the results Where’s the data? What if I move it? PERSISTENT IDENTIFIERS make the difference by Dave Rogers http://www.flickr.com/photos/dave-rogers/2815036285/
  • 7.
    Working on afederated team Data-intensive research + Regional research center + Aging infrastructure Where’s We have to the data? move it! PERSISTENT IDENTIFIERS make the difference ©All rights reserved by University of California, http://www.flickr.com/photos/universityofcalifornia/5405812887
  • 8.
    Making a careermove • Data-intensive research + • Researcher(s) on the move I know where my data is and I’m taking it with me! PERSISTENT IDENTIFIERS make the difference ©All rights reserved by University of California, http://www.flickr.com/photos/universityofcalifornia/540630865
  • 9.
    Meeting funder requirements •Data-intensive research + • Grantor requirements for data management What do we plan put here? How do we track the data? PERSISTENT IDENTIFIERS make the difference By David Mellis, http://www.flickr.com/photos/mellis/7675610/
  • 11.
    DataCite German National Libraryof Economics (ZBW) Canada Institute for Scientific and Technical Information German National Library of Science and Technology (TIB) (CISTI) German National Library of Medicine (ZB MED) Technical Information Center of Denmark GESIS - Leibniz Institute for the Social Sciences, Germany Institute for Scientific & Technical Information (INIST- Australian National Data Service (ANDS) CNRS), France ETH Zurich, Switzerland TU Delft Library, The Netherlands The Swedish National Data Service (SNDS) The British Library , UK California Digital Library (CDL), USA Office of Scientific & Technical Information (OSTI), USA Purdue University Library
  • 12.
    DataCite Metadata V.2.2 • Small required set = citation elements • Optional descriptive set: – extendable lists – can refer to other standards, schemes – domain-neutral – rich ability to describe relationships to other digital objects • Metadata Search (MDS) is full-text indexed
  • 13.
    DataCite Metadata V.2.2 Required properties Optional properties 1. Identifier (with type attribute) 6. Subject (with schema attribute) 2. Creator (with name identifier 7. Contributor (with type & name identifier attributes) attributes) 3. Title (with optional type attribute) 8. Date (with type attribute) 4. Publisher 9. Language 5. PublicationYear 10. ResourceType (with description attribute) 11. AlternateIdentifier (with type attribute) 12. RelatedIdentifier (with type &relation type attributes) 13. Size 14. Format 15. Version 16. Rights 17. Description (with type attribute)
  • 14.
    Get identifiers • Add location • Add metadata • Update location • Update metadata
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
    Next Steps DataCite • DublinCore application profile • Content Service • Metadata v. 2.3 EZID •UI redesign •Automated link checking •Exposure for citations By Nicola Whitaker http://www.flickr.com/photos/nicolawhitaker/111009156/
  • 25.
    Next Steps foryou • Get more information, and • Try EZID for yourself! By Nicola Whitaker http://www.flickr.com/photos/nicolawhitaker/111009156/
  • 26.
    For more information EZID EZIDapplication: http://n2t.net/ezid/ EZID website: http://www.cdlib.org/services/uc3/ezid/ UC3 website: http://www.cdlib.org/services/uc3/ DataCite DataCite Home: http://datacite.org/ DataCite Metadata Schema: http://schema.datacite.org/meta/kernel-2.2/index.html DataCite Metadata Search: http://search.datacite.org Contact Joan Starr at uc3@ucop.edu
  • 27.
    Questions? by HoriaVarlan http://www.flickr.com/photos/horiavarlan/4273168957/in/photostream/

Editor's Notes

  • #2 Thank you for this opportunity to speak with you today about Dataset Identification & Citation.
  • #4 My library:Serving the 10 UC campuses226,000 students 134,000 faculty and staffWorking collaborativelylibrariesdata centersmuseums, archivesfaculty and researchersCDL has historically provided strategic, integrated technical and program services in a broad portfolio, including:Groundbreaking licensing agreementsUnion bibliographic servicesData curation & preservation toolsOpen access publishing servicesCDL: http://www.cdlib.org/
  • #5 My group:The UC Curation Center is creative partnership between the CDL, the ten UC campuses, and peer institutions in the community.An evolving community of shared concern and practice; bringing together diverse experience, expertise, and resources; providing robust curation solutions.
  • #6 Let’s start out by taking a look at some common challenges in data-intensive research today.
  • #7 Researchers doing data-intensive research and writing. Want to refer to the dataset right now even though they haven't yet found a permanent "home" for the data. If they get a persistent identifier for that dataset now, they will have reference that can be used in the paper. When the papers are published and the data is moved, the researcher simply updates the target URL, and the reference will still work.
  • #8 RESEARCH TEAMS, work ACROSS REGIONS OR COUNTRIES where a data is hosted REMOTELY. Let’s assume the database is stored on someone’s departmental web server, but the server is getting old, soon to be replaced. The team can get an identifier now +circulate it to colleagues + the entire data federation. When the infrastructure is replaced, the team updates the location details so that references to the database continues to work perfectly.
  • #9 Researchers who have published extensively and who want to be able to move around in their career, also may want to take their data with them.They can get identifiers for the work AND the datasets that go with it. With persistent identifiers, the references are never broken, because the researcher can keep the target URLs and other metadata up to date even as she moves.
  • #10 As the NSF and other funders issue requirements for data management plans, scientists have to be able to answer basic questions like, How will you name and organize the data files? Persistent identifiers provide a ready answer to this requirement.
  • #11 To address this challenge, DataCite was formed in 2009 by 10 Libraries and Research Centers.
  • #12 The number has now grown to 15. In addition there are 3 associate members, including the Korea Institute of Science and Technology Information, so there is a presence in Asia.Mission: “"Helping you find, access, and reuse data"Advocacy, citationTo support citation, access and finding, you need…Metadata
  • #13 MDS=Metadata Search
  • #14 The 5 Required properties = basic citation elements[click]Optional elementsThe Family Jewels = RelatedIdentifer, relationTypeIsCitedBy & Cites IsSupplementTo  & IsSupplementedByIsContinuedBy  & Continues IsNewVersionOf  & IsPreviousVersionOf  IsPartOf  & HasPart  IsDocumentedBy & Documents isCompiledBy & CompilesIsVariantFormOf  & IsOriginalFormOfCOMING IN 2.3: IsIdenticalTo
  • #15 Now that we’ve discussed identifiers, how do you get them? How do you keep them up to date?EZIDA service to make and manage actionable idsCan manage identifiers under different schemes:ARKs, DOIs, and more to comeUser and programming interfacesPartnering for replication
  • #16 How to use the UI to test EZIDARKs and DOIsARKsFlexibleCase-sensitiveSpecial features support granularityCan be deletedInexpensiveDOIsEstablished brand in publishingIndexed by major A&I citation databases DataCite policies applyCannot be deletedMore costlyDOIs should be assigned to objects that are under good long-term management, and where there is an intention is to make the object persistently available.DOIs must be registered exclusively with metadata that is available to public view.Can DOIs and ARKs work together?Yes. For example, researchers may choose to use ARKs for unpublished materials associated with an object that has been registered with a DOI. These two identifier schemes can work well together, and EZID offers them both, along with policy support consistent across both schemes.
  • #17 Let’s take a look at the UI now. I would give you a live demo, but I’m afraid that it might have some difficulties traveling over SKYPE. I’ve made some key captures here, and I think it will work fairly well for us.So, thisis our User Interface. EZID also has a machine-to-machine interface, an API, and a link to the documentation is here.If you’d like to try EZID, simply click on the help tab [CLICK] here.
  • #18 Let’s take a look at the UI now. I would give you a live demo, but I’m afraid that it might have some difficulties traveling over SKYPE. I’ve made some key captures here, and I think it will work fairly well for us.So, thisis our User Interface. EZID also has a machine-to-machine interface, an API, and a link to the documentation is here.If you’d like to try EZID, simply click on the help tab [CLICK] here.
  • #19 On the Help screen, you have the choice of creating a test ARK or DOI.[CLICK] Click the Create button
  • #20 On the Help screen, you have the choice of creating a test ARK or DOI.[CLICK] Click the Create button
  • #21 EZID creates the identifier and sends you to the MANAGE tab where you have the opportunity to enter a target URL and other metadata as we’ve seen earlier.
  • #22 EZID creates the identifier and sends you to the MANAGE tab where you have the opportunity to enter a target URL and other metadata as we discussed earlier. The EZID UI allows the entry of DataCite’s required set, and you can submit a full record using the API.
  • #23 So here is what this means. Here is an example of a data set deposited with one of our clients, Dryad.Dryad is an international repository of data underlying peer-reviewed articles in the basic and applied biosciences.
  • #25 Dublin Core application profile available for the DataCite Metadata Schema; we’ll keep it up to date and in-sync. From the DCMI: “A DCAP is designed to promote interoperability within the constraints of the Dublin Core model and to encourage harmonization of usage and convergence on "emerging semantics" around its edges.”Content Service exposes our metadata stored in the DataCite Metadata Store (MDS) using multiple formats Alpha version: The service can be accessed at http://data.datacite.orgEZID: UI redesignActivity reportingBrowse & searchEnhanced persistence supportAutomated link checkingTombstone pages (a web page returned for a resource no longer found at its target location of record. The tombstone may provide “last known” metadata, including the original owner.)Exposure for citationsThomson-Reuters (Web of Knowledge)Elsevier (Scopus)OAI? RSS?GoogleScholar