Dataset Identification and Citation
Upcoming SlideShare
Loading in...5
×
 

Dataset Identification and Citation

on

  • 1,232 views

NISO Managing Data for Scholarly Communications Webinar, October 19, 2011

NISO Managing Data for Scholarly Communications Webinar, October 19, 2011

Statistics

Views

Total Views
1,232
Views on SlideShare
1,232
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Thank you for this opportunity to speak with you today about Dataset Identification & Citation.
  • My library:Serving the 10 UC campuses226,000 students 134,000 faculty and staffWorking collaborativelylibrariesdata centersmuseums, archivesfaculty and researchersCDL has historically provided strategic, integrated technical and program services in a broad portfolio, including:Groundbreaking licensing agreementsUnion bibliographic servicesData curation & preservation toolsOpen access publishing servicesCDL: http://www.cdlib.org/
  • My group:The UC Curation Center is creative partnership between the CDL, the ten UC campuses, and peer institutions in the community.An evolving community of shared concern and practice; bringing together diverse experience, expertise, and resources; providing robust curation solutions.
  • Let’s start out by taking a look at some common challenges in data-intensive research today.
  • Researchers doing data-intensive research and writing. Want to refer to the dataset right now even though they haven't yet found a permanent "home" for the data. If they get a persistent identifier for that dataset now, they will have reference that can be used in the paper. When the papers are published and the data is moved, the researcher simply updates the target URL, and the reference will still work.
  • RESEARCH TEAMS, work ACROSS REGIONS OR COUNTRIES where a data is hosted REMOTELY. Let’s assume the database is stored on someone’s departmental web server, but the server is getting old, soon to be replaced. The team can get an identifier now +circulate it to colleagues + the entire data federation. When the infrastructure is replaced, the team updates the location details so that references to the database continues to work perfectly.
  • Researchers who have published extensively and who want to be able to move around in their career, also may want to take their data with them.They can get identifiers for the work AND the datasets that go with it. With persistent identifiers, the references are never broken, because the researcher can keep the target URLs and other metadata up to date even as she moves.
  • As the NSF and other funders issue requirements for data management plans, scientists have to be able to answer basic questions like, How will you name and organize the data files? Persistent identifiers provide a ready answer to this requirement.
  • To address this challenge, DataCite was formed in 2009 by 10 Libraries and Research Centers.
  • The number has now grown to 15. In addition there are 3 associate members, including the Korea Institute of Science and Technology Information, so there is a presence in Asia.Mission: “"Helping you find, access, and reuse data"Advocacy, citationTo support citation, access and finding, you need…Metadata
  • MDS=Metadata Search
  • The 5 Required properties = basic citation elements[click]Optional elementsThe Family Jewels = RelatedIdentifer, relationTypeIsCitedBy & Cites IsSupplementTo  & IsSupplementedByIsContinuedBy  & Continues IsNewVersionOf  & IsPreviousVersionOf  IsPartOf  & HasPart  IsDocumentedBy & Documents isCompiledBy & CompilesIsVariantFormOf  & IsOriginalFormOfCOMING IN 2.3: IsIdenticalTo
  • Now that we’ve discussed identifiers, how do you get them? How do you keep them up to date?EZIDA service to make and manage actionable idsCan manage identifiers under different schemes:ARKs, DOIs, and more to comeUser and programming interfacesPartnering for replication
  • How to use the UI to test EZIDARKs and DOIsARKsFlexibleCase-sensitiveSpecial features support granularityCan be deletedInexpensiveDOIsEstablished brand in publishingIndexed by major A&I citation databases DataCite policies applyCannot be deletedMore costlyDOIs should be assigned to objects that are under good long-term management, and where there is an intention is to make the object persistently available.DOIs must be registered exclusively with metadata that is available to public view.Can DOIs and ARKs work together?Yes. For example, researchers may choose to use ARKs for unpublished materials associated with an object that has been registered with a DOI. These two identifier schemes can work well together, and EZID offers them both, along with policy support consistent across both schemes.
  • Let’s take a look at the UI now. I would give you a live demo, but I’m afraid that it might have some difficulties traveling over SKYPE. I’ve made some key captures here, and I think it will work fairly well for us.So, thisis our User Interface. EZID also has a machine-to-machine interface, an API, and a link to the documentation is here.If you’d like to try EZID, simply click on the help tab [CLICK] here.
  • Let’s take a look at the UI now. I would give you a live demo, but I’m afraid that it might have some difficulties traveling over SKYPE. I’ve made some key captures here, and I think it will work fairly well for us.So, thisis our User Interface. EZID also has a machine-to-machine interface, an API, and a link to the documentation is here.If you’d like to try EZID, simply click on the help tab [CLICK] here.
  • On the Help screen, you have the choice of creating a test ARK or DOI.[CLICK] Click the Create button
  • On the Help screen, you have the choice of creating a test ARK or DOI.[CLICK] Click the Create button
  • EZID creates the identifier and sends you to the MANAGE tab where you have the opportunity to enter a target URL and other metadata as we’ve seen earlier.
  • EZID creates the identifier and sends you to the MANAGE tab where you have the opportunity to enter a target URL and other metadata as we discussed earlier. The EZID UI allows the entry of DataCite’s required set, and you can submit a full record using the API.
  • So here is what this means. Here is an example of a data set deposited with one of our clients, Dryad.Dryad is an international repository of data underlying peer-reviewed articles in the basic and applied biosciences.
  • Dublin Core application profile available for the DataCite Metadata Schema; we’ll keep it up to date and in-sync. From the DCMI: “A DCAP is designed to promote interoperability within the constraints of the Dublin Core model and to encourage harmonization of usage and convergence on "emerging semantics" around its edges.”Content Service exposes our metadata stored in the DataCite Metadata Store (MDS) using multiple formats Alpha version: The service can be accessed at http://data.datacite.orgEZID: UI redesignActivity reportingBrowse & searchEnhanced persistence supportAutomated link checkingTombstone pages (a web page returned for a resource no longer found at its target location of record. The tombstone may provide “last known” metadata, including the original owner.)Exposure for citationsThomson-Reuters (Web of Knowledge)Elsevier (Scopus)OAI? RSS?GoogleScholar

Dataset Identification and Citation Dataset Identification and Citation Presentation Transcript

  • Dataset Identification & Citation: DataCite and EZID Joan Starr California Digital Library October, 2011
  • Dataset Identification & CitationIntroductionThe Researchers’ Challenge Identifiers are a tool for researchersDataCite “Helping you find, access and reuse data.”EZID Easy creation and management of DataCite DOIs and other identifiers.Next steps For DataCite, EZID and you!
  • California Digital Library (CDL)
  • The Researchers’ Challenge
  • Early in the research life cycleData-intensive research + Writing up the results Where’s the data? What if I move it? PERSISTENT IDENTIFIERS make the difference by Dave Rogers http://www.flickr.com/photos/dave-rogers/2815036285/
  • Working on a federated team Data-intensive research + Regional research center + Aging infrastructure Where’s We have to the data? move it! PERSISTENT IDENTIFIERS make the difference©All rights reserved by University of California, http://www.flickr.com/photos/universityofcalifornia/5405812887
  • Making a career move• Data-intensive research + • Researcher(s) on the move I know where my data is and I’m taking it with me! PERSISTENT IDENTIFIERS make the difference ©All rights reserved by University of California, http://www.flickr.com/photos/universityofcalifornia/540630865
  • Meeting funder requirements• Data-intensive research + • Grantor requirements for data management What do we plan put here? How do we track the data? PERSISTENT IDENTIFIERS make the difference By David Mellis, http://www.flickr.com/photos/mellis/7675610/
  • DataCiteGerman National Library of Economics (ZBW) Canada Institute for Scientific and Technical InformationGerman National Library of Science and Technology (TIB) (CISTI)German National Library of Medicine (ZB MED) Technical Information Center of DenmarkGESIS - Leibniz Institute for the Social Sciences, Germany Institute for Scientific & Technical Information (INIST-Australian National Data Service (ANDS) CNRS), FranceETH Zurich, Switzerland TU Delft Library, The Netherlands The Swedish National Data Service (SNDS) The British Library , UK California Digital Library (CDL), USA Office of Scientific & Technical Information (OSTI), USA Purdue University Library
  • DataCite Metadata V. 2.2• Small required set = citation elements• Optional descriptive set: – extendable lists – can refer to other standards, schemes – domain-neutral – rich ability to describe relationships to other digital objects• Metadata Search (MDS) is full-text indexed
  • DataCite Metadata V. 2.2 Required properties Optional properties1. Identifier (with type attribute) 6. Subject (with schema attribute)2. Creator (with name identifier 7. Contributor (with type & name identifier attributes) attributes)3. Title (with optional type attribute) 8. Date (with type attribute)4. Publisher 9. Language5. PublicationYear 10. ResourceType (with description attribute) 11. AlternateIdentifier (with type attribute) 12. RelatedIdentifier (with type &relation type attributes) 13. Size 14. Format 15. Version 16. Rights 17. Description (with type attribute)
  • • Get identifiers• Add location• Add metadata• Update location• Update metadata
  • http://n2t.net/ezid
  • http://n2t.net/ezid
  • http://n2t.net/ezid
  • http://n2t.net/ezid
  • http://n2t.net/ezid
  • http://n2t.net/ezid
  • http://n2t.net/ezid
  • What this means…
  • What this means…
  • Next StepsDataCite• Dublin Core application profile• Content Service• Metadata v. 2.3EZID•UI redesign•Automated link checking•Exposure for citations By Nicola Whitaker http://www.flickr.com/photos/nicolawhitaker/111009156/
  • Next Steps for you• Get more information, and• Try EZID for yourself! By Nicola Whitaker http://www.flickr.com/photos/nicolawhitaker/111009156/
  • For more informationEZIDEZID application: http://n2t.net/ezid/EZID website: http://www.cdlib.org/services/uc3/ezid/UC3 website: http://www.cdlib.org/services/uc3/DataCiteDataCite Home: http://datacite.org/DataCite Metadata Schema: http://schema.datacite.org/meta/kernel-2.2/index.htmlDataCite Metadata Search: http://search.datacite.orgContact Joan Starr at uc3@ucop.edu
  • Questions? by Horia Varlan http://www.flickr.com/photos/horiavarlan/4273168957/in/photostream/