Thoughts about Metadata Standards for Data


Published on

Presentation at ALA Midwinter 2012, STS session on The Role of Metadata Standards in Scientific Data Publishing

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Thank you for the opportunity to engage in this discussion about Metadata Standards for Data. Let me give special thanks to the STS Publisher/Vendor Relations Discussion Group and the STS Subject & Bibliographic Access Committee.Image credits:By: MDB 28, davecurlee, sabarishr: rkrichardson: awsheffield: Scutter: Amy the Nurse: Anita & Greg:
  • I want to add some information and observations from my experience working with DataCite and our DataCite implementation, EZID. I have this very simple roadmap as a starter for what I hope can be an exchange.
  • But first a very brief context setting.Where I come from:University of CaliforniaCalifornia Digital LibraryUC3
  • As librarians, we listen to our users, the researchers and scholars.And, while the scientific domains are coming up with various recommendations and standards, their requirements are similar:They want CITATIONS AND CITATION TRACKING. ESIP—Earth Science Information Partners have asserted that the purposes of citations are:( provide fair credit to those responsible (exposure)To aid scientific reproducibility(re-use)To ensure scientific transparency and reasonable accountability (verification)To aid in tracking the impact of the work (citation tracking)For end users (the re-use scenario): these are the foundational needsTo know that the data exist, Know where to get the data, andBeable to get the datain a form that is easily integrated into local workflows. Preservation: Easy to maintainThe funders’ requirements are for data management and And the library’s charge is to preserve our institutions’ scholarly assets
  • You may have read about the founding of DataCite in the 2009 ICSTI 2009 Report of Jan Brase’s talk.DataCite was indeed formed in 2009 by 10 Libraries and Research Centers with a Mission: “"Helping you find, access, and reuse data“The number has now grown to 15. In addition there are 3 associate members, including the Korea Institute of Science and Technology Information, so there is a presence in Asia.California Digital Library was one of the founding members.DATACITE’s primary methodology for achieving this mission: issuing DOIs (Digital Object Identifiers) for datasets.DOIs are associated with a target URL and metadata.
  • Each DataCite Member is an allocating agent for DOIs.EZID is CDL’s application for offering DataCite DOIs as well as other identifiers. It has this UI and an API. Try it by going to the Help tab.1. CREATE persistent identifiers2. MANAGE identifiers over time3. MANAGE associates metadata over timeEZID is available to individuals, groups and institutions within the University of California community, and beyond on a cost recovery basis.take control of the management and distribution of your research, share and get credit for it, and build your reputation through its collection and documentation
  • If you choose to create a DOI (EZID allows you to create other types of identifiers in addition to DOIs), you will be asked to provide the minimum required set of DataCite metadata.MDS=Metadata Search
  • The 5 Required properties = basic citation elementsIdentifier = DOI now; in future may open upCreator is repeatable; Name can have a nameIdentifier and schema as in ORCHID idTitle is repeatable and has an optional type attribute for Alternative Title; Subtitle; and TranslatedTitlePublisher: “In the case of datasets, "publish" is understood to mean making the data available to the community of researchers.”IDENTIFIER=VERIFICATIONALLOW the submitter to give credit where credit is due. EXPOSURE & CITATION TRACKINGIf the Year field isn’t quite what you want—use the repeatable DATE field in the optional set.
  • As librarians, we ask the tough questions like what’s missing, how can this be better?We hope that Datacite’s Metadata Schema is agnostic & happy medium, but it will have to be continuously adapted and improved over time. With diverse domain standards, the way to bring things together is with interoperability. DataCite’s new Metadata Content Negotiation Service* is moving in this direction.We need to verify if we have the right support available for 2 KEY THINGS:GETTING TO THE DATA IN A FORMAT THAT WORKSFormat: Free textWhat metadata are needed to make the data meaningful? (“Beable to get the datain a form that is easily integrated into local workflows.”)GRANULARITYisPartOf, Description (text)Support highly granular description: the level of individual variable fields, units of measure, spatial and temporal coverage, normalization procedures, etc. *DATACITE+XMLDATACITE+TEXTRDF+XMLTURTLEBIBTEXRESEARCH-INFO-SYSTEMSHTML
  • Thoughts about Metadata Standards for Data

    1. 1. Thoughts aboutMetadata Standards for Data Joan Starr California Digital Library January, 2012 @joan_starr
    2. 2. Our roadmapRequirementsSome offeringsSome gapsWhere to keep looking By Roo Reynolds
    3. 3. Partnership between CDL | 10 UC campuses | Peer institutionsProvide solutions, services, resources for digital assetsPool & distribute diverse experience, expertise, & resources
    4. 4. Requirements for dataset description• Citation• Reuse Photos courtesy of David Mellis, University of California; Dave Rogers; ThelmageGroup
    5. 5. EZID: long-term identifiers made easy
    6. 6. DataCite Metadata V. 2.2• Small required set = citation elements• Optional descriptive set: – extendable lists – can refer to other standards, schemes – domain-neutral – rich ability to describe relationships to other digital objects• Metadata Search (MDS) is full-text indexed
    7. 7. DataCite Metadata V. 2.2Required properties1. Identifier (with type attribute)2. Creator (with name identifier attributes)3. Title (with optional type attribute)4. Publisher5. PublicationYear
    8. 8. What’s missing? By zen
    9. 9. Where to go for more informationEZIDEZID application: website: website: Home: Metadata Schema: Metadata Search: http://search.datacite.orgAnd contact Joan Starr at
    10. 10. Questions? by Horia Varlan Starr: @joan_starr
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.