• Share
  • Email
  • Embed
  • Like
  • Private Content
NISO Forum, Denver, Sept. 24, 2012: EZID: Easy dataset identification & management

NISO Forum, Denver, Sept. 24, 2012: EZID: Easy dataset identification & management



EZID: Easy dataset identification & management ...

EZID: Easy dataset identification & management
Joan Starr, Manager, Strategic and Project Planning and EZID Service Manager, California Digital Library

Data and data curation are assuming a growing role today’s research library. New approaches are needed both to address the resulting challenges and take advantage of the emerging opportunities. Long-term identifiers represent one such tool. In this presentation, Joan Starr will introduce identifiers and an application designed to make them easy to create and manage: EZID. She will provide a closer look at two identifier types: DOIs and ARKs, and discuss what bringing an identifier service to your institution might mean.



Total Views
Views on SlideShare
Embed Views



3 Embeds 193

http://www.niso.org 184
http://scholarlykitchen.sspnet.org 8
http://www.slashdocs.com 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • Image credit:http://www.flickr.com/photos/39877441@N05/5145507521/By mclcbooks
  • Image credit: http://www.flickr.com/photos/mr_t_in_dc/6083561702/By Mr. T in DC
  • Why all the fuss about Data?Big DATA, Big MONEYBut for us and our clients…
  • Two aspects are key: DATA MANAGEMENT
  • This is also a question of an almost perfect fit with our historic mission to preserve and protect our institution’s scholarly output.472,000 in September455,000 in May!
  • Image credit:http://www.flickr.com/photos/60in3/2338247189/ by 60 in 3So, withthe emergence of these funder mandates, the NSF’s being the most prominent of course, you might call it the greatest thing since sliced bread for libraries.Scientists have been asked for data management plans, and they don’t have the first idea what to do about this. We’re going to hear more about the scientists’s view of things from Carly this morning.Lucky for us, these same scientists who report never stepping into libraries are now turning to librarians and asking for help!
  • What does DATA MANAGEMENT LOOK LIKE?The players here are domain-specific data repositories like Dryad, in the environmental sciences fields, or institutional repositories and data centers, including those run by libraries and their campus IT partners.
  • We’re going to look at what identifiers are—what makes them work.
  • DOIs are one kind of persistent identifier.But what is an identifier?An identifier is an alphanumeric string assigned to an object, and if that assignment is managed with some metadata and the object is made available over time, the identifier becomes a VERY reliable way of keeping track of that object.
  • Let’s take a look at one.So you can see that with just the identifier and a simple set of metadata, you get:Location for VERIFICATION & RE-USEEXPOSURE & CITATION TRACKING (this is not an actual DOI, nor an actual study)
  • And here’s that same DOI some time later.THE STRING NEVER CHANGES. This means it can be cited, tracked and associated with all kinds of metadata.
  • Is everyone with me? If so, I’m going to ask you to be brave for a few minutes while I introduce you to one more piece of information.
  • Let’s look at that same DOI so we can talk about it’s structure. Remember: this is a STRING associated with a TARGET URL.DOI structure is based on the Handle system of identifiers, because you can think of DOIs are a special implementation of the Handle system.So, here is the segment called the PREFIX.All DOI prefixes begin with ’10’ and this is followed by a “dot” and more numbers. The prefix is a unique number assigned to the specific registrant of DOIs. CDL has its own prefix, for example. Most EZID clients have one too. The prefix is the common element in every DOI the registrant makes.The second part is the suffix--the part after the slash. This part has to be unique for every DOI created with the prefix.
  • How can EZID be in the business of issuing DataCite DOIs? California Digital Library was one of the founding members.DataCite was indeed formed in 2009 by 10 Libraries and Research Centers with a Mission: “"Helping you find, access, and reuse data“The number has now grown to 16. In addition there are 3 associate members, including the Korea Institute of Science and Technology Information and BGI, so there is a presence in Asia.DATACITE’s primary methodology for achieving this mission: issuing DOIs (Digital Object Identifiers) for datasets.
  • If you click on this link, you’ll be able to try EZID without an account.
  • By default, we take to a SIMPLE create screen.
  • There are other features available on the ADVANCED CREATE screen and MANAGE tabs that I invite you to explore on your own.
  • Image credit: http://www.flickr.com/photos/mr_t_in_dc/6083561702/By Mr. T in DC
  • The tool we’ve been looking at, EZID, let’s you create and manage both DOIs and ARKs.But how do you choose?
  • ARKs come from the Library and Museum world and have been adopted by some large cultural organizations around the world.Managed by the CDL. CASE SENSITVE: MORE OPTIONS (CD, Cd, cD, cd are all distinct)FLEXIBLE: using the API, can supply metadata pairs as desired; can upload existing domain specific metadata if desired.ARKs have a feature called suffix pass-through. It means you can register the root of a file structure and get pointers to the rest of the file structure for free. I’ll show you an example in a minute.
  • An identifier wild card. The ability for one identifier location URL to stand in for unlimited sub-identifier locations. A sub-identifier is just an identifier extended by a suffix. Image credit: http://www.flickr.com/photos/11356857@N08/5120543262/ by OnFoot4Now (Didi)
  • An identifier wild card. The ability for one identifier location URL to stand in for unlimited sub-identifier locations. A sub-identifier is just an identifier extended by a suffix. Let’s assume that the identifier and location (or target URL) that you registered were those shown in red above the table. I’ve also listed them in the table as #1SUFFIX PASS-THRU means that you can submit requests to the ARK server for any sub-identifiers, and the suffixes will be passed through to the target server. So, in example #2, the suffix “king” is passed through as a request to the target server, even though it was never registered as an identifier. And so on.Why is suffix pass-through so important?  Imagine you have a dataset with 10,000 nameable components, such as packages, files, or tables. You'd like to be able to reference these components for tracking purposes. With suffix pass-through, while you still have to manage the components, you only take on management for one overall dataset identifier.
  • The gold standardDOIs are for keepsDOIs:DOIs are identifiers originating from the publishing world and are in widespread use for journal articles. Managed by the International DOI Foundation.DOIs should be assigned to objects that are under good long-term management, and where there is an intention is to make the object persistently available.DOIs must be registered exclusively with metadata that is available to public view.
  • Image credit: http://www.flickr.com/photos/mzn37/562770075 by michael.newmanCan DOIs and ARKs work together?These two identifier schemes can work well together, and EZID offers them both, along with policy support consistent across both schemes.Use ARKs early in the life cycle for good data management and before it’s clear what will be cited. When you are ready to cite, get the DOI, and if desired, incorporate the ARK string into the suffix of the DOI for continuity.
  • Image credit:http://www.flickr.com/photos/andy_bernay-roman/380095041/ by allspice1
  • http://www.flickr.com/photos/sekihan/6100774057/ By sekihanhttp://www.flickr.com/photos/expressmonorail/7032291971/ By Express MonorailInformation depotDistribute information, pass along questions, etc.A full data management service centerLibrary as data management service center:Data services that libraries may provide: + data management planning+ institutional repository+ metadata creation and linking+ data archiving & curation+ consultation on above topics+ data management & data literacy training for grad students & faculty + persistent identifier servicesAnything in betweenOne size does not fill all! But EZID and long-term identifier services are a nice, discrete service that fits into any workflow.
  • You can be part of a growing network. 38 institutions and counting.Government data centers, university-hosted research institutes, research libraries offering data management services, publishers beginning to support the data behind scholarly work.There’s a longer list at the URL I’m showing here…

NISO Forum, Denver, Sept. 24, 2012: EZID: Easy dataset identification & management NISO Forum, Denver, Sept. 24, 2012: EZID: Easy dataset identification & management Presentation Transcript

  • EZID: Easy datasetidentification & management Joan Starr California Digital Library September, 2012 @joan_starr
  • EZID: Easy datasetidentification & management • Why data? • Identifiers 101 • EZID: identifiers made easy! • Choosing an identifier • What does this mean for you?
  • Data! By barryegan (Vitor Leite) http://www.flickr.com/photos/vixon/116447718/
  • Data!= scholarly communication By barryegan (Vitor Leite) http://www.flickr.com/photos/vixon/116447718/
  • What can libraries do?
  • What this looks like, pt. 1
  • What this looks like, pt. 2
  • What this looks like, pt. 2
  • Identifiers 101
  • What is an identifier?What you see: alphanumeric string (never changes)Associated with: location of object (such as a URL)Optional: who, what, when, etc (i.e. metadata) By Joelk75: http://www.flickr.com/photos/75001512@N00/2728233597/
  • Identifier examplestring: doi:10.9999/FK40K2GTVhtml version: http://dx.doi.org/10.9999/FK40K2GTVlocation: http://www.bologna.edu/biology/xfg/123.xlsmetadata Creator: Dr. Felix Kottor Title: Data for chromosomal study of catfish (Ictalurus punctatus) Publisher: University of Bologna Publication Year: 2011
  • Identifier examplestring: doi:10.9999/FK40K2GTVhtml version: http://dx.doi.org/10.9999/FK40K2GTVlocation: http://www.state.edu/ecology/783sdr/123.xlsmetadata Creator: Dr. Felix Kottor Title: Data for chromosomal study of catfish (Ictalurus punctatus) Publisher: Dryad Data Repository Publication Year: 2012
  • Identifiers 201 By Christi Nielsen http://www.flickr.com/photos/christinielsen/476326980/
  • Identifiers 201• string: doi:10.9999/FK40K2GTV “prefix” “suffix”
  • EZID: long-term identifiers made easy take control of the management anddistribution of your research, share and get credit for it, and build your reputation through its collection and documentation Primary Functions 1. Create long-term identifiers 2. Manage identifiers over time 3. Manage associated metadata over time
  • http://n2t.net/ezid
  • http://n2t.net/ezid
  • http://n2t.net/ezid
  • http://n2t.net/ezid
  • http://n2t.net/ezid
  • EZID: Easy datasetidentification & management  Why data?  Identifiers 101  EZID: identifiers made easy! • Choosing an identifier • What does this mean for you?
  • DOIs and ARKs• both can work like regular hyperlinks.• both can refer to a subset or portion of a resource.• both become persistent when the target URL is maintained. http://content.cdlib.org/ark:/13030/tf0v19n605/, courtesy of UC Davis Special Collections
  • DOIs vs ARKs• Case sensitive• Flexible metadata• Special feature supports granularity
  • DOIs vs ARKs: suffix pass-through The identifier WILD CARD!
  • DOIs vs ARKs: suffix pass-throughark:/13030/xt54321 ---> http://example.org/heartshttp://n2t.net/ark:/13030/xt54321 http://example.org/heartshttp://n2t.net/ark:/13030/xt54321/king http://example.org/hearts/kinghttp://n2t.net/ark:/13030/xt54321/queen http://example.org/hearts/queen
  • DOIs vs ARKs• Gold standard for citation• Established brand in publishing• Indexed by major A&I citation databases
  • Playing well with others…
  • What does it all mean?
  • Well…what would you like to be? Contact:uc3@ucop.edu
  • http://www.cdlib.org/services/uc3/ezid/clients.html
  • For more informationEZIDEZID: http://n2t.net/ezid/EZID on Twitter: @ezidCDLDataCiteDataCite Search: http://search.datacite.orgDataCite & CrossRef Citation tool: http://crosscite.org/citeproc/UC3UC3: http://www.cdlib.org/services/uc3/Joan Starr: uc3@ucop.edu @joan_starr