Boot Camp: An Introduction to CrossRef (2011 CrossRef Workshops)


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • In this next section I’m going to go over the different stages of registering and managing CrossRef DOIs. These are presented step-by-step, but this isn’t necessarily the order you need to do things in and don’t view it as a recommended workflow. The obvious first step is to join CrossRef, next you’ll need to publish your content, Deposit DOIs, query for DOIs, link them in your references, and maintain them.
  • 1. The DOI consists of two parts: a prefix and a suffix.Inthis example, the is prefix 10.1006 , everything after the forward slash is the suffix.  2. When you first join CrossRef as a member, you are given a DOI prefix. Prefixes are always 10. something, currently they’re all 4 digit numbers.Each member has a prefix, some have several.  3. + 4 5. You are assigned a prefix, but need to come up with the suffix on your own.  6. A DOI suffix is a unique string of letters and numbers – you can pretty much do whatever you want (with a few limitations) but it’s usually best to come up with a consistent pattern that makes sense (to you,, it doesn’t need to be derivable by anyone else), and it should be easily created and documented
  • 1.Keep it simple. Suffixes can be opaque or meaningful, it’s up to you – here are some examples. They’re all fairly simple and presumably follow a pattern, but the pattern doesn’t need to be decipherable. To the end user, a DOI is a DOI – the prefix/suffix combination have no meaning, it’s a string of letters and numbers. 2. We do have a few restrictions – the DOI suffix is limited to the characters you see here. The actual DOI specification (created and maintained by the International DOI Foundation) is more flexible but we’ve implemented a limited character set because of past problems – DOIs with spaces, for example, need to be URL encoded, and not everyone knows to do that.   
  • The next step is to publish your item online – your journal article, book, report, etc. will need what we call a ‘DOI response page’ which is the page the DOI resolves to. Usually this is the same as the article landing page. The response page must include the DOI, bibliographic information about the item, and a means to access full text. In a perfect world, DOIs would be published and deposited with CrossRef spontaneously, but realistically we ask you to keep the gap between publication and deposit within 24 hours. As I mentioned before, we’ve updated our DOI display guidelines in recent months (early August) – DOIs must now be displayed as a URL (with the prefix) instead of with doi: as was previously recommended.
  • DOIs are required on the DOI response page but we also recommend you include them whenever a persistent link is needed – in your tables of contents, abstracts, citation downloads, metadata feeds, etc. etc. – they’re not just for reference linking.
  • Here’s an example of a DOI on a response page - this example is following the new guidelines.
  • The next steps are the most complex and involve sending and receiving data from CrossRef. You deposit your DOIs and metadata with us, which enables inbound linking, meaning you (and others) are able to use DOIs to link to your content. You also look up DOIs for other publishers and include them in your references as outbound links. I’m going to cover depositing first but Querying and outbound linking can happen at whatever point in your workflow makes sense to you.Regarding the deposit process: you create CrossRef-compliant XML containing DOIs and metadata. You send the XML to the CrossRef system.Journal article example: has journal title, title abbreviation, article info, DOI, and URL
  • We call the process of sending in metadata to the CR system ‘depositing’. We sometimes use the terms ‘deposit’ and ‘register’ interchangeably but they’re slightly different - when a publisher ‘deposits’ a DOI, the metadata is added to the CrossRef database, making the DOI retrievable. The DOI is also registered with the Handle resolver, meaning the DOI and URL only – no citation metadata is recorded by the Handle resolver. Immediately after the submission is processed, the system sends you a submission log. This is very important - data is often messy, and we try to keep the messy stuff out of our database, so there are many reasons your submission might fail.
  • . Once the DOI has been registered, it is resolvable and queryable, meaning it can be used for linking and can be retrieved from our system by end users, CrossRef Metadata Services subscribers, library link resolvers, and of course other members.
  • Submission methods vary from very robust complicated systems to one guy cutting and pasting stuff from Word into our web deposit form (which converts the data to XML). Most deposits are made via machine interfaces. Data is sent to us via HTTP POST – we do have a simple java-based tool that can be used for uploads, it’s available in our help documentation. Many publishers prefer to create their own tools. We do not currently accept FTP deposits.Our system has a very simple public interface at, and as I mentioned we have a web interface – Anna will talk about that in a few minutes.
  • Query CrossRef to retrieve DOI matches for outbound linking: meaning you need to search our database for DOI matches for your journal articles. Outbound linking is required for recent journal articles but we encourage publishers to link references for back issues and other content types as well, since it strengthens the linking network for everyone.XML / bulk querying: if you are, XML querying is the way to go – we have a number of methods you can use for querying. You can upload bulk XML queries, poll the system using XML queries, or do OpenURL queries.
  • After you’ve retrieved your reference DOIs, you need to include them in the reference lists on your website. The DOI links can be represented in a few different ways, here’s one example – the ‘Article’ links are DOI links.
  • Here is another example of outbound linking - this publisher is including the full DOI url in their references, making it very easy to cut and paste etc. This is recommended in our new DOI guidelines…
  • The complete guidelines are on our website, there’s a link, I’ve included the different reference linking options here. 1. uses the complete DOI link in the referenceGhosh, M.K., M.L. Harter. 2003. A viral mechanism for remodeling chromatin structure in G0 cells. Mol. Cell. 12:255–260,  2.—Use a ShortDOI as the permanent link - a ShortDOI is a shortcut DOI name - if you find yourself in possession of a long DOI that adds extra lines to your references or is otherwise difficult to represent, you can create a shortcut using the shortDOI service at ExampleGhosh, M.K., M.L. Harter. 2003. A viral mechanism for remodeling chromatin structure in G0 cells. Mol. Cell. 12:255–260, 3—Displays the CrossRef linking graphic next to the permanent DOI link.Linking graphics can be found on the member logo page. ExampleGhosh, M.K., M.L. Harter. 2003. A viral mechanism for remodeling chromatin structure in G0 cells. Mol. Cell. 12:255–260 4—Display the CrossRef linking graphic with the permanent URL behind it, so if you click on the graphic you resolve the DOILinking graphics can be found on the member logo page. ExampleGhosh, M.K., M.L. Harter. 2003. A viral mechanism for remodeling chromatin structure in G0 cells. Mol. Cell. 12:255–260  5—Display the text “CrossRef” with a permanent DOI link behind the text  ExampleGhosh, M.K., M.L. Harter. 2003. A viral mechanism for remodeling chromatin structure in G0 cells. Mol. Cell. 12:255–260, CrossRef. 6—Display the words “Full Text” or “Article” or something similar with the permanent DOI link behind the text. ExampleGhosh, M.K., M.L. Harter. 2003. A viral mechanism for remodeling chromatin structure in G0 cells. Mol. Cell. 12:255–260, Article.  So, once your DOIs have been included in your references, you’re done! Until the next issue. Or unless there are problems..
  • Making changes: for the most part, if the metadata you send us is thorough and accurate, DOIs don’t need to be touched. There are some circumstances that require attention - Common changes are:Updating URLs: this is the most frequent change, for obvious reasons. If you make changes to your site or acquire DOIs from another publisher, DOIs can be updated easily by either:Redepositing your metadata with the new URLs – the metadata in our system will be overwrittenSending a tab separated list of DOIs and their new URLs. These updates can usually be processed within 24 hours (excluding weekends). If you have a large number of DOIs to update (say over 100,000) contact us in advance and we can give you an estimate of the time involved.Changing metadata: if you’ve discovered errors in your metadata or otherwise need to make an update, redeposit your metadata with the changes – any deposits already in the system will be overwritten, so make sure the redeposit is complete. For example, if you deposit your DOIs with an online publication date before the item has been published in print, you can update the metadata once the print info is available. There is no charge for updating your DOIs – we encourage you to update them as often as you like.
  • List of reports – some reports apply to all members, others are created on an as-needed basis. I’m going to go through some of the reports that deal directly with maintaining your DOIs and metadata.
  • First is the resolution report. It is sent out monthly to the business contact we have on file. This report is comprised of statistics we extract from DOI resolution logs and contains data about how many times your DOIs have been clicked and your overall resolution failure rate (successes vs. failures) There’s a lot of interesting data here – it has a.) how many resolutions you have per month – and lists the previous 12 months as well, so you can compareb.) it also includes info on resolution attempts (that is, how many times someone tries to resolve one of your DOIs. c.) also has a list of your top ten DOIsd.) this is important – the report lists your overall resolution failure rate, as well as the overall failure percentage for all members. The resolution failure rate is the percentage of DOI resolution attempts that have failed. This rate often gives members a heads up about potential problems, whether it be someone creating bad links to your content, or you failing to deposit DOIs that have been published.e.) there is a .csv file attached to your report that lists all failed DOI resolution attempts. Some of these are garbage – users make errors, especially when they are cutting and pasting, but if you have a high failure rate you should look at these DOIs closely.
  • The next few reports are linked from the Members area of our website - The next report is the depositor report – these are generated for journals, books and conference proceedings, and they list all titles and DOIs for each publisher. The main page lists titles by publisher and the number of DOIs per title and is updated weekly.
  • There’s a detailed report for each title as well, which is updated somewhat dynamically – usually new deposits appear in the list within an hour. – it lists:DOIs deposited for the titleThe current owner of the DOI – I should mention that in our system, DOIs are owned by a prefix. If a title changes hands, the DOIs for the title change hands as well. The actual DOI prefix stays the same but the owning prefix in our system will be the new owner’s prefix. The depositor report also includes the timestamp for each DOI – the timestamp is a value used in deposits, it is incremented with every deposit.The date of the last update is included as well, this is only really used in troubleshooting – you usually won’t need to worry about this.
  • Conflict report: this report is linked through the members area. Alerts are also sent out monthly to technical contacts when you have conflicts to resolve. If you don’t have any conflicts, you won’t get this report. Again, conflicts are what happens when two DOIs are created with the same metadata.  like the depositor report, the page contains a list of publishers, each publisher name expands to reveal a list of titles with conflicts. The number of conflicts is listed next to each title. Clicking on the title pops up a detail report which lists data about each conflict such as the DOIs involved, when they were deposited, and the metadata involved.Conflicts should be taken seriously – only one DOI should be assigned to a given item – typically when two DOIs are assigned to a single item, one DOI gets neglected and eventually fails to resolve. Conflicts can be easily resolved by aliasing two DOIs together – instructions can be found in our help documentation.
  • This report is emailed as needed to the technical contact. These alerts are compiled from complaints about unresolving DOIs submitted to us by end users. If an end user tries to resolve a DOI that has not been registered, they are delivered to a form that they can submit to us with comments and their email address. When the DOI has been registered, we send them an alert. Any comments are passed on to publishers. there are a number of reasons a DOI could fail, the most common being: § a DOI was been published but not deposited - if you distribute your DOIs before they are deposited, people will use them, even if you tell them not to. § the published DOI does not match the deposited DOI - maybe you left out a period, or your template truncated something – it happens. If it does, contact us and we’ll help you correct the situation.§ the end user misinterpreted or mistyped a DOI (i.e. confusing 1 for l) – some errors are clearly user error. 
  • We also have something called the Schematron report – Schematron is a validation language. These reports are used to identify messy metadata We need to be flexible and accommodate variances in data, so our deposit schema can’t keep all of the questionable data out without blocking good data as well, so we do a post-deposit review of metadata and pick out items that we think might be incorrect. These reports are emailed out weekly on Saturday, and we send out an average of 45 reports a week.
  • Boot Camp: An Introduction to CrossRef (2011 CrossRef Workshops)

    1. 1. • ••  •  
    2. 2. 
    3. 3. 
    4. 4. Submissio n report
    5. 5. Submissio n report
    6. 6. Submissio n report
    7. 7. HTTP
    8. 8. Next – add outbound links to references
    9. 9. Ghosh, M.K., M.L. Harter. 2003. A viral mechanism for remodeling chromatinstructure in G0 cells. Mol. Cell. 12:255–260,, M.K., M.L. Harter. 2003. A viral mechanism for remodeling chromatinstructure in G0 cells. Mol. Cell. 12:255–260,, M.K., M.L. Harter. 2003. A viral mechanism for remodeling chromatinstructure in G0 cells. Mol. Cell. 12:255–260, M.K., M.L. Harter. 2003. A viral mechanism for remodeling chromatinstructure in G0 cells. Mol. Cell. 12:255–260Ghosh, M.K., M.L. Harter. 2003. A viral mechanism for remodeling chromatinstructure in G0 cells. Mol. Cell. 12:255–260, CrossRef.Ghosh, M.K., M.L. Harter. 2003. A viral mechanism for remodeling chromatinstructure in G0 cells. Mol. Cell. 12:255–260, Article.
    10. 10. Maintaining DOIs and Metadata
    11. 11. Maintaining DOIs and Metadata: Reports         
    12. 12. 
    13. 13. 
    14. 14. 