Hardware : put into service this past year: (we’re pretty much a Dell shop) Dell 720xd: Oracle database server, new virtual host server (cr14), new storage array in Denver Replaced our front end Internet handoff routers to provide better failover capability
Network: We’ve switched to managed DNS services provided by Dyn. This provides reduced name resolution latency, a more robust global DNS infrastructure. And the ability to load balance and direct traffic based on location. The servers behind the rest-api (api.crossref.org) are probably the first that we’ll take advantage of for this capability
Resiliency: While a lot that we do every day really falls under this topic, specific actions taken involve bolstering out ability to operate through database interruptions. As some of you may know we still use Oracle as the main datastore. Oracle does of course offer very sophisticated enterprise solutions however they come at a steep price. Therefore we’ve implemented our own solutions which address continuous real time replication to our disaster center and auto failover for read-only operations to a local backup database.
Production: Over the past few months we’ve started moving some of what are called Labs projects into a production environment. This means they operate out of our main datacenter and, perhaps more importantly, additional staff are becoming familiar with their configuration, deployment and operation.
Performance: Deposit performance is generally not an issue. Only at times when large updates occur does the Q get backed up. Monthly average wait times are about 30-60 minutes but by count the vast majority of deposits get processed in a few minutes.
Query performance as improved significantly with a number of implementation changes. Through configuration changes we’ve increase the throughput for metadata distribution. In October we had 112 million DOI queries up from a monthly average of 34 million seen in 2013
Callback notifications: We’ve finally implemented an alternative to receiving deposit log files by email. We’ve always had an alternative to email where you poll a deposit job looking for a completion status and the retrieve the log via a specific API. With call-backs you implement an end-point that will receive a completion notice when the deposit is done. The notice contains details on how to retrieve the deposit log results, that being a URL to retrieve the data. Call-backs also work for batch query jobs and for cited-by link alerts (which can be large). Of course we’ve been aggregating cited-by link alerts for a few years now which has dramatically reduced the email problems we’ve had.
Conflicts: This has been a challenging topic for staff and members to stay on top of. The original implementation (still in operation) could create many more conflicts than necessary. We’re currently completing a project to clean up many of the outstanding unresolved conflicts. 1.35 million DOIs have been in a conflict at some point, 473K remain in a conflict that has not been addressed.
Books: We’re nearing the end of implementing a change to books that will ease the process of assigning DOIs to book content that is hosted in several locations. Coding is done and testing is under way. We’ll be looking to pilot with a few publishers at the start of the year.
Standards: A working group consisting of members who deposit DOIs for standards has been focused on improving the overall treatment of standards DOIs at crossref. The major outcome to date has been a revision of the deposit schema now placing the main emphasis on designator. In December the group has an in-person meeting scheduled to finalize deposits and to address the query processes to maximize discoverability.
Meta-data query: Changed the response when conflicts are present. This is when two (or more) DOIs have the exact same metadata which use to always result in no DOI being returned to the caller. Now, we pick a DOI based on the most recent deposit or ownership where we select a DOI that is owned by the same member who owns the title. This solves a common conflict problem where a new publisher acquires a journal and deposits new DOIs for articles that the former owner had already assigned a DOI.
Schema: Allow for abstracts, as of Nov 5 we have 53,915 deposits. Licensing information in support of text & data mining. And recently a beta version of a relations sub-schema that allows for establishing relationships between things assigned a crossref DOI and other item, that may/may not have a DOI or a non-crossref DOI or may be identified using some other scheme like pubmed IDs or URIs.
ORCID: we’ve recently achieved a common understanding with ORCID on a workflow where crossref will post to author’s via ORCID for them to accept articles into their profile
Article title cleanup: Over the years, with various processing bugs at times (on both sides crossref and the depositing members) a number of article titles have gotten mangled with respect to non-ASCII 7, or what are sometimes called, special characters. We’ll have a cleanup project to make corrections where possible.
Relations: Version 4.3.5 of the deposit schema now includes a beta version of a relations sub-schema (technique now being used to expand the schema). This provides for the creation of relations between crossref DOIs and items identified with a DOI or some other identification scheme.
Stored queries: We have 252,930,184 unresolved stored queries (29,512,735 resolved). Right now we run about 1 million a day. We recently discovered a few flaws in the cyclic processing logic which introduced unacceptable latency in processing all queries. We’re in the process of fixing these now and are looking for ways to improve the process.
New content types: For awhile not members have been depositing DOIs for content that does not exactly fit the genres defined in our current deposit schema. Most often this is done by using the database genre as a sort of catch-all. Having recently been approached with two more situations we’re now exploring the possibility of adding a new general purpose content type and additional dedicated content types. The goal here is to more accurately represent the type of content
2014 CrossRef Annual Meeting: CrossRef System Update
System update 2014
System Update 2014
Director of Technology
System update 2014
core system improvements
things soon to start or in planning
System update 2014