Tech WG report 2011


Published on

The report of technical working group given to the DataCite General Assembly 2001 by the tech led Ed Zukowski

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Tech WG report 2011

  1. 1. Technical working group report 2011 1st Dec 2011 Ed Zukowski The British Library
  2. 2. Technical infrastructure: focus on metadata
  3. 3. What happened since Dec 2010? <ul><li>Major service releases in 2011 </li></ul><ul><li>March Metadata Store (MDS) public beta </li></ul><ul><li>June MDS v2 (production release) </li></ul><ul><li>June Metadata Search beta </li></ul><ul><li>July OAI beta </li></ul><ul><li>September Content Negotiation alpha </li></ul>
  4. 4. What’s the Metadata Store (MDS)?
  5. 5. Usage scenarios <ul><li>Let your data centres use MDS directly (recommended) - e.g. BL and TIB </li></ul><ul><li>Integrate your local solution with MDS (if you have one). e.g. CDL and CISTI </li></ul>
  6. 6. MDS user interface UI also available in German and French
  7. 7. Statistics: our DOIs <ul><li>Number of DOIs </li></ul><ul><li>Total (in Handle) 1,144,354 </li></ul><ul><li>In MDS 52,497 </li></ul><ul><li>With metadata 47,350 </li></ul><ul><li>Number of active data centres in MDS: ~50 </li></ul><ul><li>Top 5 data centres (by number of DOIs in MDS): </li></ul><ul><li>CDL.CDL 17,387 </li></ul><ul><li>BL.ADS 11,846 </li></ul><ul><li>TIB.PANGAEA 6,429 </li></ul><ul><li>BL.UKDA 4,962 </li></ul><ul><li>ZBMED.GMS 2,986 </li></ul>November 2011
  8. 8. DOI resolutions in in 2011 <ul><li>Total number of DOIs resolutions in 2011: </li></ul><ul><li>~660,000 </li></ul><ul><li>Number of resolutions in top 200 DOIs: </li></ul><ul><li>from 2000 to 100 per DOI </li></ul>
  9. 9. Metadata Store usage
  10. 10. Metadata Schema Repository <ul><li> </li></ul><ul><li>Metadata: XSD & examples are validated by Jenkins, all version managed with the source code </li></ul>
  11. 11. Other metadata services <ul><li> </li></ul><ul><li>Public, simple but flexible, drill down </li></ul><ul><li>UI and API </li></ul><ul><li>Best looking piece of our software – check it out! </li></ul><ul><li> </li></ul><ul><li>Open Archive Interface provider </li></ul><ul><li> </li></ul><ul><li>Content negotiation (Metadata Resolver) </li></ul><ul><li>Metadata in many formats (e.g. RDF) </li></ul>
  12. 12.
  13. 13. usage Getting popular.... ... all over the wold!
  14. 14. Metadata resolver <ul><li>Available at: </li></ul><ul><li>Purpose: make our metadata available in various formats </li></ul><ul><li>Two modes of usage: </li></ul><ul><li>HTML links </li></ul><ul><li>HTTP content negotiation </li></ul>
  15. 15. Metadata resolver: Content negotiation <ul><li>CrossRef launched their conneg Apr 2011 </li></ul><ul><li>“ the beauty of the setup is that from now on, any DOI registration agency can enable content negotiation for their constituencies as well. DataCite we're looking at you ;-)” (Geoffrey Bilder on CrossTech blog, April 2011) </li></ul><ul><li>So how does it work? Make a resolution request with and specify required format in HTTP „Accept” header. </li></ul>
  16. 16. Our implementation was well received <ul><li>“ In April CrossRef launched content negotiation support for its DOIs. At the time I cheekily called-out DataCite to start supporting content negotiation as well. </li></ul><ul><li>Edward Zukowski (DataCite's resident propellor-head) took up the challenge with gusto and, as of September 22nd DataCite has also been supporting content negotiation for its DOIs. This means that one million more DOIs are now linked-data friendly. </li></ul><ul><li>Congratulations to Ed and the rest of the team at DataCite.” </li></ul><ul><li>(Geoffrey Bilder on CrossTech blog, Sep 2011) </li></ul>
  17. 17. Metadata resolver: links
  18. 18. Infrastructure <ul><li>Amazon Cloud </li></ul><ul><li>Cost <$500/month </li></ul><ul><li>6 VMs (4 web servers, 2 databases) </li></ul><ul><li>Backup on EC2 – 7 days </li></ul><ul><li>Backup off-site (BL/CISTI) – 30 days </li></ul><ul><li>No propriatory APIs used (no vendor lock-in) </li></ul><ul><li>Amazon is location neutral (i.e. no single DataCite member is on critical path) </li></ul><ul><li>Handle server (now at TIB) should be moved to CNRI (North Virginia, USA) </li></ul>
  19. 19. Tech team’s tasks <ul><li>Gathering service requiremets </li></ul><ul><li>Software design and implementation </li></ul><ul><li>Running the services on daily basis </li></ul><ul><li>Support and maintainance (backups etc.) </li></ul><ul><li>End user 2nd line of support (API, system queries etc.) </li></ul>
  20. 20. 100% Open Source <ul><li>All our code: (also public ticketing system) </li></ul><ul><li>All the components are open source: </li></ul><ul><ul><li>Apache/Tomcat </li></ul></ul><ul><ul><li>Solr </li></ul></ul><ul><ul><li>Linux </li></ul></ul><ul><ul><li>Java (OpenJDK) / Ruby </li></ul></ul><ul><ul><li>MySQL </li></ul></ul>
  21. 21. Tomorrow <ul><li>Discussion </li></ul><ul><li>Questions </li></ul><ul><li>Ideas / requests </li></ul><ul><li>Future projects </li></ul><ul><li>and more! </li></ul><ul><li>Starts 12.30 </li></ul>