Tech WG report 2011

  • 377 views
Uploaded on

The report of technical working group given to the DataCite General Assembly 2001 by the tech led Ed Zukowski

The report of technical working group given to the DataCite General Assembly 2001 by the tech led Ed Zukowski

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
377
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
3
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Technical working group report 2011 1st Dec 2011 Ed Zukowski The British Library
  • 2. Technical infrastructure: focus on metadata
  • 3. What happened since Dec 2010?
    • Major service releases in 2011
    • March Metadata Store (MDS) public beta
    • June MDS v2 (production release)
    • June Metadata Search beta
    • July OAI beta
    • September Content Negotiation alpha
  • 4. What’s the Metadata Store (MDS)? mds.datacite.org
  • 5. Usage scenarios
    • Let your data centres use MDS directly (recommended) - e.g. BL and TIB
    • Integrate your local solution with MDS (if you have one). e.g. CDL and CISTI
  • 6. MDS user interface UI also available in German and French
  • 7. Statistics: our DOIs
    • Number of DOIs
    • Total (in Handle) 1,144,354
    • In MDS 52,497
    • With metadata 47,350
    • Number of active data centres in MDS: ~50
    • Top 5 data centres (by number of DOIs in MDS):
    • CDL.CDL 17,387
    • BL.ADS 11,846
    • TIB.PANGAEA 6,429
    • BL.UKDA 4,962
    • ZBMED.GMS 2,986
    November 2011
  • 8. DOI resolutions in dx.doi.org in 2011
    • Total number of DOIs resolutions in 2011:
    • ~660,000
    • Number of resolutions in top 200 DOIs:
    • from 2000 to 100 per DOI
  • 9. Metadata Store usage
  • 10. Metadata Schema Repository
    • http://schema.datacite.org
    • Metadata: XSD & examples are validated by Jenkins, all version managed with the source code
  • 11. Other metadata services
    • search.datacite.org
    • Public, simple but flexible, drill down
    • UI and API
    • Best looking piece of our software – check it out!
    • oai.datacite.org
    • Open Archive Interface provider
    • data.datacite.org
    • Content negotiation (Metadata Resolver)
    • Metadata in many formats (e.g. RDF)
  • 12. search.datacite.org
  • 13. search.datacite.org usage Getting popular.... ... all over the wold!
  • 14. Metadata resolver
    • Available at: data.datacite.org
    • Purpose: make our metadata available in various formats
    • Two modes of usage:
    • HTML links
    • HTTP content negotiation
  • 15. Metadata resolver: Content negotiation
    • CrossRef launched their conneg Apr 2011
    • “ the beauty of the setup is that from now on, any DOI registration agency can enable content negotiation for their constituencies as well. DataCite we're looking at you ;-)” (Geoffrey Bilder on CrossTech blog, April 2011)
    • So how does it work? Make a resolution request with dx.doi.org and specify required format in HTTP „Accept” header.
  • 16. Our implementation was well received
    • “ In April CrossRef launched content negotiation support for its DOIs. At the time I cheekily called-out DataCite to start supporting content negotiation as well.
    • Edward Zukowski (DataCite's resident propellor-head) took up the challenge with gusto and, as of September 22nd DataCite has also been supporting content negotiation for its DOIs. This means that one million more DOIs are now linked-data friendly.
    • Congratulations to Ed and the rest of the team at DataCite.”
    • (Geoffrey Bilder on CrossTech blog, Sep 2011)
  • 17. Metadata resolver: links
  • 18. Infrastructure
    • Amazon Cloud
    • Cost <$500/month
    • 6 VMs (4 web servers, 2 databases)
    • Backup on EC2 – 7 days
    • Backup off-site (BL/CISTI) – 30 days
    • No propriatory APIs used (no vendor lock-in)
    • Amazon is location neutral (i.e. no single DataCite member is on critical path)
    • Handle server (now at TIB) should be moved to CNRI (North Virginia, USA)
  • 19. Tech team’s tasks
    • Gathering service requiremets
    • Software design and implementation
    • Running the services on daily basis
    • Support and maintainance (backups etc.)
    • End user 2nd line of support (API, system queries etc.)
  • 20. 100% Open Source
    • All our code: https://github.com/datacite (also public ticketing system)
    • All the components are open source:
      • Apache/Tomcat
      • Solr
      • Linux
      • Java (OpenJDK) / Ruby
      • MySQL
  • 21. Tomorrow
    • Discussion
    • Questions
    • Ideas / requests
    • Future projects
    • and more!
    • Starts 12.30