Persistent Identifiers for Audiovisual Archives and Cultural Heritage Hennie Brugman Technical coordinator CATCHPlus Max-P...
Summary <ul><li>CATCH & CATCHPlus </li></ul><ul><li>Requirements from CATCHPlus, CH and AV  </li></ul><ul><li>Our solution...
CATCH & CATCHPlus <ul><li>CATCH research program by NWO (14 projects) </li></ul><ul><li>CATCHPlus valorisation project </l...
Requirements from CATCHPlus, Cultural Heritage and Audiovisual Archives
Requirements (1) Software support <ul><li>Good resolving service  available </li></ul><ul><li>Proven technology, stable an...
Requirements (2) Identifier management <ul><li>Identifier management should be independent of </li></ul><ul><ul><li>System...
Requirements (3) Organisation, policy <ul><li>What choices are made by partner institutions ?  (the fewer ‘flavours’, the ...
CATCHPlus solution
CATCHPlus solution: base technology <ul><li>Based on Handle technology </li></ul><ul><ul><li>Best meets our requirements <...
CATCHPlus solution: identifier management <ul><li>REST web service </li></ul><ul><li>For resolving, searching, creation an...
CATCHPlus solution: organisational embedding <ul><li>EPIC  (European Persistent Identifier Consortium) </li></ul><ul><ul><...
Application to data sets
Collections and data sets <ul><li>Currently assigning identifiers to: </li></ul><ul><li>Concepts for the CATCHPlus Vocabul...
Application to data sets Some questions to answer first... <ul><li>What are the objects to assign persistent identifiers t...
Steps <ul><li>For existing objects </li></ul><ul><ul><li>Determine your policies </li></ul></ul><ul><ul><li>Determine what...
Sound and Vision pilot <ul><li>Objects:  </li></ul><ul><ul><li>metadata descriptions at level of broadcasts </li></ul></ul...
Concluding remarks <ul><li>External accessibility of data and service depends on one resolver service: should be no single...
Questions?
Upcoming SlideShare
Loading in...5
×

CATCHPlus on Europeana Connect: Persistent Identifier solution

693

Published on

On february 17-18 CATCHPlus participated in and contributed to the EuropeanaConnect ERDS (Europeana Resolution Discovery Service) meeting at the German National Library in Frankfurt. Aim of this meeting was to jointly formulate requirements for the Europeana "meta resolver" for the different kinds of persistent identifiers in use with participating institutions.

CATCHPlus had the opportunity to report on their experiences with formulating requirements from Cultural Heritage and Audiovisual domains, on the solution that CATCHPlus has chosen and implemented and on a number of application pilots.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
693
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Scalable: hashing, caching, replication Metadata: checksum, link to metadata, restoration information
  • Includes batch operations (“move collection”)
  • Handle vs DNS: DNS is control hierarchy, Handles not (necessarily) Handle admin can be somebody else than sysadmin (distributed administration) Handle resolver can be behind http proxy Handle supports Unicode, DNS not Mirroring per record (handle) versus mirroring per data set (dns) Support for access control and authentication (dns has not) PILIN doc: http://www.pilin.net.au/Project_Documents/Community_Guidelines/Using_URLS_PI.htm “ URI can be used as a persistent identifier, but at the cost of additional maintenance, which needs to be planned for through explicit strategies” Clearer to uncouple locations and identifiers (although an http URI can be both) Avoid unnecessary protocol dependence URL can be used as identifier or as service call. It should be possible to call different services for the same identifier http URI also need explicit identifier management (just as Handles) CoolURIs necessary, otherwise http URI tend to be interpreted as locators Take care that not everyone can publish http URIs, define curation boundaries Identifier management infrastructure is needed for http URIs as well
  • Redundant service providing
  • CATCHPlus on Europeana Connect: Persistent Identifier solution

    1. 1. Persistent Identifiers for Audiovisual Archives and Cultural Heritage Hennie Brugman Technical coordinator CATCHPlus Max-Planck-Institute for Psycholinguistics Netherlands Institute for Sound and Vision
    2. 2. Summary <ul><li>CATCH & CATCHPlus </li></ul><ul><li>Requirements from CATCHPlus, CH and AV </li></ul><ul><li>Our solution </li></ul><ul><ul><li>Base technology </li></ul></ul><ul><ul><li>Identifier management </li></ul></ul><ul><ul><li>Organisational embedding </li></ul></ul><ul><li>Application to collections </li></ul><ul><li>Concluding remarks </li></ul>
    3. 3. CATCH & CATCHPlus <ul><li>CATCH research program by NWO (14 projects) </li></ul><ul><li>CATCHPlus valorisation project </li></ul><ul><ul><li>8 subprojects at large CH institutions </li></ul></ul><ul><ul><li>Connected by common services </li></ul></ul><ul><ul><ul><li>Vocabulary services </li></ul></ul></ul><ul><ul><ul><li>Annotation services </li></ul></ul></ul><ul><ul><ul><li>Infrastructural: OAI-PMH, persistent identifiers </li></ul></ul></ul><ul><li>Project office hosted by Netherlands Institute for Sound and Vision </li></ul><ul><li>www. catchplus . nl </li></ul>
    4. 4. Requirements from CATCHPlus, Cultural Heritage and Audiovisual Archives
    5. 5. Requirements (1) Software support <ul><li>Good resolving service available </li></ul><ul><li>Proven technology, stable and 100% reliable </li></ul><ul><li>Scalable, with respect to </li></ul><ul><ul><li>Number of identifiers </li></ul></ul><ul><ul><li>Performance </li></ul></ul><ul><li>Globally working solution </li></ul><ul><li>Redundant hosting and service providing </li></ul><ul><li>Identification of parts of objects (AV, CH) </li></ul><ul><li>Possibility to associate metadata with an identifier (AV, CH) </li></ul><ul><li>“ Actionable”: identifiers can be resolved using http URIs </li></ul><ul><li>Support for identifier management tasks </li></ul>
    6. 6. Requirements (2) Identifier management <ul><li>Identifier management should be independent of </li></ul><ul><ul><li>System management </li></ul></ul><ul><ul><li>Web server management </li></ul></ul><ul><ul><li>Hosting of resolution services </li></ul></ul><ul><li>Can be done from the context of a collection management system </li></ul><ul><ul><li>typically by a responsible collection manager </li></ul></ul><ul><li>Is efficient, powerful and simple </li></ul>
    7. 7. Requirements (3) Organisation, policy <ul><li>What choices are made by partner institutions ? (the fewer ‘flavours’, the better) </li></ul><ul><li>Reliability and sustainability of the service providers </li></ul><ul><li>Limited and controlable costs </li></ul><ul><li>Freedom to switch between service providers </li></ul>
    8. 8. CATCHPlus solution
    9. 9. CATCHPlus solution: base technology <ul><li>Based on Handle technology </li></ul><ul><ul><li>Best meets our requirements </li></ul></ul><ul><ul><li>http://handle.net/ </li></ul></ul><ul><li>One Local Handle System and Handle prefix per participating Naming Authority </li></ul><ul><li>Hosted by SARA, (eventually) mirrored by other EPIC partners (redundant hosting) </li></ul><ul><li>Redundant resolving is inherent to Handle System </li></ul>
    10. 10. CATCHPlus solution: identifier management <ul><li>REST web service </li></ul><ul><li>For resolving, searching, creation and management of Handles (in one’s own Naming Authority) over internet </li></ul><ul><li>Also support for batch operations (“move collection”) </li></ul><ul><li>SARA has built the first version for CATCHPlus </li></ul><ul><li>Available as Open source </li></ul><ul><li>Ambition: uniform redundant service by EPIC </li></ul><ul><li>User interface will be developed (Q1-2, 2010) </li></ul><ul><li>Prototype for evaluation by collection managers </li></ul>
    11. 11. CATCHPlus solution: organisational embedding <ul><li>EPIC (European Persistent Identifier Consortium) </li></ul><ul><ul><li>SARA (Netherlands), CSC (Finland), GWDG (Germany) </li></ul></ul><ul><ul><li>Redundant and reliable PID services for eScience and eCulture in Europe </li></ul></ul><ul><ul><li>Based on Handles </li></ul></ul><ul><ul><li>European mirror of Global Handle Repository </li></ul></ul><ul><ul><li>Governance structure with technical board and board of stakeholders </li></ul></ul>
    12. 12. Application to data sets
    13. 13. Collections and data sets <ul><li>Currently assigning identifiers to: </li></ul><ul><li>Concepts for the CATCHPlus Vocabulary Repository </li></ul><ul><li>A subcollection of the Sound and Vision archive </li></ul><ul><li>Several Dutch cultural heritage institutions and projects expressed interest </li></ul>
    14. 14. Application to data sets Some questions to answer first... <ul><li>What are the objects to assign persistent identifiers to? (versions, metadata records, formats, composite objects...) </li></ul><ul><li>Is there a relation with already existing identifiers? </li></ul><ul><li>What syntax to use? Include semantics in your PIDs? </li></ul><ul><li>Where do your PIDs resolve to, especially for objects that do not have a web representation of their own? </li></ul><ul><li>Who is responsible for identifier creation and management? </li></ul><ul><li>What garantees can be made with regard to persistence? </li></ul><ul><li>Who does hosting? Who provides services? </li></ul>
    15. 15. Steps <ul><li>For existing objects </li></ul><ul><ul><li>Determine your policies </li></ul></ul><ul><ul><li>Determine what URLs to resolve to </li></ul></ul><ul><ul><li>Create and publish PIDs for these URLs </li></ul></ul><ul><ul><li>Locally store association of URLs and proprietary identifiers </li></ul></ul><ul><ul><li>For all externally visible metadata: replace proprietary identifiers with PIDs </li></ul></ul><ul><li>For new objects </li></ul><ul><ul><li>Ultimately, integrate PID creation and management in your collection management tools and workflows </li></ul></ul>
    16. 16. Sound and Vision pilot <ul><li>Objects: </li></ul><ul><ul><li>metadata descriptions at level of broadcasts </li></ul></ul><ul><ul><li>Open data set: ‘polygoon journaal’ </li></ul></ul><ul><li>Existing identifiers: “task identifiers” </li></ul><ul><li>Resolve to metadata record implies: resolve to dynamically created html page </li></ul><ul><li>Persistent identifiers are published using OAI-PMH </li></ul><ul><ul><li>Published metadata refers back to same dynamic web page </li></ul></ul><ul><ul><li>OAI data provider uses PID service to find handles for internal identifiers/URLs </li></ul></ul>
    17. 17. Concluding remarks <ul><li>External accessibility of data and service depends on one resolver service: should be no single point of failure </li></ul><ul><li>Identifier management is an extra task that explicitly has to be dealt with </li></ul><ul><li>Explicit commitments with respect to persistency have to be made, and kept </li></ul><ul><li>Identifier management requires tool support (otherwise too labour intensive and error prone) </li></ul><ul><li>(Re)organizing your data internally becomes easier </li></ul><ul><li>Publishing parts of your collections on the internet becomes easier, more consistent and more sustainable </li></ul>
    18. 18. Questions?
    1. ¿Le ha llamado la atención una diapositiva en particular?

      Recortar diapositivas es una manera útil de recopilar información importante para consultarla más tarde.

    ×