Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Gbrds Workshop Sept09 Metadata Identifiers


Published on

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

Gbrds Workshop Sept09 Metadata Identifiers

  1. 1. GLOBAL BIODIVERSITY INFORMATION FACILITY Metadata in context of GBRDS Éamonn Ó Tuama GBRDS Workshop, Copenhagen, 17-18 Sept 2009
  2. 2. - Task groups Outline - Metadata task group recommendations - LSID-GUID task group recommendations - Overview: where metadata fits in - Role of metadata
  3. 3. Where metadata fits in ...
  4. 4. Where metadata fits in ...
  5. 5. Why metadata? William K. Michener, Meta-information concepts for ecological data management, Ecological Informatics, Volume 1, Issue 1, January 2006, Pages 3-7, ISSN 1574-9541, DOI: 10.1016/j.ecoinf.2005.08.004. ( Information about datasets deteriorates over time!
  6. 6. GBIF Nodes Survey 2009
  7. 7. GBIF Nodes Survey 2009
  8. 8. Data providers register their data and services in the GBIF UDDI Registry UDDI lists “business” information and binding template, i.e, the URL by which the provider installation can be accessed All further metadata are derived via DiGIR /TAPIR requests No separate metadata catalogue with dedicated client for searching or browsing Current GBIF metadata handling
  9. 9. <ul><ul><li>Data provider details (Name; Website; GBIF participant; Description; Country; Added to portal; Information updated)‏ </li></ul></ul><ul><ul><li>Provider (DiGIR, BioCASe, TAPIR) binding </li></ul></ul><ul><ul><li>Name </li></ul></ul><ul><ul><li>Website </li></ul></ul><ul><ul><li>Description </li></ul></ul><ul><ul><li>Citation </li></ul></ul><ul><ul><li>How to cite this dataset </li></ul></ul><ul><ul><li>Basis of record </li></ul></ul><ul><ul><li>Access point URL </li></ul></ul><ul><ul><li>Added to portal </li></ul></ul><ul><ul><li>Information updated </li></ul></ul><ul><ul><li>Contacts (Name, Role, Address, Email, Telephone)‏ </li></ul></ul><ul><ul><li>Data networks </li></ul></ul><ul><ul><li>Number of occurrence records indexed </li></ul></ul><ul><ul><li>Number of records shared by provider </li></ul></ul><ul><ul><li>Number of occurrences with coordinates </li></ul></ul><ul><ul><li>Number of occurrences with no geospatial issues </li></ul></ul><ul><ul><li>Number of species </li></ul></ul><ul><ul><li>Number of taxa </li></ul></ul>and via the indexing process - Current GBIF dataset metadata via the UDDI registry, DiGIR, TAPIR
  10. 10. Metadata Implementation Framework Task Group (MIFTG) <ul><li>To provide recommendations and guidelines on implementation of a metadata framework for the GBIF network. </li></ul> - Define a metadata model - Define optimal network design - Advise on use of controlled terminology - Focus on implementation issues - Review current GBIF metadata system
  11. 11. MIFTG: summary The Global Biodiversity Information Facility (GBIF) aspires to … become a major provider of discovery and access services for a wide variety of biodiversity data types. A distributed metadata catalog system that describes and makes accessible general information on datasets of primary biodiversity data is recognised as an essential component of GBIF to achieve this objective.
  12. 12. MIFTG: recommendations Controlled vocabularies R53. Providers should use controlled vocabularies in any metadata field for which an appropriate vocabulary exists, and should use a multi-lingual thesaurus when appropriate R54. The GBIF vocabularies registry is a valuable service, but should be extended to include a canonical identifier for each vocabulary, and should work to be consistent with other vocabulary registries (e.g., oasis, info, srw)
  13. 13. MIFTG: recommendations Metadata specifications R6. Metadata should be able to describe multiple types of primary biodiversity data. R7. Metadata should support data discovery, interpretation, and analytical reuse R8. Metadata should support search/browse by space, time, taxa, and theme R9. Metadata should support search/browse by name of provider/name of organization R10. Metadata should support search by related publications
  14. 14. MIFTG: recommendations Metadata catalog system recommendations R27. The metadata catalog system must support multiple metadata models natively. R28. The metadata catalog system must be able to return the original contributed metadata object. R29. The metadata catalog system must support unique versioning of metadata and data objects using globally unique identifiers to differentiate revisions. R42. The metadata catalog system should register with one or more node registries to advertise services available. . R44. The metadata catalog system should provide attribution and branding for original metadata providers. .
  15. 15. MIFTG: recommendations Network Architecture Recommendations R20. GBIF should build a distributed system of regional nodes, each containing a replica of all metadata. R21. Each regional node must replicate metadata to other regional nodes when record changes occur using a GBIF-prescribed replication protocol. R22. Each regional node should also provide a harvesting interface that exposes metadata via their unique identifiers. R25. GBIF needs a registry to maintain list of regional nodes and their relevant service endpoints.
  16. 16. LSID-GUID Task Group (LGTG) <ul><ul><li>To provide recommendations and guidelines on deployment of LSIDs and other GUIDs on the GBIF network with particular reference to the potential role of GBIF as a stable, long term provider of GUID resolution services. </li></ul></ul> - Review the plans for a decentralised GBIF informatics architecture to ascertain requirements for GUID technologies - Evaluate a role for GBIF in provision of LSID hosted services - Identify the data models/vocabularies for use in metadata returned on GUID resolution - Propose a business model for adopting LSIDs - Review the main GUID technologies - Identify solutions for integrating GUIDs (e.g., LSIDs, Handles, DOIs) with the Semantic Web and the Linked Data model
  17. 17. LSID-GUID Task Group: summary Effective identification of data objects is essential for linking the world’s biodiversity data. If GBIF is to enable the exchange of biodiversity data it must promote identifier adoption through: - education, training, outreach - leadership - practical services
  18. 18. LGTG: recommendations Recommendation 4 GBIF should encourage, support and advise on the use of appropriate identifier technologies, in particular LSIDs and HTTP URIs , but not impose a requirement for one at the expense of the other. GBIF should provide specific advice for the issuing and use of LSIDs and for HTTP URIs.
  19. 19. LGTG: recommendations Recommendation 5 <ul><li>GBIF should support a promotional programme, including: </li></ul><ul><li>workshops for data providers on awareness of </li></ul><ul><li>identifiers and choosing and implementing </li></ul><ul><li>persistent identifiers; </li></ul><ul><li>- technical and deployment training programmes; </li></ul><ul><li>maintaining a system of “quality marks” for </li></ul><ul><li>compliant collaborators (data providers, </li></ul><ul><li>aggregators, etc.). </li></ul>
  20. 20. LGTG: recommendations Recommendation 6 <ul><li>The GBIF data portal should demonstrate good practice by: </li></ul><ul><li>maintaining fields for identifiers including those </li></ul><ul><li>from data providers, </li></ul><ul><li>- assigning GBIF identifiers to cached objects, </li></ul><ul><li>property values in GBIF records should be </li></ul><ul><li>persistent resolvable identifiers if possible. </li></ul>
  21. 21. LGTG: recommendations Recommendation 8 <ul><li>GBIF should make data more inter-connected by: </li></ul><ul><li>adopting current best practice for </li></ul><ul><li>interconnected data (Linked Data principles); </li></ul><ul><li>- outputing RDF graphs; </li></ul><ul><li>- using existing vocabularies and GUIDs wherever </li></ul><ul><li>possible. </li></ul>
  22. 22. LGTG: recommendations Recommendation 8 <ul><li>GBIF should make data more inter-connected by: </li></ul><ul><li>adopting current best practice for </li></ul><ul><li>interconnected data (Linked Data principles); </li></ul><ul><li>- outputing RDF graphs; </li></ul><ul><li>- using existing vocabularies and GUIDs wherever </li></ul><ul><li>possible. </li></ul>
  23. 23. LGTG: recommendations Recommendation 10 GBIF should provide services to support identifier resolution, redirection, metadata hosting, and caching.
  24. 24. LGTG: recommendations Recommendation 11 GBIF should provide additional services, including persistent identifier monitoring services.
  25. 25. LGTG: recommendations Recommendation 12 GBIF should extend the role of its data portal by hosting resources related to the use of identifiers, such as the TDWG vocabularies.
  26. 26. How to contact GBIF: <ul><li>Web site: </li></ul><ul><li>Data portal: </li></ul><ul><li>GBIF Secretariat </li></ul><ul><ul><li>Universitetsparken 15 DK-2100 Copenhagen Ø Denmark </li></ul></ul><ul><ul><li>E-mail: [email_address] </li></ul></ul><ul><ul><li>Phone: +45 3532 1470 </li></ul></ul><ul><ul><li>Fax: +45 3532 1480 </li></ul></ul><ul><li>GBIF Secretariat building, supported by a grant from the Aage V. Jensens Fonde </li></ul>