Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Gbrds Summary Final July2009 (2)


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Gbrds Summary Final July2009 (2)

  1. 1. GBIF Summary Paper: The Global Biodiversity Resources Discovery System (GBRDS) Background: One of the major challenges for existing biodiversity informatics infrastructure is to provide users with ways that substantially increase their ability to discover and access relevant biodiversity information and data resources. Our current ability to discover distributed, isolated and unknown data and information resources is limited. GBIF has already demonstrated that a worldwide-distributed network of biodiversity data publishers can be linked together and made searchable from a single point of access. Currently, more than 181 million primary biodiversity data records are searchable and downloadable online from over 296 publishers, demonstrating the feasibility of linking data-holding institutions and individuals at national, regional and thematic level. However, this represents only a small fraction of the biodiversity information and data resources which exist (and will be generated in the future) yet which are not, nor will be, readily discoverable, without a comprehensive discovery system. Rationale: One of the major challenges for GBIF as a leading global biodiversity informatics infrastructure is to provide an innovative means to the discovery and access of all relevant information and data resources for the benefit of the biodiversity community at large. This challenge is being addressed by GBIF through the development of a Global Biodiversity Resources Discovery System (GBRDS) for registration and discovery of biodiversity information and data resources and services, as set out in our Work Programme 2009-2010. Previously, users of resources such as a library or biological collections had to rely on a card catalogue. It was the only means for finding much of the material available in libraries or other collections. Today, of course, much of the information painstakingly captured on cards in the catalogues has become available digitally, cross-searchable between catalogues and capable of easily locating relevant resources remotely, using titles, authors, abstracts, dates or keywords. More even than a global catalogue, what is required today for the Biodiversity Informatics community is a comprehensive directory of resources and a road map to access them – a resource discovery system. Like a compass such a global directory of resources would offer a unique map of existing institutions, collections, datasets, services and other relevant information resources. At present the GBIF infrastructure functions on a Universal Description Discovery and Integration (UDDI) platform, which allows for registration of services and associated technical access points. It is used as a network compass to locate the data publisher’s access-points exposed on the Internet via Web services. However a UDDI alone falls short of mapping the full complexity of the networked GBIF community and does not address more complex issues
  2. 2. such as understanding of roles and relationships between institutions, collections and resources. For example, within the DarwinCore schema, the two fundamental concepts (InstitutionCode and CollectionCode), are intended to refer to a “standard” code identifier that uniquely identifies the institution and collection. However, no global registry exists for assigning unique, standardised institutional (and collection) codes that publishers should use across the various disciplines. In 2008, acutely aware of this growing challenge GBIF, together with the Biodiversity Information Standards (TDWG), and the Royal Botanic Garden, Edinburgh (RBGE) initiated a project called the ‘Biodiversity Collections Index’ (BCI) with the aim to facilitate the understanding, conservation and utilisation of global biodiversity resources by creating a single global annotated index of biodiversity collections based on existing authoritative references. The BCI feasibility project intended to do this by collaborating with the organisations and individuals who curate these collections. The BCI project was not intended to duplicate these authoritative references but to reconcile these through a central reference point, and associate them with a globally unique identifier. The BCI project was successful in mobilizing interest from the community but its integration within the full GBIF infrastructure and community requires broadening its scope – content, coverage, and reconciliation ability. DESIGN: A GBRDS should ideally be a combination of 1) a Registry of resources and services and 2) a set of discovery services interacting with existing infrastructure such as GBIF to facilitate the discovery of biodiversity information. The most important component, the Registry would facilitate the inventory of information resources by creating a single annotated index of publishers, institutions, networks, collections (datasets), schema repository and services. The envisaged GBRDS is not conceived to be designed as simply a collection of centralized indexes but much more as an integrated ‘Yellow Pages’ reference of all biodiversity information resources, reconciling all distributed resources and providing a meaningful way to discover them in a distributed manner. Any and all interested organisations should be able to register their resources and services into the GBRDS and contribute to the discovery services. It is envisaged that the GBRDS will form the core of the next generation of biodiversity informatics infrastructure, built on the principle of distributed architecture, and decentralized implementation. Through the comprehensive related activities planned in the GBIF 2009-2010 Work Programme, by December 2010, the GBRDS in conjunction with existing infrastructures, is intended to become a unified global entry point for discovery of all kinds of information (who, what, where, when, and how) about biodiversity resources, both digitised and undigitised, including primary and non-primary biodiversity data, standards and services, and for integrating the GBIF network with other systems/networks. The functionalities and services offered by the GBRDS are intended to be scalable and evolve over time based on the needs of the community, improvement of global biodiversity information infrastructure, and interoperable linkages with other similar networks such as GEO-BON etc. The GBRDS is conceived as a discovery system that will provide a platform for a coherent global map of resources. The GBRDS is therefore foreseen as a catalyst for integration of all existing resources and enabling their discovery and use.
  3. 3. SPECIFICATIONS Two main components constitute the GBRDS: (a) registry and (b) discovery services. a) REGISTRY The registry will facilitate unified registration, disambiguation and resolution of data resources, and services. In other words, the GBRDS will incorporate an inventory of ‘persistent identifiers’ that glues together all the data components within and outside of the GBIF infrastructure. Thus, the registry will provide a ‘wiring diagram’ of GBIF and other biodiversity related network topologies and services available. • Open source (Apache 2.0 license) Java-based customisable, multilingual web application. • Web based user interface and web services offering: o Creation, deletion and update of network entities (institutions, datasets, products, protocols, services, thematic networks etc) o Management of multiple relationships between network entities, allowing resources to participate in many networks for example. o Role-based user management, allowing for multiple data curators to participate in the registration of network entities. o Browse and search capabilities for registered resources, enabling discovery of access points. o Ability to filter the network view so that only registered entities participating in a specific thematic network are displayed; “Show me the publishers contributing to Network X and their relationships” o Services to help uniquely identify entities reconciling multiple identifiers to the same network entity. o Registration of IPT specific extensions and controlled vocabularies o Notification to appropriate administrators when changes are made o Services for assignment and resolution of persistent identifiers for network entities o Services for resolution of ‘Persistent Identifiers’ assigned by other registries/discovery systems b) DISCOVERY SERVICES: The GBIF infrastructure aims to provide search and discovery of the following data types: • Primary biodiversity data o Single location (e.g. point base) o Grid based, such as plot monitoring o Individual specimen monitoring • Taxonomic and checklist information • Species distributions (range of distribution, etc.) • Multimedia resources in biodiversity • Literature based biodiversity data • Data Resources (datasets) descriptions or metadata
  4. 4. Other services are made available by many organisations and initiatives. The discovery services of the GRBDS are aimed at documenting these in ways that they can be easily discovered even by machine-to-machine systems. Ultimately such discovery services combined with the content of the Registry will provide a unique reference of all biodiversity information resources and enable scientists and other users to simply and quickly discover what information resources are available in the global distributed network and what analytical tools and services to analyse them are available. DEVELOPMENT SCHEDULE • GBRDS Summary Paper: August 2009 • GBRDS Stakeholders workshop: September 2009 • GBRDS version 1.0: October 2009 • GBRDS version 2.0: December 2010 RESOURCES 1. Source code, wiki, bug reporting 2. GBIF communications portal GBIF CONTACTS   Head of Informatics Systems Architect SAMY GAIJI TIM ROBERTSON Senior Programme Officer for DIGIT Developer VISHWAS CHAVAN JOSE CUADRA