0
EOL Content Summit,  Barro Colorado Island, Panama Global Biodiversity Information Facility David Remsen Senior Programme ...
GBIF and its parts
GBIF is composed of countries
GBIF Governing Board
GBIF Organisation
GBIF Participant Countries
Why is this important? <ul><li>Capacitate contributing countries </li></ul><ul><li>Support creation of national biodiversi...
GBIF Data Scope
PRIMARY BIODIVERSITY DATA
PRIMARY BIODIVERSITY DATA
SPECIES INFORMATION
SPECIES INFORMATION Distribution Species Descriptions!! Classification Synonymy Bibliography Specimens Common Names Images...
GBIF Infrastructure Components
GBIF IS A FEDERATED NETWORK A “network of networks”
Heterogenous biodiversity databases
Standard Formats/Protocols set the scope of the network
Standard Formats/Protocols set the scope of the network
Darwin Core Archives A text-based solution to publishing biodiversity data
Core Data file Each row=1 taxon <ul><li>CSV or TAB </li></ul><ul><li>Easily exported from DB </li></ul><ul><li>Easy to imp...
Extending Darwin Core one-to-many one-to-many <ul><li>Extensions defined via simple schema </li></ul><ul><li>Darwin Core o...
Archive is stand-alone data file No complicated protocols required Data is shared with URLs
Standards-based Data Publishing Data publishing tools User Guides, References, Best Practices
Integrated Publishing Toolkit
Easy to customise/internationalise
http:// tools.gbif.org /resource-browser/ Knowledge Organisation System
Common discovery system http:// gbrds.gbif.org
http:// gbrds.gbif.org /registry/ service.json?type =DWC-ARCHIVE-CHECKLIST Access to resources
Common discovery system http://gbrds.gbif.org
GLOBAL DATA INDEX
DATA PORTAL DISCOVERY ACCESS
317,199,241 data records 9,290 datasets 6,112,683 “names”
Nodes Portal Toolkit http:// npt-demo.gbif.org /
EOL discussion points
How can EOL and GBIF simplify the process of mobilisation and discovery of biodiversity data/content?
Leverage and contribute to a common biodiversity data mobilisation network
Dataset Registry
Adoption of DwC-A for some EOL resources Particularly those within common scope of EOL and GBIF
Develop Shared Vocabularies Internationalise them
Darwin Core Archive-related documentation
IPT and other publishing tools One tool: multiple data types GBIF support of Plinian Core Audubon Core See:  Customizing t...
Other ideas? [email_address]
Upcoming SlideShare
Loading in...5
×

Remsen EOL Content Summit

427

Published on

Global Biodiversity Information Facility presentation to the Encyclopedia of Life Content Summit

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
427
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
9
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Before I describe the challenges inherent to the index, I’d like to illustrate how primary biodiversity data has been used in various scientific and biodiversity policy-related contexts.
  • GBIF has a specific focus within biodiversity information in that our scope is restricted to the mobilisation, discovery, and use of primary biodiversity data. Primary biodiversity data are the digital text or multimedia data records that detail the instance of an organism – the ‘what, where, when, how and by whom’ of the organism’s occurrence and recording. One major class of primary biodiversity data is that derived from natural history collections.
  • A second class of primary biodiversity data originate with observations of species and there are numerous instances of observational data networks that collect millions of species observations every year.
  • A second class of primary biodiversity data originate with observations of species and there are numerous instances of observational data networks that collect millions of species observations every year.
  • GBIF represents a federated network that is composed of thousands of different primary biodiversity databases located all over the world.
  • Before I describe the challenges inherent to the index, I’d like to illustrate how primary biodiversity data has been used in various scientific and biodiversity policy-related contexts.
  • Before I describe the challenges inherent to the index, I’d like to illustrate how primary biodiversity data has been used in various scientific and biodiversity policy-related contexts.
  • Before I describe the challenges inherent to the index, I’d like to illustrate how primary biodiversity data has been used in various scientific and biodiversity policy-related contexts.
  • GBIF has invested heavily in the development of Darwin Core Archive data publishing tools and supporting documentation.
  • GBIF has invested heavily in the development of Darwin Core Archive data publishing tools and supporting documentation.
  • GBIF has invested heavily in the development of Darwin Core Archive data publishing tools and supporting documentation.
  • What makes all of these different databases part of the GBIF network are: These data are made available on the Internet using a common set of communications protocols and data formats. A registry, representing a list of all members of the network and the location of the data itself (often a URL) serves as a master network directory.
  • Lists of these resources are available via RESTful machine interfaces. Here is an example of listing all Darwin Core Archive checklists data as a JSON object.
  • The registry and communications protocols are utilised to poll each database in the network and retrieve an index of the biodiversity data records they contain. The index includes the key taxonomic, geospatial, and provenance elements of the data record. This allows the data to be visually represented, for instance, on a map of the Earth.
  • The data in the index are made available through the GBIF data portal. A primary means by which data are accessed is via taxonomic organisation – either by searching for a taxon by keyword or by browsing through a taxonomic hierarchy.
  • Currently the GBIF index stands at over 310 million records from over 9000 different databases. Each of these data records records the name of the taxon, usually a species, that the record is associated with. The total number of scientific names in this virtual dataset exceeds 6 million different text strings – far exceeding the number of known species. Correctly interpreting this list of names is a key requirement in enabling effective use of the index.
  • GBIF has invested heavily in the development of Darwin Core Archive data publishing tools and supporting documentation.
  • Before I describe the challenges inherent to the index, I’d like to illustrate how primary biodiversity data has been used in various scientific and biodiversity policy-related contexts.
  • Transcript of "Remsen EOL Content Summit"

    1. 1. EOL Content Summit, Barro Colorado Island, Panama Global Biodiversity Information Facility David Remsen Senior Programme Officer Global Biodiversity Information Facility (GBIF) January 2012
    2. 2. GBIF and its parts
    3. 3. GBIF is composed of countries
    4. 4. GBIF Governing Board
    5. 5. GBIF Organisation
    6. 6. GBIF Participant Countries
    7. 7. Why is this important? <ul><li>Capacitate contributing countries </li></ul><ul><li>Support creation of national biodiversity information facilities </li></ul><ul><li>Serve as a means to mobilise and discover biodiversity data – not an ends </li></ul><ul><ul><li>Not just a single portal application </li></ul></ul>
    8. 8. GBIF Data Scope
    9. 9. PRIMARY BIODIVERSITY DATA
    10. 10. PRIMARY BIODIVERSITY DATA
    11. 11. SPECIES INFORMATION
    12. 12. SPECIES INFORMATION Distribution Species Descriptions!! Classification Synonymy Bibliography Specimens Common Names Images Annotated Species Checklists General Descriptions Morphology Behavior Conservation Diagnosic Reproduction
    13. 13. GBIF Infrastructure Components
    14. 14. GBIF IS A FEDERATED NETWORK A “network of networks”
    15. 15. Heterogenous biodiversity databases
    16. 16. Standard Formats/Protocols set the scope of the network
    17. 17. Standard Formats/Protocols set the scope of the network
    18. 18. Darwin Core Archives A text-based solution to publishing biodiversity data
    19. 19. Core Data file Each row=1 taxon <ul><li>CSV or TAB </li></ul><ul><li>Easily exported from DB </li></ul><ul><li>Easy to import into Excel </li></ul><ul><li>Classification </li></ul><ul><li>Synonymy </li></ul><ul><li>Checklist parts </li></ul>Taxon
    20. 20. Extending Darwin Core one-to-many one-to-many <ul><li>Extensions defined via simple schema </li></ul><ul><li>Darwin Core or other terms </li></ul><ul><li>Linked to controlled vocabularies </li></ul><ul><li>One taxa – many extension records </li></ul><ul><li>Simple to Export </li></ul><ul><li>Simple to Manage </li></ul><ul><li>Supports sharing of EOL content </li></ul>Taxon Taxon Descriptions Distribution
    21. 21. Archive is stand-alone data file No complicated protocols required Data is shared with URLs
    22. 22. Standards-based Data Publishing Data publishing tools User Guides, References, Best Practices
    23. 23. Integrated Publishing Toolkit
    24. 24. Easy to customise/internationalise
    25. 25. http:// tools.gbif.org /resource-browser/ Knowledge Organisation System
    26. 26. Common discovery system http:// gbrds.gbif.org
    27. 27. http:// gbrds.gbif.org /registry/ service.json?type =DWC-ARCHIVE-CHECKLIST Access to resources
    28. 28. Common discovery system http://gbrds.gbif.org
    29. 29. GLOBAL DATA INDEX
    30. 30. DATA PORTAL DISCOVERY ACCESS
    31. 31. 317,199,241 data records 9,290 datasets 6,112,683 “names”
    32. 32. Nodes Portal Toolkit http:// npt-demo.gbif.org /
    33. 33. EOL discussion points
    34. 34. How can EOL and GBIF simplify the process of mobilisation and discovery of biodiversity data/content?
    35. 35. Leverage and contribute to a common biodiversity data mobilisation network
    36. 36. Dataset Registry
    37. 37. Adoption of DwC-A for some EOL resources Particularly those within common scope of EOL and GBIF
    38. 38. Develop Shared Vocabularies Internationalise them
    39. 39. Darwin Core Archive-related documentation
    40. 40. IPT and other publishing tools One tool: multiple data types GBIF support of Plinian Core Audubon Core See: Customizing the IPT
    41. 41. Other ideas? [email_address]
    1. Gostou de algum slide específico?

      Recortar slides é uma maneira fácil de colecionar informações para acessar mais tarde.

    ×