Global Names Architecture - Remsen


Published on

Presentation to the GNA meeting hosted by Catalogue of Life in Paris, France on 8-10 November.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • I was digitising resources like this gem – beautiful plates and lots of nice metadata to index them with. The only problem was that the only really useful biological metadata was the scientific name that labeled each picture. Had I only known what I was getting into.
  • This is another early digitisation project. A seminal work. Only the name is spelled incorrectly or at least incorrectly according to todays spelling. I cant change the spelling in the source so what can I do about it? At the time I was short on ideas.
  • Pictures and specimens and gene sequences are labelled with names like this. Eventually I learned these are all different forms of the same name. For various reasons, however, different sources often prefer to retain their versions of the name.
  • This paper contains historic catch and distribution data. But the name represents a combination that is no longer in use. How can we retrieve historic information on species where the name has changed from what is in use today?
  • In 1999, the fungus responsible for causing one of the major causes of death in people with HIV was renamed, following a split of the species. This requires those who work on this disease to be aware of the new name in order to access information related to the species.
  • Here is a map showing a distribution of data related to hummingbirds. It is assembled because we have a taxonomic authority source that specifies all the names of the species within the hummingbird family. As a result the map is an accurate representation (aside from some stray data with clear geospatial issues).
  • Without access to sufficient authoritative taxonomic data, we have been forced to rely on less-accurate classification data originating in occurrence datasets. These datasets often contain errors such as illustrated here where a European bird species was mistakenly placed in the hummingbird family.
  • This outreach extends to a new suite of data publishing guides and tools that provide details on data formats, checklist metadata, and checklist publishing tools.
  • In 2011 the number of taxonomic authority files published through the network has doubled thanks to promotional efforts within the GBIF network and partnerships that include other taxonomic initiatives.
  • in use by ALA/ . Consultation through the GNA. Soliciting feedback about the APIs. Require discussion with community about attribution? Something about it being the component in the evolved portal that will provide all taxonomic services, including means to organise content
  • in use by ALA/ . Consultation through the GNA. Soliciting feedback about the APIs. Require discussion with community about attribution? Something about it being the component in the evolved portal that will provide all taxonomic services, including means to organise content
  • Global Names Architecture - Remsen

    1. 1. GNA Meeting, Paris France Global Names Architecture Meeting David Remsen Senior Programme Officer Global Biodiversity Information Facility (GBIF) 2011
    2. 2. <ul><li>Somewhere around 2001 </li></ul>
    3. 3. From T.E. Glover, The Fishes of Southwestern Japan, c.1870
    4. 4. Orthography The long-finned squid, Loligo pealeii (Laseur)
    5. 5. Agalinus paupercula borealis Agalinus pauperculum borealis Agalinis paupercula var. Borealis Agalinus pauperculum var. borealis Agalinus paupercula var. borealis Agalinus paupercula var. borealis Pennell Agalinus paupercula Britton var. borealis Pennell Agalinus paupercula (Gray) Britt. var. borealis Pennell Agalinis paupercula (A.Gray) Britton var. borealis Pennell Agalinus paupercula (Gray) Britton var. borealis (Pennell) Zenkert 1934 Orthography Reconciling different forms of the same name
    6. 6. Nomenclature The Bluefish, Temnodon saltator
    7. 7. Taxonomy P. carinii sec 1 P. carinii sec 2 P. jiroveci
    8. 8. With access to authority information Higher Taxonomy
    9. 9. Without authority information Higher Taxonomy
    10. 10. Issues that are not unique Particularly in federated systems
    11. 11. <ul><li>Taxonomic Data Sources </li></ul><ul><ul><li>Classification </li></ul></ul><ul><ul><li>Taxonomic Status </li></ul></ul><ul><ul><li>Heterotypic Synonymy </li></ul></ul><ul><ul><li>Taxon Identifiers </li></ul></ul><ul><li>Nomenclatural Data Sources </li></ul><ul><ul><li>Orthography </li></ul></ul><ul><ul><li>Nomenclatural Status </li></ul></ul><ul><ul><li>Objective Synonymy </li></ul></ul><ul><ul><li>Nomenclatural Identifiers </li></ul></ul>Addressed by…
    12. 12. Catalog of Life Index Fungorum Species Fungorum Tropicos LepIndex GRIN DSMZ Euzeby index IPNI ITIS Euro + Med Plantbase Index Nominum Diptorum Orthoptera Species File The Plant List NCBI Taxonomy World Register of Marine Species Angiosperm Phylogeny Group list Solanaceae Source Amphibian Species World World Spider Catalogue AlgaeBase Index Nominum Algarum Index Nominum Genericorum ZooBank ERMS IUCN RedList Mammal Species of World Catalog of Fishes FishBase Catalog of Life Index Animalium ION Nomenclator Zoologicus Fauna Europaea IRMNG NZOR Coleorrhyncha Species File A lot of this…
    13. 13. <ul><li>Common </li></ul><ul><ul><li>Discovery Network </li></ul></ul><ul><ul><li>Documentation (metadata) model </li></ul></ul><ul><ul><li>Data Sharing Format </li></ul></ul><ul><li>Data Sharing tools </li></ul><ul><li>Consensus Web Service methods </li></ul><ul><li>Few resolvable identifers </li></ul><ul><ul><li>No common resolution output </li></ul></ul><ul><li>Little Integration </li></ul>Not a lot of this
    14. 14. “ All accumulated information of a species is tied to a scientific name, a name that serves as a link between what has been learned in the past and what we today add to the body of knowledge.” (nearly) All names matter
    15. 15. <ul><li>Global discovery of nomenclature and taxonomic resources </li></ul><ul><li>Common access to these resources </li></ul><ul><li>Reconcile names labeling data and information to nomenclature and taxa </li></ul><ul><li>Embedded services that add value to these resources </li></ul>We need
    16. 16. uBio
    17. 17. An index of all names used with biodiversity information reconciled to authoritative nomenclators
    18. 18. An index of taxon resources and species checklists
    19. 20. Nice Idea No Architecture
    20. 21. Without an architecture <ul><li>Ad-hoc </li></ul><ul><li>Requires personal networking </li></ul><ul><li>No clear fit to a larger picture </li></ul>
    21. 22. Common approach to common tasks
    22. 23. Architecture <ul><li>Global Registry for resource discovery </li></ul><ul><li>Common and documented data standards </li></ul><ul><ul><li>Metadata </li></ul></ul><ul><ul><li>Data </li></ul></ul><ul><ul><li>Vocabularies </li></ul></ul><ul><li>Data Sharing tools </li></ul><ul><li>Common web service methods </li></ul><ul><li>Resolvable identifers (names/taxa)` </li></ul>
    23. 24. Architecture <ul><li>Enable global discovery of taxonomic and nomenclatural resources </li></ul><ul><ul><li>Derivative products (regional and thematic species checklists) </li></ul></ul><ul><li>Enable resources to be shared in a consistent manner </li></ul><ul><li>Promote development of new derived products </li></ul>
    24. 25. Enable global discovery
    25. 26. Integrated Publishing Toolkit <ul><li>Supports Publication of Species Checklists (sensu lato) </li></ul><ul><li>Supports EML as resource metadata format </li></ul><ul><li>Darwin Core Archive as output formats </li></ul><ul><li>Possible to add </li></ul><ul><ul><li>ISO metadata output </li></ul></ul><ul><ul><li>TCS data output – lossy relative to source data </li></ul></ul>Integrated Publishing Toolkit 2.0
    26. 27. Lowered the technical barriers to data publishing <ul><li>Publishing with spreadsheets </li></ul><ul><li>Publishing via Email </li></ul><ul><li>Publishing with no installed tools </li></ul><ul><li>Publishing with no tools at all </li></ul>
    27. 28. Darwin Core Archives
    28. 29. Lots of documentation > 2500 downloads English/French/Spanish
    29. 30. Many resources available
    30. 31. Promote Development of New Derived Products
    31. 32. GBIF Involved but not integrated <ul><li>Global Names Index </li></ul><ul><li>Global Name Usage Bank </li></ul>Supported by what has been presented
    32. 33. Checklist Bank for GBIF network
    33. 34. Checklist Bank Status: Dev version in place. Integration with GBIF data portal 2011
    34. 35. i4Life
    35. 36. <ul><li>Common platform for multiple initiatives to discover and exchange taxonomic and nomenclatural information </li></ul><ul><li>New derived products that improve efficiency and utility of taxonomic process </li></ul><ul><li>Embed taxonomy within larger biodiversity informatics challenges </li></ul>Vision