Remsen Lect04

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    1 Favorite

    Remsen Lect04 - Presentation Transcript

    1. GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen, Senior Programme Officer, GBIF 15 September 2009, Biodiversity Informatics WWW.GBIF.ORG Global Names Architecture A Rationale Brief History Components
    2. Biodiversity Information: A focus on taxa All accumulated information of a species is tied to a scientific name, a name that serves as a link between what has been learned in the past and what we today add to the body of knowledge. - Grimaldi & Engel, 2005, Evolution of the Insects Biodiversity Informatics: Creation, Curation, Discovery, Delivery of biodiversity information
    3. A name that serves as a link to what has been learned in the past… From T.E. Glover, The Fishes of Southwestern Japan, c.1870
    4. A name that serves as a link to what has been learned in the past… Unlike many other domains of science, historic publications have continued importance.
    5. … and that we today add to the body of knowledge. From T.E. Glover, The Fishes of Southwestern Japan, c.1870
    6. GBIF index 177 million records (> 5%/month) G igabytes of text (~100 now) All data mobilized through GBIF
    7. Biodiversity Information Species information “tied” to scientific names
    8. T he “Names Problem”
      • Not Stable
        • 5-10% names invalidated/decade
      • Not unique
      • No complete list of names
      • No complete list of species
        • No agreement on how many
        • Even within a single group
      • Impacts discovery and access of information about species
    9. T he “Names Problem”
      • Properties of Names
        • Orthographic (As labels of text that are “tied” to information about species)
        • Nomenclature (As the core “words” of taxonomy that tie a name to a original publication and type)
        • Taxonomy (As components of taxon definitions derived via authoritative taxonomic rigor)
    10. Orthography
      • Orthography and the Names Problem
      • Objectives for Remediation
    11. Variations in name spelling Loligo pealeii Loligo pealii Loligo pealei
    12. Some names are more hard to spell than others Actinobacillus actimomycetemcomitans Actinobacillus actimycetemcomitans Actinobacillus actinmycetemcomitans Actinobacillus actinomicetemcomitans Actinobacillus actinomy Actinobacillus actinomyce Actinobacillus actinomycemcomitans Actinobacillus actinomyceremcomitans Actinobacillus actinomycetam Actinobacillus actinomycetamcomitans Actinobacillus actinomycetecomitans Actinobacillus actinomycetemcmitans Actinobacillus actinomycetemcomintans Actinobacillus actinomycetemcomitance Actinobacillus actinomycetemcomitans Actinobacillus actinomycetemcomitants Actinobacillus actinomycetemcommitans Actinobacillus actinomycetemocimitans Actinobacillus actinomycetencomitans Actinobacillus actinomycetum Actinobacillus actinomyctemcomitans Actinobacillus actinomyectomcomitans Actinobacillus actinomyetemcomitans Actinobacillus actinonmycetemcomitans Actinobacillus actionomycetemcomitans Actinobacillus actynomicetemcomitans Actinobacillus antinomycetemcomitans
      • Difficulties with Latinized Names
      • Transcription errors
      Which one is the correct one?
    13. Agalinus paupercula borealis Agalinus pauperculum borealis Agalinis paupercula var. Borealis Agalinus pauperculum var. borealis Agalinus paupercula var. borealis Agalinus paupercula var. borealis Pennell Agalinus paupercula Britton var. borealis Pennell Agalinus paupercula (Gray) Britt. var. borealis Pennell Agalinis paupercula (A.Gray) Britton var. borealis Pennell Agalinus paupercula (Gray) Britton var. borealis (Pennell) Zenkert 1934 Gerardia paupercula borealis Gerardia paupercula var. borealis Gerardia paupercula var. borealis (Pennell) Deam Gerardia paupercula (Gray) Britt. var. borealis (Pennell) Deam Gerardia paupercula (Gray) Britt. var. borealis (Pennell) Deam Gerardia paupercula (A. Gray) Britton var. borealis (Pennell) Deam Gerardia paupercula (A. Gray) Britton subsp. borealis (Pennell) Pennell Gerardia paupercula (Gray) Britt. ssp. borealis (Pennell) Pennell Gerardia paupercula Britton ssp. borealis Pennell Many ways to correctly spell a name Should GBIF/EoL/BHL display all/one/some?
    14. Objectives
      • Informatics can contribute
        • Index names occurring in content we wish to publicise and access
        • Develop tools to extract, catalog, and match names.
        • Reconcile names to authoritative names sources via a common resolution path
        • Reconcile name occurrence to taxonomic concepts via a common concept resolution path
    15. Nomenclature
      • Nomenclatural aspects of the names problem.
      • Approaches for remediating them
    16. Don’t pass on bad information. How can we determine the status of the names we discover in content that we serve?
    17. Nomenclatural changes impact search and retrieval Where can I find out these names are related? Zoological Code doesn’t track recombinations Botanical Code does.
    18. Nomenclatural changes impact search and retrieval
    19. Homonyms Peranema – the fern Peranema – the euglenid How many Peranema are there? How can I tell them apart?
    20. Homonyms Taxonomic context alone doesn’t tell me enough. Kingdom Phylum Class Order Family Genus Plantae Magnoliophyta Magnoliopsida Apiales Umbelliferae Oenanthe Plantae Oenanthe Oenanthe Plantae Magnoliophyta Magnoliopsida Apiales Apiaceae Oenanthe Plantae Orchidaceae Oenanthe Animalia Chordata Aves Passeriformes Muscicapidae Oenanthe Animalia Chordata Aves Passeriformes Turdidae Oenanthe Animalia Chordata Actinopterygii Perciformes Pomatomidae Pomatomus Animalia Chordata Pisces Perciformes Serranidae Pomatomus
    21. Approaches to remediation
      • Consolidate the major nomenclatural databases
        • A single nomenclatural dictionary
          • Populate with provisionally verified records and enable open annotation
        • Provides nomenclatural status of a name
        • Collectively identifies all homonyms. Identifiers used in taxonomic data provide disambiguation context
        • Ties all distinct nomenclatural combinations to the original published name.
      • Informatics
        • Promote global identifiers and simple resolution pathway for these data
    22. Taxonomy
      • Taxonomic Examples of the Names problem
      • Approaches for remediating them
    23. Taxonomic synonyms Halichondria panicea (Pallas 1776) sec Van Soest 2002 (WoRMS)
    24. Consequences of Splitting Taxon Concept problem: What does someone mean when they refer to P. carinii
    25. The Perils of Lumping Bear Lodge meadow jumping mouse. Zaphus hudsonius campestris Zaphus hudsonius preblei INCLUDES DOES NOT INCLUDE Dr. Rob Roy Ramey says Dr. Tim King says Preble’s meadow jumping mouse. What should a search for “Zaphus hudsonius campestris” return?
    26. Different taxonomic views, different # species, different names Taxonomic Backbones: Scope and completeness
    27. Organisational value of Non-Taxonomic Lists
    28. Approaches to remediation
      • An inventory of different taxonomic catalogues
        • Inform if there are concept issues for the species
      • Provide synonymised taxon concepts with unique and resolvable identifiers
      • Multiple classifications via checklists and catalogues accessible and utilised as organisational frameworks for species information
    29. Summary
        • A data publication framework that enables
          • A complete index of all names that are tied to information about species
            • Tools and infrastructure to support this.
          • A complete index of verified nomenclature and a identification and resolution system to make it easy to tie a name to an authoritative record.
          • A global taxonomic resolution system that allows a particular usage of a name to be tied to a defined taxon.
        • A system that puts taxonomy as a global organisational framework for species information.
    30. Inventory and Index
    31. uBio Indexes
    32. Web Service outputs Taxon Object
    33. Web Service calls from client applications
    34. Taxonomic organisation of content
    35. Taxonomic organisation of content
    36. Indexes support processes that support discovery
    37. That enable new and better tools and services
    38. Formalise the Architecture
    39. Coordinate Communities of Interest
    40. Summary: GNA Objectives
        • A complete index of names tied to information about species reconciled to a common and verified nomenclatural dictionary.
        • This same dictionary forms the basis for multiple expressions of taxonomic catalogues, regional checklists, and thematic lists of species.
        • These lists are openly accessible and tied to services and processes that enable them to be effectively employed in data organisation and retrieval.
        • Collectively, these components serve the delivery and utilisation of biological knowledge.
    41. Thank you [email_address] Skype:dremsen
    SlideShare Zeitgeist 2009

    + bioinfocoursebioinfocourse Nominate

    custom

    120 views, 1 favs, 0 embeds more stats

    David Remsen lecture on Tuesday, Sept 15, 2009, for more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 120
      • 120 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 1
    • Downloads 5
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories