This document discusses the challenges of indexing and organizing biological names across different databases and classifications. It proposes the creation of a shared "NameBank" repository that would contain all nomenclatural name concepts, uncoupled from any particular taxonomic interpretation. This would insulate taxonomic systems from invalid names while facilitating data sharing across resources by providing a common foundation and layer for names. The NameBank aims to catalog all known names, map relationships between them, and distribute this consensus name index to various databases and classifications in a cooperative manner.
2. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
Newt: as concept
• Triturus viridescens Rafinesque 1820
• String
• a single specimen
• Nomenclatural concept
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
viridis - to become green
3. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
Concepts:Nomenclatural
• Triturus viridescens Rafinesque 1820
• Notopthalmus viridescens Baird 1850
• Notophthalmus viridescens Gray 1850
• Notophthalma viridescens Gray 1858 msp.
• Diemyctylus viridescens Hallowell 1856
• Triton viridescens Strauch, 1870
• Molge viridescens Boulanger, 1872
• Diemyctylus minatus viridescens Yarrow
•…
Common origin in a single real specimen (homotypic)
Creation of the new nomen concept is subjective
Relationship among them is not
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
4. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
Concepts: Nomenclatural
• Triturus viridescens dorsalis - Bishop, 1943
• Diemyctylus viridescens dorsalis Schmidt 1953
• Notophthalmus viridescens dorsalis - Smith, 1953
• Triturus viridescens louisianae - Strecker 1928
• Triturus viridescens louisianensis - Bishop, 1943
• Diemyctylus viridescens louisianensis Schmidt 1953
• Notophthalmus viridescens louisianensis - Smith, 1953
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
6. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
Concepts:Taxonomic
• Amphibia
• Urodela
• Salamandridae
• Notophthalmus
• Notopthalmus viridescens
Frost 2005 AMNH
• Amphibia
• Batrachia
• Caudata
• Salamandroidea
• Salamandridae
• Notophthalmus
• Notopthalmus viridescens
NCBI 2005
7. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
Concepts:Summary
• Factual
• Inter-relationships are objective
• No new science required
• (except to make new ones)
• Stable
• Expert scrutiny useful, not required
• Compilation potentially FAST
• uBio 1 million/year
• share (no opinion attached)
Nomenclatural Concepts
• Opinion
• Interelationships are subjective
• Derived from nomenclatural concepts
• Expert scrutiny is required
• Unstable
• Compilation slow
• CoL 50K / year
• Diptera 200K/15 years
• sharing concerns - opinions attached
Taxonomic Concepts
8. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
Why this is a problem
9. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
Don’t forget common names
10. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
And additionally…
5-10% scientific names become invalid per decade
Scientific names aren’t unique
Acalyptus
11. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
Names Challenges within PubMed
476 unique
Name (Nomenclatural Synonyms) PMID Date Unique
Notophthalmus viridescens 350 1965 349
Diemictylus viridescens 36 1959 36
Triturus viridescens 87 1949 86
12. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
Names Challenges within PubMed
4208 unique PMID
Name (Taxonomic Synonyms) Total Unique
Brucella melitensis 1078 840 78.1%
Brucella abortus (Bacterium abortus) 3109 2852 91.7%
Brucella canis 178 146 82.0%
Brucella neotomae 12 4 33.3%
Brucella ovis 233 168 84.9%
Brucella suis 286 198 69.2%
Brucella melitensis DSMZ 2005
13. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
How big is the problem?
• Not sure
• No comprehensive listing of
names
• 1.75M valid species names
• 2-?M+ invalid names
• 2-?M+ vernacular names
• + Misspellings, lexical forms
• 14,000 avian genera
14. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
uBio
• Library service
• “System” must account for all names
• Any classifications
• Biological Name Server
• 2 million nomenclatural concepts
• 1.7 taxon concepts
• (60 classifications)
• SOAP/WSDL web services
15. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
16. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
17. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
uBio
18. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
Major Impediment to progress
• Different taxon concepts/needs
• Same nomenclatural concepts
• No obj/subj distinction
• Duplication
• Interconnectivity issues
19. NameBank
• Repository for all nomen
concepts
•Insulates taxonomic systems
from “bad” nomen concepts
•Consensus data only
• Common to any taxon
concept
• Shareable
• Distributable
20. NameBank
• NameBank is not a nomenclator nor are nomenclators
NameBank
• NameBank is an index of factually-derived name concepts that
include a much more broad names definition
• It overlaps, and is supported by nomenclators and should, I
think, provide a service on top of NameBank.
• NameBank provides an underlying unified index to systems like
IF that contain authoritative nomenclatural metadata.
• NameBank accomodates strings outside the scope of
nomenclators
23. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
NameBank
• Repository for all nomen
concepts
• Insulates taxonomic systems
from “bad” nomen concepts
• Consensus data only
• Layered
• Shareable
• Distributable
• Independent compilation
24. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
Share
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
• NameBank is a big job
• Catalog all names
• Map all factually derived relationships
• Share them for increased data access
•Proactive
• NCBI
• CBOL
• new submissions
•
25. Universal Biological Indexer and Organizer
Research Funded by the Andrew W. Mellon Foundation
MBL / WHOI LIBRARY
Federate
• Layered architecture
• Common Foundation
• Diverse expression
• Enhanced Interchange
• Cooperation
• Efficient
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.