Introduction to the GBIF Backbone Taxonomy and GBIF’s cooperation with the Catalogue of Life.
A voice recording is available at https://vimeo.com/568318390
9. GBIF Backbone Explained
See blog post for more details:
https://data-blog.gbif.org/post/gbif-backbone-taxonomy/
10. Collaboration with Catalogue of Life
• COL ChecklistBank
https://data.catalogueoflife.org
• New API
https://api.catalogueoflife.org
• New COL portal
https://catalogueoflife.org
• ColDP
https://github.com/CatalogueOfLife/coldp
11.
12. Nodes Engagement
• COL feedback
• support@catalogueoflife.org
• https://github.com/CatalogueOfLife/data/issues/new/choose
• https://lists.gbif.org/mailman/listinfo/col-users
• Publish checklists (ColDP / DwC-A)
• National lists
• Fill taxonomic gaps
• ColDP helpdesk
• Spread expertise
• Engage in documentation
Welcome everyone.
My name is Markus Döring and I am going to introduce you to the GBIF Backbone Taxonomy and GBIF’s cooperation with the Catalogue of Life.
The GBIF backbone taxonomy is created and used by GBIF to organize all occurrences into a single taxonomic view for searches, metrics and maps.
It is updated biannually using an algorithm that merges names from … manually selected source checklists present in ChecklistBank
… manually selected source checklists present in ChecklistBank.
To resolve conflicts, these sources are ordered by priority ….
… with the Catalogue of Life being the seed which gets augmented.
COL also provides the entire higher classification above families, …
… with the exception of Bacteria, Archaea, Fabaceae and Lepidoptera
for which we instead use the Genome Taxonomy Database,
Kew’s World Checklist of Vascular Plants,
Fauna Europaea and 2 superfamilies compiled by Donald Hobern.
A special source is Plazi which digitize entire scientific articles and monographs as individual checklists and currently contribute over 34.000 datasets to GBIF.
Although often small, these checklists frequently capture very recently published names not yet present in any of the larger sources.
The backbone build configuration allows us also to manually fine tune results,
for example by excluding specific names or avoiding wrong duplicates - especially for genera.
After merging all sources into the backbone various routines are applied to guarantee taxonomic consistency.
Most notably we only allow a single accepted name within a homotypic group of names derived from basionym relations.
Basionyms are either given explicitly in sources
or are programmatically derived by comparing name authorships of all names within one family sharing the same terminal epithet.
For those interested in more details, we have written a blog post on how the backbone is generated and occurrences are matched.
GBIF has teamed up with the Catalogue of Life to develop a new infrastructure to assemble aggregated checklists like the Catalogue of Life itself.
For this a new version of Checklist Bank has been developed, which is based around ColDP,
a new format for publishing checklist data, that overcomes some restrictions imposed by Darwin Core Archives.
Checklist Bank includes a workbench with tools to review & compare lists and help the manual editorial process of stitching together parts (called sectors) from various sources.
In the second half of this year we will work on an extended COL Checklist which will programmatically merge names similar to how it is done in the GBIF Backbone now.
The Catalogue of Life highly welcomes feedback on content, either by mail or github.
A bottleneck for COL has always been and still is the conversion of sources to a standardized format such as ColDP or Darwin Core Archives.
Publication of national and other checklists to COL ChecklistBank or GBIF is important for filling gaps and keeping up to date with scientific progress.
ColDP is a rather new format not supported by the IPT yet.
Sharing expertise and providing support for the community to publish data in ColDP would be a very useful contribution.