Backbone taxonomies, data aggregation,
and early career systematists:
Something's got to give
M. Andrew Johnston & Nico M. Franz
Hasbrouck Insect Collection, Arizona State University
@MAndrewJohnston
@taxonbytes
Slides: [Slideshare link]
Overview
• Introduction to backbone taxonomies
• Use cases with aggregated data
• Conflict with systematic research
• Available solutions and their limits
Backbone Taxonomies
What are they?
GBIF Backbone Taxonomy:
https://www.gbif.org/dataset/d7dddbf4-2cf0-4f39-9b2a-bb099caae36c
Backbone Taxonomies
Widely used
Aggregated data and darkling beetles
Eleodes Eschscholtz, 1829
208 Valid species
412 Available species-group names
Eleodes species in Arizona
34 verified
4 doubtful
38
38
8,010 Eleodes specimens
from Arizona
38
208
Darwin Core ‘scientificName’
208 unique strings (“names”)
38
208
Darwin Core ‘scientificName’
Eleodes carbonaria chihuahuaensis
Champion, 1884
Previously valid combinations
eleodes carbonarius chihuahuaensis
eleodes lineatus
eleodes ampla
eleodes nitidus
eleodes lineata
eleodes amplus
eleodes carbonarius amplus
eleodes amplus dolosus
eleodes ampla dolosa
eleodes nitida
Taxonomic Issues I – Synonymy
38
208
Darwin Core ‘scientificName’
Eleodes carbonaria chihuahuaensis
Champion, 1884
Subgeneric names included
eleodes (melaneleodes) amplus
eleodes melaneleodes lineata
eleodes melaneleodes amplus
Taxonomic Issues II – Dirty Data
38
208
14
‘Accepted’ species from
backbone taxonomy
8 verified
6 doubtful
38
208
14
Manually interpreted using
current literature
32 verified
14 doubtful
51
38
208
14
Good data are available
But we need a good taxonomy
to realize their potential
51
Digitization of my research collection
(M. Andrew Johnston Collection – MAJC)
What about generating new data?
2315 Eleodes records
74 taxa
2315 Eleodes records
74 species/subspecies
2315 Eleodes records
74 species/subspecies
Default view of data
705 specimens
“ID’d” to Eleodes
29 taxon names
recognized
Verbatim view of record, showing issue flags
Failure across (at least) Coleoptera
iPhylo Blog Post
iPhylo.blogspot.com
~85,000 Australian Coleoptera records
1/3 names are changed, affecting 1/5 records
Taxon coverage is identified as a hurdle
Darwin Core Hour: The Aggregator’s Viewpoint – GBIF and iDigBio
Name ‘discovered’ through
digitized type specimen
Can we fix a
taxonomic issue?
Amphidora Eschscholtz
synonym (Doyen and Lawrence, 1979)
Eleodes Eschscholtz
Can we fix a
taxonomic issue?
Amphidora Eschscholtz
synonym (Doyen and Lawrence, 1979)
Eleodes Eschscholtz
Can we fix a
taxonomic issue?
Amphidora parallela Casey
synonym (Blaisdell, 1933)
Stenotrichus rufipes (LeConte)
transferred (Aalbu et al. 2002)
Helops rufipes (LeConte)
Names not
known to GBIF
GBIF accepts feedback on any item
Logs issue on github, to be fixed in later patch
What we can offer:
• A large, distributed workforce
What we can offer:
• A large, distributed workforce
• Knowledge and access to literature
What we can offer:
• A large, distributed workforce
• Knowledge and access to literature
• Time and passion to curate our focal branch
of the tree of life
What we need:
• Editorial access to on-line biodiversity data
What we need:
• Editorial access to on-line biodiversity data
• Ability to share/impose our views on data
• Unfiltered and un-edited
• NOT the same as “mine is the only view”
What we need:
• Editorial access to on-line biodiversity data
• Ability to share/impose our views on data
• Unfiltered and un-edited
• NOT the same as “mine is the only view”
• Citations and metrics out of the system
• Every specimen added, editorial action taken
• To build reputation and CV
A solution for now
• Direct control over my
specimen data
• Immediately
interoperable with 170
other collections
Public, citable metrics
Direct access to edit backbone taxonomy
But … Need user account approval to make edits
Only one possible valid backbone
Publish checklists and interactive keys
Backbone taxonomy paradigm shift
Disambiguates ‘taxonomy’ and ‘classification’
Allows for multiple classification systems and concept taxonomy
Graduate students should be empowered
to use and curate biodiversity data
Summary
Graduate students should be empowered
to use and curate biodiversity data
Summary
Current backbone taxonomies alienate systematists
Graduate students should be empowered
to use and curate biodiversity data
Summary
Need to develop new, community-empowering designs
Current backbone taxonomies alienate systematists
Special thanks to:
Bob Mesibov Andrew Jansen
Ed Gilbert Brian Reily
Sal Anzaldo Sangmi Lee
Funding:
NSFARTS DEB-1258154
Evolutionary Biology
Summer Research Fellowship
Data files and scripts: github.com/mandrewj/ECN_backbone_taxonomies
Slides: [slideshare link]
Tenebrionidae taxonomy sec Bousquet et al. 2017. Catalogue of Tenebrionidae (Coleoptera) of
North America, Zookeys.
Thank You

Backbone taxonomies, data aggregation, and early career systematists: something's got to give

  • 1.
    Backbone taxonomies, dataaggregation, and early career systematists: Something's got to give M. Andrew Johnston & Nico M. Franz Hasbrouck Insect Collection, Arizona State University @MAndrewJohnston @taxonbytes Slides: [Slideshare link]
  • 2.
    Overview • Introduction tobackbone taxonomies • Use cases with aggregated data • Conflict with systematic research • Available solutions and their limits
  • 3.
    Backbone Taxonomies What arethey? GBIF Backbone Taxonomy: https://www.gbif.org/dataset/d7dddbf4-2cf0-4f39-9b2a-bb099caae36c
  • 4.
  • 5.
    Aggregated data anddarkling beetles Eleodes Eschscholtz, 1829 208 Valid species 412 Available species-group names
  • 6.
    Eleodes species inArizona 34 verified 4 doubtful 38
  • 7.
  • 8.
    38 208 Darwin Core ‘scientificName’ 208unique strings (“names”)
  • 9.
    38 208 Darwin Core ‘scientificName’ Eleodescarbonaria chihuahuaensis Champion, 1884 Previously valid combinations eleodes carbonarius chihuahuaensis eleodes lineatus eleodes ampla eleodes nitidus eleodes lineata eleodes amplus eleodes carbonarius amplus eleodes amplus dolosus eleodes ampla dolosa eleodes nitida Taxonomic Issues I – Synonymy
  • 10.
    38 208 Darwin Core ‘scientificName’ Eleodescarbonaria chihuahuaensis Champion, 1884 Subgeneric names included eleodes (melaneleodes) amplus eleodes melaneleodes lineata eleodes melaneleodes amplus Taxonomic Issues II – Dirty Data
  • 11.
    38 208 14 ‘Accepted’ species from backbonetaxonomy 8 verified 6 doubtful
  • 12.
    38 208 14 Manually interpreted using currentliterature 32 verified 14 doubtful 51
  • 13.
    38 208 14 Good data areavailable But we need a good taxonomy to realize their potential 51
  • 14.
    Digitization of myresearch collection (M. Andrew Johnston Collection – MAJC) What about generating new data?
  • 15.
  • 16.
    2315 Eleodes records 74species/subspecies
  • 17.
    2315 Eleodes records 74species/subspecies Default view of data 705 specimens “ID’d” to Eleodes 29 taxon names recognized
  • 18.
    Verbatim view ofrecord, showing issue flags
  • 19.
    Failure across (atleast) Coleoptera iPhylo Blog Post iPhylo.blogspot.com ~85,000 Australian Coleoptera records 1/3 names are changed, affecting 1/5 records
  • 20.
    Taxon coverage isidentified as a hurdle Darwin Core Hour: The Aggregator’s Viewpoint – GBIF and iDigBio
  • 21.
    Name ‘discovered’ through digitizedtype specimen Can we fix a taxonomic issue?
  • 22.
    Amphidora Eschscholtz synonym (Doyenand Lawrence, 1979) Eleodes Eschscholtz Can we fix a taxonomic issue?
  • 23.
    Amphidora Eschscholtz synonym (Doyenand Lawrence, 1979) Eleodes Eschscholtz Can we fix a taxonomic issue? Amphidora parallela Casey synonym (Blaisdell, 1933) Stenotrichus rufipes (LeConte) transferred (Aalbu et al. 2002) Helops rufipes (LeConte) Names not known to GBIF
  • 24.
  • 25.
    Logs issue ongithub, to be fixed in later patch
  • 26.
    What we canoffer: • A large, distributed workforce
  • 27.
    What we canoffer: • A large, distributed workforce • Knowledge and access to literature
  • 28.
    What we canoffer: • A large, distributed workforce • Knowledge and access to literature • Time and passion to curate our focal branch of the tree of life
  • 29.
    What we need: •Editorial access to on-line biodiversity data
  • 30.
    What we need: •Editorial access to on-line biodiversity data • Ability to share/impose our views on data • Unfiltered and un-edited • NOT the same as “mine is the only view”
  • 31.
    What we need: •Editorial access to on-line biodiversity data • Ability to share/impose our views on data • Unfiltered and un-edited • NOT the same as “mine is the only view” • Citations and metrics out of the system • Every specimen added, editorial action taken • To build reputation and CV
  • 32.
    A solution fornow • Direct control over my specimen data • Immediately interoperable with 170 other collections
  • 33.
  • 34.
    Direct access toedit backbone taxonomy But … Need user account approval to make edits Only one possible valid backbone
  • 35.
    Publish checklists andinteractive keys
  • 36.
    Backbone taxonomy paradigmshift Disambiguates ‘taxonomy’ and ‘classification’ Allows for multiple classification systems and concept taxonomy
  • 37.
    Graduate students shouldbe empowered to use and curate biodiversity data Summary
  • 38.
    Graduate students shouldbe empowered to use and curate biodiversity data Summary Current backbone taxonomies alienate systematists
  • 39.
    Graduate students shouldbe empowered to use and curate biodiversity data Summary Need to develop new, community-empowering designs Current backbone taxonomies alienate systematists
  • 40.
    Special thanks to: BobMesibov Andrew Jansen Ed Gilbert Brian Reily Sal Anzaldo Sangmi Lee Funding: NSFARTS DEB-1258154 Evolutionary Biology Summer Research Fellowship Data files and scripts: github.com/mandrewj/ECN_backbone_taxonomies Slides: [slideshare link] Tenebrionidae taxonomy sec Bousquet et al. 2017. Catalogue of Tenebrionidae (Coleoptera) of North America, Zookeys. Thank You

Editor's Notes

  • #2 Critique + call for change – refernce something’s got to give
  • #4 Single consensus taxonomy Multiple sources Programmatically derived Allows biodiversity data to be accessed/managed
  • #6 Perhaps the most majestic group of all animals Grammatical gender – nomenclaturally complex group
  • #10 Header – taxonomic historiy/issues etc Issues I, then II next slide Mention literal text copy
  • #17 Harvested and available via GBIF
  • #19 Verbatim records are available, but then you need taxonomic expertise to decode Not issues with data quality, but with backbone taxonomic filtering
  • #22 Nomenclatural issue only important to me and maybe 1 other person if I’m lucky
  • #23 Nomenclatural issue only important to myself and maybe 1 other person if I’m lucky
  • #24 What can I do so that I can go home and sleep at night?
  • #26 Submitted Oct. 24 Last patch – August 2016? Next apparently due December 2017
  • #35 Who gives approval, why
  • #36 New screen shot – map, vouchers, names