More Related Content

Tony Rees: Towards a Hierarchical Classification of All Life

  1. Towards a Hierarchical Classification of All Life – the IRMNG data assembly project Tony Rees – CSIRO Marine and Atmospheric Research, Australia October 2011
  2. Why a hierarchical classification? Tony Rees: Hierarchical Classification of All Life
  3. What should “the system” ideally hold? – something like… Tony Rees: Hierarchical Classification of All Life (etc.)
  4. Perseverance produces the following (subset of genus table, 453k names as at Oct 2011): Tony Rees: Hierarchical Classification of All Life
  5. A glimpse of the IRMNG “master genus” table (currently 452,827 records) Tony Rees: Hierarchical Classification of All Life
  6. A glimpse of the IRMNG “master genus” table (currently 452,827 records) Tony Rees: Hierarchical Classification of All Life (Mabberley plant names list)
  7. Detail showing example source/s used Tony Rees: Hierarchical Classification of All Life
  8. Other services / products e.g. full hierarchical lists Tony Rees: Hierarchical Classification of All Life however with caveat: some / many genera may still be classified only at higher level (e.g. “Mammalia – unallocated”) at this time (more work to do).
  9. Check batches of entered names Tony Rees: Hierarchical Classification of All Life (1,406 genus names…)
  10. Check batches of entered names Tony Rees: Hierarchical Classification of All Life (start of IRMNG search result)
  11. Check batches of entered names Tony Rees: Hierarchical Classification of All Life
  12. Check batches of entered names Tony Rees: Hierarchical Classification of All Life ?
  13. Query by taxon name (correctly spelled or misspelled) Tony Rees: Hierarchical Classification of All Life
  14. Linking names with literature Tony Rees: Hierarchical Classification of All Life
  15. Expanded citation info in IRMNG - example Tony Rees: Hierarchical Classification of All Life
  16. Expanded citation info in IRMNG - example Tony Rees: Hierarchical Classification of All Life
  17. Expanded citation info in IRMNG - example Tony Rees: Hierarchical Classification of All Life
  18. IRMNG content – recent missing genera… Tony Rees: Hierarchical Classification of All Life
  19. IRMNG content – genus names published by year, 1995-current (as at Oct 2011), excluding virus names (which are undated) Tony Rees: Hierarchical Classification of All Life (NB could disaggregate further as desired, e.g. by detailed tax. group, or extant vs. fossil…) … also would expect a small number of residual names missed for ostensibly “complete” years presumed missing names
  20. IRMNG 2011 content cf. Cat. of Life 2011 Tony Rees: Hierarchical Classification of All Life Note, Chapman, 2009 estimates c.1.9m described extant species (see earlier slide) On that basis, CoL has 70% of valid extant species names, maybe 70% of valid extant genera (with subset of genus-level synonyms) IRMNG is missing est. 10k genera from 2004-2011 (from last slide), maybe further 2-3% overall (say 10k-15k), “complete” list would thus be ~475k at this time (increasing at ~2k/year). Cat. of Life - 2011 edition % with auth's IRMNG – Oct 2011 - extant + fossil % with auth's IRMNG – Oct 2011 - fossil only           Kingdoms 8   7   0 Phyla 111   153   12 Classes 288   509   64 Orders 1,233   2,645   715 Families 8,071 0% 19,639 22.1% 6,542 Subfamilies           Genera 178,515 0% 452,848 97.1% 90,278 Subgenera           Species (valid) 1,347,224 ~100% 1,020,519 ~100% 16,792 Species (synonyms) 895,441 ~100% 440,738 ~100% 100
  21. Thank you Thanks to: - OBIS, GBIF and Atlas of Living Australia for financial support, numerous data providers for data - CSIRO for salary and in-kind support, 2006-present - D. Patterson / MBL / NSF (this trip funding + hosting) Tony Rees: Hierarchical Classification of All Life Contact details Phone: +61 3 6232 5318 Email: Tony.Rees@csiro.au Web: www.cmar.csiro.au/datacentre/
  22. Supplementary slides Tony Rees: Hierarchical Classification of All Life
  23. New names: potential discovery paths Tony Rees: Hierarchical Classification of All Life new virus names new prokaryote names new botanical names – algae & fungi (except fossils) new botanical names – bryophytes through angiosperms (except fossils) new zoological names publication discovery official registers taxon-specific DB’s integrated DB’s “ all names” Botany Zoology Newly published names – primary literature (print, electronic) ICTV Viruses DB LPSN (Prokaryote names) ICBN Decisions ICZN Decisions Journal TOC’s, RSS feeds, text mining Abstracting services Subject bibliographies Reviews, secondary literature Zoological Record ION (Index of Organism Names) ChecklistBank GNI GNUB ZooBank? Catalogue of Life annual editions ITIS NCBI Taxonomy WoRMS etc. CyanoDB Index Fungorum MycoBank AlgaeBase Plant GSD’s PaleoDB Animal GSD’s other compilations e.g. regional lists, Wikispecies, Wikipedia, more… IRMNG
  24. New names: potential discovery paths Tony Rees: Hierarchical Classification of All Life new virus names new prokaryote names new botanical names – algae & fungi (except fossils) new botanical names – bryophytes through angiosperms (except fossils) new zoological names publication discovery official registers taxon-specific DB’s integrated DB’s “ all names” Botany Zoology Newly published names – primary literature (print, electronic) ICTV Viruses DB LPSN (Prokaryote names) ICBN Decisions ICZN Decisions Journal TOC’s, RSS feeds, text mining Abstracting services Subject bibliographies Reviews, secondary literature Zoological Record ION (Index of Organism Names) ChecklistBank GNI GNUB ZooBank? Catalogue of Life annual editions ITIS NCBI Taxonomy WoRMS etc. CyanoDB Index Fungorum MycoBank AlgaeBase Plant GSD’s PaleoDB Animal GSD’s other compilations e.g. regional lists, Wikispecies, Wikipedia, more… IRMNG Lots of manual effort
  25. New names: potential discovery paths Tony Rees: Hierarchical Classification of All Life new virus names new prokaryote names new botanical names – algae & fungi (except fossils) new botanical names – bryophytes through angiosperms (except fossils) new zoological names publication discovery official registers taxon-specific DB’s integrated DB’s “ all names” Botany Zoology Newly published names – primary literature (print, electronic) ICTV Viruses DB LPSN (Prokaryote names) ICBN Decisions ICZN Decisions Journal TOC’s, RSS feeds, text mining Abstracting services Subject bibliographies Reviews, secondary literature Zoological Record ION (Index of Organism Names) ChecklistBank GNI GNUB ZooBank? Catalogue of Life annual editions ITIS NCBI Taxonomy WoRMS etc. CyanoDB Index Fungorum MycoBank AlgaeBase Plant GSD’s PaleoDB Animal GSD’s other compilations e.g. regional lists, Wikispecies, Wikipedia, more… IRMNG Lots of automated feeds + expert curation
  26. New names: potential discovery paths Tony Rees: Hierarchical Classification of All Life new virus names new prokaryote names new botanical names – algae & fungi (except fossils) new botanical names – bryophytes through angiosperms (except fossils) new zoological names publication discovery official registers taxon-specific DB’s integrated DB’s “ all names” Botany Zoology Newly published names – primary literature (print, electronic) ICTV Viruses DB LPSN (Prokaryote names) ICBN Decisions ICZN Decisions Journal TOC’s, RSS feeds, text mining Abstracting services Subject bibliographies Reviews, secondary literature Zoological Record ION (Index of Organism Names) ChecklistBank GNI GNUB ZooBank? Catalogue of Life annual editions ITIS NCBI Taxonomy WoRMS etc. CyanoDB Index Fungorum MycoBank AlgaeBase Plant GSD’s PaleoDB Animal GSD’s other compilations e.g. regional lists, Wikispecies, Wikipedia, more… IRMNG Lots of automated feeds + expert curation Lots of useful services
  27. How many taxa? Tony Rees: Hierarchical Classification of All Life valid extant + fossil taxa (est.) How many species? estimates according to Chapman, 2009 (valid, extant taxa only); “others” comprise c. 54k protists, 10k prokaryotes, 2k viruses NB inverts. includes “~1,000,000” for Insects – probably +/- 60k Fossil species – no published estimates – maybe 500k names, 300k valid 2+ million ~250k ~10k ~2k Kingdoms (5/6/7/8) ~400 ~140 Phyla Classes Orders Families Genera Species
  28. Relevant information domain: all life Tony Rees: Hierarchical Classification of All Life PROTISTS Fig. i-1 in Margulis & Schwartz, 1998
  29. How many kingdoms… Tony Rees: Hierarchical Classification of All Life PROTISTS Fig. i-1 in Margulis & Schwartz, 1998 7 kingdoms (5 in Margulis & Schwartz, 8 in Cat. of Life…): Animals, Fungi, Plants : 3 kingdoms Protists : 1 (or 2 if Stramenopiles [Heterokonts] recognized, = Cavalier-Smith’s Kingdom “Chromista”) Bacteria + Archaea : 2 (=1 in Margulis & Schwartz) Viruses : 1 (not in Margulis & Schwartz)
  30. Nomenclature governed by four separate Codes , i.e. Zoological, Botanical, Bacteriological, Viruses Tony Rees: Hierarchical Classification of All Life PROTISTS Zoo. Code Bact. Code Bot. Code Vir. Code: viruses (not shown) Fig. i-1 in Margulis & Schwartz, 1998
  31. Parker, 1982 content example Tony Rees: Hierarchical Classification of All Life
  32. Benton, 1993 content example Tony Rees: Hierarchical Classification of All Life
  33. Rees TAXAMATCH fuzzy matching poster (start) Tony Rees: Hierarchical Classification of All Life
  34. Schematic of TAXAMATCH operation Tony Rees: Hierarchical Classification of All Life