Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Surfacing the deep data of taxonomy

1,585 views

Published on

Talk at Linnean Society London, 20 September 2012

Published in: Technology
  • Be the first to comment

Surfacing the deep data of taxonomy

  1. 1. Surfacing the deep data of taxonomy @rdmpage http://iphylo.blogspot.com
  2. 2. To a first approximation thetaxonomy of life is already digital…
  3. 3. doi:10.1126/science.276.5313.734
  4. 4. Data – GenBankPublications – PubMedNames – Names4Life
  5. 5. So, we’re done! (aren’t we?)
  6. 6. doi:10.1126/science.276.5313.734
  7. 7. Zoology as microbiology Microbiology Zoology GenBank ➔ DNA barcoding PubMed ➔ Digital archives (BHL) Names ➔ ION, ZooBank, uBio, …Images from http://phylopic.org
  8. 8. Why does having a singledatabase of names matter?
  9. 9. Bacterial names linked to literaturehttp://dx.doi.org/10.1099/ijs.0.035154-0
  10. 10. Paenibacillus polymyxa• http://dx.doi.org/10.1601/nm.5110 (name)• http://dx.doi.org/10.1601/tx.5110 (taxon) ​ Image from http://dx.doi.org/10.1128/AEM.71.11.7292-7300.2005
  11. 11. …still not convinced?
  12. 12. Skull, mandible and tooth morphology of the holotype of L. melvillei MUSM 1676.O Lambert et al. Nature 466, 105-108 (2010) doi:10.1038/nature09067
  13. 13. Leviathan melvillei
  14. 14. Bugger…
  15. 15. Livyatan melvillei
  16. 16. Two kinds of #fail
  17. 17. We don’t have a list of all names
  18. 18. Publications containing names often not accessible
  19. 19. Leviathan melvillei
  20. 20. Need more convincing?
  21. 21. Dark taxahttp://iphylo.blogspot.co.uk/2011/04/dark-taxa-genbank-in-post-taxonomic.html
  22. 22. Mammals in GenBank Aus sp. Proper Linnaean names
  23. 23. Mammals Aus sp. Proper Linnaean names
  24. 24. “Invertebrates” BOLD
  25. 25. Is this a problem?
  26. 26. It’s the norm for Bacteria
  27. 27. Dark taxa will only increase in number
  28. 28. Roth v. Wikipeiahttp://www.newyorker.com/online/blogs/books/2012/09/an-open-letter-to-wikipedia.html
  29. 29. Wikipedia says “no”
  30. 30. “I understand your point that theauthor is the greatest authority ontheir own work,” writes the WikipediaAdministrator—“but we requiresecondary sources.”
  31. 31. @quominusOne of Wikipedia’s core principles, alongwith things like neutrality, is verifiability:a reader must be able to look at astatement in a Wikipedia article and findout where it comes from.http://quominus.org/archives/981
  32. 32. Taxonomic statements should be verifiable
  33. 33. Literature is theevidence base for taxonomy
  34. 34. Literature onlineCommercialpublishersDigital archivesMuseums,universities,and scientificsocieties
  35. 35. http://iphylo.org/~rpage/itaxon
  36. 36. Animal names per decade Data from http://www.organismnames.com
  37. 37. Names with a DOI 25%
  38. 38. BioStor (BHL) 25% ©@biostor_orghttp://biostor.org
  39. 39. Online(DOI, BioStor, JSTOR,DSpace,PDF, …) 50%
  40. 40. Identifiers
  41. 41. Vast majority of names are in the legacy literature
  42. 42. Zootaxa and Zookeys XML
  43. 43. My wish list…
  44. 44. Names linked to: literature specimens geography sequences phylogeny…
  45. 45. BioNames (real soon now…)Computable Data Challenge

×