Successfully reported this slideshow.

Biodiversity Informatics at the Natural History Museum


Published on

Published in: Technology, Education
  • Be the first to comment

Biodiversity Informatics at the Natural History Museum

  1. 1. Biodiversity Informatics at theNatural History MuseumEd BakerTerrestrial Invertebrates, Department of Life Sciences& NHM Informatics Initiative
  2. 2. Science as a Slow Cooker• Only the surface visible• Lid kept on for extended periodsof time• Uses cheap cuts of raggy meat• Ingredient lose their nutritionalvalue• Children at risk due to hightemperatures
  3. 3. We like data• 70 million+ specimens collected over 400 years• 350,000+ books• ??? Unpublished datasets inarchive, notebooks, computers• ??? In the minds of staff
  4. 4. How do we provide access?• Digitisation of specimens and associated data• Scanning and transcribingbooks, journals, archives• Providing tools for managing the data life cycle• Changing the way we publish: data publication
  5. 5. Flowing DataPublicationCollection Curation Use
  6. 6. Flowing DataCollection CurationSomebody retires Somebody dies Project is cancelledSits in desk drawer oron a hard drive until….
  7. 7. Flowing DataCollection Curation UseData PublicationRe-usePublicationRe-use Re-use Re-use
  8. 8. Flowing Data: from collection to reuseCollection Curation UseData PublicationRe-usePublicationRe-use Re-use Re-use
  9. 9. CollectionCitizen ScienceAutomated identification andmonitoringTraditional taxonomic sources
  10. 10. Flowing Data: from collection to reuseCuration UseData PublicationRe-usePublicationRe-use Re-use Re-use
  11. 11. CurationWebsites for communities to publish and curate:• Taxonomy / nomenclature• Bibliographies• Specimen information• Character matricies
  12. 12. Flowing Data: from collection to reuseUseData PublicationRe-usePublicationRe-use Re-use Re-use
  13. 13. Use: Oboe
  14. 14. Use: Oboe
  15. 15. Flowing Data: from collection to reuseData PublicationRe-usePublicationRe-use Re-use Re-use
  16. 16. Publication (Data)• Datasets• Single species descriptions• Checklists• Software
  17. 17. Flowing Data: from collection to reuseRe-usePublicationRe-use Re-use Re-use
  18. 18. Publication (Research)• Traditional research• Systematic zoology• Phylogeny• Biogeography
  19. 19. Flowing Data: from collection to reuseRe-use Re-use Re-use Re-use
  20. 20. The Problem of ScaleData is being generated by tens of thousands ofresearchers, in thousands of institutions• Hard to find what you need• Hard to know if what you need actually exists• Impossible to go through researcher by researcher
  21. 21. NHM Data Portal• Aggregator for NHM sciencedata• Visualisation tools fordatasets• Allows export of NHM datafor re-use
  22. 22. The Informatics Landscape>18K specimen records(local small scale coverage)>276M specimen records(worldwide coverage)
  23. 23. The Informatics LandscapeA webpage for every speciesAggregate specimen andobservation data globally
  24. 24. Wikimedian in Residence• Make NHM content availableunder open licenses for useon Wikimedia projects (andelsewhere)• Reach of Wikipedia:BBC, Encyclopedia of Life• Wikisource: Transcription andtranslation crowd-sourcing
  25. 25. Flowing Data: from collection to reuse?
  26. 26. "Everybody makes mistakes. And if you dontexpose your raw data, nobody will find yourmistakes."Jean-Claude Bradley