Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

BHL, BioStor, and beyond


Published on

Keynote for Biodiversity Heritage Library #BHLat10 event at NHM in London, 12 April 2016

Published in: Science
  • Be the first to comment

  • Be the first to like this

BHL, BioStor, and beyond

  1. 1. BHL, BioStor, and beyond #BHLat10 @rdmpage
  2. 2. #iamataxonomist
  3. 3. 3 Pinnotheres atrinicola Page, 1983
  4. 4. One species of peacrab had a parasite… …what is it?
  5. 5. Sur un type nouveau d'Epicarides Rhopalione uromyzon n. g. n. sp., parasite sous-abdominal d'un Pinnothere
  6. 6. Rhopalione in BHL
  7. 7. Why BHL is cool #1 Accessibility
  8. 8. First impressions, meh… • OMG it’s full of plants  • It’s all old stuff  • Where the $#@! are the articles?
  9. 9. “More hack, less yack” “[to] be able to move some subset of the world … from the leverage point of the command line.” Steven E. Jones The Emergence of the Digital Humanities
  10. 10. Why BHL is cool #2 It is hackable
  11. 11. No articles? No problem! • Data is available for download • Also an API (and OAI-PMH, yuck!) • So, let’s go find the articles…
  12. 12. Find articles - “simples” Title Volume Page Journal Volume Start page – end pageArticle Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library doi:10.1186/1471-2105-12-187 Mapping between BHL and articles
  13. 13.
  14. 14. BioStor and Pintrest
  15. 15. BioStor and JournalMap
  16. 16. BioStor and BHL articles
  17. 17. First impressions, meh… • OMG it’s full of plants  • It’s all old stuff  • Where the $#@! are the articles?
  18. 18. Not so cool…
  19. 19. Scanning currently dominated by USDA
  20. 20. BHL-Europe: unhackable zombie
  21. 21. Where next?
  22. 22. Findability: DOIs for articles 10.5962/bhl.part.14773
  23. 23. Mickey Mouse is evil
  24. 24. 100,000 articles from (BHL) 1923 today
  25. 25. Synthetic documents S. Michael Machines as readers: A solution to the copyright problem “we proposed to scan works digitally to extract their intellectual content, and then generate by machine synthetic works that capture this content … and distribute them free of copyright”
  26. 26. Cited, linkable specimens NMNH Vertebrate Zoology Herpetology Collections 11194 CAS Herpetology Collection Catalog MCZ Herpetology Collection Herpetology Collection (University of Kansas Biodiversity Research Center) 9619 6720 5818
  27. 27. The case for a PubMed Central for Biodiversity
  28. 28. Isn’t that, um, PubMed Central?...
  29. 29. Europe PMC
  30. 30. PubMed Central for biodiversity • Taxonomic names • Geographic localities • Specimen codes • Handle XML, PDF, OCR text • Store facts as well as documents
  31. 31. Google figured out how to manage abundance while every other media company in the world was trying to manufacture scarcity, and for that we should be grateful. Siva Vaidhyanathan The Googlization of everything (and why we should worry)