0
BHL DEVELOPMENTS BHL-EUROPE MEETING NÁRODNÍ MUZEUM, PRAGUE  16 NOV 2009 Chris Freeland   Technical Director, BHL
Kai in STL, describing a metadata format
We like to have fun while BHLing…
Blame the scotch
Stats: Now Online <ul><li>Last week: </li></ul><ul><ul><li>15,000 titles </li></ul></ul><ul><ul><li>40,000 volumes </li></...
Stats: Usage <ul><li>Jan – Sep 2009 </li></ul><ul><ul><li>266,000 visitors </li></ul></ul><ul><ul><li>436,000 visits </li>...
New Color Scheme: To be released this week http://github.com/openlibrary/bookreader
Cloud storage & computing
Global, coordinated development <ul><li>Building a community of developers </li></ul><ul><ul><li>Funded & volunteer </li><...
Open Software & Development <ul><li>BHL Bits: </li></ul><ul><ul><li>Portal code, utilities, services </li></ul></ul><ul><u...
Open Data <ul><li>Downloads </li></ul><ul><ul><li>Simple tab-delimited exports of core data </li></ul></ul><ul><ul><li>  h...
Services <ul><li>Names Service </li></ul><ul><ul><li>Return all occurrences of a name throughout BHL digitized corpus </li...
Services: OpenURL http://www.biodiversitylibrary.org/openurl? pid=title:3934&volume=14&issue=&spage=301&date=1879 http://w...
Services: OpenURL Disambiguation <ul><li>Looking for: </li></ul><ul><li>BHL returns: </li></ul>
Services: OpenURL Results
Encyclopedia of Life <ul><li>522,000 species pages linked to BHL </li></ul><ul><li>#1 referring site </li></ul>
Other Consumers <ul><li>EarthCape Labs </li></ul><ul><ul><li>Sort/Search capabilities with harvested names </li></ul></ul>...
http://bioguid.info/bhl/compare.php?name1=Physeter+catodon&name2=Physeter+macrocephalus
Crowdsourced Articles <ul><li>http://www.biodiversitylibrary.org/pdfgen/17298 </li></ul>Demo:   http://youtube.com/watch?v...
Crowdsourced Articles <ul><li>12,000 PDFs generated through September 2009 </li></ul><ul><ul><li>4,900 submitted with arti...
Great, but how to… <ul><li>display / manage? </li></ul><ul><li>meet community demands for bibliography / citation manageme...
Development goals re: citations <ul><li>Create a repository for community-vetted taxonomic bibliographies. </li></ul><ul><...
<ul><li>“ something like GenBank or NameBank for citations…” </li></ul><ul><li>So, CitationBank…or  CiteBank  (savs chars)...
http://citebank.biodiversitylibrary.org/
Crowdsourced Articles <ul><li>PDFs from BHL pushed into Drupal/Biblio: </li></ul>
http://citebank.biodiversitylibrary.org/ search
http://citebank.biodiversitylibrary.org/node/47423
 
CiteBank boundaries Book Citation Pageturning UI PDF OCR eBook/Kindle Stored *somewhere* & retrievable  via HTTP URI Citat...
BHL Data Flow – Sep 2009 CiteBank
Points of discussion @ TDWG09… Linked Literature and the  Biodiversity Heritage Library http://www.tdwg.org/proceedings/ar...
Who can upload & edit? <ul><li>Trusted repositories? </li></ul><ul><li>Approved specialists? </li></ul><ul><li>BHL Librari...
What about duplicates? <ul><li>3 Bibliographies had Syst. Nat. </li></ul><ul><ul><li>All 3 in different reference  manager...
Accuracy <ul><li>How clean is clean? </li></ul><ul><li>How dirty is dirty? </li></ul><ul><li>What’s good enough? </li></ul...
Right technologies? <ul><li>“ But Drupal’s awful…just ask ___ for their bad experience.” </li></ul><ul><li>“ Drupal’s grea...
<ul><li>… BHL keeps growing & growing & growing… </li></ul>New projects
Darwin’s Library <ul><ul><li>AMNH, NHM,  CUL, BHL (MOBOT) </li></ul></ul><ul><ul><li>Funded by NEH/JISC </li></ul></ul><ul...
BHL Take Away <ul><li>Content now available in EPUB format </li></ul><ul><ul><li>Used by Stanza, transferable to Kindle </...
Next steps <ul><li>Bring hardware online at MBL </li></ul><ul><ul><li>Have one point of redundancy </li></ul></ul><ul><ul>...
Global BHL Coordination
Thanks! <ul><li>Chris Freeland </li></ul><ul><li>Technical Director, BHL </li></ul><ul><li>Director, Center for Biodiversi...
Upcoming SlideShare
Loading in...5
×

BHL Developments - Prague

5,842

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
5,842
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "BHL Developments - Prague"

  1. 1. BHL DEVELOPMENTS BHL-EUROPE MEETING NÁRODNÍ MUZEUM, PRAGUE 16 NOV 2009 Chris Freeland Technical Director, BHL
  2. 2. Kai in STL, describing a metadata format
  3. 3. We like to have fun while BHLing…
  4. 4. Blame the scotch
  5. 5. Stats: Now Online <ul><li>Last week: </li></ul><ul><ul><li>15,000 titles </li></ul></ul><ul><ul><li>40,000 volumes </li></ul></ul><ul><ul><li>16.4mil pages </li></ul></ul><ul><li>Today: </li></ul><ul><ul><li>34,636 titles </li></ul></ul><ul><ul><li>66,544 volumes </li></ul></ul><ul><ul><li>25.2mil pages </li></ul></ul>BHL Partner Libraries BHL + >100 other libraries with open access content at archive.org
  6. 6. Stats: Usage <ul><li>Jan – Sep 2009 </li></ul><ul><ul><li>266,000 visitors </li></ul></ul><ul><ul><li>436,000 visits </li></ul></ul><ul><ul><li>2.1million pageviews </li></ul></ul><ul><li>Daily average </li></ul><ul><ul><li>970 visitors </li></ul></ul><ul><ul><li>1,600 visits / day </li></ul></ul><ul><ul><li>7,700 pageviews / day </li></ul></ul>Jan – Sep 2009 Launch to 30 Sep 2009
  7. 7. New Color Scheme: To be released this week http://github.com/openlibrary/bookreader
  8. 8. Cloud storage & computing
  9. 9. Global, coordinated development <ul><li>Building a community of developers </li></ul><ul><ul><li>Funded & volunteer </li></ul></ul><ul><ul><li>RubyBHL: http://github.com/mjy/rubyBHL </li></ul></ul><ul><ul><li>PyBHL: http://linux.softpedia.com/get/Programming/Libraries/pybhl-51612.shtml </li></ul></ul><ul><li>Programmers from China & Australia committed to project </li></ul><ul><li>New partners, new content, new possibilities </li></ul>
  10. 10. Open Software & Development <ul><li>BHL Bits: </li></ul><ul><ul><li>Portal code, utilities, services </li></ul></ul><ul><ul><li>http://code.google.com/p/bhl-bits/ </li></ul></ul><ul><li>Taxonomic Literature Group </li></ul><ul><ul><li>Google Group for discussion of “taxonomic literature & the services required to make literature interoperable within biodiversity research and biodiversity informatics.” </li></ul></ul><ul><ul><li>http://groups.google.com/group/taxonlit </li></ul></ul>
  11. 11. Open Data <ul><li>Downloads </li></ul><ul><ul><li>Simple tab-delimited exports of core data </li></ul></ul><ul><ul><li> http://www.biodiversitylibrary.org/data/BHLExportSchema.pdf </li></ul></ul><ul><li>Data model </li></ul><ul><ul><li>DB schema as ERD </li></ul></ul><ul><ul><li> http://bhl-bits.googlecode.com/files/20090930_BHLDataModel.pdf </li></ul></ul>
  12. 12. Services <ul><li>Names Service </li></ul><ul><ul><li>Return all occurrences of a name throughout BHL digitized corpus </li></ul></ul><ul><ul><ul><li>Documentation: http://bit.ly/2e6sg9 </li></ul></ul></ul><ul><ul><li>Access to 51million name strings using TaxonFinder </li></ul></ul><ul><ul><ul><ul><li>1.4million unique names </li></ul></ul></ul></ul><ul><li>OpenURL </li></ul><ul><ul><li>Facilitate links to citations: protologues, articles, references </li></ul></ul><ul><ul><ul><li>Documentation: http://www.biodiversitylibrary.org/openurlhelp.aspx </li></ul></ul></ul><ul><ul><li>Useful to Nomenclators, Reference Systems </li></ul></ul><ul><ul><ul><li>IPNI </li></ul></ul></ul><ul><ul><ul><li>Tropicos </li></ul></ul></ul>
  13. 13. Services: OpenURL http://www.biodiversitylibrary.org/openurl? pid=title:3934&volume=14&issue=&spage=301&date=1879 http://www.tropicos.org/Name/1200408
  14. 14. Services: OpenURL Disambiguation <ul><li>Looking for: </li></ul><ul><li>BHL returns: </li></ul>
  15. 15. Services: OpenURL Results
  16. 16. Encyclopedia of Life <ul><li>522,000 species pages linked to BHL </li></ul><ul><li>#1 referring site </li></ul>
  17. 17. Other Consumers <ul><li>EarthCape Labs </li></ul><ul><ul><li>Sort/Search capabilities with harvested names </li></ul></ul><ul><ul><li>YouTube demo: http://www.youtube.com/watch?v=qw7qw87JTOs </li></ul></ul><ul><li>BioGUID / iPhylo </li></ul><ul><ul><li>BHL Name Timeline & Comparison </li></ul></ul><ul><ul><ul><li>http://bioguid.info/bhl/ </li></ul></ul></ul><ul><ul><ul><li>http://bioguid.info/bhl/compare.php </li></ul></ul></ul><ul><ul><li>New Viewer </li></ul></ul><ul><ul><li>Tagging </li></ul></ul><ul><ul><li>So much cool stuff we can’t keep up! </li></ul></ul><ul><ul><ul><li>http://iphylo.blogspot.com/search/label/BHL </li></ul></ul></ul>@rdmpage
  18. 18. http://bioguid.info/bhl/compare.php?name1=Physeter+catodon&name2=Physeter+macrocephalus
  19. 19. Crowdsourced Articles <ul><li>http://www.biodiversitylibrary.org/pdfgen/17298 </li></ul>Demo: http://youtube.com/watch?v=oidf3b26jVs
  20. 20. Crowdsourced Articles <ul><li>12,000 PDFs generated through September 2009 </li></ul><ul><ul><li>4,900 submitted with article metadata </li></ul></ul><ul><ul><li>Analysis: http://bit.ly/4Jqu9 </li></ul></ul>
  21. 21. Great, but how to… <ul><li>display / manage? </li></ul><ul><li>meet community demands for bibliography / citation management? </li></ul><ul><li>build from more open source tools? </li></ul>
  22. 22. Development goals re: citations <ul><li>Create a repository for community-vetted taxonomic bibliographies. </li></ul><ul><li>Ability to ingest, display, download, and index articles so that the BHL can operate as an article repository. </li></ul><ul><li>Identify article boundaries in BHL digitized content using contributed bibliographies & algorithms. </li></ul><ul><li>Build from existing community of work around Drupal / Biblio. </li></ul><ul><ul><li>In use by collaborators </li></ul></ul>
  23. 23. <ul><li>“ something like GenBank or NameBank for citations…” </li></ul><ul><li>So, CitationBank…or CiteBank (savs chars) </li></ul>Need…
  24. 24. http://citebank.biodiversitylibrary.org/
  25. 25. Crowdsourced Articles <ul><li>PDFs from BHL pushed into Drupal/Biblio: </li></ul>
  26. 26. http://citebank.biodiversitylibrary.org/ search
  27. 27. http://citebank.biodiversitylibrary.org/node/47423
  28. 29. CiteBank boundaries Book Citation Pageturning UI PDF OCR eBook/Kindle Stored *somewhere* & retrievable via HTTP URI Citation Citation Citation Bibliography CiteBank
  29. 30. BHL Data Flow – Sep 2009 CiteBank
  30. 31. Points of discussion @ TDWG09… Linked Literature and the Biodiversity Heritage Library http://www.tdwg.org/proceedings/article/view/548
  31. 32. Who can upload & edit? <ul><li>Trusted repositories? </li></ul><ul><li>Approved specialists? </li></ul><ul><li>BHL Librarians? </li></ul><ul><li>People in this session? </li></ul><ul><li>Citizen scientists? </li></ul><ul><li>6 th graders? </li></ul><ul><li>Rod Page? </li></ul>Discussion: Session participants thought it important that BHL get as many citations as possible, then find ways of implementing trust mechanisms for users such as iSpot (Drupal module), ratings systems, ways of tagging inappropriate materials.
  32. 33. What about duplicates? <ul><li>3 Bibliographies had Syst. Nat. </li></ul><ul><ul><li>All 3 in different reference manager formats </li></ul></ul><ul><ul><li>All 3 had variant forms of title: </li></ul></ul><ul><li>Syst. Nat. </li></ul><ul><li>Systema Naturae </li></ul><ul><li>Systema naturae per regna tria naturae </li></ul><ul><ul><li>Library catalogues: </li></ul></ul><ul><ul><li>Caroli Linnaei...Systema naturae per regna tria naturae :secundum classes, ordines, genera, species, cum characteribus, differentiis, synonymis, locis. </li></ul></ul>Discussion: Important to have all the ways in which materials have been referred to over time, then have algorithms & people aggregate titles/articles (translations) into reconciliation groups, resulting in a master index.
  33. 34. Accuracy <ul><li>How clean is clean? </li></ul><ul><li>How dirty is dirty? </li></ul><ul><li>What’s good enough? </li></ul><ul><ul><li>How to Rank </li></ul></ul><ul><ul><ul><li>Gold/Platinum? </li></ul></ul></ul><ul><li>Dirty Bucket/Clean Bucket? </li></ul>Discussion: Let users decide which is the “right” form for use; may differ from project to project. BHL should take it all in, then refine using our libraries’ collected knowledge + involvement from domain specialists.
  34. 35. Right technologies? <ul><li>“ But Drupal’s awful…just ask ___ for their bad experience.” </li></ul><ul><li>“ Drupal’s great!” </li></ul><ul><li>“ MySQL won’t scale” </li></ul><ul><li>“ MySQL’s great!” </li></ul>Discussion: Drupal has limitations, but a large community of developers & implementers. There may be a “Montpellier Declaration” to centralize efforts within biodiversity informatics around the framework. Drupal/Biblio is a good starting point for CiteBank, needs further evaluation after more data are loaded & site is used.
  35. 36. <ul><li>… BHL keeps growing & growing & growing… </li></ul>New projects
  36. 37. Darwin’s Library <ul><ul><li>AMNH, NHM, CUL, BHL (MOBOT) </li></ul></ul><ul><ul><li>Funded by NEH/JISC </li></ul></ul><ul><ul><li>Digitization of Darwin’s personal library, with annotations </li></ul></ul><ul><ul><ul><li>New interfaces for recording, indexing, displaying annotations </li></ul></ul></ul><ul><li>Review “Dannotate” technology from ALA: http://metadata.net/sfprojects/dannotate.html </li></ul>
  37. 38. BHL Take Away <ul><li>Content now available in EPUB format </li></ul><ul><ul><li>Used by Stanza, transferable to Kindle </li></ul></ul><ul><li>Blog post by John Mignault (NYBG): </li></ul><ul><ul><li>http://john.mignault.net/blog/2009/10/28/first-bhl-e-book-experiments/ </li></ul></ul>
  38. 39. Next steps <ul><li>Bring hardware online at MBL </li></ul><ul><ul><li>Have one point of redundancy </li></ul></ul><ul><ul><li>By Q1 2010 </li></ul></ul><ul><li>Bring BHL-Europe & other nodes online </li></ul><ul><ul><li>In conjunction with DuraCloud & other solutions </li></ul></ul><ul><li>Release CiteBank for beta & sandbox testing </li></ul><ul><ul><li>Beta at http://citebank.biodiversitylibrary.org </li></ul></ul><ul><ul><li>Sandbox at http://sandcite.biodiversitylibrary.org </li></ul></ul><ul><ul><li>Production release by Q2 2010 </li></ul></ul><ul><li>Integration of BHL-Europe tools & content </li></ul>
  39. 40. Global BHL Coordination
  40. 41. Thanks! <ul><li>Chris Freeland </li></ul><ul><li>Technical Director, BHL </li></ul><ul><li>Director, Center for Biodiversity Informatics, Missouri Botanical Garden </li></ul><ul><ul><li>[email_address] </li></ul></ul><ul><ul><li>http://twitter.com/chrisfreeland </li></ul></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×