0
An Inordinate Fondness for Data
The Biodiversity Heritage Library


OCLC Digital Forum East 2009
5 November 2009
Arlington...
American Museum of Natural History (New York)
Academy of Natural Sciences Philadelphia
California Academy of Sciences (San...
The
Encyclopedia
of
Life
Education and Outreach
                              Smithsonian & Harvard


             H

                             ...
How much is there:
Core literature pre-
1923: 100 million
pages (?)
All pre-1923: 120-
150 million pages
All literature: 2...
•   Northeast Regional
    Scanning Facility
    (Boston)
•   Jersey City Facility
•   University of Illinois
•   Natural ...
BHL Members: BHL-Europe
•   Museum für Naturkunde -            •   Stichting Nationaal
    Leibniz-Institut für Evolutions...
Now Online
More than:
40,000 volumes
16 million pages
Only 290 million to go!

Avg. monthly growth rate
1,500 volumes
600,...
Ingest existing
content
12,000,000 pages+
from other
Internet Archive
scanning partners
Acquiring other content ...
            Researches scanning
            their own work or
            literature relevant ...
Biodiversity Heritage Library
Permission Process
Working with non-profit publishers for
sharing with the BHL
To digitize a...
So what? Does [fill in blank] do
           that?




   … and more and faster?
So what? Does [fill in blank] do
           that?




   … and more and faster?
BHL is all about OPEN & SHARING
Remind me
again why?
An inordinate fondness for data
Access
Putting biodiversity
literature in the hands
of researchers
Set the data free
Suck ...
Stats: Usage
   • Jan – Sep 2009           • Daily average
      – 266,000 visitors        – 970 visitors
      – 436,000 ...
Global, coordinated development
New functionality from BHL-Europe
  Improved deduplication tools
  Semantic interface
  OA...
Open Software & Development
BHL Bits:
  Portal code, utilities, services
  http://code.google.com/p/bhl-bits/
Taxonomic Li...
Open Data
Downloads
  Simple tab-delimited exports of core data
  http://www.biodiversitylibrary.org/data/BHLExportSchema....
Open Data
Open Source Pageturning UI




        http://github.com/openlibrary/bookreader
Metadata: Feedback loop




              Assigned to library staff
              for review & resolution
Services
Names Service
  Return all occurrences of a name throughout BHL digitized
   corpus
     Documentation: http://bi...
Services: OpenURL Request




                                                 http://www.biodiversitylibrary.org/openurl?...
Services: OpenURL Disambiguation
Looking for:


BHL returns:
Services: OpenURL Results
Encyclopedia of Life
522,000 species pages linked to BHL
#1 referring site
Other Consumers
EarthCape Labs
  Sort/Search capabilities with harvested names
  YouTube demo: http://www.youtube.com/watc...
Global BHL
Based on open access

Open content

Collaboration

Shared development
Uh, so what's it mean
to me?
1.9 million known
species … most
described once in a
hard to find article …
wouldn't it be ni...
And thanks to ...
Thanks for sticking around!
An Inordinate Fondness for Data: The Biodiversity Heritage Library
An Inordinate Fondness for Data: The Biodiversity Heritage Library
An Inordinate Fondness for Data: The Biodiversity Heritage Library
An Inordinate Fondness for Data: The Biodiversity Heritage Library
An Inordinate Fondness for Data: The Biodiversity Heritage Library
An Inordinate Fondness for Data: The Biodiversity Heritage Library
An Inordinate Fondness for Data: The Biodiversity Heritage Library
An Inordinate Fondness for Data: The Biodiversity Heritage Library
An Inordinate Fondness for Data: The Biodiversity Heritage Library
An Inordinate Fondness for Data: The Biodiversity Heritage Library
An Inordinate Fondness for Data: The Biodiversity Heritage Library
An Inordinate Fondness for Data: The Biodiversity Heritage Library
An Inordinate Fondness for Data: The Biodiversity Heritage Library
Upcoming SlideShare
Loading in...5
×

An Inordinate Fondness for Data: The Biodiversity Heritage Library

6,365

Published on

An Inordinate Fondness for Data: The Biodiversity Heritage Library. Martin R. Kalfatovic. OCLC Digital Forum East 2009. November 5, 2009. Arlington, VA.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
6,365
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "An Inordinate Fondness for Data: The Biodiversity Heritage Library"

  1. 1. An Inordinate Fondness for Data The Biodiversity Heritage Library OCLC Digital Forum East 2009 5 November 2009 Arlington, VA Martin R. Kalfatovic Smithsonian Institution Libraries
  2. 2. American Museum of Natural History (New York) Academy of Natural Sciences Philadelphia California Academy of Sciences (San Francisco) Field Museum (Chicago) Natural History Museum (London) Smithsonian Institution Libraries (Washington) Missouri Botanical Garden (St. Louis) New York Botanical Garden (New York) Royal Botanic Garden, Kew Botany Libraries, Harvard University Ernst Mayr Library of the Museum of Comparative Zoology, Harvard University Marine Biological Laboratory / Woods Hole Oceanographic Institution
  3. 3. The Encyclopedia of Life
  4. 4. Education and Outreach Smithsonian & Harvard H Synthesis Center Field Museum Species Pages & Secretariat Smithsonian Informatics Marine Biological Laboratory Missouri Botanical Garden
  5. 5. How much is there: Core literature pre- 1923: 100 million pages (?) All pre-1923: 120- 150 million pages All literature: 280-320 million pages
  6. 6. • Northeast Regional Scanning Facility (Boston) • Jersey City Facility • University of Illinois • Natural History Museum, London • Missouri Botanical Garden (Non-Scribe operation) • Fedscan (Library of Congress) • Smithsonian Libraries
  7. 7. BHL Members: BHL-Europe • Museum für Naturkunde - • Stichting Nationaal Leibniz-Institut für Evolutions- Natuurhistorisch Museum, und Biodiversitätsforschung an Naturalis der Humboldt-Universität zu • National Botanic Garden of Berlin Belgium • Natural History Museum, UK • Royal Museum for Central Africa, • Narodni muzeum NMP CZ • Royal Belgian Institute of Natural • Angewandte Informationstechnik Sciences Forschungsgesellschaft mbH • Bibliothèque nationale de France • Freie Universität Berlin • Museum national d’histoire FUBBGBM naturelle • Georg-August-Universität • Consejo Superior de Göttingen Stiftung Öffentlichen Investigaciones Cientificas Rechts • Università degli Studi di Firenze • Naturhistorisches Museum Wien • Royal Botanic Garden, • Hungarian Natural History Edinburgh Museum • Species 2000 • Museum and Institute of Zoology, Polish Academy of • John Wiley & Sons limited Sciences • Helsingin yliopisto UH-Viikki • University of Copenhagen
  8. 8. Now Online More than: 40,000 volumes 16 million pages Only 290 million to go! Avg. monthly growth rate 1,500 volumes 600,000 pages See you in 2048!
  9. 9. Ingest existing content 12,000,000 pages+ from other Internet Archive scanning partners
  10. 10. Acquiring other content ... Researches scanning their own work or literature relevant to their work Journals that have scanned their content, but do not have a robust platform to host it
  11. 11. Biodiversity Heritage Library Permission Process Working with non-profit publishers for sharing with the BHL To digitize and mount works under copyright BHL must obtain permission from the copyright holders. Many biodiversity journals and monographs are published by non-profit institutions or learned societies whose mission is to promote research and learning. Some of these institutions have not sold their rights to commercial publishers and are open to sharing with the BHL.
  12. 12. So what? Does [fill in blank] do that? … and more and faster?
  13. 13. So what? Does [fill in blank] do that? … and more and faster?
  14. 14. BHL is all about OPEN & SHARING
  15. 15. Remind me again why?
  16. 16. An inordinate fondness for data Access Putting biodiversity literature in the hands of researchers Set the data free Suck it; mash it; broadcast it Increase Reuse, recyle, expand
  17. 17. Stats: Usage • Jan – Sep 2009 • Daily average – 266,000 visitors – 970 visitors – 436,000 visits – 1,600 visits / day – 2.1million – 7,700 pageviews / pageviews day Jan – Sep 2009 Launch to 30 Sep 2009
  18. 18. Global, coordinated development New functionality from BHL-Europe Improved deduplication tools Semantic interface OAIS-compliant preservation infrastructure Building a community of developers Funded & volunteer RubyBHL: http://github.com/mjy/rubyBHL PyBHL: http://linux.softpedia.com/get/Programming/Libraries/pybhl-51612.shtml New partners, new content
  19. 19. Open Software & Development BHL Bits: Portal code, utilities, services http://code.google.com/p/bhl-bits/ Taxonomic Literature Group Google Group for discussion of “taxonomic literature & the services required to make literature interoperable within biodiversity research and biodiversity informatics.” http://groups.google.com/group/taxonlit
  20. 20. Open Data Downloads Simple tab-delimited exports of core data http://www.biodiversitylibrary.org/data/BHLExportSchema.pdf Data model DB schema as ERD http://bhl-bits.googlecode.com/files/20090930_BHLDataModel.pdf
  21. 21. Open Data
  22. 22. Open Source Pageturning UI http://github.com/openlibrary/bookreader
  23. 23. Metadata: Feedback loop Assigned to library staff for review & resolution
  24. 24. Services Names Service Return all occurrences of a name throughout BHL digitized corpus Documentation: http://bit.ly/2e6sg9 Access to 51million name strings using TaxonFinder 1.4million unique names Working out a strategy for obscure species Algorithm improvements to detect nomenclatural & taxonomic acts OpenURL Facilitate links to citations: protologues, articles, references Documentation: http://www.biodiversitylibrary.org/openurlhelp.aspx Useful to Nomenclators, Reference Systems IPNI Tropicos
  25. 25. Services: OpenURL Request http://www.biodiversitylibrary.org/openurl? pid=title:3934&volume=3&issue=&spage=262&date=1856 http://www.tropicos.org/Name/1200408
  26. 26. Services: OpenURL Disambiguation Looking for: BHL returns:
  27. 27. Services: OpenURL Results
  28. 28. Encyclopedia of Life 522,000 species pages linked to BHL #1 referring site
  29. 29. Other Consumers EarthCape Labs Sort/Search capabilities with harvested names YouTube demo: http://www.youtube.com/watch?v=qw7qw87JTOs BioGUID BHL Name Timeline http://bioguid.info/bhl/ BHL Name Comparison http://bioguid.info/bhl/compare.php
  30. 30. Global BHL Based on open access Open content Collaboration Shared development
  31. 31. Uh, so what's it mean to me? 1.9 million known species … most described once in a hard to find article … wouldn't it be nice to know more about your neighbors ...
  32. 32. And thanks to ...
  33. 33. Thanks for sticking around!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×