Digital Libraries for Science:   Botanicus & Biodiversity Heritage Library   Chris Freeland Director of Bioinformatics,  M...
Why scan old books? The cited half-life of publications in taxonomy is longer than in any other scientific discipline * * ...
Botanicus.org
Workflow Selection Preparation Post Production (Re)publication Digitization Conservation
Selection process Title Protologues 5,455 Das Pflanzenreich … … 5,736 Species Plantarum 5,853 Bulletin of the Torrey Botan...
Digitization process 6 Full time scanning technicians 3 Indus 5002 book scanners 1 Kodak i280 Sheet feed scanner
Demonstration: Connecting a name with its protologue
 
Citation resolver Vol. Title Part Page Year
 
How we make the connections <ul><li>From Tropicos: </li></ul><ul><ul><li>Store structured citation info, not free text </l...
Botanicus Progress To Date: <ul><li>2,400 volumes </li></ul><ul><li>1 million pages </li></ul><ul><li>… growing daily… </l...
Biodiversity Heritage Library (BHL) http://www.biodiversitylibrary.org
BHL Institutions <ul><li>Museums </li></ul><ul><ul><li>American Museum of Natural History (New York) </li></ul></ul><ul><u...
Scanning Operations <ul><li>BHL uses scanning centers established by  Internet Archive  for mass scanning.  </li></ul><ul>...
BHL Progress To Date: <ul><li>Nearing: </li></ul><ul><li>24,000 volumes </li></ul><ul><li>10 million pages </li></ul><ul><...
Open Access Literature Flora de la Provincia de Buenos Aires.    Publisher:  Buenos Aires :M. Biedma è Hijo,1905. PDF OCR ...
Name Finding via  TaxonFinder
Raw Image Converted to text via OCR Name finding via TaxonFinder Extract names Submit to NameBank SOAP response Name Findi...
BHL Name Finding Stats to date * <ul><li>Have mined more than  30 million  name string occurrences  </li></ul><ul><ul><li>...
 
 
BHL & JSTOR <ul><li>Complementary efforts </li></ul><ul><ul><li>Preservation & distribution of scholarly content </li></ul...
How can BHL enrich LAPI? <ul><li>Links to: </li></ul><ul><li>Protologues </li></ul><ul><li>All occurrences of a name </li>...
Contact <ul><ul><li>Chris Freeland </li></ul></ul><ul><ul><li>4344 Shaw Blvd. </li></ul></ul><ul><ul><li>St. Louis, MO 631...
Upcoming SlideShare
Loading in …5
×

Digital Libraries for Science: Botanicus and the Biodiversity Heritage Library

1,442 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,442
On SlideShare
0
From Embeds
0
Number of Embeds
23
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Digital Libraries for Science: Botanicus and the Biodiversity Heritage Library

    1. 1. Digital Libraries for Science: Botanicus & Biodiversity Heritage Library Chris Freeland Director of Bioinformatics, Missouri Botanical Garden Technical Director, BHL
    2. 2. Why scan old books? The cited half-life of publications in taxonomy is longer than in any other scientific discipline * * * The decay rate is longer than in any scientific discipline - Macro-economic case for open access, Tom Moritz
    3. 3. Botanicus.org
    4. 4. Workflow Selection Preparation Post Production (Re)publication Digitization Conservation
    5. 5. Selection process Title Protologues 5,455 Das Pflanzenreich … … 5,736 Species Plantarum 5,853 Bulletin of the Torrey Botanical Club 7,578 Repertorium Specierum Novarum Regni Vegetabilis 7,599 Journal of the Linnean Society, Botany 9,833 Flora Brasiliensis 11,757 Prodromus Systematis Naturalis Regni Vegetabilis 12,695 Linnaea 13,548 Revisio Generum Plantarum 15,052 Botanische Jahrbucher fur Systematik…
    6. 6. Digitization process 6 Full time scanning technicians 3 Indus 5002 book scanners 1 Kodak i280 Sheet feed scanner
    7. 7. Demonstration: Connecting a name with its protologue
    8. 9. Citation resolver Vol. Title Part Page Year
    9. 11. How we make the connections <ul><li>From Tropicos: </li></ul><ul><ul><li>Store structured citation info, not free text </li></ul></ul><ul><ul><ul><li>Volume: 2 </li></ul></ul></ul><ul><ul><ul><li>Issue: 4 </li></ul></ul></ul><ul><ul><ul><li>Start Page: 358 </li></ul></ul></ul><ul><ul><ul><li>*NOT*: 2(4): 358 </li></ul></ul></ul><ul><ul><li>Maintain authority files for bibliographic materials, including Botanicus TitleIDs </li></ul></ul><ul><li>From Botanicus: </li></ul><ul><ul><li>Detailed info for every page </li></ul></ul><ul><ul><li>Knowledge of other identifiers for book </li></ul></ul><ul><ul><li>Flexibility to accommodate multiple cataloging </li></ul></ul>
    10. 12. Botanicus Progress To Date: <ul><li>2,400 volumes </li></ul><ul><li>1 million pages </li></ul><ul><li>… growing daily… </li></ul>Freely available at www.botanicus.org
    11. 13. Biodiversity Heritage Library (BHL) http://www.biodiversitylibrary.org
    12. 14. BHL Institutions <ul><li>Museums </li></ul><ul><ul><li>American Museum of Natural History (New York) </li></ul></ul><ul><ul><li>Natural History Museum (London) </li></ul></ul><ul><ul><li>Smithsonian Institution (Washington) </li></ul></ul><ul><ul><li>The Field Museum (Chicago) </li></ul></ul><ul><li>Botanical Gardens </li></ul><ul><ul><li>Missouri Botanical Garden </li></ul></ul><ul><ul><li>New York Botanical Garden </li></ul></ul><ul><ul><li>Royal Botanic Garden, Kew </li></ul></ul><ul><li>Bioinformatics Institutes </li></ul><ul><ul><li>MBL/WHOI </li></ul></ul><ul><ul><li>uBio.org </li></ul></ul><ul><li>University Libraries </li></ul><ul><ul><li>Botany Libraries, Harvard University </li></ul></ul><ul><ul><li>Ernst Meyer Library of the Museum of Comparative Zoology, Harvard University </li></ul></ul><ul><ul><li>University of Illinois </li></ul></ul>
    13. 15. Scanning Operations <ul><li>BHL uses scanning centers established by Internet Archive for mass scanning. </li></ul><ul><li>Some partner libraries also scan in-house. </li></ul><ul><li>Want to expand international footprint: </li></ul><ul><ul><li>mirrored content </li></ul></ul><ul><ul><li>ingest from global data providers </li></ul></ul>Locations of BHL/IA Scanning Centers
    14. 16. BHL Progress To Date: <ul><li>Nearing: </li></ul><ul><li>24,000 volumes </li></ul><ul><li>10 million pages </li></ul><ul><li>… growing daily… </li></ul>Freely available at www.biodiversitylibrary.org
    15. 17. Open Access Literature Flora de la Provincia de Buenos Aires. Publisher: Buenos Aires :M. Biedma è Hijo,1905. PDF OCR XML JP2
    16. 18. Name Finding via TaxonFinder
    17. 19. Raw Image Converted to text via OCR Name finding via TaxonFinder Extract names Submit to NameBank SOAP response Name Finding in action with Taxonomic Intelligence…
    18. 20. BHL Name Finding Stats to date * <ul><li>Have mined more than 30 million name string occurrences </li></ul><ul><ul><li>4.4 million unique </li></ul></ul><ul><li>More than 23.7 million name strings verified by NameBank </li></ul><ul><ul><li>1.2 million unique </li></ul></ul>*17 November 2008
    19. 23. BHL & JSTOR <ul><li>Complementary efforts </li></ul><ul><ul><li>Preservation & distribution of scholarly content </li></ul></ul><ul><li>Yet distinct </li></ul><ul><ul><li>BHL has thousands of monographs </li></ul></ul><ul><ul><li>Rare materials </li></ul></ul><ul><ul><li>Content selected specifically for taxonomists & parataxonomists </li></ul></ul><ul><ul><li>All BHL content is open access </li></ul></ul><ul><ul><li>For now, BHL is focused on legacy content; JSTOR on contemporary </li></ul></ul>
    20. 24. How can BHL enrich LAPI? <ul><li>Links to: </li></ul><ul><li>Protologues </li></ul><ul><li>All occurrences of a name </li></ul><ul><li>Historic texts </li></ul><ul><li>Illustrations & maps </li></ul>
    21. 25. Contact <ul><ul><li>Chris Freeland </li></ul></ul><ul><ul><li>4344 Shaw Blvd. </li></ul></ul><ul><ul><li>St. Louis, MO 63110 </li></ul></ul><ul><ul><li>[email_address] </li></ul></ul><ul><ul><li>http:// www.botanicus.org </li></ul></ul><ul><ul><li>http:// www.biodiversitylibrary.org </li></ul></ul>

    ×