Digital Libraries for Science: Botanicus and the Biodiversity Heritage Library - Presentation Transcript
Digital Libraries for Science: Botanicus & Biodiversity Heritage Library Chris Freeland Director of Bioinformatics, Missouri Botanical Garden Technical Director, BHL
Why scan old books? The cited half-life of publications in taxonomy is longer than in any other scientific discipline * * * The decay rate is longer than in any scientific discipline - Macro-economic case for open access, Tom Moritz
Botanicus.org
Workflow Selection Preparation Post Production (Re)publication Digitization Conservation
Selection process Title Protologues 5,455 Das Pflanzenreich … … 5,736 Species Plantarum 5,853 Bulletin of the Torrey Botanical Club 7,578 Repertorium Specierum Novarum Regni Vegetabilis 7,599 Journal of the Linnean Society, Botany 9,833 Flora Brasiliensis 11,757 Prodromus Systematis Naturalis Regni Vegetabilis 12,695 Linnaea 13,548 Revisio Generum Plantarum 15,052 Botanische Jahrbucher fur Systematik…
Digitization process 6 Full time scanning technicians 3 Indus 5002 book scanners 1 Kodak i280 Sheet feed scanner
Demonstration: Connecting a name with its protologue
Citation resolver Vol. Title Part Page Year
How we make the connections
From Tropicos:
Store structured citation info, not free text
Volume: 2
Issue: 4
Start Page: 358
*NOT*: 2(4): 358
Maintain authority files for bibliographic materials, including Botanicus TitleIDs
Ernst Meyer Library of the Museum of Comparative Zoology, Harvard University
University of Illinois
Scanning Operations
BHL uses scanning centers established by Internet Archive for mass scanning.
Some partner libraries also scan in-house.
Want to expand international footprint:
mirrored content
ingest from global data providers
Locations of BHL/IA Scanning Centers
BHL Progress To Date:
Nearing:
24,000 volumes
10 million pages
… growing daily…
Freely available at www.biodiversitylibrary.org
Open Access Literature Flora de la Provincia de Buenos Aires. Publisher: Buenos Aires :M. Biedma è Hijo,1905. PDF OCR XML JP2
Name Finding via TaxonFinder
Raw Image Converted to text via OCR Name finding via TaxonFinder Extract names Submit to NameBank SOAP response Name Finding in action with Taxonomic Intelligence…
BHL Name Finding Stats to date *
Have mined more than 30 million name string occurrences
4.4 million unique
More than 23.7 million name strings verified by NameBank
1.2 million unique
*17 November 2008
BHL & JSTOR
Complementary efforts
Preservation & distribution of scholarly content
Yet distinct
BHL has thousands of monographs
Rare materials
Content selected specifically for taxonomists & parataxonomists
All BHL content is open access
For now, BHL is focused on legacy content; JSTOR on contemporary
0 comments
Post a comment