Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Harvesting Heritage: Seed & Nursery Catalog Digitization, Discovery & Access


Published on

Collaborative digitization of seed and nursery catalogs and delivered through the Biodiversity Heritage Library (BHL). Part of the Purposeful Gaming project, with funding support from IMLS, Cornell University, New York Botanical Garden, Missouri Botanical Garden, Harvard University Museum of Comparative Zoology collaborate to digitize content, create a game to crowdsource the correction of inaccurate OCRed text in BHL and crowdsource the transcription of handwritten field notebooks to improved discovery and access via full-text searching

Published in: Science
  • Be the first to comment

  • Be the first to like this

Harvesting Heritage: Seed & Nursery Catalog Digitization, Discovery & Access

  1. 1. Harvesting Heritage: Seed & Nursery Catalog Digitization, Discovery & Access Marty Schlabach Food & Agriculture Librarian Mann Library Cornell University Cornell University Reunion Mann Library June 5, 2015
  2. 2. Listings of fruit and vegetables with descriptions Amateur's Fruit and Ornamental Trees, York, PA 1867
  3. 3.
  4. 4. Ethel Zoe Bailey 1889-1983 • Curated the seed & nursery collection for 70+ years, 1911-1980s • Daughter of Liberty Hyde Bailey • Graduated from Smith College 1911 • Assisted her father with many of his plant collection trips around the world • Assisted and coauthored with her father many of his botany and horticultural publications • Awarded the George Robert White Medal in 1967 from the Massachusetts Horticultural Society • Awarded the Smith College Medal in 1970 • Continued curating the seed & nursery catalog collection until her death in 1983 at the age of 93 • Lived a long and distinguished career in botany and horticulture
  5. 5. Why were seed & nursery catalogs collected? How was the collection used? LH Bailey used the collection in support of his publications, such as: • Hortus Third • Manual of Cultivated Plants • Annals of Horticulture
  6. 6. How was it used? 1889 Introduction of Vick’s Irondequoit Melon, as found in Annals of Horticulture, by LH Bailey
  7. 7. Why Digitize Seed & Nursery Catalogs? • Taxonomists • Discover dates of early introductions of new plants • Gardeners • Peruse old catalogs for historical availability and uses of traditional cultivars of heirloom annuals and perennials • Museums and botanical gardens • Recreate historical gardens • Plant breeders • Look for descriptions of plants with unique disease and pest resistance • Historians of art and illustration • Drawn to the striking representations of flowers, fruit & vegetables • Historians of printing • Catalogs documented changes in printing • Text-only broadsides & pamphlets • Multipage booklets with engraved illustrations • Colorful lithographs added • Photographic illustrations, b&w and later color
  8. 8. Collaborative Digitization Effort • New York Botanical Garden • LuEsther T. Mertz Library • 50,000+ catalogs • National Agriculture Library, USDA • Henry G. Gilbert Nursery & Seed Trade Catalog Collection • 200,000+ catalogs • Missouri Botanical Garden • Peter H. Raven Library • Seed exchange lists • Cornell University • Cornell University Library • Liberty Hyde Bailey Hortorium (Plant Specimen Collection) • Ethel Zoe Bailey Horticultural Catalog Collection • 130,000+ catalogs
  9. 9. Cornell’s Digitization • Public Domain items only • i.e. no longer under copyright • Pre-1923 American, public domain • Post-1922 American, need research to determine • Non-US catalogs • different copyright laws apply • Began scanning firms that carried grapevines • Ethel Bailey indexed catalogs by specie & cultivated variety (cultivar) • Then scanned firms not held by project partners • Funding in part provided by • Institute for Museum & Library Services (IMLS)
  10. 10. Seed & Nursery Catalog Digital Collection • Biodiversity Heritage Library (BHL) • • Collaboration among 20 botanical garden, natural history museum and academic libraries • 157,532 volumes • 45,568,082 pages • Seed & Nursery Catalog Collection • • A subcollection in BHL • Currently 15,437 volumes and growing • 769,163 pages • Combining NAL, NYBG, MBG and Cornell digitized catalogs
  11. 11. Historical Development of American Seed & Nursery Catalogs • Broadside • Multi-page, text only • Engraved & woodcut images added to text • Hand-colored engravings • Lithography • Chromolithography • Black & White photography • Color photography • Today: websites
  12. 12. Broadside One sheet only Approx 12”x18” (Prince Nursery, 1793)
  13. 13. Text only, multi-page (J. B. Russel’s Catalog of Garden Seeds, 1827)
  14. 14. Engravings added to the text (James Vick, 1868)
  15. 15. Hand-colored engravings (D. M. Dewey, Rochester, NY, n.d.)
  16. 16. Black & White Lithography, followed by Chromolithography
  17. 17. Black & White Photography (A. Currie & Co, 1923) Color Photography (Miss Ella V. Baines, 1928)
  18. 18. Garden & Farm Tools & Equipment (Mann’s Superior Seeds, 1934)
  19. 19. Beekeeping and Dairy Supplies (Mann’s Superior Seeds, 1934)
  20. 20. Poultry, Supplies and Equipment (Mann’s Superior Seeds, 1934)
  21. 21. • National Leadership Grant for Libraries awarded to Missouri Botanical Garden in St Louis. • Partners include Harvard, New York Botanical Garden, Cornell • Runs Dec 2013-Nov 2015 • Funded in part by IMLS Purposeful Gaming and BHL: engaging the public in improving & enhancing discovery & access to digital texts
  22. 22. BHL Problem Statement: • Major challenge for digital libraries: • full-text searching of scanned texts is significantly hampered by poor output from Optical Character Recognition (OCR) software. • Historic literature has proven to be particularly problematic because of its tendency to have • varying fonts • varying typesetting • varying layouts • ink bleed-through • foxing • other physical condition issues
  23. 23. • Building an online game to crowdsource the correction of inaccurate OCR • Crowdsourcing the transcription of inaccurate OCR and handwritten texts • Adding new content types upon which to test the approach • Seed & Nursery Catalogs & Seed Exchange Lists • Test OCR correction on this content type • Crowd-source the transcription when needed • Field Notebooks, • Handwritten, OCR virtually impossible • Crowd-source the transcription How are we engaging the public in improving and enhancing the discovery and access to digital texts in BHL?
  24. 24. Beanstalk
  25. 25. What happens with the game output? • Multiple players enter the same character string for a word, system considers it correct • String of characters or the correct word is added to the index • Made available for searching & improves discoverability • Games released, June 9, 2015
  26. 26. Thanks to my colleagues at: • Mann Library, Cornell University (especially Carol Lowe) • Liberty Hyde Bailey Hortorium, Cornell University (especially Bob Dirig, Bailey Hortorium, retired for details of Ethel Zoe Bailey’s life) • Biodiversity Heritage Library • Mertz Library, New York Botanical Garden • Peter H. Raven Library, Missouri Botanical Garden • National Agriculture Library, USDA • Institute for Museum and Library Services
  27. 27. Questions??