Presentation ala2010


Published on

Published in: Education
1 Comment
1 Like
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Presentation ala2010

  1. 1. A Mass Scanning Workflow Discussion Sociological aspects of a global mass scanning project Biodiversity Heritage Library Suzanne C. Pilsk- Smithsonian Institution Libraries Matthew Person – MBLWHOI Library, Woods Hole June 28, 2010
  2. 2. Biodiversity Heritage Library In any well-appointed Natural History Library there should be found every book and every edition of every book dealing in the remotest way with the subjects concerned. One never knows wherein one edition differs from or supplements the other and unless these are on the same table at the same time it is not possible to collate them properly. Moreover for accurate work it is necessary for the student to verify every reference he may find; it is not enough to copy from a previous author; he must verify each Charles Davies Sherborn (1861-1942) reference itself from the original. Charles Davies Sherborn, Epilogue to Index Animalium, March 1922
  3. 3. ● Vision ● Application ● Interaction ● Results
  4. 4. E.O. Wilson: A single webpage for every living organism
  5. 5. How do you convert THIS into 0’s and 1’s ?
  6. 6. You Meet… and discuss… and meet… and discuss… ● 2003. Telluride. Encyclopedia of Life meeting ● February 2005. London. Library and Laboratory: the Marriage of Research, Data and Taxonomic Literature ● May 2005. Washington. Ground work for the Biodiversity Heritage Library ● June 2006. Washington. Organizational and Technical meeting ● August 2006. New York Botanical Garden. BHL Director‟s Meeting. ● October 2006. St. Louis/San Francisco. Technical meetings …and you ● February 2007. Museum of Comparative Zoology. Organizational meeting follow ● May 2007. Encyclopedia of Life and BHL Portal through.. Launch. Washington DC.
  7. 7. Members ● American Museum of Natural History (New York) ● Botany Libraries, Harvard University ● Ernst Mayr Library of the Museum of Comparative Zoology, Harvard University ● Field Museum (Chicago) ● Marine Biological Laboratory / Woods Hole Oceanographic Institution Library ● Missouri Botanical Garden (St. Louis) ● Natural History Museum (London) ● New York Botanical Garden (New York) ● Royal Botanic Gardens, Kew ● Smithsonian Institution Libraries (Washington) ● Academy of Natural Science (Philadelphia) ● California Academy of Science (San Francisco)
  8. 8. Contributing Members and Partners ● Internet Archive ● California Digital Libraries ● University Library of the University of Illinois at Urbana- Champaign
  9. 9. Institutions formed agreements ; quickly mass scanning work flow began
  10. 10. People Do The Work
  11. 11. *Who has what? *What should we scan and when? *Monographs vs Serials *Series treated as separates *Can it be found and used once scanned?
  12. 12. Initial Metadata Analysis: We have 1.3 million catalogue records 73% are monographs (remainder are serials at title- level) 63% is English language material. The next most popular language (9%) is German. About 30% of material was published before 1923.
  13. 13. The Worker Bees ● Telephone conversations ● Email strings ● Working documents ● ● Face to face meetings ● Presentations ● Articles ● Going beyond self expectations was the norm
  14. 14. worker bees inside the beehive…
  15. 15. Mass Scanning Workflow Local data flow Vendor data flow WonderFetch tm Return of data Return of material Quality Assurance Billing
  16. 16. EOL Bibliographic Curator species Data from Request Evaluateneed SIRIS Carts delivered to scanner title. Need is… Goin‟ down Picklist Put on shipping cart, the rows “gap-fill” generate„packinglist‟ invoice for other database stores BHL library select/reject/ship Update picklist if item record state & supplies has been changed item metadata During cataloging touch-up to IA Circ to scanner Select title no in picklist, serial? upload to Circ to cataloging monograph de-duper for MARC editing yes no The Stacks Reject in picklist, yes Duplicate? Circ in Horizon fail Other Return to stacks library “bid” ? Meta- Reject in picklist, data no return to stacks check pass “Bid” Pull from stacks Preser- on title, Circ in ILS vation select in Preliminary metadata check review pass picklist And physical check fail
  17. 17. IA scanning process Unique IA id is assigned BHL Portal Metadata is gathered from Periodically harvests SIRIS and the picklist db Marc.xml (bib) and item And associated with the scan Records, along with JP2000s generated JP2000 from Carts delivered & transformed Served on to scanner QA is done by IA on 10% To index and display In the portal Put on shipping cart, generate „packinglist‟ Books are returned, Invoice, alert cart contents are scanning center verified against invoice SIL does 20% QA Download .csv from Update picklist Checking for metadata matching portal with SIL to indicate With item, scan quality etc barcodes, Portal rescan URLs no Pass QA? yes Updated in picklist as scanned Send URLs to SIRIS Circ in Horizon Place BHL sticker near barcode Office for batch Return to Stacks updates
  18. 18. The work-Flow Process ● Select Book ~Pull from Shelf ● Review Physically, and check Metadata ● Establish viability and create pick/pack list / Wonderfetch tm ● Send to IA scanning center
  19. 19. Monographic DeDuper
  20. 20. Serials Deduping, merging, bidding…an ordering process.
  21. 21. OCLC matching holdings institutions
  22. 22. Potential merge-dedupe alert ahead...
  23. 23. Don’t press the wrong button
  24. 24. Indicate which records you intend to consider as a single record
  25. 25. Choose the more complete record
  26. 26. 2 records merged into single record for this title. Holdings remain Potential second bid distinct for brewing… each institution
  27. 27. Last step in the workflow?
  28. 28. another view of the beehive
  29. 29. calling worker bees to the beehive
  30. 30. the bee-skyve
  31. 31. 24 April 2009 The following came from a public librarian in Falmouth, Massachusetts: "We recently were asked the question: who discovered the zebra fish? In searching the Encyclopedia of Life I kept seeing the phrase “Hamilton, 1822” next to the “danio rerio”. Wondering who Hamilton was, I searched WorldCat and discovered that Hamilton was Francis Hamilton who had published in 1822 An account of the fishes found in the river Ganges and its branches. I looked at the EOL record and clicked on the Biodiversity Heritage Library link. One of the links was to a Hamilton book! In 1878 the book The Fishes of India was published which included a description and a image of the danio rerio. Links were provided to the exact place in the text where the fish was mentioned, as well as to the plate with the fish itself illustrated. Not only that, but I could send the patron the exact link to both pages which described her fish. How remarkable it was to find this Harvard University book available so easily through the Biodiversity Heritage Library. A great success for our patron, and we looked like magicians bringing the book to her."
  32. 32. ● Gary Anderson, Professor Emeritus at the University of Southern Mississippi. He used to make an annual trip to our stacks to xerox hundreds of articles at a time. ● “The Biodiversity Heritage Library is a valuable resource for acquiring crustacean literature. At present, a search there (http:// at=) will turn up 5 publications (one of which was not contributed by the Smithsonian). Also note that the BHL has scanned these and additional literature at the site for taxonomic terms, and provides links to those documents. There are 1592 "hits" for Pycnogonida. It is likely that you could turn up a lot of additional articles within larger works that way. Alternatively, you could perform searches for volumes of interest (if you know of specific references), to home in on the papers you want. There will be A LOT of additional material becoming available at that site.”
  33. 33. “Yesterday whilst reading the latest edition of The Entomologist's Record I was pleased to find that early editions of this invaluable publication, edited by the seminal entomologist James Tutt (no relation to Elvis's drummer as far as I am aware) are available digitised […] So I went there, and was amazed at what I found. They even have a blog. What a fantastic project!!!” From the blog: resource.html
  34. 34. […]Michael, an colleague researching wasps was excited that he had discovered in the Biodiversity Heritage Library a copy of an obscure 1860s book: Saussure, H. de & Sichel, J. (1864). Catalogue des espèces de l'ancien genre Scolia, contenant les diagnoses, les descriptions et la synonymie des espèces, avec des remarques explicatives er critiques. Genève & Paris : Henri Georg & V. Masson et Fils pp. 1–350 This book was not in our library, probably not in Australia, and almost impossible to get hold of without travelling to the northern hemisphere. Thanks to the BHL for their work in providing access to works of importance. Michael is now able to use detailed content of this book in his work. John Tann Australian Museum
  35. 35. The Biodiversity Heritage Library : Advancing Metadata Practices in a Collaborative Digital Library Suzanne C. Pilsk, Smithsonian Institution Libraries; Matthew Person, MBLWHOI Library; Joseph deVeer, Ernst Mayr Library, Museum of Comparative Zoology; John F. Furfey, MBLWHOI Library; Martin R. Kalfatovic, Smithsonian Institution Libraries Abstract: The Biodiversity Heritage Library is an open access digital library of taxonomic literature, forming a single point of access to this collection for use by a worldwide audience of professional taxonomists, as well as “citizen scientists.” A successful mass scanning digitization program, one that creates functional and findable digital objects, requires thoughtful metadata workflow that parallels the workflow of the physical items from shelf to scanner. This article examines the needs of users of taxonomic literature, specifically in relation to the transformation of traditional library material to digital form. It details the issues that arise in determining scanning priorities, avoiding duplication of scanning across the founding twelve natural history and botanical garden library collections, and the problems related to the complexity of serials, monographs, and series. Highlighted are the tools, procedures, and methodology for addressing the details of a mass scanning operation.
  36. 36. A Mass Scanning Workflow Discussion Thanks to All Staff of the BHL Members