Siri sgrpmtg05092013


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • A free & open access digital library for biodiversity literature and primary source materials (field books)A consortium of 15 libraries working together to run a virtual library branchA collection of content from the 15 member BHL consortium and other Internet Archive contributorsAnyone is free to access & download BHL materials
  • SIL employees work to scan SIL contentSIL also hosts BHL Secretariat: BHL Program Director, BHL Program Manager, BHL Collections CoordinatorNancy Gwinn = BHL Executive ChairFederal support received for the past 2 years and ongoing!
  • Each of the 15 BHL member libraries work together to select unique content for scanningThen we send the books from our shelves and the metadata from our library catalogs to the Internet Archive for scanningThe Internet Archive does the heavy lifting of digitization, derivative file creation and packaging all image and metadata files together for storageWe harvest the files from the IA database to our BHL database managed by the Missouri Botanical Garden in St. Louis
  • Overview of types of metadata we needMetadata flows from our library catalogs to the Internet Archive and then to BHLWe derive the metadata we display in the BHL website from the original MARC record of the contributing library
  • Example of the original MARC record in SIRIS and in the backend BHL database vs. the metadata derived from the original MARC and displayed on the BHL websiteNotice also the differences in the volume information. This is b/c AMNH contributed some volumes in addition to the SIL contributed volumes.It is often the case that multiple BHL member libraries need to work together to complete a seriesWe don’t have the time to standardize volume metadata coming from different libraries at the time of scanning but we can modify this information after it appears in our collection
  • Curating the BHL collection = critical piece of post-digitization workflowRequires loginWeb-based Administrative interface to access the backend BHL database so that we can make corrections to our collection as necessaryWith over 60,000 titles and 114,000 volumes – how do we manage our curation activities?!
  • User feedback is key; we rely on the many eyes of the crowd to help us direct our curation activities to the content people are actually usingUsers can let us know if they find a problem with something in our collection through our general feedback form and place a request for something to be scanned through our scanning request form
  • BHL uses an issue tracking system, known as Gemini, to manage the feedback we receive from usersNearly all consortium member libraries participate in responding to user feedback via this systemEssential to BHL day-to-day workKey to communicating at level of granularity we needExcellent documentation tool
  • The majority of the content in the BHL collection is public domainHowever we have agreements to provide access to over 270 in-copyright titles under a Creative Commons Attribution-Non Commercial-Share Alike licenseAs part of the volume metadata, we include data about copyright status and licensing if applicable – 3 different tiers As an open access project it is critical that we manage our copyright metadata; focus on managing in-copyright as well as “due diligence” volumesOpen data available under a Creative Commons Zero license = public domain dedication
  • Siri sgrpmtg05092013

    1. 1. Biodiversity Heritage Library:A Mass Scanning Mix of MetadataBianca Crowley, Collections CoordinatorBiodiversity Heritage LibrarySmithsonian LibrariesJun-13
    2. 2. BHL Overview•• New user interface launched in March• Search by title, author, article, subjects andscientific names• Various download options, even highresolution• Taxonomic name finding algorithm• Machine-to-machine services
    3. 3. Core BHL Member InstitutionsSmithsonian Libraries: 6,800+ titles Nearly 18,000 volumes 7 million+ pages
    4. 4. Digitization Workflow
    5. 5. Metadata1. Titles vs. Items vs. Segments2. Metadata we need:• MARC for book and journal titles• Volume information• Page dataBHL Term Library Term Meaning MetadataTitle Book or JournalTitlesConceptual Unit MARC recordItem Volume, Piece Object Derived from holdings +created @ digitizationSegment Article, BookChapter, PartSection ofconsecutive pgsHarvested from BioStor.orgor created post digitization
    6. 6. Smithsonian LibrariesRecordBHLRecord
    7. 7. Metadata Challenges• BHL collection aggregates metadata from 15member library catalogs• Also aggregating metadata from a couplehundred Internet Archive contributors• Default page metadata created at time ofscanning lacks detail esp. for plates, figures, etc.• Taxonomic name finding algorithm only as goodas optical character recognition (OCR)
    9. 9. User Feedback is CriticalGeneral feedback form Scan request form
    10. 10. Gemini Issue Tracking System
    11. 11. ©opyright MetadataCopyright Status Status language LicensePermissions (1923in-copyright pubs)In copyright. Digitized with thepermission of the rights holder.Creative CommonsAttribution, Non-Commercial, Share-Alike (CC-BY-NC-SA)Due Diligence(1923-77 US pubs)No known copyright restrictions asdetermined by scanninginstitution.NAPublic Domain(pre-1923)Public domain. The BHL considersthat this work is no longer undercopyright protection.NA• We also have an open data policy – metadata 100%available for reuse under a CC0 license (public domain)
    12. 12. Impact• “BHL came to the rescue when a planned trip to work in the Mertz Library at The NewYork Botanical Garden had to be cancelled due to Hurricane Sandy. Thanks to the onlineresources available through BHL I was able to source most of the key works I needed,with their supporting bibliographic information. Further use of BHL occurred whenbuilding work at the Linnean Society of London limited access to some of the book I hadbeen able to use from that collection."• “I would like thank you all very much for invaluable work and support you do. I just got apdf-file from more than century old (1893) journal paper (regional naturalist societypaper, published in Finland), to get copy I should take 500 mile drive to our universitylibrary. Now I am got it fastly in high-quality pdf-copy. Cordial thanks and all success incontinuing your highly valuable mission.” [conservation biologist from Estonia]• “You are a wonderful resource. I maintain a Website that describes the plant genusOpuntia (prickly pear cacti). There is no way I could maintain such a site without access toliterature from 100-200 years ago. Most of the cactus species were discovered long ago; Ifind it invaluable to put up PDF files to document each species in the literature as Idocument them photographically. I am a botanist, but I work in the pharmaceutical field(not so many botanical jobs out there). Your library makes it possible for me to continueworking with plants in a meaningful and scientific manner.”
    13. 13. • Repackaging content in new ways for newaudiences via:– flickr, Facebook, Twitter, & Pinterest– iTunes U & iBooks• Open data & APIs– Put content where users are already working(Encyclopedia of, Int’l Plant Names,– Gets power users to work for us (for free!),
    14. 14. Questions?Bianca Crowleycrowleyb@si.eduThank you