The Biodiversity Heritage Library: A Cornerstone of the Encyclopedia of Life

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    2 Favorites & 1 Group

    The Biodiversity Heritage Library: A Cornerstone of the Encyclopedia of Life - Presentation Transcript

    1. The Biodiversity Heritage Library Martin R. Kalfatovic Smithsonian Institution Libraries A Cornerstone of the Encyclopedia of Life
    2. Biodiversity Heritage Library
    3. Structure of the Encyclopedia of Life
    4. OH O H 2 N OH H Serine Molecule
    5. Education & Outreach Smithsonian/Harvard Informatics Marine Biological Laboratory Secretariat Smithsonian Synthesis Center Field Museum Biodiversity Heritage Library
    6. Biodiversity Heritage Library
      • 2003, Telluride. Encyclopedia of Life meeting
      • February 2005. London. Library and Laboratory: the Marriage of Research, Data and Taxonomic Literature
      • May 2005. Washington. Ground work for the Biodiversity Heritage Library
      • June 2006. Washington. Organizational and Technical meeting
      • August 2006. New York Botanical Garden. BHL Director’s Meeting.
      • October 2006. St. Louis/San Francisco. Technical meetings
      • February 2007. Museum of Comparative Zoology. Organizational meeting
    7. Biodiversity Heritage Library
      • American Museum of Natural History (New York)
      • Field Museum (Chicago)
      • Natural History Museum (London)
      • Smithsonian Institution (Washington)
      • Missouri Botanical Garden (St. Louis)
      • New York Botanical Garden (New York)
      • Royal Botanic Garden, Kew
      • Botany Libraries, Harvard University
      • Ernst Meyer Library of the Museum of Comparative Zoology, Harvard University
      • Marine Biological Laboratory / Woods Hole Oceanographic Institution
      James Dwight Dana Zoophytes. Atlas , 1849
    8. Taxonomic Literature
      • Over 250 years of systematic description of life
      • The cited half-life of publications in taxonomy is longer than in any other scientific discipline
      • The decay rate is longer than in any scientific discipline
        • Tom Moritz
    9. Literature Repatriation Biologia Centrali-Americana. Edited by Frederick Ducane Godman and Osbert Salvin. London : Pub. for the editors by R. H. Porter, 1879-1915
    10. Digital Divide?
    11. Digital Divide? Vishwas Chavan travels a lot. An informatician based at the National Chemical Laboratory in Pune, India, he collects data on what types of animal live where in India to enter into a biodiversity database … Much of the information Chavan seeks is in old, out-of-print tomes … To find them, Chavan has spent years trailing around libraries. He dreams of the day when books such as these are scanned and made available as digital files on the Internet. “ Science in the Web Age: The Real Death of Print” by Andreas von Bubnoff Nature 438, 550-552 1 December 2005 Henry Walter Bates The Naturalist on the River Amazons, 1863
    12. Narrowing the Divide
      • Core literature pre-1923: 400,000 (80 million pages)
      • All pre-1923: 600-750,000 (120-150 million pages)
      • All literature: 1.4-1.6 million (280-320 million pages)
      Biodiversity Heritage Library Mass. Zoological and Botanical Survey Reports on the fishes, reptiles and birds of Massachusetts , 1839
    13. Changing Priorities
      • Open Access for scientific literature
      • Encourage re-use and re-purposing of the data in multiple and diverse systems
      • Work with non-commercial publishers to provide access
    14. Changing Priorities
      • BHL has had discussions with various society publishers as well as:
        • BioOne
        • JSTOR
      T.H. Huxley by Leslie Ward (“Spy”)
    15. Digital Book Creation
      • Automated structure detection – vital for serials
      • Taxonomic Intelligence
      • Digital Identifiers
      • Scalable mass scanning (outside of the Google environment)
      Richard Owen by Leslie Ward (“Spy”)
    16. BHL Structural Metadata First Ingest Internet Archive 390888347 45632 390888346 45632 390888345 45632 390888344 45632 390888343 45632 Sub-element Barcode Bib #
    17. BHL Structural Metadata Sub-Element Map Internet Archive 5 390888343 45632 4 390888343 45632 3 390888343 45632 2 390888343 45632 1 390888343 45632 Sub-element Barcode Bib #
    18. BHL Structural Metadata Page Structure Map Internet Archive XML structure map that delineates the relationships of the images created automatically 1 390888343 45632 Sub-element Barcode Bib # 0005 0004 0003 0002 0001 Image Number
    19. Taxonomic Intelligence
    20. Taxonomic Intelligence
      • 9.4 million name strings in NameBank
      • Uses sophisticated algorithm (TaxonGrab) to locate likely name strings in OCR text
      • Iterative processing of BHL texts will both increase the number of name strings in NameBank and increase the accuracy of name string recognition
      Georges Louis Leclerc, comte de Buffon Histoire naturelle : générale et particulière (Oiseaux) , 1799-1808
    21. Digital Identifiers
      • Digital Object Identifiers (DOI)
      • Handles
      • Life Science Identifiers (LSID)
      • URIs
      • Etc.
      Telespiza palmeri Avifauna of Laysan , 1893-1900
    22. Digital Identifiers
      • Factors:
        • Cost per identifier
        • Community acceptance
        • Scalability
      • BHL is working with TDWG and others to come up with the best scheme(s)
      Moho bishopi Aviafauna of Laysan , 1893-1900
    23. Scalable Mass Scanning
    24. The Internet Archive
      • 501(c)(3) organization
      • Dedicated to “Universal Access to Human Knowledge”
      • Founder of the Open Content Alliance
      • Provides:
        • Mass scanning
        • Archival storage of files
        • Image processing
        • Technology development
    25. Internet Archive Scribe Scanner
      • Single Scribe Machine
        • Human operated
        • 200 volumes per shift per week
        • ~ 70,000 pages from a single machine per week
        • Cost: $100,000 / year
    26. Internet Archive Scribe: Boston
      • Cooperative facility with the Boston Library Consortium (19 New England Libraries)
      • BHL Members MBL/WHOI and Harvard Libraries will use the facility
      • Status: In production
    27. Internet Archive Scribe: Boston
    28. Internet Archive Scribe: London
      • Single Scribe in place
      • Projected 5 unit pod to be located at The Natural History Museum
      • Status: In production
    29. Internet Archive Scribe: London
    30. Internet Archive Scribe: Washington
      • Single unit arrived May 5
      • Funded by Smithsonian Libraries
      • Projected 5 unit BHL pod in National Museum of Natural History
      • Projected 10-15 unit pod shared by Smithsonian/BHL and regional Washington libraries
    31. Internet Archive Scribe: Washington
    32. Internet Archive Scribe: New York
      • Current BHL plans focus on sharing a 10 unit pod located at the New York Public Library
      • American Museum of Natural History and New York Botanical Garden will use this facility
      • Status: in planning
      Carl von Linné (1707 - 1778)
    33. Internet Archive Scribe: Illinois
      • Two machines funded by State of Illinois
      • UIUC scanning Fieldiana (all series)
      • Arrangement coordinated by Michael Godow, Bryan Heidorn (UIUC/GSLIS), Betsy Kruger (UIUC/Library)
      • Status: In production
    34. Internet Archive Scribe: Illinois
    35. BHL Portal
      • Library catalog-like interface to BHL literature
      • Enhanced structural analysis to provide volume/issue/article page access to the literature
      • Iterative development based on feedback from user community
      • Provide access to two key audiences:
        • Humans
        • Machines
    36. www.biodiversitylibrary.org
    37.  
    38.  
    39.  
    40.  
    41.  
    42.  
    43.  
    44.  
    45. BHL Literature Online 1,291,485 pages 657,310 pages via BHL Portal Yet another physical difficulty is the task of assembling the library and indexes which will enable the student to work under proper conditions…. the beginner must now be prepared to spend liberally, or else must establish himself in an institution where a large library exists ; if he work by himself with only a few books, he will have to confine himself to a very narrow specialty indeed. 'The Limitations of Taxonomy' by J.M. Aldrich, Science , April 22, 1927, vol. LXV, no. 1686, p.381
    46. Biodiversity Heritage Library
    47. Biodiversity Heritage Library

    + Martin KalfatovicMartin Kalfatovic, 3 years ago

    custom

    2353 views, 2 favs, 0 embeds more stats

    Presentation at the Biodiversity Heritage Library @ more

    More info about this document

    CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

    Go to text version

    • Total Views 2353
      • 2353 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 2
    • Downloads 102
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories