Biodiversity Heritiage Library: progress and process
Upcoming SlideShare
Loading in...5
×
 

Biodiversity Heritiage Library: progress and process

on

  • 1,613 views

And update on Biodiversity Heritage Library's efforts and success in 2010 with a focus on the future as part of the EU project, ViBRANT.

And update on Biodiversity Heritage Library's efforts and success in 2010 with a focus on the future as part of the EU project, ViBRANT.

Statistics

Views

Total Views
1,613
Views on SlideShare
1,610
Embed Views
3

Actions

Likes
1
Downloads
10
Comments
0

2 Embeds 3

http://www.linkedin.com 2
https://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • BHL has gone Global
  • Sharing, distribution and delivery of content
  • Global data sharing requires a social infrastructure

Biodiversity Heritiage Library: progress and process Biodiversity Heritiage Library: progress and process Presentation Transcript

  • Phil Cryer Biodiversity Heritage Library Scripting Life : the science behind ViBRANT January 20-21, 2011 - Paris, France Biodiversity Heritage Library: Process & Progress
      • a consortium of global partners
      • aims to share historic biodiversity literature texts
      • provides open access of all content
      • free for all
    Biodiversity Heritage Library (BHL) Biodiversity Heritage Library (BHL)
  • bhl data stats
    • Content
    • 45,000 journals & monographs
    • 8,821 in 2010
    • 87,000 volumes
    • 15,552 in 2010
    • 32 million pages
    • 5.6 million in 2010
    • Usage (2010)
    • 837,000 visits
    • 422,000 unique visitors
    • 4.2 millions page views
    • 221 countries/territories
  • new features
  • scanning request form click on ‘Feedback’ to access click on ‘Feedback’ to access click on ‘Feedback’ to access
  • new user interface for names index sortable columns, exportable via CSV, BibTeX and Endnote
  • downloadable article PDFs create articles from BHL books create articles from BHL books create articles from BHL books
  • downloadable article PDFs 1- enter metadata about the article 1- enter metadata about the article 1- enter metadata about the article
  • downloadable article PDFs 2- select the pages of the article 2- select the pages of the article 2- select the pages of the article
  • downloadable article PDFs 3- PDF request received 3- PDF request received 3- PDF request received
  • downloadable article PDFs 4- PDF article arrives via email 4- PDF article arrives via email 4- PDF article arrives via email
  • CiteBank ( http://citebank.org ) open access repository for biodiversity publications open access repository for biodiversity publications open access repository for biodiversity publications
  • CiteBank ( http://citebank.org ) Solr search with faceting Solr search with faceting Solr search with faceting
  • CiteBank ( http://citebank.org ) individual bibliography page individual bibliography page individual bibliography page
  • CiteBank features
    • access the ‘crowd-sourced’ articles generated from the BHL scans (harvested from BHL)
    • platform for journals/publishers/societies in need of tools to store and share content
    • harvests metadata from Zookeys , SCiELO, Smithsonian collections nightly via OAI-PMH
    • new search index to BHL content using Solr
  • CiteBank + BHL expands our core features
    • content and tools for scholarly crowd-sourcing
      • Users can get content they need, do minor work, share enhancements with community
    • look to add more content integration with other existing platforms
      • EOL, Atlas of Living Australia, JSTOR Plant Science, BioStor and others
      • Mendeley , Zotero, RefWorks, etc
    • enhancements to the portal home page
      • More focus on search
    • special collections
      • Charles Darwin’s scientific library
    • scholarly annotations
      • annotations in Darwin’s hand and academic interpretation, crosslinking
    More BHL features coming soon... More BHL features coming soon...
  • bhl global
  •  
  •  
  •  
  • Benefits of Global BHL partnerships
    • redundancy and resilience
      • data and app Mirroring
    • exposing unique content
    • new tools, services, people
    • opportunities for new collaborations
      • IMPACT, ViBRANT , OpenUp! in EU
  • storage clusters
      • all BHL data stored at the Internet Archive in San Francisco
      • no redundancy
      • limited in how we could serve our data and images
      • difficult to analyze data
      • First global BHL cluster gives us
      • redundancy and failover
      • many new serving options
      • new ways to run analytics, data mining
    Storage issues solved using clusters Storage issues solved using clusters
  •  
      • open source software
      • Linux operating system
      • Gluster distributed storage system
      • commodity hardware
      • Supermicro servers
      • ‘ off the shelf ’ hard drives and other system components
    Open source software / commodity hardware Open source software / commodity hardware
      • BHL Cluster 01
      • six 4U sized cabinets
      • twenty-four 1.5TB hard drives in each cabinet
      • 97TB   of replicated and distributed storage (over 200TB of raw disk)
    BHL Cluster 01
  •  
      • find relationships
      • R GNU statistical language
      • Hadoop, Disco
      • make existing data more useful
      • image and OCR reprocessing, taxonfinder
    Statistical computing
  • data sharing
      • replicating BHL data globally
      • Marine Biological Laboratory (Woods Hole, US )
      • National History Museum (London, UK )
      • Bibliotheca Alexandrina (Alexandrina, EG )
      • Atlas of Living Australia (Canberra, AU )
      • China... Brazil...
      • advantages to replication
      • redundancy , failover
      • load balancing
      • geographical distribution
    Data sharing and replication Data sharing and replication
      • grabby
      • handles initial download from Internet Archive (IA)
      • bhl-sync
      • open source Dropbox model
      • handles syncing remote nodes automatically
      • uses inotify, lsyncd, OpenSSH, rsync, unison
      • remote server only requires a secure login
    • Open source code available at http://bit.ly/bhl-bits
    Software for data sync Software for data sync
      • digital repository platform
      • enables storage and management of digital content
      • maintains a persistent digital archive
      • stores data in a neutral manner
      • provides backup , redundancy, disaster recovery
      • shares data to remote nodes via OAI-PMH
    Fedora-commons integration Fedora-commons integration
  • future plans
    • BHL is a member of CrossRef through The Smithsonian
    • will start assigning DOIs to BHL monographs
      • easy, non-controversial provides open access of all content
    • then move on to articles and other publication types
      • CrossRef rules make full assignment challenging for crowd-sourced articles
    Assigning DOIs (Digital Object Identifier) Assigning DOIs (Digital Object Identifier)
    • OCR Correction
      • a big problem , no easy solution
    • add more content
      • partnerships, CiteBank
    • sustainability planning and funding
      • committed to no fees for users
    • more outreach
      • conferences, marketing
      • Facebook, Twitter and other social media avenues...
    Wish list for 2011 and beyond Wish list for 2011 and beyond
      • http://biodiversitylibrary.blogspot.com
      • http://twitter.com/BioDivLibrary #bhlib
      • http://facebook.com/pages/Biodiversity-Heritage-Library/63547246565
      • http://flickr.com/groups/bhl
      • http://youtube.com/user/BioHeritageLibrary
      • http://biodiversitylibrary.org/RecentRss.aspx
      • http://slidesha.re/bhl-slides
    BHL is social! BHL is social!
  • slides : slidesha.re/bhl-slides contact : [email_address] Thanks. Phil Cryer : Biodiversity Heritage Library Scripting Life : the science behind ViBRANT January 20-21, 2011 - Paris, France