Your SlideShare is downloading. ×
Biodiversity Heritiage Library: progress and process
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Biodiversity Heritiage Library: progress and process

1,385
views

Published on

And update on Biodiversity Heritage Library's efforts and success in 2010 with a focus on the future as part of the EU project, ViBRANT. …

And update on Biodiversity Heritage Library's efforts and success in 2010 with a focus on the future as part of the EU project, ViBRANT.

Published in: Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,385
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
10
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • BHL has gone Global
  • Sharing, distribution and delivery of content
  • Global data sharing requires a social infrastructure
  • Transcript

    • 1. Phil Cryer Biodiversity Heritage Library Scripting Life : the science behind ViBRANT January 20-21, 2011 - Paris, France Biodiversity Heritage Library: Process & Progress
    • 2.
        • a consortium of global partners
        • aims to share historic biodiversity literature texts
        • provides open access of all content
        • free for all
      Biodiversity Heritage Library (BHL) Biodiversity Heritage Library (BHL)
    • 3. bhl data stats
    • 4.
      • Content
      • 45,000 journals & monographs
      • 8,821 in 2010
      • 87,000 volumes
      • 15,552 in 2010
      • 32 million pages
      • 5.6 million in 2010
    • 5.
      • Usage (2010)
      • 837,000 visits
      • 422,000 unique visitors
      • 4.2 millions page views
      • 221 countries/territories
    • 6. new features
    • 7. scanning request form click on ‘Feedback’ to access click on ‘Feedback’ to access click on ‘Feedback’ to access
    • 8. new user interface for names index sortable columns, exportable via CSV, BibTeX and Endnote
    • 9. downloadable article PDFs create articles from BHL books create articles from BHL books create articles from BHL books
    • 10. downloadable article PDFs 1- enter metadata about the article 1- enter metadata about the article 1- enter metadata about the article
    • 11. downloadable article PDFs 2- select the pages of the article 2- select the pages of the article 2- select the pages of the article
    • 12. downloadable article PDFs 3- PDF request received 3- PDF request received 3- PDF request received
    • 13. downloadable article PDFs 4- PDF article arrives via email 4- PDF article arrives via email 4- PDF article arrives via email
    • 14. CiteBank ( http://citebank.org ) open access repository for biodiversity publications open access repository for biodiversity publications open access repository for biodiversity publications
    • 15. CiteBank ( http://citebank.org ) Solr search with faceting Solr search with faceting Solr search with faceting
    • 16. CiteBank ( http://citebank.org ) individual bibliography page individual bibliography page individual bibliography page
    • 17. CiteBank features
      • access the ‘crowd-sourced’ articles generated from the BHL scans (harvested from BHL)
      • platform for journals/publishers/societies in need of tools to store and share content
      • harvests metadata from Zookeys , SCiELO, Smithsonian collections nightly via OAI-PMH
      • new search index to BHL content using Solr
    • 18. CiteBank + BHL expands our core features
      • content and tools for scholarly crowd-sourcing
        • Users can get content they need, do minor work, share enhancements with community
      • look to add more content integration with other existing platforms
        • EOL, Atlas of Living Australia, JSTOR Plant Science, BioStor and others
        • Mendeley , Zotero, RefWorks, etc
    • 19.
      • enhancements to the portal home page
        • More focus on search
      • special collections
        • Charles Darwin’s scientific library
      • scholarly annotations
        • annotations in Darwin’s hand and academic interpretation, crosslinking
      More BHL features coming soon... More BHL features coming soon...
    • 20. bhl global
    • 21.  
    • 22.  
    • 23.  
    • 24. Benefits of Global BHL partnerships
      • redundancy and resilience
        • data and app Mirroring
      • exposing unique content
      • new tools, services, people
      • opportunities for new collaborations
        • IMPACT, ViBRANT , OpenUp! in EU
    • 25. storage clusters
    • 26.
        • all BHL data stored at the Internet Archive in San Francisco
        • no redundancy
        • limited in how we could serve our data and images
        • difficult to analyze data
        • First global BHL cluster gives us
        • redundancy and failover
        • many new serving options
        • new ways to run analytics, data mining
      Storage issues solved using clusters Storage issues solved using clusters
    • 27.  
    • 28.
        • open source software
        • Linux operating system
        • Gluster distributed storage system
        • commodity hardware
        • Supermicro servers
        • ‘ off the shelf ’ hard drives and other system components
      Open source software / commodity hardware Open source software / commodity hardware
    • 29.
        • BHL Cluster 01
        • six 4U sized cabinets
        • twenty-four 1.5TB hard drives in each cabinet
        • 97TB   of replicated and distributed storage (over 200TB of raw disk)
      BHL Cluster 01
    • 30.  
    • 31.
        • find relationships
        • R GNU statistical language
        • Hadoop, Disco
        • make existing data more useful
        • image and OCR reprocessing, taxonfinder
      Statistical computing
    • 32. data sharing
    • 33.
        • replicating BHL data globally
        • Marine Biological Laboratory (Woods Hole, US )
        • National History Museum (London, UK )
        • Bibliotheca Alexandrina (Alexandrina, EG )
        • Atlas of Living Australia (Canberra, AU )
        • China... Brazil...
        • advantages to replication
        • redundancy , failover
        • load balancing
        • geographical distribution
      Data sharing and replication Data sharing and replication
    • 34.
        • grabby
        • handles initial download from Internet Archive (IA)
        • bhl-sync
        • open source Dropbox model
        • handles syncing remote nodes automatically
        • uses inotify, lsyncd, OpenSSH, rsync, unison
        • remote server only requires a secure login
      • Open source code available at http://bit.ly/bhl-bits
      Software for data sync Software for data sync
    • 35.
        • digital repository platform
        • enables storage and management of digital content
        • maintains a persistent digital archive
        • stores data in a neutral manner
        • provides backup , redundancy, disaster recovery
        • shares data to remote nodes via OAI-PMH
      Fedora-commons integration Fedora-commons integration
    • 36. future plans
    • 37.
      • BHL is a member of CrossRef through The Smithsonian
      • will start assigning DOIs to BHL monographs
        • easy, non-controversial provides open access of all content
      • then move on to articles and other publication types
        • CrossRef rules make full assignment challenging for crowd-sourced articles
      Assigning DOIs (Digital Object Identifier) Assigning DOIs (Digital Object Identifier)
    • 38.
      • OCR Correction
        • a big problem , no easy solution
      • add more content
        • partnerships, CiteBank
      • sustainability planning and funding
        • committed to no fees for users
      • more outreach
        • conferences, marketing
        • Facebook, Twitter and other social media avenues...
      Wish list for 2011 and beyond Wish list for 2011 and beyond
    • 39.
        • http://biodiversitylibrary.blogspot.com
        • http://twitter.com/BioDivLibrary #bhlib
        • http://facebook.com/pages/Biodiversity-Heritage-Library/63547246565
        • http://flickr.com/groups/bhl
        • http://youtube.com/user/BioHeritageLibrary
        • http://biodiversitylibrary.org/RecentRss.aspx
        • http://slidesha.re/bhl-slides
      BHL is social! BHL is social!
    • 40. slides : slidesha.re/bhl-slides contact : [email_address] Thanks. Phil Cryer : Biodiversity Heritage Library Scripting Life : the science behind ViBRANT January 20-21, 2011 - Paris, France

    ×