Your SlideShare is downloading. ×
Networked digital library through harvesting
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Networked digital library through harvesting


Published on

Digital Archive

Digital Archive

Published in: Education, Technology

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Networked Digital Library through Harvesting: The Future of Digital Archiving Barnali Roy Choudhury and Dr. Parthasarathi Mukhopadhyay Department of Library and Information Science The University of Burdwan, Burdwan – 713 104
  • 2. DIGITAL LIBRARY A digital library is a library in which collections are stored in digital formats (as opposed to print, microform, or other media) and accessible by computers.[1] The digital content may be stored locally, or accessed remotely via computer networks. (Wikipedia) The DELOS Digital Library Reference Model[2] defines a digital library as: An organization, which might be virtual, that comprehensively collects, manages and preserves for the long term rich digital content, and offers to its user communities specialized functionality on that content, of measurable quality and according to codified policies.
  • 3. No traditional library is self sufficient; No digital library is self sufficient;
  • 4. Networked Digital Library An entity that collects metadata in a central place from selected Dls for providing centralized searching
  • 5. OBJECTIVES  To harvest metadata in a single window (centralized search facility) from different OAI/PMH repositories related to LIS;  To design union catalogue of scholarly objects through harvesting (by using OAI/PMH protocol, PKP open source harvesting software on LAMP architecture); and  To provide comprehensive search facilities to end users of LIS domain for accessing scholarly objects (search metadata locally and access full-text globally).
  • 6. CRITERIA for DL selection Selection of a particular domain Selection of most efficient and effective dataset Selected data are OAI/PMH compatible or not
  • 7. Open Access Institutional Digital Repository Institutional Digital Repositories (IDRs) are digital collections that organize, preserve, and make accessible the intellectual output of a single institution or a group of related institutions (Crow, 2002). A typical IDR has following attributes Open-access Repositories allow author/ right holders to deposit their articles  May allow preprints (pre-published manuscripts)  Normally allow post-prints (peer-reviewed and published articles)  Most reputed academic publishers allow authors to deposit some version of their articles in such repositories (
  • 8. OpenDOAR
  • 9. ROAR
  • 10. IDRs in LIS domain Directory for Open Access Repositories ( lists      around 51 open access repositories among them 43 are in English language; 24 are only LIS & IT related; 18 are OAI/PMH compatible. In English, ELIS consist of highest no. of records i.e, 9565 Registry of Open Access Repositories (roar. lists around 6 institutional repositories among them 5 are OAI/PMH compatible. allow us to search & list open access repositories by subject, country and content type.
  • 11. Cross Collection Interoperability These repositories allows submission of scholarly materials globally (i.e cross-institutional) by extensive uses of two interoperability standards Z39.50 is a protocol for distributed search services; OAI/PMH deals with metadata harvesting
  • 12. What is OAI/PMH 1. The OAI/PMH is a light-weight standard protocol for harvesting metadata records from ‘data providers’ to ‘service providers’ 2. It provides some rules to harvest the metadata of a repository not the full content. 3. The content should be retrieve form source repository allows ‘service provider’ to say ‘give me some or all of your metadata records’ 4. Based on HTTP and XML 5. Simply carries metadata 6. Mandates simple DC as record format  but extensible to any XML format – IEEE LOM, ONIX, MARC, METS, MPEG-21, etc.
  • 13. HOW OAI WORKS? OAI “VERBS” Identify ListMetadataFormats ListSets ListIdentifiers ListRecords GetRecord H HTTP Request A (OAI Verb) R V E OAI S T HTTP Response E (Valid XML) R R E P O OAI S I T O R Y
  • 14. METHODOLOGY OF DESIGING     LAMP related activities Harvester related activities Repository related activities Development of repositories
  • 15. LAMP related activities  The prototype harvesting framework developed at Department of LIS, The University of Burdwan, named as UniLIS, is based on open source software and open standards. It uses LAMP architecture as base,  Linux (Ubuntu 9.10)as operating system,  Apache (2.2.8) as Web server,  MySQL (5.0.0) as RDBMS, and  PHP version 5.X as harvesting tool Linking PHP with Apache & MySQL
  • 16. Harvester related activities The requirements of PKP harvester are as follows –  PHP >= 4.2.x (including PHP 5.x); Microsoft IIS requires PHP 5.x  MySQL >= 3.23.23 (including MySQL 4.x/5.x)  Apache >= 1.3.2x or >= 2.0.4x or 2.0.5x /Microsoft IIS 5.x or 6.x  Operating system: Any OS that supports the above software, including Linux, BSD, Solaris, Mac OS X, Windows (preferably NT based Windows flavors)
  • 17. Harvester related activities This group includes two major tasks – • Installation of PKP harvester requires a) login name and password for system administrator (root user) b) database details (name of the MySQL database, user of database and password of the database user)
  • 18. Harvester related activities ii) Configuration of PKP harvester  a) site management (configuration of site specific details, language, crosswalk, plug-in and reading tools);  b) Archives (creation of archives, managing created archives); and  c) other administrative functions (layout, customization etc.).
  • 19. UniLIS Burdwan Department of LIS, The University of
  • 20. UniLIS Department of LIS, The University of Burdwan
  • 21. Site Administration
  • 22. IDRs related requirements Name of open access repositories LDL Librarians Digital Library Sponsoring Institute Documentation Research and Training Centre (DRTC), Indian Institute, Bangalore centre (ISI). India. No of records 249 items (2009-03-13) Software in use Dspace URL of the repository OAI/PMH base URL est Document type Articles; Conferences; Theses; Multimedia Language English, Hindi, Kannada
  • 24. ARCHIVES
  • 26. BROWSING
  • 27. BROWSING
  • 29. Search result
  • 30. View Record
  • 31. View Original
  • 32. UniLIS repository  Presently it includes 5 large-scale open access repositories in LIS domain.  In future it is going to include LIS specific open access journals, ETDs and other open access repositories for the purpose of developing a comprehensive local search service for open access resource in the domain of LIS.
  • 33. THANK YOU