The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) provides a simple but effective mechanism for metadata harvesting. It allows service providers to aggregate content from data providers to build value-added services. The OAI-PMH uses HTTP and XML to share metadata in any agreed format, with Dublin Core as a baseline. It defines a set of verbs and standards for harvesting metadata from repositories in a consistent way. This interoperability has helped surface resources and build services across independently developed digital libraries.
A presentation on select international digital library initiatives by Rupesh Kumar A, Assistant Professor, Department of Studies and Research in Library and Information Science, Tumkur University, Karnataka, India.
This PPT contain details of Z39.50 and useful for Library Science students. This protocol used for information retrieval and in the end list of different types of protocols are given.
A presentation on select international digital library initiatives by Rupesh Kumar A, Assistant Professor, Department of Studies and Research in Library and Information Science, Tumkur University, Karnataka, India.
This PPT contain details of Z39.50 and useful for Library Science students. This protocol used for information retrieval and in the end list of different types of protocols are given.
Library automation software, Types of software available, Pros and Cons of Commercial and Open source software, List of library automation software.KOHA, WINISIS, NEWGENLIB, SOUL, AUTOLIB AND LIBSYS
Relationship of information science with library scienceSadaf Batool
Relationship of information science with library science
Presentation by Sadaf Batool
MPhil 1st semester
Table of contents
1. Definition of information science
2. Definition of library science
3. Primary history of library
4. Primary history of information
5. Progress of library science as (Library and information science)
6. IS &LS concerned task
7. Relationship of Information science with library science
8. According to S.R Nathan’s five laws
9. Difference of Information science &Library science
10. Conclusion
11. References
Definition of information science
Information science is that discipline that investigates the properties and behavior of information, the forces governing the flow of information, and the means of processing information for optimum accessibility and usability.
It primarily concerned with the analysis, collection, classification, manipulation, storage, retrieval, movement, dissemination, and protection of information.
This includes the investigation of information representations in both natural and artificial systems, the use of codes for efficient message transmission, and the study of information processing devices and techniques such as computers and their programming systems.
It is an interdisciplinary science derived from and related to such fields as mathematics, logic, linguistics, psychology, computer technology, operations research, the graphic arts, communications, library science, management, and other similar fields. It has both a pure science component, which inquiries into the subject without regard to its application, and an applied science component, which develops services and products." (Borko, 1968, p.3The study of – the use of information, – its sources and development; – usually taken to refer to the role of scientific, industrial and specialized libraries and information units – in the handling and – dissemination of information. (Prytherch, 2005)
The systematic study and analysis of the – sources, – development, – collection, – organization, – dissemination, – evaluation, – use, and – management of information in all its forms, including the channels (formal and informal) and technology used in its communication. – –(Reitz, 2004) Definition of library science
The study of principles and practices of library care, and organization and administration of a library, and of its technical, informational, and reference services.
Library science as “a generic term for the study of libraries and information units, the role they play in society, their various component routines and processes, and their history and future development. (Harrods ‘Librarian’s Glossary)
Collection of reading material, its processing, organization and dissemination started with the advent of library. The knowledge and its implementation in respect of library may therefore be called library science.
The professional kn
A presentation on Interoperability in Digital Libraries by Rupesh Kumar A, Assistant Professor, Department of Studies and Research in Library and Information Science, Tumkur University, Tumakuru, Karnataka, India.
Software's now-a-days became the life line of modern day organizations. Libraries also need software if they want to create a parallel digital library with features which we may not find in a traditional library.
Library automation software, Types of software available, Pros and Cons of Commercial and Open source software, List of library automation software.KOHA, WINISIS, NEWGENLIB, SOUL, AUTOLIB AND LIBSYS
Relationship of information science with library scienceSadaf Batool
Relationship of information science with library science
Presentation by Sadaf Batool
MPhil 1st semester
Table of contents
1. Definition of information science
2. Definition of library science
3. Primary history of library
4. Primary history of information
5. Progress of library science as (Library and information science)
6. IS &LS concerned task
7. Relationship of Information science with library science
8. According to S.R Nathan’s five laws
9. Difference of Information science &Library science
10. Conclusion
11. References
Definition of information science
Information science is that discipline that investigates the properties and behavior of information, the forces governing the flow of information, and the means of processing information for optimum accessibility and usability.
It primarily concerned with the analysis, collection, classification, manipulation, storage, retrieval, movement, dissemination, and protection of information.
This includes the investigation of information representations in both natural and artificial systems, the use of codes for efficient message transmission, and the study of information processing devices and techniques such as computers and their programming systems.
It is an interdisciplinary science derived from and related to such fields as mathematics, logic, linguistics, psychology, computer technology, operations research, the graphic arts, communications, library science, management, and other similar fields. It has both a pure science component, which inquiries into the subject without regard to its application, and an applied science component, which develops services and products." (Borko, 1968, p.3The study of – the use of information, – its sources and development; – usually taken to refer to the role of scientific, industrial and specialized libraries and information units – in the handling and – dissemination of information. (Prytherch, 2005)
The systematic study and analysis of the – sources, – development, – collection, – organization, – dissemination, – evaluation, – use, and – management of information in all its forms, including the channels (formal and informal) and technology used in its communication. – –(Reitz, 2004) Definition of library science
The study of principles and practices of library care, and organization and administration of a library, and of its technical, informational, and reference services.
Library science as “a generic term for the study of libraries and information units, the role they play in society, their various component routines and processes, and their history and future development. (Harrods ‘Librarian’s Glossary)
Collection of reading material, its processing, organization and dissemination started with the advent of library. The knowledge and its implementation in respect of library may therefore be called library science.
The professional kn
A presentation on Interoperability in Digital Libraries by Rupesh Kumar A, Assistant Professor, Department of Studies and Research in Library and Information Science, Tumkur University, Tumakuru, Karnataka, India.
Software's now-a-days became the life line of modern day organizations. Libraries also need software if they want to create a parallel digital library with features which we may not find in a traditional library.
Linked Open Data Principles, Technologies and ExamplesOpen Data Support
Theoretical and practical introducton to linked data, focusing both on the value proposition, the theory/foundations, and on practical examples. The material is tailored to the context of the EU institutions.
Similar to Open Archives Initiatives For Metadata Harvesting (20)
Implementing web scale discovery services: special reference to Indian Librar...Nikesh Narayanan
Web scale Discovery services arebecoming the widely adopted Information Retrieval solution in libraries across the world to connect its patrons with the relevant information they seek. In lieu with the world trend, Resources Discovery Solution implementation is gathering momentum in Indian libraries also.
Considering the Indian Libraries scenario, this paper attempts to provide an overview of Library Web Scale Discovery solutions, its need in Indian Libraries, important parameters to be considered for evaluation of Discovery Services, essential factors to be considered prior to implementation, stages of implementation and finally some thoughts on post implementation analysis for measuring the success.
Web scale Discovery services are becoming the most sought after solution for Libraries to connect its patrons with the relevant information they seek. Many studies show that these services are getting wide acceptance from users as well as Library staff and making revolution in Library Information retrieval arena. Given such broad implications, selecting a new discovery service for libraries is an important undertaking. Library professionals should carefully evaluate options to meet their goal of finding the best potential match for their library. This Paper attempts to provide a comprehensive overview of Library Web Scale Discovery solutions by depicting various facets of Web Scale Discovery, how it differs from federated searching and highlights the important parameters to be considered for taking an informed and confident decision on selecting discovery service.
Cloud web scale discovery services landscape an overviewNikesh Narayanan
Abstract
The impact of Internet and Google like search engines radically influenced the information behavior of Net Generation users. They expect same environment in library services such that all their required information make available in a single set of results through unified search across all the available resources. Libraries have been striving to respond to this challenge for years. Until recently, federated search technology of the past decade was the better attempt in this area to meet these user expectations. But federated search solution is marked by the drawbacks of its slowness as it searches each database on the fly. New Generation cloud based Library Web scale discovery technology is a promising entrant in this landscape. This Paper attempts to provide a comprehensive overview of Library Web Scale Discovery solutions by depicting various facets of Web Scale Discovery solutions such as its importance to Library field, their possible role as the starting point for research, content coverage, and finally analyses the competition at the discovery front by comparing the services of major players. The comparative analysis shows that all the major service providers are extending competitive features and services, but varies in some areas and the adoption choice depends on the concerned library’s preferences and the cost involved.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
1. Open Archives Initiatives for Metadata Harvesting
A Framework for Building Open Digital Libraries
Term Paper-1
Submitted by
NIKESH.N
International School of Information Management
University of Mysore
2010
2. Open Archives Initiatives for Metadata Harvesting
A Framework for Building Open Digital Libraries
1.0 Introduction
Digital Library may be defined as system that supports collection, organization, storage, retrieval
and dissemination of Digital Documents. It may be viewed as the intersection of Library Science,
Computer Science and networked information systems. Open movements are gaining acceptance
in the scholarly information arena and many of the Universities and research centers have started
to provide public access to their repositories. With the growing number of repositories of digital
repositories in the Web, it became difficult for the users to visit individual places in search of
information. Many organizational repositories have not been indexed by the search engines. Such
mechanism is therefore required by which the repositories can share the resources and work in
coordination, to provide a broader purview to the users. The mechanism which provides the ability to
the information systems to work in coordination has been termed as Interoperability. Open Archives
Initiative is one of the landmark efforts to ensure the availability of the metadata of digital resources
of many repositories at the users’ end.
The essence of the open archives approach is to enable access to Web-accessible material through
interoperable repositories for metadata sharing, publishing and archiving.
Such interoperability requirements necessitated the development of standards such as the Dublin
Core Metadata Element Set and the Open Archives Initiative's Protocol for Metadata Harvesting
(OAI-PMH). These standards have achieved a degree of success in the DL community largely
because of their generality and simplicity.
2.0 Need for a Harvester protocol
There is a growing need to make resources, not only descriptive metadata, harvestable in an
interoperable manner. There are two major use cases that motivate this need:
• Preservation: The need to periodically transfer digital content from a data repository to one or
more trusted digital repositories charged with storing and preserving safety copies of the
3. content. The trusted digital repositories need a mechanism to automatically synchronize with
the originating data repository.
• Discovery: The need to use content itself in the creation of services. Examples include search
engines that make full-text from multiple data repositories searchable, and citation indexing
systems that extract references from the full-text content. Another scenario is the provision of
thumbnail versions of high-quality images from cultural heritage collections to external
services that build browsing interfaces that include the thumbnails
3.0 OAI Protocol for Metadata Harvesting (OAI-PMH)
In October of 1999 the Open Archives Initiative (OAI) was launched in an attempt to address
interoperability issues among the many existing and independent DLs. The focus was on high-
level communication among systems and simplicity of protocols. The OAI has since received
much media attention in the DL community and, primarily because of the simplicity of its
standards, has attracted many early adopters. It defines a mechanism for harvesting records
containing metadata from repositories.
3.1 Definitions of Key terms
• Open archives Initiatives (OAI)
OAI is an initiative to develop and promote interoperability standards that aim to facilitate the
efficient dissemination of content.
• Archive
The term "archive" in the name Open Archives Initiative reflects the origins of the OAI in
the e-prints community where the term archive is generally accepted as a synonym for
repository of scholarly papers. Members of the archiving profession have justifiably noted
the strict definition of an ?archive? within their domain; with connotations of preservation of
long-term value, statutory authorization and institutional policy. The OAI uses the term ?
archive? in a broader sense: as a repository for stored information. Language and terms are
never unambiguous and uncontroversial and the OAI respectfully requests the indulgence of
the professional archiving community with this broader use of ?archive?
4. (OAI definition quoted from FAQ on OAI Web site)
• OAI Protocol for Metadata Harvesting (OAI-PMH)
OAI-PMH is a lightweight harvesting protocol for sharing metadata between services.
• Protocol
A protocol is a set of rules defining communication between systems. FTP (File Transfer
Protocol) and HTTP (Hypertext Transport Protocol) are examples of other protocols used for
communication between systems across the Internet.
• Harvesting
In the OAI context, harvesting refers specifically to the gathering together of metadata from a
number of distributed repositories into a combined data store.
3.2 Prerequisites to develop metadata harvesting protocol
To facilitate metadata harvesting there needs to be agreement on:
o Transport protocol - HTTP or FTP or other such protocol
o Metadata format - Dublin Core or MARC or other such format
o Metadata Quality Assurance - mandatory element set, naming and subject conventions, etc.
o Intellectual Property and Usage Rights - who can do what with what?
3.3 OAI: Key players
There are two groups of 'participants': Data Providers and Service Providers.
5. Data Providers
(open archives, repositories) provide free access to metadata, and may, but do not necessarily,
offer free access to full texts or other resources. OAI-PMH provides an easy to implement, low
barrier solution for Data Providers.
Service Providers
use the OAI interfaces of the Data Providers to harvest and store metadata. Note that this means
that there are no live search requests to the Data Providers; rather, services are based on the
harvested data via OAI-PMH. Service Providers may select certain subsets from Data Providers
(e.g., by set hierarchy or date stamp). Service Providers offer (value-added) services on the basis
of the metadata harvested, and they may enrich the harvested metadata in order to do so.
3.4 How it works
6. Prerequisites to develop metadata harvesting protocol
To facilitate metadata harvesting there needs to be agreement on:
o Transport protocol - HTTP or FTP or other such protocol
o Metadata format - Dublin Core or MARC or other such format
o Metadata Quality Assurance - mandatory element set, naming and subject conventions, etc.
o Intellectual Property and Usage Rights - who can do what with what?
The OAI-PMH gives a simple technical option for data providers to make their metadata
available to services, based on the open standards HTTP (Hypertext Transport Protocol) and
XML (Extensible Markup Language). The metadata that is harvested may be in any format that
is agreed by a community (or by any discrete set of data and service providers), although
unqualified Dublin Core is specified to provide a basic level of interoperability. Thus, metadata
from many sources can be gathered together in one database, and services can be provided based
on this centrally harvested or "aggregated" data. The link between this metadata and the related
content is not defined by the OAI protocol. It is important to realize that OAI-PMH does not
provide a search across this data, it simply makes it possible to bring the data together in one
place. In order to provide services, the harvesting approach must be combined with other
mechanisms.
3.5 Protocol details
Records
A record is the metadata of a resource in a specific format. A record has three parts: a header and
metadata, both of which are mandatory, and an optional about statement. Each of these is made
up of various components as set out below.
header (mandatory)
identifier (mandatory: 1 only)
7. datestamp (mandatory: 1 only)
setSpec elements (optional: 0, 1 or more)
status attribute for deleted item
metadata (mandatory)
XML encoded metadata with root tag, namespace
repositories must support Dublin Core, may support other formats
about (optional)
rights statements
provenance statements
Datestamps
A datestamp is the date of last modification of a metadata record. Datestamp is a mandatory
characteristic of every item. It has two possible levels of granularity:
YYYY-MM-DD or YYYY-MM-DDThh:mm:ssZ.
The function of the datestamp is to provide information on metadata that enables selective
harvesting using from and until arguments. Its applications are in incremental update
mechanisms. It gives either the date of creation, last modification, or deletion. Deletion is
covered with three support levels: no, persistent, transient.
Metadata schema
OAI-PMH supports dissemination of multiple metadata formats from a repository. The
properties of metadata formats are:
– id string to specify the format (metadataPrefix)
– metadata schema URL (XML schema to test validity)
– XML namespace URI (global identifier for metadata format)
Repositories must be able to disseminate unqualified Dublin Core. Further arbitrary metadata
formats can be defined and transported via the OAI-PMH. Any returned metadata must comply
8. with an XML namespace specification. The Dublin Core Metadata Element Set contains 15
elements. All elements are optional, and all elements may be repeated.
3.6 The Dublin Core Metadata Element Set:
Title Contributor Source
Creator Date Language
Subject Type Relation
Description Format Coverage
Publisher Identifier Rights
Sets
Sets enable a logical partitioning of repositories. They are optional archives do not have to
define Sets. There are no recommendations for the implementation of Sets. Sets are not
necessarily exhaustive of the content of a repository. They are not necessarily strictly
hierarchical. It is important and necessary to have negotiated agreements within communities
defining useful sets for the communities.
• function: selective harvesting (set parameter)
• applications: subject gateways, dissertation search engine, and others
• examples
o publication types (thesis, article, ?)
o document types (text, audio, image, ?)
o content sets, according to DNB (medicine, biology, ?)
3.7 Request format
Requests must be submitted using the GET or POST methods of HTTP, and repositories must
support both methods. At least one key=value pair: verb=RequestType (where RequestType is
9. some type of request such as ListRecords) must be provided. Additional key=value pairs depend
on the request type.
example for GET request: http://archive.org/oai?
verb=ListRecords&metadataPrefix=oai_dc
The encoding of special characters must be supported; for example, ":" (host port separator)
becomes "%3A"
3.8 Response
Responses are formatted as HTTP responses. The content type must be text/xml. HTTP-based
status codes, as distinguished from OAI-PMH errors, such as 302 (redirect) and 503 (service not
available) may be returned. Compression codes are optional in OAI-PMH, only identity
encoding is mandatory. The response format must be well-formed XML with markup as follows:
1. XML declaration
(<?xml version="1.0" encoding="UTF-8" ?>)
2. root element named OAI-PMH with three attributes
(xmlns, xmlns:xsi, xsi:schemaLocation)
3. three child elements
1. responseDate (UTC datetime)
2. request (the request that generated this response)
3. a) error (in case of an error or exception condition)
b) element with the name of the OAI-PMH request
10. 3.9 OAI-
PMH
Verbs
Here ‘verb’
means
request type which the service provider/harvester sends to get responses from data providers. There is
a standard set of 6 verbs:
o Identify
o ListMetadataFormats
o ListSets
o GetRecord
o ListIdentifiers
o ListRecords
Function
Identify Description of repository
ListMetadataFormats Metadata format supported by the repository
ListSets Sets defined by repository
ListIdentifiers Retrieves unique identifiers of the item
ListRecords Used to harvest records from the repository
GetRecords Retrieves individual metadata record from the
repository
11. A harvester is not required to use all types. However, a repository must implement all types.
There are required and optional arguments, depending on request types.
4.0 Dspace : OAI compatible Digital Library Software
DSpace is open source software for building and managing Digital repositories. Developed jointly by
MIT Libraries and Hewlett-Packard (HP), is freely available to research institutions as an open
source system that can be customized and extended. DSpace is a digital institutional repository that
captures, stores, indexes, preserves, and redistributes content in digital formats. Institutional
Repository is a set of services that a research institution/ organization/ university offers to the
members of its community for the management and dissemination of digital
materials created by the institution and its community members Typically, DSpace has been
deployed for Institutional Repositories of publications, thesis and dissertations. There are several
groups working on extending its capabilities such implementation of ontologies in search interface
and for submission module, customization for management of electronic theses and dissertations and
for localization and international of the package for the world languages.
Dspace is compliant with OAI-PMH ver 2.0 and metadata in Dspace digital libraries can be
harvested.
4.1 DSpace Search System
The end user can browse, search and access the collections using the hierarchies and also the
alphabetic bar menu. For searching the collection, Dspace uses Lucene Search Engine, which is a
part of Apache Jakarta Project (1). Additionally research projects such as the …(Portugal)…
provides Ontologies that enables context based querying. This work like subject based directory
structures.
Lucene search engine has very powerful search features that encompass many search approaches of
the end-user. It provides the basic ‘exact term’ or keyword search. In addition it allows fielded search
akin the field level search of library databases. In Dspace, Dublin Core elements are used for the field
names. Lucene also facilitates Boolean search, range searches, term boosting and proximity searches.
The interesting search facility lucene uses fuzzy logic that is based on the Levenstien’s alogorithm
(5) that can replace and match terms by similarity. This feature is especially useful in instances where
we hear a term and guess it spellings and more so in the case of personal names.
12. 4.2 Metadata in Dspace
DSpace users deal with/come across metadata in the following modules:
D Administration modules: Dublin core registry, administrative metadata- default values, mail
alert to subscribers
a Submission modules: descriptive metadata
a Harvesting – OAI-PMH using the DC elements (unqualified)
a Search result display: brief and full metadata
4.3 Metadata harvesting in Dspace
Dspace is compliant with the OAI-PMH for exposing metadata. OAI-PMH allows repositories to
expose an hierarchy of sets in which records may be placed. DSpace exposes collections as sets.
Each collection has a corresponding OAI set and harvestors use a verb (OAI- command) ListSets, to
discover the sets. Only the 15 basic Dublin Core elements is exposed at present.
5.0 OAI Harvester Software
o Arc (http://arc.cs.odu.edu/)
o Citebase (http://citebase.eprints.org/cgi-bin/search)
o CYCLADES (http://www.ercim.org/cyclades/)
o DP9 (http://arc.cs.odu.edu:8080/dp9/index.jsp)
o MeIND (http://www.meind.de/)
o METALIS (http://metalis.cilea.it/)
o my.OAI (http://www.myoai.com)
o NCSTRL (http://www.ncstrl.org/)
o Purseus (http://www.perseus.tufts.edu/cgi-bin/vor)
o Public Knowledge Project – Open Archives Harvester (http://pkp.ubc.ca/harvester/)
o OAICAT (http://www.oclc.org/research/software/oai/cat.htm)
o OAI Repository Explorer (http://re.cs.uct.ac.za/)
o OAIster (http://oaister.umdl.umich.edu/o/oaister/)
o OASIC (Open Archvies en SIC) (http://oasic.ccsd.cnrs.fr/)
o OAIHarvester (http://www.oclc.org/research/software/oai/harvester.htm)
o DLESE OAI Software (http://dlese.org/oai/index.jsp)
6.0 Future Prospects
13. Some more work has to be done in order to make OAI-PMH as a complete globally accepted
metadata harvesting protocol:
o Tools and software has to be developed by which the non-OAI-PMH compliant repositories
can be converted into OAI-PMH compliant so that the repository can be made data provider.
o The higher versions of the protocol should be made compatible of the lower ones.
At metadata creation level some standardization is required, as a particular resource is described
inconsistently at different repositories. Vocabulary control measures should be also taken care of.
Still some more improvements are awaited in OAI-PMH protocol, and then only we can ensure
a comprehensive view of the resources available on a particular subject to our end-users.
7.0 Conclusion
Much promise is seen for the use of the protocol within an open archives approach. Support for a
new pattern for scholarly communication is the most publicized potential benefit. Perhaps most
readily achievable are the goals of surfacing 'hidden resources' and low cost interoperability.
Although the OAI-PMH is technically very simple, building coherent services that meet user
requirements remains complex. The OAI-PMH protocol could become part of the infrastructure
of the Web, as taken-for-granted as the HTTP protocol now is, if a combination of its relative
simplicity and proven success by early implementers in a service context leads to widespread
uptake by research organizations, publishers and archives.
REFERENCES
1. http://www.openarchives.org/
2. Breeding, M. (2002, April). The Emergence of the Open Archives Initiative: This Protocol
could become a key part of the digital library infrastructure. Information Today.
from http://www.findarticles.com/cf_0/m3336/4_19/85251474/p1/article.jhtml
3. Breeding, M. (2002). Understanding the Protocol for Metadata Harvesting of the Open
Archives Initiative. Computers in Libraries, 22(8).
4. Lagoze, C., & Sompel, H. V. d. (2001, January). The Open Archives Initiative Protocol for
Metadata Harvesting,from http://www.openarchives.org/OAI/openarchivesprotocol.htm
14. 5. Lynch, C. A. (2001, August). Metadata Harvesting and the Open Archives Initiative. ARL
Bimonthly Report 217. from http://www.arl.org/newsltr/217/mhp.html
6. Shearer, K. (2002, March). The Open Archives Initiative: Developing an Interoperability
Framework for Scholarly Publishing. CARL/ABRC Background Series, No. 5. from
http://www.carl-abrc.ca/projects/scholarly/open_archives.PDF
7. Suleman, H., & Fox, E. A. (2001, December). A Framework for Building Open Digital
Libraries. D-Lib Magazine, 7(12). from
http://www.dlib.org/dlib/december01/suleman/12suleman.html
8. Sompel, H. V. d., & Lagoze, C. (2000, February). The Santa Fe Convention of the Open
Archives Initiative. D-Lib Magazine, 6(2). from http://www.dlib.org/dlib/february00/vandesompel-
oai/02vandesompel-oai.html
9. Warner, S. (2001, June). Exposing and Harvesting Metadata Using the OAI Metadata
Harvesting Protocol: A Tutorial. HEP Libraries Webzine Issue 4. from
http://library.cern.ch/HEPLW/4/papers/3/
11 . http://www.ukoln.ac.uk/repositories/digirep/index/FAQs
12 . Michael Shepherd, (2003), Interoperability for Digital Libraries, DRTC Workshop on
Semantic Web 8th – 10th December, 2003,DRTC, Bangalore
13 . http://www.openarchives.org/Register/BrowseSites
14 . http://www.openarchives.org/service/listproviders.html