A grouping of items made to allow selective harvesting
E.g. all theses
E.g. the Engineering section
E.g. all resources from a given source
Harvester can ask for specific metadata format for
All available items
All items in a set
All records modified in given date range
(A single item — GetRecord)
Data provider can return
All relevant records
Some relevant records + resumption token
An error code (no such set / metadata format)
Even lighter-weight specification for data providers with small and relatively static collections
E.g. the output from a conference
Essentially an XML file available at a URL
Accessed through a “static repository gateway” intermediary
Providing data is easy
Harvesting data is easy
Doing so may lead to complex workflow / policy issues
What do you do with the harvested metadata?
Do you modify the metadata you harvest?
If so, do you feed this back to the provider?
What if the provider changes a modified record?
Does a service provider disseminate via OAI?
Lots of implementers, who have produced lots of useful support
Relatively little commercial uptake
Relatively little support for harvesting rich metadata
Relatively little support/consensus on sets
Issues: Harvesting resource (e.g. Full text)
Nothing in OAI-PMH requires that full-text should be available for harvesting.
Resource may be physical or accessed controlled
Nothing in OAI-PMH requires that information required for harvesting should be available.
However in many cases OAI-PMH will provide the information required to harvest the resource.
Arc: A Cross Archive Search Service
From October 2000
Arc is an experimental research service that serves as a platform for demonstrating the scalability of the OAI-PMH and as a vehicle for providing access to OAI-compliant repositories through a unified search interface. Arc is the oldest federated search service based on the OAI-PMH.
OAIster is a union catalog of digital resources. We provide access to these digital resources by "harvesting" their descriptive metadata (records) using OAI-PMH (the Open Archives Initiative Protocol for Metadata Harvesting).
Citebase “allows researchers to search across free, full-text research literature eprint archives, with results ranked according to many criteria (e.g., citation impact), and then to navigate that literature using citation links and analysis.”
SAIL-eprints (Search, Alert, Impact and Link)
April 2003 SAIL-eprints (Search, Alert, Impact and Link) is “an electronic open access service provider for finding scientific or technical documents, published or unpublished, in Chemistry, Physics, Engineering, Materials Sciences, Nanotechnologies, Microelectronics, Computer Sciences, Astronomy, Astrophysics, Earth Sciences, Meteorology, Oceanography, . . . [Agriculture], and related . . . [subjects].”
Open Archives Initiative
Spec, best practice guide and useful resources, mailing lists