SlideShare a Scribd company logo
Purdue University
Purdue e-Pubs
Libraries Research Publications
11-22-2006
Providing an OAI-PMH Interface to the Storage
Resource Broker With OAISRB
Michael Witt
Purdue University, mwitt@purdue.edu
Jigar Kadakia
Follow this and additional works at: http://docs.lib.purdue.edu/lib_research
This document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries. Please contact epubs@purdue.edu for
additional information.
Witt, Michael and Kadakia, Jigar, "Providing an OAI-PMH Interface to the Storage Resource Broker With OAISRB" (2006). Libraries
Research Publications. Paper 78.
http://docs.lib.purdue.edu/lib_research/78
Providing an OAI-PMH Interface to the Storage Resource Broker With OAISRB
Michael Witt & Jigar Kadakia, Purdue University Libraries, West Lafayette, Indiana, USA.
ABSTRACT
The OAISRB software was developed by Michael Witt
and Jigar Kadakia of the Purdue University Libraries
to provide an Open Archives Initiative Protocol for
Metadata Harvesting (OAI-PMH) interface to expose
metadata from digital objects residing on the
Storage Resource Broker (SRB) to OAI service
providers. By harvesting metadata from the SRB, grid
resources such as large research datasets from
computational sciences can be represented
alongside more conventional digital library objects
such as e-prints and digitized archival collections.
OAI-PMH
The OAI-PMH defines a protocol that allows a service
provider to harvest metadata from a compliant
repository, also known as a data provider. OAI-PMH
requests are constructed using six different verbs:
• Identify
• ListSets
• ListMetadataFormats
• ListIdentifiers
• ListRecords
• GetRecord
STORAGE RESOURCE BROKER
The SRB functions as a data grid, allowing multiple
physical storage units to be aggregated and
represented as a single logical storage unit. An
assortment of client programs facilitate access to
files on the SRB and perform functions such as
filesystem virtualization. A metadata catalog (MCAT)
service keeps track of the relationships among users,
files, and storage units as well as supporting different
levels of description that can be queried by users to
perform various actions to access and use objects.
OAISRB
OAISRB essentially serves as middleware, converting
OAI-PMH requests into SRB queries and returning
the results from the MCAT in an OAI-compliant
manner. It is implemented as a java servlet that runs
as a web service, for example, on an Apache Tomcat
server.
Harvester
MCAT
(SRB)
Apache Tomcat Server
OAISRB
(OAICat)
interface client
HTTP
OAI-PMH SRB
XML
(Jargon)
Figure 1. OAISRB Flow Diagram
Its external interface listens to TCP port 80 for an OAI-PMH request, which arrives via HTTP. The response to
an Identify request provides basic information about the repository such as its name, administrator email
address, and protocol version from a configuration file, oaihandler.properties. An OAI set is analogous to a
collection in SRB, and OAISRB responds to the ListSets verb with the list of collections that are statically
defined in oaihandler.properties. Because the current version of OAISRB only supports the oai_dc (Dublin
Core) schema, the response to the ListMetaDataFormats request is hard-coded. For all other requests,
OAISRB extracts the verbs and other parameters from the URL, constructs a query, connects to SRB using
the internal interface, and passes the query to MCAT. The results are received from MCAT as an array that is
formatted into XML and validated before being returned to external interface and the harvester. The
GetRecord verb retrieves an individual metadata record. The ListRecords verb harvests a group of metadata
records from a collection that meet the parameters of a query. The ListIdentifiers verb returns a list of
abbreviated metadata records. Both OAISRB interfaces use open source software; the external OAI-PMH
interface is based on OAICat, developed by Jeff Young at the Online Computer Library Center (OCLC), and
the internal SRB interface uses the Jargon java client API that was developed by the San Diego
Supercomputer Center (SDSC).
The oaihandler.properties file also defines the base URL and port for the OAI-PMH interface, the SRB
connection parameters (e.g., username, password, zone), and information about the server and collections.
A crosswalk feature converts from one metadata format to oai_dc on-the-fly after the recordset is returned
from MCAT. Custom field mappings can be defined for each collection. Lastly, a logging function records
information related to performance and errors for debugging purposes.
OAISRB (con’t)
OAISRB was approved for open source
distribution by Purdue’s Office of Technology
Commercialization (request #64660) on August
30, 2006. It was tested in collaboration with the
Rosen Center for Advanced Computing, a
division of Information Technology at Purdue, on
the TeraGrid. Feedback from developers and
users will help identify bugs to fix and features to
add in new releases.
FUTURE WORK
Future work on OAISRB includes adding support
for metadata formats other than Dublin Core and
the ability to reference an XML schema on the
web to perform dynamic crosswalks. More
flexibility and granularity in defining objects in
collections could be provided by adding
pattern-matching capability. For example, a
service provider may only be interested in
graphic files in a collection and apply a regex
filter (.tif$) in its collection definition. Once
standards are defined, OAISRB could be
extended to allow harvesting of objects as well as
metadata. In the case of large datasets, it is
impractical to ingest them into a single
repository. A new set of middleware tools can be
developed to use existing protocols to provide
interoperability to access data wherever it resides
(e.g., on the grid, on a researcher’s desktop, etc.).
LINKS
Purdue University Libraries, OAISRB:
http://www.lib.purdue.edu/research/oaisrb
The OAI-PMH Version 2.0:
http://www.openarchives.org/OAI/openarchives
protocol.htm
San Diego Supercomputer Center, SRB:
http://www.sdsc.edu/srb
Online Computer Library Center, OAICat:
http://oclc.org/research/software/oai/cat.htm
San Diego Supercomputer Center, Jargon:
http://www.sdsc.edu/srb/index.php/Jargon

More Related Content

What's hot

Digitisation and institutional repositories 3
Digitisation and institutional repositories 3Digitisation and institutional repositories 3
Digitisation and institutional repositories 3
Libsoul Technologies Pvt. Ltd.
 
An Approach for the Incremental Export of Relational Databases into RDF Graphs
An Approach for the Incremental Export of Relational Databases into RDF GraphsAn Approach for the Incremental Export of Relational Databases into RDF Graphs
An Approach for the Incremental Export of Relational Databases into RDF Graphs
Nikolaos Konstantinou
 
DataBearings: A semantic platform for data integration on IoT, Artem Katasonov
DataBearings: A semantic platform for data integration on IoT, Artem KatasonovDataBearings: A semantic platform for data integration on IoT, Artem Katasonov
DataBearings: A semantic platform for data integration on IoT, Artem Katasonov
VTT Technical Research Centre of Finland Ltd
 
Deploying RDF Linked Data via Virtuoso Universal Server
Deploying RDF Linked Data via Virtuoso Universal ServerDeploying RDF Linked Data via Virtuoso Universal Server
Deploying RDF Linked Data via Virtuoso Universal Server
rumito
 
Microsoft data access components
Microsoft data access componentsMicrosoft data access components
Microsoft data access components
Shiva Krishna Chandra Shekar
 
Semantic Media Management with Apache Marmotta
Semantic Media Management with Apache MarmottaSemantic Media Management with Apache Marmotta
Semantic Media Management with Apache Marmotta
Thomas Kurz
 
Customizing CKAN
Customizing CKANCustomizing CKAN
Customizing CKAN
OKCon2013
 
Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases
IJECEIAES
 
EKAW - Linked Data Publishing
EKAW - Linked Data PublishingEKAW - Linked Data Publishing
EKAW - Linked Data Publishing
Ruben Taelman
 
LDAP
LDAPLDAP
SQL Server 2012 - Semantic Search
SQL Server 2012 - Semantic SearchSQL Server 2012 - Semantic Search
SQL Server 2012 - Semantic Search
Sperasoft
 
RDF Views of SQL Data Power Point Presentation - 1
RDF Views of SQL Data Power Point Presentation - 1RDF Views of SQL Data Power Point Presentation - 1
RDF Views of SQL Data Power Point Presentation - 1
rumito
 
Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO
Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OOVirtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO
Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO
Paolo Cristofaro
 
Solving Real Problems Using Linked Data
Solving Real Problems Using Linked DataSolving Real Problems Using Linked Data
Solving Real Problems Using Linked Data
rumito
 
B2SHARE REST API Hands-on - EUDAT Summer School (Hans van Piggelen, SURFsara)
B2SHARE REST API Hands-on - EUDAT Summer School (Hans van Piggelen, SURFsara)B2SHARE REST API Hands-on - EUDAT Summer School (Hans van Piggelen, SURFsara)
B2SHARE REST API Hands-on - EUDAT Summer School (Hans van Piggelen, SURFsara)
EUDAT
 
Environment Canada's Data Management Service
Environment Canada's Data Management ServiceEnvironment Canada's Data Management Service
Environment Canada's Data Management Service
Safe Software
 
Drupal Services 3 - Drupal Dev Days 2011, Brussels
Drupal Services 3 - Drupal Dev Days 2011, BrusselsDrupal Services 3 - Drupal Dev Days 2011, Brussels
Drupal Services 3 - Drupal Dev Days 2011, Brussels
heyrocker
 
MarcEdit Shelter-In-Place Webinar 5: Working with MarcEdit's Linked Data Fram...
MarcEdit Shelter-In-Place Webinar 5: Working with MarcEdit's Linked Data Fram...MarcEdit Shelter-In-Place Webinar 5: Working with MarcEdit's Linked Data Fram...
MarcEdit Shelter-In-Place Webinar 5: Working with MarcEdit's Linked Data Fram...
Terry Reese
 
Exchanging the Status between Clients of Geospatial Web Services and GIS appl...
Exchanging the Status between Clients of Geospatial Web Services and GIS appl...Exchanging the Status between Clients of Geospatial Web Services and GIS appl...
Exchanging the Status between Clients of Geospatial Web Services and GIS appl...
Paula Díaz
 
MarcEdit Shelter-In-Place Webinar 4: Merging, Clustering, and Integrations…oh...
MarcEdit Shelter-In-Place Webinar 4: Merging, Clustering, and Integrations…oh...MarcEdit Shelter-In-Place Webinar 4: Merging, Clustering, and Integrations…oh...
MarcEdit Shelter-In-Place Webinar 4: Merging, Clustering, and Integrations…oh...
Terry Reese
 

What's hot (20)

Digitisation and institutional repositories 3
Digitisation and institutional repositories 3Digitisation and institutional repositories 3
Digitisation and institutional repositories 3
 
An Approach for the Incremental Export of Relational Databases into RDF Graphs
An Approach for the Incremental Export of Relational Databases into RDF GraphsAn Approach for the Incremental Export of Relational Databases into RDF Graphs
An Approach for the Incremental Export of Relational Databases into RDF Graphs
 
DataBearings: A semantic platform for data integration on IoT, Artem Katasonov
DataBearings: A semantic platform for data integration on IoT, Artem KatasonovDataBearings: A semantic platform for data integration on IoT, Artem Katasonov
DataBearings: A semantic platform for data integration on IoT, Artem Katasonov
 
Deploying RDF Linked Data via Virtuoso Universal Server
Deploying RDF Linked Data via Virtuoso Universal ServerDeploying RDF Linked Data via Virtuoso Universal Server
Deploying RDF Linked Data via Virtuoso Universal Server
 
Microsoft data access components
Microsoft data access componentsMicrosoft data access components
Microsoft data access components
 
Semantic Media Management with Apache Marmotta
Semantic Media Management with Apache MarmottaSemantic Media Management with Apache Marmotta
Semantic Media Management with Apache Marmotta
 
Customizing CKAN
Customizing CKANCustomizing CKAN
Customizing CKAN
 
Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases
 
EKAW - Linked Data Publishing
EKAW - Linked Data PublishingEKAW - Linked Data Publishing
EKAW - Linked Data Publishing
 
LDAP
LDAPLDAP
LDAP
 
SQL Server 2012 - Semantic Search
SQL Server 2012 - Semantic SearchSQL Server 2012 - Semantic Search
SQL Server 2012 - Semantic Search
 
RDF Views of SQL Data Power Point Presentation - 1
RDF Views of SQL Data Power Point Presentation - 1RDF Views of SQL Data Power Point Presentation - 1
RDF Views of SQL Data Power Point Presentation - 1
 
Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO
Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OOVirtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO
Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO
 
Solving Real Problems Using Linked Data
Solving Real Problems Using Linked DataSolving Real Problems Using Linked Data
Solving Real Problems Using Linked Data
 
B2SHARE REST API Hands-on - EUDAT Summer School (Hans van Piggelen, SURFsara)
B2SHARE REST API Hands-on - EUDAT Summer School (Hans van Piggelen, SURFsara)B2SHARE REST API Hands-on - EUDAT Summer School (Hans van Piggelen, SURFsara)
B2SHARE REST API Hands-on - EUDAT Summer School (Hans van Piggelen, SURFsara)
 
Environment Canada's Data Management Service
Environment Canada's Data Management ServiceEnvironment Canada's Data Management Service
Environment Canada's Data Management Service
 
Drupal Services 3 - Drupal Dev Days 2011, Brussels
Drupal Services 3 - Drupal Dev Days 2011, BrusselsDrupal Services 3 - Drupal Dev Days 2011, Brussels
Drupal Services 3 - Drupal Dev Days 2011, Brussels
 
MarcEdit Shelter-In-Place Webinar 5: Working with MarcEdit's Linked Data Fram...
MarcEdit Shelter-In-Place Webinar 5: Working with MarcEdit's Linked Data Fram...MarcEdit Shelter-In-Place Webinar 5: Working with MarcEdit's Linked Data Fram...
MarcEdit Shelter-In-Place Webinar 5: Working with MarcEdit's Linked Data Fram...
 
Exchanging the Status between Clients of Geospatial Web Services and GIS appl...
Exchanging the Status between Clients of Geospatial Web Services and GIS appl...Exchanging the Status between Clients of Geospatial Web Services and GIS appl...
Exchanging the Status between Clients of Geospatial Web Services and GIS appl...
 
MarcEdit Shelter-In-Place Webinar 4: Merging, Clustering, and Integrations…oh...
MarcEdit Shelter-In-Place Webinar 4: Merging, Clustering, and Integrations…oh...MarcEdit Shelter-In-Place Webinar 4: Merging, Clustering, and Integrations…oh...
MarcEdit Shelter-In-Place Webinar 4: Merging, Clustering, and Integrations…oh...
 

Similar to OAISRB

IP based standards for IoT
IP based standards for IoTIP based standards for IoT
IP based standards for IoT
Michael Koster
 
Hypermedia for Machine APIs
Hypermedia for Machine APIsHypermedia for Machine APIs
Hypermedia for Machine APIs
Michael Koster
 
The Materials API
The Materials APIThe Materials API
aip-workshop1-dev-tutorial
aip-workshop1-dev-tutorialaip-workshop1-dev-tutorial
aip-workshop1-dev-tutorial
Matthew Vaughn
 
Arabidopsis Information Portal, Developer Workshop 2014, Introduction
Arabidopsis Information Portal, Developer Workshop 2014, IntroductionArabidopsis Information Portal, Developer Workshop 2014, Introduction
Arabidopsis Information Portal, Developer Workshop 2014, Introduction
JasonRafeMiller
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Guido Schmutz
 
A Look into the Apache OODT Ecosystem
A Look into the Apache OODT EcosystemA Look into the Apache OODT Ecosystem
A Look into the Apache OODT Ecosystem
Chris Mattmann
 
Data Science with the Help of Metadata
Data Science with the Help of MetadataData Science with the Help of Metadata
Data Science with the Help of Metadata
Jim Dowling
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaSolutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Guido Schmutz
 
Relational Databases to Riak
Relational Databases to RiakRelational Databases to Riak
Relational Databases to Riak
Basho Technologies
 
Arabidopsis Information Portal overview from Plant Biology Europe 2014
Arabidopsis Information Portal overview from Plant Biology Europe 2014Arabidopsis Information Portal overview from Plant Biology Europe 2014
Arabidopsis Information Portal overview from Plant Biology Europe 2014
Matthew Vaughn
 
六合彩,香港六合彩 » SlideShare
六合彩,香港六合彩 » SlideShare六合彩,香港六合彩 » SlideShare
六合彩,香港六合彩 » SlideShare
osaanwrs
 
六合彩|香港六合彩
六合彩|香港六合彩六合彩|香港六合彩
六合彩|香港六合彩
swgcne
 
香港六合彩 » SlideShare
香港六合彩 » SlideShare香港六合彩 » SlideShare
香港六合彩 » SlideShare
fwjipyww
 
香港六合彩
香港六合彩香港六合彩
香港六合彩
ykisdbey
 
六合彩|香港六合彩
六合彩|香港六合彩六合彩|香港六合彩
六合彩|香港六合彩
ejtdxmd
 
香港六合彩
香港六合彩香港六合彩
香港六合彩
hqaumeai
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
dgarijo
 
Sword Crig 2007 12 06
Sword Crig 2007 12 06Sword Crig 2007 12 06
Sword Crig 2007 12 06
Julie Allinson
 
OREChem Services and Workflows
OREChem Services and WorkflowsOREChem Services and Workflows
OREChem Services and Workflows
marpierc
 

Similar to OAISRB (20)

IP based standards for IoT
IP based standards for IoTIP based standards for IoT
IP based standards for IoT
 
Hypermedia for Machine APIs
Hypermedia for Machine APIsHypermedia for Machine APIs
Hypermedia for Machine APIs
 
The Materials API
The Materials APIThe Materials API
The Materials API
 
aip-workshop1-dev-tutorial
aip-workshop1-dev-tutorialaip-workshop1-dev-tutorial
aip-workshop1-dev-tutorial
 
Arabidopsis Information Portal, Developer Workshop 2014, Introduction
Arabidopsis Information Portal, Developer Workshop 2014, IntroductionArabidopsis Information Portal, Developer Workshop 2014, Introduction
Arabidopsis Information Portal, Developer Workshop 2014, Introduction
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 
A Look into the Apache OODT Ecosystem
A Look into the Apache OODT EcosystemA Look into the Apache OODT Ecosystem
A Look into the Apache OODT Ecosystem
 
Data Science with the Help of Metadata
Data Science with the Help of MetadataData Science with the Help of Metadata
Data Science with the Help of Metadata
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaSolutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
 
Relational Databases to Riak
Relational Databases to RiakRelational Databases to Riak
Relational Databases to Riak
 
Arabidopsis Information Portal overview from Plant Biology Europe 2014
Arabidopsis Information Portal overview from Plant Biology Europe 2014Arabidopsis Information Portal overview from Plant Biology Europe 2014
Arabidopsis Information Portal overview from Plant Biology Europe 2014
 
六合彩,香港六合彩 » SlideShare
六合彩,香港六合彩 » SlideShare六合彩,香港六合彩 » SlideShare
六合彩,香港六合彩 » SlideShare
 
六合彩|香港六合彩
六合彩|香港六合彩六合彩|香港六合彩
六合彩|香港六合彩
 
香港六合彩 » SlideShare
香港六合彩 » SlideShare香港六合彩 » SlideShare
香港六合彩 » SlideShare
 
香港六合彩
香港六合彩香港六合彩
香港六合彩
 
六合彩|香港六合彩
六合彩|香港六合彩六合彩|香港六合彩
六合彩|香港六合彩
 
香港六合彩
香港六合彩香港六合彩
香港六合彩
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
 
Sword Crig 2007 12 06
Sword Crig 2007 12 06Sword Crig 2007 12 06
Sword Crig 2007 12 06
 
OREChem Services and Workflows
OREChem Services and WorkflowsOREChem Services and Workflows
OREChem Services and Workflows
 

OAISRB

  • 1. Purdue University Purdue e-Pubs Libraries Research Publications 11-22-2006 Providing an OAI-PMH Interface to the Storage Resource Broker With OAISRB Michael Witt Purdue University, mwitt@purdue.edu Jigar Kadakia Follow this and additional works at: http://docs.lib.purdue.edu/lib_research This document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries. Please contact epubs@purdue.edu for additional information. Witt, Michael and Kadakia, Jigar, "Providing an OAI-PMH Interface to the Storage Resource Broker With OAISRB" (2006). Libraries Research Publications. Paper 78. http://docs.lib.purdue.edu/lib_research/78
  • 2. Providing an OAI-PMH Interface to the Storage Resource Broker With OAISRB Michael Witt & Jigar Kadakia, Purdue University Libraries, West Lafayette, Indiana, USA. ABSTRACT The OAISRB software was developed by Michael Witt and Jigar Kadakia of the Purdue University Libraries to provide an Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) interface to expose metadata from digital objects residing on the Storage Resource Broker (SRB) to OAI service providers. By harvesting metadata from the SRB, grid resources such as large research datasets from computational sciences can be represented alongside more conventional digital library objects such as e-prints and digitized archival collections. OAI-PMH The OAI-PMH defines a protocol that allows a service provider to harvest metadata from a compliant repository, also known as a data provider. OAI-PMH requests are constructed using six different verbs: • Identify • ListSets • ListMetadataFormats • ListIdentifiers • ListRecords • GetRecord STORAGE RESOURCE BROKER The SRB functions as a data grid, allowing multiple physical storage units to be aggregated and represented as a single logical storage unit. An assortment of client programs facilitate access to files on the SRB and perform functions such as filesystem virtualization. A metadata catalog (MCAT) service keeps track of the relationships among users, files, and storage units as well as supporting different levels of description that can be queried by users to perform various actions to access and use objects. OAISRB OAISRB essentially serves as middleware, converting OAI-PMH requests into SRB queries and returning the results from the MCAT in an OAI-compliant manner. It is implemented as a java servlet that runs as a web service, for example, on an Apache Tomcat server. Harvester MCAT (SRB) Apache Tomcat Server OAISRB (OAICat) interface client HTTP OAI-PMH SRB XML (Jargon) Figure 1. OAISRB Flow Diagram Its external interface listens to TCP port 80 for an OAI-PMH request, which arrives via HTTP. The response to an Identify request provides basic information about the repository such as its name, administrator email address, and protocol version from a configuration file, oaihandler.properties. An OAI set is analogous to a collection in SRB, and OAISRB responds to the ListSets verb with the list of collections that are statically defined in oaihandler.properties. Because the current version of OAISRB only supports the oai_dc (Dublin Core) schema, the response to the ListMetaDataFormats request is hard-coded. For all other requests, OAISRB extracts the verbs and other parameters from the URL, constructs a query, connects to SRB using the internal interface, and passes the query to MCAT. The results are received from MCAT as an array that is formatted into XML and validated before being returned to external interface and the harvester. The GetRecord verb retrieves an individual metadata record. The ListRecords verb harvests a group of metadata records from a collection that meet the parameters of a query. The ListIdentifiers verb returns a list of abbreviated metadata records. Both OAISRB interfaces use open source software; the external OAI-PMH interface is based on OAICat, developed by Jeff Young at the Online Computer Library Center (OCLC), and the internal SRB interface uses the Jargon java client API that was developed by the San Diego Supercomputer Center (SDSC). The oaihandler.properties file also defines the base URL and port for the OAI-PMH interface, the SRB connection parameters (e.g., username, password, zone), and information about the server and collections. A crosswalk feature converts from one metadata format to oai_dc on-the-fly after the recordset is returned from MCAT. Custom field mappings can be defined for each collection. Lastly, a logging function records information related to performance and errors for debugging purposes. OAISRB (con’t) OAISRB was approved for open source distribution by Purdue’s Office of Technology Commercialization (request #64660) on August 30, 2006. It was tested in collaboration with the Rosen Center for Advanced Computing, a division of Information Technology at Purdue, on the TeraGrid. Feedback from developers and users will help identify bugs to fix and features to add in new releases. FUTURE WORK Future work on OAISRB includes adding support for metadata formats other than Dublin Core and the ability to reference an XML schema on the web to perform dynamic crosswalks. More flexibility and granularity in defining objects in collections could be provided by adding pattern-matching capability. For example, a service provider may only be interested in graphic files in a collection and apply a regex filter (.tif$) in its collection definition. Once standards are defined, OAISRB could be extended to allow harvesting of objects as well as metadata. In the case of large datasets, it is impractical to ingest them into a single repository. A new set of middleware tools can be developed to use existing protocols to provide interoperability to access data wherever it resides (e.g., on the grid, on a researcher’s desktop, etc.). LINKS Purdue University Libraries, OAISRB: http://www.lib.purdue.edu/research/oaisrb The OAI-PMH Version 2.0: http://www.openarchives.org/OAI/openarchives protocol.htm San Diego Supercomputer Center, SRB: http://www.sdsc.edu/srb Online Computer Library Center, OAICat: http://oclc.org/research/software/oai/cat.htm San Diego Supercomputer Center, Jargon: http://www.sdsc.edu/srb/index.php/Jargon