On National Teacher Day, meet the 2024-25 Kenan Fellows
Sharing with the Open Archives Initiative
1. Sharing With the Open
Archives Initiative
Jenn Riley
Metadata Librarian
Indiana University
2. 3/20/07 Getty Technical Talk 2
Purpose of Open Archives Initiative
“develops and promotes interoperability
standards that aim to facilitate the efficient
dissemination of content”
“has its roots in the open access and
institutional repository movements”
“Archive” defined broadly: “as a repository for
stored information”
3. 3/20/07 Getty Technical Talk 3
Early history of the Open Archives
Initiative
Originally defined a “metadata harvesting
protocol” – OAI-PMH
Grew out of efforts to share e-prints
Original work supported by:
Digital Library Federation (DLF)
Coalition for Networked Information (CNI)
National Science Foundation (NSF)
4. 3/20/07 Getty Technical Talk 4
OAI-PMH
Protocol history
Version 1.0 released January 2001
Version 1.1 released July 2001
Version 2.0 released June 2002
No further major revisions planned
Protocol for harvesting metadata, not content
No inherent assumption that the metadata
describes digital content
5. 3/20/07 Getty Technical Talk 5
Diagram from OAI for Beginners - the Open Archives Forum online tutorial at
http://www.oaforum.org/tutorial/english/intro.htm
How OAI-PMH works
6. 3/20/07 Getty Technical Talk 6
Data providers
Set up a server that responds to harvesting
requests
Required to expose metadata in simple Dublin
Core (DC) format
Can supplement DC with metadata in any other
format expressible with an XML schema (e.g.,
CDWA Lite)
7. 3/20/07 Getty Technical Talk 7
Service providers
Harvest and store metadata
Generally provide search/browse access to this
metadata
Can be general or domain-specific
Can choose to collect metadata in formats other
than DC
Can provide value-added services
Sometimes re-expose metadata to other
aggregations
8. 3/20/07 Getty Technical Talk 8
Typical service provider behavior
Today
Collect and normalize metadata
Provide basic discovery
Send user back to home institution for more
information and/or access to content
Future
Metadata enrichment
Resource licensing
…
9. 3/20/07 Getty Technical Talk 9
Why share metadata?
Benefits to users
One-stop searching
Aggregation of subject-specific resources
Benefits to institutions
Increased exposure for collections
Broader user base
Bringing together of distributed collections
Don’t expect users will know about your
collection and remember to visit it.
10. 3/20/07 Getty Technical Talk 10
Why share metadata with OAI-PMH?
“Low barrier” protocol
Shares metadata only, not content,
simplifying rights issues
Same effort on your part to share with one or
a hundred service providers (basically)
Wide adoption in the cultural heritage sector
Quickly eclipsed methods such as Z39.50
11. 3/20/07 Getty Technical Talk 11
Sharing can be hard
Some initiatives have fizzled out
CIMI
AMICO
Some are still going
ARTstor
RLG Cultural Materials
CAMIO and other AMICO derivatives
Art museums have been most active
Customizing for each individual aggregator isn’t
sustainable
12. 3/20/07 Getty Technical Talk 12
Sharing is easier with OAI-PMH
Framework for sharing with multiple
aggregators
Museum-centric OAI initiatives are emerging
CDWA Lite from the Getty
RLG Museum Collections Sharing Working Group
Museums are beginning to explore more
open sharing models
13. 3/20/07 Getty Technical Talk 13
Challenges to OAI-PMH adoption for
museums
Protocol implicitly assumes you want
metadata to be harvestable by anyone
DC a poor match for describing most
museum materials
Museums often want to share content as well
as metadata (with select partners)
One solution? Start a specialized service
provider in the community.
14. 3/20/07 Getty Technical Talk 14
Some service providers
OAIster
National Science Digital Library
Sheet Music Consortium
Open Language Archives Community
15. 3/20/07 Getty Technical Talk 15
“Shareable” metadata
Promotes search interoperability - “the ability
to perform a search over diverse sets of
metadata records and obtain meaningful
results” (Priscilla Caplan)
Is human understandable outside of its local
context
Is useful outside of its local context
Preferably is machine processable
16. 3/20/07 Getty Technical Talk 16
Models for sharing with OAI-PMH
OAIHarvester
Digital asset management system
Metadata
creation
module
OAI data
provider
module
Transformation
Metadata
creation
system
Stand-alone
OAI data
provider
Transformation
DC
QDC MODS
CDWA Lite
DC CDWA Lite
QDC MODS
Metadata
creation
module
Static
Repository
Gateway
Transformation
XMLFile
17. 3/20/07 Getty Technical Talk 17
Basic metadata sharing workflow
Create metadata, thinking about shareability
Determine format(s) you wish to share your
metadata in
Transform records into versions appropriate for
sharing via OAI
Validate transformed metadata
Load transformed metadata into OAI data provider
Test with OAI Repository Explorer
Communicate with service providers
See what your metadata looks like once a service
provider harvests it
18. 3/20/07 Getty Technical Talk 18
Expanding the scope
“Over time, however, the work of OAI has
expanded to promote broad access to digital
resources for eScholarship, eLearning, and
eScience”
Some experiments with sharing content
CIC Metadata Portal
Fedora Asset Actions
MPEG-21 DIDL over OAI-PMH
OAI-ORE
19. 3/20/07 Getty Technical Talk 19
CIC Metadata Portal
Research project to build an OAI-based
aggregator for a consortium of academic
libraries in the Midwest
Created a version of qualified DC to indicate
the location of a thumbnail image
Integrated harvested thumbnails into search
interface
Procedure documented in January 2006 D-
Lib Magazine article
20. 3/20/07 Getty Technical Talk 20
Asset Actions
Grew out of need for “actionable URLs”
XML schema designed for facilitating the sharing
and manipulation of digital objects
Define core functions for digital objects of all types,
e.g., “get preview”
Begun to define further functions for specific content
types
Proof-of-concept implementation created for DLF
Aquifer project
Can be shared via OAI-PMH as a supplemental
metadata format
Documented in October 2006 D-Lib Magazine
article
21. 3/20/07 Getty Technical Talk 21
MPEG-21 DIDL over OAI-PMH
Repository architecture at Los Alamos National
Laboratory uses MPEG-21 DIDL for complex digital
objects
OAI-PMH repositories integral parts of the internal
repository architecture
These internal OAI-PMH repositories do not support
DC metadata – “Because mapping a DID that
represents a complex digital object to simple DC is
quite an impossible task, support of DC by these
OAI-PMH repositories is rather meaningless.”
Described in 2004 JCDL paper
Lay the groundwork for OAI-ORE
22. 3/20/07 Getty Technical Talk 22
OAI-ORE
Open Archives Initiative Object Re-Use and
Exchange
Two-year Mellon-funded project beginning October
2006
Will develop specifications that allow distributed
repositories to exchange information about their
constituent digital objects
Goal is to facilitate “a new digitally-based scholarly
communication framework”
Imagine how research could be transformed if users
had seamless access to information in any
repository, anywhere, and the tools to use them
23. 3/20/07 Getty Technical Talk 23
Goals for initial OAI-ORE project
Formation of an international advisory committee,
consisting of leaders in e-Science, institutional
repositories, publishing, library, and educational
technology communities.
Formation of an international working group that will
meet over the two year period and develop the set
of ORE specifications.
Establishment and management of an experimental
deployment community that will exercise the
developed standards in a variety of contexts.
Establishment of a sustainable community to
support the widespread deployment and
management of the standards fabric.
24. 3/20/07 Getty Technical Talk 24
OAI-ORE potential for cultural
materials
Original focus is on scholarship, and sharing
text and datasets
No inherent limitations to these uses
Facilitates, but doesn’t require exchange of
actual content
Focus on complex objects better suited to
cultural materials than OAI-PMH model
It’s still early, but inherent flexibility of model
looks promising for cultural materials
25. 3/20/07 Getty Technical Talk 25
For more information
jenlrile@indiana.edu
These presentation slides
<http://www.dlib.indiana.edu/~jenlrile/presentations/getty2007/oai.ppt>
OAI home page <http://www.openarchives.org>
OAI-PMH <http://www.openarchives.org/pmh/>
OAI-ORE <http://www.openarchives.org/ore/>
Editor's Notes
“archives” in the largest sense, not like the “Archival community” uses it.