SlideShare a Scribd company logo
1 of 27
The Open Archives Initiative


          Michael L. Nelson

 Computer Science, Old Dominion University
          www.cs.odu.edu/~mln/




         www.openarchives.org




          The Open Archives Initiative
  DRIADE Workshop, Durham NC, May 16-17, 2007
               Michael L. Nelson
Open Archives Initiative
            Protocol for Metadata Harvesting

• data providers / repositories:
   o   “A repository is a network accessible server that can
       process the 6 OAI-PMH requests in the manner
       described in [the OAI-PMH document].   A repository is
       managed by a data provider to expose metadata to
       harvesters.” 
• service providers / harvesters:
   o   “A harvester is a client application that issues OAI-PMH
       requests.  A harvester is operated by a service provider
       as a means of collecting metadata from repositories.”




                          The Open Archives Initiative
                  DRIADE Workshop, Durham NC, May 16-17, 2007
                               Michael L. Nelson
Data Providers / Service Providers




data providers                                                   service providers
(repositories)                                                   (harvesters)

                           The Open Archives Initiative
                   DRIADE Workshop, Durham NC, May 16-17, 2007
                                Michael L. Nelson
Overview of OAI-PMH Verbs


                          Verb                                             Function
               Identify                          description of repository
  repository
  metadata     ListMetadataFormats               metadata formats supported by repo

               ListSets                          sets defined by repository

               ListIdentifiers                   OAI unique ids contained in repo

harvesting
verbs          ListRecords                       listing of N records

               GetRecord                         listing of a single record

                  most verbs take arguments: dates, sets, ids, metadata formats
                  and resumption token (for flow control)
                                     The Open Archives Initiative
                             DRIADE Workshop, Durham NC, May 16-17, 2007
                                          Michael L. Nelson
OAI-PMH data model



                                                                                           resource




    OAI-PMH sets

OAI-PMH identifier       entry point to all records pertaining to the resource                  item




OAI-PMH identifier                      Dublin Core         MARCXML
        metadataPrefix                   metadata                                           records
                                                            metadata
        datestamp
                                                                                 metadata pertaining
                                                                                     to the resource
                                    The Open Archives Initiative
                            DRIADE Workshop, Durham NC, May 16-17, 2007
                                         Michael L. Nelson
Complexity Comes to OAI-PMH…

•   First noticed in how people would populate their Dublin Core records
     o   people need the HTML splash page
     o   crawlers need the PDF file
•   Ad-hoc conventions and methods used to expose the repository’s
    knowledge about the structure of the object
•   Next three slides taken from “Resource Harvesting Within the
    OAI-PMH Framework”
     o   http://www.dlib.org/dlib/december04/vandesompel/12vandesompel.html




                                The Open Archives Initiative
                        DRIADE Workshop, Durham NC, May 16-17, 2007
                                     Michael L. Nelson
Dublin Core Encoding Type 1

<oai_dc:dc>
   <dc:title>A Simple Parallel-Plate Resonator Technique for Microwave.
       Characterization of Thin Resistive Films</dc:title>
   <dc:creator>Vorobiev, A.</dc:creator>
   <dc:subject>ING-INF/01 Elettronica</dc:subject>
   <dc:description>A parallel-plate resonator method is proposed for
       non-destructive characterisation of resistive films used in
       microwave integrated circuits. A slot made in one ... </dc:description>
   <dc:publisher>Microwave engineering Europe</dc:publisher>
   <dc:date>2002</dc:date>
   <dc:type>Documento relativo ad una Conferenza o altro Evento</dc:type>
   <dc:type>PeerReviewed</dc:type>
   <dc:identifier>http://amsacta.cib.unibo.it/archive/00000014/</dc:identifier>
   <dc:format>pdf
     http://amsacta.cib.unibo.it/archive/00000014/01/GaAs_1_Vorobiev.pdf
   </dc:format>
</oai_dc:dc>




                splash page                                           locator of resource

                                The Open Archives Initiative
                        DRIADE Workshop, Durham NC, May 16-17, 2007
                                     Michael L. Nelson
Dublin Core Encoding Type 2

…
    <dc:identifier>http://amsacta.cib.unibo.it/archive/00000014/</dc:identifier>
    <dc:relation>
      http://amsacta.cib.unibo.it/archive/00000014/01/GaAs_1_Vorobiev.pdf
    </dc:relation>
…



                 splash page                                          locator of resource




                                The Open Archives Initiative
                        DRIADE Workshop, Durham NC, May 16-17, 2007
                                     Michael L. Nelson
Dublin Core Encoding Type 3

…
    <dc:identifier> http://amsacta.cib.unibo.it/archive/00000014/</dc:identifier>
     <dc:relation>
        http://resolver.unibo.it/00000014/
        </dc:relation>
     <dc:relation>
        http://amsacta.cib.unibo.it/archive/00000014/01/GaAs_1_Vorobiev.pdf
        </dc:relation>
…

               splash page
                                                                       locator of resource
                             splash page




                                 The Open Archives Initiative
                         DRIADE Workshop, Durham NC, May 16-17, 2007
                                      Michael L. Nelson
OAI Object Re-Use and Exchange


•   Develop, identify, and profile extensible standards and protocols to
    allow repositories, agents, and services to interoperate in the
    context of use and reuse of compound digital objects beyond the
    boundaries of the holding repositories.


•   Aim for more effective and consistent ways:
     o   to facilitate discovery of these objects,
     o   to reference (link to) these objects (and parts thereof),
     o   to obtain a variety of disseminations of these objects,
     o   to aggregate and disaggregate these objects,
     o   Enable processing by automated agents




                               The Open Archives Initiative
                       DRIADE Workshop, Durham NC, May 16-17, 2007
                                    Michael L. Nelson
The Structure of Compound Objects is Obfuscated
            When Mapped to the Web




                  The Open Archives Initiative
          DRIADE Workshop, Durham NC, May 16-17, 2007
                       Michael L. Nelson
Useful for humans and useful for applications is often different




                HTTP LINK HEADER




                         The Open Archives Initiative
                 DRIADE Workshop, Durham NC, May 16-17, 2007
                              Michael L. Nelson
Through the Resource Map, the Web application sees the compound object




                             The Open Archives Initiative
                     DRIADE Workshop, Durham NC, May 16-17, 2007
                                  Michael L. Nelson
This approach reveals compound objects in the Web graph




                      The Open Archives Initiative
              DRIADE Workshop, Durham NC, May 16-17, 2007
                           Michael L. Nelson
OAI: Its Not Just for
      Metadata Harvesting Anymore…


   OAI-PMH                                     OAI-ORE
Repository structure                        Object structure

 Metadata centric                           Resource centric

Metadata harvesting                    Object re-use (obtain,
                                        harvest, register)


     OAI-PMH and OAI-ORE are complimentary;
             o   you can do one without the other
                   o you can do them together




                  The Open Archives Initiative
          DRIADE Workshop, Durham NC, May 16-17, 2007
                       Michael L. Nelson
OAI-ORE : Current Status
•   Ongoing definition of the ORE framework
     o   Reach joint problem statement
     o   Issues regarding identification
     o   Model for ORE resource
     o   Publishing ORE resources to the Web
     o   Discovering ORE resources


•   Review of appropriate technologies for ORE Model and Resource
    Map
     o   ATOM
     o   DID/DIDL, IMS/CP, METS, Ramlet
     o   RDF, RDF/XML
     o   Dublin Core Abstract Model
     o   …
                             The Open Archives Initiative
                     DRIADE Workshop, Durham NC, May 16-17, 2007
                                  Michael L. Nelson
OAI-ORE : Current Status

•   Explore demonstrators using these concepts in preparation of May
    2007 ORE Technical Committee meeting


•   Post May 2007 meeting:
     o   Hopefully work towards alpha specs for ORE resource, Resource Map,
         discovery of ORE resource
     o   Experimentation with alpha specs




                              The Open Archives Initiative
                      DRIADE Workshop, Durham NC, May 16-17, 2007
                                   Michael L. Nelson
My research group’s approach to
 OAI/Preservation integration…




            The Open Archives Initiative
    DRIADE Workshop, Durham NC, May 16-17, 2007
                 Michael L. Nelson
Preservation: Fortress Model



     Five Easy Steps for Preservation:
1.     Get a lot of $
2.     Buy a lot of disks, machines, tapes,
       etc.
3.     Hire an army of staff
4.     Load a small amount of data
5.     “Look upon my archive ye Mighty, and
       despair!”


                                                                image from: http://www.itunisie.com/tourisme/excursion/tabarka/images/fort.jpg


                                The Open Archives Initiative
                        DRIADE Workshop, Durham NC, May 16-17, 2007
                                     Michael L. Nelson
Alternate Models
                          of Preservation

• Lazy Preservation
   o   Let Google, IA et al. preserve your website
• Just-In-Time Preservation
   o   Wait for it to disappear first, then a “good enough”
       version
• Shared Infrastructure Preservation
   o   Push your content to sites that might preserve it
• Web Server Enhanced Preservation
   o   Use Apache modules to create archival-ready resources

                                                                 image from: http://www.proex.ufes.br/arsm/knots_interlaced.htm


                           The Open Archives Initiative
                   DRIADE Workshop, Durham NC, May 16-17, 2007
                                Michael L. Nelson
Web Site Preservation: 2 Problems


Guess the bean count,
     win the jar




The counting problem                                       The representation problem
How many pages are on that site?                                    What’s that page all about?
  To save it you have to find it                                 Future use requires understanding




                               The Open Archives Initiative
                       DRIADE Workshop, Durham NC, May 16-17, 2007
                                    Michael L. Nelson
OAI-PMH Data Model


                                                                                                    resource



                                           OAI-PMH identifier                                           item
                         = entry point to all records pertaining to the resource




metadata pertaining      Dublin Core       MPEG-21                                 MARCXML           records
 to the resource          metadata                               METS
                                            DIDL                                   metadata

modeled representation     simple           complex              complex          more expressive
   of the resource         model             model                model               model




                                            The Open Archives Initiative
                                    DRIADE Workshop, Durham NC, May 16-17, 2007
                                                 Michael L. Nelson
mod_oai implementation
 Integrate OAI-PMH functionality into the web server itself…
         1.       Use mod_oai
                   -    an Apache 2.0 module
                   -    automatically answers OAI-PMH requests for an http server
                   -    written in C
                   -    respects values in .htaccess, httpd.conf
         2.       Install mod_oai on http://www.foo.edu/
         Define baseURL: http://www.foo.edu/modoai
         3.

 Result: web harvesting with OAI-PMH semantics (e.g., from, until, sets)

http://www.foo.edu/modoai?verb=ListRecords&metdataPrefix=oai_didl&from=2004-09-15&set=mime:video:mpeg



From site foo,                                                                          dating from 9/15/2004 through today
                                  Give me all resources
                 Using OAI-PMH
                                                      And their preservation metadata                        that are MIME type video-MPEG




                                               The Open Archives Initiative
                                       DRIADE Workshop, Durham NC, May 16-17, 2007
                                                    Michael L. Nelson
Addressing the Counting Problem: ListIdentifiers


                                                    CRAWLER:
                                                    •   issues a ListIdentifiers,
                                                    •   finds URLs of updated
                                                        resources
                                                    •   does HTTP GET updates
                                                        only
                                                    •   can get URLs of resources
                                                        with specified MIME
                                                        types




              The Open Archives Initiative
      DRIADE Workshop, Durham NC, May 16-17, 2007
                   Michael L. Nelson
Addressing the Representation Problem: ListRecords in DIDL Format



CRAWLER:
• Makes a ListRecords
  query,
• Gets updates as MPEG-21
  DIDL records (HTTP
  headers, resource By
  Value or By Reference)
• can get resources with
  specified MIME types




                               The Open Archives Initiative
                       DRIADE Workshop, Durham NC, May 16-17, 2007
                                    Michael L. Nelson
CRATE: Preservation Metadata at Dissemination Time



•   Harnesses web
    server to support
    preservation
•   Moves preservation
    metadata from
    “strict validation at
    ingest”
    to “best-effort                       Plug-in Name                    Executable path
    description at
    dissemination”




                                    The Open Archives Initiative
                            DRIADE Workshop, Durham NC, May 16-17, 2007
                                         Michael L. Nelson
Validation is Subjective
                                   Preservation metadata is like a David Hockney photo collage:
                                              each image is both true and incomplete,
                                   and while the result is not faithful, it does capture the “essence”




images from: http://facweb.cs.depaul.edu/sgrais/collage.htm

                                                                      The Open Archives Initiative
                                                              DRIADE Workshop, Durham NC, May 16-17, 2007
                                                                           Michael L. Nelson

More Related Content

What's hot (20)

Webometrics
WebometricsWebometrics
Webometrics
 
Open Archives Initiatives For Metadata Harvesting
Open Archives Initiatives For Metadata   HarvestingOpen Archives Initiatives For Metadata   Harvesting
Open Archives Initiatives For Metadata Harvesting
 
Sigmaplot 13 PPT
Sigmaplot 13 PPTSigmaplot 13 PPT
Sigmaplot 13 PPT
 
Electronic Resource Management
Electronic Resource ManagementElectronic Resource Management
Electronic Resource Management
 
Presentation On INSDOC
Presentation On INSDOCPresentation On INSDOC
Presentation On INSDOC
 
INSPEC
INSPECINSPEC
INSPEC
 
Digital library software
Digital library softwareDigital library software
Digital library software
 
BIBLIOMETRICS LAWS
BIBLIOMETRICS LAWSBIBLIOMETRICS LAWS
BIBLIOMETRICS LAWS
 
All you want to know about ISSN-ISBN
All you want to know about ISSN-ISBNAll you want to know about ISSN-ISBN
All you want to know about ISSN-ISBN
 
Altmetrics
Altmetrics Altmetrics
Altmetrics
 
Institutional repository
Institutional repositoryInstitutional repository
Institutional repository
 
Multimedia application in libraries gaurav boudh
Multimedia application in libraries gaurav boudhMultimedia application in libraries gaurav boudh
Multimedia application in libraries gaurav boudh
 
Bibliometrics and its application
Bibliometrics and its applicationBibliometrics and its application
Bibliometrics and its application
 
citation analysis
citation analysiscitation analysis
citation analysis
 
What do we know about the h index?
What do we know about the h index?What do we know about the h index?
What do we know about the h index?
 
Reference management tools
Reference management toolsReference management tools
Reference management tools
 
Scopus
ScopusScopus
Scopus
 
Abstract and i ndexing
Abstract and i ndexingAbstract and i ndexing
Abstract and i ndexing
 
ALTMETRICS
ALTMETRICSALTMETRICS
ALTMETRICS
 
Webometrics
WebometricsWebometrics
Webometrics
 

Viewers also liked

Can’t Find Your 404s?
Can’t Find Your 404s?Can’t Find Your 404s?
Can’t Find Your 404s?Michael Nelson
 
Memento: TimeGates, TimeBundles, and TimeMaps
Memento: TimeGates, TimeBundles, and TimeMapsMemento: TimeGates, TimeBundles, and TimeMaps
Memento: TimeGates, TimeBundles, and TimeMapsMichael Nelson
 
Memento: Time Travel for the Web
Memento: Time Travel for the WebMemento: Time Travel for the Web
Memento: Time Travel for the WebMichael Nelson
 
Memento: Time Travel for the Web
Memento: Time Travel for the WebMemento: Time Travel for the Web
Memento: Time Travel for the WebMichael Nelson
 
Memento: Time Travel for the Web
Memento: Time Travel for the WebMemento: Time Travel for the Web
Memento: Time Travel for the WebMichael Nelson
 
Using timed-release cryptography to mitigate the preservation risk of embargo...
Using timed-release cryptography to mitigate the preservation risk of embargo...Using timed-release cryptography to mitigate the preservation risk of embargo...
Using timed-release cryptography to mitigate the preservation risk of embargo...Michael Nelson
 
Synchronicity: Just-In-Time Discovery of Lost Web Pages
Synchronicity: Just-In-Time Discovery of Lost Web PagesSynchronicity: Just-In-Time Discovery of Lost Web Pages
Synchronicity: Just-In-Time Discovery of Lost Web PagesMichael Nelson
 
(Re-) Discovering Lost Web Pages
(Re-) Discovering Lost Web Pages(Re-) Discovering Lost Web Pages
(Re-) Discovering Lost Web PagesMichael Nelson
 
My Point of View: Michael L. Nelson Web Archiving Cooperative
My Point of View: Michael L. Nelson  Web Archiving CooperativeMy Point of View: Michael L. Nelson  Web Archiving Cooperative
My Point of View: Michael L. Nelson Web Archiving CooperativeMichael Nelson
 
Music Video Redundancy and Half-Life in YouTube
Music Video Redundancy and Half-Life in YouTubeMusic Video Redundancy and Half-Life in YouTube
Music Video Redundancy and Half-Life in YouTubeMichael Nelson
 
A Research Agenda for "Obsolete Data or Resources"
A Research Agenda for "Obsolete Data or Resources"A Research Agenda for "Obsolete Data or Resources"
A Research Agenda for "Obsolete Data or Resources"Michael Nelson
 
(Re-)Discovering Lost Web Pages
(Re-)Discovering Lost Web Pages(Re-)Discovering Lost Web Pages
(Re-)Discovering Lost Web PagesMichael Nelson
 
Tools for A Preservation Ready Web
Tools for A Preservation Ready WebTools for A Preservation Ready Web
Tools for A Preservation Ready WebMichael Nelson
 
Review of Web Archiving
Review of Web ArchivingReview of Web Archiving
Review of Web ArchivingMichael Nelson
 
OAI-ORE: The Open Archives Initiative Object Reuse and Exchange Project
OAI-ORE:  The Open Archives Initiative  Object Reuse and Exchange ProjectOAI-ORE:  The Open Archives Initiative  Object Reuse and Exchange Project
OAI-ORE: The Open Archives Initiative Object Reuse and Exchange ProjectMichael Nelson
 
Why Care About the Past?
Why Care About the Past?Why Care About the Past?
Why Care About the Past?Michael Nelson
 

Viewers also liked (16)

Can’t Find Your 404s?
Can’t Find Your 404s?Can’t Find Your 404s?
Can’t Find Your 404s?
 
Memento: TimeGates, TimeBundles, and TimeMaps
Memento: TimeGates, TimeBundles, and TimeMapsMemento: TimeGates, TimeBundles, and TimeMaps
Memento: TimeGates, TimeBundles, and TimeMaps
 
Memento: Time Travel for the Web
Memento: Time Travel for the WebMemento: Time Travel for the Web
Memento: Time Travel for the Web
 
Memento: Time Travel for the Web
Memento: Time Travel for the WebMemento: Time Travel for the Web
Memento: Time Travel for the Web
 
Memento: Time Travel for the Web
Memento: Time Travel for the WebMemento: Time Travel for the Web
Memento: Time Travel for the Web
 
Using timed-release cryptography to mitigate the preservation risk of embargo...
Using timed-release cryptography to mitigate the preservation risk of embargo...Using timed-release cryptography to mitigate the preservation risk of embargo...
Using timed-release cryptography to mitigate the preservation risk of embargo...
 
Synchronicity: Just-In-Time Discovery of Lost Web Pages
Synchronicity: Just-In-Time Discovery of Lost Web PagesSynchronicity: Just-In-Time Discovery of Lost Web Pages
Synchronicity: Just-In-Time Discovery of Lost Web Pages
 
(Re-) Discovering Lost Web Pages
(Re-) Discovering Lost Web Pages(Re-) Discovering Lost Web Pages
(Re-) Discovering Lost Web Pages
 
My Point of View: Michael L. Nelson Web Archiving Cooperative
My Point of View: Michael L. Nelson  Web Archiving CooperativeMy Point of View: Michael L. Nelson  Web Archiving Cooperative
My Point of View: Michael L. Nelson Web Archiving Cooperative
 
Music Video Redundancy and Half-Life in YouTube
Music Video Redundancy and Half-Life in YouTubeMusic Video Redundancy and Half-Life in YouTube
Music Video Redundancy and Half-Life in YouTube
 
A Research Agenda for "Obsolete Data or Resources"
A Research Agenda for "Obsolete Data or Resources"A Research Agenda for "Obsolete Data or Resources"
A Research Agenda for "Obsolete Data or Resources"
 
(Re-)Discovering Lost Web Pages
(Re-)Discovering Lost Web Pages(Re-)Discovering Lost Web Pages
(Re-)Discovering Lost Web Pages
 
Tools for A Preservation Ready Web
Tools for A Preservation Ready WebTools for A Preservation Ready Web
Tools for A Preservation Ready Web
 
Review of Web Archiving
Review of Web ArchivingReview of Web Archiving
Review of Web Archiving
 
OAI-ORE: The Open Archives Initiative Object Reuse and Exchange Project
OAI-ORE:  The Open Archives Initiative  Object Reuse and Exchange ProjectOAI-ORE:  The Open Archives Initiative  Object Reuse and Exchange Project
OAI-ORE: The Open Archives Initiative Object Reuse and Exchange Project
 
Why Care About the Past?
Why Care About the Past?Why Care About the Past?
Why Care About the Past?
 

Similar to The Open Archives Initiative

Using Architectures for Semantic Interoperability to Create Journal Clubs for...
Using Architectures for Semantic Interoperability to Create Journal Clubs for...Using Architectures for Semantic Interoperability to Create Journal Clubs for...
Using Architectures for Semantic Interoperability to Create Journal Clubs for...James Powell
 
Open Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataOpen Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataPascal-Nicolas Becker
 
Charleston 2012 - The Future of Serials in a Linked Data World
Charleston 2012 - The Future of Serials in a Linked Data WorldCharleston 2012 - The Future of Serials in a Linked Data World
Charleston 2012 - The Future of Serials in a Linked Data WorldProQuest
 
from local/regional OER Silos towards an OER Global Dataspace
from local/regional OER Silos towards an OER Global Dataspacefrom local/regional OER Silos towards an OER Global Dataspace
from local/regional OER Silos towards an OER Global DataspaceOpen Education Consortium
 
Emblematica overview dlf
Emblematica overview dlfEmblematica overview dlf
Emblematica overview dlfjjett2
 
RDFC2012 Open Access to Research Data
RDFC2012 Open Access to Research DataRDFC2012 Open Access to Research Data
RDFC2012 Open Access to Research DataGudmundur Thorisson
 
Evolving the Web into a Giant Global Database
Evolving the Web into a Giant Global DatabaseEvolving the Web into a Giant Global Database
Evolving the Web into a Giant Global DatabaseMarko Rodriguez
 
Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Dan Brickley
 
NANO266 - Lecture 12 - High-throughput computational materials design
NANO266 - Lecture 12 - High-throughput computational materials designNANO266 - Lecture 12 - High-throughput computational materials design
NANO266 - Lecture 12 - High-throughput computational materials designUniversity of California, San Diego
 
Illuminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data SupportIlluminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data SupportPascal-Nicolas Becker
 
Spring Data NHJUG April 2012
Spring Data NHJUG April 2012Spring Data NHJUG April 2012
Spring Data NHJUG April 2012trisberg
 
Metadata for researchers
Metadata for researchers Metadata for researchers
Metadata for researchers Getaneh Alemu
 
SWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic WebSWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic WebPascal-Nicolas Becker
 
Using Dublin Core for DISCOVER: a New Zealand visual art and music resource f...
Using Dublin Core for DISCOVER: a New Zealand visual art and music resource f...Using Dublin Core for DISCOVER: a New Zealand visual art and music resource f...
Using Dublin Core for DISCOVER: a New Zealand visual art and music resource f...Karen R
 

Similar to The Open Archives Initiative (20)

Using Architectures for Semantic Interoperability to Create Journal Clubs for...
Using Architectures for Semantic Interoperability to Create Journal Clubs for...Using Architectures for Semantic Interoperability to Create Journal Clubs for...
Using Architectures for Semantic Interoperability to Create Journal Clubs for...
 
Open Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataOpen Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked Data
 
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
 
Charleston 2012 - The Future of Serials in a Linked Data World
Charleston 2012 - The Future of Serials in a Linked Data WorldCharleston 2012 - The Future of Serials in a Linked Data World
Charleston 2012 - The Future of Serials in a Linked Data World
 
from local/regional OER Silos towards an OER Global Dataspace
from local/regional OER Silos towards an OER Global Dataspacefrom local/regional OER Silos towards an OER Global Dataspace
from local/regional OER Silos towards an OER Global Dataspace
 
Ontology development
Ontology developmentOntology development
Ontology development
 
Emblematica overview dlf
Emblematica overview dlfEmblematica overview dlf
Emblematica overview dlf
 
RDFC2012 Open Access to Research Data
RDFC2012 Open Access to Research DataRDFC2012 Open Access to Research Data
RDFC2012 Open Access to Research Data
 
Evolving the Web into a Giant Global Database
Evolving the Web into a Giant Global DatabaseEvolving the Web into a Giant Global Database
Evolving the Web into a Giant Global Database
 
Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001
 
OpenAIRE schirrwagen
OpenAIRE schirrwagenOpenAIRE schirrwagen
OpenAIRE schirrwagen
 
NANO266 - Lecture 12 - High-throughput computational materials design
NANO266 - Lecture 12 - High-throughput computational materials designNANO266 - Lecture 12 - High-throughput computational materials design
NANO266 - Lecture 12 - High-throughput computational materials design
 
Illuminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data SupportIlluminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data Support
 
Spring Data NHJUG April 2012
Spring Data NHJUG April 2012Spring Data NHJUG April 2012
Spring Data NHJUG April 2012
 
Semantic web
Semantic web Semantic web
Semantic web
 
Metadata for researchers
Metadata for researchers Metadata for researchers
Metadata for researchers
 
SWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic WebSWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic Web
 
20110728 datalift-rpi-troy
20110728 datalift-rpi-troy20110728 datalift-rpi-troy
20110728 datalift-rpi-troy
 
Using Dublin Core for DISCOVER: a New Zealand visual art and music resource f...
Using Dublin Core for DISCOVER: a New Zealand visual art and music resource f...Using Dublin Core for DISCOVER: a New Zealand visual art and music resource f...
Using Dublin Core for DISCOVER: a New Zealand visual art and music resource f...
 
Aggregation as tactic sm new
Aggregation as tactic sm newAggregation as tactic sm new
Aggregation as tactic sm new
 

More from Michael Nelson

Web Archiving in the Year eaee1902f186819154789ee22ca30035
Web Archiving in the Year eaee1902f186819154789ee22ca30035Web Archiving in the Year eaee1902f186819154789ee22ca30035
Web Archiving in the Year eaee1902f186819154789ee22ca30035Michael Nelson
 
Uncertainty in replaying archived Twitter pages
Uncertainty in replaying archived Twitter pagesUncertainty in replaying archived Twitter pages
Uncertainty in replaying archived Twitter pagesMichael Nelson
 
Web Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsWeb Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsMichael Nelson
 
Web Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsWeb Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsMichael Nelson
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesMichael Nelson
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesMichael Nelson
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Michael Nelson
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Michael Nelson
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Michael Nelson
 
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Michael Nelson
 
Summarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniquesSummarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniquesMichael Nelson
 
The Memento Protocol and Research Issues With Web Archiving
The Memento Protocol and Research Issues With Web ArchivingThe Memento Protocol and Research Issues With Web Archiving
The Memento Protocol and Research Issues With Web ArchivingMichael Nelson
 
We Need Multiple, Independent Web Archives
We Need Multiple, Independent Web ArchivesWe Need Multiple, Independent Web Archives
We Need Multiple, Independent Web ArchivesMichael Nelson
 
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with JavascriptCombining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with JavascriptMichael Nelson
 
Storytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web ArchivesStorytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web ArchivesMichael Nelson
 
Why We Need Multiple Archives
Why We Need Multiple ArchivesWhy We Need Multiple Archives
Why We Need Multiple ArchivesMichael Nelson
 
Combining Storytelling and Web Archives
Combining Storytelling and Web ArchivesCombining Storytelling and Web Archives
Combining Storytelling and Web ArchivesMichael Nelson
 
@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015Michael Nelson
 
Evaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived PagesEvaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived PagesMichael Nelson
 
When Should I Make Preservation Copies of Myself?
When Should I Make Preservation Copies of Myself?�When Should I Make Preservation Copies of Myself?�
When Should I Make Preservation Copies of Myself?Michael Nelson
 

More from Michael Nelson (20)

Web Archiving in the Year eaee1902f186819154789ee22ca30035
Web Archiving in the Year eaee1902f186819154789ee22ca30035Web Archiving in the Year eaee1902f186819154789ee22ca30035
Web Archiving in the Year eaee1902f186819154789ee22ca30035
 
Uncertainty in replaying archived Twitter pages
Uncertainty in replaying archived Twitter pagesUncertainty in replaying archived Twitter pages
Uncertainty in replaying archived Twitter pages
 
Web Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsWeb Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed Originals
 
Web Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsWeb Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed Originals
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
 
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
 
Summarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniquesSummarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniques
 
The Memento Protocol and Research Issues With Web Archiving
The Memento Protocol and Research Issues With Web ArchivingThe Memento Protocol and Research Issues With Web Archiving
The Memento Protocol and Research Issues With Web Archiving
 
We Need Multiple, Independent Web Archives
We Need Multiple, Independent Web ArchivesWe Need Multiple, Independent Web Archives
We Need Multiple, Independent Web Archives
 
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with JavascriptCombining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
 
Storytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web ArchivesStorytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web Archives
 
Why We Need Multiple Archives
Why We Need Multiple ArchivesWhy We Need Multiple Archives
Why We Need Multiple Archives
 
Combining Storytelling and Web Archives
Combining Storytelling and Web ArchivesCombining Storytelling and Web Archives
Combining Storytelling and Web Archives
 
@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015
 
Evaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived PagesEvaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived Pages
 
When Should I Make Preservation Copies of Myself?
When Should I Make Preservation Copies of Myself?�When Should I Make Preservation Copies of Myself?�
When Should I Make Preservation Copies of Myself?
 

Recently uploaded

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Recently uploaded (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

The Open Archives Initiative

  • 1. The Open Archives Initiative Michael L. Nelson Computer Science, Old Dominion University www.cs.odu.edu/~mln/ www.openarchives.org The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 2. Open Archives Initiative Protocol for Metadata Harvesting • data providers / repositories: o “A repository is a network accessible server that can process the 6 OAI-PMH requests in the manner described in [the OAI-PMH document].   A repository is managed by a data provider to expose metadata to harvesters.”  • service providers / harvesters: o “A harvester is a client application that issues OAI-PMH requests.  A harvester is operated by a service provider as a means of collecting metadata from repositories.” The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 3. Data Providers / Service Providers data providers service providers (repositories) (harvesters) The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 4. Overview of OAI-PMH Verbs Verb Function Identify description of repository repository metadata ListMetadataFormats metadata formats supported by repo ListSets sets defined by repository ListIdentifiers OAI unique ids contained in repo harvesting verbs ListRecords listing of N records GetRecord listing of a single record most verbs take arguments: dates, sets, ids, metadata formats and resumption token (for flow control) The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 5. OAI-PMH data model resource OAI-PMH sets OAI-PMH identifier entry point to all records pertaining to the resource item OAI-PMH identifier Dublin Core MARCXML metadataPrefix metadata records metadata datestamp metadata pertaining to the resource The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 6. Complexity Comes to OAI-PMH… • First noticed in how people would populate their Dublin Core records o people need the HTML splash page o crawlers need the PDF file • Ad-hoc conventions and methods used to expose the repository’s knowledge about the structure of the object • Next three slides taken from “Resource Harvesting Within the OAI-PMH Framework” o http://www.dlib.org/dlib/december04/vandesompel/12vandesompel.html The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 7. Dublin Core Encoding Type 1 <oai_dc:dc> <dc:title>A Simple Parallel-Plate Resonator Technique for Microwave. Characterization of Thin Resistive Films</dc:title> <dc:creator>Vorobiev, A.</dc:creator> <dc:subject>ING-INF/01 Elettronica</dc:subject> <dc:description>A parallel-plate resonator method is proposed for non-destructive characterisation of resistive films used in microwave integrated circuits. A slot made in one ... </dc:description> <dc:publisher>Microwave engineering Europe</dc:publisher> <dc:date>2002</dc:date> <dc:type>Documento relativo ad una Conferenza o altro Evento</dc:type> <dc:type>PeerReviewed</dc:type> <dc:identifier>http://amsacta.cib.unibo.it/archive/00000014/</dc:identifier> <dc:format>pdf http://amsacta.cib.unibo.it/archive/00000014/01/GaAs_1_Vorobiev.pdf </dc:format> </oai_dc:dc> splash page locator of resource The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 8. Dublin Core Encoding Type 2 … <dc:identifier>http://amsacta.cib.unibo.it/archive/00000014/</dc:identifier> <dc:relation> http://amsacta.cib.unibo.it/archive/00000014/01/GaAs_1_Vorobiev.pdf </dc:relation> … splash page locator of resource The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 9. Dublin Core Encoding Type 3 … <dc:identifier> http://amsacta.cib.unibo.it/archive/00000014/</dc:identifier> <dc:relation> http://resolver.unibo.it/00000014/ </dc:relation> <dc:relation> http://amsacta.cib.unibo.it/archive/00000014/01/GaAs_1_Vorobiev.pdf </dc:relation> … splash page locator of resource splash page The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 10. OAI Object Re-Use and Exchange • Develop, identify, and profile extensible standards and protocols to allow repositories, agents, and services to interoperate in the context of use and reuse of compound digital objects beyond the boundaries of the holding repositories. • Aim for more effective and consistent ways: o to facilitate discovery of these objects, o to reference (link to) these objects (and parts thereof), o to obtain a variety of disseminations of these objects, o to aggregate and disaggregate these objects, o Enable processing by automated agents The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 11. The Structure of Compound Objects is Obfuscated When Mapped to the Web The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 12. Useful for humans and useful for applications is often different HTTP LINK HEADER The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 13. Through the Resource Map, the Web application sees the compound object The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 14. This approach reveals compound objects in the Web graph The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 15. OAI: Its Not Just for Metadata Harvesting Anymore… OAI-PMH OAI-ORE Repository structure Object structure Metadata centric Resource centric Metadata harvesting Object re-use (obtain, harvest, register) OAI-PMH and OAI-ORE are complimentary; o you can do one without the other o you can do them together The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 16. OAI-ORE : Current Status • Ongoing definition of the ORE framework o Reach joint problem statement o Issues regarding identification o Model for ORE resource o Publishing ORE resources to the Web o Discovering ORE resources • Review of appropriate technologies for ORE Model and Resource Map o ATOM o DID/DIDL, IMS/CP, METS, Ramlet o RDF, RDF/XML o Dublin Core Abstract Model o … The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 17. OAI-ORE : Current Status • Explore demonstrators using these concepts in preparation of May 2007 ORE Technical Committee meeting • Post May 2007 meeting: o Hopefully work towards alpha specs for ORE resource, Resource Map, discovery of ORE resource o Experimentation with alpha specs The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 18. My research group’s approach to OAI/Preservation integration… The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 19. Preservation: Fortress Model Five Easy Steps for Preservation: 1. Get a lot of $ 2. Buy a lot of disks, machines, tapes, etc. 3. Hire an army of staff 4. Load a small amount of data 5. “Look upon my archive ye Mighty, and despair!” image from: http://www.itunisie.com/tourisme/excursion/tabarka/images/fort.jpg The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 20. Alternate Models of Preservation • Lazy Preservation o Let Google, IA et al. preserve your website • Just-In-Time Preservation o Wait for it to disappear first, then a “good enough” version • Shared Infrastructure Preservation o Push your content to sites that might preserve it • Web Server Enhanced Preservation o Use Apache modules to create archival-ready resources image from: http://www.proex.ufes.br/arsm/knots_interlaced.htm The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 21. Web Site Preservation: 2 Problems Guess the bean count, win the jar The counting problem The representation problem How many pages are on that site? What’s that page all about? To save it you have to find it Future use requires understanding The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 22. OAI-PMH Data Model resource OAI-PMH identifier item = entry point to all records pertaining to the resource metadata pertaining Dublin Core MPEG-21 MARCXML records to the resource metadata METS DIDL metadata modeled representation simple complex complex more expressive of the resource model model model model The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 23. mod_oai implementation Integrate OAI-PMH functionality into the web server itself… 1. Use mod_oai - an Apache 2.0 module - automatically answers OAI-PMH requests for an http server - written in C - respects values in .htaccess, httpd.conf 2. Install mod_oai on http://www.foo.edu/ Define baseURL: http://www.foo.edu/modoai 3. Result: web harvesting with OAI-PMH semantics (e.g., from, until, sets) http://www.foo.edu/modoai?verb=ListRecords&metdataPrefix=oai_didl&from=2004-09-15&set=mime:video:mpeg From site foo, dating from 9/15/2004 through today Give me all resources Using OAI-PMH And their preservation metadata that are MIME type video-MPEG The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 24. Addressing the Counting Problem: ListIdentifiers CRAWLER: • issues a ListIdentifiers, • finds URLs of updated resources • does HTTP GET updates only • can get URLs of resources with specified MIME types The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 25. Addressing the Representation Problem: ListRecords in DIDL Format CRAWLER: • Makes a ListRecords query, • Gets updates as MPEG-21 DIDL records (HTTP headers, resource By Value or By Reference) • can get resources with specified MIME types The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 26. CRATE: Preservation Metadata at Dissemination Time • Harnesses web server to support preservation • Moves preservation metadata from “strict validation at ingest” to “best-effort Plug-in Name Executable path description at dissemination” The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson
  • 27. Validation is Subjective Preservation metadata is like a David Hockney photo collage: each image is both true and incomplete, and while the result is not faithful, it does capture the “essence” images from: http://facweb.cs.depaul.edu/sgrais/collage.htm The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson