SlideShare a Scribd company logo
ResourceSync:
                    Web-Based
                      Resource
                 Synchronization
                     Herbert Van de Sompel
                           Los Alamos National Laboratory
                                              @hvdsomp


                            ResourceSync is funded by
                          The Sloan Foundation & JISC


         ResourceSync – Herbert Van de Sompel
TICER Summer School, August 22 2012, Tilburg, The Netherlands
ResourceSync Core Team – NISO & OAI


Cornell University & OAI:
   Berhard Haslhofer, Carl Lagoze, Simeon Warner

Old Dominion University & OAI:
   Michael L. Nelson

Los Alamos National Laboratory & OAI:
   Martin Klein, Robert Sanderson, Herbert Van de Sompel

NISO:
   Todd Carpenter, Nettie Lagace, Peter Murray


                             ResourceSync – Herbert Van de Sompel
                    TICER Summer School, August 22 2012, Tilburg, The Netherlands
ResourceSync Technical Group


•  Manuel Bernhardt, Delving B.V.
•  Kevin Ford, Library of Congress
•  Richard Jones, JISC
•  Graham Klyne, JISC
•  Stuart Lewis, JISC
•  David Rosenthal, LOCKSS
•  Christian Sadilek, Red Hat
•  Shlomo Sanders, Ex Libris, Inc.
•  Sjoerd Siebinga, Delving B.V.
•  Ed Summers, Library of Congress
•  Jeff Young, OCLC Online Computer Library Center


                             ResourceSync – Herbert Van de Sompel
                    TICER Summer School, August 22 2012, Tilburg, The Netherlands
ResourceSync


ResourceSync: What & Why?

Problem Perspective & Conceptual Approach

Technical Details

Q&A




                              ResourceSync – Herbert Van de Sompel
                     TICER Summer School, August 22 2012, Tilburg, The Netherlands
ResourceSync


ResourceSync: What & Why?

Problem Perspective & Conceptual Approach

Technical Details

Q&A




                              ResourceSync – Herbert Van de Sompel
                     TICER Summer School, August 22 2012, Tilburg, The Netherlands
Synchronize What?

•  Web resources – things with a URI that can be dereferenced and
   are cache-able (no dependency on underlying OS, technologies
   etc.)

•  Small websites/repositories (a few resources) to large
   repositories/datasets/linked data collections (many millions of
   resources)

•  That change slowly (weeks/months) or quickly (seconds), and
   where latency needs may vary

•  Focus on needs of research communication and cultural heritage
   organizations, but aim for generality


                                   ResourceSync – Herbert Van de Sompel
                          TICER Summer School, August 22 2012, Tilburg, The Netherlands
Why?

… because lots of projects and services are doing synchronization
but have to resort to ad-hoc, case by case, approaches!

•  Project team involved with projects that need this

•  Experience with OAI-PMH: widely used in repos but
    o  XML metadata only

    o  Attempts at synchronizing actual content via OAI-PMH

       (complex object formats, dc:identifier) not successful.
    o  Web technology has moved on since 1999




•  Devise a shared solution for data, metadata, linked data?


                                  ResourceSync – Herbert Van de Sompel
                         TICER Summer School, August 22 2012, Tilburg, The Netherlands
Use Cases – The Basics




              ResourceSync – Herbert Van de Sompel
     TICER Summer School, August 22 2012, Tilburg, The Netherlands
Use Cases - More




           ResourceSync – Herbert Van de Sompel
  TICER Summer School, August 22 2012, Tilburg, The Netherlands
Out Of Scope (For Now)

•  Bidirectional synchronization

•  Destination-defined selective synchronization (query)

•  Bulk URI migration




                                  ResourceSync – Herbert Van de Sompel
                         TICER Summer School, August 22 2012, Tilburg, The Netherlands
Use Case: arXiv Mirroring

•  1M article versions, ~800/day created or
   updated at 8 PM US Eastern Time

•  Metadata and full-text for each article

•  Accuracy important

•  Want low barrier for others to use

•  Look for more general solution than current
   homebrew mirroring (running with minor
   modifications since 1994!) and occasional rsync
   (filesystem layout specific, auth issues)

                                   ResourceSync – Herbert Van de Sompel
                          TICER Summer School, August 22 2012, Tilburg, The Netherlands
Use Case: DBpedia Live Duplication

•  Average of 2 updates per second
•  Want low latency => need a push technology




                                ResourceSync – Herbert Van de Sompel
                       TICER Summer School, August 22 2012, Tilburg, The Netherlands
ResourceSync


ResourceSync: What & Why?

Problem Perspective & Conceptual Approach

Technical Details

Q&A




                              ResourceSync – Herbert Van de Sompel
                     TICER Summer School, August 22 2012, Tilburg, The Netherlands
ResourceSync Problem


•  Consideration:
    •  Source (server) A has resources that change over time: they
       get created, modified, deleted
    •  Destination (servers) X, Y, and Z leverage (some) resources
       of Source A.
•  Problem:
    •  Destinations want to keep in step with the resource changes
       at Source A: resource synchronization.
•  Goal:
    •  Design an approach for resource synchronization aligned
       with the Web Architecture that has a fair chance of adoption
       by different communities.
        •  The approach must scale better than recurrent HTTP
           HEAD/GET on resources.


                                ResourceSync – Herbert Van de Sompel
                       TICER Summer School, August 22 2012, Tilburg, The Netherlands
Destination: 3 Basic Synchronization Needs

1.  Baseline synchronization – A destination must be able to
    perform an initial load or catch-up with a source
       -  avoid out-of-band setup

2.  Incremental synchronization – A destination must have some
    way to keep up-to-date with changes at a source
       -  subject to some latency; minimal: create/update/delete
       -  allow to catch-up after destination has been offline

3.  Audit – A destination should be able to determine whether it is
    synchronized with a source
       -  subject to some latency



                                  ResourceSync – Herbert Van de Sompel
                         TICER Summer School, August 22 2012, Tilburg, The Netherlands
Source Capability 1: Describing Content

In order to advertise the resources that a source wants destinations
to know about, it may describe them:

    o    Publish an inventory of resource URIs and possibly
         associated metadata
         -  Destination GETs the Content Description
         -  Destination GETs listed resources by their URI




                                    ResourceSync – Herbert Van de Sompel
                           TICER Summer School, August 22 2012, Tilburg, The Netherlands
Source Capability 2: Communicating Change Events

In order to achieve lower latency, a source may communicate about
changes to its resources:

   o     2.1. Change Set: Publish a list of recent change events
         (create, update, delete resource)
        -  Destination acts upon change events, e.g. GETs created/
           updated resources, removes deleted resources.




                                  ResourceSync – Herbert Van de Sompel
                         TICER Summer School, August 22 2012, Tilburg, The Netherlands
Source Capability 2: Communicating Change Events

In order to achieve lower latency, a source may communicate about
changes to its resources:

   o     2.1. Change Set: Publish a list of recent change events
         (create, update, delete resource)
        -  Destination acts upon change events, e.g. GETs created/
            updated resources, removes deleted resources.

   o     2.2. Push Change Set: Push a list of recent change events
         (create, update, delete resource) towards (a) destination(s)
        -  Destination acts upon change events, e.g. GETs created/
            updated resources, removes deleted resources.




                                   ResourceSync – Herbert Van de Sompel
                          TICER Summer School, August 22 2012, Tilburg, The Netherlands
Source Capability 3: Providing Access to Versions

In order to allow a destination to catch up with missed changes, a
source may support:

   o    3.1. Historical Change Sets: Provide access to change events that
        occurred prior to the ones listed in the current Change Set




                                    ResourceSync – Herbert Van de Sompel
                           TICER Summer School, August 22 2012, Tilburg, The Netherlands
Source Capability 3: Providing Access to Versions

In order to allow a destination to catch up with missed changes, a
source may support:

   o    3.1. Historical Change Sets: Provide access to change events that
        occurred prior to the ones listed in the current Change Set

   o    3.2. Historical Content: Provide access to prior resource versions




                                     ResourceSync – Herbert Van de Sompel
                            TICER Summer School, August 22 2012, Tilburg, The Netherlands
Source Capability 4: Transferring Content

By default, content is transferred in response to a GET issued by a
destination against a URI of a source’s resource. But a source may
support additional mechanisms:

   o     4.1. Dump: Publish a package of resource representations
         and necessary metadata
        -  Destination GETs the Dump
        -  Destination unpacks the Dump




                                  ResourceSync – Herbert Van de Sompel
                         TICER Summer School, August 22 2012, Tilburg, The Netherlands
Source Capability 4: Transferring Content

By default, content is transferred in response to a GET issued by a
destination against a URI of a source’s resource. But a source may
support additional mechanisms:

   o     4.1. Dump: Publish a package of resource representations
         and necessary metadata
        -  Destination GETs the Dump
        -  Destination unpacks the Dump

   o    4.2. Alternate Content Transfer: Support alternative
        mechanisms to optimize getting content, e.g. content via a
        mirror site, only changes not the entire changed resource.



                                  ResourceSync – Herbert Van de Sompel
                         TICER Summer School, August 22 2012, Tilburg, The Netherlands
Source: Advertise Capabilities

A source needs to advertise the capabilities it supports to allow a
destination to discover them

•     Some capabilities may be provided by a third party, not the
      source itself
     o   e.g. Historical Change Sets, Historical Content
     o   But the source should still make those third party capabilities
         discoverable - trust




                                    ResourceSync – Herbert Van de Sompel
                           TICER Summer School, August 22 2012, Tilburg, The Netherlands
ResourceSync


ResourceSync: What & Why?

Problem Perspective & Conceptual Approach

Technical Details

Q&A




                              ResourceSync – Herbert Van de Sompel
                     TICER Summer School, August 22 2012, Tilburg, The Netherlands
So Many Choices

                                                   Push
  DSNotify
                OAI-PMH                                                   Pull
     rsync
                   Crawl
                                         OAI-ORE
        RDFsync
                                                           WebDAV Col. Syn.
                                 XMPP
 Atom                                                SWORD                    AtomPub
             Sitemap             RSS

SPARQLpush                                                       PubSubHubbub
                  SDShare                  XMPP

                                  ResourceSync – Herbert Van de Sompel
                         TICER Summer School, August 22 2012, Tilburg, The Netherlands
ResourceSync – Herbert Van de Sompel
TICER Summer School, August 22 2012, Tilburg, The Netherlands
ResourceSync – Herbert Van de Sompel
TICER Summer School, August 22 2012, Tilburg, The Netherlands
A Framework Based on Sitemaps

•  Modular framework allowing selective deployment

•  Sitemap is the core component throughout the
   framework

   o    Introduce extension elements and attributes:
          -  In ResourceSync namespace (rs:) to
             accommodate synchronization needs
          -  In XHTML namespace (xhtml:) mainly to
             accommodate discovery needs
   o    Reuse Sitemap format for Change Sets (both
        current and historical) and for manifest in Dump

                                ResourceSync – Herbert Van de Sompel
                       TICER Summer School, August 22 2012, Tilburg, The Netherlands
Source Capabilities – Destination Needs




                     ResourceSync – Herbert Van de Sompel
            TICER Summer School, August 22 2012, Tilburg, The Netherlands
Source Capabilities – Destination Needs




                     ResourceSync – Herbert Van de Sompel
            TICER Summer School, August 22 2012, Tilburg, The Netherlands
Sitemap with Added Datetime




                ResourceSync – Herbert Van de Sompel
       TICER Summer School, August 22 2012, Tilburg, The Netherlands
Change Types: Extend lastmod, Use expires
                                        !




                       ResourceSync – Herbert Van de Sompel
              TICER Summer School, August 22 2012, Tilburg, The Netherlands
Sitemap with lastmod and expires
                               !




                   ResourceSync – Herbert Van de Sompel
          TICER Summer School, August 22 2012, Tilburg, The Netherlands
Sitemap Discovery via robots.txt
                               !




                  ResourceSync – Herbert Van de Sompel
         TICER Summer School, August 22 2012, Tilburg, The Netherlands
Source Capabilities – Destination Needs




                     ResourceSync – Herbert Van de Sompel
            TICER Summer School, August 22 2012, Tilburg, The Netherlands
Change Set: An rs Typed Sitemap




                  ResourceSync – Herbert Van de Sompel
         TICER Summer School, August 22 2012, Tilburg, The Netherlands
More rs Extension Elements




               ResourceSync – Herbert Van de Sompel
      TICER Summer School, August 22 2012, Tilburg, The Netherlands
Change Set with rs and xhtml Extensions




                      ResourceSync – Herbert Van de Sompel
             TICER Summer School, August 22 2012, Tilburg, The Netherlands
Change Set Discovery via Sitemap




                  ResourceSync – Herbert Van de Sompel
         TICER Summer School, August 22 2012, Tilburg, The Netherlands
Pushing Change Sets via XMPP PubSub

XMPP Publish-Subscribe: Client to Subscription Service,
  Subscription Service to Client(s) communication

•  One of the XMPP (Extensible Messaging and Presence Protocol)
   extensions http://xmpp.org/extensions/xep-0060.html
•  Apple Notifications based on XMPP PubSub
•  Available tools, see http://xmpp.org/about-xmpp/
   technology-overview/pubsub/#impl-client
    o  XMPP Servers with PubSub support:

        -  ejabberd , OpenFire , Tigase , SleekXMPP
    o  XMPP libraries with PubSub support:

        -  Strophe (C, JavaScript), XMPP4R (Ruby), SleekXMPP
           (Python), PubSub Client (Python)



                                 ResourceSync – Herbert Van de Sompel
                        TICER Summer School, August 22 2012, Tilburg, The Netherlands
Pushing Change Sets via XMPP PubSub




                    ResourceSync – Herbert Van de Sompel
           TICER Summer School, August 22 2012, Tilburg, The Netherlands
Change Set via XMPP




            ResourceSync – Herbert Van de Sompel
   TICER Summer School, August 22 2012, Tilburg, The Netherlands
Push Change Set Discovery via Sitemap




                     ResourceSync – Herbert Van de Sompel
            TICER Summer School, August 22 2012, Tilburg, The Netherlands
Source Capabilities – Destination Needs




                     ResourceSync – Herbert Van de Sompel
            TICER Summer School, August 22 2012, Tilburg, The Netherlands
Discovering a Historical Change Set via a Current Change Set




                                ResourceSync – Herbert Van de Sompel
                       TICER Summer School, August 22 2012, Tilburg, The Netherlands
Source Capabilities – Destination Needs




                     ResourceSync – Herbert Van de Sompel
            TICER Summer School, August 22 2012, Tilburg, The Netherlands
Discovering Historical Content – Link to Version Resource




                              ResourceSync – Herbert Van de Sompel
                     TICER Summer School, August 22 2012, Tilburg, The Netherlands
Memento Intermezzo




 http://www.mementoweb.org/

            ResourceSync – Herbert Van de Sompel
   TICER Summer School, August 22 2012, Tilburg, The Netherlands
Original Resources and Mementos




                  ResourceSync – Herbert Van de Sompel
         TICER Summer School, August 22 2012, Tilburg, The Netherlands
Bridge from Present to Past




               ResourceSync – Herbert Van de Sompel
      TICER Summer School, August 22 2012, Tilburg, The Netherlands
Bridge from Past to Present




               ResourceSync – Herbert Van de Sompel
      TICER Summer School, August 22 2012, Tilburg, The Netherlands
Memento Framework




           ResourceSync – Herbert Van de Sompel
  TICER Summer School, August 22 2012, Tilburg, The Netherlands
Discovering Historical Content – Link to Memento TimeGate




                              ResourceSync – Herbert Van de Sompel
                     TICER Summer School, August 22 2012, Tilburg, The Netherlands
Source Capabilities – Destination Needs




                     ResourceSync – Herbert Van de Sompel
            TICER Summer School, August 22 2012, Tilburg, The Netherlands
Dump

•  Two formats currently under discussion:

   o    Format based on ZIP:
         -  Package content
         -  Add manifest (manifest.xml) expressed in
            Sitemap format
         -  ZIP it up

   o    WARC files as used by the web archiving
        community



                               ResourceSync – Herbert Van de Sompel
                      TICER Summer School, August 22 2012, Tilburg, The Netherlands
Mapping URI to File Path with rs:path
                                    !




                     ResourceSync – Herbert Van de Sompel
            TICER Summer School, August 22 2012, Tilburg, The Netherlands
Manifest (manifest.xml) Expressed in Sitemap Format




                            ResourceSync – Herbert Van de Sompel
                   TICER Summer School, August 22 2012, Tilburg, The Netherlands
Dump Discovery via Sitemap




               ResourceSync – Herbert Van de Sompel
      TICER Summer School, August 22 2012, Tilburg, The Netherlands
Source Capabilities – Destination Needs




                     ResourceSync – Herbert Van de Sompel
            TICER Summer School, August 22 2012, Tilburg, The Netherlands
Alternate Location




           ResourceSync – Herbert Van de Sompel
  TICER Summer School, August 22 2012, Tilburg, The Netherlands
Alternate Protocol, e.g. Obtain Changes Only




                        ResourceSync – Herbert Van de Sompel
               TICER Summer School, August 22 2012, Tilburg, The Netherlands
Timeline
•  August 2012
    o  First draft spec shared for feedback with ResourceSync team




•  September 2012
    o  In-person meeting of ResourceSync Team

    o  Revise spec, conduct experiments

    o  Solicit broad feedback

    o  Paper in D-Lib Magazine




•  December 2012 – Finalize specification (?)




                                 ResourceSync – Herbert Van de Sompel
                        TICER Summer School, August 22 2012, Tilburg, The Netherlands
Pointers
•  First draft spec:
   http://www.openarchives.org/rs/0.1/resourcesync!

•  Simulator code on github
   http://github.org/resync/simulator!

•  NISO workspace
   http://www.niso.org/workrooms/resourcesync/!
   !
•  List for public comment coming soon




                           ResourceSync – Herbert Van de Sompel
                  TICER Summer School, August 22 2012, Tilburg, The Netherlands
ResourceSync:
                    Web-Based
                      Resource
                 Synchronization
                     Herbert Van de Sompel
                           Los Alamos National Laboratory
                                              @hvdsomp


                            ResourceSync is funded by
                          The Sloan Foundation & JISC


         ResourceSync – Herbert Van de Sompel
TICER Summer School, August 22 2012, Tilburg, The Netherlands

More Related Content

What's hot

ResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem PerspectiveResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem Perspective
Herbert Van de Sompel
 
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
National Information Standards Organization (NISO)
 
Hadoop ecosystem for health/life sciences
Hadoop ecosystem for health/life sciencesHadoop ecosystem for health/life sciences
Hadoop ecosystem for health/life sciences
Uri Laserson
 
Maintaining scholarly standards in the digital age: Publishing historical gaz...
Maintaining scholarly standards in the digital age: Publishing historical gaz...Maintaining scholarly standards in the digital age: Publishing historical gaz...
Maintaining scholarly standards in the digital age: Publishing historical gaz...
Humphrey Southall
 
Druid Scaling Realtime Analytics
Druid Scaling Realtime AnalyticsDruid Scaling Realtime Analytics
Druid Scaling Realtime Analytics
Aaron Brooks
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
tcloudcomputing-tw
 
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
eswcsummerschool
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
EUCLID project
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
tcloudcomputing-tw
 
20131205 hadoop-hdfs-map reduce-introduction
20131205 hadoop-hdfs-map reduce-introduction20131205 hadoop-hdfs-map reduce-introduction
20131205 hadoop-hdfs-map reduce-introduction
Xuan-Chao Huang
 
Large Scale Data With Hadoop
Large Scale Data With HadoopLarge Scale Data With Hadoop
Large Scale Data With Hadoopguest27e6764
 
Hadoop Family and Ecosystem
Hadoop Family and EcosystemHadoop Family and Ecosystem
Hadoop Family and Ecosystem
tcloudcomputing-tw
 
Lessons learnt from the Murchison Widefield Array Data Archive
Lessons learnt from the Murchison Widefield Array Data ArchiveLessons learnt from the Murchison Widefield Array Data Archive
Lessons learnt from the Murchison Widefield Array Data Archive
Chen Wu
 
List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in UttarakhandList of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
Roorkee College of Engineering, Roorkee
 
Using the Data Cube vocabulary for Publishing Environmental Linked Data on la...
Using the Data Cube vocabulary for Publishing Environmental Linked Data on la...Using the Data Cube vocabulary for Publishing Environmental Linked Data on la...
Using the Data Cube vocabulary for Publishing Environmental Linked Data on la...
Laurent Lefort
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduce
eakasit_dpu
 
Hadoop tools with Examples
Hadoop tools with ExamplesHadoop tools with Examples
Hadoop tools with Examples
Joe McTee
 
Visualizing Digital Collections of Web Archives from Columbia Web Archiving C...
Visualizing Digital Collections of Web Archives from Columbia Web Archiving C...Visualizing Digital Collections of Web Archives from Columbia Web Archiving C...
Visualizing Digital Collections of Web Archives from Columbia Web Archiving C...Mat Kelly
 
End-to-End : Open Access Process Review and Improvements
End-to-End: Open Access Process Review and ImprovementsEnd-to-End: Open Access Process Review and Improvements
End-to-End : Open Access Process Review and Improvements
Repository Fringe
 

What's hot (20)

ResourceSync
ResourceSyncResourceSync
ResourceSync
 
ResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem PerspectiveResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem Perspective
 
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
 
Hadoop ecosystem for health/life sciences
Hadoop ecosystem for health/life sciencesHadoop ecosystem for health/life sciences
Hadoop ecosystem for health/life sciences
 
Maintaining scholarly standards in the digital age: Publishing historical gaz...
Maintaining scholarly standards in the digital age: Publishing historical gaz...Maintaining scholarly standards in the digital age: Publishing historical gaz...
Maintaining scholarly standards in the digital age: Publishing historical gaz...
 
Druid Scaling Realtime Analytics
Druid Scaling Realtime AnalyticsDruid Scaling Realtime Analytics
Druid Scaling Realtime Analytics
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q2
 
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
 
20131205 hadoop-hdfs-map reduce-introduction
20131205 hadoop-hdfs-map reduce-introduction20131205 hadoop-hdfs-map reduce-introduction
20131205 hadoop-hdfs-map reduce-introduction
 
Large Scale Data With Hadoop
Large Scale Data With HadoopLarge Scale Data With Hadoop
Large Scale Data With Hadoop
 
Hadoop Family and Ecosystem
Hadoop Family and EcosystemHadoop Family and Ecosystem
Hadoop Family and Ecosystem
 
Lessons learnt from the Murchison Widefield Array Data Archive
Lessons learnt from the Murchison Widefield Array Data ArchiveLessons learnt from the Murchison Widefield Array Data Archive
Lessons learnt from the Murchison Widefield Array Data Archive
 
List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in UttarakhandList of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
 
Using the Data Cube vocabulary for Publishing Environmental Linked Data on la...
Using the Data Cube vocabulary for Publishing Environmental Linked Data on la...Using the Data Cube vocabulary for Publishing Environmental Linked Data on la...
Using the Data Cube vocabulary for Publishing Environmental Linked Data on la...
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduce
 
Hadoop tools with Examples
Hadoop tools with ExamplesHadoop tools with Examples
Hadoop tools with Examples
 
Visualizing Digital Collections of Web Archives from Columbia Web Archiving C...
Visualizing Digital Collections of Web Archives from Columbia Web Archiving C...Visualizing Digital Collections of Web Archives from Columbia Web Archiving C...
Visualizing Digital Collections of Web Archives from Columbia Web Archiving C...
 
End-to-End : Open Access Process Review and Improvements
End-to-End: Open Access Process Review and ImprovementsEnd-to-End: Open Access Process Review and Improvements
End-to-End : Open Access Process Review and Improvements
 

Similar to ResourceSync: Web-Based Resource Synchronization

A new approach to aggregation
A new approach to aggregation A new approach to aggregation
A new approach to aggregation
Enno Meijers
 
A distributed network of digital heritage information - Unesco/NDL India
A distributed network of digital heritage information - Unesco/NDL IndiaA distributed network of digital heritage information - Unesco/NDL India
A distributed network of digital heritage information - Unesco/NDL India
Enno Meijers
 
ResourceSync Tutorial from Open Repositories 2013
ResourceSync Tutorial from Open Repositories 2013ResourceSync Tutorial from Open Repositories 2013
ResourceSync Tutorial from Open Repositories 2013
Simeon Warner
 
Session 1.4 a distributed network of heritage information
Session 1.4   a distributed network of heritage informationSession 1.4   a distributed network of heritage information
Session 1.4 a distributed network of heritage information
semanticsconference
 
A distributed network of digital heritage information - Semantics Amsterdam
A distributed network of digital heritage information - Semantics AmsterdamA distributed network of digital heritage information - Semantics Amsterdam
A distributed network of digital heritage information - Semantics Amsterdam
Enno Meijers
 
lodlam summit session browsable linked data
lodlam summit session browsable linked datalodlam summit session browsable linked data
lodlam summit session browsable linked data
Enno Meijers
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
Enno Meijers
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
Herbert Van de Sompel
 
Sharing Big Data - Bob Jones
Sharing Big Data - Bob JonesSharing Big Data - Bob Jones
Sharing Big Data - Bob Jones
Helix Nebula The Science Cloud
 
IPRES 2014 paper presentation: significant environment information for LTDP
IPRES 2014 paper presentation: significant environment information for LTDPIPRES 2014 paper presentation: significant environment information for LTDP
IPRES 2014 paper presentation: significant environment information for LTDP
Fabio Corubolo
 
NHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-LifeNHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-Life
Edward Baker
 
NHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-LifeNHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-Life
Vince Smith
 
Illuminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data SupportIlluminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data Support
Pascal-Nicolas Becker
 
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
Martin Klein
 
Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...
openminted_eu
 
Annotating Scholarly Resources
Annotating Scholarly ResourcesAnnotating Scholarly Resources
Annotating Scholarly ResourcesRobert Sanderson
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
Ivan Herman
 
A distributed network of digital heritage information by Enno Meijers - Europ...
A distributed network of digital heritage information by Enno Meijers - Europ...A distributed network of digital heritage information by Enno Meijers - Europ...
A distributed network of digital heritage information by Enno Meijers - Europ...
Europeana
 

Similar to ResourceSync: Web-Based Resource Synchronization (20)

A new approach to aggregation
A new approach to aggregation A new approach to aggregation
A new approach to aggregation
 
A distributed network of digital heritage information - Unesco/NDL India
A distributed network of digital heritage information - Unesco/NDL IndiaA distributed network of digital heritage information - Unesco/NDL India
A distributed network of digital heritage information - Unesco/NDL India
 
ResourceSync Tutorial from Open Repositories 2013
ResourceSync Tutorial from Open Repositories 2013ResourceSync Tutorial from Open Repositories 2013
ResourceSync Tutorial from Open Repositories 2013
 
Session 1.4 a distributed network of heritage information
Session 1.4   a distributed network of heritage informationSession 1.4   a distributed network of heritage information
Session 1.4 a distributed network of heritage information
 
A distributed network of digital heritage information - Semantics Amsterdam
A distributed network of digital heritage information - Semantics AmsterdamA distributed network of digital heritage information - Semantics Amsterdam
A distributed network of digital heritage information - Semantics Amsterdam
 
lodlam summit session browsable linked data
lodlam summit session browsable linked datalodlam summit session browsable linked data
lodlam summit session browsable linked data
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
 
Sharing Big Data - Bob Jones
Sharing Big Data - Bob JonesSharing Big Data - Bob Jones
Sharing Big Data - Bob Jones
 
IPRES 2014 paper presentation: significant environment information for LTDP
IPRES 2014 paper presentation: significant environment information for LTDPIPRES 2014 paper presentation: significant environment information for LTDP
IPRES 2014 paper presentation: significant environment information for LTDP
 
NHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-LifeNHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-Life
 
NHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-LifeNHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-Life
 
Illuminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data SupportIlluminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data Support
 
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
 
Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...
 
Annotating Scholarly Resources
Annotating Scholarly ResourcesAnnotating Scholarly Resources
Annotating Scholarly Resources
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
Open Energy Data
Open Energy DataOpen Energy Data
Open Energy Data
 
A distributed network of digital heritage information by Enno Meijers - Europ...
A distributed network of digital heritage information by Enno Meijers - Europ...A distributed network of digital heritage information by Enno Meijers - Europ...
A distributed network of digital heritage information by Enno Meijers - Europ...
 
FFL & CNYH
FFL & CNYHFFL & CNYH
FFL & CNYH
 

More from Herbert Van de Sompel

The web is rotting and what to do about it
The web is rotting and what to do about itThe web is rotting and what to do about it
The web is rotting and what to do about it
Herbert Van de Sompel
 
Researcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebResearcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized Web
Herbert Van de Sompel
 
Persistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DonePersistent Identification: Easier Said than Done
Persistent Identification: Easier Said than Done
Herbert Van de Sompel
 
FAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueFAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning Issue
Herbert Van de Sompel
 
Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)
Herbert Van de Sompel
 
Collecting the organizational scholarly record
Collecting the organizational scholarly recordCollecting the organizational scholarly record
Collecting the organizational scholarly record
Herbert Van de Sompel
 
To the Rescue of Scholarly Orphans
To the Rescue of Scholarly OrphansTo the Rescue of Scholarly Orphans
To the Rescue of Scholarly Orphans
Herbert Van de Sompel
 
Almost two decades at LANL
Almost two decades at LANLAlmost two decades at LANL
Almost two decades at LANL
Herbert Van de Sompel
 
Perseverance on Persistence
Perseverance on PersistencePerseverance on Persistence
Perseverance on Persistence
Herbert Van de Sompel
 
Paul Evan Peters Lecture
Paul Evan Peters LecturePaul Evan Peters Lecture
Paul Evan Peters Lecture
Herbert Van de Sompel
 
Achieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed CollectionsAchieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed Collections
Herbert Van de Sompel
 
Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)
Herbert Van de Sompel
 
Signposting Overview
Signposting OverviewSignposting Overview
Signposting Overview
Herbert Van de Sompel
 
PID Signposting Pattern
PID Signposting PatternPID Signposting Pattern
PID Signposting Pattern
Herbert Van de Sompel
 
Interoperability for web based scholarship
Interoperability for web based scholarshipInteroperability for web based scholarship
Interoperability for web based scholarship
Herbert Van de Sompel
 
Reminiscing about interoperability
Reminiscing about interoperabilityReminiscing about interoperability
Reminiscing about interoperability
Herbert Van de Sompel
 
Creating Pockets of Persistence
Creating Pockets of PersistenceCreating Pockets of Persistence
Creating Pockets of Persistence
Herbert Van de Sompel
 
Memento 101
Memento 101Memento 101
A Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly RecordA Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly Record
Herbert Van de Sompel
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013
Herbert Van de Sompel
 

More from Herbert Van de Sompel (20)

The web is rotting and what to do about it
The web is rotting and what to do about itThe web is rotting and what to do about it
The web is rotting and what to do about it
 
Researcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebResearcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized Web
 
Persistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DonePersistent Identification: Easier Said than Done
Persistent Identification: Easier Said than Done
 
FAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueFAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning Issue
 
Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)
 
Collecting the organizational scholarly record
Collecting the organizational scholarly recordCollecting the organizational scholarly record
Collecting the organizational scholarly record
 
To the Rescue of Scholarly Orphans
To the Rescue of Scholarly OrphansTo the Rescue of Scholarly Orphans
To the Rescue of Scholarly Orphans
 
Almost two decades at LANL
Almost two decades at LANLAlmost two decades at LANL
Almost two decades at LANL
 
Perseverance on Persistence
Perseverance on PersistencePerseverance on Persistence
Perseverance on Persistence
 
Paul Evan Peters Lecture
Paul Evan Peters LecturePaul Evan Peters Lecture
Paul Evan Peters Lecture
 
Achieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed CollectionsAchieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed Collections
 
Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)
 
Signposting Overview
Signposting OverviewSignposting Overview
Signposting Overview
 
PID Signposting Pattern
PID Signposting PatternPID Signposting Pattern
PID Signposting Pattern
 
Interoperability for web based scholarship
Interoperability for web based scholarshipInteroperability for web based scholarship
Interoperability for web based scholarship
 
Reminiscing about interoperability
Reminiscing about interoperabilityReminiscing about interoperability
Reminiscing about interoperability
 
Creating Pockets of Persistence
Creating Pockets of PersistenceCreating Pockets of Persistence
Creating Pockets of Persistence
 
Memento 101
Memento 101Memento 101
Memento 101
 
A Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly RecordA Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly Record
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013
 

Recently uploaded

GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 

Recently uploaded (20)

GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 

ResourceSync: Web-Based Resource Synchronization

  • 1. ResourceSync: Web-Based Resource Synchronization Herbert Van de Sompel Los Alamos National Laboratory @hvdsomp ResourceSync is funded by The Sloan Foundation & JISC ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 2. ResourceSync Core Team – NISO & OAI Cornell University & OAI: Berhard Haslhofer, Carl Lagoze, Simeon Warner Old Dominion University & OAI: Michael L. Nelson Los Alamos National Laboratory & OAI: Martin Klein, Robert Sanderson, Herbert Van de Sompel NISO: Todd Carpenter, Nettie Lagace, Peter Murray ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 3. ResourceSync Technical Group •  Manuel Bernhardt, Delving B.V. •  Kevin Ford, Library of Congress •  Richard Jones, JISC •  Graham Klyne, JISC •  Stuart Lewis, JISC •  David Rosenthal, LOCKSS •  Christian Sadilek, Red Hat •  Shlomo Sanders, Ex Libris, Inc. •  Sjoerd Siebinga, Delving B.V. •  Ed Summers, Library of Congress •  Jeff Young, OCLC Online Computer Library Center ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 4. ResourceSync ResourceSync: What & Why? Problem Perspective & Conceptual Approach Technical Details Q&A ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 5. ResourceSync ResourceSync: What & Why? Problem Perspective & Conceptual Approach Technical Details Q&A ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 6. Synchronize What? •  Web resources – things with a URI that can be dereferenced and are cache-able (no dependency on underlying OS, technologies etc.) •  Small websites/repositories (a few resources) to large repositories/datasets/linked data collections (many millions of resources) •  That change slowly (weeks/months) or quickly (seconds), and where latency needs may vary •  Focus on needs of research communication and cultural heritage organizations, but aim for generality ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 7. Why? … because lots of projects and services are doing synchronization but have to resort to ad-hoc, case by case, approaches! •  Project team involved with projects that need this •  Experience with OAI-PMH: widely used in repos but o  XML metadata only o  Attempts at synchronizing actual content via OAI-PMH (complex object formats, dc:identifier) not successful. o  Web technology has moved on since 1999 •  Devise a shared solution for data, metadata, linked data? ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 8. Use Cases – The Basics ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 9. Use Cases - More ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 10. Out Of Scope (For Now) •  Bidirectional synchronization •  Destination-defined selective synchronization (query) •  Bulk URI migration ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 11. Use Case: arXiv Mirroring •  1M article versions, ~800/day created or updated at 8 PM US Eastern Time •  Metadata and full-text for each article •  Accuracy important •  Want low barrier for others to use •  Look for more general solution than current homebrew mirroring (running with minor modifications since 1994!) and occasional rsync (filesystem layout specific, auth issues) ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 12. Use Case: DBpedia Live Duplication •  Average of 2 updates per second •  Want low latency => need a push technology ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 13. ResourceSync ResourceSync: What & Why? Problem Perspective & Conceptual Approach Technical Details Q&A ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 14. ResourceSync Problem •  Consideration: •  Source (server) A has resources that change over time: they get created, modified, deleted •  Destination (servers) X, Y, and Z leverage (some) resources of Source A. •  Problem: •  Destinations want to keep in step with the resource changes at Source A: resource synchronization. •  Goal: •  Design an approach for resource synchronization aligned with the Web Architecture that has a fair chance of adoption by different communities. •  The approach must scale better than recurrent HTTP HEAD/GET on resources. ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 15. Destination: 3 Basic Synchronization Needs 1.  Baseline synchronization – A destination must be able to perform an initial load or catch-up with a source -  avoid out-of-band setup 2.  Incremental synchronization – A destination must have some way to keep up-to-date with changes at a source -  subject to some latency; minimal: create/update/delete -  allow to catch-up after destination has been offline 3.  Audit – A destination should be able to determine whether it is synchronized with a source -  subject to some latency ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 16. Source Capability 1: Describing Content In order to advertise the resources that a source wants destinations to know about, it may describe them: o  Publish an inventory of resource URIs and possibly associated metadata -  Destination GETs the Content Description -  Destination GETs listed resources by their URI ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 17.
  • 18.
  • 19. Source Capability 2: Communicating Change Events In order to achieve lower latency, a source may communicate about changes to its resources: o  2.1. Change Set: Publish a list of recent change events (create, update, delete resource) -  Destination acts upon change events, e.g. GETs created/ updated resources, removes deleted resources. ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 20.
  • 21.
  • 22.
  • 23.
  • 24. Source Capability 2: Communicating Change Events In order to achieve lower latency, a source may communicate about changes to its resources: o  2.1. Change Set: Publish a list of recent change events (create, update, delete resource) -  Destination acts upon change events, e.g. GETs created/ updated resources, removes deleted resources. o  2.2. Push Change Set: Push a list of recent change events (create, update, delete resource) towards (a) destination(s) -  Destination acts upon change events, e.g. GETs created/ updated resources, removes deleted resources. ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 25. Source Capability 3: Providing Access to Versions In order to allow a destination to catch up with missed changes, a source may support: o  3.1. Historical Change Sets: Provide access to change events that occurred prior to the ones listed in the current Change Set ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 26.
  • 27.
  • 28.
  • 29. Source Capability 3: Providing Access to Versions In order to allow a destination to catch up with missed changes, a source may support: o  3.1. Historical Change Sets: Provide access to change events that occurred prior to the ones listed in the current Change Set o  3.2. Historical Content: Provide access to prior resource versions ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 30. Source Capability 4: Transferring Content By default, content is transferred in response to a GET issued by a destination against a URI of a source’s resource. But a source may support additional mechanisms: o  4.1. Dump: Publish a package of resource representations and necessary metadata -  Destination GETs the Dump -  Destination unpacks the Dump ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 31.
  • 32. Source Capability 4: Transferring Content By default, content is transferred in response to a GET issued by a destination against a URI of a source’s resource. But a source may support additional mechanisms: o  4.1. Dump: Publish a package of resource representations and necessary metadata -  Destination GETs the Dump -  Destination unpacks the Dump o  4.2. Alternate Content Transfer: Support alternative mechanisms to optimize getting content, e.g. content via a mirror site, only changes not the entire changed resource. ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 33. Source: Advertise Capabilities A source needs to advertise the capabilities it supports to allow a destination to discover them •  Some capabilities may be provided by a third party, not the source itself o  e.g. Historical Change Sets, Historical Content o  But the source should still make those third party capabilities discoverable - trust ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 34. ResourceSync ResourceSync: What & Why? Problem Perspective & Conceptual Approach Technical Details Q&A ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 35. So Many Choices Push DSNotify OAI-PMH Pull rsync Crawl OAI-ORE RDFsync WebDAV Col. Syn. XMPP Atom SWORD AtomPub Sitemap RSS SPARQLpush PubSubHubbub SDShare XMPP ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 36. ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 37. ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 38. A Framework Based on Sitemaps •  Modular framework allowing selective deployment •  Sitemap is the core component throughout the framework o  Introduce extension elements and attributes: -  In ResourceSync namespace (rs:) to accommodate synchronization needs -  In XHTML namespace (xhtml:) mainly to accommodate discovery needs o  Reuse Sitemap format for Change Sets (both current and historical) and for manifest in Dump ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 39. Source Capabilities – Destination Needs ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 40. Source Capabilities – Destination Needs ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 41. Sitemap with Added Datetime ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 42. Change Types: Extend lastmod, Use expires ! ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 43. Sitemap with lastmod and expires ! ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 44. Sitemap Discovery via robots.txt ! ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 45. Source Capabilities – Destination Needs ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 46. Change Set: An rs Typed Sitemap ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 47. More rs Extension Elements ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 48. Change Set with rs and xhtml Extensions ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 49. Change Set Discovery via Sitemap ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 50. Pushing Change Sets via XMPP PubSub XMPP Publish-Subscribe: Client to Subscription Service, Subscription Service to Client(s) communication •  One of the XMPP (Extensible Messaging and Presence Protocol) extensions http://xmpp.org/extensions/xep-0060.html •  Apple Notifications based on XMPP PubSub •  Available tools, see http://xmpp.org/about-xmpp/ technology-overview/pubsub/#impl-client o  XMPP Servers with PubSub support: -  ejabberd , OpenFire , Tigase , SleekXMPP o  XMPP libraries with PubSub support: -  Strophe (C, JavaScript), XMPP4R (Ruby), SleekXMPP (Python), PubSub Client (Python) ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 51. Pushing Change Sets via XMPP PubSub ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 52. Change Set via XMPP ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 53. Push Change Set Discovery via Sitemap ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 54. Source Capabilities – Destination Needs ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 55. Discovering a Historical Change Set via a Current Change Set ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 56. Source Capabilities – Destination Needs ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 57. Discovering Historical Content – Link to Version Resource ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 58. Memento Intermezzo http://www.mementoweb.org/ ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 59. Original Resources and Mementos ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 60. Bridge from Present to Past ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 61. Bridge from Past to Present ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 62. Memento Framework ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 63. Discovering Historical Content – Link to Memento TimeGate ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 64. Source Capabilities – Destination Needs ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 65. Dump •  Two formats currently under discussion: o  Format based on ZIP: -  Package content -  Add manifest (manifest.xml) expressed in Sitemap format -  ZIP it up o  WARC files as used by the web archiving community ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 66. Mapping URI to File Path with rs:path ! ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 67. Manifest (manifest.xml) Expressed in Sitemap Format ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 68. Dump Discovery via Sitemap ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 69. Source Capabilities – Destination Needs ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 70. Alternate Location ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 71. Alternate Protocol, e.g. Obtain Changes Only ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 72. Timeline •  August 2012 o  First draft spec shared for feedback with ResourceSync team •  September 2012 o  In-person meeting of ResourceSync Team o  Revise spec, conduct experiments o  Solicit broad feedback o  Paper in D-Lib Magazine •  December 2012 – Finalize specification (?) ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 73. Pointers •  First draft spec: http://www.openarchives.org/rs/0.1/resourcesync! •  Simulator code on github http://github.org/resync/simulator! •  NISO workspace http://www.niso.org/workrooms/resourcesync/! ! •  List for public comment coming soon ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands
  • 74. ResourceSync: Web-Based Resource Synchronization Herbert Van de Sompel Los Alamos National Laboratory @hvdsomp ResourceSync is funded by The Sloan Foundation & JISC ResourceSync – Herbert Van de Sompel TICER Summer School, August 22 2012, Tilburg, The Netherlands