SlideShare a Scribd company logo
ResourceSync:
A Web-Based
Resource Synchronization
Framework

#resourcesync

ResourceSync is funded by
The Sloan Foundation & JISC
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

1
These slides were presented at the LITA Forum,
Louisville, Kentucky, November 10 2013
The most recent version of the slides is available at
http://www.slideshare.net/OpenArchivesInitiative/resourcesync-tutorial

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

2
ResourceSync Tutorial History
•
•
•
•
•
•

First outing: OAI8, June 2013
Second run: Open Repositories, July 2013
Third run: JCDL, July 2013
Fourth run: TPDL 2013, September 2013
Fifth run: LITA Forum, November 2013
Sixth run: SWIB 2013, November 2013

Presenter

Herbert Van de Sompel
Los Alamos National Laboratory
<hvdsomp@gmail.com>
@hvdsomp
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands
3
ResourceSync Tutorial Contributors

Martin Klein
Herbert Van de Sompel
Robert Sanderson
Los Alamos National Laboratory Los Alamos National Laboratory Los Alamos National Laboratory
<martinklein0815@gmail.com>
<hvdsomp@gmail.com>
<azaroth24@gmail.com>
@mart1nkle1n
@hvdsomp
@azaroth24

Simeon Warner
Cornell University
<simeon.warner@cornell.edu>
@zimeon

Michael L. Nelson
Old Dominion University
<mln@cs.odu.edu>
@phonedude_mln

Richard Jones
Cottage Labs
<richard@cottagelabs.com>
@cottagelabs

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands
4
OAI
Herbert Van de Sompel
Martin Klein
Robert Sanderson
(Los Alamos National Laboratory)
Simeon Warner
(Cornell University)

NISO
Todd Carpenter
Nettie Lagace
University of Oxford
Graham Klyne

Berhard Haslhofer
(University of Vienna)
Michael L. Nelson
(Old Dominion University)

Lyrasis
Peter Murray

Carl Lagoze
(University of Michigan)

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

5
ResourceSync Technical Group
LOCKSS
Ex Libris Inc.
Shlomo Sanders

David Rosenthal

JISC
Paul Walk
Richard Jones
Graham Klyne
Stuart Lewis

RedHat
OCLC
Christian Sadilek

Library of Congress

Jeff Young

Kevin Ford

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

6
Timeline, Status of Specification(s)
• August 2013
o

o

Release of ResourceSync framework Core specification
- Version 0.9.1
Public draft of ResourceSync Archives specification released

• September 2013
o

Core specification on its way to become an ANSI standard

• November 2013
o

Internal draft of ResourceSync Notification specification

• January 2014
o

Public draft of ResourceSync Notification specification

• Mid 2014
o

Core specification becomes ANSI/NISO standard

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

7
Pointers
• Specification
http://www.openarchives.org/rs/
http://www.openarchives.org/rs/resourcesync
http://www.openarchives.org/rs/notification
http://www.openarchives.org/rs/archives
• List for public comment
https://groups.google.com/d/forum/resourcesync
• Client and simulator code
http://github.org/resync/resync
http://github.org/resync/simulator

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

8
Papers
• Klein, M., and Van de Sompel, H. (2013) Extending Sitemaps for
Resourcesync. http://arxiv.org/abs/1305.4890 ACM/IEEE JCDL 2013
• Haslhofer, B., Warner, S, Lagoze, C., Klein, M., Sanderson, R., Nels
on, M.L. and Van de Sompel, H. (2013) ResourceSync: Leveraging
Sitemaps for Resource Synchronization.
http://arxiv.org/abs/1305.1476 WWW 2013 Developer Track
• Klein, M., Sanderson, R., Van de
Sompel, H., Warner, S, Haslhofer, B., Lagoze, C., and Nelson, M.L.
(2013) A Technical Framework for Resource Synchronization.
http://dx.doi.org/10.1045/january2013-klein D-Lib Magazine.
• Van de
Sompel, H., Sanderson, R., Klein, M., Nelson, M.L., Haslhofer, B., W
arner, S, and Lagoze, C. (2012) A Perspective on Resource
Synchronization. http://dx.doi.org/10.1045/september2012vandesompel D-Lib Magazine.
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

9
ResourceSync - Agenda
1. ResourceSync: Problem Perspective & Conceptual
Approach

2. Motivation & Use Cases
3. Framework Walkthrough

4. Framework (Technical) Details
5. Implementation
6. Q&A
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

10
ResourceSync - Agenda
1. ResourceSync: Problem Perspective & Conceptual
Approach

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

11
Synchronize What?

• Web resources
o things with a URI that can be dereferenced
• Focus on needs of research communication and cultural heritage
organizations
o but aim for generality

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

12
Synchronize What?
• Small websites/repositories (a few resources) to large
repositories/datasets/linked data collections (many millions of
resources)

sync

sync

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

13
Synchronize What?
• Low change frequency (weeks/months) to high change
frequency (seconds)
sync

sync

sync

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

14
Synchronize What?
• Synchronization latency and accuracy needs may vary

sync

Sync ???

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

15
Why?
… because lots of projects and services are doing synchronization
but have to resort to ad-hoc, case by case, approaches!
• Project team involved with projects that need this

• Experience with OAI-PMH: widely used in repos but
o XML metadata only
o Web technology has moved on since 1999
• Devise a shared solution for data, metadata, linked data?

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

16
ResourceSync Problem
• Consideration:
• Source (server) A has resources that change over time: they
get created, modified, deleted
• Destination (servers) X, Y, and Z leverage (some)
resources of Source A.
• Problem:
• Destinations want to keep in step with the resource changes
at Source A: resource synchronization.
• Goal:
• Design an approach for resource synchronization aligned
with the Web Architecture that has a fair chance of adoption
by different communities.
• The approach must scale better than recurrent HTTP
HEAD/GET on resources.

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

17
Source: Core Synchronization Capabilities

P
U
L
L

1. Describing content – publish a list of resources available for
synchronization to enable Destinations to perform an initial load
or catch-up with a Source
2. Packaging content – bundle resources to enable bulk download
by destinations
3. Describing changes – publish a list of resource changes to
enable destinations to stay synchronized and decrease latency
4. Packaging changes – bundle resource changes for bulk
download by destinations

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

18
Source: Notifications Capabilities
To reduce synchronization latency and to optimize the synchronization
process the Source can support:

P
•
U
S
•
H

1. Change Notification
• Notifies about changes to particular resources
• e.g., resource A has been updated | created | deleted
2. Framework Notification
• Notifies about changes to capabilities i.e., their documents
• e.g., a Change List has been updated | created | deleted

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

19
A
R
C
H
I
V
E
S

Source: Archival Capabilities
The Source may hold on to historical data, for example, to allow
Destinations to catch up with events they missed or revisit prior
resource states. To this end, the Source can publish archives, i.e.
documents that enumerate historical capability documents
1.
2.
3.
4.

Resource List Archive
Resource Dump Archive
Change List Archive
Change Dump Archive

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

20
Source: Synchronization Features
1. Discovery of capabilities – support Destinations in discovering
all offered capabilities
o

Applies to PULL, PUSH, ARCHIVES capabilities

1. Linking to related resources – provide links from resources
subject to synchronization to related resources
o

Applies to PULL, PUSH capabilities

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

21
Destination: Synchronization Needs
1. Baseline synchronization – A destination must be able to
perform an initial load or catch-up with a source
- avoid out-of-band setup
2. Incremental synchronization – A destination must have some
way to keep up-to-date with changes at a source
- subject to some latency; minimal: create/update/delete
- allow to catch-up after destination has been offline
3. Audit – A destination should be able to determine whether it is
synchronized with a source
- regarding coverage and accuracy

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

22
ResourceSync - Agenda

2. Motivation & Use Cases

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

23
Use Cases – The Basics

a)

b)

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

24
Use Cases – The Basics
c)

d)
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

25
Use Cases – The not-so-Basics

e)

f)
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

26
Use Case 1: arXiv Mirroring and Data Sharing
• Repository of scholarly articles in
physics, mathematics, computer
science, etc.
• > 850k articles
• approx. 1.5 revisions per article on
average
• approx. 75k new articles per year
• Each article has full-text and separate
metadata record
• approx. 3.8M resources

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

28
Use Case 1: arXiv Mirroring and Data Sharing
• 2,700 updates daily
o at 8pm EST
o Currently using homebrew mirroring
solution (running with minor
modifications since 1994!)
o occasional rsync (file systemspecific, auth issues)

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

29
Use Case 1: arXiv
Mirroring
• GOAL: Keep mirror sites synchronized with daily
changes
• WANT:
o
o
o
o

high consistency
moderate latency
robustness to global network outages (low admin effort)
ability to verify sync status in case of questions

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

31
Use Case 1: arXiv
Data Sharing
• GOAL: Make resources and update information
publicly available so that any other service may
synchronize at the frequency it needs, e.g.
o
o
o

Math Front at UC Davis
EprintWeb from IOP in UK
Data for bibliometric and scientometric analysis

• WANT:
o
o

low admin effort (i.e. standard approach, standard tools)
reasonable consistency, latency, efficiency

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

32
Use Case 2: DBpedia Live Duplication
• Average of 2 updates per second
• Low latency desirable => need for a push technology

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

33
Use Case 2: DBpedia Live Duplication
• Daily traffic:
o 99% updates
o 0.6% deletions
o 0.03% creations

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

35
Use Case 2: DBpedia Live Duplication
• # of content transfer
events in two 8 hour
intervals
• Max, queue size of
remote duplication
process

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

36
ResourceSync - Agenda

3. Framework Walkthrough

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

37
Source Capability 1: Describing Content
In order to advertise the resources that a source wants destinations
to know about, it may describe them:
o

o

Publish a Resource List, a list of resource URIs and possibly
associated metadata
- Destination GETs the Resource List
- Destination GETs listed resources by their URI
A Resource List describes the state of a set of resources at
one point in time (snapshot)

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

38
39
40
Source Capability 2: Packaging Content
By default, content is transferred in response to a GET issued by a
destination against a URI of a source’s resource. But a source may
support additional mechanisms:
o

o

Publish a Resource Dump, a document that points to
packages of resource representations and necessary
metadata
- Destination GETs the package
- Destination unpacks the package
- ZIP format supported
A Resource Dump and the packages it points to reflect the
state of a set of resources at one point in time (snapshot)

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

41
42
43
Source:
Modular Capabilities

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

44
Source Capability 3: Describing Changes
In order to achieve lower latency and/or greater efficiency, a source
may communicate about changes to its resources:
o

o

Publish a Change List, a list of recent change events
(created, updated, deleted resource)
- Destination acts upon change events, e.g. GETs
created/updated resources, removes deleted resources.
A Change List pertains to resources that changed in a
temporal interval with a start- and an end-date
- If a resource changed more than once, it will be listed
more than once

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

45
46
47
48
49
Source Capability 4: Packaging Changes
In order to reduce the number of requests to obtain resource
changes, a source may provide packaged bitstreams for changed
resources:
o

o

Publish a Change Dump, a document that points to
packages containing bitstreams of recently changed
resource and necessary metadata
- Destination GETs the package
- Destination unpacks the package
- ZIP format supported
A Change Dump and its packages pertain to resources that
changed in a temporal interval with a start- and an end-date
- If a resource changed more than once, it will be included
more than once
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

50
51
52
Source:
Modular Capabilities

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

53
Destination: Key Processes

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

54
ResourceSync - Agenda

4. Framework (Technical) Details

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

55
ResourceSync - Agenda
4. Framework (Technical) Details
1. Sitemaps

2. Core synchronization capabilities (PULL)
3. Discovery
4. Linking to related resources

5. Notification Capabilities (PUSH)
6. Archival capabilities (ARCHIVES)

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

56
ResourceSync - Agenda
4. Framework (Technical) Details
1. Sitemaps

2. Core synchronization capabilities (PULL)
3. Discovery
4. Linking to related resources

5. Notification Capabilities (PUSH)
6. Archival capabilities (ARCHIVES)

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

57
So Many Choices
Push

DSNotify
OAI-PMH
rsync

Crawl

Pull
OAI-ORE

RDFsync

WebDAV Col. Syn.

XMPP
Atom

SWORD
Sitemap

SPARQLpush

SDShare

AtomPub

RSS
PubSubHubbub

XMPP
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

58
So Many Choices
Push

DSNotify
OAI-PMH
rsync

Crawl

Pull
OAI-ORE

RDFsync

WebDAV Col. Syn.

XMPP
Atom

SWORD
Sitemap

SPARQLpush

SDShare

AtomPub

RSS
PubSubHubbub

XMPP
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

59
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

60
A Framework Based on Sitemaps
• Modular framework allowing selective deployment

• Sitemap is the core format throughout the framework
o

o

o

Introduce extension elements and attributes:
- In ResourceSync namespace (rs:) to
accommodate synchronization needs
Reuse Sitemap format for all capability documents:
Resource List, Resource Dump, Change
List, Change Dump, as well as for manifest in
Dumps
Utilize Sitemap index format where
needed/allowed
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

61
Sitemap Format

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9”>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
</url>

<url>
<loc>http://example.com/res2</loc>
<lastmod>2013-01-02T14:00:00Z</lastmod>
</url>
…
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

62
Sitemap Index Format

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9”>
<sitemap>
<loc>http://example.com/sitemap1.xml</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
</sitemap>

<sitemap>
<loc>http://example.com/sitemap2.xml</loc>
<lastmod>2013-01-02T14:00:00Z</lastmod>
</sitemap>
…
</sitemapindex>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

63
ResourceSync Sitemap Extensions
<urlset xmlns=http://www.sitemaps.org/schemas/sitemap/0.9
xmlns:rs="http://www.openarchives.org/rs/terms/”>
<rs:ln …/>
<rs:md …/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:ln …/>
<rs:md …/>
</url>
<url>
…
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

64
ResourceSync Sitemap Extensions

<sitemapindex xmlns=http://www.sitemaps.org/schemas/sitemap/0.9
xmlns:rs="http://www.openarchives.org/rs/terms/”>
<rs:ln …/>
<rs:md …/>
<sitemap>
<loc>http://example.com/sitemap1.xml</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:ln …/>
<rs:md …/>
</sitemap>
…
</sitemapindex>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

65
Resource Metadata Summary
Element/Attribute
<loc>
<lastmod>

Description
Resource URI (identity)
Timestamp of last change

Defined by
sitemaps
sitemaps

<changefreq>

Expected update frequency

sitemaps

<rs:md>
change
encoding

hash
length
path
type

ResourceSync
Change type (Change List & Change
Dump Manifest only)

ResourceSync

HTTP Content-Encoding header value

RFC2616

One or more content digests (md5, sha-1, Atom Link Ext.
sha-256)

HTTP Content-Length header value

RFC4287

Path in ZIP package (Dump Manifests
only)
HTTP Content-Type header value

ResourceSync

RFC4287

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands
Related Resource Metadata Summary
• Attributes of the <rs:ln> element; c.f. resource metadata + pri
Element/Attribute Description

Defined by

<rs:ln>

ResourceSync

encoding

HTTP Content-Encoding header value

RFC2616

hash

One or more content digests (md5, sha-1, sha-256)

Atom Link Ext.

href

Related resource URI (identity)

RFC4287

length

HTTP Content-Length header value

RFC4287

modified

Timestamp of last change (c.f. <lastmod>)

Atom Link Ext.

path

Path in ZIP package (Dump Manifests only)

ResourceSync

pri

Priority of link

RFC6249

rel

Relation - IANA registered or URI

RFC4287

type

HTTP Content-Type header value

RFC4287

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands
Link Relation Summary
Relation

Use in ResourceSync

Defined in

rel="alternate"

Link from generic to specific URI

HTML 5

rel="canonical"

Link from specific to generic URI

RFC6596

rel="collection"

Resource is member of collection

RFC6573

rel="contents"

Link from dump to manifest

rel="describedby"

Has metadata

HTML4
Protocol for Web Description Resources
(POWDER): Description Resources

rel="describes"

Is metadata for

The 'describes' Link Relation Type

rel="duplicate"

RFC6249

rel=".../rs/terms/patch"

Mirror or alternative copy
A patch -- efficient change
information

rel="memento"

Link to time-specific URI

Memento Internet Draft

rel="timegate"

Link to timegate

Memento Internet Draft

rel="via"

Provenance chain, came from

RFC4287

This specification

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands
ResourceSync Sitemap Validation
• All ResourceSync capability documents are valid according to
the Sitemap XML Schema
o

http://www.sitemaps.org/schemas/sitemap/0.9

• For a more thorough validation use the ResourceSync XML
Schema
o

http://www.openarchives.org/rs/0.9.1/resourcesync.xsd

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands
ResourceSync - Agenda
4. Framework (Technical) Details
1. Sitemaps

2. Core synchronization capabilities (PULL)
3. Discovery
4. Linking to related resources

5. Notification Capabilities (PUSH)
6. Archival capabilities (ARCHIVES)
http://www.openarchives.org/rs/resourcesync
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

70
Describing Content: Resource List

http://www.openarchives.org/rs/resourcesync#DescResources
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

71
Resource List
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability="resourcelist"
at="2013-01-03T09:00:00Z”
completed="2013-01-03T09:01:00Z” />
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6"
length="8876"
type="text/html"/>
</url>
<url>
…
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

72
Resource List
• Describe Source’s resources that are subject to synchronization
• At one point in time (snapshot)
• Creation can take some time – duration can be conveyed
• Typical Destination use: Baseline Synchronization, Audit

• Each URI typically listed only once
• Might be expensive to generate
• Destinations use @at to determine freshness
• [@at, @completed] – interval of uncertainty
• Destination issues GETs against URIs to obtain resources
• Very similar to current Sitemaps

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

73
What if I have a million resources?
• Current sitemap limit is 50k resources (or maximum document
size of 50MB)
• Break complete list of resources into 50k-resource chunks, each
on a Resource List document
• Create a Resource List Index document to group them:
o
o
o

Based on <sitemapindex>
May have up to 50k component Resource Lists
Extends capacity to 2,500,000,000 resources within current
community practices

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands
Resource List Index <resourcelist_index.xml>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”resourcelist"
at="2013-01-02T09:00:02Z”/>
<sitemap>
<loc>http://example.com/resourcelist1.xml</loc>
<rs:md type="application/xml"/>
</sitemap>
<sitemap>
<loc>http://example.com/resourcelist2.xml</loc>
<rs:md type="application/xml"/>
</sitemap>
</sitemapindex>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

75
Resource List <resourcelist1.xml>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs=http://www.openarchives.org/rs/terms/>
<rs:ln rel=”index”
href=”http://example.com/resourcelist_index.xml”/>
<rs:md capability=”resourcelist"
at="2013-01-02T09:00:00Z”/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T08:07:06Z</lastmod>
<rs:md hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6"
length="8876"
type="text/html"/>
</url>
...
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

76
Resource List Index

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

77
Packaging Content: Resource Dump

http://www.openarchives.org/rs/resourcesync#ResourceDump
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

78
Resource Dump
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”resourcedump"
at="2013-01-02T09:00:00Z”/>
<url>
<loc>http://example.com/resourcedump_part1.zip</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md length=”97553"
type=”application/zip"/>
<rs:ln rel=”contents”
href="http://example.com/resourcedump_manifest-part1.xml"
type=”application/xml"/>
</url>
<url>
<loc>http://example.com/resourcedump_part2.zip</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
</url>
</urlset>
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

79
Resource Dump Manifest
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”resourcedump-manifest"
at="2013-01-02T09:00:00Z”/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md type="text/html"
path=”/resources/res1"/>
</url>
<url>
<loc>http://example.com/res2</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md type=”application/pdf”
path=”/resources/res2"/>
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

80
Resource Dump
• A Resource Dump points to packages (ZIP files) that contain
representations of the Source’s resources
• At one point in time (snapshot)
• Resource Dump is mandatory, even if there is only one ZIP file
• ZIP package contains manifest, listing contained bitstreams
• Typical Destination use: Baseline Synchronization, bulk
download

• Each URI typically listed only once
• Might be expensive to generate
• Destinations use @at to determine freshness
• [@at, @completed] – interval of uncertainty
• GETs against individual URIs from Resource List achieves the
same result (ignoring varying freshness)
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

81
Describing Changes: Change List

http://www.openarchives.org/rs/resourcesync#DesChanges
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

82
Change List
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-02T09:00:00Z”
until="2013-01-03T09:00:00Z”/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md change=”updated"
hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6"
length="8876"
type="text/html"/>
</url>
<url>
…
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

83
Open Change List
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs=http://www.openarchives.org/rs/terms/>
<rs:md capability="changelist"
from="2013-01-02T09:00:00Z”/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md change=”updated"
hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6"
length="8876"
type="text/html"/>
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

84
Change List
• A Change List pertains to a Source’s resources that changed
• Changes that occurred during a temporal interval with startand end-date
• Typical Destination use: Incremental Synchronization, Audit
• Changes are listed in chronological order
• Multiple changes to one resource results in the resource being
listed multiple times, once per change
• Source determines duration of temporal interval
• Destinations use @from and @until to determine freshness
• Destinations issue GETs against URIs to obtain changed
resources

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

85
Change List Index
<changelist_index.xml>

<changelist1.xml>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

86
Change List Index <changelist_index.xml>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-02T09:00:00Z”
until="2013-01-03T09:00:00Z”/>
<sitemap>
<loc>http://example.com/changelist1.xml</loc>
<lastmod>2013-01-02T11:00:00Z</lastmod>
<rs:md type="application/xml"/>
</sitemap>
<sitemap>
<loc>http://example.com/changelist2.xml</loc>
<lastmod>2013-01-02T23:00:00Z</lastmod>
<rs:md type="application/xml"/>
</sitemap>
</sitemapindex>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

87
Change List <changelist1.xml>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs=http://www.openarchives.org/rs/terms/>
<rs:ln rel=”index”
href=”http://example.com/changelist_index.xml”/>
<rs:md capability="changelist"
from="2013-01-02T09:00:00Z”
until="2013-01-02T21:00:00Z”/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md change=”updated"
hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6"
length="8876"
type="text/html"/>
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

88
Open Change List Index
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-02T09:00:00Z”/>
<sitemap>
<loc>http://example.com/changelist1.xml</loc>
<lastmod>2013-01-02T11:00:00Z</lastmod>
</sitemap>
<sitemap>
<loc>http://example.com/changelist2.xml</loc>
<lastmod>2013-01-02T23:00:00Z</lastmod>
</sitemap>
<sitemap>
<loc>http://example.com/changelist_open.xml</loc>
</sitemap>
</sitemapindex>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

89
Change List Index

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

90
Packaging Changes: Change Dump

http://www.openarchives.org/rs/resourcesync#PackChanges
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

91
Capability 4: Change Dump
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changedump"
from="2013-01-02T09:00:00Z”
until="2013-01-03T09:00:00Z”/>
<url>
<loc>http://example.com/change_dump_part1.zip</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md length="887"
type=”application/zip"/>
</url>
<url>
<loc>http://example.com/change_dump_part2.zip</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md length=”9767"
type=”application/zip"/>
</url>
</urlset>
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

92
Change Dump Manifest
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changedump-manifest"
from="2013-01-02T09:00:00Z”
until="2013-01-03T09:00:00Z”/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md change=”updated"
length=”2887”
type=”text/html”
path=”/changes/res1”/>
</url>
<url>
…
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

93
Change Dump
• A Change Dump points at packages (ZIP files) that contain
bitstreams of the Source’s resources that changed
• Changes that occurred during a temporal interval with startand end-date
• Change Dump is mandatory, even if there is only one ZIP file
• ZIP package contains manifest, listing contained bitstreams
• Typical Destination use: Incremental Synchronization, bulk
download of changes
•
•
•
•

Changes in Change Dump Manifest listed in chronological order
Same URI can be listed multiple times
Might be expensive to generate
Destinations use @from and @until to determine freshness

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

94
ResourceSync - Agenda
4. Framework (Technical) Details
1. Sitemaps

2. Core synchronization capabilities (PULL)
3. Discovery
4. Linking to related resources

5. Notification Capabilities (PUSH)
6. Archival capabilities (ARCHIVES)
http://www.openarchives.org/rs/resourcesync#Discovery
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

95
Discovery of Capabilities
Requirements:
• Need to discover capabilities, i.e. Resource List, Resource
Dump, Change List, Change Dump, Archives, Notification
channels
• Need to know the type of capability each document
represents.
Approach:
• The Source publishes a Capability List that enumerates the
capabilities it supports.
• By pointing at Resource List, Change List, Resource Dump,
etc. using appropriate relation types, e.g. “resourcelist”,
“changelist”, “resourcedump” etc.
http://www.openarchives.org/rs/resourcesync#CapabilityList
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

96
Discovery of Capabilities

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

97
Capability List
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”capabilitylist”/>
<url>
<loc>http://example.com/dataset1/resourcelist.xml</loc>
<rs:md capability=”resourcelist”/>
</url>
<url>
<loc>http://example.com/dataset1/changelist.xml</loc>
<rs:md capability=”changelist”/>
</url>
<url>
<loc>http://example.com/dataset1/resourcedump.xml</loc>
<rs:md capability=”resourcedump”/>
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

98
Discovery of Capability Lists
Requirements:
• Need to discover a Capability List
Approaches:
• Introduce a link in the HTTP Link header of a resources that is
subject to synchronization, pointing at the Capability List with the
relation type “resourcesync”
• Introduce a link from an HTML document that is subject to
synchronization (<head> section), pointing at the Capability List
with the relation type “resourcesync”
• Link from a Resource List, etc. to the Capability List with the
relation type “up”
Link header on example.com/res1.pdf
Link: <example.com/dataset1/capabilitylist.xml>;rel=“resourcesync”
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

99
Discovery of Capabilities

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

100
Discovery: Source Description
Requirements:
• Support for multiple Capability Lists, one per “set of
resources”
• Need to discover these Capability Lists
• Need descriptive information about each set of resources
that a Capability List pertains to
• Useful to have descriptive information about the Source itself

Approach:
• The Source Description document meets these requirements.
• It should be at a particular location to avoid having registries:
http://(hostname)/.well-known/resourcesync
• It can be linked to from the Capability Lists as well.
http://www.openarchives.org/rs/resourcesync#SourceDesc
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

101
Discovery of Capabilities

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

102
Discovery of Capabilities

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

103
Source Description
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”description”/>
<rs:ln rel=“describedby”
href=“http://example.com/info_about_source.xml”/>
<url>
<loc>http://example.com/dataset1/capabilitylist.xml</loc>
<rs:md capability=”capabilitylist”/>
<rs:ln rel=“describedby”
href=“http://example.com/dataset1/info_about_dataset1.xml”/>
</url>
<url>
<loc>http://example.com/dataset2/capabilitylist.xml</loc>
<rs:md capability=”capabilitylist”/>
<rs:ln rel=“describedby”
href=“http://example.com/dataset2/info_about_dataset2.xml”/>
</url>
</urlset>
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

104
Discovery via robots.txt
• Resource Lists are (enhanced) Sitemaps
• Sitemaps can be discovered via robots.txt
• Ergo, Resource Lists should be discoverable via robots.txt
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Sitemap: http://example.com/dataset1/resourcelist.xml

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

105
Discovery of Capabilities

http://www.openarchives.org/rs/resourcesync#Discovery
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

106
Framework Navigation

http://www.openarchives.org/rs/resourcesync#Navigation
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

107
e.g., Capability List
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”capabilitylist”/>
<rs:ln rel=“up”
href=“http://example.com/.well-known/resourcesync”/>
<url>
<loc>http://example.com/dataset1/resourcelist.xml</loc>
<rs:md capability=”resourcelist”/>
</url>
<url>
<loc>http://example.com/dataset1/changelist.xml</loc>
<rs:md capability=”changelist”/>
</url>
<url>
<loc>http://example.com/dataset1/resourcedump.xml</loc>
<rs:md capability=”resourcedump”/>
</url>
</urlset>
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

108
Framework Structure

http://www.openarchives.org/rs/resourcesync#Structure
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

109
Framework Structure

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

110
ResourceSync - Agenda
4. Framework (Technical) Details

4. Linking to related resources

http://www.openarchives.org/rs/resourcesync#LinkRelRes
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

111
Supported Linking Use Cases
Provide links to related resources to address specific resource
synchronization needs.

1.
2.
3.
4.
5.
6.
7.

Mirrored content with multiple download locations
Alternate representations of the same content
Patching content rather than replacing it
Resources and metadata about resources
Prior versions of resources
Collection membership of resources
Republishing synchronized resources

All cases are handled with a <rs:ln> element referring to the linked
resource
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

112
Notes about Linked Resources
Some important things to keep in mind about linked resources:
• They may also be subject to synchronization
• They may be updated in a very different schedule than the
resources that link to them
• Therefore, it is recommended to convey metadata about the
linked resource too
• Links can be bi-directional – the linked resource can link back to
the linking resource

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

113
Linking #1 - Mirror
1. Content with multiple download locations
This may be of interest for:
• Content distribution networks
• Mirror sites
• Backup locations
• Load balancing

http://www.openarchives.org/rs/0.9.1/resourcesync#MirCon
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

114
Linking #1 - Mirror
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-02T09:00:00Z”
until="2013-01-03T09:00:00Z”/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md change=”updated”/>
<rs:ln rel=”duplicate”
pri=”1”
href=”http://mirror1.example.com/res1"/>
<rs:ln rel=”duplicate”
pri=”2”
href=”http://mirror2.example.com/res1"/>
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

115
Linking #2 – Alternate Representations
2.

Alternate representations of the same content

This may be of interest for:
• Resources subject to HTTP content negotiation
• Format migration for preservation reasons
• Different clients wanting different formats
• Multiple languages of the content

http://www.openarchives.org/rs/resourcesync#AltRep
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

116
Linking #2 – Alternate Representations
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-02T09:00:00Z”
until="2013-01-03T09:00:00Z”/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md change=”updated”/>
<rs:ln rel="alternate"
type="text/html"
href="http://example.com/res1.html"/>
<rs:ln rel="alternate"
type=“application/pdf"
href=”http://example.com/res1.pdf"/>
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

117
Linking #2 – Alternate Representations
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-02T09:00:00Z”
until="2013-01-03T09:00:00Z”/>
<url>
<loc>http://example.com/res1.html</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md change=”updated”/>
<rs:ln rel=”canonical”
href="http://example.com/res1"/>
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

118
Linking #3 – Patching Content
3.

Patching content rather than replacing it

This may be of interest when:
• Resources are very large and server wishes to conserve
bandwidth where possible
• Changes are frequent and small
• Changes are managed in a CMS that tracks differences
Need:
• Machine processable format to describe a change in a
manner that allows patching a representation
• Existing or newly defined by communities
http://www.openarchives.org/rs/resourcesync#PatchCon
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

119
Linking #3 – Patching Content
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-02T09:00:00Z”
until="2013-01-03T09:00:00Z”/>
<url>
<loc>http://example.com/res1.json</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md change=”updated”
length=“398723”/>
<rs:ln rel=”http://www.openarchives.org/rs/terms/patch”
type=”application/json-patch”
modified=“2013-01-02T17:00:00Z”
length=“58”
href=”http://example.com/res1-patch.json"/>
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

120
Linking #4 – Metadata about Resources
4.

Resources and metadata about resources

This may be of interest when:
• Resources have associated descriptive metadata records,
which are useful for understanding the resource
• Such as cultural heritage images, audio, video
• Resources that have associated technical, administrative,
rights metadata

http://www.openarchives.org/rs/resourcesync#ResMDLinking
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

121
Linking #4 – Metadata about Resources
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-02T09:00:00Z”
until="2013-01-03T09:00:00Z”/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md change=”updated”/>
<rs:ln rel=”describedby”
type=”application/xml”
href=”http://example.com/metadata/res1.xml"/>
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

122
Linking #4 – Metadata about Resources
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-02T09:00:00Z”
until="2013-01-03T09:00:00Z”/>
<url>
<loc>http://example.com/metadata/res1.xml</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md change=”updated”/>
<rs:ln rel=”describes”
type=”text/html”
href=”http://example.com/res1"/>
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

123
Linking #5 – Prior Versions of Resources
This may be of interest when:
• A Destinations needs to have a copy of all versions of a
resource

http://www.openarchives.org/rs/resourcesync#ResVers
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

124
Memento Intermezzo

http://www.mementoweb.org/
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands
URI for Original, URI for Version

Web Archive

URI-M - http://web.archive.org/web/20010911203610/http://www.cnn.com/

URI-R - http://www.cnn.com/
URI for Original, URI for Version

CMS

URI-M - http://en.wikipedia.org/w/index.php?title=September_11_attacks&oldid=282333

URI-R - http://en.wikipedia.org/wiki/September_11_attacks
ResourceSync Tutorial
ResourceSync Tutorial
ResourceSync Tutorial
ResourceSync Tutorial
ResourceSync Tutorial
ResourceSync Tutorial
Memento Time Travel extension for Chrome

Download extension at http://bit.ly/memento-for-chrome
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands
Linking #5 – Prior Versions of Resources
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-02T09:00:00Z”
until="2013-01-03T09:00:00Z”/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md change=”updated”/>
<rs:ln rel=”memento”
href=”http://example.com/past/20130102130000/res1"/>
<rs:ln rel=”timegate”
href=”http://example.com/timegate/res1"/>
<rs:ln rel=”timemap”
href=“http://example.com/timemap/res1”
type=“application/link-format”/>
</url>
</urlset>
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

135
Linking #6 – Collection Membership
6.

Collection membership of resources

This may be of interest when:
• Resources are part of OAI-ORE aggregations
• Resources are part of OAI-PMH sets
• To indicate any other type of collections of resources

Collections are named with URIs and can then be linked to with
rel=“collection”
• Nice if the collection URI resolves to a useful description

http://www.openarchives.org/rs/resourcesync#ColMem
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

136
Linking #6 – Collection Membership
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-02T09:00:00Z”
until="2013-01-03T09:00:00Z”/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md change=”updated”/>
<rs:ln rel=”collection”
href=”http://example.com/aggregation/allres"/>
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

137
Linking #7 – Republishing Resources
7.

Republishing synchronized resources

This may be of interest when:
• Aggregator systems harvest resources from Sources and
then republish them at new URIs
Examples include Blog republishing, content distribution networks,
mirrored or combined collections
Hypothetical scenario: Lots of little museums with small collections,
and a large European/American aggregating digital library system
that wants to provide fast, combined access to the content (with
permission)
http://www.openarchives.org/rs/resourcesync#RePub
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

138
Linking #7 – Republishing Resources #1
• Original Source publishes information about a changed resource
via a Change List

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-03T00:00:00Z”/>
<url>
<loc>http://original.example.com/res1</loc>
<lastmod>2013-01-03T07:00:00Z</lastmod>
<rs:md change=”updated”/>
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

139
Linking #7 – Republishing Resources #2
• Aggregator 1 republishes information about the changed
resource with reference to the original Source
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-03T11:00:00Z”/>
<url>
<loc>http://aggregator1.example.com/res1</loc>
<lastmod>2013-01-03T20:00:00Z</lastmod>
<rs:md change=”updated”/>
<rs:ln rel=”via”
modified=“2013-01-03T07:00:00Z”
href=”http://original.example.org/res1"/>
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

140
Linking #7 – Republishing Resources #3
• Aggregator 2 ditto
• Caution when republishing links, need to make sure they are still
appropriate from an aggregator’s perspective
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-03T12:00:00Z”/>
<url>
<loc>http://aggregator2.example.com/res1</loc>
<lastmod>2013-01-04T09:00:00Z</lastmod>
<rs:md change=”updated”/>
<rs:ln rel=”via”
modified=“2013-01-03T07:00:00Z”
href=”http://original.example.org/res1"/>
</url>
</urlset>
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

141
ResourceSync - Agenda
4. Framework (Technical) Details
1. Sitemaps

2. Core synchronization capabilities (PULL)
3. Discovery
4. Linking to related resources

5. Notification Capabilities (PUSH)
6. Archival capabilities (ARCHIVES)
http://www.openarchives.org/rs/notification
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

142
Motivation for Notifications
•

Reduce synchronization latency by having the Source push out
resource change information
• To avoid continuous pull of Change Lists by Destinations

•

Share information about changes to the Source’s
ResourceSync implementation, e.g. announcement of new
Resource List, new Capability List, etc.
• To avoid continuous polling of e.g. Resource Lists,
ResourceSync Description

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

143
Source: Notifications Capabilities
•

P
U
•
S
H

1. Change Notification
• Notifies about changes to particular resources
• e.g., resource A has been updated | created | deleted
2. Framework Notification
• Notifies about changes to capabilities i.e., their documents
• e.g., a Change List has been updated | created | deleted
• Also for Capability Lists and Source Description

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

144
Notifications Channels
•

Notification sent via channels
• Resource Notification: one channel per set of resources
• Framework Notification: one channel per set of resources
• Sent on level of capability document, not on index-level
• Notifications about changes to Source Description sent on all
Framework Notification channels

•

Payload for notifications: <urlset> documents

•

Transport protocol for notifications:
• PubSubHubbub https://pubsubhubbub.googlecode.com/git/pubsubhubbub-core0.4.html - current choice
• WebSockets -http://tools.ietf.org/html/rfc6455 – may be added
later
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

145
146
Framework
Notification
Structure

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

147
Framework
Notification
Structure

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

148
Change Notification Payload
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T09:07:00Z</lastmod>
<rs:md change=”updated"
hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6"
length="8876"
type="text/html"/>
</url>
<url>
…
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

149
Framework Notification Payload
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<url>
<loc>http://example.com/resourceset1/resourcelist.xml</loc>
<rs:md change=”created"
capability=”resourcelist”/>
</url>
<url>
<loc>http://example.com/resourceset1/resourcedump.xml</loc>
<rs:md change=”created"
capability=”resourcedump”/>
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

150
Framework Notification Payload (w/ index)
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<url>
<loc>http://example.com/resourceset1/resourcelist.xml</loc>
<rs:md change=”created"
capability=”resourcelist”/>
<rs:ln rel="index"
href=”http://example.com/dataset1/resourcelist-index.xml/>
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

151
Framework
Notification
Discovery

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

152
ResourceSync - Agenda
4. Framework (Technical) Details
1. Sitemaps

2. Core synchronization capabilities (PULL)
3. Discovery
4. Linking to related resources

5. Notification Capabilities (PUSH)
6. Archival capabilities (ARCHIVES)
http://www.openarchives.org/rs/archives
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

153
A
R
C
H
I
V
E
S

Source: Archival Capabilities
The Source may hold on to historical data, for example, to allow
Destinations to catch up with events they missed or revisit prior
resource states. To this end, the Source can publish archives, i.e.
documents that enumerate historical capability documents
1.
2.
3.
4.

Resource List Archive
Resource Dump Archive
Change List Archive
Change Dump Archive

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

154
Resource List Archive

http://www.openarchives.org/rs/archives#ResourceListArch
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

155
Resource List Archive
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability="resourcelist-archive"
at="2013-01-09T13:00:00Z"/>
<url>
<loc>http://example.com/resourcelist1.xml</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
</url>
<url>
<loc>http://example.com/resourcelist2.xml</loc>
<lastmod>2013-01-09T13:00:00Z</lastmod>
</url>
<url>
…
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

156
Resource Dump Archive

http://www.openarchives.org/rs/archives#ResourceDumpArch
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

157
Resource Dump Archive
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability="resourcedump-archive"
at="2013-02-10T03:00:00Z"/>
<url>
<loc>http://example.com/resourcedump1.xml</loc>
<lastmod>2013-01-10T03:00:00Z</lastmod>
</url>
<url>
<loc>http://example.com/resourcedump2.xml</loc>
<lastmod>2013-02-10T03:00:00Z</lastmod>
</url>
<url>
…
</url>
</urlset>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

158
Change List Archive

http://www.openarchives.org/rs/archives#ChangeListArch
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

159
Change List Archive
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist-archive"
from="2013-02-01T23:00:00Z
until="2013-02-03T23:00:00Z"/>
<url>
<loc>http://example.com/changelist1.xml</loc>
<lastmod>2013-02-01T23:00:00Z</lastmod>
</url>
<url>
<loc>http://example.com/changelist2.xml</loc>
<lastmod>2013-02-02T23:00:00Z</lastmod>
</url>
<url>
<loc>http://example.com/changelist3.xml</loc>
<lastmod>2013-02-03T23:00:00Z</lastmod>
</url>
</urlset>
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

160
Change Dump Archive

http://www.openarchives.org/rs/archives#ChangeDumpArch
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

161
Change Dump Archive
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changedump-archive"
from="2013-02-10T03:00:00Z
until="2013-02-17T03:00:00Z"/>
<url>
<loc>http://example.com/changedump1.xml</loc>
<lastmod>2013-02-10T03:00:00Z</lastmod>
</url>
<url>
<loc>http://example.com/changedump2.xml</loc>
<lastmod>2013-02-17T03:00:00Z</lastmod>
</url>
<url>
…
</url>
</urlset>
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

162
Capability List for Archives
<urlset xmlns=“http://www.sitemaps.org/schemas/sitemap/0.9”
xmlns:rs=“http://www.openarchives.org/rs/terms/”>
<rs:md capability=”capabilitylist”/>
<url>
<loc>http://example.com/dataset1/resourcelist.xml</loc>
<rs:md capability=”resourcelist”/>
</url>
…
<url>
<loc>http://example.com/dataset1/resourcelist-archive.xml</loc>
<rs:md capability=“resourcelist-archive”/>
</url>
<url>
<loc>http://example.com/dataset1/changelist-archive.xml</loc>
<rs:md capability=“changelist-archive”/>
</url>
</urlset>
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

163
ResourceSync
Framework
with Archives

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

164
ResourceSync - Agenda

5. Implementation

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

165
Implementation #1:
The Metadata Harvesting Use Case

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

166
The Metadata Harvesting Use Case
1. Identification of metadata records within a service

1. Use of standards in metadata formats
1. Incremental updates

1. Create, Update, Delete
1. Sets

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

167
The Metadata Harvesting Use Case
1. Identification of metadata records within a service
ResourceSync does not specifically care about metadata records, only
resources. It is up to the server to identify which of those resources
are metadata.

2. Use of standards in metadata formats
We are free to annotate a resource's entry with appropriate metadata
to indicate the format.

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

168
The Metadata Harvesting Use Case
3. Incremental updates
ResourceSync publishes changes as static documents. The client is
then free to walk up and down the change lists provided by the server.

4. Create, Update, Delete
All resources that can be obtained from a change list will be annotated
with the kind of change that happened to them.

5. Sets

ResourceSync allows the server to publish lists of resources and
changes and indexes of those lists all annotated with metadata.
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

169
(Required) Documents for
metadata harvesting use case

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

170
Describing Metadata Resources
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability="resourcelist"
from="2013-05-05T13:00:00Z"/>
<url>
<loc>http://mydspace.edu/dspace-rs/resource/123456789/7/qdc</loc>
<lastmod>2013-05-01T19:09:35Z</lastmod>
<changefreq>never</changefreq>
<rs:md type=”application/xml”/>
<rs:ln href="http://mydspace.edu/bitstream/123456789/7/1/bitstream.pdf"
rel="describes"/>
<rs:ln href="http://mydspace.edu/bitstream/123456789/7/2/image.jpg"
rel="describes"/>
<rs:ln href="http://mydspace.edu/123456789/3"
rel=”collection"/>
</url>
</urlset>
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

171
Describing Bitstream Resources
<urlset
…
<url>
<loc>http://mydspace.edu/bitstream/123456789/7/1/bitstream.pdf</loc>
<lastmod>2013-05-01T19:09:35Z</lastmod>
<changefreq>never</changefreq>
<rs:md hash="md5:75d0ea94097a05fce9aca5b079e2f209"
length="419805"
type="application/pdf"/>
<rs:ln href="http://mydspace.edu/dspace-rs/resource/123456789/7/qdc"
rel="describedby"/>
<rs:ln href="http://mydspace.edu/dspace-rs/resource/123456789/7/mets"
rel="describedby"/>
<rs:ln href="http://mydspace.edu/dspace-rs/resource/123456789/12/qdc"
rel="describedby"/>
<rs:ln href="http://mydspace.edu/123456789/2"
rel=”collection"/>
</url>
</urlset>
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

172
Serving Metadata Resources
http://mydspace.edu/dspace-rs/resource/123456789/7/qdc
ResourceSync webapp

metadata.formats = 
qdc = http://purl.org/dc/terms/, 
mets = http://www.loc.gov/METS/

metadata.types = 
qdc = application/xml, 
mets = application/xml

Item handle

Metadata Format

<loc>http://mydspace.edu/dspace-rs/resource/123456789/7/qdc<loc>
<rs:md type="application/xml”/>
<rs:ln href="http://purl.org/dc/terms/"
rel="describedby"/>

<loc>http://mydspace.edu/dspace-rs/resource/123456789/7/mets</loc>
<rs:md type="application/xml”/>
<rs:ln href="http://www.loc.gov/METS/"
rel="describedby"/>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

173
Generating Documents
1. Initialise
Creates initial Capability List and Resource List documents
[dspace]/bin/dspace dsrun org.dspace.resourcesync.ResourceSyncGenerator -i

2. Update
Creates a new Change List which covers the period since the last Change List
was created
[dspace]/bin/dspace dsrun org.dspace.resourcesync.ResourceSyncGenerator -u

3. Rebase
A combination of both Initialise and Update.
[dspace]/bin/dspace dsrun org.dspace.resourcesync.ResourceSyncGenerator -r

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

174
Usage of Resources by clients

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

175
Impact on DSpace

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

176
URLs
•
•
•
•

Stable identifiers for archived items
Stable identifiers for unarchived items
Stable identifiers for metadata resources (in their various formats)
Stable identifiers for previous versions ?

Provenance
• History of changes to an item/bitstream
• Item/bitstream deletions (vs withdraw)
• Bitstream create/update dates
• Item create/update dates

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

177
Versioning
• Access of previous versions of both metadata and bitstreams ?
• Stable identifiers for previous versions of both metadata and ?
bitstreams

Metadata Resources
• Metadata in a variety of formats
• Metadata as file/bitstream

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

178
Admin Files
•
•
•

ResourceSync documents (Resource Lists, Change Lists, etc)
ResourceSync exports - Resource Dumps, Change Dumps
Metadata exports in a number of formats

Scheduled Tasks
•

Regular generation of RS documents

Complex Objects
•
•

Item/bitstream relationships
Collections of content

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

179
Get the software!
Dspace Module:
https://github.com/CottageLabs/DSpaceResourceSync
depends on the common java library:
https://github.com/CottageLabs/ResourceSyncJava
PHP client:
https://github.com/stuartlewis/resync-php
depends on the SWORDv2 clienbt library:
https://github.com/swordapp/swordappv2-php-library/

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

180
Implementation #2:
ResourceSync at arXiv.org

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

181
ResourceSync @ arXiv
• Use ResourceSync for both mirroring and public data access
o efficient updates
o ability to do periodic audits
o public synchronization capability
o reduce admin burden
• Likely start with metadata + source for mirroring use case (doing
experiments now)
• Open access use cases requires processed PDF also
• Some concerns about likely use/load…

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

182
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

183
Alternate download location
•

Likely want to separate machine accesses from human accesses to
preserve response time on main server

=> Use Mirrored Content part of spec
o

o

<loc> specifies canonical URI
- e.g. http://arxiv.org/pdf/1306.1073v1.pdf
<rs:ln rel=“duplicate”> specifies preferred download location
- e.g. http://export.arxiv.org/pdf/1306.1073v1.pdf

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

184
Alternate download location

<url>
<loc>http://arxiv.org/pdf/1306.1073v1.pdf</loc>
<lastmod>2013-06-06T00:57:12Z</lastmod>
<rs:md hash="md5:e08e0c4e4d7b0895120014f0aa09e7c4"
length="287714” type=”application/pdf"/>
<rs:ln rel="duplicate”
pri="1"
href="http://export.arxiv.org/pdf/1306.1073v1.pdf"
modified="2013-06-06T02:00:59Z"/>
</url>

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

185
Getting a copy of arXiv
It might be as easy as:

(of course, you probably have to wait a while but it is nice to know ResourceSync is
stateless so one can efficiently restart)

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

186
Python Library and Client
• Aim to provide library code implementing all ResourceSync
facilities for use in both source and destination implementations
o
o

Designed for python 2.6 (RHEL6) and 2.7
Will not work with python <= 2.5

• Client (resync) supports many destination operations, inspired
by the common Unix rsync program
• Client also supports some operations that might be useful in a
source, such as generation of static Resource Lists, or periodic
Change Lists (used in arXiv experiments)
• Explorer (resync-explorer) intended to allow easy inspection
of a source’s resource sets and capabilities
• Developed since ResourceSync v0.5, updated for v0.9
http://github.org/resync/resync

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands
ResourceSync Source Simulator
• Python code using Tornado server
• Provides random set of resources of different sizes updated at a
particular rate
• Very useful for testing Destination code

http://github.com/resync/simulator

ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands
ResourceSync - Agenda

6. Q&A
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

189
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

190
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

191
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

192
ResourceSync:
A Web-Based
Resource Synchronization
Framework

#resourcesync

ResourceSync is funded by
The Sloan Foundation & JISC
ResourceSync Tutorial
DANS, January 21 2014, Den Haag, Netherlands

193

More Related Content

What's hot

Pentaho Data Integration Introduction
Pentaho Data Integration IntroductionPentaho Data Integration Introduction
Pentaho Data Integration Introduction
mattcasters
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
DataWorks Summit/Hadoop Summit
 
What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0
Databricks
 
Ontology development in protégé-آنتولوژی در پروتوغه
Ontology development in protégé-آنتولوژی در پروتوغهOntology development in protégé-آنتولوژی در پروتوغه
Ontology development in protégé-آنتولوژی در پروتوغه
sadegh salehi
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
Jurriaan Persyn
 
Diving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction Log
Databricks
 
Making DSpace XMLUI Your Own
Making DSpace XMLUI Your OwnMaking DSpace XMLUI Your Own
Making DSpace XMLUI Your Own
Tim Donohue
 
CCO (Cataloging Cultural Objects): Applying CCO
CCO (Cataloging Cultural Objects): Applying CCOCCO (Cataloging Cultural Objects): Applying CCO
CCO (Cataloging Cultural Objects): Applying CCO
Visual Resources Association
 
ckan 2.0: Harvesting from other sources
ckan 2.0: Harvesting from other sourcesckan 2.0: Harvesting from other sources
ckan 2.0: Harvesting from other sources
Chengjen Lee
 
Oracle Advanced Analytics
Oracle Advanced AnalyticsOracle Advanced Analytics
Oracle Advanced Analytics
aghosh_us
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
James Serra
 
CCO (Cataloging Cultural Objects): Structuring and Storing Data with CCO
CCO (Cataloging Cultural Objects): Structuring and Storing Data with CCOCCO (Cataloging Cultural Objects): Structuring and Storing Data with CCO
CCO (Cataloging Cultural Objects): Structuring and Storing Data with CCO
Visual Resources Association
 
GoldenGate and ODI - A Perfect Match for Real-Time Data Warehousing
GoldenGate and ODI - A Perfect Match for Real-Time Data WarehousingGoldenGate and ODI - A Perfect Match for Real-Time Data Warehousing
GoldenGate and ODI - A Perfect Match for Real-Time Data Warehousing
Michael Rainey
 
أمثلة دبلن كور / إعداد محمد عبدالحميد معوض
أمثلة دبلن كور / إعداد محمد عبدالحميد معوضأمثلة دبلن كور / إعداد محمد عبدالحميد معوض
أمثلة دبلن كور / إعداد محمد عبدالحميد معوض
Muhammad Muawwad
 
Secrets of the DSpace Submission Form
Secrets of the DSpace Submission FormSecrets of the DSpace Submission Form
Secrets of the DSpace Submission Form
Bram Luyten
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
Dan McKinley
 
Introduction to Azure Synapse Webinar
Introduction to Azure Synapse WebinarIntroduction to Azure Synapse Webinar
Introduction to Azure Synapse Webinar
Peter Ward
 
LOD 구축 공정 가이드라인
LOD 구축 공정 가이드라인LOD 구축 공정 가이드라인
LOD 구축 공정 가이드라인
Hansung University
 
Or2019 DSpace 7 Enhanced submission &amp; workflow
Or2019 DSpace 7 Enhanced submission &amp; workflowOr2019 DSpace 7 Enhanced submission &amp; workflow
Or2019 DSpace 7 Enhanced submission &amp; workflow
4Science
 
RDF Data Model
RDF Data ModelRDF Data Model
RDF Data Model
Jose Emilio Labra Gayo
 

What's hot (20)

Pentaho Data Integration Introduction
Pentaho Data Integration IntroductionPentaho Data Integration Introduction
Pentaho Data Integration Introduction
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
 
What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0
 
Ontology development in protégé-آنتولوژی در پروتوغه
Ontology development in protégé-آنتولوژی در پروتوغهOntology development in protégé-آنتولوژی در پروتوغه
Ontology development in protégé-آنتولوژی در پروتوغه
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Diving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction Log
 
Making DSpace XMLUI Your Own
Making DSpace XMLUI Your OwnMaking DSpace XMLUI Your Own
Making DSpace XMLUI Your Own
 
CCO (Cataloging Cultural Objects): Applying CCO
CCO (Cataloging Cultural Objects): Applying CCOCCO (Cataloging Cultural Objects): Applying CCO
CCO (Cataloging Cultural Objects): Applying CCO
 
ckan 2.0: Harvesting from other sources
ckan 2.0: Harvesting from other sourcesckan 2.0: Harvesting from other sources
ckan 2.0: Harvesting from other sources
 
Oracle Advanced Analytics
Oracle Advanced AnalyticsOracle Advanced Analytics
Oracle Advanced Analytics
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
CCO (Cataloging Cultural Objects): Structuring and Storing Data with CCO
CCO (Cataloging Cultural Objects): Structuring and Storing Data with CCOCCO (Cataloging Cultural Objects): Structuring and Storing Data with CCO
CCO (Cataloging Cultural Objects): Structuring and Storing Data with CCO
 
GoldenGate and ODI - A Perfect Match for Real-Time Data Warehousing
GoldenGate and ODI - A Perfect Match for Real-Time Data WarehousingGoldenGate and ODI - A Perfect Match for Real-Time Data Warehousing
GoldenGate and ODI - A Perfect Match for Real-Time Data Warehousing
 
أمثلة دبلن كور / إعداد محمد عبدالحميد معوض
أمثلة دبلن كور / إعداد محمد عبدالحميد معوضأمثلة دبلن كور / إعداد محمد عبدالحميد معوض
أمثلة دبلن كور / إعداد محمد عبدالحميد معوض
 
Secrets of the DSpace Submission Form
Secrets of the DSpace Submission FormSecrets of the DSpace Submission Form
Secrets of the DSpace Submission Form
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
 
Introduction to Azure Synapse Webinar
Introduction to Azure Synapse WebinarIntroduction to Azure Synapse Webinar
Introduction to Azure Synapse Webinar
 
LOD 구축 공정 가이드라인
LOD 구축 공정 가이드라인LOD 구축 공정 가이드라인
LOD 구축 공정 가이드라인
 
Or2019 DSpace 7 Enhanced submission &amp; workflow
Or2019 DSpace 7 Enhanced submission &amp; workflowOr2019 DSpace 7 Enhanced submission &amp; workflow
Or2019 DSpace 7 Enhanced submission &amp; workflow
 
RDF Data Model
RDF Data ModelRDF Data Model
RDF Data Model
 

Viewers also liked

ResourceSync Overview
ResourceSync OverviewResourceSync Overview
ResourceSync Overview
Herbert Van de Sompel
 
On the Change in Archivability of Websites Over Time
On the Change in Archivability of Websites Over TimeOn the Change in Archivability of Websites Over Time
On the Change in Archivability of Websites Over Time
Michael Nelson
 
How To Succeed In Web Design
How To Succeed In Web DesignHow To Succeed In Web Design
How To Succeed In Web Design
Shawn Rider
 
The Open Archives Initiative
The Open Archives InitiativeThe Open Archives Initiative
The Open Archives Initiative
Michael Nelson
 
Sharing with the Open Archives Initiative
Sharing with the Open Archives InitiativeSharing with the Open Archives Initiative
Sharing with the Open Archives Initiative
Jenn Riley
 
"Archive What I See Now" - NEH ODH overview
"Archive What I See Now" - NEH ODH overview"Archive What I See Now" - NEH ODH overview
"Archive What I See Now" - NEH ODH overview
Michele Weigle
 
Open Archive Initiatives (OAI)
Open Archive Initiatives (OAI)Open Archive Initiatives (OAI)
Open Archive Initiatives (OAI)
Ismail Fahmi
 
A manual for a small archives
A manual for a small archivesA manual for a small archives
A manual for a small archives
Candy Husmillo
 
Work is not a Dare: Tips for Building Inclusive Teams
Work is not a Dare: Tips for Building Inclusive TeamsWork is not a Dare: Tips for Building Inclusive Teams
Work is not a Dare: Tips for Building Inclusive Teams
Shawn Rider
 
Koha
KohaKoha

Viewers also liked (10)

ResourceSync Overview
ResourceSync OverviewResourceSync Overview
ResourceSync Overview
 
On the Change in Archivability of Websites Over Time
On the Change in Archivability of Websites Over TimeOn the Change in Archivability of Websites Over Time
On the Change in Archivability of Websites Over Time
 
How To Succeed In Web Design
How To Succeed In Web DesignHow To Succeed In Web Design
How To Succeed In Web Design
 
The Open Archives Initiative
The Open Archives InitiativeThe Open Archives Initiative
The Open Archives Initiative
 
Sharing with the Open Archives Initiative
Sharing with the Open Archives InitiativeSharing with the Open Archives Initiative
Sharing with the Open Archives Initiative
 
"Archive What I See Now" - NEH ODH overview
"Archive What I See Now" - NEH ODH overview"Archive What I See Now" - NEH ODH overview
"Archive What I See Now" - NEH ODH overview
 
Open Archive Initiatives (OAI)
Open Archive Initiatives (OAI)Open Archive Initiatives (OAI)
Open Archive Initiatives (OAI)
 
A manual for a small archives
A manual for a small archivesA manual for a small archives
A manual for a small archives
 
Work is not a Dare: Tips for Building Inclusive Teams
Work is not a Dare: Tips for Building Inclusive TeamsWork is not a Dare: Tips for Building Inclusive Teams
Work is not a Dare: Tips for Building Inclusive Teams
 
Koha
KohaKoha
Koha
 

Similar to ResourceSync Tutorial

NISO ResourceSync Training Session
NISO ResourceSync Training SessionNISO ResourceSync Training Session
NISO ResourceSync Training Session
National Information Standards Organization (NISO)
 
ResourceSync Tutorial from Open Repositories 2013
ResourceSync Tutorial from Open Repositories 2013ResourceSync Tutorial from Open Repositories 2013
ResourceSync Tutorial from Open Repositories 2013
Simeon Warner
 
ResourceSync Introduction at SWIB13
ResourceSync Introduction at SWIB13ResourceSync Introduction at SWIB13
ResourceSync Introduction at SWIB13
Simeon Warner
 
Illuminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data SupportIlluminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data Support
Pascal-Nicolas Becker
 
11.12.14 Slides: “Doing It: Trends Toward Hosted Service Adoption and Impleme...
11.12.14 Slides: “Doing It: Trends Toward Hosted Service Adoption and Impleme...11.12.14 Slides: “Doing It: Trends Toward Hosted Service Adoption and Impleme...
11.12.14 Slides: “Doing It: Trends Toward Hosted Service Adoption and Impleme...
DuraSpace
 
ResourceSync tutorial OAI8
ResourceSync tutorial OAI8ResourceSync tutorial OAI8
ResourceSync tutorial OAI8
Herbert Van de Sompel
 
ResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource SynchronizationResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource Synchronization
Herbert Van de Sompel
 
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
National Information Standards Organization (NISO)
 
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
OpenAIRE
 
OpenAIRE and the Case of Irish Repositories
OpenAIRE and the Case of Irish RepositoriesOpenAIRE and the Case of Irish Repositories
OpenAIRE and the Case of Irish Repositories
RIANIreland
 
Linked Open Data Approaches within the ARIADNE Project
Linked Open Data Approaches within the ARIADNE ProjectLinked Open Data Approaches within the ARIADNE Project
Linked Open Data Approaches within the ARIADNE Project
ariadnenetwork
 
BaurCHCArchivist
BaurCHCArchivistBaurCHCArchivist
BaurCHCArchivist
lindyhopper38
 
DataverseNL as structured data hub
DataverseNL as structured data hubDataverseNL as structured data hub
DataverseNL as structured data hub
vty
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
Enno Meijers
 
Open Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataOpen Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked Data
Pascal-Nicolas Becker
 
Hull presentation to Fedora UK&I meeting, 21st March 2013
Hull presentation to Fedora UK&I meeting, 21st March 2013Hull presentation to Fedora UK&I meeting, 21st March 2013
Hull presentation to Fedora UK&I meeting, 21st March 2013
Chris Awre
 
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE
 
Publishing the Full Research Data Lifecycle
Publishing the Full Research Data LifecyclePublishing the Full Research Data Lifecycle
Publishing the Full Research Data Lifecycle
Anita de Waard
 
SWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic WebSWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic Web
Pascal-Nicolas Becker
 
Ukcorr hydra presentation
Ukcorr hydra presentationUkcorr hydra presentation
Ukcorr hydra presentation
Chris Awre
 

Similar to ResourceSync Tutorial (20)

NISO ResourceSync Training Session
NISO ResourceSync Training SessionNISO ResourceSync Training Session
NISO ResourceSync Training Session
 
ResourceSync Tutorial from Open Repositories 2013
ResourceSync Tutorial from Open Repositories 2013ResourceSync Tutorial from Open Repositories 2013
ResourceSync Tutorial from Open Repositories 2013
 
ResourceSync Introduction at SWIB13
ResourceSync Introduction at SWIB13ResourceSync Introduction at SWIB13
ResourceSync Introduction at SWIB13
 
Illuminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data SupportIlluminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data Support
 
11.12.14 Slides: “Doing It: Trends Toward Hosted Service Adoption and Impleme...
11.12.14 Slides: “Doing It: Trends Toward Hosted Service Adoption and Impleme...11.12.14 Slides: “Doing It: Trends Toward Hosted Service Adoption and Impleme...
11.12.14 Slides: “Doing It: Trends Toward Hosted Service Adoption and Impleme...
 
ResourceSync tutorial OAI8
ResourceSync tutorial OAI8ResourceSync tutorial OAI8
ResourceSync tutorial OAI8
 
ResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource SynchronizationResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource Synchronization
 
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
 
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
 
OpenAIRE and the Case of Irish Repositories
OpenAIRE and the Case of Irish RepositoriesOpenAIRE and the Case of Irish Repositories
OpenAIRE and the Case of Irish Repositories
 
Linked Open Data Approaches within the ARIADNE Project
Linked Open Data Approaches within the ARIADNE ProjectLinked Open Data Approaches within the ARIADNE Project
Linked Open Data Approaches within the ARIADNE Project
 
BaurCHCArchivist
BaurCHCArchivistBaurCHCArchivist
BaurCHCArchivist
 
DataverseNL as structured data hub
DataverseNL as structured data hubDataverseNL as structured data hub
DataverseNL as structured data hub
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
 
Open Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataOpen Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked Data
 
Hull presentation to Fedora UK&I meeting, 21st March 2013
Hull presentation to Fedora UK&I meeting, 21st March 2013Hull presentation to Fedora UK&I meeting, 21st March 2013
Hull presentation to Fedora UK&I meeting, 21st March 2013
 
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
 
Publishing the Full Research Data Lifecycle
Publishing the Full Research Data LifecyclePublishing the Full Research Data Lifecycle
Publishing the Full Research Data Lifecycle
 
SWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic WebSWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic Web
 
Ukcorr hydra presentation
Ukcorr hydra presentationUkcorr hydra presentation
Ukcorr hydra presentation
 

Recently uploaded

Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
SynapseIndia
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
ArgaBisma
 
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
Edge AI and Vision Alliance
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
Neo4j
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
Jimmy Lai
 
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and OllamaTirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Zilliz
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
huseindihon
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
Ivanti
 
Salesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot WorkshopSalesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot Workshop
CEPTES Software Inc
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
ssuser1915fe1
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc
 
Data Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining DataData Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining Data
Safe Software
 
Choose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presenceChoose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presence
rajancomputerfbd
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
bhumivarma35300
 
CiscoIconsLibrary cours de réseau VLAN.ppt
CiscoIconsLibrary cours de réseau VLAN.pptCiscoIconsLibrary cours de réseau VLAN.ppt
CiscoIconsLibrary cours de réseau VLAN.ppt
moinahousna
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
aslasdfmkhan4750
 
Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024
aakash malhotra
 

Recently uploaded (20)

Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
 
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
 
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and OllamaTirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
 
Salesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot WorkshopSalesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot Workshop
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
 
Data Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining DataData Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining Data
 
Choose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presenceChoose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presence
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
 
CiscoIconsLibrary cours de réseau VLAN.ppt
CiscoIconsLibrary cours de réseau VLAN.pptCiscoIconsLibrary cours de réseau VLAN.ppt
CiscoIconsLibrary cours de réseau VLAN.ppt
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
 
Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024
 

ResourceSync Tutorial

  • 1. ResourceSync: A Web-Based Resource Synchronization Framework #resourcesync ResourceSync is funded by The Sloan Foundation & JISC ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 1
  • 2. These slides were presented at the LITA Forum, Louisville, Kentucky, November 10 2013 The most recent version of the slides is available at http://www.slideshare.net/OpenArchivesInitiative/resourcesync-tutorial ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 2
  • 3. ResourceSync Tutorial History • • • • • • First outing: OAI8, June 2013 Second run: Open Repositories, July 2013 Third run: JCDL, July 2013 Fourth run: TPDL 2013, September 2013 Fifth run: LITA Forum, November 2013 Sixth run: SWIB 2013, November 2013 Presenter Herbert Van de Sompel Los Alamos National Laboratory <hvdsomp@gmail.com> @hvdsomp ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 3
  • 4. ResourceSync Tutorial Contributors Martin Klein Herbert Van de Sompel Robert Sanderson Los Alamos National Laboratory Los Alamos National Laboratory Los Alamos National Laboratory <martinklein0815@gmail.com> <hvdsomp@gmail.com> <azaroth24@gmail.com> @mart1nkle1n @hvdsomp @azaroth24 Simeon Warner Cornell University <simeon.warner@cornell.edu> @zimeon Michael L. Nelson Old Dominion University <mln@cs.odu.edu> @phonedude_mln Richard Jones Cottage Labs <richard@cottagelabs.com> @cottagelabs ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 4
  • 5. OAI Herbert Van de Sompel Martin Klein Robert Sanderson (Los Alamos National Laboratory) Simeon Warner (Cornell University) NISO Todd Carpenter Nettie Lagace University of Oxford Graham Klyne Berhard Haslhofer (University of Vienna) Michael L. Nelson (Old Dominion University) Lyrasis Peter Murray Carl Lagoze (University of Michigan) ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 5
  • 6. ResourceSync Technical Group LOCKSS Ex Libris Inc. Shlomo Sanders David Rosenthal JISC Paul Walk Richard Jones Graham Klyne Stuart Lewis RedHat OCLC Christian Sadilek Library of Congress Jeff Young Kevin Ford ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 6
  • 7. Timeline, Status of Specification(s) • August 2013 o o Release of ResourceSync framework Core specification - Version 0.9.1 Public draft of ResourceSync Archives specification released • September 2013 o Core specification on its way to become an ANSI standard • November 2013 o Internal draft of ResourceSync Notification specification • January 2014 o Public draft of ResourceSync Notification specification • Mid 2014 o Core specification becomes ANSI/NISO standard ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 7
  • 8. Pointers • Specification http://www.openarchives.org/rs/ http://www.openarchives.org/rs/resourcesync http://www.openarchives.org/rs/notification http://www.openarchives.org/rs/archives • List for public comment https://groups.google.com/d/forum/resourcesync • Client and simulator code http://github.org/resync/resync http://github.org/resync/simulator ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 8
  • 9. Papers • Klein, M., and Van de Sompel, H. (2013) Extending Sitemaps for Resourcesync. http://arxiv.org/abs/1305.4890 ACM/IEEE JCDL 2013 • Haslhofer, B., Warner, S, Lagoze, C., Klein, M., Sanderson, R., Nels on, M.L. and Van de Sompel, H. (2013) ResourceSync: Leveraging Sitemaps for Resource Synchronization. http://arxiv.org/abs/1305.1476 WWW 2013 Developer Track • Klein, M., Sanderson, R., Van de Sompel, H., Warner, S, Haslhofer, B., Lagoze, C., and Nelson, M.L. (2013) A Technical Framework for Resource Synchronization. http://dx.doi.org/10.1045/january2013-klein D-Lib Magazine. • Van de Sompel, H., Sanderson, R., Klein, M., Nelson, M.L., Haslhofer, B., W arner, S, and Lagoze, C. (2012) A Perspective on Resource Synchronization. http://dx.doi.org/10.1045/september2012vandesompel D-Lib Magazine. ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 9
  • 10. ResourceSync - Agenda 1. ResourceSync: Problem Perspective & Conceptual Approach 2. Motivation & Use Cases 3. Framework Walkthrough 4. Framework (Technical) Details 5. Implementation 6. Q&A ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 10
  • 11. ResourceSync - Agenda 1. ResourceSync: Problem Perspective & Conceptual Approach ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 11
  • 12. Synchronize What? • Web resources o things with a URI that can be dereferenced • Focus on needs of research communication and cultural heritage organizations o but aim for generality ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 12
  • 13. Synchronize What? • Small websites/repositories (a few resources) to large repositories/datasets/linked data collections (many millions of resources) sync sync ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 13
  • 14. Synchronize What? • Low change frequency (weeks/months) to high change frequency (seconds) sync sync sync ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 14
  • 15. Synchronize What? • Synchronization latency and accuracy needs may vary sync Sync ??? ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 15
  • 16. Why? … because lots of projects and services are doing synchronization but have to resort to ad-hoc, case by case, approaches! • Project team involved with projects that need this • Experience with OAI-PMH: widely used in repos but o XML metadata only o Web technology has moved on since 1999 • Devise a shared solution for data, metadata, linked data? ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 16
  • 17. ResourceSync Problem • Consideration: • Source (server) A has resources that change over time: they get created, modified, deleted • Destination (servers) X, Y, and Z leverage (some) resources of Source A. • Problem: • Destinations want to keep in step with the resource changes at Source A: resource synchronization. • Goal: • Design an approach for resource synchronization aligned with the Web Architecture that has a fair chance of adoption by different communities. • The approach must scale better than recurrent HTTP HEAD/GET on resources. ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 17
  • 18. Source: Core Synchronization Capabilities P U L L 1. Describing content – publish a list of resources available for synchronization to enable Destinations to perform an initial load or catch-up with a Source 2. Packaging content – bundle resources to enable bulk download by destinations 3. Describing changes – publish a list of resource changes to enable destinations to stay synchronized and decrease latency 4. Packaging changes – bundle resource changes for bulk download by destinations ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 18
  • 19. Source: Notifications Capabilities To reduce synchronization latency and to optimize the synchronization process the Source can support: P • U S • H 1. Change Notification • Notifies about changes to particular resources • e.g., resource A has been updated | created | deleted 2. Framework Notification • Notifies about changes to capabilities i.e., their documents • e.g., a Change List has been updated | created | deleted ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 19
  • 20. A R C H I V E S Source: Archival Capabilities The Source may hold on to historical data, for example, to allow Destinations to catch up with events they missed or revisit prior resource states. To this end, the Source can publish archives, i.e. documents that enumerate historical capability documents 1. 2. 3. 4. Resource List Archive Resource Dump Archive Change List Archive Change Dump Archive ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 20
  • 21. Source: Synchronization Features 1. Discovery of capabilities – support Destinations in discovering all offered capabilities o Applies to PULL, PUSH, ARCHIVES capabilities 1. Linking to related resources – provide links from resources subject to synchronization to related resources o Applies to PULL, PUSH capabilities ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 21
  • 22. Destination: Synchronization Needs 1. Baseline synchronization – A destination must be able to perform an initial load or catch-up with a source - avoid out-of-band setup 2. Incremental synchronization – A destination must have some way to keep up-to-date with changes at a source - subject to some latency; minimal: create/update/delete - allow to catch-up after destination has been offline 3. Audit – A destination should be able to determine whether it is synchronized with a source - regarding coverage and accuracy ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 22
  • 23. ResourceSync - Agenda 2. Motivation & Use Cases ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 23
  • 24. Use Cases – The Basics a) b) ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 24
  • 25. Use Cases – The Basics c) d) ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 25
  • 26. Use Cases – The not-so-Basics e) f) ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 26
  • 27. Use Case 1: arXiv Mirroring and Data Sharing • Repository of scholarly articles in physics, mathematics, computer science, etc. • > 850k articles • approx. 1.5 revisions per article on average • approx. 75k new articles per year • Each article has full-text and separate metadata record • approx. 3.8M resources ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 28
  • 28. Use Case 1: arXiv Mirroring and Data Sharing • 2,700 updates daily o at 8pm EST o Currently using homebrew mirroring solution (running with minor modifications since 1994!) o occasional rsync (file systemspecific, auth issues) ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 29
  • 29. Use Case 1: arXiv Mirroring • GOAL: Keep mirror sites synchronized with daily changes • WANT: o o o o high consistency moderate latency robustness to global network outages (low admin effort) ability to verify sync status in case of questions ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 31
  • 30. Use Case 1: arXiv Data Sharing • GOAL: Make resources and update information publicly available so that any other service may synchronize at the frequency it needs, e.g. o o o Math Front at UC Davis EprintWeb from IOP in UK Data for bibliometric and scientometric analysis • WANT: o o low admin effort (i.e. standard approach, standard tools) reasonable consistency, latency, efficiency ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 32
  • 31. Use Case 2: DBpedia Live Duplication • Average of 2 updates per second • Low latency desirable => need for a push technology ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 33
  • 32. Use Case 2: DBpedia Live Duplication • Daily traffic: o 99% updates o 0.6% deletions o 0.03% creations ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 35
  • 33. Use Case 2: DBpedia Live Duplication • # of content transfer events in two 8 hour intervals • Max, queue size of remote duplication process ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 36
  • 34. ResourceSync - Agenda 3. Framework Walkthrough ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 37
  • 35. Source Capability 1: Describing Content In order to advertise the resources that a source wants destinations to know about, it may describe them: o o Publish a Resource List, a list of resource URIs and possibly associated metadata - Destination GETs the Resource List - Destination GETs listed resources by their URI A Resource List describes the state of a set of resources at one point in time (snapshot) ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 38
  • 36. 39
  • 37. 40
  • 38. Source Capability 2: Packaging Content By default, content is transferred in response to a GET issued by a destination against a URI of a source’s resource. But a source may support additional mechanisms: o o Publish a Resource Dump, a document that points to packages of resource representations and necessary metadata - Destination GETs the package - Destination unpacks the package - ZIP format supported A Resource Dump and the packages it points to reflect the state of a set of resources at one point in time (snapshot) ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 41
  • 39. 42
  • 40. 43
  • 41. Source: Modular Capabilities ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 44
  • 42. Source Capability 3: Describing Changes In order to achieve lower latency and/or greater efficiency, a source may communicate about changes to its resources: o o Publish a Change List, a list of recent change events (created, updated, deleted resource) - Destination acts upon change events, e.g. GETs created/updated resources, removes deleted resources. A Change List pertains to resources that changed in a temporal interval with a start- and an end-date - If a resource changed more than once, it will be listed more than once ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 45
  • 43. 46
  • 44. 47
  • 45. 48
  • 46. 49
  • 47. Source Capability 4: Packaging Changes In order to reduce the number of requests to obtain resource changes, a source may provide packaged bitstreams for changed resources: o o Publish a Change Dump, a document that points to packages containing bitstreams of recently changed resource and necessary metadata - Destination GETs the package - Destination unpacks the package - ZIP format supported A Change Dump and its packages pertain to resources that changed in a temporal interval with a start- and an end-date - If a resource changed more than once, it will be included more than once ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 50
  • 48. 51
  • 49. 52
  • 50. Source: Modular Capabilities ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 53
  • 51. Destination: Key Processes ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 54
  • 52. ResourceSync - Agenda 4. Framework (Technical) Details ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 55
  • 53. ResourceSync - Agenda 4. Framework (Technical) Details 1. Sitemaps 2. Core synchronization capabilities (PULL) 3. Discovery 4. Linking to related resources 5. Notification Capabilities (PUSH) 6. Archival capabilities (ARCHIVES) ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 56
  • 54. ResourceSync - Agenda 4. Framework (Technical) Details 1. Sitemaps 2. Core synchronization capabilities (PULL) 3. Discovery 4. Linking to related resources 5. Notification Capabilities (PUSH) 6. Archival capabilities (ARCHIVES) ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 57
  • 55. So Many Choices Push DSNotify OAI-PMH rsync Crawl Pull OAI-ORE RDFsync WebDAV Col. Syn. XMPP Atom SWORD Sitemap SPARQLpush SDShare AtomPub RSS PubSubHubbub XMPP ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 58
  • 56. So Many Choices Push DSNotify OAI-PMH rsync Crawl Pull OAI-ORE RDFsync WebDAV Col. Syn. XMPP Atom SWORD Sitemap SPARQLpush SDShare AtomPub RSS PubSubHubbub XMPP ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 59
  • 57. ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 60
  • 58. A Framework Based on Sitemaps • Modular framework allowing selective deployment • Sitemap is the core format throughout the framework o o o Introduce extension elements and attributes: - In ResourceSync namespace (rs:) to accommodate synchronization needs Reuse Sitemap format for all capability documents: Resource List, Resource Dump, Change List, Change Dump, as well as for manifest in Dumps Utilize Sitemap index format where needed/allowed ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 61
  • 60. Sitemap Index Format <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9”> <sitemap> <loc>http://example.com/sitemap1.xml</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> </sitemap> <sitemap> <loc>http://example.com/sitemap2.xml</loc> <lastmod>2013-01-02T14:00:00Z</lastmod> </sitemap> … </sitemapindex> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 63
  • 61. ResourceSync Sitemap Extensions <urlset xmlns=http://www.sitemaps.org/schemas/sitemap/0.9 xmlns:rs="http://www.openarchives.org/rs/terms/”> <rs:ln …/> <rs:md …/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:ln …/> <rs:md …/> </url> <url> … </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 64
  • 62. ResourceSync Sitemap Extensions <sitemapindex xmlns=http://www.sitemaps.org/schemas/sitemap/0.9 xmlns:rs="http://www.openarchives.org/rs/terms/”> <rs:ln …/> <rs:md …/> <sitemap> <loc>http://example.com/sitemap1.xml</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:ln …/> <rs:md …/> </sitemap> … </sitemapindex> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 65
  • 63. Resource Metadata Summary Element/Attribute <loc> <lastmod> Description Resource URI (identity) Timestamp of last change Defined by sitemaps sitemaps <changefreq> Expected update frequency sitemaps <rs:md> change encoding hash length path type ResourceSync Change type (Change List & Change Dump Manifest only) ResourceSync HTTP Content-Encoding header value RFC2616 One or more content digests (md5, sha-1, Atom Link Ext. sha-256) HTTP Content-Length header value RFC4287 Path in ZIP package (Dump Manifests only) HTTP Content-Type header value ResourceSync RFC4287 ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands
  • 64. Related Resource Metadata Summary • Attributes of the <rs:ln> element; c.f. resource metadata + pri Element/Attribute Description Defined by <rs:ln> ResourceSync encoding HTTP Content-Encoding header value RFC2616 hash One or more content digests (md5, sha-1, sha-256) Atom Link Ext. href Related resource URI (identity) RFC4287 length HTTP Content-Length header value RFC4287 modified Timestamp of last change (c.f. <lastmod>) Atom Link Ext. path Path in ZIP package (Dump Manifests only) ResourceSync pri Priority of link RFC6249 rel Relation - IANA registered or URI RFC4287 type HTTP Content-Type header value RFC4287 ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands
  • 65. Link Relation Summary Relation Use in ResourceSync Defined in rel="alternate" Link from generic to specific URI HTML 5 rel="canonical" Link from specific to generic URI RFC6596 rel="collection" Resource is member of collection RFC6573 rel="contents" Link from dump to manifest rel="describedby" Has metadata HTML4 Protocol for Web Description Resources (POWDER): Description Resources rel="describes" Is metadata for The 'describes' Link Relation Type rel="duplicate" RFC6249 rel=".../rs/terms/patch" Mirror or alternative copy A patch -- efficient change information rel="memento" Link to time-specific URI Memento Internet Draft rel="timegate" Link to timegate Memento Internet Draft rel="via" Provenance chain, came from RFC4287 This specification ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands
  • 66. ResourceSync Sitemap Validation • All ResourceSync capability documents are valid according to the Sitemap XML Schema o http://www.sitemaps.org/schemas/sitemap/0.9 • For a more thorough validation use the ResourceSync XML Schema o http://www.openarchives.org/rs/0.9.1/resourcesync.xsd ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands
  • 67. ResourceSync - Agenda 4. Framework (Technical) Details 1. Sitemaps 2. Core synchronization capabilities (PULL) 3. Discovery 4. Linking to related resources 5. Notification Capabilities (PUSH) 6. Archival capabilities (ARCHIVES) http://www.openarchives.org/rs/resourcesync ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 70
  • 68. Describing Content: Resource List http://www.openarchives.org/rs/resourcesync#DescResources ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 71
  • 69. Resource List <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability="resourcelist" at="2013-01-03T09:00:00Z” completed="2013-01-03T09:01:00Z” /> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6" length="8876" type="text/html"/> </url> <url> … </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 72
  • 70. Resource List • Describe Source’s resources that are subject to synchronization • At one point in time (snapshot) • Creation can take some time – duration can be conveyed • Typical Destination use: Baseline Synchronization, Audit • Each URI typically listed only once • Might be expensive to generate • Destinations use @at to determine freshness • [@at, @completed] – interval of uncertainty • Destination issues GETs against URIs to obtain resources • Very similar to current Sitemaps ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 73
  • 71. What if I have a million resources? • Current sitemap limit is 50k resources (or maximum document size of 50MB) • Break complete list of resources into 50k-resource chunks, each on a Resource List document • Create a Resource List Index document to group them: o o o Based on <sitemapindex> May have up to 50k component Resource Lists Extends capacity to 2,500,000,000 resources within current community practices ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands
  • 72. Resource List Index <resourcelist_index.xml> <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”resourcelist" at="2013-01-02T09:00:02Z”/> <sitemap> <loc>http://example.com/resourcelist1.xml</loc> <rs:md type="application/xml"/> </sitemap> <sitemap> <loc>http://example.com/resourcelist2.xml</loc> <rs:md type="application/xml"/> </sitemap> </sitemapindex> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 75
  • 73. Resource List <resourcelist1.xml> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs=http://www.openarchives.org/rs/terms/> <rs:ln rel=”index” href=”http://example.com/resourcelist_index.xml”/> <rs:md capability=”resourcelist" at="2013-01-02T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T08:07:06Z</lastmod> <rs:md hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6" length="8876" type="text/html"/> </url> ... </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 76
  • 74. Resource List Index ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 77
  • 75. Packaging Content: Resource Dump http://www.openarchives.org/rs/resourcesync#ResourceDump ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 78
  • 76. Resource Dump <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”resourcedump" at="2013-01-02T09:00:00Z”/> <url> <loc>http://example.com/resourcedump_part1.zip</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md length=”97553" type=”application/zip"/> <rs:ln rel=”contents” href="http://example.com/resourcedump_manifest-part1.xml" type=”application/xml"/> </url> <url> <loc>http://example.com/resourcedump_part2.zip</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 79
  • 77. Resource Dump Manifest <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”resourcedump-manifest" at="2013-01-02T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md type="text/html" path=”/resources/res1"/> </url> <url> <loc>http://example.com/res2</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md type=”application/pdf” path=”/resources/res2"/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 80
  • 78. Resource Dump • A Resource Dump points to packages (ZIP files) that contain representations of the Source’s resources • At one point in time (snapshot) • Resource Dump is mandatory, even if there is only one ZIP file • ZIP package contains manifest, listing contained bitstreams • Typical Destination use: Baseline Synchronization, bulk download • Each URI typically listed only once • Might be expensive to generate • Destinations use @at to determine freshness • [@at, @completed] – interval of uncertainty • GETs against individual URIs from Resource List achieves the same result (ignoring varying freshness) ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 81
  • 79. Describing Changes: Change List http://www.openarchives.org/rs/resourcesync#DesChanges ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 82
  • 80. Change List <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated" hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6" length="8876" type="text/html"/> </url> <url> … </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 83
  • 81. Open Change List <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs=http://www.openarchives.org/rs/terms/> <rs:md capability="changelist" from="2013-01-02T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated" hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6" length="8876" type="text/html"/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 84
  • 82. Change List • A Change List pertains to a Source’s resources that changed • Changes that occurred during a temporal interval with startand end-date • Typical Destination use: Incremental Synchronization, Audit • Changes are listed in chronological order • Multiple changes to one resource results in the resource being listed multiple times, once per change • Source determines duration of temporal interval • Destinations use @from and @until to determine freshness • Destinations issue GETs against URIs to obtain changed resources ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 85
  • 83. Change List Index <changelist_index.xml> <changelist1.xml> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 86
  • 84. Change List Index <changelist_index.xml> <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <sitemap> <loc>http://example.com/changelist1.xml</loc> <lastmod>2013-01-02T11:00:00Z</lastmod> <rs:md type="application/xml"/> </sitemap> <sitemap> <loc>http://example.com/changelist2.xml</loc> <lastmod>2013-01-02T23:00:00Z</lastmod> <rs:md type="application/xml"/> </sitemap> </sitemapindex> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 87
  • 85. Change List <changelist1.xml> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs=http://www.openarchives.org/rs/terms/> <rs:ln rel=”index” href=”http://example.com/changelist_index.xml”/> <rs:md capability="changelist" from="2013-01-02T09:00:00Z” until="2013-01-02T21:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated" hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6" length="8876" type="text/html"/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 88
  • 86. Open Change List Index <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z”/> <sitemap> <loc>http://example.com/changelist1.xml</loc> <lastmod>2013-01-02T11:00:00Z</lastmod> </sitemap> <sitemap> <loc>http://example.com/changelist2.xml</loc> <lastmod>2013-01-02T23:00:00Z</lastmod> </sitemap> <sitemap> <loc>http://example.com/changelist_open.xml</loc> </sitemap> </sitemapindex> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 89
  • 87. Change List Index ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 90
  • 88. Packaging Changes: Change Dump http://www.openarchives.org/rs/resourcesync#PackChanges ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 91
  • 89. Capability 4: Change Dump <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changedump" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/change_dump_part1.zip</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md length="887" type=”application/zip"/> </url> <url> <loc>http://example.com/change_dump_part2.zip</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md length=”9767" type=”application/zip"/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 92
  • 90. Change Dump Manifest <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changedump-manifest" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated" length=”2887” type=”text/html” path=”/changes/res1”/> </url> <url> … </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 93
  • 91. Change Dump • A Change Dump points at packages (ZIP files) that contain bitstreams of the Source’s resources that changed • Changes that occurred during a temporal interval with startand end-date • Change Dump is mandatory, even if there is only one ZIP file • ZIP package contains manifest, listing contained bitstreams • Typical Destination use: Incremental Synchronization, bulk download of changes • • • • Changes in Change Dump Manifest listed in chronological order Same URI can be listed multiple times Might be expensive to generate Destinations use @from and @until to determine freshness ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 94
  • 92. ResourceSync - Agenda 4. Framework (Technical) Details 1. Sitemaps 2. Core synchronization capabilities (PULL) 3. Discovery 4. Linking to related resources 5. Notification Capabilities (PUSH) 6. Archival capabilities (ARCHIVES) http://www.openarchives.org/rs/resourcesync#Discovery ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 95
  • 93. Discovery of Capabilities Requirements: • Need to discover capabilities, i.e. Resource List, Resource Dump, Change List, Change Dump, Archives, Notification channels • Need to know the type of capability each document represents. Approach: • The Source publishes a Capability List that enumerates the capabilities it supports. • By pointing at Resource List, Change List, Resource Dump, etc. using appropriate relation types, e.g. “resourcelist”, “changelist”, “resourcedump” etc. http://www.openarchives.org/rs/resourcesync#CapabilityList ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 96
  • 94. Discovery of Capabilities ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 97
  • 95. Capability List <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”capabilitylist”/> <url> <loc>http://example.com/dataset1/resourcelist.xml</loc> <rs:md capability=”resourcelist”/> </url> <url> <loc>http://example.com/dataset1/changelist.xml</loc> <rs:md capability=”changelist”/> </url> <url> <loc>http://example.com/dataset1/resourcedump.xml</loc> <rs:md capability=”resourcedump”/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 98
  • 96. Discovery of Capability Lists Requirements: • Need to discover a Capability List Approaches: • Introduce a link in the HTTP Link header of a resources that is subject to synchronization, pointing at the Capability List with the relation type “resourcesync” • Introduce a link from an HTML document that is subject to synchronization (<head> section), pointing at the Capability List with the relation type “resourcesync” • Link from a Resource List, etc. to the Capability List with the relation type “up” Link header on example.com/res1.pdf Link: <example.com/dataset1/capabilitylist.xml>;rel=“resourcesync” ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 99
  • 97. Discovery of Capabilities ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 100
  • 98. Discovery: Source Description Requirements: • Support for multiple Capability Lists, one per “set of resources” • Need to discover these Capability Lists • Need descriptive information about each set of resources that a Capability List pertains to • Useful to have descriptive information about the Source itself Approach: • The Source Description document meets these requirements. • It should be at a particular location to avoid having registries: http://(hostname)/.well-known/resourcesync • It can be linked to from the Capability Lists as well. http://www.openarchives.org/rs/resourcesync#SourceDesc ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 101
  • 99. Discovery of Capabilities ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 102
  • 100. Discovery of Capabilities ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 103
  • 101. Source Description <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”description”/> <rs:ln rel=“describedby” href=“http://example.com/info_about_source.xml”/> <url> <loc>http://example.com/dataset1/capabilitylist.xml</loc> <rs:md capability=”capabilitylist”/> <rs:ln rel=“describedby” href=“http://example.com/dataset1/info_about_dataset1.xml”/> </url> <url> <loc>http://example.com/dataset2/capabilitylist.xml</loc> <rs:md capability=”capabilitylist”/> <rs:ln rel=“describedby” href=“http://example.com/dataset2/info_about_dataset2.xml”/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 104
  • 102. Discovery via robots.txt • Resource Lists are (enhanced) Sitemaps • Sitemaps can be discovered via robots.txt • Ergo, Resource Lists should be discoverable via robots.txt User-agent: * Disallow: /cgi-bin/ Disallow: /tmp/ Sitemap: http://example.com/dataset1/resourcelist.xml ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 105
  • 103. Discovery of Capabilities http://www.openarchives.org/rs/resourcesync#Discovery ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 106
  • 105. e.g., Capability List <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”capabilitylist”/> <rs:ln rel=“up” href=“http://example.com/.well-known/resourcesync”/> <url> <loc>http://example.com/dataset1/resourcelist.xml</loc> <rs:md capability=”resourcelist”/> </url> <url> <loc>http://example.com/dataset1/changelist.xml</loc> <rs:md capability=”changelist”/> </url> <url> <loc>http://example.com/dataset1/resourcedump.xml</loc> <rs:md capability=”resourcedump”/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 108
  • 107. Framework Structure ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 110
  • 108. ResourceSync - Agenda 4. Framework (Technical) Details 4. Linking to related resources http://www.openarchives.org/rs/resourcesync#LinkRelRes ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 111
  • 109. Supported Linking Use Cases Provide links to related resources to address specific resource synchronization needs. 1. 2. 3. 4. 5. 6. 7. Mirrored content with multiple download locations Alternate representations of the same content Patching content rather than replacing it Resources and metadata about resources Prior versions of resources Collection membership of resources Republishing synchronized resources All cases are handled with a <rs:ln> element referring to the linked resource ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 112
  • 110. Notes about Linked Resources Some important things to keep in mind about linked resources: • They may also be subject to synchronization • They may be updated in a very different schedule than the resources that link to them • Therefore, it is recommended to convey metadata about the linked resource too • Links can be bi-directional – the linked resource can link back to the linking resource ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 113
  • 111. Linking #1 - Mirror 1. Content with multiple download locations This may be of interest for: • Content distribution networks • Mirror sites • Backup locations • Load balancing http://www.openarchives.org/rs/0.9.1/resourcesync#MirCon ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 114
  • 112. Linking #1 - Mirror <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”duplicate” pri=”1” href=”http://mirror1.example.com/res1"/> <rs:ln rel=”duplicate” pri=”2” href=”http://mirror2.example.com/res1"/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 115
  • 113. Linking #2 – Alternate Representations 2. Alternate representations of the same content This may be of interest for: • Resources subject to HTTP content negotiation • Format migration for preservation reasons • Different clients wanting different formats • Multiple languages of the content http://www.openarchives.org/rs/resourcesync#AltRep ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 116
  • 114. Linking #2 – Alternate Representations <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel="alternate" type="text/html" href="http://example.com/res1.html"/> <rs:ln rel="alternate" type=“application/pdf" href=”http://example.com/res1.pdf"/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 117
  • 115. Linking #2 – Alternate Representations <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1.html</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”canonical” href="http://example.com/res1"/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 118
  • 116. Linking #3 – Patching Content 3. Patching content rather than replacing it This may be of interest when: • Resources are very large and server wishes to conserve bandwidth where possible • Changes are frequent and small • Changes are managed in a CMS that tracks differences Need: • Machine processable format to describe a change in a manner that allows patching a representation • Existing or newly defined by communities http://www.openarchives.org/rs/resourcesync#PatchCon ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 119
  • 117. Linking #3 – Patching Content <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1.json</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated” length=“398723”/> <rs:ln rel=”http://www.openarchives.org/rs/terms/patch” type=”application/json-patch” modified=“2013-01-02T17:00:00Z” length=“58” href=”http://example.com/res1-patch.json"/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 120
  • 118. Linking #4 – Metadata about Resources 4. Resources and metadata about resources This may be of interest when: • Resources have associated descriptive metadata records, which are useful for understanding the resource • Such as cultural heritage images, audio, video • Resources that have associated technical, administrative, rights metadata http://www.openarchives.org/rs/resourcesync#ResMDLinking ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 121
  • 119. Linking #4 – Metadata about Resources <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”describedby” type=”application/xml” href=”http://example.com/metadata/res1.xml"/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 122
  • 120. Linking #4 – Metadata about Resources <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/metadata/res1.xml</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”describes” type=”text/html” href=”http://example.com/res1"/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 123
  • 121. Linking #5 – Prior Versions of Resources This may be of interest when: • A Destinations needs to have a copy of all versions of a resource http://www.openarchives.org/rs/resourcesync#ResVers ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 124
  • 123. URI for Original, URI for Version Web Archive URI-M - http://web.archive.org/web/20010911203610/http://www.cnn.com/ URI-R - http://www.cnn.com/
  • 124. URI for Original, URI for Version CMS URI-M - http://en.wikipedia.org/w/index.php?title=September_11_attacks&oldid=282333 URI-R - http://en.wikipedia.org/wiki/September_11_attacks
  • 131. Memento Time Travel extension for Chrome Download extension at http://bit.ly/memento-for-chrome ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands
  • 132. Linking #5 – Prior Versions of Resources <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”memento” href=”http://example.com/past/20130102130000/res1"/> <rs:ln rel=”timegate” href=”http://example.com/timegate/res1"/> <rs:ln rel=”timemap” href=“http://example.com/timemap/res1” type=“application/link-format”/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 135
  • 133. Linking #6 – Collection Membership 6. Collection membership of resources This may be of interest when: • Resources are part of OAI-ORE aggregations • Resources are part of OAI-PMH sets • To indicate any other type of collections of resources Collections are named with URIs and can then be linked to with rel=“collection” • Nice if the collection URI resolves to a useful description http://www.openarchives.org/rs/resourcesync#ColMem ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 136
  • 134. Linking #6 – Collection Membership <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”collection” href=”http://example.com/aggregation/allres"/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 137
  • 135. Linking #7 – Republishing Resources 7. Republishing synchronized resources This may be of interest when: • Aggregator systems harvest resources from Sources and then republish them at new URIs Examples include Blog republishing, content distribution networks, mirrored or combined collections Hypothetical scenario: Lots of little museums with small collections, and a large European/American aggregating digital library system that wants to provide fast, combined access to the content (with permission) http://www.openarchives.org/rs/resourcesync#RePub ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 138
  • 136. Linking #7 – Republishing Resources #1 • Original Source publishes information about a changed resource via a Change List <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-03T00:00:00Z”/> <url> <loc>http://original.example.com/res1</loc> <lastmod>2013-01-03T07:00:00Z</lastmod> <rs:md change=”updated”/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 139
  • 137. Linking #7 – Republishing Resources #2 • Aggregator 1 republishes information about the changed resource with reference to the original Source <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-03T11:00:00Z”/> <url> <loc>http://aggregator1.example.com/res1</loc> <lastmod>2013-01-03T20:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”via” modified=“2013-01-03T07:00:00Z” href=”http://original.example.org/res1"/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 140
  • 138. Linking #7 – Republishing Resources #3 • Aggregator 2 ditto • Caution when republishing links, need to make sure they are still appropriate from an aggregator’s perspective <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-03T12:00:00Z”/> <url> <loc>http://aggregator2.example.com/res1</loc> <lastmod>2013-01-04T09:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”via” modified=“2013-01-03T07:00:00Z” href=”http://original.example.org/res1"/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 141
  • 139. ResourceSync - Agenda 4. Framework (Technical) Details 1. Sitemaps 2. Core synchronization capabilities (PULL) 3. Discovery 4. Linking to related resources 5. Notification Capabilities (PUSH) 6. Archival capabilities (ARCHIVES) http://www.openarchives.org/rs/notification ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 142
  • 140. Motivation for Notifications • Reduce synchronization latency by having the Source push out resource change information • To avoid continuous pull of Change Lists by Destinations • Share information about changes to the Source’s ResourceSync implementation, e.g. announcement of new Resource List, new Capability List, etc. • To avoid continuous polling of e.g. Resource Lists, ResourceSync Description ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 143
  • 141. Source: Notifications Capabilities • P U • S H 1. Change Notification • Notifies about changes to particular resources • e.g., resource A has been updated | created | deleted 2. Framework Notification • Notifies about changes to capabilities i.e., their documents • e.g., a Change List has been updated | created | deleted • Also for Capability Lists and Source Description ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 144
  • 142. Notifications Channels • Notification sent via channels • Resource Notification: one channel per set of resources • Framework Notification: one channel per set of resources • Sent on level of capability document, not on index-level • Notifications about changes to Source Description sent on all Framework Notification channels • Payload for notifications: <urlset> documents • Transport protocol for notifications: • PubSubHubbub https://pubsubhubbub.googlecode.com/git/pubsubhubbub-core0.4.html - current choice • WebSockets -http://tools.ietf.org/html/rfc6455 – may be added later ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 145
  • 143. 146
  • 146. Change Notification Payload <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T09:07:00Z</lastmod> <rs:md change=”updated" hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6" length="8876" type="text/html"/> </url> <url> … </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 149
  • 147. Framework Notification Payload <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <url> <loc>http://example.com/resourceset1/resourcelist.xml</loc> <rs:md change=”created" capability=”resourcelist”/> </url> <url> <loc>http://example.com/resourceset1/resourcedump.xml</loc> <rs:md change=”created" capability=”resourcedump”/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 150
  • 148. Framework Notification Payload (w/ index) <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <url> <loc>http://example.com/resourceset1/resourcelist.xml</loc> <rs:md change=”created" capability=”resourcelist”/> <rs:ln rel="index" href=”http://example.com/dataset1/resourcelist-index.xml/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 151
  • 150. ResourceSync - Agenda 4. Framework (Technical) Details 1. Sitemaps 2. Core synchronization capabilities (PULL) 3. Discovery 4. Linking to related resources 5. Notification Capabilities (PUSH) 6. Archival capabilities (ARCHIVES) http://www.openarchives.org/rs/archives ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 153
  • 151. A R C H I V E S Source: Archival Capabilities The Source may hold on to historical data, for example, to allow Destinations to catch up with events they missed or revisit prior resource states. To this end, the Source can publish archives, i.e. documents that enumerate historical capability documents 1. 2. 3. 4. Resource List Archive Resource Dump Archive Change List Archive Change Dump Archive ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 154
  • 152. Resource List Archive http://www.openarchives.org/rs/archives#ResourceListArch ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 155
  • 153. Resource List Archive <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability="resourcelist-archive" at="2013-01-09T13:00:00Z"/> <url> <loc>http://example.com/resourcelist1.xml</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> </url> <url> <loc>http://example.com/resourcelist2.xml</loc> <lastmod>2013-01-09T13:00:00Z</lastmod> </url> <url> … </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 156
  • 154. Resource Dump Archive http://www.openarchives.org/rs/archives#ResourceDumpArch ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 157
  • 155. Resource Dump Archive <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability="resourcedump-archive" at="2013-02-10T03:00:00Z"/> <url> <loc>http://example.com/resourcedump1.xml</loc> <lastmod>2013-01-10T03:00:00Z</lastmod> </url> <url> <loc>http://example.com/resourcedump2.xml</loc> <lastmod>2013-02-10T03:00:00Z</lastmod> </url> <url> … </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 158
  • 156. Change List Archive http://www.openarchives.org/rs/archives#ChangeListArch ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 159
  • 157. Change List Archive <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist-archive" from="2013-02-01T23:00:00Z until="2013-02-03T23:00:00Z"/> <url> <loc>http://example.com/changelist1.xml</loc> <lastmod>2013-02-01T23:00:00Z</lastmod> </url> <url> <loc>http://example.com/changelist2.xml</loc> <lastmod>2013-02-02T23:00:00Z</lastmod> </url> <url> <loc>http://example.com/changelist3.xml</loc> <lastmod>2013-02-03T23:00:00Z</lastmod> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 160
  • 158. Change Dump Archive http://www.openarchives.org/rs/archives#ChangeDumpArch ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 161
  • 159. Change Dump Archive <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changedump-archive" from="2013-02-10T03:00:00Z until="2013-02-17T03:00:00Z"/> <url> <loc>http://example.com/changedump1.xml</loc> <lastmod>2013-02-10T03:00:00Z</lastmod> </url> <url> <loc>http://example.com/changedump2.xml</loc> <lastmod>2013-02-17T03:00:00Z</lastmod> </url> <url> … </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 162
  • 160. Capability List for Archives <urlset xmlns=“http://www.sitemaps.org/schemas/sitemap/0.9” xmlns:rs=“http://www.openarchives.org/rs/terms/”> <rs:md capability=”capabilitylist”/> <url> <loc>http://example.com/dataset1/resourcelist.xml</loc> <rs:md capability=”resourcelist”/> </url> … <url> <loc>http://example.com/dataset1/resourcelist-archive.xml</loc> <rs:md capability=“resourcelist-archive”/> </url> <url> <loc>http://example.com/dataset1/changelist-archive.xml</loc> <rs:md capability=“changelist-archive”/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 163
  • 161. ResourceSync Framework with Archives ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 164
  • 162. ResourceSync - Agenda 5. Implementation ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 165
  • 163. Implementation #1: The Metadata Harvesting Use Case ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 166
  • 164. The Metadata Harvesting Use Case 1. Identification of metadata records within a service 1. Use of standards in metadata formats 1. Incremental updates 1. Create, Update, Delete 1. Sets ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 167
  • 165. The Metadata Harvesting Use Case 1. Identification of metadata records within a service ResourceSync does not specifically care about metadata records, only resources. It is up to the server to identify which of those resources are metadata. 2. Use of standards in metadata formats We are free to annotate a resource's entry with appropriate metadata to indicate the format. ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 168
  • 166. The Metadata Harvesting Use Case 3. Incremental updates ResourceSync publishes changes as static documents. The client is then free to walk up and down the change lists provided by the server. 4. Create, Update, Delete All resources that can be obtained from a change list will be annotated with the kind of change that happened to them. 5. Sets ResourceSync allows the server to publish lists of resources and changes and indexes of those lists all annotated with metadata. ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 169
  • 167. (Required) Documents for metadata harvesting use case ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 170
  • 168. Describing Metadata Resources <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability="resourcelist" from="2013-05-05T13:00:00Z"/> <url> <loc>http://mydspace.edu/dspace-rs/resource/123456789/7/qdc</loc> <lastmod>2013-05-01T19:09:35Z</lastmod> <changefreq>never</changefreq> <rs:md type=”application/xml”/> <rs:ln href="http://mydspace.edu/bitstream/123456789/7/1/bitstream.pdf" rel="describes"/> <rs:ln href="http://mydspace.edu/bitstream/123456789/7/2/image.jpg" rel="describes"/> <rs:ln href="http://mydspace.edu/123456789/3" rel=”collection"/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 171
  • 169. Describing Bitstream Resources <urlset … <url> <loc>http://mydspace.edu/bitstream/123456789/7/1/bitstream.pdf</loc> <lastmod>2013-05-01T19:09:35Z</lastmod> <changefreq>never</changefreq> <rs:md hash="md5:75d0ea94097a05fce9aca5b079e2f209" length="419805" type="application/pdf"/> <rs:ln href="http://mydspace.edu/dspace-rs/resource/123456789/7/qdc" rel="describedby"/> <rs:ln href="http://mydspace.edu/dspace-rs/resource/123456789/7/mets" rel="describedby"/> <rs:ln href="http://mydspace.edu/dspace-rs/resource/123456789/12/qdc" rel="describedby"/> <rs:ln href="http://mydspace.edu/123456789/2" rel=”collection"/> </url> </urlset> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 172
  • 170. Serving Metadata Resources http://mydspace.edu/dspace-rs/resource/123456789/7/qdc ResourceSync webapp metadata.formats = qdc = http://purl.org/dc/terms/, mets = http://www.loc.gov/METS/ metadata.types = qdc = application/xml, mets = application/xml Item handle Metadata Format <loc>http://mydspace.edu/dspace-rs/resource/123456789/7/qdc<loc> <rs:md type="application/xml”/> <rs:ln href="http://purl.org/dc/terms/" rel="describedby"/> <loc>http://mydspace.edu/dspace-rs/resource/123456789/7/mets</loc> <rs:md type="application/xml”/> <rs:ln href="http://www.loc.gov/METS/" rel="describedby"/> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 173
  • 171. Generating Documents 1. Initialise Creates initial Capability List and Resource List documents [dspace]/bin/dspace dsrun org.dspace.resourcesync.ResourceSyncGenerator -i 2. Update Creates a new Change List which covers the period since the last Change List was created [dspace]/bin/dspace dsrun org.dspace.resourcesync.ResourceSyncGenerator -u 3. Rebase A combination of both Initialise and Update. [dspace]/bin/dspace dsrun org.dspace.resourcesync.ResourceSyncGenerator -r ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 174
  • 172. Usage of Resources by clients ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 175
  • 173. Impact on DSpace ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 176
  • 174. URLs • • • • Stable identifiers for archived items Stable identifiers for unarchived items Stable identifiers for metadata resources (in their various formats) Stable identifiers for previous versions ? Provenance • History of changes to an item/bitstream • Item/bitstream deletions (vs withdraw) • Bitstream create/update dates • Item create/update dates ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 177
  • 175. Versioning • Access of previous versions of both metadata and bitstreams ? • Stable identifiers for previous versions of both metadata and ? bitstreams Metadata Resources • Metadata in a variety of formats • Metadata as file/bitstream ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 178
  • 176. Admin Files • • • ResourceSync documents (Resource Lists, Change Lists, etc) ResourceSync exports - Resource Dumps, Change Dumps Metadata exports in a number of formats Scheduled Tasks • Regular generation of RS documents Complex Objects • • Item/bitstream relationships Collections of content ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 179
  • 177. Get the software! Dspace Module: https://github.com/CottageLabs/DSpaceResourceSync depends on the common java library: https://github.com/CottageLabs/ResourceSyncJava PHP client: https://github.com/stuartlewis/resync-php depends on the SWORDv2 clienbt library: https://github.com/swordapp/swordappv2-php-library/ ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 180
  • 178. Implementation #2: ResourceSync at arXiv.org ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 181
  • 179. ResourceSync @ arXiv • Use ResourceSync for both mirroring and public data access o efficient updates o ability to do periodic audits o public synchronization capability o reduce admin burden • Likely start with metadata + source for mirroring use case (doing experiments now) • Open access use cases requires processed PDF also • Some concerns about likely use/load… ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 182
  • 180. ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 183
  • 181. Alternate download location • Likely want to separate machine accesses from human accesses to preserve response time on main server => Use Mirrored Content part of spec o o <loc> specifies canonical URI - e.g. http://arxiv.org/pdf/1306.1073v1.pdf <rs:ln rel=“duplicate”> specifies preferred download location - e.g. http://export.arxiv.org/pdf/1306.1073v1.pdf ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 184
  • 182. Alternate download location <url> <loc>http://arxiv.org/pdf/1306.1073v1.pdf</loc> <lastmod>2013-06-06T00:57:12Z</lastmod> <rs:md hash="md5:e08e0c4e4d7b0895120014f0aa09e7c4" length="287714” type=”application/pdf"/> <rs:ln rel="duplicate” pri="1" href="http://export.arxiv.org/pdf/1306.1073v1.pdf" modified="2013-06-06T02:00:59Z"/> </url> ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 185
  • 183. Getting a copy of arXiv It might be as easy as: (of course, you probably have to wait a while but it is nice to know ResourceSync is stateless so one can efficiently restart) ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 186
  • 184. Python Library and Client • Aim to provide library code implementing all ResourceSync facilities for use in both source and destination implementations o o Designed for python 2.6 (RHEL6) and 2.7 Will not work with python <= 2.5 • Client (resync) supports many destination operations, inspired by the common Unix rsync program • Client also supports some operations that might be useful in a source, such as generation of static Resource Lists, or periodic Change Lists (used in arXiv experiments) • Explorer (resync-explorer) intended to allow easy inspection of a source’s resource sets and capabilities • Developed since ResourceSync v0.5, updated for v0.9 http://github.org/resync/resync ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands
  • 185. ResourceSync Source Simulator • Python code using Tornado server • Provides random set of resources of different sizes updated at a particular rate • Very useful for testing Destination code http://github.com/resync/simulator ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands
  • 186. ResourceSync - Agenda 6. Q&A ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 189
  • 187. ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 190
  • 188. ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 191
  • 189. ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 192
  • 190. ResourceSync: A Web-Based Resource Synchronization Framework #resourcesync ResourceSync is funded by The Sloan Foundation & JISC ResourceSync Tutorial DANS, January 21 2014, Den Haag, Netherlands 193

Editor's Notes

  1. LANL Memento Aggregator of IIPC; Europeana does metadata via OAI-PMH but anticipate content also; arXiv – mirroring and data sharing; Linked data @ BBC; DBpedia, journal data at LANLREST not about in 1999
  2. XML &lt;-&gt; OAI-PMHlarge data begs diff question
  3. protected mostly about existing HTTP auth methods, stats -&gt; just inventory
  4. Switching to a standardized resource-centric framework could
  5. Semantic web version of wikipedia; want mirror to provide reliable basis for local services
  6. Semantic web version of wikipedia; want mirror to provide reliable basis for local services
  7. Semantic web version of wikipedia; want mirror to provide reliable basis for local services
  8. Semantic web version of wikipedia; want mirror to provide reliable basis for local services
  9. Top line – just metadata about resources, destination uses GET to get them (duh)Bottom line – packaged content =&gt; fewer round trips
  10. Rsyncetc just reference; push vs pull -&gt; both; many other parts
  11. Rsyncetc just reference; push vs pull -&gt; both; many other parts
  12. Add: rel=“contents”rel=“archives”
  13. They have in common: versions exist at different URIs. Because only the representation of a single state of a resource is available from a URI.
  14. They have in common: versions exist at different URIs. Because only the representation of a single state of a resource is available from a URI.
  15. Pattern exists in e.g.: WikiPedia, W3C specs, DryadNot sure whether DOI in general follows this paradigm.
  16. Now the question is “How we do access those versions” - Can interlink them. There’s RFCs that describe how to do that.-But that URI-R is special. It is what typically is being bookmarked, put in email. Want to leverage the fact that this URI-R is always there. Use it as the entry point.
  17. Memento addresses the problem in a resource-centric way:Resource, URI, state, representation, link, content negotiation
  18. Test site, has subsets of arXiv and even complete source plus metadata (at present not up to date with 0.9)
  19. No way around the difficulty of transferring 1TB initially but then a daily or weekly sync is efficient, and it still works even after some arbitrary time.