NISO Training
http://www.niso.org/workrooms/onixpl-encoding/

NISO Training
ResourceSync: A Web-Based
Resource Synchronization
Framework
December 3, 2013
Speakers:
Bernhard Haslhofer - Postdoc Research Associate
Department of Computer Science, University of Vienna
Simeon Warner - Information Science, Cornell University
ResourceSync:
A Web-Based
Resource Synchronization
Framework

#resourcesync

ResourceSync is funded by
The Sloan Foundation & JISC
ResourceSync Webinar
December 3 2013

2
This is a short version of the complete ResourceSync tutorial,
which is available at
http://www.slideshare.net/OpenArchivesInitiative/resourcesync-tutorial

ResourceSync Webinar
December 3 2013

3
ResourceSync Tutorial History
• OAI8, June 2013 – Open Repositories, July 2013 –
JCDL, July 2013 – TPDL 2013, September 2013 –LITA
Forum, November 2013, SWIB November 2013, …

Presenters

Simeon Warner
Cornell University

Bernhard Haslhofer
University of Vienna
ResourceSync Webinar
December 3 2013
4
ResourceSync Tutorial Contributors

Martin Klein
Herbert Van de Sompel
Robert Sanderson
Los Alamos National Laboratory Los Alamos National Laboratory Los Alamos National Laboratory
<martinklein0815@gmail.com>
<hvdsomp@gmail.com>
<azaroth24@gmail.com>
@mart1nkle1n
@hvdsomp
@azaroth24

Simeon Warner
Cornell University
<simeon.warner@cornell.edu>
@zimeon

Michael L. Nelson
Old Dominion University
<mln@cs.odu.edu>
@phonedude_mln

Richard Jones
Cottage Labs
<richard@cottagelabs.com>
@cottagelabs
ResourceSync Webinar
December 3 2013
5
OAI
Herbert Van de Sompel
Martin Klein
Robert Sanderson
(Los Alamos National Laboratory)
Simeon Warner
(Cornell University)

NISO
Todd Carpenter
Nettie Lagace
University of Oxford
Graham Klyne

Bernhard Haslhofer
(University of Vienna)
Michael L. Nelson
(Old Dominion University)

Lyrasis
Peter Murray

Carl Lagoze
(University of Michigan)

ResourceSync Webinar
December 3 2013

6
ResourceSync Technical Group
LOCKSS
Ex Libris Inc.
Shlomo Sanders

David Rosenthal

JISC
Richard Jones

Paul Walk

Stuart Lewis

RedHat
OCLC
Christian Sadilek

Library of Congress

Jeff Young

Kevin Ford

ResourceSync Webinar
December 3 2013

7
Timeline, Status of Specification(s)
• August 2013
o

o

Release of ResourceSync framework Core specification
- Version 0.9.1
Public draft of ResourceSync Archives specification released

• September 2013
o

Core specification on its way to become an ANSI standard

• November 2013
o

Internal draft of ResourceSync Notification specification

• January 2014
o

Public draft of ResourceSync Notification specification

• Mid 2014
o

Core specification becomes ANSI/NISO standard

ResourceSync Webinar
December 3 2013

8
Pointers
• Specification
http://www.openarchives.org/rs/
http://www.openarchives.org/rs/resourcesync
http://www.openarchives.org/rs/archives

• List for public comment
https://groups.google.com/d/forum/resourcesync
• Client and simulator code
http://github.org/resync/resync
http://github.org/resync/simulator

ResourceSync Webinar
December 3 2013

9
ResourceSync - Agenda
1. ResourceSync: Problem Perspective & Conceptual
Approach

2. Motivation & Use Cases
3. Framework Walkthrough

4. Framework (Technical) Details
5. Implementation
6. Q&A
ResourceSync Webinar
December 3 2013

10
ResourceSync - Agenda
1. ResourceSync: Problem Perspective & Conceptual
Approach

ResourceSync Webinar
December 3 2013

11
Synchronize What?

• Web resources
o things with a URI that can be dereferenced
• Focus on needs of research communication and cultural heritage
organizations
o but aim for generality

ResourceSync Webinar
December 3 2013

12
Synchronize What?
• Small websites/repositories (a few resources) to large
repositories/datasets/linked data collections (many millions of
resources)

sync

sync

ResourceSync Webinar
December 3 2013

13
Synchronize What?
• Low change frequency (weeks/months) to high change
frequency (seconds)
• Synchronization latency and accuracy needs may vary
sync

sync

sync

ResourceSync Webinar
December 3 2013

14
Why?
… because lots of projects and services are doing synchronization
but have to resort to ad-hoc, case by case, approaches!
• Project team involved with projects that need this

• Experience with OAI-PMH: widely used in repos but
o XML metadata only
o Attempts at synchronizing actual content via OAI-PMH
(complex object formats, dc:identifier) not successful.
o Web technology has moved on since 1999
• Devise a shared solution for data, metadata, linked data?

ResourceSync Webinar
December 3 2013

15
ResourceSync Problem
• Consideration:
• Source (server) A has resources that change over time: they
get created, modified, deleted
• Destination (servers) X, Y, and Z leverage (some)
resources of Source A.
• Problem:
• Destinations want to keep in step with the resource changes
at Source A: resource synchronization.
• Goal:
• Design an approach for resource synchronization aligned
with the Web Architecture that has a fair chance of adoption
by different communities.
• The approach must scale better than recurrent HTTP
HEAD/GET on resources.

ResourceSync Webinar
December 3 2013

16
Source: Core Synchronization Capabilities

P
U
L
L

1. Describing content – publish a list of resources available for
synchronization to enable Destinations to perform an initial load
or catch-up with a Source
2. Packaging content – bundle resources to enable bulk download
by destinations
3. Describing changes – publish a list of resource changes to
enable destinations to stay synchronized and decrease latency
4. Packaging changes – bundle resource changes for bulk
download by destinations

ResourceSync Webinar
December 3 2013

17
Source: Notifications Capabilities
To reduce synchronization latency and to optimize the synchronization
process the Source can support:

P
•
U
S
•
H

1. Change Notification
• Notifies about changes to particular resources
• e.g., resource A has been updated | created | deleted
2. Framework Notification
• Notifies about changes to capabilities i.e., their documents
• e.g., a Change List has been updated | created | deleted

ResourceSync Webinar
December 3 2013

18
Source: Synchronization Features
1. Discovery of capabilities – support Destinations in discovering
all offered capabilities
o

Applies to PULL, PUSH, capabilities

1. Linking to related resources – provide links from resources
subject to synchronization to related resources
o

Applies to PULL, PUSH capabilities

ResourceSync Webinar
December 3 2013

19
Destination: Synchronization Needs
1. Baseline synchronization – A destination must be able to
perform an initial load or catch-up with a source
- avoid out-of-band setup
2. Incremental synchronization – A destination must have some
way to keep up-to-date with changes at a source
- subject to some latency; minimal: create/update/delete
- allow to catch-up after destination has been offline
3. Audit – A destination should be able to determine whether it is
synchronized with a source
- regarding coverage and accuracy

ResourceSync Webinar
December 3 2013

20
ResourceSync - Agenda

2. Motivation & Use Cases

ResourceSync Webinar
December 3 2013

21
Use Case 1: arXiv Mirroring and Data Sharing
• Repository of scholarly articles in
physics, mathematics, computer
science, etc.
• > 850k articles
• approx. 1.5 revisions per article on
average
• approx. 75k new articles per year
• Each article has full-text and separate
metadata record
• approx. 3.8M resources

ResourceSync Webinar
December 3 2013

22
Use Case 1: arXiv Mirroring and Data Sharing
• 2,700 updates daily
o at 8pm EST
o Currently using homebrew mirroring
solution (running with minor
modifications since 1994!)
o occasional rsync (file systemspecific, auth issues)

ResourceSync Webinar
December 3 2013

23
Use Case 1: arXiv
Mirroring / Data Sharing
• GOAL: Keep mirror sites synchronized with daily
changes
• WANT:
o
o
o
o
o

o

high consistency
moderate latency
robustness to global network outages (low admin effort)
ability to verify sync status in case of questions
low admin effort (i.e. standard approach, standard tools)
reasonable consistency, latency, efficiency

ResourceSync Webinar
December 3 2013

24
Use Case 2: DBpedia Live Duplication
• Average of 2 updates per second
• Low latency desirable => need for a push technology

ResourceSync Webinar
December 3 2013

25
ResourceSync - Agenda

3. Framework Walkthrough

ResourceSync Webinar
December 3 2013

26
Source Capability 1: Describing Content
In order to advertise the resources that a source wants destinations
to know about, it may describe them:
o

o

Publish a Resource List, a list of resource URIs and possibly
associated metadata
- Destination GETs the Resource List
- Destination GETs listed resources by their URI
A Resource List describes the state of a set of resources at
one point in time (snapshot)

ResourceSync Webinar
December 3 2013

27
28
29
Source Capability 2: Packaging Content
By default, content is transferred in response to a GET issued by a
destination against a URI of a source’s resource. But a source may
support additional mechanisms:
o

o

Publish a Resource Dump, a document that points to
packages of resource representations and necessary
metadata
- Destination GETs the package
- Destination unpacks the package
- ZIP format supported
A Resource Dump and the packages it points to reflect the
state of a set of resources at one point in time (snapshot)

ResourceSync Webinar
December 3 2013

30
31
32
Source Capability 3: Describing Changes
In order to achieve lower latency and/or greater efficiency, a source
may communicate about changes to its resources:
o

o

Publish a Change List, a list of recent change events
(created, updated, deleted resource)
- Destination acts upon change events, e.g. GETs
created/updated resources, removes deleted resources.
A Change List pertains to resources that changed in a
temporal interval with a start- and an end-date
- If a resource changed more than once, it will be listed
more than once

ResourceSync Webinar
December 3 2013

33
34
35
36
Source Capability 4: Packaging Changes
In order to reduce the number of requests to obtain resource
changes, a source may provide packaged bitstreams for changed
resources:
o

o

Publish a Change Dump, a document that points to
packages containing bitstreams of recently changed
resource and necessary metadata
- Destination GETs the package
- Destination unpacks the package
- ZIP format supported
A Change Dump and its packages pertain to resources that
changed in a temporal interval with a start- and an end-date
- If a resource changed more than once, it will be included
more than once
ResourceSync Webinar
December 3 2013

37
38
Destination: Key Processes

ResourceSync Webinar
December 3 2013

39
ResourceSync - Agenda

4. Framework (Technical) Details

ResourceSync Webinar
December 3 2013

40
So Many Choices
Push

DSNotify
OAI-PMH
rsync

Crawl

Pull
OAI-ORE

RDFsync

WebDAV Col. Syn.

XMPP
Atom

SWORD
Sitemap

SPARQLpush

SDShare

AtomPub

RSS
PubSubHubbub

XMPP
ResourceSync Webinar
December 3 2013

41
So Many Choices
Push

DSNotify
OAI-PMH
rsync

Crawl

Pull
OAI-ORE

RDFsync

WebDAV Col. Syn.

XMPP
Atom

SWORD
Sitemap

SPARQLpush

SDShare

AtomPub

RSS
PubSubHubbub

XMPP
ResourceSync Webinar
December 3 2013

42
ResourceSync Webinar
December 3 2013

43
Sitemap Format

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9”>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
</url>

<url>
<loc>http://example.com/res2</loc>
<lastmod>2013-01-02T14:00:00Z</lastmod>
</url>
…
</urlset>

ResourceSync Webinar
December 3 2013

44
ResourceSync Sitemap Extensions
<urlset xmlns=http://www.sitemaps.org/schemas/sitemap/0.9
xmlns:rs="http://www.openarchives.org/rs/terms/”>
<rs:ln …/>
<rs:md …/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:ln …/>
<rs:md …/>
</url>
<url>
…
</url>
</urlset>

ResourceSync Webinar
December 3 2013

45
Related Resource Metadata Summary
• Attributes of the <rs:ln> element; c.f. resource metadata + pri
Element/Attribute Description

Defined by

<rs:ln>

ResourceSync

encoding

HTTP Content-Encoding header value

RFC2616

hash

One or more content digests (md5, sha-1, sha-256)

Atom Link Ext.

href

Related resource URI (identity)

RFC4287

length

HTTP Content-Length header value

RFC4287

modified

Timestamp of last change (c.f. <lastmod>)

Atom Link Ext.

path

Path in ZIP package (Dump Manifests only)

ResourceSync

pri

Priority of link

RFC6249

rel

Relation - IANA registered or URI

RFC4287

type

HTTP Content-Type header value

RFC4287
ResourceSync Webinar
December 3 2013
Resource Metadata Summary
Element/Attribute
<loc>
<lastmod>

Description
Resource URI (identity)
Timestamp of last change

Defined by
sitemaps
sitemaps

<changefreq>

Expected update frequency

sitemaps

<rs:md>
change
encoding

hash
length
path
type

ResourceSync
Change type (Change List & Change
Dump Manifest only)

ResourceSync

HTTP Content-Encoding header value

RFC2616

One or more content digests (md5, sha-1, Atom Link Ext.
sha-256)

HTTP Content-Length header value

RFC4287

Path in ZIP package (Dump Manifests
only)
HTTP Content-Type header value

ResourceSync

RFC4287

ResourceSync Webinar
December 3 2013
Link Relation Summary
Relation

Use in ResourceSync

Defined in

rel="alternate"

Link from generic to specific URI

HTML 5

rel="canonical"

Link from specific to generic URI

RFC6596

rel="collection"

Resource is member of collection

RFC6573

rel="contents"

Link from dump to manifest

rel="describedby"

Has metadata

HTML4
Protocol for Web Description Resources
(POWDER): Description Resources

rel="describes"

Is metadata for

The 'describes' Link Relation Type

rel="duplicate"

RFC6249

rel=".../rs/terms/patch"

Mirror or alternative copy
A patch -- efficient change
information

rel="memento"

Link to time-specific URI

Memento Internet Draft

rel="timegate"

Link to timegate

Memento Internet Draft

rel="via"

Provenance chain, came from

RFC4287

This specification

ResourceSync Webinar
December 3 2013
Resource List
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability="resourcelist"
at="2013-01-03T09:00:00Z”
completed="2013-01-03T09:01:00Z” />
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6"
length="8876"
type="text/html"/>
</url>
<url>
…
</url>
</urlset>

ResourceSync Webinar
December 3 2013

49
Resource List
• Describe Source’s resources that are subject to synchronization
• At one point in time (snapshot)
• Creation can take some time – duration can be conveyed
• Typical Destination use: Baseline Synchronization, Audit

• Each URI typically listed only once
• Might be expensive to generate
• Destinations use @at to determine freshness
• [@at, @completed] – interval of uncertainty
• Destination issues GETs against URIs to obtain resources
• Very similar to current Sitemaps

ResourceSync Webinar
December 3 2013

50
Resource Dump
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”resourcedump"
at="2013-01-02T09:00:00Z”/>
<url>
<loc>http://example.com/resourcedump_part1.zip</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md length=”97553"
type=”application/zip"/>
<rs:ln rel=”contents”
href="http://example.com/resourcedump_manifest-part1.xml"
type=”application/xml"/>
</url>
<url>
<loc>http://example.com/resourcedump_part2.zip</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
</url>
</urlset>
ResourceSync Webinar
December 3 2013

51
Resource Dump Manifest
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”resourcedump-manifest"
at="2013-01-02T09:00:00Z”/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md type="text/html"
path=”/resources/res1"/>
</url>
<url>
<loc>http://example.com/res2</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md type=”application/pdf”
path=”/resources/res2"/>
</url>
</urlset>

ResourceSync Webinar
December 3 2013

52
Resource Dump
• A Resource Dump points to packages (ZIP files) that contain
representations of the Source’s resources
• At one point in time (snapshot)
• Resource Dump is mandatory, even if there is only one ZIP file
• ZIP package contains manifest, listing contained bitstreams
• Typical Destination use: Baseline Synchronization, bulk
download

• Each URI typically listed only once
• Might be expensive to generate
• Destinations use @at to determine freshness
• [@at, @completed] – interval of uncertainty
• GETs against individual URIs from Resource List achieves the
same result (ignoring varying freshness)
ResourceSync Webinar
December 3 2013

53
Change List
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-02T09:00:00Z”
until="2013-01-03T09:00:00Z”/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md change=”updated"
hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6"
length="8876"
type="text/html"/>
</url>
<url>
…
</url>
</urlset>

ResourceSync Webinar
December 3 2013

54
Change List
• A Change List pertains to a Source’s resources that changed
• Changes that occurred during a temporal interval with startand end-date
• Typical Destination use: Incremental Synchronization, Audit
• Changes are listed in chronological order
• Multiple changes to one resource results in the resource being
listed multiple times, once per change
• Source determines duration of temporal interval
• Destinations use @from and @until to determine freshness
• Destinations issue GETs against URIs to obtain changed
resources

ResourceSync Webinar
December 3 2013

55
Discovery of Capabilities
Requirements:
• Need to discover capabilities, i.e. Resource List, Resource
Dump, Change List, Change Dump, Archives, Notification
channels
• Need to know the type of capability each document
represents.
Approach:
• The Source publishes a Capability List that enumerates the
capabilities it supports.
• By pointing at Resource List, Change List, Resource Dump,
etc. using appropriate relation types, e.g. “resourcelist”,
“changelist”, “resourcedump” etc.
http://www.openarchives.org/rs/resourcesync#CapabilityList
ResourceSync Webinar
December 3 2013

56
Discovery of Capabilities

ResourceSync Webinar
December 3 2013

57
Capability List
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”capabilitylist”/>
<url>
<loc>http://example.com/dataset1/resourcelist.xml</loc>
<rs:md capability=”resourcelist”/>
</url>
<url>
<loc>http://example.com/dataset1/changelist.xml</loc>
<rs:md capability=”changelist”/>
</url>
<url>
<loc>http://example.com/dataset1/resourcedump.xml</loc>
<rs:md capability=”resourcedump”/>
</url>
</urlset>

ResourceSync Webinar
December 3 2013

58
Discovery of Capability Lists
Requirements:
• Need to discover a Capability List
Approaches:
• Introduce a link in the HTTP Link header of a resources that is
subject to synchronization, pointing at the Capability List with the
relation type “resourcesync”
• Introduce a link from an HTML document that is subject to
synchronization (<head> section), pointing at the Capability List
with the relation type “resourcesync”
• Link from a Resource List, etc. to the Capability List with the
relation type “up”
Link header on example.com/res1.pdf
Link: <example.com/dataset1/capabilitylist.xml>;rel=“resourcesync”
ResourceSync Webinar
December 3 2013

59
Discovery via robots.txt
• Resource Lists are (enhanced) Sitemaps
• Sitemaps can be discovered via robots.txt
• Ergo, Resource Lists should be discoverable via robots.txt
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Sitemap: http://example.com/dataset1/resourcelist.xml

ResourceSync Webinar
December 3 2013

60
Framework Structure

ResourceSync Webinar
December 3 2013

61
Motivation for Notifications
•

Reduce synchronization latency by having the Source push out
resource change information
• To avoid continuous pull of Change Lists by Destinations

•

Share information about changes to the Source’s
ResourceSync implementation, e.g. announcement of new
Resource List, new Capability List, etc.
• To avoid continuous polling of e.g. Resource Lists,
ResourceSync Description

ResourceSync Webinar
December 3 2013

62
Source: Notification Capabilities
•

P
U
•
S
H

1. Change Notification
• Notifies about changes to particular resources
• e.g., resource A has been updated | created | deleted
2. Framework Notification
• Notifies about changes to capabilities i.e., their documents
• e.g., a Change List has been updated | created | deleted
• Also for Capability Lists and Source Description

ResourceSync Webinar
December 3 2013

63
Notification Channels
•

Notification sent via channels
• Resource Notification: one channel per set of resources
• Framework Notification: one channel per set of resources
• Sent on level of capability document, not on index-level
• Notifications about changes to Source Description sent on all
Framework Notification channels

•

Payload for notifications: <urlset> documents

•

Transport protocol for notifications under discussion:
• PubSubHubbub https://pubsubhubbub.googlecode.com/git/pubsubhubbub-core0.4.html - current choice
• WebSockets -http://tools.ietf.org/html/rfc6455 – may be added
later
ResourceSync Webinar
December 3 2013

64
Change Notification Payload
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:ln rel="up"
href="http://example.com/dataset1/capabilitylist.xml"/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T09:07:00Z</lastmod>
<rs:md change=”updated"
hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6"
length="8876"
type="text/html"/>
</url>
<url>
…
</url>
</urlset>

ResourceSync Webinar
December 3 2013

65
ResourceSync - Agenda

5. Implementation

ResourceSync Webinar
December 3 2013

66
DSpace support for
metadata harvesting use case

DSpace Module:
https://github.com/CottageLabs/DSpaceResourceSync
PHP client:
https://github.com/stuartlewis/resync-php
http://mydspace.edu/dspace-rs/resource/123456789/7/qdc
ResourceSync webapp

Item handle

Metadata Format
ResourceSync Webinar
December 3 2013

67
ResourceSync @ arXiv
• Use ResourceSync for both
mirroring and public data access
o efficient updates
o ability to do periodic audits
o public synchronization capability
o reduce admin burden
• Start with metadata + source for
mirroring use case (doing
experiments now)
• Open Access use cases require
processed PDF also
ResourceSync Webinar
December 3 2013

68
Getting a copy of arXiv
It might be as easy as:

(of course, you probably have to wait a while but it is nice to know ResourceSync is
stateless so one can efficiently restart)

ResourceSync Webinar
December 3 2013

69
Python Library and Client
• Aim to provide library code implementing all ResourceSync
facilities for use in both source and destination implementations
o

Designed for python 2.6 (RHEL6) and 2.7

• Client (resync) supports many destination operations, inspired
by the common Unix rsync program
• Client also supports some operations that might be useful in a
source, such as generation of static Resource Lists, or periodic
Change Lists (used in arXiv experiments)
• Explorer (resync-explorer) intended to allow easy inspection
of a source’s resource sets and capabilities
• Developed since ResourceSync v0.5, updated for v0.9.1
http://github.org/resync/resync
On pypi: “easy_install resync”

ResourceSync Webinar
December 3 2013
ResourceSync Source Simulator
• Python code using Tornado server
• Provides random set of resources of different sizes updated at a
particular rate
• Very useful for testing Destination code

http://github.com/resync/simulator

ResourceSync Webinar
December 3 2013
ResourceSync - Agenda

6. Q&A
ResourceSync Webinar
December 3 2013

72
ResourceSync:
A Web-Based
Resource Synchronization
Framework

#resourcesync

ResourceSync is funded by
The Sloan Foundation & JISC
ResourceSync Webinar
December 3 2013

73
THANK YOU
We look forward to seeing you at a
future NISO training event.

NISO ResourceSync Training Session

  • 1.
    NISO Training http://www.niso.org/workrooms/onixpl-encoding/ NISO Training ResourceSync:A Web-Based Resource Synchronization Framework December 3, 2013 Speakers: Bernhard Haslhofer - Postdoc Research Associate Department of Computer Science, University of Vienna Simeon Warner - Information Science, Cornell University
  • 2.
    ResourceSync: A Web-Based Resource Synchronization Framework #resourcesync ResourceSyncis funded by The Sloan Foundation & JISC ResourceSync Webinar December 3 2013 2
  • 3.
    This is ashort version of the complete ResourceSync tutorial, which is available at http://www.slideshare.net/OpenArchivesInitiative/resourcesync-tutorial ResourceSync Webinar December 3 2013 3
  • 4.
    ResourceSync Tutorial History •OAI8, June 2013 – Open Repositories, July 2013 – JCDL, July 2013 – TPDL 2013, September 2013 –LITA Forum, November 2013, SWIB November 2013, … Presenters Simeon Warner Cornell University Bernhard Haslhofer University of Vienna ResourceSync Webinar December 3 2013 4
  • 5.
    ResourceSync Tutorial Contributors MartinKlein Herbert Van de Sompel Robert Sanderson Los Alamos National Laboratory Los Alamos National Laboratory Los Alamos National Laboratory <martinklein0815@gmail.com> <hvdsomp@gmail.com> <azaroth24@gmail.com> @mart1nkle1n @hvdsomp @azaroth24 Simeon Warner Cornell University <simeon.warner@cornell.edu> @zimeon Michael L. Nelson Old Dominion University <mln@cs.odu.edu> @phonedude_mln Richard Jones Cottage Labs <richard@cottagelabs.com> @cottagelabs ResourceSync Webinar December 3 2013 5
  • 6.
    OAI Herbert Van deSompel Martin Klein Robert Sanderson (Los Alamos National Laboratory) Simeon Warner (Cornell University) NISO Todd Carpenter Nettie Lagace University of Oxford Graham Klyne Bernhard Haslhofer (University of Vienna) Michael L. Nelson (Old Dominion University) Lyrasis Peter Murray Carl Lagoze (University of Michigan) ResourceSync Webinar December 3 2013 6
  • 7.
    ResourceSync Technical Group LOCKSS ExLibris Inc. Shlomo Sanders David Rosenthal JISC Richard Jones Paul Walk Stuart Lewis RedHat OCLC Christian Sadilek Library of Congress Jeff Young Kevin Ford ResourceSync Webinar December 3 2013 7
  • 8.
    Timeline, Status ofSpecification(s) • August 2013 o o Release of ResourceSync framework Core specification - Version 0.9.1 Public draft of ResourceSync Archives specification released • September 2013 o Core specification on its way to become an ANSI standard • November 2013 o Internal draft of ResourceSync Notification specification • January 2014 o Public draft of ResourceSync Notification specification • Mid 2014 o Core specification becomes ANSI/NISO standard ResourceSync Webinar December 3 2013 8
  • 9.
    Pointers • Specification http://www.openarchives.org/rs/ http://www.openarchives.org/rs/resourcesync http://www.openarchives.org/rs/archives • Listfor public comment https://groups.google.com/d/forum/resourcesync • Client and simulator code http://github.org/resync/resync http://github.org/resync/simulator ResourceSync Webinar December 3 2013 9
  • 10.
    ResourceSync - Agenda 1.ResourceSync: Problem Perspective & Conceptual Approach 2. Motivation & Use Cases 3. Framework Walkthrough 4. Framework (Technical) Details 5. Implementation 6. Q&A ResourceSync Webinar December 3 2013 10
  • 11.
    ResourceSync - Agenda 1.ResourceSync: Problem Perspective & Conceptual Approach ResourceSync Webinar December 3 2013 11
  • 12.
    Synchronize What? • Webresources o things with a URI that can be dereferenced • Focus on needs of research communication and cultural heritage organizations o but aim for generality ResourceSync Webinar December 3 2013 12
  • 13.
    Synchronize What? • Smallwebsites/repositories (a few resources) to large repositories/datasets/linked data collections (many millions of resources) sync sync ResourceSync Webinar December 3 2013 13
  • 14.
    Synchronize What? • Lowchange frequency (weeks/months) to high change frequency (seconds) • Synchronization latency and accuracy needs may vary sync sync sync ResourceSync Webinar December 3 2013 14
  • 15.
    Why? … because lotsof projects and services are doing synchronization but have to resort to ad-hoc, case by case, approaches! • Project team involved with projects that need this • Experience with OAI-PMH: widely used in repos but o XML metadata only o Attempts at synchronizing actual content via OAI-PMH (complex object formats, dc:identifier) not successful. o Web technology has moved on since 1999 • Devise a shared solution for data, metadata, linked data? ResourceSync Webinar December 3 2013 15
  • 16.
    ResourceSync Problem • Consideration: •Source (server) A has resources that change over time: they get created, modified, deleted • Destination (servers) X, Y, and Z leverage (some) resources of Source A. • Problem: • Destinations want to keep in step with the resource changes at Source A: resource synchronization. • Goal: • Design an approach for resource synchronization aligned with the Web Architecture that has a fair chance of adoption by different communities. • The approach must scale better than recurrent HTTP HEAD/GET on resources. ResourceSync Webinar December 3 2013 16
  • 17.
    Source: Core SynchronizationCapabilities P U L L 1. Describing content – publish a list of resources available for synchronization to enable Destinations to perform an initial load or catch-up with a Source 2. Packaging content – bundle resources to enable bulk download by destinations 3. Describing changes – publish a list of resource changes to enable destinations to stay synchronized and decrease latency 4. Packaging changes – bundle resource changes for bulk download by destinations ResourceSync Webinar December 3 2013 17
  • 18.
    Source: Notifications Capabilities Toreduce synchronization latency and to optimize the synchronization process the Source can support: P • U S • H 1. Change Notification • Notifies about changes to particular resources • e.g., resource A has been updated | created | deleted 2. Framework Notification • Notifies about changes to capabilities i.e., their documents • e.g., a Change List has been updated | created | deleted ResourceSync Webinar December 3 2013 18
  • 19.
    Source: Synchronization Features 1.Discovery of capabilities – support Destinations in discovering all offered capabilities o Applies to PULL, PUSH, capabilities 1. Linking to related resources – provide links from resources subject to synchronization to related resources o Applies to PULL, PUSH capabilities ResourceSync Webinar December 3 2013 19
  • 20.
    Destination: Synchronization Needs 1.Baseline synchronization – A destination must be able to perform an initial load or catch-up with a source - avoid out-of-band setup 2. Incremental synchronization – A destination must have some way to keep up-to-date with changes at a source - subject to some latency; minimal: create/update/delete - allow to catch-up after destination has been offline 3. Audit – A destination should be able to determine whether it is synchronized with a source - regarding coverage and accuracy ResourceSync Webinar December 3 2013 20
  • 21.
    ResourceSync - Agenda 2.Motivation & Use Cases ResourceSync Webinar December 3 2013 21
  • 22.
    Use Case 1:arXiv Mirroring and Data Sharing • Repository of scholarly articles in physics, mathematics, computer science, etc. • > 850k articles • approx. 1.5 revisions per article on average • approx. 75k new articles per year • Each article has full-text and separate metadata record • approx. 3.8M resources ResourceSync Webinar December 3 2013 22
  • 23.
    Use Case 1:arXiv Mirroring and Data Sharing • 2,700 updates daily o at 8pm EST o Currently using homebrew mirroring solution (running with minor modifications since 1994!) o occasional rsync (file systemspecific, auth issues) ResourceSync Webinar December 3 2013 23
  • 24.
    Use Case 1:arXiv Mirroring / Data Sharing • GOAL: Keep mirror sites synchronized with daily changes • WANT: o o o o o o high consistency moderate latency robustness to global network outages (low admin effort) ability to verify sync status in case of questions low admin effort (i.e. standard approach, standard tools) reasonable consistency, latency, efficiency ResourceSync Webinar December 3 2013 24
  • 25.
    Use Case 2:DBpedia Live Duplication • Average of 2 updates per second • Low latency desirable => need for a push technology ResourceSync Webinar December 3 2013 25
  • 26.
    ResourceSync - Agenda 3.Framework Walkthrough ResourceSync Webinar December 3 2013 26
  • 27.
    Source Capability 1:Describing Content In order to advertise the resources that a source wants destinations to know about, it may describe them: o o Publish a Resource List, a list of resource URIs and possibly associated metadata - Destination GETs the Resource List - Destination GETs listed resources by their URI A Resource List describes the state of a set of resources at one point in time (snapshot) ResourceSync Webinar December 3 2013 27
  • 28.
  • 29.
  • 30.
    Source Capability 2:Packaging Content By default, content is transferred in response to a GET issued by a destination against a URI of a source’s resource. But a source may support additional mechanisms: o o Publish a Resource Dump, a document that points to packages of resource representations and necessary metadata - Destination GETs the package - Destination unpacks the package - ZIP format supported A Resource Dump and the packages it points to reflect the state of a set of resources at one point in time (snapshot) ResourceSync Webinar December 3 2013 30
  • 31.
  • 32.
  • 33.
    Source Capability 3:Describing Changes In order to achieve lower latency and/or greater efficiency, a source may communicate about changes to its resources: o o Publish a Change List, a list of recent change events (created, updated, deleted resource) - Destination acts upon change events, e.g. GETs created/updated resources, removes deleted resources. A Change List pertains to resources that changed in a temporal interval with a start- and an end-date - If a resource changed more than once, it will be listed more than once ResourceSync Webinar December 3 2013 33
  • 34.
  • 35.
  • 36.
  • 37.
    Source Capability 4:Packaging Changes In order to reduce the number of requests to obtain resource changes, a source may provide packaged bitstreams for changed resources: o o Publish a Change Dump, a document that points to packages containing bitstreams of recently changed resource and necessary metadata - Destination GETs the package - Destination unpacks the package - ZIP format supported A Change Dump and its packages pertain to resources that changed in a temporal interval with a start- and an end-date - If a resource changed more than once, it will be included more than once ResourceSync Webinar December 3 2013 37
  • 38.
  • 39.
    Destination: Key Processes ResourceSyncWebinar December 3 2013 39
  • 40.
    ResourceSync - Agenda 4.Framework (Technical) Details ResourceSync Webinar December 3 2013 40
  • 41.
    So Many Choices Push DSNotify OAI-PMH rsync Crawl Pull OAI-ORE RDFsync WebDAVCol. Syn. XMPP Atom SWORD Sitemap SPARQLpush SDShare AtomPub RSS PubSubHubbub XMPP ResourceSync Webinar December 3 2013 41
  • 42.
    So Many Choices Push DSNotify OAI-PMH rsync Crawl Pull OAI-ORE RDFsync WebDAVCol. Syn. XMPP Atom SWORD Sitemap SPARQLpush SDShare AtomPub RSS PubSubHubbub XMPP ResourceSync Webinar December 3 2013 42
  • 43.
  • 44.
  • 45.
    ResourceSync Sitemap Extensions <urlsetxmlns=http://www.sitemaps.org/schemas/sitemap/0.9 xmlns:rs="http://www.openarchives.org/rs/terms/”> <rs:ln …/> <rs:md …/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:ln …/> <rs:md …/> </url> <url> … </url> </urlset> ResourceSync Webinar December 3 2013 45
  • 46.
    Related Resource MetadataSummary • Attributes of the <rs:ln> element; c.f. resource metadata + pri Element/Attribute Description Defined by <rs:ln> ResourceSync encoding HTTP Content-Encoding header value RFC2616 hash One or more content digests (md5, sha-1, sha-256) Atom Link Ext. href Related resource URI (identity) RFC4287 length HTTP Content-Length header value RFC4287 modified Timestamp of last change (c.f. <lastmod>) Atom Link Ext. path Path in ZIP package (Dump Manifests only) ResourceSync pri Priority of link RFC6249 rel Relation - IANA registered or URI RFC4287 type HTTP Content-Type header value RFC4287 ResourceSync Webinar December 3 2013
  • 47.
    Resource Metadata Summary Element/Attribute <loc> <lastmod> Description ResourceURI (identity) Timestamp of last change Defined by sitemaps sitemaps <changefreq> Expected update frequency sitemaps <rs:md> change encoding hash length path type ResourceSync Change type (Change List & Change Dump Manifest only) ResourceSync HTTP Content-Encoding header value RFC2616 One or more content digests (md5, sha-1, Atom Link Ext. sha-256) HTTP Content-Length header value RFC4287 Path in ZIP package (Dump Manifests only) HTTP Content-Type header value ResourceSync RFC4287 ResourceSync Webinar December 3 2013
  • 48.
    Link Relation Summary Relation Usein ResourceSync Defined in rel="alternate" Link from generic to specific URI HTML 5 rel="canonical" Link from specific to generic URI RFC6596 rel="collection" Resource is member of collection RFC6573 rel="contents" Link from dump to manifest rel="describedby" Has metadata HTML4 Protocol for Web Description Resources (POWDER): Description Resources rel="describes" Is metadata for The 'describes' Link Relation Type rel="duplicate" RFC6249 rel=".../rs/terms/patch" Mirror or alternative copy A patch -- efficient change information rel="memento" Link to time-specific URI Memento Internet Draft rel="timegate" Link to timegate Memento Internet Draft rel="via" Provenance chain, came from RFC4287 This specification ResourceSync Webinar December 3 2013
  • 49.
    Resource List <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:mdcapability="resourcelist" at="2013-01-03T09:00:00Z” completed="2013-01-03T09:01:00Z” /> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6" length="8876" type="text/html"/> </url> <url> … </url> </urlset> ResourceSync Webinar December 3 2013 49
  • 50.
    Resource List • DescribeSource’s resources that are subject to synchronization • At one point in time (snapshot) • Creation can take some time – duration can be conveyed • Typical Destination use: Baseline Synchronization, Audit • Each URI typically listed only once • Might be expensive to generate • Destinations use @at to determine freshness • [@at, @completed] – interval of uncertainty • Destination issues GETs against URIs to obtain resources • Very similar to current Sitemaps ResourceSync Webinar December 3 2013 50
  • 51.
    Resource Dump <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:mdcapability=”resourcedump" at="2013-01-02T09:00:00Z”/> <url> <loc>http://example.com/resourcedump_part1.zip</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md length=”97553" type=”application/zip"/> <rs:ln rel=”contents” href="http://example.com/resourcedump_manifest-part1.xml" type=”application/xml"/> </url> <url> <loc>http://example.com/resourcedump_part2.zip</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> </url> </urlset> ResourceSync Webinar December 3 2013 51
  • 52.
    Resource Dump Manifest <urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”resourcedump-manifest" at="2013-01-02T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md type="text/html" path=”/resources/res1"/> </url> <url> <loc>http://example.com/res2</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md type=”application/pdf” path=”/resources/res2"/> </url> </urlset> ResourceSync Webinar December 3 2013 52
  • 53.
    Resource Dump • AResource Dump points to packages (ZIP files) that contain representations of the Source’s resources • At one point in time (snapshot) • Resource Dump is mandatory, even if there is only one ZIP file • ZIP package contains manifest, listing contained bitstreams • Typical Destination use: Baseline Synchronization, bulk download • Each URI typically listed only once • Might be expensive to generate • Destinations use @at to determine freshness • [@at, @completed] – interval of uncertainty • GETs against individual URIs from Resource List achieves the same result (ignoring varying freshness) ResourceSync Webinar December 3 2013 53
  • 54.
    Change List <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:mdcapability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated" hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6" length="8876" type="text/html"/> </url> <url> … </url> </urlset> ResourceSync Webinar December 3 2013 54
  • 55.
    Change List • AChange List pertains to a Source’s resources that changed • Changes that occurred during a temporal interval with startand end-date • Typical Destination use: Incremental Synchronization, Audit • Changes are listed in chronological order • Multiple changes to one resource results in the resource being listed multiple times, once per change • Source determines duration of temporal interval • Destinations use @from and @until to determine freshness • Destinations issue GETs against URIs to obtain changed resources ResourceSync Webinar December 3 2013 55
  • 56.
    Discovery of Capabilities Requirements: •Need to discover capabilities, i.e. Resource List, Resource Dump, Change List, Change Dump, Archives, Notification channels • Need to know the type of capability each document represents. Approach: • The Source publishes a Capability List that enumerates the capabilities it supports. • By pointing at Resource List, Change List, Resource Dump, etc. using appropriate relation types, e.g. “resourcelist”, “changelist”, “resourcedump” etc. http://www.openarchives.org/rs/resourcesync#CapabilityList ResourceSync Webinar December 3 2013 56
  • 57.
    Discovery of Capabilities ResourceSyncWebinar December 3 2013 57
  • 58.
    Capability List <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:mdcapability=”capabilitylist”/> <url> <loc>http://example.com/dataset1/resourcelist.xml</loc> <rs:md capability=”resourcelist”/> </url> <url> <loc>http://example.com/dataset1/changelist.xml</loc> <rs:md capability=”changelist”/> </url> <url> <loc>http://example.com/dataset1/resourcedump.xml</loc> <rs:md capability=”resourcedump”/> </url> </urlset> ResourceSync Webinar December 3 2013 58
  • 59.
    Discovery of CapabilityLists Requirements: • Need to discover a Capability List Approaches: • Introduce a link in the HTTP Link header of a resources that is subject to synchronization, pointing at the Capability List with the relation type “resourcesync” • Introduce a link from an HTML document that is subject to synchronization (<head> section), pointing at the Capability List with the relation type “resourcesync” • Link from a Resource List, etc. to the Capability List with the relation type “up” Link header on example.com/res1.pdf Link: <example.com/dataset1/capabilitylist.xml>;rel=“resourcesync” ResourceSync Webinar December 3 2013 59
  • 60.
    Discovery via robots.txt •Resource Lists are (enhanced) Sitemaps • Sitemaps can be discovered via robots.txt • Ergo, Resource Lists should be discoverable via robots.txt User-agent: * Disallow: /cgi-bin/ Disallow: /tmp/ Sitemap: http://example.com/dataset1/resourcelist.xml ResourceSync Webinar December 3 2013 60
  • 61.
  • 62.
    Motivation for Notifications • Reducesynchronization latency by having the Source push out resource change information • To avoid continuous pull of Change Lists by Destinations • Share information about changes to the Source’s ResourceSync implementation, e.g. announcement of new Resource List, new Capability List, etc. • To avoid continuous polling of e.g. Resource Lists, ResourceSync Description ResourceSync Webinar December 3 2013 62
  • 63.
    Source: Notification Capabilities • P U • S H 1.Change Notification • Notifies about changes to particular resources • e.g., resource A has been updated | created | deleted 2. Framework Notification • Notifies about changes to capabilities i.e., their documents • e.g., a Change List has been updated | created | deleted • Also for Capability Lists and Source Description ResourceSync Webinar December 3 2013 63
  • 64.
    Notification Channels • Notification sentvia channels • Resource Notification: one channel per set of resources • Framework Notification: one channel per set of resources • Sent on level of capability document, not on index-level • Notifications about changes to Source Description sent on all Framework Notification channels • Payload for notifications: <urlset> documents • Transport protocol for notifications under discussion: • PubSubHubbub https://pubsubhubbub.googlecode.com/git/pubsubhubbub-core0.4.html - current choice • WebSockets -http://tools.ietf.org/html/rfc6455 – may be added later ResourceSync Webinar December 3 2013 64
  • 65.
    Change Notification Payload <urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:ln rel="up" href="http://example.com/dataset1/capabilitylist.xml"/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T09:07:00Z</lastmod> <rs:md change=”updated" hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6" length="8876" type="text/html"/> </url> <url> … </url> </urlset> ResourceSync Webinar December 3 2013 65
  • 66.
    ResourceSync - Agenda 5.Implementation ResourceSync Webinar December 3 2013 66
  • 67.
    DSpace support for metadataharvesting use case DSpace Module: https://github.com/CottageLabs/DSpaceResourceSync PHP client: https://github.com/stuartlewis/resync-php http://mydspace.edu/dspace-rs/resource/123456789/7/qdc ResourceSync webapp Item handle Metadata Format ResourceSync Webinar December 3 2013 67
  • 68.
    ResourceSync @ arXiv •Use ResourceSync for both mirroring and public data access o efficient updates o ability to do periodic audits o public synchronization capability o reduce admin burden • Start with metadata + source for mirroring use case (doing experiments now) • Open Access use cases require processed PDF also ResourceSync Webinar December 3 2013 68
  • 69.
    Getting a copyof arXiv It might be as easy as: (of course, you probably have to wait a while but it is nice to know ResourceSync is stateless so one can efficiently restart) ResourceSync Webinar December 3 2013 69
  • 70.
    Python Library andClient • Aim to provide library code implementing all ResourceSync facilities for use in both source and destination implementations o Designed for python 2.6 (RHEL6) and 2.7 • Client (resync) supports many destination operations, inspired by the common Unix rsync program • Client also supports some operations that might be useful in a source, such as generation of static Resource Lists, or periodic Change Lists (used in arXiv experiments) • Explorer (resync-explorer) intended to allow easy inspection of a source’s resource sets and capabilities • Developed since ResourceSync v0.5, updated for v0.9.1 http://github.org/resync/resync On pypi: “easy_install resync” ResourceSync Webinar December 3 2013
  • 71.
    ResourceSync Source Simulator •Python code using Tornado server • Provides random set of resources of different sizes updated at a particular rate • Very useful for testing Destination code http://github.com/resync/simulator ResourceSync Webinar December 3 2013
  • 72.
    ResourceSync - Agenda 6.Q&A ResourceSync Webinar December 3 2013 72
  • 73.
    ResourceSync: A Web-Based Resource Synchronization Framework #resourcesync ResourceSyncis funded by The Sloan Foundation & JISC ResourceSync Webinar December 3 2013 73
  • 74.
    THANK YOU We lookforward to seeing you at a future NISO training event.

Editor's Notes

  • #16 LANL Memento Aggregator of IIPC; Europeana does metadata via OAI-PMH but anticipate content also; arXiv – mirroring and data sharing; Linked data @ BBC; DBpedia, journal data at LANLREST not about in 1999
  • #26 Semantic web version of wikipedia; want mirror to provide reliable basis for local services
  • #40 Top line – just metadata about resources, destination uses GET to get them (duh)Bottom line – packaged content =&gt; fewer round trips
  • #42 Rsyncetc just reference; push vs pull -&gt; both; many other parts
  • #43 Rsyncetc just reference; push vs pull -&gt; both; many other parts
  • #49 Add: rel=“contents”rel=“archives”
  • #69 Test site, has subsets of arXiv and even complete source plus metadata (at present not up to date with 0.9)
  • #70 No way around the difficulty of transferring 1TB initially but then a daily or weekly sync is efficient, and it still works even after some arbitrary time.