Memento & Access to Resource Versions
Herbert Van de Sompel
http://mementoweb.org/
Memento
Uniform and Robust Access to Resource Versions
Memento has received funding
from
The Library of Congress
Andrew W. Mellon Foundation
IIPC
1
Memento & Access to Resource Versions
Herbert Van de Sompel
Memento Makes Navigating the Web’s Past Easy
2
RFC 7089 (2013) Van de Sompel, H., Nelson, M.L., Sanderson, R.
HTTP Framework for Time-Based Access to Resource States - Memento
http://tools.ietf.org/html/rfc7089
Memento & Access to Resource Versions
Herbert Van de Sompel
Today
Select Date
June 20 1997
June 5 1997
From archive.today
Memento: Access Versions via the Original URI and a Datetime
3
Memento & Access to Resource Versions
Herbert Van de Sompel
Today
Select Date
June 27 2011
May 29 2011
From
Internet Archive
Memento: Access Versions via the Original URI and a Datetime
4
Memento & Access to Resource Versions
Herbert Van de Sompel
The Memento protocol achieves this by introducing
a uniform, datetime-based, version access capability
that integrates the Present and Past Web.
5
Memento & Access to Resource Versions
Herbert Van de Sompel
Problem Statement …
6
Memento & Access to Resource Versions
Herbert Van de Sompel
Resources
7
Memento & Access to Resource Versions
Herbert Van de Sompel
Resources have Representations
8
Memento & Access to Resource Versions
Herbert Van de Sompel
Resources have Representations that Change over Time
9
Memento & Access to Resource Versions
Herbert Van de Sompel
Only the Current Representation is Available from a Resource
10
Memento & Access to Resource Versions
Herbert Van de Sompel
Old Representations are Lost Forever
11
Memento & Access to Resource Versions
Herbert Van de Sompel
But … Archived/Version Resources Exist
12
Memento & Access to Resource Versions
Herbert Van de Sompel
There are resource versions on
the Web, in:
• Web Archives;
• Content Management
Systems;
• Search engine caches;
• Transactional archives.
13
Memento & Access to Resource Versions
Herbert Van de Sompel
Web Archive
Archived Resource
URI-M - http://web.archive.org/web/20010911203610/http://www.cnn.com/
URI-R - http://www.cnn.com/
Memento & Access to Resource Versions
Herbert Van de Sompel
Web Archive
Archived Resource
URI-M - https://archive.today/UD0d6
URI-R - http://www.w3.org/
Memento & Access to Resource Versions
Herbert Van de Sompel
Version Resource
URI-M - http://en.wikipedia.org/w/index.php?title=September_11_attacks&oldid=282333
CMS
URI-R - http://en.wikipedia.org/wiki/September_11_attacks
Memento & Access to Resource Versions
Herbert Van de Sompel
Search Engine Cache
Cached Resource
URI-R – http://ghr.nlm.nih.gov/handbook/basics/dna
URI-M - http://webcache.googleusercontent.com/search?q=cache:kDmDc1PIA38J:
ghr.nlm.nih.gov/handbook/basics/dna+&cd=2&hl=en&ct=clnk&gl=us
Memento & Access to Resource Versions
Herbert Van de Sompel
Archived Resource
Transactional Archive
URI-R - http://dans.knaw.nl/en
URI-M -
http://www.theresourcedepot.com/000010/memento/20130418204153/http://dans.knaw.
nl/en
Memento & Access to Resource Versions
Herbert Van de Sompel
But, without Memento, the Web
handles these version
resources poorly:
• Cannot talk, in URI terms,
about a resource as it used to
exist
• Cannot access a prior version
knowing the current one
• Cannot access the current
version knowing a prior one
Solutions are ad hoc and
localized
19
Memento & Access to Resource Versions
Herbert Van de Sompel
Without Memento, the Current and Past Web Lack Integration
20
• Going from Current to Past
Web is a matter of (manual)
discovery
• Navigating the Past Web is
only possible within the
boundary of a single web
archive, versioning system
• Memento integrates the
Current And Past Web by
means of an extension of
HTTP
• Memento turns archives,
versioning systems into
infrastructure rather than
destinations
Memento & Access to Resource Versions
Herbert Van de Sompel
Systems with Resource Versions
system type stores URI-R and URI-M
web archive observations over time different baseURL
CMS history same baseURL
search engine cache one recent observation different baseURL
transactional archive history different baseURL
These systems have different characteristics
but the Memento protocol allows uniform versions access to their resources
21
Memento & Access to Resource Versions
Herbert Van de Sompel
The Memento Framework:
Protocol to Integrate Present and Past Web
Overview
22
Memento & Access to Resource Versions
Herbert Van de Sompel
The Memento protocol:
• Regards the Web as a big
Content Management System
• Introduces an interoperable
approach to access resource
versions across the Web
• Does not build new archives
but leverages all systems that
host versions
23
Memento & Access to Resource Versions
Herbert Van de Sompel
Memento’s approach to access
resource versions:
• Is distributed: versions may
exist on several servers
• Uses time as a global version
indicator
• Is based on the primitives of
the Web: resource, state,
representation, content
negotiation, link
24
Memento & Access to Resource Versions
Herbert Van de Sompel
Memento’s approach to access
resource versions has two
components:
• Access to a single
archived/version resource –
via datetime negotiation with
a TimeGate
• Access to an overview of
existing versions – by
requesting a TimeMap
25
Memento & Access to Resource Versions
Herbert Van de Sompel
26
Memento Protocol Resource Types
Original Resource: Resource that exists or used to exist;
we are interested in accessing a past state of it
Memento: Resource that is a prior version of the Original
Resource; it encapsulates a past state of the Original Resource
TimeGate: Resource that “decides”, based on a given datetime,
which is the temporally best Memento for an Original Resource
TimeMap: Resource that provides a list of known Mementos for
an Original Resource as well as their datetime
Memento & Access to Resource Versions
Herbert Van de Sompel
The Memento Framework:
Protocol to Integrate Present and Past Web
Datetime Negotiation
27
Memento & Access to Resource Versions
Herbert Van de Sompel
28
Original Resource and Mementos
Memento & Access to Resource Versions
Herbert Van de Sompel
29
Bridge from Present to Past
Memento & Access to Resource Versions
Herbert Van de Sompel
30
Bridge from Present to Past
Memento & Access to Resource Versions
Herbert Van de Sompel
31
Bridge from Past to Present
Memento & Access to Resource Versions
Herbert Van de Sompel
32
Bridge from Past to Present
Memento & Access to Resource Versions
Herbert Van de Sompel
33
Memento Datetime Negotiation Component
Memento & Access to Resource Versions
Herbert Van de Sompel
34
Memento Protocol Datetime Negotiation Patterns
The different Patterns are discussed in RFC 7089
Here, we deal with URI-R <> URI-G <> URI-M and 302 style negotiation
can coincide with
can coincide with
302 or 200 style negotiation can be used
Memento & Access to Resource Versions
Herbert Van de Sompel
35
Memento Datetime Negotiation - Client Server Interaction
Yes, G
It’s at M
Memento Datetime Negotiation - HTTP Flow
HEAD R, [Accept-Datetime]
[Link  G]
302  M, Vary, Link  R,[M,T]
200, Memento-Datetime, Link  R,[G,M,T]
HEAD G, Accept-Datetime
GET M, [Accept-Datetime]
[…]== optional
Memento & Access to Resource Versions
Herbert Van de Sompel
37
Original Resource Provides No Link – Client Intelligence
Memento & Access to Resource Versions
Herbert Van de Sompel
38
Original Resource Gone – Client Intelligence
Memento & Access to Resource Versions
Herbert Van de Sompel
39
Original Resource Gone – Server Due Dilligence
Memento & Access to Resource Versions
Herbert Van de Sompel
40
Original Resource’s Server Gone – Client Intelligence
Memento & Access to Resource Versions
Herbert Van de Sompel
41
Memento Aggregator
Memento & Access to Resource Versions
Herbert Van de Sompel
42
TimeGates
A list of TimeGates provided by major web archives as well as
by-proxy TimeGates provided for other systems is maintained at
http://mementoweb.org/depot/
http://timetravel.mementoweb.org/guide/api/#registry
Memento & Access to Resource Versions
Herbert Van de Sompel
The Memento Framework:
Protocol to Integrate Present and Past Web
TimeMaps
43
Memento & Access to Resource Versions
Herbert Van de Sompel
44
TimeMap
• multiple TimeMap serializations possible
• application-link/format mandatory
• When TimeMaps become too large, they can
be broken up and paged
Memento & Access to Resource Versions
Herbert Van de Sompel
45
TimeMaps
A list of TimeMaps provided by major web archives as well as
by-proxy TimeMaps provided for other systems is maintained at
http://mementoweb.org/depot/
http://timetravel.mementoweb.org/guide/api/#registry
Memento & Access to Resource Versions
Herbert Van de Sompel
The Memento Framework:
Protocol to Integrate Present and Past Web
HTTP Headers
46
Memento & Access to Resource Versions
Herbert Van de Sompel
The HTTP Headers used in the Memento Protocol
• Define two new headers:
– request: Accept-Datetime:
– response: Memento-Datetime:
• Introduce new content for two existing headers:
– response: Vary: ; Link:
• Use one existing header without modification:
– response: Location:, TCN:
47
Memento & Access to Resource Versions
Herbert Van de Sompel
HTTP Request Headers for Datetime Negotiation
• Accept-Datetime:
o Issued against TimeGate, [Original Resource, Memento]
o Header value: desired datetime of a Memento
Accept-Datetime: Mon, 12 Oct 2009 14:20:33 GMT
48
Memento & Access to Resource Versions
Herbert Van de Sompel
HTTP Response Headers for Datetime Negotiation
• Memento-Datetime:
o Returned by Mementos only
- Even when not as a result of datetetime negotiation
o Header value: Archival datetime of the Memento
- Resource has not and will not change beyond that date
o This header is sticky:
- Once returned, a server must always return it with same
value
- Must also be preserved when Mementos are mirrored at
different URIs
o This header is crucial to allow a client to understand it has
arrived at a Memento
Memento-Datetime: Mon, 12 Oct 2009 14:20:33 GMT
49
Memento & Access to Resource Versions
Herbert Van de Sompel
HTTP Response Headers Datetime Negotiation
• Vary:
o Returned by TimeGate
o Similar to regular content negotiation
o Header value: accept-datetime
• Regular content negotiation (e.g. media type) can be used too but
a TimeGate must first meet the datetime preference, and then – if
possible – the other content negotiation preferences
• Note: accept-datetime in Vary header is crucial to allow a
client to understand it has arrived at a TimeGate
Vary: accept-datetime
50
Memento & Access to Resource Versions
Herbert Van de Sompel
HTTP Response Headers for Datetime Negotiation
• Location:
o Returned by TimeGate
o Similar to regular content negotiation
o Header value: URI of the Memento selected by the TimeGate
Location:
http://web.archive.org/web/20010911223004/http://cnn.co
m
51
Memento & Access to Resource Versions
Herbert Van de Sompel
HTTP Response Headers for Datetime Negotiation
• Link:
o Returned by Original Resource, TimeGate and Mementos
o Various new Relation Types are introduced:
- “original” – points to Original Resource
- “timegate” – points to TimeGate
- “memento” – points to Memento
- “timemap” – points to TimeMap
o A TimeGate must provide the “original” link
o A Memento must provide the “original” link
o All other links are encouraged but optional
52
HTTP Link Header: RFC 5988
Memento & Access to Resource Versions
Herbert Van de Sompel
HTTP Response Headers for Datetime Negotiation
• Link:
o The following ”memento” links that point at special Mementos,
known to the responding server, are optional but very useful:
- First and last Memento known to the server, e.g. ”memento
first”
- Memento prior and after the selected Memento, e.g.
“”memento predecessor-version”
- Selected Memento
- Temporal order of Mementos is expressed using existing
relation types from RFC 5829 and RFC 5988: first, last,
next, prev, successor-version, predecessor-
version
53
Memento & Access to Resource Versions
Herbert Van de Sompel
HTTP Response Headers for Datetime Negotiation
• Link:
o Attributes for a ”memento” Link:
- datetime (mandatory): datetime of the Memento pointed at
by the link
- license (optional): license associated with the Memento
o Attributes for a ”timemap” Link:
- type (recommended): MIME type of TimeMap serialization
- from, until (optional): to convey the temporal interval of
Memento datetimes covered by the TimeMap
54
Memento Datetime Negotiation - HTTP Flow
HEAD R, [Accept-Datetime]
[Link  G]
302  M, Vary, Link  R [M T]
200, Memento-Datetime, Link  R [G M T]
HEAD G, Accept-Datetime
GET M, [Accept-Datetime]
[timegate]
original [memento timemap]
original [timegate memento timemap]
Memento & Access to Resource Versions
Herbert Van de Sompel
The Memento Framework:
Protocol to Integrate Present and Past Web
HTTP Interactions
56
Memento & Access to Resource Versions
Herbert Van de Sompel
Datetime Negotiation Flow: Step 1
57
Memento & Access to Resource Versions
Herbert Van de Sompel
Datetime Negotiation Flow: Step 2
58
Memento & Access to Resource Versions
Herbert Van de Sompel
Datetime Negotiation Flow: Step 3
59
Memento & Access to Resource Versions
Herbert Van de Sompel
Datetime Negotiation Flow: Step 4
60
Memento & Access to Resource Versions
Herbert Van de Sompel
Datetime Negotiation Flow: Step 5
61
Memento & Access to Resource Versions
Herbert Van de Sompel
Datetime Negotiation Flow: Step 6
62
Memento & Access to Resource Versions
Herbert Van de Sompel
TimeMap Access Flow: Step 1
63
Memento & Access to Resource Versions
Herbert Van de Sompel
TimeMap Access Flow: Step 2
64
Memento & Access to Resource Versions
Herbert Van de Sompel
TimeMap Access Flow: Step 3
65
Memento & Access to Resource Versions
Herbert Van de Sompel
TimeMap Access Flow: Step 4
66
Memento & Access to Resource Versions
Herbert Van de Sompel
TimeMap Access Flow: Step 5
67
Memento & Access to Resource Versions
Herbert Van de Sompel
TimeMap Access Flow: Step 6
68
Memento & Access to Resource Versions
Herbert Van de Sompel
TimeMap Access Flow: Step 6 with Index TimeMap
69
Memento & Access to Resource Versions
Herbert Van de Sompel
TimeMap Access Flow: Step 6 with Paging TimeMap
70
Memento & Access to Resource Versions
Herbert Van de Sompel
The Memento Framework:
Protocol to Integrate Present and Past Web
Resource Versioning and Memento
71
Memento & Access to Resource Versions
Herbert Van de Sompel
Common Resource Versioning Approach
Memento & Access to Resource Versions
Herbert Van de Sompel
Version Resources
(*) Tim Berners-Lee (1996) http://www.w3.org/DesignIssues/Generic.html
(*)
Memento & Access to Resource Versions
Herbert Van de Sompel
Version Resources and Associated Generic Resource
(*)
(*)
(*) Tim Berners-Lee (1996) http://www.w3.org/DesignIssues/Generic.html
Memento & Access to Resource Versions
Herbert Van de Sompel
Memento Bridges Between Generic & Specific Resources
Memento & Access to Resource Versions
Herbert Van de Sompel
Memento Bridges Between Generic & Specific Resources
Memento & Access to Resource Versions
Herbert Van de Sompel
Memento Bridges Between Generic & Specific Resources
Memento & Access to Resource Versions
Herbert Van de Sompel
Stepwise Support for the Memento Protocol – Step 1
• Provide Memento protocol HTTP response headers to convey version
date and links
o Provide Memento-Datetime header to express version date
o Provide Link header with “original” link to point from version
resource to generic resource
o Provide Link header with appropriate “memento” links to allow
navigating between versions
- In combination with links with other relation types, e.g.
“first”, “last”, “prev”, “next”, “predecessor-version”,
“successor-version”
78
Memento & Access to Resource Versions
Herbert Van de Sompel
Stepwise Support for the Memento Protocol – Step 1
• Response to HTTP HEAD/GET against
http://www.w3.org/TR/2004/PR-webarch-20041105/
79
Memento & Access to Resource Versions
Herbert Van de Sompel
Stepwise Support for the Memento Protocol – Step 2
80
• Publish a TimeMap, at, say,
http://www.w3.org/TR/timemap/webarch/
• For the generic resource and for each version resource, provide a
Link header with “timemap” link that points at the TimeMap
Memento & Access to Resource Versions
Herbert Van de Sompel
Stepwise Support for the Memento Protocol – Step 2
• Response to HTTP HEAD/GET against
http://www.w3.org/TR/2004/PR-webarch-20041105/
81
Memento & Access to Resource Versions
Herbert Van de Sompel
Stepwise Support for the Memento Protocol – Step 2
• Response to HTTP GET against
http://www.w3.org/TR/timemap/webarch/
82
Memento & Access to Resource Versions
Herbert Van de Sompel
Stepwise Support for the Memento Protocol – Step 3
83
• Expose a TimeGate, at, say,
http://www.w3.org/TR/timegate/webarch/
• Reponses for generic resource, version resources, TimeGate,
TimeMap as shown in slides 56-70
• Note that Patterns for datetime negotiation other than the one
shown in those slides are described in RFC 7089
Memento & Access to Resource Versions
Herbert Van de Sompel
The Memento Framework:
Protocol to Integrate Present and Past Web
Memento and Linked Data
84
Memento & Access to Resource Versions
Herbert Van de Sompel
Memento & Access to Resource Versions
Herbert Van de Sompel
Memento & Access to Resource Versions
Herbert Van de Sompel
The Memento Framework:
Protocol to Integrate Present and Past Web
User & Developer Tools
87
Memento & Access to Resource Versions
Herbert Van de Sompel
Memento for Chrome
88http://bit.ly/memento-for-chrome
Memento & Access to Resource Versions
Herbert Van de Sompel
Time Travel Find – Search Page
http://timetravel.mementoweb.org/
Memento & Access to Resource Versions
Herbert Van de Sompel
Time Travel Find – Result Page
http://timetravel.mementoweb.org/list/20100428103432/http://stanford.edu
Memento & Access to Resource Versions
Herbert Van de Sompel
Time Travel Find – Result Page
http://timetravel.mementoweb.org/list/20100428103432/http://stanford.edu
Memento & Access to Resource Versions
Herbert Van de Sompel
Time Travel Find – Search Page
http://timetravel.mementoweb.org/
Memento & Access to Resource Versions
Herbert Van de Sompel
Time Travel Find – Result Page
http://timetravel.mementoweb.org/list/20140428052227/http://coptr.digipres.org/Main_Page
Memento & Access to Resource Versions
Herbert Van de Sompel
Time Travel Reconstruct – Search Page
http://timetravel.mementoweb.org/
Memento & Access to Resource Versions
Herbert Van de Sompel
Time Travel Reconstruct – Result Page
http://timetravel.mementoweb.org/reconstruct/20100428103432/http://stanford.edu
Memento & Access to Resource Versions
Herbert Van de Sompel
Memento for MediaWiki Extensions
96http://bit.ly/memento-for-mediawiki
Memento & Access to Resource Versions
Herbert Van de Sompel
Generic TimeGate Server (1/2)
https://github.com/mementoweb/timegate
Memento & Access to Resource Versions
Herbert Van de Sompel
Generic TimeGate Server (2/2)
https://github.com/mementoweb/timegate
Memento & Access to Resource Versions
Herbert Van de Sompel
SiteStory Transactional Archive for Apache Servers
https://mementoweb.github.io/SiteStory/
Memento & Access to Resource Versions
Herbert Van de Sompel
100
Memento Aggregator
Coverage: See http://mementoweb.org/depot/ and
http://labs.mementoweb.org/aggregator_config/archivelist.xml
Memento & Access to Resource Versions
Herbert Van de Sompel
Various Memento Tools for Users & Developers
101http://mementoweb.org/tools/
Memento & Access to Resource Versions
Herbert Van de Sompel
The Memento Framework:
Protocol to Integrate Present and Past Web
Time Travel APIs
102
Memento & Access to Resource Versions
Herbert Van de Sompel
Time Travel APIs
http://timetravel.mementoweb.org/guide/api/
Memento & Access to Resource Versions
Herbert Van de Sompel
URI that Redirects to a Memento
http://timetravel.mementoweb.org/memento/20100428103432/http://stanford.edu
Memento & Access to Resource Versions
Herbert Van de Sompel
URI that Redirects to a JSON Description of a Memento
http://timetravel.mementoweb.org/api/json/20100428103432/http://stanford.edu
Memento & Access to Resource Versions
Herbert Van de Sompel
JSON Format for TimeMaps
http://mementoweb.org/guide/timemap-json/
Memento & Access to Resource Versions
Herbert Van de Sompel
DIY TimeMap - Index TimeMap Lists Potential TimeMap URIs
http://timetravel.mementoweb.org/timemap/json/http://stanford.edu
SPEED
Memento & Access to Resource Versions
Herbert Van de Sompel
WDI TimeMap - Regular (Index) TimeMap
http://labs.mementoweb.org/timemap/link/http://stanford.edu
COVERAGE
Memento & Access to Resource Versions
Herbert Van de Sompel
Time Travel Archive Registry
http://labs.mementoweb.org/aggregator_config/archivelist.xml
Memento & Access to Resource Versions
Herbert Van de Sompel
The Memento Framework:
Protocol to Integrate Present and Past Web
Robust Links
110
Memento & Access to Resource Versions
Herbert Van de Sompel
How to Reference Resources
• Create a Capture in Internet Archive, archive.today, perma.cc,
webcitation
• Existing practice for linking to such captures:
o Link to URI of Capture
o Lose Original URI
o Lose Capture Datetime
• Problems with existing practice:
o Impossible to visit the original URI, if desired
o Requires the permanent existence/uptime of the archive that
holds the capture
- One link rot problem replaced by another
Van de Sompel, H. et al. (2013) Thoughts on referencing, linking, reference rot
http://mementoweb.org/missing-link/
Memento & Access to Resource Versions
Herbert Van de Sompel
Permanent Existence/Uptime of Archives?
Capture of http://webcitation.org dated July 17 2013
https://archive.today/eAETp
Memento & Access to Resource Versions
Herbert Van de Sompel
Permanent Existence/Uptime of Archives?
http://webcitation.org/ on August 6 2014
Memento & Access to Resource Versions
Herbert Van de Sompel
Permanent Existence/Uptime of Archives?
Remnant of discontinued web archive http://mummify.it captured on February 14 2014
https://web.archive.org/web/20140214233752/https://www.mummify.it/
Memento & Access to Resource Versions
Herbert Van de Sompel
Permanent Existence/Uptime of Archives?
http://www.themoscowtimes.com/news/article/russia-bans-wayback-machine-internet-archive-over-
islamic-state-video/510074.html
Memento & Access to Resource Versions
Herbert Van de Sompel
Hacking Original URI, Capture Datetime from Capture URI?
URI of Capture Original URI Datetime T
https://web.archive.org/web/20140214233752/https://
www.mummify.it
yes yes
https://archive.today/eAETp no no
http://perma.cc/4RH7-999Q?type=source no no
http://en.wikipedia.org/w/index.php?title=Coil_(band)
&oldid=388321480
no no
Memento & Access to Resource Versions
Herbert Van de Sompel
Using Capture URI to find Captures in Other Web Archives?
Memento & Access to Resource Versions
Herbert Van de Sompel
Using Capture URI to find Captures in Other Web Archives?
Memento & Access to Resource Versions
Herbert Van de Sompel
Reference Resources Robustly
• When referencing resources include:
o Original URI – Allows revisiting the URI as it is at the time of
reading, if the URI is still operational
o Snapshot URI – Allows revisiting the snapshot, if one was
created, and if the web archive in which it was created is still
operational
o Original URI & Date/Time allows revisiting a snapshot
created around the Date/Time in any web archive around the
world (using Memento infrastructure)
Van de Sompel, H. et al. (2013) Thoughts on referencing, linking, reference rot
http://mementoweb.org/missing-link/
Memento & Access to Resource Versions
Herbert Van de Sompel
Reference Resources Actionably
• When referencing resources, use Link Decorations to convey
Original URI, Snapshot URI, Date/Time
<a href=“http://www.stanford.edu”
data-versionurl=“http://archive.is/FAy6o”
data-versiondate=“2014-08-15” >
<a href=“http://www.stanford.edu”
data-versiondate=“2014-08-15” >
Herbert Van de Sompel et al. (2015) Robust Links - Link Decorations
http://robustlinks.mementoweb.org/spec/
<a href=“http://archive.is/FAy6o”
data-originalurl=“http://www.stanford.edu”
data-versiondate=“2014-08-15” >
Memento & Access to Resource Versions
Herbert Van de Sompel
No Link Decorations? Insert Page Date!
• Include page date to allow retrieving Mementos of linked resources
from around page publication date
<html>
<head lang=“en”
itemtype=“http://schema.org/WebPage”
itemid=“http://robustlinks.mementoweb.org/spec/”>
<meta itemprop=“datePublished” content=“2015-01-23”>
Herbert Van de Sompel et al. (2015) Robust Links - Link Decorations
http://robustlinks.mementoweb.org/spec/
Memento & Access to Resource Versions
Herbert Van de Sompel
Robust Links via Link Decoration, JavaScript, Time Travel API
• JavaScript makes link decorations actionable
http://robustlinks.mementoweb.org/demo/uri_references_js.html
JavaScript makes
the info actionable
Memento & Access to Resource Versions
Herbert Van de Sompel
The Memento Framework:
Protocol to Integrate Present and Past Web
Pointers
123
Memento & Access to Resource Versions
Herbert Van de Sompel
Pointers
• Memento site - http://mementoweb.org/about/
• Time Travel site – http://timetravel.mementoweb.org
• RFC 7089 - http://tools.ietf.org/html/rfc7089 (text version),
http://www.mementoweb.org/guide/rfc/ (HTML version)
• Memento Development List -
http://groups.google.com/group/memento-dev/
• Memento GitHub projects - https://github.com/mementoweb/
• Client and Server software and tools -
http://mementoweb.org/tools/
• Information on TimeGates and TimeMaps for major systems -
http://mementoweb.org/depot/
• IIPC list of software and tools related to web archiving -
http://netpreserve.org/web-archiving/tools-and-software
124
Memento & Access to Resource Versions
Herbert Van de Sompel
The Memento Framework:
Protocol to Integrate Present and Past Web
Additional Details
125
Memento & Access to Resource Versions
Herbert Van de Sompel
Fixed Resource
• The resource is its own Memento, i.e. it is a stable resource
o Resource that was born stable or became stable; it will not change
anymore, e.g. PermaLink resources on news sites
o Resource provides:
- Link header with ”original” link pointing to itself
- Memento-Datetime header
o Note the difference with Last-Modified header: no promise
resource will not change anymore
- Details at http://ws-dl.blogspot.com/2010/11/2010-11-05-
memento-datetime-is-not-last.html
126
Memento & Access to Resource Versions
Herbert Van de Sompel
Fixed Resource
• Response to HTTP HEAD/GET against
http://a.example.org
127
Memento & Access to Resource Versions
Herbert Van de Sompel
Memento Without TimeGate
• The resource is a Memento but there is no TimeGate available for it
o e.g. snapshot of resource when server is being retired
o Resource provides:
- Link header with ”original” link revealing the URI of
Original Resource
- Memento-Datetime header
128
Memento & Access to Resource Versions
Herbert Van de Sompel
Memento Without TimeGate
• Response to HTTP HEAD/GET against
http://arxiv.example.net/web/20010321203610/http://
a.example.org
129
Memento & Access to Resource Versions
Herbert Van de Sompel
Intermediate Resource
• The resource issues a redirect to a TimeGate, a Memento, another
intermediate resource
o Plays an active role in the Memento framework
o Resource provides:
- Link header with ”original” link revealing the URI of
Original Resource
130
Memento & Access to Resource Versions
Herbert Van de Sompel
Intermediate Resource
• Response to HTTP HEAD/GET against a resource that redirects to a
TimeGate
131
Memento & Access to Resource Versions
Herbert Van de Sompel
Resource Excluded from Datetime Negotiation
• e.g. JavaScript, logos, banners added by web archives
o Resource always needs to be used in its current state
o In order to flag it is excluded from datetime negotiation, this
resource provides:
- Link header with ”type” link that has as value
http://mementoweb.org/terms/donotnegotiat
e
132
Memento & Access to Resource Versions
Herbert Van de Sompel
Resource Excluded from Datetime Negotiation
• Response to HTTP HEAD/GET against a resource that is excluded
from datetime negotiation
133
Memento & Access to Resource Versions
Herbert Van de Sompel
Memento of a Redirect
• HTTP responses with 3XX codes are also archived
o e.g. web archives hold on to “301 Moved Permanently” and “302
Found” whereas Linked data archives preserve “303 See Other”
• The Memento’s response must have the same HTTP status code as
the original
• Memento headers are as usual
• Memento clients need to understand that the redirect (URI in Location
header) can be to an Original Resource or to a Memento
o If an Original Resource, the client must proceed to find an
appropriate Memento for it
134
Memento & Access to Resource Versions
Herbert Van de Sompel
Memento of a Redirect
• Response in April 2008 to HTTP HEAD/GET against
http://a.example.org
135
Memento & Access to Resource Versions
Herbert Van de Sompel
Memento of a Redirect
• Response to a HTTP HEAD/GET of a Memento of that 2008 redirect,
whereby the redirect is unchanged, i.e. it is to the resource to which
the redirect originally led
136
Memento & Access to Resource Versions
Herbert Van de Sompel
Memento of a Redirect
• Response to a HTTP HEAD/GET of a Memento of that 2008 redirect,
whereby the redirect is rewritten, i.e. it leads to a Memento of the
resource to which the redirect originally led
137
Memento & Access to Resource Versions
Herbert Van de Sompel
http://mementoweb.org/
Memento
Uniform and Robust Access to Resource Versions
Memento has received funding
from
The Library of Congress
Andrew W. Mellon Foundation
IIPC
138

Memento 101

  • 1.
    Memento & Accessto Resource Versions Herbert Van de Sompel http://mementoweb.org/ Memento Uniform and Robust Access to Resource Versions Memento has received funding from The Library of Congress Andrew W. Mellon Foundation IIPC 1
  • 2.
    Memento & Accessto Resource Versions Herbert Van de Sompel Memento Makes Navigating the Web’s Past Easy 2 RFC 7089 (2013) Van de Sompel, H., Nelson, M.L., Sanderson, R. HTTP Framework for Time-Based Access to Resource States - Memento http://tools.ietf.org/html/rfc7089
  • 3.
    Memento & Accessto Resource Versions Herbert Van de Sompel Today Select Date June 20 1997 June 5 1997 From archive.today Memento: Access Versions via the Original URI and a Datetime 3
  • 4.
    Memento & Accessto Resource Versions Herbert Van de Sompel Today Select Date June 27 2011 May 29 2011 From Internet Archive Memento: Access Versions via the Original URI and a Datetime 4
  • 5.
    Memento & Accessto Resource Versions Herbert Van de Sompel The Memento protocol achieves this by introducing a uniform, datetime-based, version access capability that integrates the Present and Past Web. 5
  • 6.
    Memento & Accessto Resource Versions Herbert Van de Sompel Problem Statement … 6
  • 7.
    Memento & Accessto Resource Versions Herbert Van de Sompel Resources 7
  • 8.
    Memento & Accessto Resource Versions Herbert Van de Sompel Resources have Representations 8
  • 9.
    Memento & Accessto Resource Versions Herbert Van de Sompel Resources have Representations that Change over Time 9
  • 10.
    Memento & Accessto Resource Versions Herbert Van de Sompel Only the Current Representation is Available from a Resource 10
  • 11.
    Memento & Accessto Resource Versions Herbert Van de Sompel Old Representations are Lost Forever 11
  • 12.
    Memento & Accessto Resource Versions Herbert Van de Sompel But … Archived/Version Resources Exist 12
  • 13.
    Memento & Accessto Resource Versions Herbert Van de Sompel There are resource versions on the Web, in: • Web Archives; • Content Management Systems; • Search engine caches; • Transactional archives. 13
  • 14.
    Memento & Accessto Resource Versions Herbert Van de Sompel Web Archive Archived Resource URI-M - http://web.archive.org/web/20010911203610/http://www.cnn.com/ URI-R - http://www.cnn.com/
  • 15.
    Memento & Accessto Resource Versions Herbert Van de Sompel Web Archive Archived Resource URI-M - https://archive.today/UD0d6 URI-R - http://www.w3.org/
  • 16.
    Memento & Accessto Resource Versions Herbert Van de Sompel Version Resource URI-M - http://en.wikipedia.org/w/index.php?title=September_11_attacks&oldid=282333 CMS URI-R - http://en.wikipedia.org/wiki/September_11_attacks
  • 17.
    Memento & Accessto Resource Versions Herbert Van de Sompel Search Engine Cache Cached Resource URI-R – http://ghr.nlm.nih.gov/handbook/basics/dna URI-M - http://webcache.googleusercontent.com/search?q=cache:kDmDc1PIA38J: ghr.nlm.nih.gov/handbook/basics/dna+&cd=2&hl=en&ct=clnk&gl=us
  • 18.
    Memento & Accessto Resource Versions Herbert Van de Sompel Archived Resource Transactional Archive URI-R - http://dans.knaw.nl/en URI-M - http://www.theresourcedepot.com/000010/memento/20130418204153/http://dans.knaw. nl/en
  • 19.
    Memento & Accessto Resource Versions Herbert Van de Sompel But, without Memento, the Web handles these version resources poorly: • Cannot talk, in URI terms, about a resource as it used to exist • Cannot access a prior version knowing the current one • Cannot access the current version knowing a prior one Solutions are ad hoc and localized 19
  • 20.
    Memento & Accessto Resource Versions Herbert Van de Sompel Without Memento, the Current and Past Web Lack Integration 20 • Going from Current to Past Web is a matter of (manual) discovery • Navigating the Past Web is only possible within the boundary of a single web archive, versioning system • Memento integrates the Current And Past Web by means of an extension of HTTP • Memento turns archives, versioning systems into infrastructure rather than destinations
  • 21.
    Memento & Accessto Resource Versions Herbert Van de Sompel Systems with Resource Versions system type stores URI-R and URI-M web archive observations over time different baseURL CMS history same baseURL search engine cache one recent observation different baseURL transactional archive history different baseURL These systems have different characteristics but the Memento protocol allows uniform versions access to their resources 21
  • 22.
    Memento & Accessto Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web Overview 22
  • 23.
    Memento & Accessto Resource Versions Herbert Van de Sompel The Memento protocol: • Regards the Web as a big Content Management System • Introduces an interoperable approach to access resource versions across the Web • Does not build new archives but leverages all systems that host versions 23
  • 24.
    Memento & Accessto Resource Versions Herbert Van de Sompel Memento’s approach to access resource versions: • Is distributed: versions may exist on several servers • Uses time as a global version indicator • Is based on the primitives of the Web: resource, state, representation, content negotiation, link 24
  • 25.
    Memento & Accessto Resource Versions Herbert Van de Sompel Memento’s approach to access resource versions has two components: • Access to a single archived/version resource – via datetime negotiation with a TimeGate • Access to an overview of existing versions – by requesting a TimeMap 25
  • 26.
    Memento & Accessto Resource Versions Herbert Van de Sompel 26 Memento Protocol Resource Types Original Resource: Resource that exists or used to exist; we are interested in accessing a past state of it Memento: Resource that is a prior version of the Original Resource; it encapsulates a past state of the Original Resource TimeGate: Resource that “decides”, based on a given datetime, which is the temporally best Memento for an Original Resource TimeMap: Resource that provides a list of known Mementos for an Original Resource as well as their datetime
  • 27.
    Memento & Accessto Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web Datetime Negotiation 27
  • 28.
    Memento & Accessto Resource Versions Herbert Van de Sompel 28 Original Resource and Mementos
  • 29.
    Memento & Accessto Resource Versions Herbert Van de Sompel 29 Bridge from Present to Past
  • 30.
    Memento & Accessto Resource Versions Herbert Van de Sompel 30 Bridge from Present to Past
  • 31.
    Memento & Accessto Resource Versions Herbert Van de Sompel 31 Bridge from Past to Present
  • 32.
    Memento & Accessto Resource Versions Herbert Van de Sompel 32 Bridge from Past to Present
  • 33.
    Memento & Accessto Resource Versions Herbert Van de Sompel 33 Memento Datetime Negotiation Component
  • 34.
    Memento & Accessto Resource Versions Herbert Van de Sompel 34 Memento Protocol Datetime Negotiation Patterns The different Patterns are discussed in RFC 7089 Here, we deal with URI-R <> URI-G <> URI-M and 302 style negotiation can coincide with can coincide with 302 or 200 style negotiation can be used
  • 35.
    Memento & Accessto Resource Versions Herbert Van de Sompel 35 Memento Datetime Negotiation - Client Server Interaction Yes, G It’s at M
  • 36.
    Memento Datetime Negotiation- HTTP Flow HEAD R, [Accept-Datetime] [Link  G] 302  M, Vary, Link  R,[M,T] 200, Memento-Datetime, Link  R,[G,M,T] HEAD G, Accept-Datetime GET M, [Accept-Datetime] […]== optional
  • 37.
    Memento & Accessto Resource Versions Herbert Van de Sompel 37 Original Resource Provides No Link – Client Intelligence
  • 38.
    Memento & Accessto Resource Versions Herbert Van de Sompel 38 Original Resource Gone – Client Intelligence
  • 39.
    Memento & Accessto Resource Versions Herbert Van de Sompel 39 Original Resource Gone – Server Due Dilligence
  • 40.
    Memento & Accessto Resource Versions Herbert Van de Sompel 40 Original Resource’s Server Gone – Client Intelligence
  • 41.
    Memento & Accessto Resource Versions Herbert Van de Sompel 41 Memento Aggregator
  • 42.
    Memento & Accessto Resource Versions Herbert Van de Sompel 42 TimeGates A list of TimeGates provided by major web archives as well as by-proxy TimeGates provided for other systems is maintained at http://mementoweb.org/depot/ http://timetravel.mementoweb.org/guide/api/#registry
  • 43.
    Memento & Accessto Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web TimeMaps 43
  • 44.
    Memento & Accessto Resource Versions Herbert Van de Sompel 44 TimeMap • multiple TimeMap serializations possible • application-link/format mandatory • When TimeMaps become too large, they can be broken up and paged
  • 45.
    Memento & Accessto Resource Versions Herbert Van de Sompel 45 TimeMaps A list of TimeMaps provided by major web archives as well as by-proxy TimeMaps provided for other systems is maintained at http://mementoweb.org/depot/ http://timetravel.mementoweb.org/guide/api/#registry
  • 46.
    Memento & Accessto Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web HTTP Headers 46
  • 47.
    Memento & Accessto Resource Versions Herbert Van de Sompel The HTTP Headers used in the Memento Protocol • Define two new headers: – request: Accept-Datetime: – response: Memento-Datetime: • Introduce new content for two existing headers: – response: Vary: ; Link: • Use one existing header without modification: – response: Location:, TCN: 47
  • 48.
    Memento & Accessto Resource Versions Herbert Van de Sompel HTTP Request Headers for Datetime Negotiation • Accept-Datetime: o Issued against TimeGate, [Original Resource, Memento] o Header value: desired datetime of a Memento Accept-Datetime: Mon, 12 Oct 2009 14:20:33 GMT 48
  • 49.
    Memento & Accessto Resource Versions Herbert Van de Sompel HTTP Response Headers for Datetime Negotiation • Memento-Datetime: o Returned by Mementos only - Even when not as a result of datetetime negotiation o Header value: Archival datetime of the Memento - Resource has not and will not change beyond that date o This header is sticky: - Once returned, a server must always return it with same value - Must also be preserved when Mementos are mirrored at different URIs o This header is crucial to allow a client to understand it has arrived at a Memento Memento-Datetime: Mon, 12 Oct 2009 14:20:33 GMT 49
  • 50.
    Memento & Accessto Resource Versions Herbert Van de Sompel HTTP Response Headers Datetime Negotiation • Vary: o Returned by TimeGate o Similar to regular content negotiation o Header value: accept-datetime • Regular content negotiation (e.g. media type) can be used too but a TimeGate must first meet the datetime preference, and then – if possible – the other content negotiation preferences • Note: accept-datetime in Vary header is crucial to allow a client to understand it has arrived at a TimeGate Vary: accept-datetime 50
  • 51.
    Memento & Accessto Resource Versions Herbert Van de Sompel HTTP Response Headers for Datetime Negotiation • Location: o Returned by TimeGate o Similar to regular content negotiation o Header value: URI of the Memento selected by the TimeGate Location: http://web.archive.org/web/20010911223004/http://cnn.co m 51
  • 52.
    Memento & Accessto Resource Versions Herbert Van de Sompel HTTP Response Headers for Datetime Negotiation • Link: o Returned by Original Resource, TimeGate and Mementos o Various new Relation Types are introduced: - “original” – points to Original Resource - “timegate” – points to TimeGate - “memento” – points to Memento - “timemap” – points to TimeMap o A TimeGate must provide the “original” link o A Memento must provide the “original” link o All other links are encouraged but optional 52 HTTP Link Header: RFC 5988
  • 53.
    Memento & Accessto Resource Versions Herbert Van de Sompel HTTP Response Headers for Datetime Negotiation • Link: o The following ”memento” links that point at special Mementos, known to the responding server, are optional but very useful: - First and last Memento known to the server, e.g. ”memento first” - Memento prior and after the selected Memento, e.g. “”memento predecessor-version” - Selected Memento - Temporal order of Mementos is expressed using existing relation types from RFC 5829 and RFC 5988: first, last, next, prev, successor-version, predecessor- version 53
  • 54.
    Memento & Accessto Resource Versions Herbert Van de Sompel HTTP Response Headers for Datetime Negotiation • Link: o Attributes for a ”memento” Link: - datetime (mandatory): datetime of the Memento pointed at by the link - license (optional): license associated with the Memento o Attributes for a ”timemap” Link: - type (recommended): MIME type of TimeMap serialization - from, until (optional): to convey the temporal interval of Memento datetimes covered by the TimeMap 54
  • 55.
    Memento Datetime Negotiation- HTTP Flow HEAD R, [Accept-Datetime] [Link  G] 302  M, Vary, Link  R [M T] 200, Memento-Datetime, Link  R [G M T] HEAD G, Accept-Datetime GET M, [Accept-Datetime] [timegate] original [memento timemap] original [timegate memento timemap]
  • 56.
    Memento & Accessto Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web HTTP Interactions 56
  • 57.
    Memento & Accessto Resource Versions Herbert Van de Sompel Datetime Negotiation Flow: Step 1 57
  • 58.
    Memento & Accessto Resource Versions Herbert Van de Sompel Datetime Negotiation Flow: Step 2 58
  • 59.
    Memento & Accessto Resource Versions Herbert Van de Sompel Datetime Negotiation Flow: Step 3 59
  • 60.
    Memento & Accessto Resource Versions Herbert Van de Sompel Datetime Negotiation Flow: Step 4 60
  • 61.
    Memento & Accessto Resource Versions Herbert Van de Sompel Datetime Negotiation Flow: Step 5 61
  • 62.
    Memento & Accessto Resource Versions Herbert Van de Sompel Datetime Negotiation Flow: Step 6 62
  • 63.
    Memento & Accessto Resource Versions Herbert Van de Sompel TimeMap Access Flow: Step 1 63
  • 64.
    Memento & Accessto Resource Versions Herbert Van de Sompel TimeMap Access Flow: Step 2 64
  • 65.
    Memento & Accessto Resource Versions Herbert Van de Sompel TimeMap Access Flow: Step 3 65
  • 66.
    Memento & Accessto Resource Versions Herbert Van de Sompel TimeMap Access Flow: Step 4 66
  • 67.
    Memento & Accessto Resource Versions Herbert Van de Sompel TimeMap Access Flow: Step 5 67
  • 68.
    Memento & Accessto Resource Versions Herbert Van de Sompel TimeMap Access Flow: Step 6 68
  • 69.
    Memento & Accessto Resource Versions Herbert Van de Sompel TimeMap Access Flow: Step 6 with Index TimeMap 69
  • 70.
    Memento & Accessto Resource Versions Herbert Van de Sompel TimeMap Access Flow: Step 6 with Paging TimeMap 70
  • 71.
    Memento & Accessto Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web Resource Versioning and Memento 71
  • 72.
    Memento & Accessto Resource Versions Herbert Van de Sompel Common Resource Versioning Approach
  • 73.
    Memento & Accessto Resource Versions Herbert Van de Sompel Version Resources (*) Tim Berners-Lee (1996) http://www.w3.org/DesignIssues/Generic.html (*)
  • 74.
    Memento & Accessto Resource Versions Herbert Van de Sompel Version Resources and Associated Generic Resource (*) (*) (*) Tim Berners-Lee (1996) http://www.w3.org/DesignIssues/Generic.html
  • 75.
    Memento & Accessto Resource Versions Herbert Van de Sompel Memento Bridges Between Generic & Specific Resources
  • 76.
    Memento & Accessto Resource Versions Herbert Van de Sompel Memento Bridges Between Generic & Specific Resources
  • 77.
    Memento & Accessto Resource Versions Herbert Van de Sompel Memento Bridges Between Generic & Specific Resources
  • 78.
    Memento & Accessto Resource Versions Herbert Van de Sompel Stepwise Support for the Memento Protocol – Step 1 • Provide Memento protocol HTTP response headers to convey version date and links o Provide Memento-Datetime header to express version date o Provide Link header with “original” link to point from version resource to generic resource o Provide Link header with appropriate “memento” links to allow navigating between versions - In combination with links with other relation types, e.g. “first”, “last”, “prev”, “next”, “predecessor-version”, “successor-version” 78
  • 79.
    Memento & Accessto Resource Versions Herbert Van de Sompel Stepwise Support for the Memento Protocol – Step 1 • Response to HTTP HEAD/GET against http://www.w3.org/TR/2004/PR-webarch-20041105/ 79
  • 80.
    Memento & Accessto Resource Versions Herbert Van de Sompel Stepwise Support for the Memento Protocol – Step 2 80 • Publish a TimeMap, at, say, http://www.w3.org/TR/timemap/webarch/ • For the generic resource and for each version resource, provide a Link header with “timemap” link that points at the TimeMap
  • 81.
    Memento & Accessto Resource Versions Herbert Van de Sompel Stepwise Support for the Memento Protocol – Step 2 • Response to HTTP HEAD/GET against http://www.w3.org/TR/2004/PR-webarch-20041105/ 81
  • 82.
    Memento & Accessto Resource Versions Herbert Van de Sompel Stepwise Support for the Memento Protocol – Step 2 • Response to HTTP GET against http://www.w3.org/TR/timemap/webarch/ 82
  • 83.
    Memento & Accessto Resource Versions Herbert Van de Sompel Stepwise Support for the Memento Protocol – Step 3 83 • Expose a TimeGate, at, say, http://www.w3.org/TR/timegate/webarch/ • Reponses for generic resource, version resources, TimeGate, TimeMap as shown in slides 56-70 • Note that Patterns for datetime negotiation other than the one shown in those slides are described in RFC 7089
  • 84.
    Memento & Accessto Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web Memento and Linked Data 84
  • 85.
    Memento & Accessto Resource Versions Herbert Van de Sompel
  • 86.
    Memento & Accessto Resource Versions Herbert Van de Sompel
  • 87.
    Memento & Accessto Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web User & Developer Tools 87
  • 88.
    Memento & Accessto Resource Versions Herbert Van de Sompel Memento for Chrome 88http://bit.ly/memento-for-chrome
  • 89.
    Memento & Accessto Resource Versions Herbert Van de Sompel Time Travel Find – Search Page http://timetravel.mementoweb.org/
  • 90.
    Memento & Accessto Resource Versions Herbert Van de Sompel Time Travel Find – Result Page http://timetravel.mementoweb.org/list/20100428103432/http://stanford.edu
  • 91.
    Memento & Accessto Resource Versions Herbert Van de Sompel Time Travel Find – Result Page http://timetravel.mementoweb.org/list/20100428103432/http://stanford.edu
  • 92.
    Memento & Accessto Resource Versions Herbert Van de Sompel Time Travel Find – Search Page http://timetravel.mementoweb.org/
  • 93.
    Memento & Accessto Resource Versions Herbert Van de Sompel Time Travel Find – Result Page http://timetravel.mementoweb.org/list/20140428052227/http://coptr.digipres.org/Main_Page
  • 94.
    Memento & Accessto Resource Versions Herbert Van de Sompel Time Travel Reconstruct – Search Page http://timetravel.mementoweb.org/
  • 95.
    Memento & Accessto Resource Versions Herbert Van de Sompel Time Travel Reconstruct – Result Page http://timetravel.mementoweb.org/reconstruct/20100428103432/http://stanford.edu
  • 96.
    Memento & Accessto Resource Versions Herbert Van de Sompel Memento for MediaWiki Extensions 96http://bit.ly/memento-for-mediawiki
  • 97.
    Memento & Accessto Resource Versions Herbert Van de Sompel Generic TimeGate Server (1/2) https://github.com/mementoweb/timegate
  • 98.
    Memento & Accessto Resource Versions Herbert Van de Sompel Generic TimeGate Server (2/2) https://github.com/mementoweb/timegate
  • 99.
    Memento & Accessto Resource Versions Herbert Van de Sompel SiteStory Transactional Archive for Apache Servers https://mementoweb.github.io/SiteStory/
  • 100.
    Memento & Accessto Resource Versions Herbert Van de Sompel 100 Memento Aggregator Coverage: See http://mementoweb.org/depot/ and http://labs.mementoweb.org/aggregator_config/archivelist.xml
  • 101.
    Memento & Accessto Resource Versions Herbert Van de Sompel Various Memento Tools for Users & Developers 101http://mementoweb.org/tools/
  • 102.
    Memento & Accessto Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web Time Travel APIs 102
  • 103.
    Memento & Accessto Resource Versions Herbert Van de Sompel Time Travel APIs http://timetravel.mementoweb.org/guide/api/
  • 104.
    Memento & Accessto Resource Versions Herbert Van de Sompel URI that Redirects to a Memento http://timetravel.mementoweb.org/memento/20100428103432/http://stanford.edu
  • 105.
    Memento & Accessto Resource Versions Herbert Van de Sompel URI that Redirects to a JSON Description of a Memento http://timetravel.mementoweb.org/api/json/20100428103432/http://stanford.edu
  • 106.
    Memento & Accessto Resource Versions Herbert Van de Sompel JSON Format for TimeMaps http://mementoweb.org/guide/timemap-json/
  • 107.
    Memento & Accessto Resource Versions Herbert Van de Sompel DIY TimeMap - Index TimeMap Lists Potential TimeMap URIs http://timetravel.mementoweb.org/timemap/json/http://stanford.edu SPEED
  • 108.
    Memento & Accessto Resource Versions Herbert Van de Sompel WDI TimeMap - Regular (Index) TimeMap http://labs.mementoweb.org/timemap/link/http://stanford.edu COVERAGE
  • 109.
    Memento & Accessto Resource Versions Herbert Van de Sompel Time Travel Archive Registry http://labs.mementoweb.org/aggregator_config/archivelist.xml
  • 110.
    Memento & Accessto Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web Robust Links 110
  • 111.
    Memento & Accessto Resource Versions Herbert Van de Sompel How to Reference Resources • Create a Capture in Internet Archive, archive.today, perma.cc, webcitation • Existing practice for linking to such captures: o Link to URI of Capture o Lose Original URI o Lose Capture Datetime • Problems with existing practice: o Impossible to visit the original URI, if desired o Requires the permanent existence/uptime of the archive that holds the capture - One link rot problem replaced by another Van de Sompel, H. et al. (2013) Thoughts on referencing, linking, reference rot http://mementoweb.org/missing-link/
  • 112.
    Memento & Accessto Resource Versions Herbert Van de Sompel Permanent Existence/Uptime of Archives? Capture of http://webcitation.org dated July 17 2013 https://archive.today/eAETp
  • 113.
    Memento & Accessto Resource Versions Herbert Van de Sompel Permanent Existence/Uptime of Archives? http://webcitation.org/ on August 6 2014
  • 114.
    Memento & Accessto Resource Versions Herbert Van de Sompel Permanent Existence/Uptime of Archives? Remnant of discontinued web archive http://mummify.it captured on February 14 2014 https://web.archive.org/web/20140214233752/https://www.mummify.it/
  • 115.
    Memento & Accessto Resource Versions Herbert Van de Sompel Permanent Existence/Uptime of Archives? http://www.themoscowtimes.com/news/article/russia-bans-wayback-machine-internet-archive-over- islamic-state-video/510074.html
  • 116.
    Memento & Accessto Resource Versions Herbert Van de Sompel Hacking Original URI, Capture Datetime from Capture URI? URI of Capture Original URI Datetime T https://web.archive.org/web/20140214233752/https:// www.mummify.it yes yes https://archive.today/eAETp no no http://perma.cc/4RH7-999Q?type=source no no http://en.wikipedia.org/w/index.php?title=Coil_(band) &oldid=388321480 no no
  • 117.
    Memento & Accessto Resource Versions Herbert Van de Sompel Using Capture URI to find Captures in Other Web Archives?
  • 118.
    Memento & Accessto Resource Versions Herbert Van de Sompel Using Capture URI to find Captures in Other Web Archives?
  • 119.
    Memento & Accessto Resource Versions Herbert Van de Sompel Reference Resources Robustly • When referencing resources include: o Original URI – Allows revisiting the URI as it is at the time of reading, if the URI is still operational o Snapshot URI – Allows revisiting the snapshot, if one was created, and if the web archive in which it was created is still operational o Original URI & Date/Time allows revisiting a snapshot created around the Date/Time in any web archive around the world (using Memento infrastructure) Van de Sompel, H. et al. (2013) Thoughts on referencing, linking, reference rot http://mementoweb.org/missing-link/
  • 120.
    Memento & Accessto Resource Versions Herbert Van de Sompel Reference Resources Actionably • When referencing resources, use Link Decorations to convey Original URI, Snapshot URI, Date/Time <a href=“http://www.stanford.edu” data-versionurl=“http://archive.is/FAy6o” data-versiondate=“2014-08-15” > <a href=“http://www.stanford.edu” data-versiondate=“2014-08-15” > Herbert Van de Sompel et al. (2015) Robust Links - Link Decorations http://robustlinks.mementoweb.org/spec/ <a href=“http://archive.is/FAy6o” data-originalurl=“http://www.stanford.edu” data-versiondate=“2014-08-15” >
  • 121.
    Memento & Accessto Resource Versions Herbert Van de Sompel No Link Decorations? Insert Page Date! • Include page date to allow retrieving Mementos of linked resources from around page publication date <html> <head lang=“en” itemtype=“http://schema.org/WebPage” itemid=“http://robustlinks.mementoweb.org/spec/”> <meta itemprop=“datePublished” content=“2015-01-23”> Herbert Van de Sompel et al. (2015) Robust Links - Link Decorations http://robustlinks.mementoweb.org/spec/
  • 122.
    Memento & Accessto Resource Versions Herbert Van de Sompel Robust Links via Link Decoration, JavaScript, Time Travel API • JavaScript makes link decorations actionable http://robustlinks.mementoweb.org/demo/uri_references_js.html JavaScript makes the info actionable
  • 123.
    Memento & Accessto Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web Pointers 123
  • 124.
    Memento & Accessto Resource Versions Herbert Van de Sompel Pointers • Memento site - http://mementoweb.org/about/ • Time Travel site – http://timetravel.mementoweb.org • RFC 7089 - http://tools.ietf.org/html/rfc7089 (text version), http://www.mementoweb.org/guide/rfc/ (HTML version) • Memento Development List - http://groups.google.com/group/memento-dev/ • Memento GitHub projects - https://github.com/mementoweb/ • Client and Server software and tools - http://mementoweb.org/tools/ • Information on TimeGates and TimeMaps for major systems - http://mementoweb.org/depot/ • IIPC list of software and tools related to web archiving - http://netpreserve.org/web-archiving/tools-and-software 124
  • 125.
    Memento & Accessto Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web Additional Details 125
  • 126.
    Memento & Accessto Resource Versions Herbert Van de Sompel Fixed Resource • The resource is its own Memento, i.e. it is a stable resource o Resource that was born stable or became stable; it will not change anymore, e.g. PermaLink resources on news sites o Resource provides: - Link header with ”original” link pointing to itself - Memento-Datetime header o Note the difference with Last-Modified header: no promise resource will not change anymore - Details at http://ws-dl.blogspot.com/2010/11/2010-11-05- memento-datetime-is-not-last.html 126
  • 127.
    Memento & Accessto Resource Versions Herbert Van de Sompel Fixed Resource • Response to HTTP HEAD/GET against http://a.example.org 127
  • 128.
    Memento & Accessto Resource Versions Herbert Van de Sompel Memento Without TimeGate • The resource is a Memento but there is no TimeGate available for it o e.g. snapshot of resource when server is being retired o Resource provides: - Link header with ”original” link revealing the URI of Original Resource - Memento-Datetime header 128
  • 129.
    Memento & Accessto Resource Versions Herbert Van de Sompel Memento Without TimeGate • Response to HTTP HEAD/GET against http://arxiv.example.net/web/20010321203610/http:// a.example.org 129
  • 130.
    Memento & Accessto Resource Versions Herbert Van de Sompel Intermediate Resource • The resource issues a redirect to a TimeGate, a Memento, another intermediate resource o Plays an active role in the Memento framework o Resource provides: - Link header with ”original” link revealing the URI of Original Resource 130
  • 131.
    Memento & Accessto Resource Versions Herbert Van de Sompel Intermediate Resource • Response to HTTP HEAD/GET against a resource that redirects to a TimeGate 131
  • 132.
    Memento & Accessto Resource Versions Herbert Van de Sompel Resource Excluded from Datetime Negotiation • e.g. JavaScript, logos, banners added by web archives o Resource always needs to be used in its current state o In order to flag it is excluded from datetime negotiation, this resource provides: - Link header with ”type” link that has as value http://mementoweb.org/terms/donotnegotiat e 132
  • 133.
    Memento & Accessto Resource Versions Herbert Van de Sompel Resource Excluded from Datetime Negotiation • Response to HTTP HEAD/GET against a resource that is excluded from datetime negotiation 133
  • 134.
    Memento & Accessto Resource Versions Herbert Van de Sompel Memento of a Redirect • HTTP responses with 3XX codes are also archived o e.g. web archives hold on to “301 Moved Permanently” and “302 Found” whereas Linked data archives preserve “303 See Other” • The Memento’s response must have the same HTTP status code as the original • Memento headers are as usual • Memento clients need to understand that the redirect (URI in Location header) can be to an Original Resource or to a Memento o If an Original Resource, the client must proceed to find an appropriate Memento for it 134
  • 135.
    Memento & Accessto Resource Versions Herbert Van de Sompel Memento of a Redirect • Response in April 2008 to HTTP HEAD/GET against http://a.example.org 135
  • 136.
    Memento & Accessto Resource Versions Herbert Van de Sompel Memento of a Redirect • Response to a HTTP HEAD/GET of a Memento of that 2008 redirect, whereby the redirect is unchanged, i.e. it is to the resource to which the redirect originally led 136
  • 137.
    Memento & Accessto Resource Versions Herbert Van de Sompel Memento of a Redirect • Response to a HTTP HEAD/GET of a Memento of that 2008 redirect, whereby the redirect is rewritten, i.e. it leads to a Memento of the resource to which the redirect originally led 137
  • 138.
    Memento & Accessto Resource Versions Herbert Van de Sompel http://mementoweb.org/ Memento Uniform and Robust Access to Resource Versions Memento has received funding from The Library of Congress Andrew W. Mellon Foundation IIPC 138