Memento 101

6,240 views

Published on

This presentation provides an overview of the Memento "Time Travel for the Web" framework that is aligned with the stable version of the Memento protocol, specified in RFC 7089.

Published in: Internet

Memento 101

  1. 1. Memento & Access to Resource Versions Herbert Van de Sompel http://mementoweb.org/ Memento Uniform and Robust Access to Resource Versions Memento has received funding from The Library of Congress Andrew W. Mellon Foundation IIPC 1
  2. 2. Memento & Access to Resource Versions Herbert Van de Sompel Memento Makes Navigating the Web’s Past Easy 2 RFC 7089 (2013) Van de Sompel, H., Nelson, M.L., Sanderson, R. HTTP Framework for Time-Based Access to Resource States - Memento http://tools.ietf.org/html/rfc7089
  3. 3. Memento & Access to Resource Versions Herbert Van de Sompel Today Select Date June 20 1997 June 5 1997 From archive.today Memento: Access Versions via the Original URI and a Datetime 3
  4. 4. Memento & Access to Resource Versions Herbert Van de Sompel Today Select Date June 27 2011 May 29 2011 From Internet Archive Memento: Access Versions via the Original URI and a Datetime 4
  5. 5. Memento & Access to Resource Versions Herbert Van de Sompel The Memento protocol achieves this by introducing a uniform, datetime-based, version access capability that integrates the Present and Past Web. 5
  6. 6. Memento & Access to Resource Versions Herbert Van de Sompel Problem Statement … 6
  7. 7. Memento & Access to Resource Versions Herbert Van de Sompel Resources 7
  8. 8. Memento & Access to Resource Versions Herbert Van de Sompel Resources have Representations 8
  9. 9. Memento & Access to Resource Versions Herbert Van de Sompel Resources have Representations that Change over Time 9
  10. 10. Memento & Access to Resource Versions Herbert Van de Sompel Only the Current Representation is Available from a Resource 10
  11. 11. Memento & Access to Resource Versions Herbert Van de Sompel Old Representations are Lost Forever 11
  12. 12. Memento & Access to Resource Versions Herbert Van de Sompel But … Archived/Version Resources Exist 12
  13. 13. Memento & Access to Resource Versions Herbert Van de Sompel There are resource versions on the Web, in: • Web Archives; • Content Management Systems; • Search engine caches; • Transactional archives. 13
  14. 14. Memento & Access to Resource Versions Herbert Van de Sompel Web Archive Archived Resource URI-M - http://web.archive.org/web/20010911203610/http://www.cnn.com/ URI-R - http://www.cnn.com/
  15. 15. Memento & Access to Resource Versions Herbert Van de Sompel Web Archive Archived Resource URI-M - https://archive.today/UD0d6 URI-R - http://www.w3.org/
  16. 16. Memento & Access to Resource Versions Herbert Van de Sompel Version Resource URI-M - http://en.wikipedia.org/w/index.php?title=September_11_attacks&oldid=282333 CMS URI-R - http://en.wikipedia.org/wiki/September_11_attacks
  17. 17. Memento & Access to Resource Versions Herbert Van de Sompel Search Engine Cache Cached Resource URI-R – http://ghr.nlm.nih.gov/handbook/basics/dna URI-M - http://webcache.googleusercontent.com/search?q=cache:kDmDc1PIA38J: ghr.nlm.nih.gov/handbook/basics/dna+&cd=2&hl=en&ct=clnk&gl=us
  18. 18. Memento & Access to Resource Versions Herbert Van de Sompel Archived Resource Transactional Archive URI-R - http://dans.knaw.nl/en URI-M - http://www.theresourcedepot.com/000010/memento/20130418204153/http://dans.knaw. nl/en
  19. 19. Memento & Access to Resource Versions Herbert Van de Sompel But, without Memento, the Web handles these version resources poorly: • Cannot talk, in URI terms, about a resource as it used to exist • Cannot access a prior version knowing the current one • Cannot access the current version knowing a prior one Solutions are ad hoc and localized 19
  20. 20. Memento & Access to Resource Versions Herbert Van de Sompel Without Memento, the Current and Past Web Lack Integration 20 • Going from Current to Past Web is a matter of (manual) discovery • Navigating the Past Web is only possible within the boundary of a single web archive, versioning system • Memento integrates the Current And Past Web by means of an extension of HTTP • Memento turns archives, versioning systems into infrastructure rather than destinations
  21. 21. Memento & Access to Resource Versions Herbert Van de Sompel Systems with Resource Versions system type stores URI-R and URI-M web archive observations over time different baseURL CMS history same baseURL search engine cache one recent observation different baseURL transactional archive history different baseURL These systems have different characteristics but the Memento protocol allows uniform versions access to their resources 21
  22. 22. Memento & Access to Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web Overview 22
  23. 23. Memento & Access to Resource Versions Herbert Van de Sompel The Memento protocol: • Regards the Web as a big Content Management System • Introduces an interoperable approach to access resource versions across the Web • Does not build new archives but leverages all systems that host versions 23
  24. 24. Memento & Access to Resource Versions Herbert Van de Sompel Memento’s approach to access resource versions: • Is distributed: versions may exist on several servers • Uses time as a global version indicator • Is based on the primitives of the Web: resource, state, representation, content negotiation, link 24
  25. 25. Memento & Access to Resource Versions Herbert Van de Sompel Memento’s approach to access resource versions has two components: • Access to a single archived/version resource – via datetime negotiation with a TimeGate • Access to an overview of existing versions – by requesting a TimeMap 25
  26. 26. Memento & Access to Resource Versions Herbert Van de Sompel 26 Memento Protocol Resource Types Original Resource: Resource that exists or used to exist; we are interested in accessing a past state of it Memento: Resource that is a prior version of the Original Resource; it encapsulates a past state of the Original Resource TimeGate: Resource that “decides”, based on a given datetime, which is the temporally best Memento for an Original Resource TimeMap: Resource that provides a list of known Mementos for an Original Resource as well as their datetime
  27. 27. Memento & Access to Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web Datetime Negotiation 27
  28. 28. Memento & Access to Resource Versions Herbert Van de Sompel 28 Original Resource and Mementos
  29. 29. Memento & Access to Resource Versions Herbert Van de Sompel 29 Bridge from Present to Past
  30. 30. Memento & Access to Resource Versions Herbert Van de Sompel 30 Bridge from Present to Past
  31. 31. Memento & Access to Resource Versions Herbert Van de Sompel 31 Bridge from Past to Present
  32. 32. Memento & Access to Resource Versions Herbert Van de Sompel 32 Bridge from Past to Present
  33. 33. Memento & Access to Resource Versions Herbert Van de Sompel 33 Memento Datetime Negotiation Component
  34. 34. Memento & Access to Resource Versions Herbert Van de Sompel 34 Memento Protocol Datetime Negotiation Patterns The different Patterns are discussed in RFC 7089 Here, we deal with URI-R <> URI-G <> URI-M and 302 style negotiation can coincide with can coincide with 302 or 200 style negotiation can be used
  35. 35. Memento & Access to Resource Versions Herbert Van de Sompel 35 Memento Datetime Negotiation - Client Server Interaction Yes, G It’s at M
  36. 36. Memento Datetime Negotiation - HTTP Flow HEAD R, [Accept-Datetime] [Link  G] 302  M, Vary, Link  R,[M,T] 200, Memento-Datetime, Link  R,[G,M,T] HEAD G, Accept-Datetime GET M, [Accept-Datetime] […]== optional
  37. 37. Memento & Access to Resource Versions Herbert Van de Sompel 37 Original Resource Provides No Link – Client Intelligence
  38. 38. Memento & Access to Resource Versions Herbert Van de Sompel 38 Original Resource Gone – Client Intelligence
  39. 39. Memento & Access to Resource Versions Herbert Van de Sompel 39 Original Resource Gone – Server Due Dilligence
  40. 40. Memento & Access to Resource Versions Herbert Van de Sompel 40 Original Resource’s Server Gone – Client Intelligence
  41. 41. Memento & Access to Resource Versions Herbert Van de Sompel 41 Memento Aggregator
  42. 42. Memento & Access to Resource Versions Herbert Van de Sompel 42 TimeGates A list of TimeGates provided by major web archives as well as by-proxy TimeGates provided for other systems is maintained at http://mementoweb.org/depot/ http://timetravel.mementoweb.org/guide/api/#registry
  43. 43. Memento & Access to Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web TimeMaps 43
  44. 44. Memento & Access to Resource Versions Herbert Van de Sompel 44 TimeMap • multiple TimeMap serializations possible • application-link/format mandatory • When TimeMaps become too large, they can be broken up and paged
  45. 45. Memento & Access to Resource Versions Herbert Van de Sompel 45 TimeMaps A list of TimeMaps provided by major web archives as well as by-proxy TimeMaps provided for other systems is maintained at http://mementoweb.org/depot/ http://timetravel.mementoweb.org/guide/api/#registry
  46. 46. Memento & Access to Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web HTTP Headers 46
  47. 47. Memento & Access to Resource Versions Herbert Van de Sompel The HTTP Headers used in the Memento Protocol • Define two new headers: – request: Accept-Datetime: – response: Memento-Datetime: • Introduce new content for two existing headers: – response: Vary: ; Link: • Use one existing header without modification: – response: Location:, TCN: 47
  48. 48. Memento & Access to Resource Versions Herbert Van de Sompel HTTP Request Headers for Datetime Negotiation • Accept-Datetime: o Issued against TimeGate, [Original Resource, Memento] o Header value: desired datetime of a Memento Accept-Datetime: Mon, 12 Oct 2009 14:20:33 GMT 48
  49. 49. Memento & Access to Resource Versions Herbert Van de Sompel HTTP Response Headers for Datetime Negotiation • Memento-Datetime: o Returned by Mementos only - Even when not as a result of datetetime negotiation o Header value: Archival datetime of the Memento - Resource has not and will not change beyond that date o This header is sticky: - Once returned, a server must always return it with same value - Must also be preserved when Mementos are mirrored at different URIs o This header is crucial to allow a client to understand it has arrived at a Memento Memento-Datetime: Mon, 12 Oct 2009 14:20:33 GMT 49
  50. 50. Memento & Access to Resource Versions Herbert Van de Sompel HTTP Response Headers Datetime Negotiation • Vary: o Returned by TimeGate o Similar to regular content negotiation o Header value: accept-datetime • Regular content negotiation (e.g. media type) can be used too but a TimeGate must first meet the datetime preference, and then – if possible – the other content negotiation preferences • Note: accept-datetime in Vary header is crucial to allow a client to understand it has arrived at a TimeGate Vary: accept-datetime 50
  51. 51. Memento & Access to Resource Versions Herbert Van de Sompel HTTP Response Headers for Datetime Negotiation • Location: o Returned by TimeGate o Similar to regular content negotiation o Header value: URI of the Memento selected by the TimeGate Location: http://web.archive.org/web/20010911223004/http://cnn.co m 51
  52. 52. Memento & Access to Resource Versions Herbert Van de Sompel HTTP Response Headers for Datetime Negotiation • Link: o Returned by Original Resource, TimeGate and Mementos o Various new Relation Types are introduced: - “original” – points to Original Resource - “timegate” – points to TimeGate - “memento” – points to Memento - “timemap” – points to TimeMap o A TimeGate must provide the “original” link o A Memento must provide the “original” link o All other links are encouraged but optional 52 HTTP Link Header: RFC 5988
  53. 53. Memento & Access to Resource Versions Herbert Van de Sompel HTTP Response Headers for Datetime Negotiation • Link: o The following ”memento” links that point at special Mementos, known to the responding server, are optional but very useful: - First and last Memento known to the server, e.g. ”memento first” - Memento prior and after the selected Memento, e.g. “”memento predecessor-version” - Selected Memento - Temporal order of Mementos is expressed using existing relation types from RFC 5829 and RFC 5988: first, last, next, prev, successor-version, predecessor- version 53
  54. 54. Memento & Access to Resource Versions Herbert Van de Sompel HTTP Response Headers for Datetime Negotiation • Link: o Attributes for a ”memento” Link: - datetime (mandatory): datetime of the Memento pointed at by the link - license (optional): license associated with the Memento o Attributes for a ”timemap” Link: - type (recommended): MIME type of TimeMap serialization - from, until (optional): to convey the temporal interval of Memento datetimes covered by the TimeMap 54
  55. 55. Memento Datetime Negotiation - HTTP Flow HEAD R, [Accept-Datetime] [Link  G] 302  M, Vary, Link  R [M T] 200, Memento-Datetime, Link  R [G M T] HEAD G, Accept-Datetime GET M, [Accept-Datetime] [timegate] original [memento timemap] original [timegate memento timemap]
  56. 56. Memento & Access to Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web HTTP Interactions 56
  57. 57. Memento & Access to Resource Versions Herbert Van de Sompel Datetime Negotiation Flow: Step 1 57
  58. 58. Memento & Access to Resource Versions Herbert Van de Sompel Datetime Negotiation Flow: Step 2 58
  59. 59. Memento & Access to Resource Versions Herbert Van de Sompel Datetime Negotiation Flow: Step 3 59
  60. 60. Memento & Access to Resource Versions Herbert Van de Sompel Datetime Negotiation Flow: Step 4 60
  61. 61. Memento & Access to Resource Versions Herbert Van de Sompel Datetime Negotiation Flow: Step 5 61
  62. 62. Memento & Access to Resource Versions Herbert Van de Sompel Datetime Negotiation Flow: Step 6 62
  63. 63. Memento & Access to Resource Versions Herbert Van de Sompel TimeMap Access Flow: Step 1 63
  64. 64. Memento & Access to Resource Versions Herbert Van de Sompel TimeMap Access Flow: Step 2 64
  65. 65. Memento & Access to Resource Versions Herbert Van de Sompel TimeMap Access Flow: Step 3 65
  66. 66. Memento & Access to Resource Versions Herbert Van de Sompel TimeMap Access Flow: Step 4 66
  67. 67. Memento & Access to Resource Versions Herbert Van de Sompel TimeMap Access Flow: Step 5 67
  68. 68. Memento & Access to Resource Versions Herbert Van de Sompel TimeMap Access Flow: Step 6 68
  69. 69. Memento & Access to Resource Versions Herbert Van de Sompel TimeMap Access Flow: Step 6 with Index TimeMap 69
  70. 70. Memento & Access to Resource Versions Herbert Van de Sompel TimeMap Access Flow: Step 6 with Paging TimeMap 70
  71. 71. Memento & Access to Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web Resource Versioning and Memento 71
  72. 72. Memento & Access to Resource Versions Herbert Van de Sompel Common Resource Versioning Approach
  73. 73. Memento & Access to Resource Versions Herbert Van de Sompel Version Resources (*) Tim Berners-Lee (1996) http://www.w3.org/DesignIssues/Generic.html (*)
  74. 74. Memento & Access to Resource Versions Herbert Van de Sompel Version Resources and Associated Generic Resource (*) (*) (*) Tim Berners-Lee (1996) http://www.w3.org/DesignIssues/Generic.html
  75. 75. Memento & Access to Resource Versions Herbert Van de Sompel Memento Bridges Between Generic & Specific Resources
  76. 76. Memento & Access to Resource Versions Herbert Van de Sompel Memento Bridges Between Generic & Specific Resources
  77. 77. Memento & Access to Resource Versions Herbert Van de Sompel Memento Bridges Between Generic & Specific Resources
  78. 78. Memento & Access to Resource Versions Herbert Van de Sompel Stepwise Support for the Memento Protocol – Step 1 • Provide Memento protocol HTTP response headers to convey version date and links o Provide Memento-Datetime header to express version date o Provide Link header with “original” link to point from version resource to generic resource o Provide Link header with appropriate “memento” links to allow navigating between versions - In combination with links with other relation types, e.g. “first”, “last”, “prev”, “next”, “predecessor-version”, “successor-version” 78
  79. 79. Memento & Access to Resource Versions Herbert Van de Sompel Stepwise Support for the Memento Protocol – Step 1 • Response to HTTP HEAD/GET against http://www.w3.org/TR/2004/PR-webarch-20041105/ 79
  80. 80. Memento & Access to Resource Versions Herbert Van de Sompel Stepwise Support for the Memento Protocol – Step 2 80 • Publish a TimeMap, at, say, http://www.w3.org/TR/timemap/webarch/ • For the generic resource and for each version resource, provide a Link header with “timemap” link that points at the TimeMap
  81. 81. Memento & Access to Resource Versions Herbert Van de Sompel Stepwise Support for the Memento Protocol – Step 2 • Response to HTTP HEAD/GET against http://www.w3.org/TR/2004/PR-webarch-20041105/ 81
  82. 82. Memento & Access to Resource Versions Herbert Van de Sompel Stepwise Support for the Memento Protocol – Step 2 • Response to HTTP GET against http://www.w3.org/TR/timemap/webarch/ 82
  83. 83. Memento & Access to Resource Versions Herbert Van de Sompel Stepwise Support for the Memento Protocol – Step 3 83 • Expose a TimeGate, at, say, http://www.w3.org/TR/timegate/webarch/ • Reponses for generic resource, version resources, TimeGate, TimeMap as shown in slides 56-70 • Note that Patterns for datetime negotiation other than the one shown in those slides are described in RFC 7089
  84. 84. Memento & Access to Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web Memento and Linked Data 84
  85. 85. Memento & Access to Resource Versions Herbert Van de Sompel
  86. 86. Memento & Access to Resource Versions Herbert Van de Sompel
  87. 87. Memento & Access to Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web User & Developer Tools 87
  88. 88. Memento & Access to Resource Versions Herbert Van de Sompel Memento for Chrome 88http://bit.ly/memento-for-chrome
  89. 89. Memento & Access to Resource Versions Herbert Van de Sompel Time Travel Find – Search Page http://timetravel.mementoweb.org/
  90. 90. Memento & Access to Resource Versions Herbert Van de Sompel Time Travel Find – Result Page http://timetravel.mementoweb.org/list/20100428103432/http://stanford.edu
  91. 91. Memento & Access to Resource Versions Herbert Van de Sompel Time Travel Find – Result Page http://timetravel.mementoweb.org/list/20100428103432/http://stanford.edu
  92. 92. Memento & Access to Resource Versions Herbert Van de Sompel Time Travel Find – Search Page http://timetravel.mementoweb.org/
  93. 93. Memento & Access to Resource Versions Herbert Van de Sompel Time Travel Find – Result Page http://timetravel.mementoweb.org/list/20140428052227/http://coptr.digipres.org/Main_Page
  94. 94. Memento & Access to Resource Versions Herbert Van de Sompel Time Travel Reconstruct – Search Page http://timetravel.mementoweb.org/
  95. 95. Memento & Access to Resource Versions Herbert Van de Sompel Time Travel Reconstruct – Result Page http://timetravel.mementoweb.org/reconstruct/20100428103432/http://stanford.edu
  96. 96. Memento & Access to Resource Versions Herbert Van de Sompel Memento for MediaWiki Extensions 96http://bit.ly/memento-for-mediawiki
  97. 97. Memento & Access to Resource Versions Herbert Van de Sompel Generic TimeGate Server (1/2) https://github.com/mementoweb/timegate
  98. 98. Memento & Access to Resource Versions Herbert Van de Sompel Generic TimeGate Server (2/2) https://github.com/mementoweb/timegate
  99. 99. Memento & Access to Resource Versions Herbert Van de Sompel SiteStory Transactional Archive for Apache Servers https://mementoweb.github.io/SiteStory/
  100. 100. Memento & Access to Resource Versions Herbert Van de Sompel 100 Memento Aggregator Coverage: See http://mementoweb.org/depot/ and http://labs.mementoweb.org/aggregator_config/archivelist.xml
  101. 101. Memento & Access to Resource Versions Herbert Van de Sompel Various Memento Tools for Users & Developers 101http://mementoweb.org/tools/
  102. 102. Memento & Access to Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web Time Travel APIs 102
  103. 103. Memento & Access to Resource Versions Herbert Van de Sompel Time Travel APIs http://timetravel.mementoweb.org/guide/api/
  104. 104. Memento & Access to Resource Versions Herbert Van de Sompel URI that Redirects to a Memento http://timetravel.mementoweb.org/memento/20100428103432/http://stanford.edu
  105. 105. Memento & Access to Resource Versions Herbert Van de Sompel URI that Redirects to a JSON Description of a Memento http://timetravel.mementoweb.org/api/json/20100428103432/http://stanford.edu
  106. 106. Memento & Access to Resource Versions Herbert Van de Sompel JSON Format for TimeMaps http://mementoweb.org/guide/timemap-json/
  107. 107. Memento & Access to Resource Versions Herbert Van de Sompel DIY TimeMap - Index TimeMap Lists Potential TimeMap URIs http://timetravel.mementoweb.org/timemap/json/http://stanford.edu SPEED
  108. 108. Memento & Access to Resource Versions Herbert Van de Sompel WDI TimeMap - Regular (Index) TimeMap http://labs.mementoweb.org/timemap/link/http://stanford.edu COVERAGE
  109. 109. Memento & Access to Resource Versions Herbert Van de Sompel Time Travel Archive Registry http://labs.mementoweb.org/aggregator_config/archivelist.xml
  110. 110. Memento & Access to Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web Robust Links 110
  111. 111. Memento & Access to Resource Versions Herbert Van de Sompel How to Reference Resources • Create a Capture in Internet Archive, archive.today, perma.cc, webcitation • Existing practice for linking to such captures: o Link to URI of Capture o Lose Original URI o Lose Capture Datetime • Problems with existing practice: o Impossible to visit the original URI, if desired o Requires the permanent existence/uptime of the archive that holds the capture - One link rot problem replaced by another Van de Sompel, H. et al. (2013) Thoughts on referencing, linking, reference rot http://mementoweb.org/missing-link/
  112. 112. Memento & Access to Resource Versions Herbert Van de Sompel Permanent Existence/Uptime of Archives? Capture of http://webcitation.org dated July 17 2013 https://archive.today/eAETp
  113. 113. Memento & Access to Resource Versions Herbert Van de Sompel Permanent Existence/Uptime of Archives? http://webcitation.org/ on August 6 2014
  114. 114. Memento & Access to Resource Versions Herbert Van de Sompel Permanent Existence/Uptime of Archives? Remnant of discontinued web archive http://mummify.it captured on February 14 2014 https://web.archive.org/web/20140214233752/https://www.mummify.it/
  115. 115. Memento & Access to Resource Versions Herbert Van de Sompel Permanent Existence/Uptime of Archives? http://www.themoscowtimes.com/news/article/russia-bans-wayback-machine-internet-archive-over- islamic-state-video/510074.html
  116. 116. Memento & Access to Resource Versions Herbert Van de Sompel Hacking Original URI, Capture Datetime from Capture URI? URI of Capture Original URI Datetime T https://web.archive.org/web/20140214233752/https:// www.mummify.it yes yes https://archive.today/eAETp no no http://perma.cc/4RH7-999Q?type=source no no http://en.wikipedia.org/w/index.php?title=Coil_(band) &oldid=388321480 no no
  117. 117. Memento & Access to Resource Versions Herbert Van de Sompel Using Capture URI to find Captures in Other Web Archives?
  118. 118. Memento & Access to Resource Versions Herbert Van de Sompel Using Capture URI to find Captures in Other Web Archives?
  119. 119. Memento & Access to Resource Versions Herbert Van de Sompel Reference Resources Robustly • When referencing resources include: o Original URI – Allows revisiting the URI as it is at the time of reading, if the URI is still operational o Snapshot URI – Allows revisiting the snapshot, if one was created, and if the web archive in which it was created is still operational o Original URI & Date/Time allows revisiting a snapshot created around the Date/Time in any web archive around the world (using Memento infrastructure) Van de Sompel, H. et al. (2013) Thoughts on referencing, linking, reference rot http://mementoweb.org/missing-link/
  120. 120. Memento & Access to Resource Versions Herbert Van de Sompel Reference Resources Actionably • When referencing resources, use Link Decorations to convey Original URI, Snapshot URI, Date/Time <a href=“http://www.stanford.edu” data-versionurl=“http://archive.is/FAy6o” data-versiondate=“2014-08-15” > <a href=“http://www.stanford.edu” data-versiondate=“2014-08-15” > Herbert Van de Sompel et al. (2015) Robust Links - Link Decorations http://robustlinks.mementoweb.org/spec/ <a href=“http://archive.is/FAy6o” data-originalurl=“http://www.stanford.edu” data-versiondate=“2014-08-15” >
  121. 121. Memento & Access to Resource Versions Herbert Van de Sompel No Link Decorations? Insert Page Date! • Include page date to allow retrieving Mementos of linked resources from around page publication date <html> <head lang=“en” itemtype=“http://schema.org/WebPage” itemid=“http://robustlinks.mementoweb.org/spec/”> <meta itemprop=“datePublished” content=“2015-01-23”> Herbert Van de Sompel et al. (2015) Robust Links - Link Decorations http://robustlinks.mementoweb.org/spec/
  122. 122. Memento & Access to Resource Versions Herbert Van de Sompel Robust Links via Link Decoration, JavaScript, Time Travel API • JavaScript makes link decorations actionable http://robustlinks.mementoweb.org/demo/uri_references_js.html JavaScript makes the info actionable
  123. 123. Memento & Access to Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web Pointers 123
  124. 124. Memento & Access to Resource Versions Herbert Van de Sompel Pointers • Memento site - http://mementoweb.org/about/ • Time Travel site – http://timetravel.mementoweb.org • RFC 7089 - http://tools.ietf.org/html/rfc7089 (text version), http://www.mementoweb.org/guide/rfc/ (HTML version) • Memento Development List - http://groups.google.com/group/memento-dev/ • Memento GitHub projects - https://github.com/mementoweb/ • Client and Server software and tools - http://mementoweb.org/tools/ • Information on TimeGates and TimeMaps for major systems - http://mementoweb.org/depot/ • IIPC list of software and tools related to web archiving - http://netpreserve.org/web-archiving/tools-and-software 124
  125. 125. Memento & Access to Resource Versions Herbert Van de Sompel The Memento Framework: Protocol to Integrate Present and Past Web Additional Details 125
  126. 126. Memento & Access to Resource Versions Herbert Van de Sompel Fixed Resource • The resource is its own Memento, i.e. it is a stable resource o Resource that was born stable or became stable; it will not change anymore, e.g. PermaLink resources on news sites o Resource provides: - Link header with ”original” link pointing to itself - Memento-Datetime header o Note the difference with Last-Modified header: no promise resource will not change anymore - Details at http://ws-dl.blogspot.com/2010/11/2010-11-05- memento-datetime-is-not-last.html 126
  127. 127. Memento & Access to Resource Versions Herbert Van de Sompel Fixed Resource • Response to HTTP HEAD/GET against http://a.example.org 127
  128. 128. Memento & Access to Resource Versions Herbert Van de Sompel Memento Without TimeGate • The resource is a Memento but there is no TimeGate available for it o e.g. snapshot of resource when server is being retired o Resource provides: - Link header with ”original” link revealing the URI of Original Resource - Memento-Datetime header 128
  129. 129. Memento & Access to Resource Versions Herbert Van de Sompel Memento Without TimeGate • Response to HTTP HEAD/GET against http://arxiv.example.net/web/20010321203610/http:// a.example.org 129
  130. 130. Memento & Access to Resource Versions Herbert Van de Sompel Intermediate Resource • The resource issues a redirect to a TimeGate, a Memento, another intermediate resource o Plays an active role in the Memento framework o Resource provides: - Link header with ”original” link revealing the URI of Original Resource 130
  131. 131. Memento & Access to Resource Versions Herbert Van de Sompel Intermediate Resource • Response to HTTP HEAD/GET against a resource that redirects to a TimeGate 131
  132. 132. Memento & Access to Resource Versions Herbert Van de Sompel Resource Excluded from Datetime Negotiation • e.g. JavaScript, logos, banners added by web archives o Resource always needs to be used in its current state o In order to flag it is excluded from datetime negotiation, this resource provides: - Link header with ”type” link that has as value http://mementoweb.org/terms/donotnegotiat e 132
  133. 133. Memento & Access to Resource Versions Herbert Van de Sompel Resource Excluded from Datetime Negotiation • Response to HTTP HEAD/GET against a resource that is excluded from datetime negotiation 133
  134. 134. Memento & Access to Resource Versions Herbert Van de Sompel Memento of a Redirect • HTTP responses with 3XX codes are also archived o e.g. web archives hold on to “301 Moved Permanently” and “302 Found” whereas Linked data archives preserve “303 See Other” • The Memento’s response must have the same HTTP status code as the original • Memento headers are as usual • Memento clients need to understand that the redirect (URI in Location header) can be to an Original Resource or to a Memento o If an Original Resource, the client must proceed to find an appropriate Memento for it 134
  135. 135. Memento & Access to Resource Versions Herbert Van de Sompel Memento of a Redirect • Response in April 2008 to HTTP HEAD/GET against http://a.example.org 135
  136. 136. Memento & Access to Resource Versions Herbert Van de Sompel Memento of a Redirect • Response to a HTTP HEAD/GET of a Memento of that 2008 redirect, whereby the redirect is unchanged, i.e. it is to the resource to which the redirect originally led 136
  137. 137. Memento & Access to Resource Versions Herbert Van de Sompel Memento of a Redirect • Response to a HTTP HEAD/GET of a Memento of that 2008 redirect, whereby the redirect is rewritten, i.e. it leads to a Memento of the resource to which the redirect originally led 137
  138. 138. Memento & Access to Resource Versions Herbert Van de Sompel http://mementoweb.org/ Memento Uniform and Robust Access to Resource Versions Memento has received funding from The Library of Congress Andrew W. Mellon Foundation IIPC 138

×