SlideShare a Scribd company logo
1 of 139
Download to read offline
Memento: Time Travel for the Web	
  


                                 Herbert Van de Sompel
                                   Robert Sanderson
                                   Michael L. Nelson

                                 http://mementoweb.org/



                                       Memento is funded by
                                      The Library of Congress

                                   	
  
Updated Technical Details (May 2011)
         Memento: Time Travel for the Web
         Updated Technical Details (05/2011)
Memento wants to make it Easy

          to navigate the Web of the Past




                    Technical Specification
https://datatracker.ietf.org/doc/draft-vandesompel-memento/
                Memento: Time Travel for the Web
                                                      2
                Updated Technical Details (05/2011)
Tate Online         Select Date                          Tate Online
   Today           March 16 2008                        March 16 2008




                                                          From
                                                    National Archives


              Memento: Time Travel for the Web
                                                    3
              Updated Technical Details (05/2011)
Memento achieves this by introducing

a uniform version access capability to

 integrate the past and current Web




        Memento: Time Travel for the Web
                                              4
        Updated Technical Details (05/2011)
Problem Statement …




Memento: Time Travel for the Web
                                      5
Updated Technical Details (05/2011)
Resources




Memento: Time Travel for the Web
                                      6
Updated Technical Details (05/2011)
Resources have Representations




     Memento: Time Travel for the Web
                                           7
     Updated Technical Details (05/2011)
Resources have Representations that Change over Time




                Memento: Time Travel for the Web
                                                      8
                Updated Technical Details (05/2011)
Only the Current Representation is Available from a Resource




                   Memento: Time Travel for the Web
                                                         9
                   Updated Technical Details (05/2011)
Old Representations are Lost Forever




       Memento: Time Travel for the Web
                                             10
       Updated Technical Details (05/2011)
Archived/Version Resources Exist




     Memento: Time Travel for the Web
                                           11
     Updated Technical Details (05/2011)
Resource Versions on the Web



                   •  Content Management Systems

                   •  Web Archives

                   •  Transactional archives

                   •  Search engine caches

                   •    …




    Memento: Time Travel for the Web
                                          12
    Updated Technical Details (05/2011)
Sep 11 2001, 20:36:10 UTC                                                 Dec 20 2001, 4:51:00 UTC

                                    Archived Resources




                                                                http://en.wikipedia.org/w/index.php?
http://web.archive.org/web/20010911203610/http://      title=September_11_attacks&oldid=282333 archived
www.cnn.com/ archived resource for http://cnn.com             resource for http://en.wikipedia.org/wiki/
                                                                        September_11_attacks

                                   Memento: Time Travel for the Web
                                   Updated Technical Details (05/2011)     13
Versions are Not Integrated with the Web



                        •  Cannot talk about a resource as it
                           used to exist

                        •  Cannot access a prior version
                           knowing the current one

                        •  Cannot access the current version
                           knowing a prior one




         Memento: Time Travel for the Web
                                               14
         Updated Technical Details (05/2011)
Memento Wants to Integrate the Past and Current Web




               Memento: Time Travel for the Web
                                                     15
               Updated Technical Details (05/2011)
The Memento Framework:


Protocol to Integrate Past and Current Web


                    Overview




          Memento: Time Travel for the Web
                                                16
          Updated Technical Details (05/2011)
Memento Framework



                   •  Regards the Web as a big Content
                      Management System

                   •  Introduces a uniform capability to
                      access versions on the Web

                   •  Does not build new archives but
                      leverages existing systems that
                      host versions




Memento: Time Travel for the Web
                                      17
Updated Technical Details (05/2011)
Memento Framework



                   •  Is distributed: versions may exist on
                      several servers

                   •  Uses time as a global version
                      indicator

                   •  Is based on the primitives of the
                      Web: resource, resource state,
                      representation, content negotiation,
                      link




Memento: Time Travel for the Web
                                      18
Updated Technical Details (05/2011)
Original Resources and Mementos




     Memento: Time Travel for the Web
                                           19
     Updated Technical Details (05/2011)
Bridge from Present to Past




  Memento: Time Travel for the Web
                                        20
  Updated Technical Details (05/2011)
Bridge from Past to Present




  Memento: Time Travel for the Web
                                        21
  Updated Technical Details (05/2011)
Memento Framework




Memento: Time Travel for the Web
                                      22
Updated Technical Details (05/2011)
Memento Client Server Interaction




     Memento: Time Travel for the Web
                                           23
     Updated Technical Details (05/2011)
Memento HTTP Flow

       HEAD R, Accept-Datetime


              Link    G


       GET G, Accept-Datetime


   302    M, Vary, Link    M,R,T


       GET M, Accept-Datetime


200, Memento-Datetime, Link     M,R,T,G
                                 24
The Memento Framework:


Protocol to Integrate Past and Current Web


            Interesting Cases




          Memento: Time Travel for the Web
                                                25
          Updated Technical Details (05/2011)
Multiple Archives




Memento: Time Travel for the Web
                                      26
Updated Technical Details (05/2011)
Original Resource Gone




Memento: Time Travel for the Web
                                      27
Updated Technical Details (05/2011)
Original Resource’s Server Gone




     Memento: Time Travel for the Web
                                           28
     Updated Technical Details (05/2011)
Original Resource Provides no Link




      Memento: Time Travel for the Web
                                            29
      Updated Technical Details (05/2011)
The Memento Framework:


Protocol to Integrate Past and Current Web


               HTTP Headers




          Memento: Time Travel for the Web
                                                30
          Updated Technical Details (05/2011)
HTTP Headers used in Memento


•  Defines two new headers:
    –  request: Accept-Datetime!
    –  response: Memento-Datetime!

•  Introduce new content for two existing headers:
    –  response: Vary ; Link!

•  Use one existing headers without modification:
    –  response: Location, TCN:!




                    Memento: Time Travel for the Web
                                                          31
                    Updated Technical Details (05/2011)
HTTP Request Headers: Accept-Datetime

•  Accept-Datetime


   o    Issued against TimeGate, (Original Resource), (Memento)

   o    Header value:
         -  Desired datetime of Memento (MANDATORY)
            Must be in RFC 1123 format and in GMT

         -  Interval indicator to express the client is only interested in
            Mementos within the interval (OPTIONAL)
              –  Expressed as two ISO8601 durations:
                 "-P3DT5H;+P2DT6H"!

  Accept-Datetime: Mon, 12 Oct 2009 14:20:33 GMT!

                      Memento: Time Travel for the Web
                                                            32
                      Updated Technical Details (05/2011)
HTTP Response Headers: Memento-Datetime
•  Memento-Datetime!
    o  Returned by Mementos

        -  Always. Even when not via a TimeGate
    o  Header value: Archival datetime of the Memento

        -  Resource has not and will not change beyond that date
    o  This header is sticky:

        -  Once returned, must always return it with same value
        -  Must also be preserved when Mementos are mirrored at
           different URIs
•  This header is crucial to allow a client to understand it has arrived
   at a Memento
       See: http://www.mementoweb.org/guide/resourcetype/

  Memento-Datetime: Mon, 12 Oct 2009 14:20:33 GMT!

                      Memento: Time Travel for the Web
                                                            33
                      Updated Technical Details (05/2011)
HTTP Response Headers: Vary

•  Vary!
    o  Returned by TimeGate

    o  Similar to regular content negotiation

    o  Header value:

        -  negotiate, accept-datetime!

•  TimeGate must first meet the datetime preference, and then – if
   possible – other content negotiation preferences

•  Note: accept-datetime value in Vary header is crucial to
   allow a client to understand it has arrived at a TimeGate.
      See: http://www.mementoweb.org/guide/resourcetype/

           Vary: negotiate, accept-datetime!

                    Memento: Time Travel for the Web
                                                          34
                    Updated Technical Details (05/2011)
HTTP Response Headers: Location

 •  Location


    o    Returned by TimeGate
    o    Similar to regular content negotiation
    o    Header value: URI of selected Memento




Location: http://web.archive.org/web/20010911223004/
                   http://cnn.com!

                      Memento: Time Travel for the Web
                                                            35
                      Updated Technical Details (05/2011)
HTTP Response Headers: Link

    •  Link!
        o  Returned by Original Resource, TimeGate and Mementos

        o  Various new Relation Types are introduced:

            -  “original”!
            -  “timegate”!
            -  “memento”!
            -  “timemap”!
        o  HTTP Link Header: RFC 5988

           See:     https://datatracker.ietf.org/doc/rfc5988/



Link: <http://web.archive.org/web/20010911223004/http://!
 cnn.com>;rel="memento";datetime="Mon, 11 Sep 2001 22:30:04 GMT"!

                       Memento: Time Travel for the Web
                                                             36
                       Updated Technical Details (05/2011)
Memento HTTP Flow

       HEAD R, Accept-Datetime


              Link  G
               timegate

       GET G, Accept-Datetime


   302    M, Vary, Link    M,R,T
              memento,original,timemap
       GET M, Accept-Datetime


200, Memento-Datetime, Link     M,R,T,G
        memento,original,timemap,timegate
HTTP Response Headers: Link

•  Link!
    o  Returned by Original Resource, TimeGate and Mementos

    o  Various new Relation Types are introduced




                   Memento: Time Travel for the Web
                                                         38
                   Updated Technical Details (05/2011)
The Memento Framework:


Protocol to Integrate Past and Current Web


            HTTP Interactions




          Memento: Time Travel for the Web
                                                39
          Updated Technical Details (05/2011)
Memento HTTP Flow: Step 1
Memento HTTP Flow: Step 2
Memento HTTP Flow: Step 3
Memento HTTP Flow: Step 4
Memento HTTP Flow: Step 5
Memento HTTP Flow: Step 6
The Memento Framework:


Protocol to Integrate Past and Current Web


       HTTP Link Header Details




          Memento: Time Travel for the Web
                                                46
          Updated Technical Details (05/2011)
HTTP Response Headers: Link




    Memento: Time Travel for the Web
                                          47
    Updated Technical Details (05/2011)
Memento HTTP Flow

       HEAD R, Accept-Datetime


              Link    G
              timegate

       GET G, Accept-Datetime


   302    M, Vary, Link    M,R,T


       GET M, Accept-Datetime


200, Memento-Datetime, Link     M,R,T,G
HTTP Response Headers: Link


•  RECOMMENDED "timegate” Link from Original Resource to
   TimeGate

•  If this Link is not available, the client must try and find a TimeGate
   itself, via:
          -  Memento Discovery approaches (see later)
          -  User interaction (e.g. Preferences in an application)




                      Memento: Time Travel for the Web
                                                            49
                      Updated Technical Details (05/2011)
HTTP Response Headers: Link




    Memento: Time Travel for the Web
                                          50
    Updated Technical Details (05/2011)
Memento HTTP Flow

    HEAD R, Accept-Datetime


           Link    G


    GET G, Accept-Datetime


  302    M, Vary, Link  M
                     memento

    GET M, Accept-Datetime


200, Memento-Datetime, Link  M
                         memento
HTTP Response Headers: Link

•  MANDATORY ”memento” Links from TimeGate and Memento to
   Mementos

•  ”memento” Links point at the following Mementos know to the
   responding server:
    o  Selected Memento (MANDATORY if a Memento is selected)

    o  First Memento, Last Memento (MANDATORY)

    o  Memento prev to selected one, Memento next to selected one

       (RECOMMENDED)
    o  Other Mementos (OPTIONAL, and only if prev and next are

       provided)
    o  Temporal order of Mementos is expressed using existing Relation

       Types (RFC 5829, RFC 5988): first, last, next, prev,
       successor-version, predecessor-version

                    Memento: Time Travel for the Web
                                                          52
                    Updated Technical Details (05/2011)
HTTP Response Headers: Link

•  MANDATORY ”memento” Links from TimeGate and Memento to
   Mementos

•  Attributes for a ”memento” Link:
    o  datetime (MANDATORY)

                  datetime of the Memento pointed at by the link
    o  license (OPTIONAL)

                  license associated with the Memento
    o  embargo (OPTIONAL)

                  datetime until which the Memento will remain
       inaccessible
    o  type (RECOMMENDED)

                  mime type of the Memento.


                     Memento: Time Travel for the Web
                                                           53
                     Updated Technical Details (05/2011)
HTTP Response Headers: Link




    Memento: Time Travel for the Web
                                          54
    Updated Technical Details (05/2011)
Memento HTTP Flow

HEAD R, Accept-Datetime


     Link    M=R
  memento,original
HTTP Response Headers: Link

•  Both ”memento” and ”original” Link on a resource.

•  The resource is its own Memento, i.e. it is a stable resource.
    o  Resource that was born stable or became stable; it will not change

       anymore.
    o  For example resources with PermaLink on news sites

    o  Note the difference with Last-Modified header




•  Can still also provide a ”timegate” Link!
    o  For example pointing at TimeGates for Mementos of the resource

       before it became stable




                    Memento: Time Travel for the Web
                                                          56
                    Updated Technical Details (05/2011)
HTTP Response Headers: Link




    Memento: Time Travel for the Web
                                          57
    Updated Technical Details (05/2011)
Memento HTTP Flow




     GET M, (Accept-Datetime)


200, Memento-Datetime, Link  M,R,T
                memento,original,timemap
HTTP Response Headers: Link


•  Mementos without a TimeGate, for example:
    o  Resources in snapshot archives

    o  Version resources in systems that have not yet implemented

       TimeGates

•  Should still use Memento-Datetime header
•  Should still use “original” Link
•  Can have “timemap” Link




                    Memento: Time Travel for the Web
                                                          59
                    Updated Technical Details (05/2011)
HTTP Response Headers: Link




    Memento: Time Travel for the Web
                                          60
    Updated Technical Details (05/2011)
Memento HTTP Flow

    HEAD R, Accept-Datetime


           Link    G


    GET G, Accept-Datetime


  302    M, Vary, Link  T
                     timemap

    GET M, Accept-Datetime


200, Memento-Datetime, Link  T
                         timemap
HTTP Response Headers: Link
•  RECOMMENDED ”timemap” Links from TimeGate and
   Mementos.

•  A TimeMap is introduced to allow retrieving an inventory of Mementos
   for an Original Resource that the responding server is aware of. It lists:
    o  URI of Original Resource (MANDATORY)

    o  URI and datetime of all known Mementos (MANDATORY)

    o  URI of TimeGate for Original Resource (RECOMMENDED)

    o  URI of TimeMap itself (RECOMMENDED)




•  Multiple TimeMap serializations possible; link-value format
   MANDATORY
    o  application/link-format:

       see https://datatracker.ietf.org/doc/draft-ietf-core-link-format/

                      Memento: Time Travel for the Web
                                                            62
                      Updated Technical Details (05/2011)
Memento HTTP Flow: GET TimeMap




     Memento: Time Travel for the Web
                                           63
     Updated Technical Details (05/2011)
Memento HTTP Flow: TimeMap Response




       Memento: Time Travel for the Web
       Updated Technical Details (05/2011)
The Memento Framework:


Discovery to Support Integration of Past and Current Web


                 TimeGate Discovery




                 Memento: Time Travel for the Web
                                                       65
                 Updated Technical Details (05/2011)
Batch Discovery of TimeGates: robots.txt

•  robots.txt file is used by Web servers to convey crawling
policies

•  Web crawlers (such as for archives) retrieve and parse it
•  De-facto standard, no official endorsement
•  Extended with new directives, including by Google



  User-agent: *               # Reject all crawlers!
  Disallow: /!
                       # Google Sitemap extension!
  Sitemap: http://some.example.com/me/sitemap.xml!

  User-agent: NiceBot         # Select only NiceBot!
  Crawl-delay: 10             # 10 seconds between requests!


                     Memento: Time Travel for the Web
                                                           66
                     Updated Technical Details (05/2011)
Batch Discovery of TimeGates: robots.txt

•  Add TimeGate and Archived directives to support discovery
of TimeGates known to the server
•  User agent should concatenate desired URL with TimeGate link
•  Archived value is truncated host/path or * to describe a general
web archive




              http://mementoweb.org/guide/robotstxt/
                     Memento: Time Travel for the Web
                                                           67
                     Updated Technical Details (05/2011)
The Memento Framework:


Discovery to Support Integration of Present and Past Web


 Discovery via TimeMaps: All Mementos for a given
      Original Resource known by an archive




                 Memento: Time Travel for the Web
                                                       68
                 Updated Technical Details (05/2011)
TimeMap Overview

•  A TimeMap is an inventory of Mementos for an Original Resource
that the responding server is aware of. It lists at least:
      •  URI of Original Resource
      •  URI and datetime of all known Mementos
      •  URI of TimeGate for Original Resource
      •  URI of TimeMap itself

•  Multiple TimeMap serializations possible:
     •  application/link-format mandatory

    see https://datatracker.ietf.org/doc/draft-ietf-core-link-format/

    •  RDF TimeMaps proposed




                     Memento: Time Travel for the Web
                                                           69
                     Updated Technical Details (05/2011)
TimeMaps: Link Format Syntax

    •  Document in the format of the value of the Link HTTP Header

    •  Format:
            <URI>;rel="type";attr="val", !
           !<URI2>…


    •  rel is the relationship between context URI and the URI in <>s
    •  The Context URI for TimeMaps is the URI with rel="original"
    •  Other rel types link to Mementos, TimeGates etc.!

<http://cnn.com/>;rel="original",

<http://web.archive.org/web/20010911223004/http://!
 cnn.com>;rel="memento";datetime="Mon, 11 Sep 2001 22:30:04 GMT"!




                        Memento: Time Travel for the Web
                                                              70
                        Updated Technical Details (05/2011)
TimeMaps: Link Attributes

•  Existing Attributes for Links
     •  "rel"               The type of relationship
     •  "type"              The (mime) formatt of the linked resource
     •  "title"             Title of the linked resource
     •  "hreflang"         !Language of linked resource
     •  "media"             Intended media (eg screen)
     •  "anchor"            URI to override context URI for link

•  New rel types introduced:
     •  "original"      The Original Resource
     •  "memento"        A Memento of the Original
     •  "timegate"       A TimeGate for the Original
     •  "timemap"        A TimeMap of Mementos of the Original!




                     Memento: Time Travel for the Web
                                                           71
                     Updated Technical Details (05/2011)
TimeMaps: Link Attributes

      •  New Attributes for Mementos:
           •  datetime The Memento-Datetime!
           •  license!License associated with Memento
           •  embargo!Time after which Memento is available
           •  Are others necessary?!

<http://cnn.com/>;rel="original",!
<http://web.archive.org/timegate/cnn.com/>;rel="timegate",!
<http://web.archive.org/timemap/link/cnn.com/>;rel="timemap",!
<http://web.archive.org/web/20010911223004/http://

 cnn.com>;rel="memento";datetime="Mon, 11 Sep 2001 22:30:04 GMT"!
<http://web.archive.org/web/…/cnn.com/>;rel="memento";

    license="http://archive.org/license/1";

    datetime="…";embargo="Mon, 20 Jul 2011 00:00:00 GMT",!
… !

                          Memento: Time Travel for the Web
                                                                72
                          Updated Technical Details (05/2011)
The Memento Framework:


Discovery to Support Integration of Past and Current Web


  Memento Discovery: All Mementos known by an
                    archive




                 Memento: Time Travel for the Web
                                                       73
                 Updated Technical Details (05/2011)
Batch discovery of Mementos: robots.txt

•  robots.txt file is used by Web servers to convey crawling
policies

•  Support discovery of Mementos through robots.txt via existing
User-agent and Allow directives
•  Use value memento for User-agent to convey that the value
for the Memento-Datetime header must remain sticky when
crawling/mirroring Mementos




                    Memento: Time Travel for the Web
                                                          74
                    Updated Technical Details (05/2011)
Batch discovery of Mementos: Memento Feeds

•  Concept:
    •  Archives publish feeds in which each entry provides details
    about a specific Memento, e.g. Memento-Datetime, Original
    Resource, etc.
    •  As new Mementos become available, new feeds with new
    entries are published
    •  Once published, feeds remain static

•  Technology:
     •  To be decided in collaboration with IIPC
     •  Inspired by the approach and functionality of CDX files (see
     http://www.archive.org/web/researcher/cdx_legend.php) but:
           •  With Memento-specific extensions;
           •  Possibly using different serialization;
           •  Including mechanisms to discover these feeds.


                     Memento: Time Travel for the Web
                                                           75
                     Updated Technical Details (05/2011)
The Memento Framework:




              Tools




  Memento: Time Travel for the Web
                                        76
  Updated Technical Details (05/2011)
Memento Client Support

                    •  Several client tools developed by
                       us and others

                    •  Add-ons for FireFox (operational)
                       and Internet Explorer
                       (experimental)

                    •  Applications for Android
                       (operational) and iPhone/iPad (in
                       development)

                    •  Paper in Code4Lib Journal
                        http://journal.code4lib.org/articles/4979




 Memento: Time Travel for the Web
                                       77
 Updated Technical Details (05/2011)
Memento Server Support




                              •  Plug-in for MediaWiki (operational)

                              •  Used on W3C’s main wiki

                              •  Please install it for your MediaWiki!




http://www.mediawiki.org/wiki/Extension:Memento
           Memento: Time Travel for the Web
                                                 78
           Updated Technical Details (05/2011)
Memento Server Support

                        •  Memento-compliant Wayback
                           software:

                             •  In production at the Internet
                                Archive

                             •  Available to Web archives,
                                worldwide

                             •  Please have your favorite Web
                                Archive experiment with the
                                new 1.6 version!




http://mementoweb.org/tools/wayback/
     Memento: Time Travel for the Web
                                           79
     Updated Technical Details (05/2011)
Memento Server Validator


                         •  Server side client:

                              •  Attempts to perform all
                                 Memento actions against a
                                 given URI

                              •  Reports success/failure of the
                                 interactions and warnings for
                                 optional aspects

                              •  Kept up to date with IETF
                                 Internet Draft


http://mementoweb.org/tools/validator/
      Memento: Time Travel for the Web
                                            80
      Updated Technical Details (05/2011)
Memento Proxy Support


                     •  Several systems that host
                        Mementos made Memento-
                        compliant “by proxy”

                          •  Many major Web Archives that
                             do not yet run Memento-
                             compliant software
                          •  3,000+ MediaWiki systems,
                             including Wikipedia, Wikia

                     •  We would love all of these to
                        become natively Memento
                        compliant!



  Memento: Time Travel for the Web
                                        81
  Updated Technical Details (05/2011)
Memento Aggregator TimeGate



                      •  Aggregates all known TimeGates
                          •  Proxies
                          •  Native Implementations

                      •  Redirects to authoritative
                         TimeGates (Wikipedia,
                         Transactional Archives)

                      •  Currently implemented with
                         BerkeleyDB
                      •  Future version to use
                         FaceBook's Cassandra platform



  Memento: Time Travel for the Web
                                        82
  Updated Technical Details (05/2011)
Aggregators Find More Mementos!




•  1000 URIs sampled from delicious.com
•  1 dot = 1 Memento (x=Memento-Datetime, y=URI of Original Resource)
•  Sorted by URI longevity
                        Memento: Time Travel for the Web
                                                              83
                        Updated Technical Details (05/2011)
But Still Too Few Mementos To Be Found…




•  1000 URIs sampled from search engine result pages;
•  See: “How Much of the Web is Archived?” JCDL 2011
                        Memento: Time Travel for the Web
                                                              84
                        Updated Technical Details (05/2011)
Crawl-Based Web Archives




                Observations
For example: Heritrix crawler for Internet Archive


           Memento: Time Travel for the Web
                                                 85
           Updated Technical Details (05/2011)
Crawl-Based Web Archives

•  Collect discreet observations of resources, not their entire
evolution.

•  Can be rejected (robots.txt, by user-agent, by host IP)

•  Can be deceived (cloaking, by geo-location, by user-agent).

•  Coverage of particular Web server dependent on crawl-strategy.




                      Memento: Time Travel for the Web
                                                            86
                      Updated Technical Details (05/2011)
Server-Side Transactional Web Archives




                   Change History
For example: TTApache, PageVault, Vignette Web Capture


              Memento: Time Travel for the Web
                                                    87
              Updated Technical Details (05/2011)
Server-Side Transactional Web Archives

•  Collect all representations served by to-be-archived server.

•  To-be-archived server needs to cooperate.
     •  Incentives e.g. institutional memory, official record of Web
     presence.

•  Archival coverage restricted by to-be-archived server, does not
include external servers (e.g. embedded resources).

•  To be archived server can submit falsified information.

•  Archival collection management: what to keep, what not (e.g.
significant changes, deduplication, …).




                      Memento: Time Travel for the Web
                                                            88
                      Updated Technical Details (05/2011)
Development of Transactional Web Archive Software
Capture:
   •  Apache connection filter module captures URI, headers, body
   •  POSTs in real-time to transactional archive




Access:
   •  Online, real time access via Memento TimeGates
   •  Batch Export via WARC files for long term preservation


                        Memento: Time Travel for the Web
                                                              89
                        Updated Technical Details (05/2011)
Development of Transactional Web Archive Software
Capture:
   •  Apache connection filter module captures URI, headers, body
   •  POSTs in real-time to transactional archive




Submit:
   •  Java-Grizzly-Jersey submission interface application
   •  Berkeley DB metadata store
   •  FS store for body and headers

                        Memento: Time Travel for the Web
                                                              90
                        Updated Technical Details (05/2011)
Development of Transactional Web Archive Software
Access:
   •  Transactional archive natively supports Memento
   •  Immediate availability of archived content
   •  Export of WARC, e.g. for long-term archiving in other environment




Development Timeline:
   •  Ongoing development (LANL) and testing (ODU)
   •  Submit/Access finalized; coding focus on collection management
   •  Expected release as open source, 3rd Quarter 2011

                        Memento: Time Travel for the Web
                                                              91
                        Updated Technical Details (05/2011)
The Memento Framework:




 Resource Versioning




  Memento: Time Travel for the Web
                                        92
  Updated Technical Details (05/2011)
Memento: Time Travel for the Web
                                      93
Updated Technical Details (05/2011)
Memento: Time Travel for the Web
                                      94
Updated Technical Details (05/2011)
Memento: Time Travel for the Web
                                      95
Updated Technical Details (05/2011)
Memento: Time Travel for the Web
                                      96
Updated Technical Details (05/2011)
Memento: Time Travel for the Web
                                      97
Updated Technical Details (05/2011)
Memento: Time Travel for the Web
                                      98
Updated Technical Details (05/2011)
Memento: Time Travel for the Web
                                      99
Updated Technical Details (05/2011)
Memento Framework

Original Resource: http://lanlsource.lanl.gov/pics/picoftheday.png




                   Memento: Time Travel for the Web
                                                         100
                   Updated Technical Details (05/2011)
Time Travel across Versions of a Picture of the Day




Movie at: http://www.mementoweb.org/demo/picoftheday.mov

               Memento: Time Travel for the Web
               Updated Technical Details (05/2011)
Memento: Time Travel for the Web
Updated Technical Details (05/2011)
Memento: Time Travel for the Web
Updated Technical Details (05/2011)
Memento: Time Travel for the Web
Updated Technical Details (05/2011)
Memento: Time Travel for the Web
Updated Technical Details (05/2011)
Memento: Time Travel for the Web
Updated Technical Details (05/2011)
Memento Framework

Original Resource: http://dbpedia.org/resource/France




             Memento: Time Travel for the Web
                                                   107
             Updated Technical Details (05/2011)
Time-Series Analysis across DBpedia Versions




      Data collected through HTTP Navigation

      paper at http://arxiv.org/abs/1003.3661
            Memento: Time Travel for the Web
                                                  108
            Updated Technical Details (05/2011)
The Memento Framework:




 About Memento-Datetime:

Archive Navigation Coherence




    Memento: Time Travel for the Web
                                          109
    Updated Technical Details (05/2011)
Resource History Recorded by CMS and Transactional Archives




                   Memento: Time Travel for the Web
                                                         110
                   Updated Technical Details (05/2011)
Navigate foo.html @ t4




Memento: Time Travel for the Web
                                      111
Updated Technical Details (05/2011)
Navigation Coherence for foo.html @ t4




        Memento: Time Travel for the Web
                                              112
        Updated Technical Details (05/2011)
Resource Observations Recorded by Crawler-Based Archives




                  Memento: Time Travel for the Web
                                                        113
                  Updated Technical Details (05/2011)
Missed Observations




Memento: Time Travel for the Web
                                      114
Updated Technical Details (05/2011)
Navigate foo.html @ t4




Memento: Time Travel for the Web
                                      115
Updated Technical Details (05/2011)
Navigation Incoherence foo.html @ t4




       Memento: Time Travel for the Web
                                             116
       Updated Technical Details (05/2011)
Increase Coherence with Observations from Multiple Archives?




                   Memento: Time Travel for the Web
                                                         117
                   Updated Technical Details (05/2011)
The Memento Framework:




          About Memento-Datetime:

Relation to Creation Datetime and Last-Modified




             Memento: Time Travel for the Web
                                                   118
             Updated Technical Details (05/2011)
Three Notions of Time

•  Creation: Datetime when the resource first came into being
•  Last-Modified: Datetime when the resource was last changed
•  Memento-Datetime: Datetime that the resource was “frozen”, e.g.
   as a result of:
    o  Archiving it at a different URI (e.g. in a CMS, Web Archive,
       Transactional Archive, Snapshot Archive);
    o  Deciding never to change it anymore and keeping it at its
       original URI.




                    Memento: Time Travel for the Web
                                                          119
                    Updated Technical Details (05/2011)
Creation = Memento-Datetime = Last-Modified


Cr
MD
LM



At a particular point in time,
the Original Resource is
observed, and the associated
Memento is created. All time
values for the Memento are
the same.



                         Memento: Time Travel for the Web
                                                               120
                         Updated Technical Details (05/2011)
Creation = Memento-Datetime < Last-Modified


Cr
MD
LM


The HTML archive banner
added to a Memento
necessitates a change in
Last-Modified of the
Memento, but Creation date
and Memento-Datetime
remain unchanged.



                       Memento: Time Travel for the Web
                                                             121
                       Updated Technical Details (05/2011)
Creation < Memento-Datetime <= Last-Modified


Cr
MD
LM



It is possible that the Original
Resource was a placeholder
resource and returned a 200
response before it started to
identify a Memento (URI-
R=URI-M).



                            Memento: Time Travel for the Web
                                                                  122
                            Updated Technical Details (05/2011)
Memento-Datetime < Creation <= Last-Modified


Cr
MD
LM

If a Memento is copied to a
new archive, the copied
Memento has a Creation and
Last-Modified equal to the
time of copying. The
Memento-Datetime is “sticky”
and is the same for the
Memento and its copy.



                        Memento: Time Travel for the Web
                                                              123
                        Updated Technical Details (05/2011)
The Memento Framework:




Persistent Web Annotations




   Memento: Time Travel for the Web
                                         124
   Updated Technical Details (05/2011)
Web-Centric Annotation: No Persistence




Google Sidewiki Annotation on http://news.bbc.co.uk/ as of 2010-06-14
                     Memento: Time Travel for the Web
                                                           125
                     Updated Technical Details (05/2011)
Web-Centric Annotation: No Persistence




                       Archived page from:
http://www.dracos.co.uk/work/bbc-news-archive/2010/03/08/07.05.html
                    Memento: Time Travel for the Web
                                                          126
                    Updated Technical Details (05/2011)
Web-Centric Annotation: Desired Persistence




          Memento: Time Travel for the Web
                                                127
          Updated Technical Details (05/2011)
Open Annotation: Dealing with Web Time

•  As regular Web resources, Body and Target of an Annotation have
representations that can change over time.

•  Body and Target can change independently of each other.

•  If an Annotation involves resources as they existed at a particular point
in time, this needs to be recorded.

•  The OAC model provides hooks for doing so:
     •  Timeless Annotations;
     •  Uniform Time Annotations;
     •  Varied Time Annotations.




                        Memento: Time Travel for the Web
                        Updated Technical Details (05/2011)   128
Open Annotation: Uniform Time Annotations
•  The Annotation is not always applicable, but pertains to the state of the
Body and Target at a specific moment in time.


•  Add oac:when property to the Annotation.




                        Memento: Time Travel for the Web
                        Updated Technical Details (05/2011)   129
Memento + Open Annotation: Persistent Annotations
•  In order to reconstruct the Annotation as intended: Use Memento to
obtain an archived representation of B and T as they existed at the
oac:when datetime.




                      Memento: Time Travel for the Web
                      Updated Technical Details (05/2011)   130
Create an Annotation




Memento: Time Travel for the Web
                                      131
Updated Technical Details (05/2011)
Reconstruct the Annotation without Memento




          Memento: Time Travel for the Web
                                                132
          Updated Technical Details (05/2011)
Reconstruct the Annotation with Memento




         Memento: Time Travel for the Web
                                               133
         Updated Technical Details (05/2011)
The Memento Framework:




The Increasing Value of a URI




    Memento: Time Travel for the Web
                                          134
    Updated Technical Details (05/2011)
URI as Access Point to a Page




    Memento: Time Travel for the Web
    Updated Technical Details (05/2011)
URI as Access Point to Page and Data




       Memento: Time Travel for the Web
       Updated Technical Details (05/2011)
URI as Access Point to Current and Past Pages and Data




                Memento: Time Travel for the Web
                Updated Technical Details (05/2011)
References

•  Van de Sompel, H., Nelson, M.L., Sanderson, R.,
   Balakireva, L., Ainsworth, S., Shankar, H. (2009)
   Memento: Time Travel for the Web.
    http://arxiv.org/abs/0911.1112
•  Van de Sompel, H., Sanderson, R., Nelson, M.L.,
   Balakireva, L., Ainsworth, S., Shankar, H. (2010) An
   HTTP-Based Versioning Mechanism for Linked Data.
   Proceedings of the 3rd Workshop on Linked Data on the
   Web. http://arxiv.org/abs/1003.3661
•  Sanderson, R., and Van de Sompel, H. (2010) Making Web
   Annotations Persistent over Time. Proceedings of the
   10th ACM/IEEE-CS Joint Conference on Digital libraries.
    http://arxiv.org/abs/1003.2643
•  Sanderson, R., Van de Sompel, H. (2011) Open Annotation
   Alpha3 Data Model Guide.
   http://www.openannotation.org/spec/alpha3/


                  Memento: Time Travel for the Web
                                                        138
                  Updated Technical Details (05/2011)
Memento wants to make navigating the Web’s Past Easy




                    http://mementoweb.org/
        http://groups.google.com/group/memento-dev
                 Memento: Time Travel for the Web
                                                       139
                 Updated Technical Details (05/2011)

More Related Content

Similar to Memento: Updated technical details (May 2011)

Transactional Archiving (Web Archive Globalization Workshop)
Transactional Archiving (Web Archive Globalization Workshop)Transactional Archiving (Web Archive Globalization Workshop)
Transactional Archiving (Web Archive Globalization Workshop)Robert Sanderson
 
An HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked DataAn HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked DataHerbert Van de Sompel
 
facebook architecture for 600M users
facebook architecture for 600M usersfacebook architecture for 600M users
facebook architecture for 600M usersJongyoon Choi
 
Semantic Annotation and Search for Resources in the Next Generation Web
Semantic Annotation and Search for Resources in the Next Generation WebSemantic Annotation and Search for Resources in the Next Generation Web
Semantic Annotation and Search for Resources in the Next Generation Webajithranabahu
 
You Can't Search Without Data
You Can't Search Without DataYou Can't Search Without Data
You Can't Search Without DataBryan Bende
 
Devnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFiDevnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFiBryan Bende
 
Forget Duplicating Local Changes: Apache NiFi and the Flow Development Lifecy...
Forget Duplicating Local Changes: Apache NiFi and the Flow Development Lifecy...Forget Duplicating Local Changes: Apache NiFi and the Flow Development Lifecy...
Forget Duplicating Local Changes: Apache NiFi and the Flow Development Lifecy...DataWorks Summit
 
Digital Preservation - ODU
Digital Preservation - ODUDigital Preservation - ODU
Digital Preservation - ODUJustin Brunelle
 
Digital Preservation at ODU
Digital Preservation at ODUDigital Preservation at ODU
Digital Preservation at ODUJustin Brunelle
 
introduction and basic of web development
introduction and basic of web developmentintroduction and basic of web development
introduction and basic of web developmentamithvp002
 
WCM-8 A Tale of Two Web Quick Start Implementations
WCM-8 A Tale of Two Web Quick Start ImplementationsWCM-8 A Tale of Two Web Quick Start Implementations
WCM-8 A Tale of Two Web Quick Start ImplementationsAlfresco Software
 
An introduction to honeyclient technology
An introduction to honeyclient technologyAn introduction to honeyclient technology
An introduction to honeyclient technologyAngelo Dell'Aera
 

Similar to Memento: Updated technical details (May 2011) (20)

Transactional Archiving (Web Archive Globalization Workshop)
Transactional Archiving (Web Archive Globalization Workshop)Transactional Archiving (Web Archive Globalization Workshop)
Transactional Archiving (Web Archive Globalization Workshop)
 
An HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked DataAn HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked Data
 
facebook architecture for 600M users
facebook architecture for 600M usersfacebook architecture for 600M users
facebook architecture for 600M users
 
Semantic Annotation and Search for Resources in the Next Generation Web
Semantic Annotation and Search for Resources in the Next Generation WebSemantic Annotation and Search for Resources in the Next Generation Web
Semantic Annotation and Search for Resources in the Next Generation Web
 
Websockets
WebsocketsWebsockets
Websockets
 
You Can't Search Without Data
You Can't Search Without DataYou Can't Search Without Data
You Can't Search Without Data
 
Web Standards
Web StandardsWeb Standards
Web Standards
 
Modern Web Applications
Modern Web ApplicationsModern Web Applications
Modern Web Applications
 
J web socket
J web socketJ web socket
J web socket
 
WebSockets - Boosting Web Communication - SDC 2011
WebSockets - Boosting Web Communication - SDC 2011WebSockets - Boosting Web Communication - SDC 2011
WebSockets - Boosting Web Communication - SDC 2011
 
Nifi workshop
Nifi workshopNifi workshop
Nifi workshop
 
Devnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFiDevnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFi
 
Forget Duplicating Local Changes: Apache NiFi and the Flow Development Lifecy...
Forget Duplicating Local Changes: Apache NiFi and the Flow Development Lifecy...Forget Duplicating Local Changes: Apache NiFi and the Flow Development Lifecy...
Forget Duplicating Local Changes: Apache NiFi and the Flow Development Lifecy...
 
Digital Preservation - ODU
Digital Preservation - ODUDigital Preservation - ODU
Digital Preservation - ODU
 
Digital Preservation at ODU
Digital Preservation at ODUDigital Preservation at ODU
Digital Preservation at ODU
 
Ad upresentation
Ad upresentationAd upresentation
Ad upresentation
 
introduction and basic of web development
introduction and basic of web developmentintroduction and basic of web development
introduction and basic of web development
 
Future of web_apps
Future of web_appsFuture of web_apps
Future of web_apps
 
WCM-8 A Tale of Two Web Quick Start Implementations
WCM-8 A Tale of Two Web Quick Start ImplementationsWCM-8 A Tale of Two Web Quick Start Implementations
WCM-8 A Tale of Two Web Quick Start Implementations
 
An introduction to honeyclient technology
An introduction to honeyclient technologyAn introduction to honeyclient technology
An introduction to honeyclient technology
 

More from Herbert Van de Sompel

The web is rotting and what to do about it
The web is rotting and what to do about itThe web is rotting and what to do about it
The web is rotting and what to do about itHerbert Van de Sompel
 
Researcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebResearcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebHerbert Van de Sompel
 
Persistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DonePersistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DoneHerbert Van de Sompel
 
FAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueFAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueHerbert Van de Sompel
 
Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Herbert Van de Sompel
 
Collecting the organizational scholarly record
Collecting the organizational scholarly recordCollecting the organizational scholarly record
Collecting the organizational scholarly recordHerbert Van de Sompel
 
Achieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed CollectionsAchieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed CollectionsHerbert Van de Sompel
 
Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)Herbert Van de Sompel
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTHerbert Van de Sompel
 
Interoperability for web based scholarship
Interoperability for web based scholarshipInteroperability for web based scholarship
Interoperability for web based scholarshipHerbert Van de Sompel
 
A Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly RecordA Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly RecordHerbert Van de Sompel
 

More from Herbert Van de Sompel (20)

The web is rotting and what to do about it
The web is rotting and what to do about itThe web is rotting and what to do about it
The web is rotting and what to do about it
 
Researcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebResearcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized Web
 
Persistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DonePersistent Identification: Easier Said than Done
Persistent Identification: Easier Said than Done
 
FAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueFAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning Issue
 
Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)
 
Collecting the organizational scholarly record
Collecting the organizational scholarly recordCollecting the organizational scholarly record
Collecting the organizational scholarly record
 
To the Rescue of Scholarly Orphans
To the Rescue of Scholarly OrphansTo the Rescue of Scholarly Orphans
To the Rescue of Scholarly Orphans
 
Almost two decades at LANL
Almost two decades at LANLAlmost two decades at LANL
Almost two decades at LANL
 
Perseverance on Persistence
Perseverance on PersistencePerseverance on Persistence
Perseverance on Persistence
 
Paul Evan Peters Lecture
Paul Evan Peters LecturePaul Evan Peters Lecture
Paul Evan Peters Lecture
 
Achieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed CollectionsAchieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed Collections
 
Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)
 
Signposting Overview
Signposting OverviewSignposting Overview
Signposting Overview
 
PID Signposting Pattern
PID Signposting PatternPID Signposting Pattern
PID Signposting Pattern
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
 
Interoperability for web based scholarship
Interoperability for web based scholarshipInteroperability for web based scholarship
Interoperability for web based scholarship
 
Reminiscing about interoperability
Reminiscing about interoperabilityReminiscing about interoperability
Reminiscing about interoperability
 
Creating Pockets of Persistence
Creating Pockets of PersistenceCreating Pockets of Persistence
Creating Pockets of Persistence
 
ResourceSync Quick Overview
ResourceSync Quick OverviewResourceSync Quick Overview
ResourceSync Quick Overview
 
A Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly RecordA Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly Record
 

Recently uploaded

Mysore Call Girls 7001305949 WhatsApp Number 24x7 Best Services
Mysore Call Girls 7001305949 WhatsApp Number 24x7 Best ServicesMysore Call Girls 7001305949 WhatsApp Number 24x7 Best Services
Mysore Call Girls 7001305949 WhatsApp Number 24x7 Best Servicesnajka9823
 
ppt on Myself, Occupation and my Interest
ppt on Myself, Occupation and my Interestppt on Myself, Occupation and my Interest
ppt on Myself, Occupation and my InterestNagaissenValaydum
 
Dubai Call Girls Bikni O528786472 Call Girls Dubai Ebony
Dubai Call Girls Bikni O528786472 Call Girls Dubai EbonyDubai Call Girls Bikni O528786472 Call Girls Dubai Ebony
Dubai Call Girls Bikni O528786472 Call Girls Dubai Ebonyhf8803863
 
Expert Pool Table Refelting in Lee & Collier County, FL
Expert Pool Table Refelting in Lee & Collier County, FLExpert Pool Table Refelting in Lee & Collier County, FL
Expert Pool Table Refelting in Lee & Collier County, FLAll American Billiards
 
Resultados del Campeonato mundial de Marcha por equipos Antalya 2024
Resultados del Campeonato mundial de Marcha por equipos Antalya 2024Resultados del Campeonato mundial de Marcha por equipos Antalya 2024
Resultados del Campeonato mundial de Marcha por equipos Antalya 2024Judith Chuquipul
 
Technical Data | ThermTec Wild 650 | Optics Trade
Technical Data | ThermTec Wild 650 | Optics TradeTechnical Data | ThermTec Wild 650 | Optics Trade
Technical Data | ThermTec Wild 650 | Optics TradeOptics-Trade
 
Presentation: The symbols of the Olympic Games
Presentation: The symbols of the Olympic  GamesPresentation: The symbols of the Olympic  Games
Presentation: The symbols of the Olympic Gamesluciavilafernandez
 
Chennai Call Girls Anna Nagar Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Anna Nagar Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Anna Nagar Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Anna Nagar Phone 🍆 8250192130 👅 celebrity escorts servicevipmodelshub1
 
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdf
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdfJORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdf
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdfArturo Pacheco Alvarez
 
8377087607 ☎, Cash On Delivery Call Girls Service In Hauz Khas Delhi Enjoy 24/7
8377087607 ☎, Cash On Delivery Call Girls Service In Hauz Khas Delhi Enjoy 24/78377087607 ☎, Cash On Delivery Call Girls Service In Hauz Khas Delhi Enjoy 24/7
8377087607 ☎, Cash On Delivery Call Girls Service In Hauz Khas Delhi Enjoy 24/7dollysharma2066
 
办理学位证(KCL文凭证书)伦敦国王学院毕业证成绩单原版一模一样
办理学位证(KCL文凭证书)伦敦国王学院毕业证成绩单原版一模一样办理学位证(KCL文凭证书)伦敦国王学院毕业证成绩单原版一模一样
办理学位证(KCL文凭证书)伦敦国王学院毕业证成绩单原版一模一样7pn7zv3i
 
Croatia vs Italy UEFA Euro 2024 Croatia's Checkered Legacy on Display in New ...
Croatia vs Italy UEFA Euro 2024 Croatia's Checkered Legacy on Display in New ...Croatia vs Italy UEFA Euro 2024 Croatia's Checkered Legacy on Display in New ...
Croatia vs Italy UEFA Euro 2024 Croatia's Checkered Legacy on Display in New ...Eticketing.co
 
Instruction Manual | ThermTec Wild Thermal Monoculars | Optics Trade
Instruction Manual | ThermTec Wild Thermal Monoculars | Optics TradeInstruction Manual | ThermTec Wild Thermal Monoculars | Optics Trade
Instruction Manual | ThermTec Wild Thermal Monoculars | Optics TradeOptics-Trade
 
VIP Kolkata Call Girl Liluah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Liluah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Liluah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Liluah 👉 8250192130 Available With Roomdivyansh0kumar0
 
Call Girls in Dhaula Kuan 💯Call Us 🔝8264348440🔝
Call Girls in Dhaula Kuan 💯Call Us 🔝8264348440🔝Call Girls in Dhaula Kuan 💯Call Us 🔝8264348440🔝
Call Girls in Dhaula Kuan 💯Call Us 🔝8264348440🔝soniya singh
 
Real Moto 2 MOD APK v1.1.721 All Bikes, Unlimited Money
Real Moto 2 MOD APK v1.1.721 All Bikes, Unlimited MoneyReal Moto 2 MOD APK v1.1.721 All Bikes, Unlimited Money
Real Moto 2 MOD APK v1.1.721 All Bikes, Unlimited MoneyApk Toly
 
Interpreting the Secrets of Milan Night Chart
Interpreting the Secrets of Milan Night ChartInterpreting the Secrets of Milan Night Chart
Interpreting the Secrets of Milan Night ChartChart Kalyan
 
Technical Data | ThermTec Wild 325 | Optics Trade
Technical Data | ThermTec Wild 325 | Optics TradeTechnical Data | ThermTec Wild 325 | Optics Trade
Technical Data | ThermTec Wild 325 | Optics TradeOptics-Trade
 

Recently uploaded (20)

FULL ENJOY Call Girls In Savitri Nagar (Delhi) Call Us 9953056974
FULL ENJOY Call Girls In  Savitri Nagar (Delhi) Call Us 9953056974FULL ENJOY Call Girls In  Savitri Nagar (Delhi) Call Us 9953056974
FULL ENJOY Call Girls In Savitri Nagar (Delhi) Call Us 9953056974
 
Mysore Call Girls 7001305949 WhatsApp Number 24x7 Best Services
Mysore Call Girls 7001305949 WhatsApp Number 24x7 Best ServicesMysore Call Girls 7001305949 WhatsApp Number 24x7 Best Services
Mysore Call Girls 7001305949 WhatsApp Number 24x7 Best Services
 
ppt on Myself, Occupation and my Interest
ppt on Myself, Occupation and my Interestppt on Myself, Occupation and my Interest
ppt on Myself, Occupation and my Interest
 
young Call girls in Moolchand 🔝 9953056974 🔝 Delhi escort Service
young Call girls in Moolchand 🔝 9953056974 🔝 Delhi escort Serviceyoung Call girls in Moolchand 🔝 9953056974 🔝 Delhi escort Service
young Call girls in Moolchand 🔝 9953056974 🔝 Delhi escort Service
 
Dubai Call Girls Bikni O528786472 Call Girls Dubai Ebony
Dubai Call Girls Bikni O528786472 Call Girls Dubai EbonyDubai Call Girls Bikni O528786472 Call Girls Dubai Ebony
Dubai Call Girls Bikni O528786472 Call Girls Dubai Ebony
 
Expert Pool Table Refelting in Lee & Collier County, FL
Expert Pool Table Refelting in Lee & Collier County, FLExpert Pool Table Refelting in Lee & Collier County, FL
Expert Pool Table Refelting in Lee & Collier County, FL
 
Resultados del Campeonato mundial de Marcha por equipos Antalya 2024
Resultados del Campeonato mundial de Marcha por equipos Antalya 2024Resultados del Campeonato mundial de Marcha por equipos Antalya 2024
Resultados del Campeonato mundial de Marcha por equipos Antalya 2024
 
Technical Data | ThermTec Wild 650 | Optics Trade
Technical Data | ThermTec Wild 650 | Optics TradeTechnical Data | ThermTec Wild 650 | Optics Trade
Technical Data | ThermTec Wild 650 | Optics Trade
 
Presentation: The symbols of the Olympic Games
Presentation: The symbols of the Olympic  GamesPresentation: The symbols of the Olympic  Games
Presentation: The symbols of the Olympic Games
 
Chennai Call Girls Anna Nagar Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Anna Nagar Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Anna Nagar Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Anna Nagar Phone 🍆 8250192130 👅 celebrity escorts service
 
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdf
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdfJORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdf
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdf
 
8377087607 ☎, Cash On Delivery Call Girls Service In Hauz Khas Delhi Enjoy 24/7
8377087607 ☎, Cash On Delivery Call Girls Service In Hauz Khas Delhi Enjoy 24/78377087607 ☎, Cash On Delivery Call Girls Service In Hauz Khas Delhi Enjoy 24/7
8377087607 ☎, Cash On Delivery Call Girls Service In Hauz Khas Delhi Enjoy 24/7
 
办理学位证(KCL文凭证书)伦敦国王学院毕业证成绩单原版一模一样
办理学位证(KCL文凭证书)伦敦国王学院毕业证成绩单原版一模一样办理学位证(KCL文凭证书)伦敦国王学院毕业证成绩单原版一模一样
办理学位证(KCL文凭证书)伦敦国王学院毕业证成绩单原版一模一样
 
Croatia vs Italy UEFA Euro 2024 Croatia's Checkered Legacy on Display in New ...
Croatia vs Italy UEFA Euro 2024 Croatia's Checkered Legacy on Display in New ...Croatia vs Italy UEFA Euro 2024 Croatia's Checkered Legacy on Display in New ...
Croatia vs Italy UEFA Euro 2024 Croatia's Checkered Legacy on Display in New ...
 
Instruction Manual | ThermTec Wild Thermal Monoculars | Optics Trade
Instruction Manual | ThermTec Wild Thermal Monoculars | Optics TradeInstruction Manual | ThermTec Wild Thermal Monoculars | Optics Trade
Instruction Manual | ThermTec Wild Thermal Monoculars | Optics Trade
 
VIP Kolkata Call Girl Liluah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Liluah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Liluah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Liluah 👉 8250192130 Available With Room
 
Call Girls in Dhaula Kuan 💯Call Us 🔝8264348440🔝
Call Girls in Dhaula Kuan 💯Call Us 🔝8264348440🔝Call Girls in Dhaula Kuan 💯Call Us 🔝8264348440🔝
Call Girls in Dhaula Kuan 💯Call Us 🔝8264348440🔝
 
Real Moto 2 MOD APK v1.1.721 All Bikes, Unlimited Money
Real Moto 2 MOD APK v1.1.721 All Bikes, Unlimited MoneyReal Moto 2 MOD APK v1.1.721 All Bikes, Unlimited Money
Real Moto 2 MOD APK v1.1.721 All Bikes, Unlimited Money
 
Interpreting the Secrets of Milan Night Chart
Interpreting the Secrets of Milan Night ChartInterpreting the Secrets of Milan Night Chart
Interpreting the Secrets of Milan Night Chart
 
Technical Data | ThermTec Wild 325 | Optics Trade
Technical Data | ThermTec Wild 325 | Optics TradeTechnical Data | ThermTec Wild 325 | Optics Trade
Technical Data | ThermTec Wild 325 | Optics Trade
 

Memento: Updated technical details (May 2011)

  • 1. Memento: Time Travel for the Web   Herbert Van de Sompel Robert Sanderson Michael L. Nelson http://mementoweb.org/ Memento is funded by The Library of Congress   Updated Technical Details (May 2011) Memento: Time Travel for the Web Updated Technical Details (05/2011)
  • 2. Memento wants to make it Easy to navigate the Web of the Past Technical Specification https://datatracker.ietf.org/doc/draft-vandesompel-memento/ Memento: Time Travel for the Web 2 Updated Technical Details (05/2011)
  • 3. Tate Online Select Date Tate Online Today March 16 2008 March 16 2008 From National Archives Memento: Time Travel for the Web 3 Updated Technical Details (05/2011)
  • 4. Memento achieves this by introducing a uniform version access capability to integrate the past and current Web Memento: Time Travel for the Web 4 Updated Technical Details (05/2011)
  • 5. Problem Statement … Memento: Time Travel for the Web 5 Updated Technical Details (05/2011)
  • 6. Resources Memento: Time Travel for the Web 6 Updated Technical Details (05/2011)
  • 7. Resources have Representations Memento: Time Travel for the Web 7 Updated Technical Details (05/2011)
  • 8. Resources have Representations that Change over Time Memento: Time Travel for the Web 8 Updated Technical Details (05/2011)
  • 9. Only the Current Representation is Available from a Resource Memento: Time Travel for the Web 9 Updated Technical Details (05/2011)
  • 10. Old Representations are Lost Forever Memento: Time Travel for the Web 10 Updated Technical Details (05/2011)
  • 11. Archived/Version Resources Exist Memento: Time Travel for the Web 11 Updated Technical Details (05/2011)
  • 12. Resource Versions on the Web •  Content Management Systems •  Web Archives •  Transactional archives •  Search engine caches •  … Memento: Time Travel for the Web 12 Updated Technical Details (05/2011)
  • 13. Sep 11 2001, 20:36:10 UTC Dec 20 2001, 4:51:00 UTC Archived Resources http://en.wikipedia.org/w/index.php? http://web.archive.org/web/20010911203610/http:// title=September_11_attacks&oldid=282333 archived www.cnn.com/ archived resource for http://cnn.com resource for http://en.wikipedia.org/wiki/ September_11_attacks Memento: Time Travel for the Web Updated Technical Details (05/2011) 13
  • 14. Versions are Not Integrated with the Web •  Cannot talk about a resource as it used to exist •  Cannot access a prior version knowing the current one •  Cannot access the current version knowing a prior one Memento: Time Travel for the Web 14 Updated Technical Details (05/2011)
  • 15. Memento Wants to Integrate the Past and Current Web Memento: Time Travel for the Web 15 Updated Technical Details (05/2011)
  • 16. The Memento Framework: Protocol to Integrate Past and Current Web Overview Memento: Time Travel for the Web 16 Updated Technical Details (05/2011)
  • 17. Memento Framework •  Regards the Web as a big Content Management System •  Introduces a uniform capability to access versions on the Web •  Does not build new archives but leverages existing systems that host versions Memento: Time Travel for the Web 17 Updated Technical Details (05/2011)
  • 18. Memento Framework •  Is distributed: versions may exist on several servers •  Uses time as a global version indicator •  Is based on the primitives of the Web: resource, resource state, representation, content negotiation, link Memento: Time Travel for the Web 18 Updated Technical Details (05/2011)
  • 19. Original Resources and Mementos Memento: Time Travel for the Web 19 Updated Technical Details (05/2011)
  • 20. Bridge from Present to Past Memento: Time Travel for the Web 20 Updated Technical Details (05/2011)
  • 21. Bridge from Past to Present Memento: Time Travel for the Web 21 Updated Technical Details (05/2011)
  • 22. Memento Framework Memento: Time Travel for the Web 22 Updated Technical Details (05/2011)
  • 23. Memento Client Server Interaction Memento: Time Travel for the Web 23 Updated Technical Details (05/2011)
  • 24. Memento HTTP Flow HEAD R, Accept-Datetime Link  G GET G, Accept-Datetime 302  M, Vary, Link  M,R,T GET M, Accept-Datetime 200, Memento-Datetime, Link  M,R,T,G 24
  • 25. The Memento Framework: Protocol to Integrate Past and Current Web Interesting Cases Memento: Time Travel for the Web 25 Updated Technical Details (05/2011)
  • 26. Multiple Archives Memento: Time Travel for the Web 26 Updated Technical Details (05/2011)
  • 27. Original Resource Gone Memento: Time Travel for the Web 27 Updated Technical Details (05/2011)
  • 28. Original Resource’s Server Gone Memento: Time Travel for the Web 28 Updated Technical Details (05/2011)
  • 29. Original Resource Provides no Link Memento: Time Travel for the Web 29 Updated Technical Details (05/2011)
  • 30. The Memento Framework: Protocol to Integrate Past and Current Web HTTP Headers Memento: Time Travel for the Web 30 Updated Technical Details (05/2011)
  • 31. HTTP Headers used in Memento •  Defines two new headers: –  request: Accept-Datetime! –  response: Memento-Datetime! •  Introduce new content for two existing headers: –  response: Vary ; Link! •  Use one existing headers without modification: –  response: Location, TCN:! Memento: Time Travel for the Web 31 Updated Technical Details (05/2011)
  • 32. HTTP Request Headers: Accept-Datetime •  Accept-Datetime
 o  Issued against TimeGate, (Original Resource), (Memento) o  Header value: -  Desired datetime of Memento (MANDATORY) Must be in RFC 1123 format and in GMT -  Interval indicator to express the client is only interested in Mementos within the interval (OPTIONAL) –  Expressed as two ISO8601 durations: "-P3DT5H;+P2DT6H"! Accept-Datetime: Mon, 12 Oct 2009 14:20:33 GMT! Memento: Time Travel for the Web 32 Updated Technical Details (05/2011)
  • 33. HTTP Response Headers: Memento-Datetime •  Memento-Datetime! o  Returned by Mementos -  Always. Even when not via a TimeGate o  Header value: Archival datetime of the Memento -  Resource has not and will not change beyond that date o  This header is sticky: -  Once returned, must always return it with same value -  Must also be preserved when Mementos are mirrored at different URIs •  This header is crucial to allow a client to understand it has arrived at a Memento See: http://www.mementoweb.org/guide/resourcetype/ Memento-Datetime: Mon, 12 Oct 2009 14:20:33 GMT! Memento: Time Travel for the Web 33 Updated Technical Details (05/2011)
  • 34. HTTP Response Headers: Vary •  Vary! o  Returned by TimeGate o  Similar to regular content negotiation o  Header value: -  negotiate, accept-datetime! •  TimeGate must first meet the datetime preference, and then – if possible – other content negotiation preferences •  Note: accept-datetime value in Vary header is crucial to allow a client to understand it has arrived at a TimeGate. See: http://www.mementoweb.org/guide/resourcetype/ Vary: negotiate, accept-datetime! Memento: Time Travel for the Web 34 Updated Technical Details (05/2011)
  • 35. HTTP Response Headers: Location •  Location
 o  Returned by TimeGate o  Similar to regular content negotiation o  Header value: URI of selected Memento Location: http://web.archive.org/web/20010911223004/ http://cnn.com! Memento: Time Travel for the Web 35 Updated Technical Details (05/2011)
  • 36. HTTP Response Headers: Link •  Link! o  Returned by Original Resource, TimeGate and Mementos o  Various new Relation Types are introduced: -  “original”! -  “timegate”! -  “memento”! -  “timemap”! o  HTTP Link Header: RFC 5988 See: https://datatracker.ietf.org/doc/rfc5988/ Link: <http://web.archive.org/web/20010911223004/http://! cnn.com>;rel="memento";datetime="Mon, 11 Sep 2001 22:30:04 GMT"! Memento: Time Travel for the Web 36 Updated Technical Details (05/2011)
  • 37. Memento HTTP Flow HEAD R, Accept-Datetime Link  G timegate GET G, Accept-Datetime 302  M, Vary, Link  M,R,T memento,original,timemap GET M, Accept-Datetime 200, Memento-Datetime, Link  M,R,T,G memento,original,timemap,timegate
  • 38. HTTP Response Headers: Link •  Link! o  Returned by Original Resource, TimeGate and Mementos o  Various new Relation Types are introduced Memento: Time Travel for the Web 38 Updated Technical Details (05/2011)
  • 39. The Memento Framework: Protocol to Integrate Past and Current Web HTTP Interactions Memento: Time Travel for the Web 39 Updated Technical Details (05/2011)
  • 46. The Memento Framework: Protocol to Integrate Past and Current Web HTTP Link Header Details Memento: Time Travel for the Web 46 Updated Technical Details (05/2011)
  • 47. HTTP Response Headers: Link Memento: Time Travel for the Web 47 Updated Technical Details (05/2011)
  • 48. Memento HTTP Flow HEAD R, Accept-Datetime Link  G timegate GET G, Accept-Datetime 302  M, Vary, Link  M,R,T GET M, Accept-Datetime 200, Memento-Datetime, Link  M,R,T,G
  • 49. HTTP Response Headers: Link •  RECOMMENDED "timegate” Link from Original Resource to TimeGate •  If this Link is not available, the client must try and find a TimeGate itself, via: -  Memento Discovery approaches (see later) -  User interaction (e.g. Preferences in an application) Memento: Time Travel for the Web 49 Updated Technical Details (05/2011)
  • 50. HTTP Response Headers: Link Memento: Time Travel for the Web 50 Updated Technical Details (05/2011)
  • 51. Memento HTTP Flow HEAD R, Accept-Datetime Link  G GET G, Accept-Datetime 302  M, Vary, Link  M memento GET M, Accept-Datetime 200, Memento-Datetime, Link  M memento
  • 52. HTTP Response Headers: Link •  MANDATORY ”memento” Links from TimeGate and Memento to Mementos •  ”memento” Links point at the following Mementos know to the responding server: o  Selected Memento (MANDATORY if a Memento is selected) o  First Memento, Last Memento (MANDATORY) o  Memento prev to selected one, Memento next to selected one (RECOMMENDED) o  Other Mementos (OPTIONAL, and only if prev and next are provided) o  Temporal order of Mementos is expressed using existing Relation Types (RFC 5829, RFC 5988): first, last, next, prev, successor-version, predecessor-version Memento: Time Travel for the Web 52 Updated Technical Details (05/2011)
  • 53. HTTP Response Headers: Link •  MANDATORY ”memento” Links from TimeGate and Memento to Mementos •  Attributes for a ”memento” Link: o  datetime (MANDATORY) datetime of the Memento pointed at by the link o  license (OPTIONAL) license associated with the Memento o  embargo (OPTIONAL) datetime until which the Memento will remain inaccessible o  type (RECOMMENDED) mime type of the Memento. Memento: Time Travel for the Web 53 Updated Technical Details (05/2011)
  • 54. HTTP Response Headers: Link Memento: Time Travel for the Web 54 Updated Technical Details (05/2011)
  • 55. Memento HTTP Flow HEAD R, Accept-Datetime Link  M=R memento,original
  • 56. HTTP Response Headers: Link •  Both ”memento” and ”original” Link on a resource. •  The resource is its own Memento, i.e. it is a stable resource. o  Resource that was born stable or became stable; it will not change anymore. o  For example resources with PermaLink on news sites o  Note the difference with Last-Modified header •  Can still also provide a ”timegate” Link! o  For example pointing at TimeGates for Mementos of the resource before it became stable Memento: Time Travel for the Web 56 Updated Technical Details (05/2011)
  • 57. HTTP Response Headers: Link Memento: Time Travel for the Web 57 Updated Technical Details (05/2011)
  • 58. Memento HTTP Flow GET M, (Accept-Datetime) 200, Memento-Datetime, Link  M,R,T memento,original,timemap
  • 59. HTTP Response Headers: Link •  Mementos without a TimeGate, for example: o  Resources in snapshot archives o  Version resources in systems that have not yet implemented TimeGates •  Should still use Memento-Datetime header •  Should still use “original” Link •  Can have “timemap” Link Memento: Time Travel for the Web 59 Updated Technical Details (05/2011)
  • 60. HTTP Response Headers: Link Memento: Time Travel for the Web 60 Updated Technical Details (05/2011)
  • 61. Memento HTTP Flow HEAD R, Accept-Datetime Link  G GET G, Accept-Datetime 302  M, Vary, Link  T timemap GET M, Accept-Datetime 200, Memento-Datetime, Link  T timemap
  • 62. HTTP Response Headers: Link •  RECOMMENDED ”timemap” Links from TimeGate and Mementos. •  A TimeMap is introduced to allow retrieving an inventory of Mementos for an Original Resource that the responding server is aware of. It lists: o  URI of Original Resource (MANDATORY) o  URI and datetime of all known Mementos (MANDATORY) o  URI of TimeGate for Original Resource (RECOMMENDED) o  URI of TimeMap itself (RECOMMENDED) •  Multiple TimeMap serializations possible; link-value format MANDATORY o  application/link-format: see https://datatracker.ietf.org/doc/draft-ietf-core-link-format/ Memento: Time Travel for the Web 62 Updated Technical Details (05/2011)
  • 63. Memento HTTP Flow: GET TimeMap Memento: Time Travel for the Web 63 Updated Technical Details (05/2011)
  • 64. Memento HTTP Flow: TimeMap Response Memento: Time Travel for the Web Updated Technical Details (05/2011)
  • 65. The Memento Framework: Discovery to Support Integration of Past and Current Web TimeGate Discovery Memento: Time Travel for the Web 65 Updated Technical Details (05/2011)
  • 66. Batch Discovery of TimeGates: robots.txt •  robots.txt file is used by Web servers to convey crawling policies •  Web crawlers (such as for archives) retrieve and parse it •  De-facto standard, no official endorsement •  Extended with new directives, including by Google User-agent: * # Reject all crawlers! Disallow: /! # Google Sitemap extension! Sitemap: http://some.example.com/me/sitemap.xml! User-agent: NiceBot # Select only NiceBot! Crawl-delay: 10 # 10 seconds between requests! Memento: Time Travel for the Web 66 Updated Technical Details (05/2011)
  • 67. Batch Discovery of TimeGates: robots.txt •  Add TimeGate and Archived directives to support discovery of TimeGates known to the server •  User agent should concatenate desired URL with TimeGate link •  Archived value is truncated host/path or * to describe a general web archive http://mementoweb.org/guide/robotstxt/ Memento: Time Travel for the Web 67 Updated Technical Details (05/2011)
  • 68. The Memento Framework: Discovery to Support Integration of Present and Past Web Discovery via TimeMaps: All Mementos for a given Original Resource known by an archive Memento: Time Travel for the Web 68 Updated Technical Details (05/2011)
  • 69. TimeMap Overview •  A TimeMap is an inventory of Mementos for an Original Resource that the responding server is aware of. It lists at least: •  URI of Original Resource •  URI and datetime of all known Mementos •  URI of TimeGate for Original Resource •  URI of TimeMap itself •  Multiple TimeMap serializations possible: •  application/link-format mandatory see https://datatracker.ietf.org/doc/draft-ietf-core-link-format/ •  RDF TimeMaps proposed Memento: Time Travel for the Web 69 Updated Technical Details (05/2011)
  • 70. TimeMaps: Link Format Syntax •  Document in the format of the value of the Link HTTP Header •  Format: <URI>;rel="type";attr="val", ! !<URI2>…
 •  rel is the relationship between context URI and the URI in <>s •  The Context URI for TimeMaps is the URI with rel="original" •  Other rel types link to Mementos, TimeGates etc.! <http://cnn.com/>;rel="original",
 <http://web.archive.org/web/20010911223004/http://! cnn.com>;rel="memento";datetime="Mon, 11 Sep 2001 22:30:04 GMT"! Memento: Time Travel for the Web 70 Updated Technical Details (05/2011)
  • 71. TimeMaps: Link Attributes •  Existing Attributes for Links •  "rel" The type of relationship •  "type" The (mime) formatt of the linked resource •  "title" Title of the linked resource •  "hreflang" !Language of linked resource •  "media" Intended media (eg screen) •  "anchor" URI to override context URI for link •  New rel types introduced: •  "original" The Original Resource •  "memento" A Memento of the Original •  "timegate" A TimeGate for the Original •  "timemap" A TimeMap of Mementos of the Original! Memento: Time Travel for the Web 71 Updated Technical Details (05/2011)
  • 72. TimeMaps: Link Attributes •  New Attributes for Mementos: •  datetime The Memento-Datetime! •  license!License associated with Memento •  embargo!Time after which Memento is available •  Are others necessary?! <http://cnn.com/>;rel="original",! <http://web.archive.org/timegate/cnn.com/>;rel="timegate",! <http://web.archive.org/timemap/link/cnn.com/>;rel="timemap",! <http://web.archive.org/web/20010911223004/http://
 cnn.com>;rel="memento";datetime="Mon, 11 Sep 2001 22:30:04 GMT"! <http://web.archive.org/web/…/cnn.com/>;rel="memento";
 license="http://archive.org/license/1";
 datetime="…";embargo="Mon, 20 Jul 2011 00:00:00 GMT",! … ! Memento: Time Travel for the Web 72 Updated Technical Details (05/2011)
  • 73. The Memento Framework: Discovery to Support Integration of Past and Current Web Memento Discovery: All Mementos known by an archive Memento: Time Travel for the Web 73 Updated Technical Details (05/2011)
  • 74. Batch discovery of Mementos: robots.txt •  robots.txt file is used by Web servers to convey crawling policies •  Support discovery of Mementos through robots.txt via existing User-agent and Allow directives •  Use value memento for User-agent to convey that the value for the Memento-Datetime header must remain sticky when crawling/mirroring Mementos Memento: Time Travel for the Web 74 Updated Technical Details (05/2011)
  • 75. Batch discovery of Mementos: Memento Feeds •  Concept: •  Archives publish feeds in which each entry provides details about a specific Memento, e.g. Memento-Datetime, Original Resource, etc. •  As new Mementos become available, new feeds with new entries are published •  Once published, feeds remain static •  Technology: •  To be decided in collaboration with IIPC •  Inspired by the approach and functionality of CDX files (see http://www.archive.org/web/researcher/cdx_legend.php) but: •  With Memento-specific extensions; •  Possibly using different serialization; •  Including mechanisms to discover these feeds. Memento: Time Travel for the Web 75 Updated Technical Details (05/2011)
  • 76. The Memento Framework: Tools Memento: Time Travel for the Web 76 Updated Technical Details (05/2011)
  • 77. Memento Client Support •  Several client tools developed by us and others •  Add-ons for FireFox (operational) and Internet Explorer (experimental) •  Applications for Android (operational) and iPhone/iPad (in development) •  Paper in Code4Lib Journal http://journal.code4lib.org/articles/4979 Memento: Time Travel for the Web 77 Updated Technical Details (05/2011)
  • 78. Memento Server Support •  Plug-in for MediaWiki (operational) •  Used on W3C’s main wiki •  Please install it for your MediaWiki! http://www.mediawiki.org/wiki/Extension:Memento Memento: Time Travel for the Web 78 Updated Technical Details (05/2011)
  • 79. Memento Server Support •  Memento-compliant Wayback software: •  In production at the Internet Archive •  Available to Web archives, worldwide •  Please have your favorite Web Archive experiment with the new 1.6 version! http://mementoweb.org/tools/wayback/ Memento: Time Travel for the Web 79 Updated Technical Details (05/2011)
  • 80. Memento Server Validator •  Server side client: •  Attempts to perform all Memento actions against a given URI •  Reports success/failure of the interactions and warnings for optional aspects •  Kept up to date with IETF Internet Draft http://mementoweb.org/tools/validator/ Memento: Time Travel for the Web 80 Updated Technical Details (05/2011)
  • 81. Memento Proxy Support •  Several systems that host Mementos made Memento- compliant “by proxy” •  Many major Web Archives that do not yet run Memento- compliant software •  3,000+ MediaWiki systems, including Wikipedia, Wikia •  We would love all of these to become natively Memento compliant! Memento: Time Travel for the Web 81 Updated Technical Details (05/2011)
  • 82. Memento Aggregator TimeGate •  Aggregates all known TimeGates •  Proxies •  Native Implementations •  Redirects to authoritative TimeGates (Wikipedia, Transactional Archives) •  Currently implemented with BerkeleyDB •  Future version to use FaceBook's Cassandra platform Memento: Time Travel for the Web 82 Updated Technical Details (05/2011)
  • 83. Aggregators Find More Mementos! •  1000 URIs sampled from delicious.com •  1 dot = 1 Memento (x=Memento-Datetime, y=URI of Original Resource) •  Sorted by URI longevity Memento: Time Travel for the Web 83 Updated Technical Details (05/2011)
  • 84. But Still Too Few Mementos To Be Found… •  1000 URIs sampled from search engine result pages; •  See: “How Much of the Web is Archived?” JCDL 2011 Memento: Time Travel for the Web 84 Updated Technical Details (05/2011)
  • 85. Crawl-Based Web Archives Observations For example: Heritrix crawler for Internet Archive Memento: Time Travel for the Web 85 Updated Technical Details (05/2011)
  • 86. Crawl-Based Web Archives •  Collect discreet observations of resources, not their entire evolution. •  Can be rejected (robots.txt, by user-agent, by host IP) •  Can be deceived (cloaking, by geo-location, by user-agent). •  Coverage of particular Web server dependent on crawl-strategy. Memento: Time Travel for the Web 86 Updated Technical Details (05/2011)
  • 87. Server-Side Transactional Web Archives Change History For example: TTApache, PageVault, Vignette Web Capture Memento: Time Travel for the Web 87 Updated Technical Details (05/2011)
  • 88. Server-Side Transactional Web Archives •  Collect all representations served by to-be-archived server. •  To-be-archived server needs to cooperate. •  Incentives e.g. institutional memory, official record of Web presence. •  Archival coverage restricted by to-be-archived server, does not include external servers (e.g. embedded resources). •  To be archived server can submit falsified information. •  Archival collection management: what to keep, what not (e.g. significant changes, deduplication, …). Memento: Time Travel for the Web 88 Updated Technical Details (05/2011)
  • 89. Development of Transactional Web Archive Software Capture: •  Apache connection filter module captures URI, headers, body •  POSTs in real-time to transactional archive Access: •  Online, real time access via Memento TimeGates •  Batch Export via WARC files for long term preservation Memento: Time Travel for the Web 89 Updated Technical Details (05/2011)
  • 90. Development of Transactional Web Archive Software Capture: •  Apache connection filter module captures URI, headers, body •  POSTs in real-time to transactional archive Submit: •  Java-Grizzly-Jersey submission interface application •  Berkeley DB metadata store •  FS store for body and headers Memento: Time Travel for the Web 90 Updated Technical Details (05/2011)
  • 91. Development of Transactional Web Archive Software Access: •  Transactional archive natively supports Memento •  Immediate availability of archived content •  Export of WARC, e.g. for long-term archiving in other environment Development Timeline: •  Ongoing development (LANL) and testing (ODU) •  Submit/Access finalized; coding focus on collection management •  Expected release as open source, 3rd Quarter 2011 Memento: Time Travel for the Web 91 Updated Technical Details (05/2011)
  • 92. The Memento Framework: Resource Versioning Memento: Time Travel for the Web 92 Updated Technical Details (05/2011)
  • 93. Memento: Time Travel for the Web 93 Updated Technical Details (05/2011)
  • 94. Memento: Time Travel for the Web 94 Updated Technical Details (05/2011)
  • 95. Memento: Time Travel for the Web 95 Updated Technical Details (05/2011)
  • 96. Memento: Time Travel for the Web 96 Updated Technical Details (05/2011)
  • 97. Memento: Time Travel for the Web 97 Updated Technical Details (05/2011)
  • 98. Memento: Time Travel for the Web 98 Updated Technical Details (05/2011)
  • 99. Memento: Time Travel for the Web 99 Updated Technical Details (05/2011)
  • 100. Memento Framework Original Resource: http://lanlsource.lanl.gov/pics/picoftheday.png Memento: Time Travel for the Web 100 Updated Technical Details (05/2011)
  • 101. Time Travel across Versions of a Picture of the Day Movie at: http://www.mementoweb.org/demo/picoftheday.mov Memento: Time Travel for the Web Updated Technical Details (05/2011)
  • 102. Memento: Time Travel for the Web Updated Technical Details (05/2011)
  • 103. Memento: Time Travel for the Web Updated Technical Details (05/2011)
  • 104. Memento: Time Travel for the Web Updated Technical Details (05/2011)
  • 105. Memento: Time Travel for the Web Updated Technical Details (05/2011)
  • 106. Memento: Time Travel for the Web Updated Technical Details (05/2011)
  • 107. Memento Framework Original Resource: http://dbpedia.org/resource/France Memento: Time Travel for the Web 107 Updated Technical Details (05/2011)
  • 108. Time-Series Analysis across DBpedia Versions Data collected through HTTP Navigation paper at http://arxiv.org/abs/1003.3661 Memento: Time Travel for the Web 108 Updated Technical Details (05/2011)
  • 109. The Memento Framework: About Memento-Datetime: Archive Navigation Coherence Memento: Time Travel for the Web 109 Updated Technical Details (05/2011)
  • 110. Resource History Recorded by CMS and Transactional Archives Memento: Time Travel for the Web 110 Updated Technical Details (05/2011)
  • 111. Navigate foo.html @ t4 Memento: Time Travel for the Web 111 Updated Technical Details (05/2011)
  • 112. Navigation Coherence for foo.html @ t4 Memento: Time Travel for the Web 112 Updated Technical Details (05/2011)
  • 113. Resource Observations Recorded by Crawler-Based Archives Memento: Time Travel for the Web 113 Updated Technical Details (05/2011)
  • 114. Missed Observations Memento: Time Travel for the Web 114 Updated Technical Details (05/2011)
  • 115. Navigate foo.html @ t4 Memento: Time Travel for the Web 115 Updated Technical Details (05/2011)
  • 116. Navigation Incoherence foo.html @ t4 Memento: Time Travel for the Web 116 Updated Technical Details (05/2011)
  • 117. Increase Coherence with Observations from Multiple Archives? Memento: Time Travel for the Web 117 Updated Technical Details (05/2011)
  • 118. The Memento Framework: About Memento-Datetime: Relation to Creation Datetime and Last-Modified Memento: Time Travel for the Web 118 Updated Technical Details (05/2011)
  • 119. Three Notions of Time •  Creation: Datetime when the resource first came into being •  Last-Modified: Datetime when the resource was last changed •  Memento-Datetime: Datetime that the resource was “frozen”, e.g. as a result of: o  Archiving it at a different URI (e.g. in a CMS, Web Archive, Transactional Archive, Snapshot Archive); o  Deciding never to change it anymore and keeping it at its original URI. Memento: Time Travel for the Web 119 Updated Technical Details (05/2011)
  • 120. Creation = Memento-Datetime = Last-Modified Cr MD LM At a particular point in time, the Original Resource is observed, and the associated Memento is created. All time values for the Memento are the same. Memento: Time Travel for the Web 120 Updated Technical Details (05/2011)
  • 121. Creation = Memento-Datetime < Last-Modified Cr MD LM The HTML archive banner added to a Memento necessitates a change in Last-Modified of the Memento, but Creation date and Memento-Datetime remain unchanged. Memento: Time Travel for the Web 121 Updated Technical Details (05/2011)
  • 122. Creation < Memento-Datetime <= Last-Modified Cr MD LM It is possible that the Original Resource was a placeholder resource and returned a 200 response before it started to identify a Memento (URI- R=URI-M). Memento: Time Travel for the Web 122 Updated Technical Details (05/2011)
  • 123. Memento-Datetime < Creation <= Last-Modified Cr MD LM If a Memento is copied to a new archive, the copied Memento has a Creation and Last-Modified equal to the time of copying. The Memento-Datetime is “sticky” and is the same for the Memento and its copy. Memento: Time Travel for the Web 123 Updated Technical Details (05/2011)
  • 124. The Memento Framework: Persistent Web Annotations Memento: Time Travel for the Web 124 Updated Technical Details (05/2011)
  • 125. Web-Centric Annotation: No Persistence Google Sidewiki Annotation on http://news.bbc.co.uk/ as of 2010-06-14 Memento: Time Travel for the Web 125 Updated Technical Details (05/2011)
  • 126. Web-Centric Annotation: No Persistence Archived page from: http://www.dracos.co.uk/work/bbc-news-archive/2010/03/08/07.05.html Memento: Time Travel for the Web 126 Updated Technical Details (05/2011)
  • 127. Web-Centric Annotation: Desired Persistence Memento: Time Travel for the Web 127 Updated Technical Details (05/2011)
  • 128. Open Annotation: Dealing with Web Time •  As regular Web resources, Body and Target of an Annotation have representations that can change over time. •  Body and Target can change independently of each other. •  If an Annotation involves resources as they existed at a particular point in time, this needs to be recorded. •  The OAC model provides hooks for doing so: •  Timeless Annotations; •  Uniform Time Annotations; •  Varied Time Annotations. Memento: Time Travel for the Web Updated Technical Details (05/2011) 128
  • 129. Open Annotation: Uniform Time Annotations •  The Annotation is not always applicable, but pertains to the state of the Body and Target at a specific moment in time. •  Add oac:when property to the Annotation. Memento: Time Travel for the Web Updated Technical Details (05/2011) 129
  • 130. Memento + Open Annotation: Persistent Annotations •  In order to reconstruct the Annotation as intended: Use Memento to obtain an archived representation of B and T as they existed at the oac:when datetime. Memento: Time Travel for the Web Updated Technical Details (05/2011) 130
  • 131. Create an Annotation Memento: Time Travel for the Web 131 Updated Technical Details (05/2011)
  • 132. Reconstruct the Annotation without Memento Memento: Time Travel for the Web 132 Updated Technical Details (05/2011)
  • 133. Reconstruct the Annotation with Memento Memento: Time Travel for the Web 133 Updated Technical Details (05/2011)
  • 134. The Memento Framework: The Increasing Value of a URI Memento: Time Travel for the Web 134 Updated Technical Details (05/2011)
  • 135. URI as Access Point to a Page Memento: Time Travel for the Web Updated Technical Details (05/2011)
  • 136. URI as Access Point to Page and Data Memento: Time Travel for the Web Updated Technical Details (05/2011)
  • 137. URI as Access Point to Current and Past Pages and Data Memento: Time Travel for the Web Updated Technical Details (05/2011)
  • 138. References •  Van de Sompel, H., Nelson, M.L., Sanderson, R., Balakireva, L., Ainsworth, S., Shankar, H. (2009) Memento: Time Travel for the Web. http://arxiv.org/abs/0911.1112 •  Van de Sompel, H., Sanderson, R., Nelson, M.L., Balakireva, L., Ainsworth, S., Shankar, H. (2010) An HTTP-Based Versioning Mechanism for Linked Data. Proceedings of the 3rd Workshop on Linked Data on the Web. http://arxiv.org/abs/1003.3661 •  Sanderson, R., and Van de Sompel, H. (2010) Making Web Annotations Persistent over Time. Proceedings of the 10th ACM/IEEE-CS Joint Conference on Digital libraries. http://arxiv.org/abs/1003.2643 •  Sanderson, R., Van de Sompel, H. (2011) Open Annotation Alpha3 Data Model Guide. http://www.openannotation.org/spec/alpha3/ Memento: Time Travel for the Web 138 Updated Technical Details (05/2011)
  • 139. Memento wants to make navigating the Web’s Past Easy http://mementoweb.org/ http://groups.google.com/group/memento-dev Memento: Time Travel for the Web 139 Updated Technical Details (05/2011)