Update on Memento                         http://www.mementoweb.org/                                  Herbert Van de Sompe...
Overview of Memento FrameworkDeployment ProgressMemento and DiscoveryMemento and BrandingAlternative Web Archiving Strateg...
Overview of Memento FrameworkDeployment ProgressMemento and DiscoveryMemento and BrandingAlternative Web Archiving Strateg...
Memento wants to make it easyto navigate the Web of the Past.             Memento Update 2011 IIPC General Assembly, Den H...
Tate Online              Select Date                      Tate Online  Today                 March 16 2008                ...
Versions: Web vs CMS      World Wide Web                      Content Management Systems•  Designed to forget about       ...
Versions are not Integrated                       The Web Architecture has a                         hard time dealing wit...
Memento Framework                       •  Regards the Web as a big                          Content Management System    ...
Memento Framework                       •  Is Distributed: versions may                          exist on several servers ...
Memento Interaction Overview             Memento Update2011 IIPC General Assembly, Den Hague 10
Original Resource and Versions             Memento Update 2011 IIPC General Assembly, Den Hague 11
Bridge from Present to Past             Memento Update2011 IIPC General Assembly, Den Hague 12
Bridge from Past to Present             Memento Update2011 IIPC General Assembly, Den Hague 13
Memento Framework             Memento Update2011 IIPC General Assembly, Den Hague 14
Framework with Multiple Archives                        Memento Update           2011 IIPC General Assembly, Den Hague 15
Overview of Memento FrameworkDeployment ProgressMemento and DiscoveryMemento and BrandingAlternative Web Archiving Strateg...
Significant progress has been made towardsseamless navigation of the Web of the Past.                   Memento Update    ...
Standardization                                •  Standardization process started                                   via th...
Memento Clients                       •  Several client tools developed                          by us and others         ...
Memento Server Support                       •  Memento-compliant Wayback                          software:              ...
Memento Server Support (2)                       •  Plug-in for MediaWiki                          (operational)          ...
Memento Server Validator                        •  Server side client:                             •  Attempts to perform ...
Memento Proxy Support                       •  Several systems that host                          Mementos made Memento-  ...
Memento Web Site                       •  Ongoing effort to add materials                          that support understand...
Funding                       •  2007-2010: US $250K grant                          from Library of Congress              ...
Overview of Memento FrameworkDeployment ProgressMemento and DiscoveryMemento and BrandingAlternative Web Archiving Strateg...
Very few Web sites provide a “timegate” link.Need additional mechanisms to support Discovery.                        Memen...
Batch Discovery: TimeMaps                        A TimeMap minimally lists:•  URI and datetime of Mementos known to an arc...
Batch Discovery: Feed of TimeMapsSystem that hosts Mementos exposes Feed of TimeMaps toallow applications to remain in syn...
Batch Discovery: robots.txt•  robots.txt file is used by Web servers to conveycrawling policies•  Add a directives to supp...
Overview of Memento FrameworkDeployment ProgressMemento and DiscoveryMemento and BrandingAlternative Web Archiving Strateg...
Memento can recreate pages using resources from different archives. This poses a branding challenge.                  Meme...
Current Branding Practice for Web Archives          Page and embedded resources from same Web Archive  Branding     for   ...
Branding for Web Archives in Memento Mode       Page and embedded resources from various Web ArchivesHTMLsbranding   Nobra...
Overview of Memento FrameworkDeployment ProgressMemento and DiscoveryMemento and BrandingAlternative Web Archiving Strateg...
Crawl-based Archives host distinct observations. Transactional Archives never miss an update.                        Memen...
Crawl-Based Web ArchivesDistinct Observations are Archived for Many Servers                    Memento Update       2011 I...
Server-Side Transactional Web ArchivesEntire Change History is Archived for a Single Server                     Memento Up...
Development of Transactional Web Archive SoftwareCapture:   •  Apache connection filter module captures URI, headers, body...
Update on Memento                             http://mementoweb.org/                              Herbert Van de Sompel   ...
Upcoming SlideShare
Loading in …5
×

Update on Memento (IIPC 2011 Plenary)

1,263 views
1,156 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,263
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
2
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Update on Memento (IIPC 2011 Plenary)

  1. 1. Update on Memento http://www.mementoweb.org/ Herbert Van de Sompel Robert Sanderson Michael L. Nelson This research funded by the Library of CongressTowards Seamless Navigation of the Web of the Past Memento Update 2011 IIPC General Assembly, Den Hague 1
  2. 2. Overview of Memento FrameworkDeployment ProgressMemento and DiscoveryMemento and BrandingAlternative Web Archiving Strategies Memento Update 2011 IIPC General Assembly, Den Hague 2
  3. 3. Overview of Memento FrameworkDeployment ProgressMemento and DiscoveryMemento and BrandingAlternative Web Archiving Strategies Memento Update 2011 IIPC General Assembly, Den Hague 3
  4. 4. Memento wants to make it easyto navigate the Web of the Past. Memento Update 2011 IIPC General Assembly, Den Hague 4
  5. 5. Tate Online Select Date Tate Online Today March 16 2008 March 16 2008 From National Archives Memento Update 2011 IIPC General Assembly, Den Hague 5
  6. 6. Versions: Web vs CMS World Wide Web Content Management Systems•  Designed to forget about •  Designed to be aware of all prior versions of a resource versions of a resource•  Highly Distributed •  Self-contained•  No standard version •  Variety of proprietary version mechanisms mechanisms•  Standardized interlinking •  Versions interlinked using mechanisms proprietary mechanisms Memento Update 2011 IIPC General Assembly, Den Hague 6
  7. 7. Versions are not Integrated The Web Architecture has a hard time dealing with the versions that do exist: •  Cannot talk about a resource as it used to exist •  Cannot access a prior version given the current one •  Cannot access the current version given a prior one Memento Update2011 IIPC General Assembly, Den Hague 7
  8. 8. Memento Framework •  Regards the Web as a big Content Management System •  Introduces a uniform capability to access versions on the Web •  Does not build new archives but leverages all systems that host versions Memento Update2011 IIPC General Assembly, Den Hague 8
  9. 9. Memento Framework •  Is Distributed: versions may exist on several servers •  Uses Time as a global version indicator •  Is based on the primitives of the Web: resource, resource state, representation, content negotiation, link Memento Update2011 IIPC General Assembly, Den Hague 9
  10. 10. Memento Interaction Overview Memento Update2011 IIPC General Assembly, Den Hague 10
  11. 11. Original Resource and Versions Memento Update 2011 IIPC General Assembly, Den Hague 11
  12. 12. Bridge from Present to Past Memento Update2011 IIPC General Assembly, Den Hague 12
  13. 13. Bridge from Past to Present Memento Update2011 IIPC General Assembly, Den Hague 13
  14. 14. Memento Framework Memento Update2011 IIPC General Assembly, Den Hague 14
  15. 15. Framework with Multiple Archives Memento Update 2011 IIPC General Assembly, Den Hague 15
  16. 16. Overview of Memento FrameworkDeployment ProgressMemento and DiscoveryMemento and BrandingAlternative Web Archiving Strategies Memento Update 2011 IIPC General Assembly, Den Hague 16
  17. 17. Significant progress has been made towardsseamless navigation of the Web of the Past. Memento Update 2011 IIPC General Assembly, Den Hague 17
  18. 18. Standardization •  Standardization process started via the IETF •  Interest from IETF and W3C •  Encouraged by major Web architects, including: Tim Berners-Lee, Mark Nottingham, Michael Hausenblashttps://datatracker.ietf.org/doc/draft-vandesompel-memento/ Memento Update 2011 IIPC General Assembly, Den Hague 18
  19. 19. Memento Clients •  Several client tools developed by us and others •  Add-ons for FireFox (operational) and Internet Explorer (experimental) •  Applications for Android (operational) and iPhone/iPad (in development) •  Paper in current Issue of Code4Lib Journal http://www.mementoweb.org/tools/ Memento Update2011 IIPC General Assembly, Den Hague 19
  20. 20. Memento Server Support •  Memento-compliant Wayback software: •  In use by Internet Archive •  Available to Web archives, worldwide •  Please experiment with this new 1.6 version! http://www.mementoweb.org/tools/ Memento Update2011 IIPC General Assembly, Den Hague 20
  21. 21. Memento Server Support (2) •  Plug-in for MediaWiki (operational) •  Used on W3C’s main wiki •  Please install it for your MediaWiki! http://www.mementoweb.org/tools/ Memento Update2011 IIPC General Assembly, Den Hague 21
  22. 22. Memento Server Validator •  Server side client: •  Attempts to perform all Memento actions against a given URI •  Reports success/failure of the interactions and warnings for optional aspects •  Kept up to date with IETF Internet Drafthttp://www.mementoweb.org/tools/validator/ Memento Update 2011 IIPC General Assembly, Den Hague 22
  23. 23. Memento Proxy Support •  Several systems that host Mementos made Memento- compliant “by proxy”: •  Many Web Archives that do not yet run Memento- compliant software •  3,000+ MediaWiki systems, including Wikipedia, Wikia •  We would love all of these to become natively Memento compliant! Memento Update2011 IIPC General Assembly, Den Hague 23
  24. 24. Memento Web Site •  Ongoing effort to add materials that support understanding and adoption: •  Introduction to Memento •  How to recognize Mementos, TimeGates, Original Resources? •  Guidelines for servers that host Mementos (Web Archives, CMS, snapshot archives, etc.) http://www.mementoweb.org/guide/ Memento Update2011 IIPC General Assembly, Den Hague 24
  25. 25. Funding •  2007-2010: US $250K grant from Library of Congress •  Approx. $50K on Memento •  2010-2011: US $1 Million follow-up grant from Library of Congress •  For: Specification, outreach, tool development, further research Memento Update2011 IIPC General Assembly, Den Hague 25
  26. 26. Overview of Memento FrameworkDeployment ProgressMemento and DiscoveryMemento and BrandingAlternative Web Archiving Strategies Memento Update 2011 IIPC General Assembly, Den Hague 26
  27. 27. Very few Web sites provide a “timegate” link.Need additional mechanisms to support Discovery. Memento Update 2011 IIPC General Assembly, Den Hague 27
  28. 28. Batch Discovery: TimeMaps A TimeMap minimally lists:•  URI and datetime of Mementos known to an archive•  URI of Original Resource TimeMaps can be aggregated across systems that host Mementos Memento Update 2011 IIPC General Assembly, Den Hague 28
  29. 29. Batch Discovery: Feed of TimeMapsSystem that hosts Mementos exposes Feed of TimeMaps toallow applications to remain in sync with its collection: •  One Atom entry per Original Resource •  The entry links to or includes a TimeMap •  The entrys updated changes when additional Mementos become available •  The ID of the entry is a tag URI based on URI of Original Resource •  Can be protected, and include license information •  Could be anonymized by aggregating service Memento Update 2011 IIPC General Assembly, Den Hague 29
  30. 30. Batch Discovery: robots.txt•  robots.txt file is used by Web servers to conveycrawling policies•  Add a directives to support discovery of TimeGates andFeeds of TimeMapsTimeGate: http://dutch.archive.org/timegate/ Archived: .nlTimeGate: http://all.archive.org/timegate/ Archived: *TimeMapFeed: http://dutch.archive.org/feed/feed1.xml Memento Update 2011 IIPC General Assembly, Den Hague 30
  31. 31. Overview of Memento FrameworkDeployment ProgressMemento and DiscoveryMemento and BrandingAlternative Web Archiving Strategies Memento Update 2011 IIPC General Assembly, Den Hague 31
  32. 32. Memento can recreate pages using resources from different archives. This poses a branding challenge. Memento Update 2011 IIPC General Assembly, Den Hague 32
  33. 33. Current Branding Practice for Web Archives Page and embedded resources from same Web Archive Branding for page andembedded resourcesfrom single archive Memento Update 2011 IIPC General Assembly, Den Hague 33
  34. 34. Branding for Web Archives in Memento Mode Page and embedded resources from various Web ArchivesHTMLsbranding Nobranding Nobranding Will be researched Memento Update 2011 IIPC General Assembly, Den Hague 34
  35. 35. Overview of Memento FrameworkDeployment ProgressMemento and DiscoveryMemento and BrandingAlternative Web Archiving Strategies Memento Update 2011 IIPC General Assembly, Den Hague 35
  36. 36. Crawl-based Archives host distinct observations. Transactional Archives never miss an update. Memento Update 2011 IIPC General Assembly, Den Hague 36
  37. 37. Crawl-Based Web ArchivesDistinct Observations are Archived for Many Servers Memento Update 2011 IIPC General Assembly, Den Hague 37
  38. 38. Server-Side Transactional Web ArchivesEntire Change History is Archived for a Single Server Memento Update 2011 IIPC General Assembly, Den Hague 38
  39. 39. Development of Transactional Web Archive SoftwareCapture: •  Apache connection filter module captures URI, headers, body •  POSTs in real-time to transactional archiveAccess: •  Online, real time access via Memento TimeGates •  Batch Export via WARC files for long term preservation Memento Update 2011 IIPC General Assembly, Den Hague 39
  40. 40. Update on Memento http://mementoweb.org/ Herbert Van de Sompel Robert Sanderson Michael L. NelsonTowards Seamless Navigation of the Web of the Past Memento Update 2011 IIPC General Assembly, Den Hague 40

×