SlideShare a Scribd company logo
Who Will Archive the Archives?
Thoughts About the Future of Web Archiving
Michael L. Nelson
Old Dominion University
with:
Old Dominion University: Scott G. Ainsworth, Ahmed AlSum, Justin F. Brunelle,
Mat Kelly, Hany SalahEldeen, Michele C. Weigle
Los Alamos National Laboratory: Robert Sanderson, Herbert Van de Sompel
Web Archiving: Big Data?
Two Common Misconceptions
About Web Archiving
• Prior = old = obsolete = stale = bad
– who cares, not an interesting problem
• The Internet Archive has every copy of everything that has ever
existed
– who cares, problem solved
Why Care About The Past?
From an anonymous WWW 2010 reviewer about our
Memento paper (emphasis mine):
"Is there any statistics to show that many or a good number of Web
users would like to get obsolete data or resources? "
one answer: replay of contemporary pages >> summary pages
http://www.slideshare.net/phonedude/why-careaboutthepast
http://www.nytimes.com/2013/06/19/books/seven-american-deaths-and-disasters-transcribes-the-news.html
vs.
Archiving Moves At Hurricane Speed,
Most News Stories Move Faster
Most of the Story,
at Least as Conveyed by cnn.com,
is Missing…
in this case, you can reconstruct the events with
http://en.wikipedia.org/wiki/Virginia_Tech_massacre_timeline
How Much of The Web Is Archived?
Public Archives, ca. Late 2010 / Early 2011
Three categories of archives
• Internet ArchiveInternet Archive
• Search engineSearch engine
• Other archivesOther archives
UK US
See also: http://arxiv.org/abs/1212.6177
1000 URIs Ordered by First Observation Date
See also: http://ws-dl.blogspot.com/2011/06/2011-06-23-how-much-of-web-is-archived.html
see also: http://ws-dl.blogspot.com/2013/04/2013-04-19-carbon-dating-web.html
How Much of the Web is Archived?
It Depends on Which Web…
Including
SE cache
Excluding
SE Cache
90% 79%
97% 68%
35% 16%
88% 19%
Changes since 2011: no more free SE APIs;
greatly reduced IA quarantine period; 15 public web archives
2013
95%
92%
23%
26%
Long Tail of Archives
Archive.is
see also: http://www.cs.odu.edu/~mln/pubs/tpdl-2013/paper_134.pdf
Memento: A Multi-Archive Method
for Linking the Current & Past Web
see: http://mementoweb.org/
So It's Been Archived,
What Can Go Wrong?
Temporal Drift
August 27, 2005
11:16 a.m. EDT
link
Temporal Drift: Now 3 Hours in the Past
August 27, 2005
11:16 a.m. EDT
link
August 27, 2005
8:00 a.m. EDT
link
Temporal Drift: Now 17 Days in the Future
August 27, 2005
11:16 a.m. EDT
link
August 27, 2005
8:00 a.m. EDT
link
September 13, 2005
8:12 a.m. EDT
link
Temporal Drift: Now 23 (or 6) Days in the Future
August 27, 2005
11:16 a.m. EDT
link
August 27, 2005
8:00 a.m. EDT
link
September 13, 2005
8:12 a.m. EDT
link
September 19, 2005
8:25 a.m. EDT
link
10+ clicks in the archive results in median drift of ~45 days (standard UI)
or ~15 days with Memento. ~2% of the sessions have drift of > 1 year.
see: http://www.cs.odu.edu/~mln/pubs/jcdl-2013/jcdl93-ainsworth.pdf
We Call the Drift in a Single Page
"Temporal Spread"
2005-05-14
01:36:08
2005-05-14
01:36:08
+9 days
+18 days +18 days
+7 months
+2.1 years using current policies, only ~76% of pages are
complete, with a mean temporal spread of ~1 year,
and with ~5% of pages having a temporal violation.
(submitted for publication)
Sometimes the Live Web
"Leaks" Into the Archive…
see: http://ws-dl.blogspot.com/2012/10/2012-10-10-zombies-in-archives.html
Sept 3, 2008
2012
Quis Archiviet Ipsos Archives?
(thanks to webmaster@archive.is for this example)
% curl -I http://lenta.ru/articles/2013/04/02/mat/
HTTP/1.1 302 Found
Server: nginx
Date: Tue, 03 Sep 2013 00:15:14 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
Status: 302 Found
Location: http://lenta.ru/f_words/
X-UA-Compatible: IE=Edge,chrome=1
Cache-Control: no-cache
X-Request-Id: bd7caae039d6312c0542cb4ad62f3847
X-Runtime: 0.005474
X-Rack-Cache: miss
current page for: http://lenta.ru/articles/2013/04/02/mat/
archive.org version of: http://lenta.ru/articles/2013/04/02/mat/
peep.us archived version of archive.org version
archive.is archived version of peep.us version of archive.org version
Why Make Lots of Copies?
Archives Are Subject to the Same
Vagaries of Other Web Sites…
In a perfect world, this graph should be monotonically increasing.
Memento allows simultaneous access to more archives, but this
also means that at any given time, some archive(s) will be down.
ODU OS
upgrade
IA API changes
ODU power outage
see: http://arxiv.org/abs/1307.5685
reminder:
0.99100
= 0.37
0.999100
= 0.90
Query Routing: Using Only Top-k Archives
for URI Lookup Yields Good Results
Even when there are 100s of archives, we only need to talk to a few.
see: http://www.cs.odu.edu/~mln/pubs/tpdl-2013/paper_134.pdf
What is the Economic Model for Archives?
1TB endowment = ~$4700: http://blog.dshr.org/2011/02/paying-for-long-term-storage.html
see also: http://blog.dshr.org/2011/01/memento-marketplace-for-archiving.html
Houston, Tranquility Base Here. The Eagle has landed.
see also: http://ws-dl.blogspot.com/2013/03/2013-03-22-ntrs-web-archives-and-why-we.html
Summary
• We have a cultural mandate to preserve "obsolete data or
resources"
– however, we currently have limited discovery and replay tools
• We need lots of people making several copies of many things
– Memento is the mechanism for accessing the long tail of archives

More Related Content

What's hot

Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Michael Nelson
 
The Web We Want
The Web We WantThe Web We Want
The Web We Want
Steffen Staab
 
iPRES2015: Archiving Deferred Representations Using a Two-Tiered Crawling App...
iPRES2015: Archiving Deferred Representations Using a Two-Tiered Crawling App...iPRES2015: Archiving Deferred Representations Using a Two-Tiered Crawling App...
iPRES2015: Archiving Deferred Representations Using a Two-Tiered Crawling App...
Justin Brunelle
 
Impact of URI Canonicalization on Memento Count
Impact of URI Canonicalization on Memento Count Impact of URI Canonicalization on Memento Count
Impact of URI Canonicalization on Memento Count
Mat Kelly
 
Summarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniquesSummarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniques
Michael Nelson
 
Scripts in a Frame: A Two-Tiered Approach for Archiving Deferred Representations
Scripts in a Frame: A Two-Tiered Approach for Archiving Deferred RepresentationsScripts in a Frame: A Two-Tiered Approach for Archiving Deferred Representations
Scripts in a Frame: A Two-Tiered Approach for Archiving Deferred Representations
Justin Brunelle
 
To the Rescue of the Orphans of Scholarly Communication
To the Rescue of the Orphans of Scholarly CommunicationTo the Rescue of the Orphans of Scholarly Communication
To the Rescue of the Orphans of Scholarly Communication
Martin Klein
 
Avoiding Zombies in Archival Replay Using ServiceWorker
Avoiding Zombies in Archival Replay Using ServiceWorkerAvoiding Zombies in Archival Replay Using ServiceWorker
Avoiding Zombies in Archival Replay Using ServiceWorker
Sawood Alam
 
Why We Need Multiple Archives
Why We Need Multiple ArchivesWhy We Need Multiple Archives
Why We Need Multiple Archives
Michael Nelson
 
Detecting Off-Topic Pages in Web Archives
Detecting Off-Topic Pages in Web ArchivesDetecting Off-Topic Pages in Web Archives
Detecting Off-Topic Pages in Web Archives
Yasmin AlNoamany, PhD
 
Detecting Off-Topic Pages in Web Archives
Detecting Off-Topic Pages in Web ArchivesDetecting Off-Topic Pages in Web Archives
Detecting Off-Topic Pages in Web Archives
Yasmin AlNoamany, PhD
 
Something about links
Something about linksSomething about links
Something about links
Roderic Page
 
User Access Patterns in Web Archives
User Access Patterns in Web ArchivesUser Access Patterns in Web Archives
User Access Patterns in Web ArchivesYasmin AlNoamany, PhD
 
PID Signposting Pattern
PID Signposting PatternPID Signposting Pattern
PID Signposting Pattern
Herbert Van de Sompel
 
Archive Assisted Archival Fixity Verification Framework
Archive Assisted Archival Fixity Verification FrameworkArchive Assisted Archival Fixity Verification Framework
Archive Assisted Archival Fixity Verification Framework
Sawood Alam
 
Supporting Web Archiving via Web Packaging
Supporting Web Archiving via Web PackagingSupporting Web Archiving via Web Packaging
Supporting Web Archiving via Web Packaging
Sawood Alam
 
Summarize Your Archival Holdings With MementoMap
Summarize Your Archival Holdings With MementoMapSummarize Your Archival Holdings With MementoMap
Summarize Your Archival Holdings With MementoMap
Sawood Alam
 
Combining Storytelling and Web Archives
Combining Storytelling and Web ArchivesCombining Storytelling and Web Archives
Combining Storytelling and Web Archives
Michael Nelson
 
Robust Linking to Web Resources
Robust Linking to Web ResourcesRobust Linking to Web Resources
Robust Linking to Web Resources
Martin Klein
 
Creating Topical Collections: Web Archives vs. Live Web
Creating Topical Collections:Web Archives vs. Live WebCreating Topical Collections:Web Archives vs. Live Web
Creating Topical Collections: Web Archives vs. Live Web
Martin Klein
 

What's hot (20)

Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
 
The Web We Want
The Web We WantThe Web We Want
The Web We Want
 
iPRES2015: Archiving Deferred Representations Using a Two-Tiered Crawling App...
iPRES2015: Archiving Deferred Representations Using a Two-Tiered Crawling App...iPRES2015: Archiving Deferred Representations Using a Two-Tiered Crawling App...
iPRES2015: Archiving Deferred Representations Using a Two-Tiered Crawling App...
 
Impact of URI Canonicalization on Memento Count
Impact of URI Canonicalization on Memento Count Impact of URI Canonicalization on Memento Count
Impact of URI Canonicalization on Memento Count
 
Summarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniquesSummarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniques
 
Scripts in a Frame: A Two-Tiered Approach for Archiving Deferred Representations
Scripts in a Frame: A Two-Tiered Approach for Archiving Deferred RepresentationsScripts in a Frame: A Two-Tiered Approach for Archiving Deferred Representations
Scripts in a Frame: A Two-Tiered Approach for Archiving Deferred Representations
 
To the Rescue of the Orphans of Scholarly Communication
To the Rescue of the Orphans of Scholarly CommunicationTo the Rescue of the Orphans of Scholarly Communication
To the Rescue of the Orphans of Scholarly Communication
 
Avoiding Zombies in Archival Replay Using ServiceWorker
Avoiding Zombies in Archival Replay Using ServiceWorkerAvoiding Zombies in Archival Replay Using ServiceWorker
Avoiding Zombies in Archival Replay Using ServiceWorker
 
Why We Need Multiple Archives
Why We Need Multiple ArchivesWhy We Need Multiple Archives
Why We Need Multiple Archives
 
Detecting Off-Topic Pages in Web Archives
Detecting Off-Topic Pages in Web ArchivesDetecting Off-Topic Pages in Web Archives
Detecting Off-Topic Pages in Web Archives
 
Detecting Off-Topic Pages in Web Archives
Detecting Off-Topic Pages in Web ArchivesDetecting Off-Topic Pages in Web Archives
Detecting Off-Topic Pages in Web Archives
 
Something about links
Something about linksSomething about links
Something about links
 
User Access Patterns in Web Archives
User Access Patterns in Web ArchivesUser Access Patterns in Web Archives
User Access Patterns in Web Archives
 
PID Signposting Pattern
PID Signposting PatternPID Signposting Pattern
PID Signposting Pattern
 
Archive Assisted Archival Fixity Verification Framework
Archive Assisted Archival Fixity Verification FrameworkArchive Assisted Archival Fixity Verification Framework
Archive Assisted Archival Fixity Verification Framework
 
Supporting Web Archiving via Web Packaging
Supporting Web Archiving via Web PackagingSupporting Web Archiving via Web Packaging
Supporting Web Archiving via Web Packaging
 
Summarize Your Archival Holdings With MementoMap
Summarize Your Archival Holdings With MementoMapSummarize Your Archival Holdings With MementoMap
Summarize Your Archival Holdings With MementoMap
 
Combining Storytelling and Web Archives
Combining Storytelling and Web ArchivesCombining Storytelling and Web Archives
Combining Storytelling and Web Archives
 
Robust Linking to Web Resources
Robust Linking to Web ResourcesRobust Linking to Web Resources
Robust Linking to Web Resources
 
Creating Topical Collections: Web Archives vs. Live Web
Creating Topical Collections:Web Archives vs. Live WebCreating Topical Collections:Web Archives vs. Live Web
Creating Topical Collections: Web Archives vs. Live Web
 

Viewers also liked

Evaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived PagesEvaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived Pages
Michael Nelson
 
We Need Multiple, Independent Web Archives
We Need Multiple, Independent Web ArchivesWe Need Multiple, Independent Web Archives
We Need Multiple, Independent Web Archives
Michael Nelson
 
Profiling Web Archives
Profiling Web ArchivesProfiling Web Archives
Profiling Web Archives
Michael Nelson
 
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Michael Nelson
 
When Should I Make Preservation Copies of Myself?
When Should I Make Preservation Copies of Myself?�When Should I Make Preservation Copies of Myself?�
When Should I Make Preservation Copies of Myself?
Michael Nelson
 
Using Web Archives to Enrich the Live Web Experience Through Storytelling
Using Web Archives to Enrich the Live Web Experience Through StorytellingUsing Web Archives to Enrich the Live Web Experience Through Storytelling
Using Web Archives to Enrich the Live Web Experience Through Storytelling
Yasmin AlNoamany, PhD
 
Old Dominion University Computer Science IIPC New Member
Old Dominion University Computer Science IIPC New Member Old Dominion University Computer Science IIPC New Member
Old Dominion University Computer Science IIPC New Member
Michael Nelson
 
Software as a Well-Formed Research Object
Software as a Well-Formed Research ObjectSoftware as a Well-Formed Research Object
Software as a Well-Formed Research Object
Yasmin AlNoamany, PhD
 
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with JavascriptCombining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Michael Nelson
 
Web Archiving: A Brief Introduction
Web Archiving: A Brief IntroductionWeb Archiving: A Brief Introduction
Web Archiving: A Brief Introduction
Sawood Alam
 
@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015
Michael Nelson
 
Why Care About the Past?
Why Care About the Past?Why Care About the Past?
Why Care About the Past?
Michael Nelson
 
OAI-ORE: The Open Archives Initiative Object Reuse and Exchange Project
OAI-ORE:  The Open Archives Initiative  Object Reuse and Exchange ProjectOAI-ORE:  The Open Archives Initiative  Object Reuse and Exchange Project
OAI-ORE: The Open Archives Initiative Object Reuse and Exchange Project
Michael Nelson
 

Viewers also liked (13)

Evaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived PagesEvaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived Pages
 
We Need Multiple, Independent Web Archives
We Need Multiple, Independent Web ArchivesWe Need Multiple, Independent Web Archives
We Need Multiple, Independent Web Archives
 
Profiling Web Archives
Profiling Web ArchivesProfiling Web Archives
Profiling Web Archives
 
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
 
When Should I Make Preservation Copies of Myself?
When Should I Make Preservation Copies of Myself?�When Should I Make Preservation Copies of Myself?�
When Should I Make Preservation Copies of Myself?
 
Using Web Archives to Enrich the Live Web Experience Through Storytelling
Using Web Archives to Enrich the Live Web Experience Through StorytellingUsing Web Archives to Enrich the Live Web Experience Through Storytelling
Using Web Archives to Enrich the Live Web Experience Through Storytelling
 
Old Dominion University Computer Science IIPC New Member
Old Dominion University Computer Science IIPC New Member Old Dominion University Computer Science IIPC New Member
Old Dominion University Computer Science IIPC New Member
 
Software as a Well-Formed Research Object
Software as a Well-Formed Research ObjectSoftware as a Well-Formed Research Object
Software as a Well-Formed Research Object
 
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with JavascriptCombining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
 
Web Archiving: A Brief Introduction
Web Archiving: A Brief IntroductionWeb Archiving: A Brief Introduction
Web Archiving: A Brief Introduction
 
@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015
 
Why Care About the Past?
Why Care About the Past?Why Care About the Past?
Why Care About the Past?
 
OAI-ORE: The Open Archives Initiative Object Reuse and Exchange Project
OAI-ORE:  The Open Archives Initiative  Object Reuse and Exchange ProjectOAI-ORE:  The Open Archives Initiative  Object Reuse and Exchange Project
OAI-ORE: The Open Archives Initiative Object Reuse and Exchange Project
 

Similar to Who Will Archive the Archives? Thoughts About the Future of Web Archiving

Introducing Web Archiving and WSDL Research Group
Introducing Web Archiving and WSDL Research GroupIntroducing Web Archiving and WSDL Research Group
Introducing Web Archiving and WSDL Research Group
Sawood Alam
 
Readying Web Archives to Consume and Leverage Web Bundles
Readying Web Archives to Consume and Leverage Web BundlesReadying Web Archives to Consume and Leverage Web Bundles
Readying Web Archives to Consume and Leverage Web Bundles
Sawood Alam
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Michael Nelson
 
MementoMap Framework for Flexible and Adaptive Web Archive Profiling
MementoMap Framework for Flexible and Adaptive Web Archive ProfilingMementoMap Framework for Flexible and Adaptive Web Archive Profiling
MementoMap Framework for Flexible and Adaptive Web Archive Profiling
Sawood Alam
 
Browser-Based Digital Preservation
Browser-Based Digital PreservationBrowser-Based Digital Preservation
Browser-Based Digital PreservationMat Kelly
 
A Research Agenda for "Obsolete Data or Resources"
A Research Agenda for "Obsolete Data or Resources"A Research Agenda for "Obsolete Data or Resources"
A Research Agenda for "Obsolete Data or Resources"
Michael Nelson
 
Linked Open Data for Archives
Linked Open Data for ArchivesLinked Open Data for Archives
Linked Open Data for Archives
Cliff Landis
 
Flourish2011
Flourish2011Flourish2011
Flourish2011
Mark Meeker
 
Cache in API Gateway
Cache in API GatewayCache in API Gateway
Cache in API Gateway
GilWon Oh
 
(Re-) Discovering Lost Web Pages
(Re-) Discovering Lost Web Pages(Re-) Discovering Lost Web Pages
(Re-) Discovering Lost Web Pages
Michael Nelson
 
Mark Hughes Annual Seminar Presentation on Open Source
Mark Hughes Annual Seminar Presentation on Open Source Mark Hughes Annual Seminar Presentation on Open Source
Mark Hughes Annual Seminar Presentation on Open Source
Tracy Kent
 
MementoMap: An Archive Profile Dissemination Framework
MementoMap: An Archive Profile Dissemination FrameworkMementoMap: An Archive Profile Dissemination Framework
MementoMap: An Archive Profile Dissemination Framework
Sawood Alam
 
Web Performance in the Age of HTTP2 - Topconf Tallinn 2016 - Holger Bartel
Web Performance in the Age of HTTP2 - Topconf Tallinn 2016 - Holger BartelWeb Performance in the Age of HTTP2 - Topconf Tallinn 2016 - Holger Bartel
Web Performance in the Age of HTTP2 - Topconf Tallinn 2016 - Holger Bartel
Holger Bartel
 
Can’t Find Your 404s?
Can’t Find Your 404s?Can’t Find Your 404s?
Can’t Find Your 404s?
Michael Nelson
 
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
Artefactual Systems - AtoM
 
Web storage
Web storage Web storage
Web storage
PratikDoiphode1
 
KESW2012 Hackathon St Petersburg
KESW2012 Hackathon St PetersburgKESW2012 Hackathon St Petersburg
KESW2012 Hackathon St Petersburg
AI4BD GmbH
 
Tornado
TornadoTornado
Tornado
Jeffrey Clark
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Michael Nelson
 
Sneakernets: Bringing the Mountain to You
Sneakernets: Bringing the Mountain to YouSneakernets: Bringing the Mountain to You
Sneakernets: Bringing the Mountain to You
Sameer Verma
 

Similar to Who Will Archive the Archives? Thoughts About the Future of Web Archiving (20)

Introducing Web Archiving and WSDL Research Group
Introducing Web Archiving and WSDL Research GroupIntroducing Web Archiving and WSDL Research Group
Introducing Web Archiving and WSDL Research Group
 
Readying Web Archives to Consume and Leverage Web Bundles
Readying Web Archives to Consume and Leverage Web BundlesReadying Web Archives to Consume and Leverage Web Bundles
Readying Web Archives to Consume and Leverage Web Bundles
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
 
MementoMap Framework for Flexible and Adaptive Web Archive Profiling
MementoMap Framework for Flexible and Adaptive Web Archive ProfilingMementoMap Framework for Flexible and Adaptive Web Archive Profiling
MementoMap Framework for Flexible and Adaptive Web Archive Profiling
 
Browser-Based Digital Preservation
Browser-Based Digital PreservationBrowser-Based Digital Preservation
Browser-Based Digital Preservation
 
A Research Agenda for "Obsolete Data or Resources"
A Research Agenda for "Obsolete Data or Resources"A Research Agenda for "Obsolete Data or Resources"
A Research Agenda for "Obsolete Data or Resources"
 
Linked Open Data for Archives
Linked Open Data for ArchivesLinked Open Data for Archives
Linked Open Data for Archives
 
Flourish2011
Flourish2011Flourish2011
Flourish2011
 
Cache in API Gateway
Cache in API GatewayCache in API Gateway
Cache in API Gateway
 
(Re-) Discovering Lost Web Pages
(Re-) Discovering Lost Web Pages(Re-) Discovering Lost Web Pages
(Re-) Discovering Lost Web Pages
 
Mark Hughes Annual Seminar Presentation on Open Source
Mark Hughes Annual Seminar Presentation on Open Source Mark Hughes Annual Seminar Presentation on Open Source
Mark Hughes Annual Seminar Presentation on Open Source
 
MementoMap: An Archive Profile Dissemination Framework
MementoMap: An Archive Profile Dissemination FrameworkMementoMap: An Archive Profile Dissemination Framework
MementoMap: An Archive Profile Dissemination Framework
 
Web Performance in the Age of HTTP2 - Topconf Tallinn 2016 - Holger Bartel
Web Performance in the Age of HTTP2 - Topconf Tallinn 2016 - Holger BartelWeb Performance in the Age of HTTP2 - Topconf Tallinn 2016 - Holger Bartel
Web Performance in the Age of HTTP2 - Topconf Tallinn 2016 - Holger Bartel
 
Can’t Find Your 404s?
Can’t Find Your 404s?Can’t Find Your 404s?
Can’t Find Your 404s?
 
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
 
Web storage
Web storage Web storage
Web storage
 
KESW2012 Hackathon St Petersburg
KESW2012 Hackathon St PetersburgKESW2012 Hackathon St Petersburg
KESW2012 Hackathon St Petersburg
 
Tornado
TornadoTornado
Tornado
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
 
Sneakernets: Bringing the Mountain to You
Sneakernets: Bringing the Mountain to YouSneakernets: Bringing the Mountain to You
Sneakernets: Bringing the Mountain to You
 

More from Michael Nelson

Web Archiving in the Year eaee1902f186819154789ee22ca30035
Web Archiving in the Year eaee1902f186819154789ee22ca30035Web Archiving in the Year eaee1902f186819154789ee22ca30035
Web Archiving in the Year eaee1902f186819154789ee22ca30035
Michael Nelson
 
Uncertainty in replaying archived Twitter pages
Uncertainty in replaying archived Twitter pagesUncertainty in replaying archived Twitter pages
Uncertainty in replaying archived Twitter pages
Michael Nelson
 
Web Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsWeb Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed Originals
Michael Nelson
 
Web Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsWeb Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed Originals
Michael Nelson
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Michael Nelson
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Michael Nelson
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Michael Nelson
 

More from Michael Nelson (7)

Web Archiving in the Year eaee1902f186819154789ee22ca30035
Web Archiving in the Year eaee1902f186819154789ee22ca30035Web Archiving in the Year eaee1902f186819154789ee22ca30035
Web Archiving in the Year eaee1902f186819154789ee22ca30035
 
Uncertainty in replaying archived Twitter pages
Uncertainty in replaying archived Twitter pagesUncertainty in replaying archived Twitter pages
Uncertainty in replaying archived Twitter pages
 
Web Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsWeb Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed Originals
 
Web Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsWeb Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed Originals
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
 

Recently uploaded

Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 

Recently uploaded (20)

Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 

Who Will Archive the Archives? Thoughts About the Future of Web Archiving

  • 1. Who Will Archive the Archives? Thoughts About the Future of Web Archiving Michael L. Nelson Old Dominion University with: Old Dominion University: Scott G. Ainsworth, Ahmed AlSum, Justin F. Brunelle, Mat Kelly, Hany SalahEldeen, Michele C. Weigle Los Alamos National Laboratory: Robert Sanderson, Herbert Van de Sompel
  • 3. Two Common Misconceptions About Web Archiving • Prior = old = obsolete = stale = bad – who cares, not an interesting problem • The Internet Archive has every copy of everything that has ever existed – who cares, problem solved
  • 4. Why Care About The Past? From an anonymous WWW 2010 reviewer about our Memento paper (emphasis mine): "Is there any statistics to show that many or a good number of Web users would like to get obsolete data or resources? " one answer: replay of contemporary pages >> summary pages http://www.slideshare.net/phonedude/why-careaboutthepast http://www.nytimes.com/2013/06/19/books/seven-american-deaths-and-disasters-transcribes-the-news.html
  • 5.
  • 6. vs.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19. Archiving Moves At Hurricane Speed, Most News Stories Move Faster
  • 20.
  • 21.
  • 22.
  • 23. Most of the Story, at Least as Conveyed by cnn.com, is Missing… in this case, you can reconstruct the events with http://en.wikipedia.org/wiki/Virginia_Tech_massacre_timeline
  • 24. How Much of The Web Is Archived?
  • 25. Public Archives, ca. Late 2010 / Early 2011 Three categories of archives • Internet ArchiveInternet Archive • Search engineSearch engine • Other archivesOther archives UK US See also: http://arxiv.org/abs/1212.6177
  • 26. 1000 URIs Ordered by First Observation Date See also: http://ws-dl.blogspot.com/2011/06/2011-06-23-how-much-of-web-is-archived.html
  • 28. How Much of the Web is Archived? It Depends on Which Web… Including SE cache Excluding SE Cache 90% 79% 97% 68% 35% 16% 88% 19% Changes since 2011: no more free SE APIs; greatly reduced IA quarantine period; 15 public web archives 2013 95% 92% 23% 26%
  • 29. Long Tail of Archives Archive.is see also: http://www.cs.odu.edu/~mln/pubs/tpdl-2013/paper_134.pdf
  • 30. Memento: A Multi-Archive Method for Linking the Current & Past Web see: http://mementoweb.org/
  • 31. So It's Been Archived, What Can Go Wrong?
  • 32. Temporal Drift August 27, 2005 11:16 a.m. EDT link
  • 33. Temporal Drift: Now 3 Hours in the Past August 27, 2005 11:16 a.m. EDT link August 27, 2005 8:00 a.m. EDT link
  • 34. Temporal Drift: Now 17 Days in the Future August 27, 2005 11:16 a.m. EDT link August 27, 2005 8:00 a.m. EDT link September 13, 2005 8:12 a.m. EDT link
  • 35. Temporal Drift: Now 23 (or 6) Days in the Future August 27, 2005 11:16 a.m. EDT link August 27, 2005 8:00 a.m. EDT link September 13, 2005 8:12 a.m. EDT link September 19, 2005 8:25 a.m. EDT link 10+ clicks in the archive results in median drift of ~45 days (standard UI) or ~15 days with Memento. ~2% of the sessions have drift of > 1 year. see: http://www.cs.odu.edu/~mln/pubs/jcdl-2013/jcdl93-ainsworth.pdf
  • 36. We Call the Drift in a Single Page "Temporal Spread"
  • 38. 2005-05-14 01:36:08 +9 days +18 days +18 days +7 months +2.1 years using current policies, only ~76% of pages are complete, with a mean temporal spread of ~1 year, and with ~5% of pages having a temporal violation. (submitted for publication)
  • 39. Sometimes the Live Web "Leaks" Into the Archive…
  • 41. Quis Archiviet Ipsos Archives? (thanks to webmaster@archive.is for this example)
  • 42. % curl -I http://lenta.ru/articles/2013/04/02/mat/ HTTP/1.1 302 Found Server: nginx Date: Tue, 03 Sep 2013 00:15:14 GMT Content-Type: text/html; charset=utf-8 Connection: keep-alive Status: 302 Found Location: http://lenta.ru/f_words/ X-UA-Compatible: IE=Edge,chrome=1 Cache-Control: no-cache X-Request-Id: bd7caae039d6312c0542cb4ad62f3847 X-Runtime: 0.005474 X-Rack-Cache: miss current page for: http://lenta.ru/articles/2013/04/02/mat/
  • 43. archive.org version of: http://lenta.ru/articles/2013/04/02/mat/
  • 44. peep.us archived version of archive.org version
  • 45. archive.is archived version of peep.us version of archive.org version
  • 46. Why Make Lots of Copies?
  • 47. Archives Are Subject to the Same Vagaries of Other Web Sites… In a perfect world, this graph should be monotonically increasing. Memento allows simultaneous access to more archives, but this also means that at any given time, some archive(s) will be down. ODU OS upgrade IA API changes ODU power outage see: http://arxiv.org/abs/1307.5685 reminder: 0.99100 = 0.37 0.999100 = 0.90
  • 48. Query Routing: Using Only Top-k Archives for URI Lookup Yields Good Results Even when there are 100s of archives, we only need to talk to a few. see: http://www.cs.odu.edu/~mln/pubs/tpdl-2013/paper_134.pdf
  • 49. What is the Economic Model for Archives? 1TB endowment = ~$4700: http://blog.dshr.org/2011/02/paying-for-long-term-storage.html see also: http://blog.dshr.org/2011/01/memento-marketplace-for-archiving.html
  • 50. Houston, Tranquility Base Here. The Eagle has landed. see also: http://ws-dl.blogspot.com/2013/03/2013-03-22-ntrs-web-archives-and-why-we.html
  • 51. Summary • We have a cultural mandate to preserve "obsolete data or resources" – however, we currently have limited discovery and replay tools • We need lots of people making several copies of many things – Memento is the mechanism for accessing the long tail of archives

Editor's Notes

  1. Let return to temporal spread. Most web pages are composed from multiple resources, some of which are circled here. (WAIT FOR ANIMATION)
  2. Let return to temporal spread. Even though the display is May 14, 2005 (CLICK) The resources are captured at very different times. (CLICK) Some days (CLICK) Some months (CLICK) Even years (in this case a m image in the footer)