The document discusses how links on the web are prone to reference rot over time as content moves or disappears. It proposes making links more "robust" by taking snapshots of referenced resources in public web archives and decorating links with metadata about the archived version. This allows links to still work even if the original resource changes, and enables tools to access archived versions. The talk outlines how robust links can be implemented through standard HTML, APIs, and browser extensions to benefit both humans and machines in navigating the archived web.
1. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
Robust Linking to Web Resources
http://robustlinks.mementoweb.org/
Martin Klein
@mart1nkle1n
Research Library
Los Alamos National Laboratory
Acknowledgements:
Herbert Van de Sompel, LANL
Harihar Shankar, LANL
Michael L. Nelson, ODU
Mark Graham, Internet Archive
2. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
2
Slide by Herbert Van de Sompel, 2017
A Managed Collection Desires Reliable Outlinks
3. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
3
Slide by Herbert Van de Sompel, 2017
Links to another Managed Collection
4. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
4
Slide by Herbert Van de Sompel, 2017
Links to Web at Large Resources
5. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
5
Link Rot
6. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
6
https://web.archive.org/web/20140101072007/http://netpreserve.org/general-assembly/2013/overview
IIPC
2013
7. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
7
http://netpreserve.org/general-assembly/2013/overview
IIPC
today
8. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
8
Content Drift
9. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
9
https://web.archive.org/web/20161228184110/https://www.epa.gov/climatechange
EPA
12/2016
10. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
10
https://www.epa.gov/sites/production/files/signpost/cc.html
EPA
today
11. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
11
• On the web, all links are subject to reference rot
• Reference rot hinders our ability to follow links as they were
intended when they were put in place
• Link rot: a link stops working all together
• Content drift: The linked content changes over time and
may eventually no longer be representative of the
content that was originally linked
Problem
12. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
12
http://dx.doi.org/10.1371/journal.pone.0115253 http://dx.doi.org/10.1371/journal.pone.0167475
Reference Rot in Scholarly Communication
13. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
13
Link Rot in Scholarly Articles
14. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
14
Link Rot in Scholarly Articles
15. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
15
Reference Rot Over Time - arXiv
16. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
16
• On the web, all links are subject to reference rot
• Reference rot hinders our ability to follow links as they were
intended when they were put in place
• Link rot: a link stops working all together
• Content drift: The linked content changes over time and
may eventually no longer be representative of the
content that was originally linked
How can we:
1. Make links more robust?
2. Make them actionable for humans and machines?
Problem
17. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
17
Robust Links
18. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
18
Robust Links
1. Create a snapshot of referenced resources in a public web
archive
19. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
19
Why multiple archives? They aren’t magic web sites!
They’re just web sites.
If you used Mummify, you’re now left with a bunch of defunct, shortened links like:
https://mummify.it/XbmcMfE3
Slide by Michael L. Nelson, 2016
20. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
20
Robust Links
1. Create a snapshot of referenced resources in a publically available
web archive
2. Decorate links with:
• URI of archived snapshot
• datetime of archiving
• resource’s original URI
21. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
21
Link Decoration with Standard HTML
<a href="http://web.archive.org/web/20171108053054/http://sfgov.org/"
data-originalurl="http://sfgov.org/"
data-versiondate="2017-11-08">
City and County of San Francisco</a>
http://robustlinks.mementoweb.org/spec
22. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
22
Link Decoration via API
http://robustlinks.mementoweb.org/api/json/http://web.archive.org/web/20171108053054/http://sfgov.org/
• Submit URI of an archived
snapshot
• Retrieve Robust Links
HTML snippet
• Copy and paste into your
application
http://robustlinks.mementoweb.org/api/json/{URI-of-archived-snapshot}
23. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
23
Robust Links
1. Create a snapshot of referenced resources in a publically available
web archive
2. Decorate links with:
• URI of archived snapshot
• datetime of archiving
• resource’s original URI
Benefits:
• Can visit archived, immutable version of referenced resource
• Original URI & capture datetime allow finding versions in other
web archives
• Uniform, machine-actionable
24. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
24
Robust Links for Machines
1. JavaScript
2. Browser extensions
a. Memento for Chrome
b. IA Chrome Extension
25. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
25
Robust Links in Action - JavaScript
http://dx.doi.org/10.1045/november2015-vandesompel
26. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
26
Robust Links in Action - JavaScript
http://dx.doi.org/10.1045/november2015-vandesompel
27. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
27
Robust Links in Action – Memento for Chrome
https://chrome.google.com/webstore/detail/memento-time-travel/jgbfpjledahoajcppakbgilmojkaghgm
28. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
28
Robust Links in Action – Memento for Chrome
http://robustlinks.mementoweb.org/demo/uri_references.html
29. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
29
Robust Links in Action – IA Chrome Extension
https://chrome.google.com/webstore/detail/wayback-machine/fpnmgdkabkmnadcjpehmlllkndpkmiak
30. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
30
Robust Links in Action – IA Chrome Extension
31. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
31
Robust Links in Action – IA Chrome Extension
32. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
32
Robust Links in Action – IA Chrome Extension
33. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
33
Take-Aways
• Links on the web are subject to reference rot
• “Robustifying” them (manually or via API calls) can help alleviate the
problem
• Link decorations as proposed by Robust Links are
• based on HTML standards
• machine-actionable
• Organizations such as the Internet Archive, Wikipedia,
News Publishers can help with adoption
34. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
Robust Linking to Web Resources
http://robustlinks.mementoweb.org/
Martin Klein
@mart1nkle1n
Research Library
Los Alamos National Laboratory
Acknowledgements:
Herbert Van de Sompel, LANL
Harihar Shankar, LANL
Michael L. Nelson, ODU
Mark Graham, Internet Archive