SlideShare a Scribd company logo
1 of 16
Download to read offline
Client-side Reconstruction
of Composite Mementos
Using ServiceWorker
Sawood Alam, Mat Kelly, Michele C. Weigle, and Michael L. Nelson
Web Science and Digital Libraries Research Group
Old Dominion University, Norfolk, VA, 23529
@ibnesayeed
@WebSciDL
Supported in part by NSF III 1526700
1
JCDL 2017, June 19-23, 2017, Toronto, Ontario, Canada
Sawood Alam <@ibnesayeed>
2008 Memento Seen in 2017
2
● https://ws-dl.blogspot.com/2015/12/2015-12-08-evaluating-temporal.html
?
Sawood Alam <@ibnesayeed>
2008 Memento Seen in 2012
3
● http://ws-dl.blogspot.com/2012/10/2012-10-10-zombies-in-archives.html
Sawood Alam <@ibnesayeed>
XenLand @ Alpha Centauri
4
Sawood Alam <@ibnesayeed>
Zombies in Archive
5
?
Sawood Alam <@ibnesayeed>
Zombies in Archive
6
<img src="http://xenland.alpha/images/map.png">
// Is rewritten on replay to become:
<img src="http://archive.example.org/1998/http://xenland.alpha/images/map.png">
// URLs constructed by JavaScript are harder to rewrite on replay, e.g.:
var base = 'http://xenland.alpha';
var imgdir = '/images/';
var img = document.createElement('img');
img.src = base + imgdir + 'ruler.png';
document.getElementById('ruler').appendChild(img);
//=>> http://xenland.alpha/images/ruler.png
Sawood Alam <@ibnesayeed>
Replay URL Resolution & Rewriting
7
Reference type Example Resolution after relocation
Relative path images/logo.png Potentially correct
Absolute path /public/images/logo.png Potentially incorrect
Absolute URL http://example.com/public/images/logo.png Potentially live leakage
http://example.com/public/index.html
...
<img src="/public/images/logo.png">
...
http://archive.example.org/<datetime>/http://example.com/public/index.html
...
<img src="/<datetime>/http://example.com/public/images/logo.png">
...
Sawood Alam <@ibnesayeed>
Avoiding Zombies
● Ahead-of-time rendering and JS execution
○ http://archive.is/
● Archival replay proxy
○ https://github.com/ikreymer/pywb/wiki/Pywb-Proxy-Mode-Usage
● Browser extension
○ MementoFox (deprecated)
● JS override
○ wombat.js in PyWB
● ServiceWorker
8
Sawood Alam <@ibnesayeed>
● New web API (still a working draft)
● A standalone JavaScript file
● Persists in the browser independent of the window
● Acts as a proxy
● Installed by a web page under its domain at a specific path (called scope)
● Intercepts all requests in scope
○ Resources under the scope path (at any depth)
○ Secondary resource requests originated from any resource under scope
● Allows modification in request and response
● Primarily used in web applications for offline access and notification support
● Requires HTTPS
● Growing browser support (73.61% as of June 8, 2017)
ServiceWorker
9
● http://caniuse.com/#feat=serviceworkers
Sawood Alam <@ibnesayeed>
reconstructive.js
10
● https://github.com/oduwsdl/reconstructive
● A ServiceWorker script written for archival replay
● Plug-in for web archives or Memento aggregators
● Intercepts all network requests originated from a memento
● Reroutes requests to an archive (prevents live leakage & incorrect references)
● Optionally rewrites the content to add banner & to fix hyperlinks
Sawood Alam <@ibnesayeed>
Zombies, No More!
11
● https://github.com/oduwsdl/ipwb
Sawood Alam <@ibnesayeed>
Rewriting Mementos is Expensive
12
Original capture (without any rewriting)
In our experiment over 500 home pages we observed:
● One-fifth mean data overhead
● One-third mean time overhead
15% more data in twice the time
Sawood Alam <@ibnesayeed>
Archival Capture Replay Test Suite (ACRTS)
13
reconstructive.js
● https://ibnesayeed.github.io/acrts/
Sawood Alam <@ibnesayeed>
Reconstruction Winners: PyWB & reconstructive.js
A. OpenWayback
B. PyWB
C. Memento
Reconstruct
D. Memento for
Chrome
E. reconstructive.js
14
Sawood Alam <@ibnesayeed>
Future Work
● Use “Prefer” header for original content (when archives support it)
● Add a customizable archival banner
● Add click handler for lazy rewriting of hyperlinks
● Handle archived ServiceWorkers
● Write a 404-combat ServiceWorker script for webmasters
15
● http://ws-dl.blogspot.co.uk/2016/08/2016-08-15-mementos-in-raw-take-two.html
Sawood Alam <@ibnesayeed>
● reconstructive.js => no zombies!
● Rerouting instead of rewriting (lazy rewriting)
● Mean overhead reduction
○ one-fifth data
○ one-third time
● 73.61% (and growing) browser support for ServiceWorker
○ http://caniuse.com/#feat=serviceworkers
● reconstructive.js
○ https://github.com/oduwsdl/reconstructive
● Archival Capture Replay Test Suite
○ https://ibnesayeed.github.io/acrts/
Conclusions
16
● In-depth recap: WADL 2017 Thursday, June 22, 3:45pm (https://fox.cs.vt.edu/wadl2017.html)

More Related Content

Similar to Client-side reconstruction of mementos using ServiceWorker

What is Nginx and Why You Should to Use it with Wordpress Hosting
What is Nginx and Why You Should to Use it with Wordpress HostingWhat is Nginx and Why You Should to Use it with Wordpress Hosting
What is Nginx and Why You Should to Use it with Wordpress HostingWPSFO Meetup Group
 
Storytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web ArchivesStorytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web ArchivesMichael Nelson
 
Everything You Know is Not Quite Right Anymore: Rethinking Best Web Practices...
Everything You Know is Not Quite Right Anymore: Rethinking Best Web Practices...Everything You Know is Not Quite Right Anymore: Rethinking Best Web Practices...
Everything You Know is Not Quite Right Anymore: Rethinking Best Web Practices...Doug Gapinski
 
Everything You Know is Not Quite Right Anymore: Rethinking Best Practices to ...
Everything You Know is Not Quite Right Anymore: Rethinking Best Practices to ...Everything You Know is Not Quite Right Anymore: Rethinking Best Practices to ...
Everything You Know is Not Quite Right Anymore: Rethinking Best Practices to ...Dave Olsen
 
Performance and UX
Performance and UXPerformance and UX
Performance and UXPeter Rozek
 
Notes on SF W3Conf
Notes on SF W3ConfNotes on SF W3Conf
Notes on SF W3ConfEdy Dawson
 
WordPress performance tuning
WordPress performance tuningWordPress performance tuning
WordPress performance tuningVladimír Smitka
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesMichael Nelson
 
Cache Sketches: Using Bloom Filters and Web Caching Against Slow Load Times
Cache Sketches: Using Bloom Filters and Web Caching Against Slow Load TimesCache Sketches: Using Bloom Filters and Web Caching Against Slow Load Times
Cache Sketches: Using Bloom Filters and Web Caching Against Slow Load TimesFelix Gessert
 
Decoupled (Headless) Drupal
Decoupled (Headless) DrupalDecoupled (Headless) Drupal
Decoupled (Headless) DrupalDaniel Stout
 
10 things you can do to speed up your web app today stir trek edition
10 things you can do to speed up your web app today   stir trek edition10 things you can do to speed up your web app today   stir trek edition
10 things you can do to speed up your web app today stir trek editionChris Love
 
10 Things You Can Do to Speed Up Your Web App Today
10 Things You Can Do to Speed Up Your Web App Today10 Things You Can Do to Speed Up Your Web App Today
10 Things You Can Do to Speed Up Your Web App TodayChris Love
 
Total Browser Pwnag3 V1.0 Public
Total Browser Pwnag3   V1.0 PublicTotal Browser Pwnag3   V1.0 Public
Total Browser Pwnag3 V1.0 PublicRafal Los
 
Producing a mobile presence. Timeline: Yesterday...
Producing a mobile presence. Timeline: Yesterday...Producing a mobile presence. Timeline: Yesterday...
Producing a mobile presence. Timeline: Yesterday...Nick DeNardis
 
Delivering High Performance Websites with NGINX
Delivering High Performance Websites with NGINXDelivering High Performance Websites with NGINX
Delivering High Performance Websites with NGINXNGINX, Inc.
 
Browser-Based Digital Preservation
Browser-Based Digital PreservationBrowser-Based Digital Preservation
Browser-Based Digital PreservationMat Kelly
 
Drupal performance and scalability
Drupal performance and scalabilityDrupal performance and scalability
Drupal performance and scalabilityTwinbit
 
Loading Performance for Designers – WUDKRK 2017
Loading Performance for Designers – WUDKRK 2017Loading Performance for Designers – WUDKRK 2017
Loading Performance for Designers – WUDKRK 2017Łukasz Przywarty
 

Similar to Client-side reconstruction of mementos using ServiceWorker (20)

What is Nginx and Why You Should to Use it with Wordpress Hosting
What is Nginx and Why You Should to Use it with Wordpress HostingWhat is Nginx and Why You Should to Use it with Wordpress Hosting
What is Nginx and Why You Should to Use it with Wordpress Hosting
 
Storytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web ArchivesStorytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web Archives
 
Everything You Know is Not Quite Right Anymore: Rethinking Best Web Practices...
Everything You Know is Not Quite Right Anymore: Rethinking Best Web Practices...Everything You Know is Not Quite Right Anymore: Rethinking Best Web Practices...
Everything You Know is Not Quite Right Anymore: Rethinking Best Web Practices...
 
Everything You Know is Not Quite Right Anymore: Rethinking Best Practices to ...
Everything You Know is Not Quite Right Anymore: Rethinking Best Practices to ...Everything You Know is Not Quite Right Anymore: Rethinking Best Practices to ...
Everything You Know is Not Quite Right Anymore: Rethinking Best Practices to ...
 
Performance and UX
Performance and UXPerformance and UX
Performance and UX
 
Notes on SF W3Conf
Notes on SF W3ConfNotes on SF W3Conf
Notes on SF W3Conf
 
WordPress performance tuning
WordPress performance tuningWordPress performance tuning
WordPress performance tuning
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
 
Web Performance Optimization
Web Performance OptimizationWeb Performance Optimization
Web Performance Optimization
 
Cache Sketches: Using Bloom Filters and Web Caching Against Slow Load Times
Cache Sketches: Using Bloom Filters and Web Caching Against Slow Load TimesCache Sketches: Using Bloom Filters and Web Caching Against Slow Load Times
Cache Sketches: Using Bloom Filters and Web Caching Against Slow Load Times
 
Decoupled (Headless) Drupal
Decoupled (Headless) DrupalDecoupled (Headless) Drupal
Decoupled (Headless) Drupal
 
10 things you can do to speed up your web app today stir trek edition
10 things you can do to speed up your web app today   stir trek edition10 things you can do to speed up your web app today   stir trek edition
10 things you can do to speed up your web app today stir trek edition
 
10 Things You Can Do to Speed Up Your Web App Today
10 Things You Can Do to Speed Up Your Web App Today10 Things You Can Do to Speed Up Your Web App Today
10 Things You Can Do to Speed Up Your Web App Today
 
Total Browser Pwnag3 V1.0 Public
Total Browser Pwnag3   V1.0 PublicTotal Browser Pwnag3   V1.0 Public
Total Browser Pwnag3 V1.0 Public
 
Producing a mobile presence. Timeline: Yesterday...
Producing a mobile presence. Timeline: Yesterday...Producing a mobile presence. Timeline: Yesterday...
Producing a mobile presence. Timeline: Yesterday...
 
Mobile web performance dwx13
Mobile web performance dwx13Mobile web performance dwx13
Mobile web performance dwx13
 
Delivering High Performance Websites with NGINX
Delivering High Performance Websites with NGINXDelivering High Performance Websites with NGINX
Delivering High Performance Websites with NGINX
 
Browser-Based Digital Preservation
Browser-Based Digital PreservationBrowser-Based Digital Preservation
Browser-Based Digital Preservation
 
Drupal performance and scalability
Drupal performance and scalabilityDrupal performance and scalability
Drupal performance and scalability
 
Loading Performance for Designers – WUDKRK 2017
Loading Performance for Designers – WUDKRK 2017Loading Performance for Designers – WUDKRK 2017
Loading Performance for Designers – WUDKRK 2017
 

More from Sawood Alam

TrendMachine: Temporal Resilience of Web Pages
TrendMachine: Temporal Resilience of Web PagesTrendMachine: Temporal Resilience of Web Pages
TrendMachine: Temporal Resilience of Web PagesSawood Alam
 
CDX Summary: Web Archival Collection Insights
CDX Summary: Web Archival Collection InsightsCDX Summary: Web Archival Collection Insights
CDX Summary: Web Archival Collection InsightsSawood Alam
 
Video Archiving and Playback in the Wayback Machine
Video Archiving and Playback in the Wayback MachineVideo Archiving and Playback in the Wayback Machine
Video Archiving and Playback in the Wayback MachineSawood Alam
 
Profiling Web Archival Voids for Memento Routing
Profiling Web Archival Voids for Memento RoutingProfiling Web Archival Voids for Memento Routing
Profiling Web Archival Voids for Memento RoutingSawood Alam
 
Summarize Your Archival Holdings With MementoMap
Summarize Your Archival Holdings With MementoMapSummarize Your Archival Holdings With MementoMap
Summarize Your Archival Holdings With MementoMapSawood Alam
 
MementoMap: A Web Archive Profiling Framework for Efficient Memento Routing
MementoMap: A Web Archive Profiling Framework for Efficient Memento RoutingMementoMap: A Web Archive Profiling Framework for Efficient Memento Routing
MementoMap: A Web Archive Profiling Framework for Efficient Memento RoutingSawood Alam
 
Supporting Web Archiving via Web Packaging
Supporting Web Archiving via Web PackagingSupporting Web Archiving via Web Packaging
Supporting Web Archiving via Web PackagingSawood Alam
 
MementoMap: An Archive Profile Dissemination Framework
MementoMap: An Archive Profile Dissemination FrameworkMementoMap: An Archive Profile Dissemination Framework
MementoMap: An Archive Profile Dissemination FrameworkSawood Alam
 
Impact of HTTP Cookie Violations in Web Archives
Impact of HTTP Cookie Violations in Web ArchivesImpact of HTTP Cookie Violations in Web Archives
Impact of HTTP Cookie Violations in Web ArchivesSawood Alam
 
Archive Assisted Archival Fixity Verification Framework
Archive Assisted Archival Fixity Verification FrameworkArchive Assisted Archival Fixity Verification Framework
Archive Assisted Archival Fixity Verification FrameworkSawood Alam
 
MementoMap Framework for Flexible and Adaptive Web Archive Profiling
MementoMap Framework for Flexible and Adaptive Web Archive ProfilingMementoMap Framework for Flexible and Adaptive Web Archive Profiling
MementoMap Framework for Flexible and Adaptive Web Archive ProfilingSawood Alam
 
Web ARChive (WARC) File Format
Web ARChive (WARC) File FormatWeb ARChive (WARC) File Format
Web ARChive (WARC) File FormatSawood Alam
 
InterPlanetary Wayback: The Next Step Towards Decentralized Web Archiving
InterPlanetary Wayback: The Next Step Towards Decentralized Web ArchivingInterPlanetary Wayback: The Next Step Towards Decentralized Web Archiving
InterPlanetary Wayback: The Next Step Towards Decentralized Web ArchivingSawood Alam
 
MemGator - A Memento Aggregator CLI and Server in Go
MemGator - A Memento Aggregator CLI and Server in GoMemGator - A Memento Aggregator CLI and Server in Go
MemGator - A Memento Aggregator CLI and Server in GoSawood Alam
 
Dockerize Your Projects - A Brief Introduction to Containerization
Dockerize Your Projects - A Brief Introduction to ContainerizationDockerize Your Projects - A Brief Introduction to Containerization
Dockerize Your Projects - A Brief Introduction to ContainerizationSawood Alam
 
TPDL 2016 Doctoral Consortium - Web Archive Profiling
TPDL 2016 Doctoral Consortium - Web Archive ProfilingTPDL 2016 Doctoral Consortium - Web Archive Profiling
TPDL 2016 Doctoral Consortium - Web Archive ProfilingSawood Alam
 
Introducing Web Archiving and WSDL Research Group
Introducing Web Archiving and WSDL Research GroupIntroducing Web Archiving and WSDL Research Group
Introducing Web Archiving and WSDL Research GroupSawood Alam
 
InterPlanetary Wayback: Peer-To-Peer Permanence of Web Archives
InterPlanetary Wayback: Peer-To-Peer Permanence of Web ArchivesInterPlanetary Wayback: Peer-To-Peer Permanence of Web Archives
InterPlanetary Wayback: Peer-To-Peer Permanence of Web ArchivesSawood Alam
 
Web Archive Profiling Through Fulltext Search
Web Archive Profiling Through Fulltext SearchWeb Archive Profiling Through Fulltext Search
Web Archive Profiling Through Fulltext SearchSawood Alam
 
JCDL 2016 Doctoral Consortium - Web Archive Profiling
JCDL 2016 Doctoral Consortium - Web Archive ProfilingJCDL 2016 Doctoral Consortium - Web Archive Profiling
JCDL 2016 Doctoral Consortium - Web Archive ProfilingSawood Alam
 

More from Sawood Alam (20)

TrendMachine: Temporal Resilience of Web Pages
TrendMachine: Temporal Resilience of Web PagesTrendMachine: Temporal Resilience of Web Pages
TrendMachine: Temporal Resilience of Web Pages
 
CDX Summary: Web Archival Collection Insights
CDX Summary: Web Archival Collection InsightsCDX Summary: Web Archival Collection Insights
CDX Summary: Web Archival Collection Insights
 
Video Archiving and Playback in the Wayback Machine
Video Archiving and Playback in the Wayback MachineVideo Archiving and Playback in the Wayback Machine
Video Archiving and Playback in the Wayback Machine
 
Profiling Web Archival Voids for Memento Routing
Profiling Web Archival Voids for Memento RoutingProfiling Web Archival Voids for Memento Routing
Profiling Web Archival Voids for Memento Routing
 
Summarize Your Archival Holdings With MementoMap
Summarize Your Archival Holdings With MementoMapSummarize Your Archival Holdings With MementoMap
Summarize Your Archival Holdings With MementoMap
 
MementoMap: A Web Archive Profiling Framework for Efficient Memento Routing
MementoMap: A Web Archive Profiling Framework for Efficient Memento RoutingMementoMap: A Web Archive Profiling Framework for Efficient Memento Routing
MementoMap: A Web Archive Profiling Framework for Efficient Memento Routing
 
Supporting Web Archiving via Web Packaging
Supporting Web Archiving via Web PackagingSupporting Web Archiving via Web Packaging
Supporting Web Archiving via Web Packaging
 
MementoMap: An Archive Profile Dissemination Framework
MementoMap: An Archive Profile Dissemination FrameworkMementoMap: An Archive Profile Dissemination Framework
MementoMap: An Archive Profile Dissemination Framework
 
Impact of HTTP Cookie Violations in Web Archives
Impact of HTTP Cookie Violations in Web ArchivesImpact of HTTP Cookie Violations in Web Archives
Impact of HTTP Cookie Violations in Web Archives
 
Archive Assisted Archival Fixity Verification Framework
Archive Assisted Archival Fixity Verification FrameworkArchive Assisted Archival Fixity Verification Framework
Archive Assisted Archival Fixity Verification Framework
 
MementoMap Framework for Flexible and Adaptive Web Archive Profiling
MementoMap Framework for Flexible and Adaptive Web Archive ProfilingMementoMap Framework for Flexible and Adaptive Web Archive Profiling
MementoMap Framework for Flexible and Adaptive Web Archive Profiling
 
Web ARChive (WARC) File Format
Web ARChive (WARC) File FormatWeb ARChive (WARC) File Format
Web ARChive (WARC) File Format
 
InterPlanetary Wayback: The Next Step Towards Decentralized Web Archiving
InterPlanetary Wayback: The Next Step Towards Decentralized Web ArchivingInterPlanetary Wayback: The Next Step Towards Decentralized Web Archiving
InterPlanetary Wayback: The Next Step Towards Decentralized Web Archiving
 
MemGator - A Memento Aggregator CLI and Server in Go
MemGator - A Memento Aggregator CLI and Server in GoMemGator - A Memento Aggregator CLI and Server in Go
MemGator - A Memento Aggregator CLI and Server in Go
 
Dockerize Your Projects - A Brief Introduction to Containerization
Dockerize Your Projects - A Brief Introduction to ContainerizationDockerize Your Projects - A Brief Introduction to Containerization
Dockerize Your Projects - A Brief Introduction to Containerization
 
TPDL 2016 Doctoral Consortium - Web Archive Profiling
TPDL 2016 Doctoral Consortium - Web Archive ProfilingTPDL 2016 Doctoral Consortium - Web Archive Profiling
TPDL 2016 Doctoral Consortium - Web Archive Profiling
 
Introducing Web Archiving and WSDL Research Group
Introducing Web Archiving and WSDL Research GroupIntroducing Web Archiving and WSDL Research Group
Introducing Web Archiving and WSDL Research Group
 
InterPlanetary Wayback: Peer-To-Peer Permanence of Web Archives
InterPlanetary Wayback: Peer-To-Peer Permanence of Web ArchivesInterPlanetary Wayback: Peer-To-Peer Permanence of Web Archives
InterPlanetary Wayback: Peer-To-Peer Permanence of Web Archives
 
Web Archive Profiling Through Fulltext Search
Web Archive Profiling Through Fulltext SearchWeb Archive Profiling Through Fulltext Search
Web Archive Profiling Through Fulltext Search
 
JCDL 2016 Doctoral Consortium - Web Archive Profiling
JCDL 2016 Doctoral Consortium - Web Archive ProfilingJCDL 2016 Doctoral Consortium - Web Archive Profiling
JCDL 2016 Doctoral Consortium - Web Archive Profiling
 

Recently uploaded

Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Solution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsSolution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsHajira Mahmood
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxnoordubaliya2003
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 

Recently uploaded (20)

Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Solution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsSolution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutions
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptx
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 

Client-side reconstruction of mementos using ServiceWorker

  • 1. Client-side Reconstruction of Composite Mementos Using ServiceWorker Sawood Alam, Mat Kelly, Michele C. Weigle, and Michael L. Nelson Web Science and Digital Libraries Research Group Old Dominion University, Norfolk, VA, 23529 @ibnesayeed @WebSciDL Supported in part by NSF III 1526700 1 JCDL 2017, June 19-23, 2017, Toronto, Ontario, Canada
  • 2. Sawood Alam <@ibnesayeed> 2008 Memento Seen in 2017 2 ● https://ws-dl.blogspot.com/2015/12/2015-12-08-evaluating-temporal.html ?
  • 3. Sawood Alam <@ibnesayeed> 2008 Memento Seen in 2012 3 ● http://ws-dl.blogspot.com/2012/10/2012-10-10-zombies-in-archives.html
  • 6. Sawood Alam <@ibnesayeed> Zombies in Archive 6 <img src="http://xenland.alpha/images/map.png"> // Is rewritten on replay to become: <img src="http://archive.example.org/1998/http://xenland.alpha/images/map.png"> // URLs constructed by JavaScript are harder to rewrite on replay, e.g.: var base = 'http://xenland.alpha'; var imgdir = '/images/'; var img = document.createElement('img'); img.src = base + imgdir + 'ruler.png'; document.getElementById('ruler').appendChild(img); //=>> http://xenland.alpha/images/ruler.png
  • 7. Sawood Alam <@ibnesayeed> Replay URL Resolution & Rewriting 7 Reference type Example Resolution after relocation Relative path images/logo.png Potentially correct Absolute path /public/images/logo.png Potentially incorrect Absolute URL http://example.com/public/images/logo.png Potentially live leakage http://example.com/public/index.html ... <img src="/public/images/logo.png"> ... http://archive.example.org/<datetime>/http://example.com/public/index.html ... <img src="/<datetime>/http://example.com/public/images/logo.png"> ...
  • 8. Sawood Alam <@ibnesayeed> Avoiding Zombies ● Ahead-of-time rendering and JS execution ○ http://archive.is/ ● Archival replay proxy ○ https://github.com/ikreymer/pywb/wiki/Pywb-Proxy-Mode-Usage ● Browser extension ○ MementoFox (deprecated) ● JS override ○ wombat.js in PyWB ● ServiceWorker 8
  • 9. Sawood Alam <@ibnesayeed> ● New web API (still a working draft) ● A standalone JavaScript file ● Persists in the browser independent of the window ● Acts as a proxy ● Installed by a web page under its domain at a specific path (called scope) ● Intercepts all requests in scope ○ Resources under the scope path (at any depth) ○ Secondary resource requests originated from any resource under scope ● Allows modification in request and response ● Primarily used in web applications for offline access and notification support ● Requires HTTPS ● Growing browser support (73.61% as of June 8, 2017) ServiceWorker 9 ● http://caniuse.com/#feat=serviceworkers
  • 10. Sawood Alam <@ibnesayeed> reconstructive.js 10 ● https://github.com/oduwsdl/reconstructive ● A ServiceWorker script written for archival replay ● Plug-in for web archives or Memento aggregators ● Intercepts all network requests originated from a memento ● Reroutes requests to an archive (prevents live leakage & incorrect references) ● Optionally rewrites the content to add banner & to fix hyperlinks
  • 11. Sawood Alam <@ibnesayeed> Zombies, No More! 11 ● https://github.com/oduwsdl/ipwb
  • 12. Sawood Alam <@ibnesayeed> Rewriting Mementos is Expensive 12 Original capture (without any rewriting) In our experiment over 500 home pages we observed: ● One-fifth mean data overhead ● One-third mean time overhead 15% more data in twice the time
  • 13. Sawood Alam <@ibnesayeed> Archival Capture Replay Test Suite (ACRTS) 13 reconstructive.js ● https://ibnesayeed.github.io/acrts/
  • 14. Sawood Alam <@ibnesayeed> Reconstruction Winners: PyWB & reconstructive.js A. OpenWayback B. PyWB C. Memento Reconstruct D. Memento for Chrome E. reconstructive.js 14
  • 15. Sawood Alam <@ibnesayeed> Future Work ● Use “Prefer” header for original content (when archives support it) ● Add a customizable archival banner ● Add click handler for lazy rewriting of hyperlinks ● Handle archived ServiceWorkers ● Write a 404-combat ServiceWorker script for webmasters 15 ● http://ws-dl.blogspot.co.uk/2016/08/2016-08-15-mementos-in-raw-take-two.html
  • 16. Sawood Alam <@ibnesayeed> ● reconstructive.js => no zombies! ● Rerouting instead of rewriting (lazy rewriting) ● Mean overhead reduction ○ one-fifth data ○ one-third time ● 73.61% (and growing) browser support for ServiceWorker ○ http://caniuse.com/#feat=serviceworkers ● reconstructive.js ○ https://github.com/oduwsdl/reconstructive ● Archival Capture Replay Test Suite ○ https://ibnesayeed.github.io/acrts/ Conclusions 16 ● In-depth recap: WADL 2017 Thursday, June 22, 3:45pm (https://fox.cs.vt.edu/wadl2017.html)