SlideShare a Scribd company logo
1 of 28
Reconstructing the past with
MediaWiki:
Programmatic Issues and Solutions
Shawn M. Jones
sjone@cs.odu.edu
Old Dominion University
Reconstructing the Past with the
Internet Archive
HTML
Images
JavaScript
CSS
Our goal: Temporal Coherence
Make the page look as it looked at the time it was archived.
Some Results from the Internet
Archive Are Lacking
Images change between the time
the Archive crawls the main page
and the time it gets to the images
Sometimes embedded images
are missing when the Archive
gets to them
Sometimes the page is designed
for a specific browser in mind
Image from “A Framework for Evaluation of Composite Memento Temporal Coherence”
by S. Ainsworth, M. L. Nelson, H. Van de Sompel. http://arxiv.org/abs/1402.0928
MediaWiki Shouldn’t Have This
Problem
HTML Images
JavaScript
CSS
What we’re not doing
Interest in Reconstructing the Past
With MediaWiki
Simplified Memento Overview
Rules for Reconstructing the Past With
MediaWiki
Do not modify any existing MediaWiki
code!
Conform to
MediaWiki
coding standards
And…
Reconstructing the Past
Articles
Templates
Embedded Images
Embedded JavaScript
Embedded CSS
Accessing Old Article Text
The oldid argument references a revision of a page
within MediaWiki's database
Merely visiting the URI with the oldid will give you the
text content of the page as it existed at that revision
Reconstructing the Past
Articles
Handled by
Memento MediaWiki Extension
Templates
Embedded Images
Embedded JavaScript
Embedded CSS
Including the Right Template
This gives us:
$title - the Title object for the given page
$parser - the Parser object for the given page
$id - the revision ID (oldid) for the Template page
Using $parser, and $title, we can change the $id and
fetch an old revision of the Template
Reconstructing the Past
Articles
Handled by
Memento MediaWiki Extension
Templates
Handled by
Memento MediaWiki Extension
Embedded Images
Embedded JavaScript
Embedded CSS
But What About Images?
This Map is important to
understanding the
content of this article
This image is changed
as the article is
changed, to reflect its
content
It’s the same map if we look at the
June 6, 2013 revision now
Users can't view this
embedded resource as
it looked on June 2013
while reading the article
from that time period
What should have happened
This is the the map from
June, 2013 that should
have been displayed
This is the current map
The content of the article won't match the data in this visual aide, possibly
confusing a user who wanted historical information on this topic
We Tried To Solve This
Upon further inspection of the code in MediaWiki, the $time argument
from this function is never used as detailed here
We Just Solved This
Upon further inspection of the code in MediaWiki, the $file argument’s
getHistory() function can be used to acquire previous revisions of images
Reconstructing the Past
Articles
Handled by
Memento MediaWiki Extension
Templates
Handled by
Memento MediaWiki Extension
Embedded Images
Prototyped for future version of
Memento MediaWiki Extension
Embedded JavaScript
Embedded CSS
What about CSS/JavaScript?
The present CSS of
this page conflicts
with the past
Template.
We Couldn’t Solve This
The data is present, but we could not find any way for an
extension to access or render it.
Recap on Reconstructing the Past
Articles
Handled by
Memento MediaWiki Extension
Templates
Handled by
Memento MediaWiki Extension
Embedded Images
Prototyped for future version of
Memento MediaWiki Extension
Embedded JavaScript
Requires changes to MediaWiki
Embedded CSS
Requires changes to MediaWiki
Uniform solution
• RFC 7089, Memento, was designed to provide
uniform access to past versions of all resources
on the Web
• Memento provides a web standard to access
these resources
Resources
• Memento Protocol: http://tools.ietf.org/html/rfc7089
• Memento Website: http://www.mementoweb.org/
• Memento MediaWiki Extension:
http://www.mediawiki.org/wiki/Extension:Memento
• Memento Chrome Extension:
http://bit.ly/memento-for-chrome
• More details:
http://ws-dl.blogspot.com/2014/04/2014-04-01-
yesterdays-wiki-page-todays.html
• Contact me: sjone@cs.odu.edu
Backup Slides
Sample URI-R (Step 1) HTTP Response
HTTP/1.1 200 OK
Date: Sun, 25 May 2014 21:39:02 GMT
Server: Apache
X-Content-Type-Options: nosniff
Link: http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys_Targaryen;
rel="original latest-version",
http://ws-dl-
05.cs.odu.edu/demo/index.php/Special:TimeGate/Daenerys_Targaryen;
rel="timegate",
http://ws-dl-
05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys_Targaryen;
rel="timemap”; type="application/link-format”
Content-language: en
Vary: Accept-Encoding,Cookie
Cache-Control: s-maxage=18000, must-revalidate, max-age=0
Last-Modified: Sat, 17 May 2014 16:48:28 GMT
Connection: close
Content-Type: text/html; charset=UTF-8
Sample URI-G (Step 2) HTTP Response
HTTP/1.1 302 Found
Date: Sun, 25 May 2014 21:43:08 GMT
Server: Apache
X-Content-Type-Options: nosniff
Vary: Accept-Encoding, Accept-Datetime
Location: http://ws-dl-
05.cs.odu.edu/demo/index.php?title=Daenerys_Targaryen&oldid=1499
Link: <http://ws-dl-
05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys_Targaryen>;
rel="timemap”; type="application/link-format",
<http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys_Targaryen>;
rel="original latest-version”
Connection: close
Content-Type: text/html; charset=UTF-8
Sample URI-M (Step 3) HTTP Response
HTTP/1.1 200 OK
Date: Sun, 25 May 2014 21:46:12 GMT
Server: Apache
X-Content-Type-Options: nosniff
Memento-Datetime: Sun, 22 Apr 2007 15:01:20 GMT
Link: <http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys_Targaryen>;
rel="original latest-version”,
<http://ws-dl-
05.cs.odu.edu/demo/index.php/Special:TimeGate/Daenerys_Targaryen>;
rel="timegate”,
<http://ws-dl-
05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys_Targaryen>;
rel="timemap”; type="application/link-format”
Content-language: en
Vary: Accept-Encoding,Cookie
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Cache-Control: private, must-revalidate, max-age=0
Connection: close
Content-Type: text/html; charset=UTF-8

More Related Content

Similar to Reconstructing Past Wiki Pages Programmatically

Designing CSS Layouts for the Flexible Web
Designing CSS Layouts for the Flexible WebDesigning CSS Layouts for the Flexible Web
Designing CSS Layouts for the Flexible WebZoe Gillenwater
 
SharePoint as a Web CMS
SharePoint as a Web CMSSharePoint as a Web CMS
SharePoint as a Web CMSCraig Bailey
 
Adapting to a Responsive Design at Untangle the Web on 29th July 2013
Adapting to a Responsive Design at Untangle the Web on 29th July 2013Adapting to a Responsive Design at Untangle the Web on 29th July 2013
Adapting to a Responsive Design at Untangle the Web on 29th July 2013Matt Gibson
 
Bootstrap for Beginners
Bootstrap for BeginnersBootstrap for Beginners
Bootstrap for BeginnersD'arce Hess
 
WordPress Theme Structure
WordPress Theme StructureWordPress Theme Structure
WordPress Theme Structurekeithdevon
 
"Responsive Web Design: Clever Tips and Techniques". Vitaly Friedman, Smashin...
"Responsive Web Design: Clever Tips and Techniques". Vitaly Friedman, Smashin..."Responsive Web Design: Clever Tips and Techniques". Vitaly Friedman, Smashin...
"Responsive Web Design: Clever Tips and Techniques". Vitaly Friedman, Smashin...Yandex
 
Pablo Villalba -
Pablo Villalba - Pablo Villalba -
Pablo Villalba - .toster
 
EECI2009 - From Design to Dynamic - Rapid ExpressionEngine Development
EECI2009 - From Design to Dynamic - Rapid ExpressionEngine DevelopmentEECI2009 - From Design to Dynamic - Rapid ExpressionEngine Development
EECI2009 - From Design to Dynamic - Rapid ExpressionEngine DevelopmentFortySeven Media
 
Introduction to Web Components
Introduction to Web ComponentsIntroduction to Web Components
Introduction to Web ComponentsRich Bradshaw
 
3) web development
3) web development3) web development
3) web developmenttechbed
 
2012.10 Liferay Europe Symposium, Alistair Oldfield
2012.10 Liferay Europe Symposium, Alistair Oldfield2012.10 Liferay Europe Symposium, Alistair Oldfield
2012.10 Liferay Europe Symposium, Alistair OldfieldEmeldi Group
 
WordCamp Greenville 2018 - Beware the Dark Side, or an Intro to Development
WordCamp Greenville 2018 - Beware the Dark Side, or an Intro to DevelopmentWordCamp Greenville 2018 - Beware the Dark Side, or an Intro to Development
WordCamp Greenville 2018 - Beware the Dark Side, or an Intro to DevelopmentEvan Mullins
 
WordCamp Asheville 2017 - So You Wanna Dev? Join the Team!
WordCamp Asheville 2017 - So You Wanna Dev? Join the Team!WordCamp Asheville 2017 - So You Wanna Dev? Join the Team!
WordCamp Asheville 2017 - So You Wanna Dev? Join the Team!Evan Mullins
 
CreateJS hackathon in Zurich
CreateJS hackathon in ZurichCreateJS hackathon in Zurich
CreateJS hackathon in ZurichHenri Bergius
 
Semantic Media Wiki & Semantic Forms
Semantic Media Wiki & Semantic FormsSemantic Media Wiki & Semantic Forms
Semantic Media Wiki & Semantic FormsSergeyChernyshev
 
A Comprehensive Guide on Building Lightning-Fast Websites with React Static S...
A Comprehensive Guide on Building Lightning-Fast Websites with React Static S...A Comprehensive Guide on Building Lightning-Fast Websites with React Static S...
A Comprehensive Guide on Building Lightning-Fast Websites with React Static S...Inexture Solutions
 

Similar to Reconstructing Past Wiki Pages Programmatically (20)

Designing CSS Layouts for the Flexible Web
Designing CSS Layouts for the Flexible WebDesigning CSS Layouts for the Flexible Web
Designing CSS Layouts for the Flexible Web
 
SharePoint as a Web CMS
SharePoint as a Web CMSSharePoint as a Web CMS
SharePoint as a Web CMS
 
Aaug sample slides
Aaug sample slidesAaug sample slides
Aaug sample slides
 
Adapting to a Responsive Design at Untangle the Web on 29th July 2013
Adapting to a Responsive Design at Untangle the Web on 29th July 2013Adapting to a Responsive Design at Untangle the Web on 29th July 2013
Adapting to a Responsive Design at Untangle the Web on 29th July 2013
 
Bootstrap for Beginners
Bootstrap for BeginnersBootstrap for Beginners
Bootstrap for Beginners
 
WordPress Theme Structure
WordPress Theme StructureWordPress Theme Structure
WordPress Theme Structure
 
"Responsive Web Design: Clever Tips and Techniques". Vitaly Friedman, Smashin...
"Responsive Web Design: Clever Tips and Techniques". Vitaly Friedman, Smashin..."Responsive Web Design: Clever Tips and Techniques". Vitaly Friedman, Smashin...
"Responsive Web Design: Clever Tips and Techniques". Vitaly Friedman, Smashin...
 
MANAGE STATIC RESOURCES IN SITECORE IN HELIX WAY
MANAGE STATIC RESOURCES IN SITECORE IN HELIX WAYMANAGE STATIC RESOURCES IN SITECORE IN HELIX WAY
MANAGE STATIC RESOURCES IN SITECORE IN HELIX WAY
 
Pablo Villalba -
Pablo Villalba - Pablo Villalba -
Pablo Villalba -
 
2012.10 Oldfield
2012.10 Oldfield2012.10 Oldfield
2012.10 Oldfield
 
EECI2009 - From Design to Dynamic - Rapid ExpressionEngine Development
EECI2009 - From Design to Dynamic - Rapid ExpressionEngine DevelopmentEECI2009 - From Design to Dynamic - Rapid ExpressionEngine Development
EECI2009 - From Design to Dynamic - Rapid ExpressionEngine Development
 
Introduction to Web Components
Introduction to Web ComponentsIntroduction to Web Components
Introduction to Web Components
 
3) web development
3) web development3) web development
3) web development
 
React in 2018
React in 2018React in 2018
React in 2018
 
2012.10 Liferay Europe Symposium, Alistair Oldfield
2012.10 Liferay Europe Symposium, Alistair Oldfield2012.10 Liferay Europe Symposium, Alistair Oldfield
2012.10 Liferay Europe Symposium, Alistair Oldfield
 
WordCamp Greenville 2018 - Beware the Dark Side, or an Intro to Development
WordCamp Greenville 2018 - Beware the Dark Side, or an Intro to DevelopmentWordCamp Greenville 2018 - Beware the Dark Side, or an Intro to Development
WordCamp Greenville 2018 - Beware the Dark Side, or an Intro to Development
 
WordCamp Asheville 2017 - So You Wanna Dev? Join the Team!
WordCamp Asheville 2017 - So You Wanna Dev? Join the Team!WordCamp Asheville 2017 - So You Wanna Dev? Join the Team!
WordCamp Asheville 2017 - So You Wanna Dev? Join the Team!
 
CreateJS hackathon in Zurich
CreateJS hackathon in ZurichCreateJS hackathon in Zurich
CreateJS hackathon in Zurich
 
Semantic Media Wiki & Semantic Forms
Semantic Media Wiki & Semantic FormsSemantic Media Wiki & Semantic Forms
Semantic Media Wiki & Semantic Forms
 
A Comprehensive Guide on Building Lightning-Fast Websites with React Static S...
A Comprehensive Guide on Building Lightning-Fast Websites with React Static S...A Comprehensive Guide on Building Lightning-Fast Websites with React Static S...
A Comprehensive Guide on Building Lightning-Fast Websites with React Static S...
 

More from Shawn Jones

Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Shawn Jones
 
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...Shawn Jones
 
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Shawn Jones
 
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...Shawn Jones
 
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...Shawn Jones
 
Automatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social CardsAutomatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social CardsShawn Jones
 
SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration)
SHARI(StoryGraph Hypercane ArchiveNow Raintale Integration)SHARI(StoryGraph Hypercane ArchiveNow Raintale Integration)
SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration)Shawn Jones
 
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...Shawn Jones
 
Storytelling With Web Archives
Storytelling With Web ArchivesStorytelling With Web Archives
Storytelling With Web ArchivesShawn Jones
 
Combining Social Media Storytelling With Web Archives
Combining Social Media Storytelling With Web ArchivesCombining Social Media Storytelling With Web Archives
Combining Social Media Storytelling With Web ArchivesShawn Jones
 
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...Shawn Jones
 
The Off-Topic Memento Toolkit
The Off-Topic Memento ToolkitThe Off-Topic Memento Toolkit
The Off-Topic Memento ToolkitShawn Jones
 
The Many Shapes of Archive-It
The Many Shapes of Archive-ItThe Many Shapes of Archive-It
The Many Shapes of Archive-ItShawn Jones
 
Improving Collection Understanding in Web Archives
Improving Collection Understanding in Web ArchivesImproving Collection Understanding in Web Archives
Improving Collection Understanding in Web ArchivesShawn Jones
 
Where Can We Post Stories Summarizing Web Archive Collections
Where Can We Post Stories Summarizing Web Archive CollectionsWhere Can We Post Stories Summarizing Web Archive Collections
Where Can We Post Stories Summarizing Web Archive CollectionsShawn Jones
 

More from Shawn Jones (16)

Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
 
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
 
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
 
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
 
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
 
Automatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social CardsAutomatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social Cards
 
SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration)
SHARI(StoryGraph Hypercane ArchiveNow Raintale Integration)SHARI(StoryGraph Hypercane ArchiveNow Raintale Integration)
SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration)
 
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
 
Storytelling With Web Archives
Storytelling With Web ArchivesStorytelling With Web Archives
Storytelling With Web Archives
 
Combining Social Media Storytelling With Web Archives
Combining Social Media Storytelling With Web ArchivesCombining Social Media Storytelling With Web Archives
Combining Social Media Storytelling With Web Archives
 
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
 
The Off-Topic Memento Toolkit
The Off-Topic Memento ToolkitThe Off-Topic Memento Toolkit
The Off-Topic Memento Toolkit
 
The Many Shapes of Archive-It
The Many Shapes of Archive-ItThe Many Shapes of Archive-It
The Many Shapes of Archive-It
 
Improving Collection Understanding in Web Archives
Improving Collection Understanding in Web ArchivesImproving Collection Understanding in Web Archives
Improving Collection Understanding in Web Archives
 
Reference Rot
Reference RotReference Rot
Reference Rot
 
Where Can We Post Stories Summarizing Web Archive Collections
Where Can We Post Stories Summarizing Web Archive CollectionsWhere Can We Post Stories Summarizing Web Archive Collections
Where Can We Post Stories Summarizing Web Archive Collections
 

Recently uploaded

How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfLivetecs LLC
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noidabntitsolutionsrishis
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 

Recently uploaded (20)

How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdf
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 

Reconstructing Past Wiki Pages Programmatically

  • 1. Reconstructing the past with MediaWiki: Programmatic Issues and Solutions Shawn M. Jones sjone@cs.odu.edu Old Dominion University
  • 2. Reconstructing the Past with the Internet Archive HTML Images JavaScript CSS Our goal: Temporal Coherence Make the page look as it looked at the time it was archived.
  • 3. Some Results from the Internet Archive Are Lacking Images change between the time the Archive crawls the main page and the time it gets to the images Sometimes embedded images are missing when the Archive gets to them Sometimes the page is designed for a specific browser in mind Image from “A Framework for Evaluation of Composite Memento Temporal Coherence” by S. Ainsworth, M. L. Nelson, H. Van de Sompel. http://arxiv.org/abs/1402.0928
  • 4. MediaWiki Shouldn’t Have This Problem HTML Images JavaScript CSS
  • 6. Interest in Reconstructing the Past With MediaWiki
  • 8. Rules for Reconstructing the Past With MediaWiki Do not modify any existing MediaWiki code! Conform to MediaWiki coding standards And…
  • 9. Reconstructing the Past Articles Templates Embedded Images Embedded JavaScript Embedded CSS
  • 10. Accessing Old Article Text The oldid argument references a revision of a page within MediaWiki's database Merely visiting the URI with the oldid will give you the text content of the page as it existed at that revision
  • 11. Reconstructing the Past Articles Handled by Memento MediaWiki Extension Templates Embedded Images Embedded JavaScript Embedded CSS
  • 12. Including the Right Template This gives us: $title - the Title object for the given page $parser - the Parser object for the given page $id - the revision ID (oldid) for the Template page Using $parser, and $title, we can change the $id and fetch an old revision of the Template
  • 13. Reconstructing the Past Articles Handled by Memento MediaWiki Extension Templates Handled by Memento MediaWiki Extension Embedded Images Embedded JavaScript Embedded CSS
  • 14. But What About Images? This Map is important to understanding the content of this article This image is changed as the article is changed, to reflect its content
  • 15. It’s the same map if we look at the June 6, 2013 revision now Users can't view this embedded resource as it looked on June 2013 while reading the article from that time period
  • 16. What should have happened This is the the map from June, 2013 that should have been displayed This is the current map The content of the article won't match the data in this visual aide, possibly confusing a user who wanted historical information on this topic
  • 17. We Tried To Solve This Upon further inspection of the code in MediaWiki, the $time argument from this function is never used as detailed here
  • 18. We Just Solved This Upon further inspection of the code in MediaWiki, the $file argument’s getHistory() function can be used to acquire previous revisions of images
  • 19. Reconstructing the Past Articles Handled by Memento MediaWiki Extension Templates Handled by Memento MediaWiki Extension Embedded Images Prototyped for future version of Memento MediaWiki Extension Embedded JavaScript Embedded CSS
  • 20. What about CSS/JavaScript? The present CSS of this page conflicts with the past Template.
  • 21. We Couldn’t Solve This The data is present, but we could not find any way for an extension to access or render it.
  • 22. Recap on Reconstructing the Past Articles Handled by Memento MediaWiki Extension Templates Handled by Memento MediaWiki Extension Embedded Images Prototyped for future version of Memento MediaWiki Extension Embedded JavaScript Requires changes to MediaWiki Embedded CSS Requires changes to MediaWiki
  • 23. Uniform solution • RFC 7089, Memento, was designed to provide uniform access to past versions of all resources on the Web • Memento provides a web standard to access these resources
  • 24. Resources • Memento Protocol: http://tools.ietf.org/html/rfc7089 • Memento Website: http://www.mementoweb.org/ • Memento MediaWiki Extension: http://www.mediawiki.org/wiki/Extension:Memento • Memento Chrome Extension: http://bit.ly/memento-for-chrome • More details: http://ws-dl.blogspot.com/2014/04/2014-04-01- yesterdays-wiki-page-todays.html • Contact me: sjone@cs.odu.edu
  • 26. Sample URI-R (Step 1) HTTP Response HTTP/1.1 200 OK Date: Sun, 25 May 2014 21:39:02 GMT Server: Apache X-Content-Type-Options: nosniff Link: http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys_Targaryen; rel="original latest-version", http://ws-dl- 05.cs.odu.edu/demo/index.php/Special:TimeGate/Daenerys_Targaryen; rel="timegate", http://ws-dl- 05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys_Targaryen; rel="timemap”; type="application/link-format” Content-language: en Vary: Accept-Encoding,Cookie Cache-Control: s-maxage=18000, must-revalidate, max-age=0 Last-Modified: Sat, 17 May 2014 16:48:28 GMT Connection: close Content-Type: text/html; charset=UTF-8
  • 27. Sample URI-G (Step 2) HTTP Response HTTP/1.1 302 Found Date: Sun, 25 May 2014 21:43:08 GMT Server: Apache X-Content-Type-Options: nosniff Vary: Accept-Encoding, Accept-Datetime Location: http://ws-dl- 05.cs.odu.edu/demo/index.php?title=Daenerys_Targaryen&oldid=1499 Link: <http://ws-dl- 05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys_Targaryen>; rel="timemap”; type="application/link-format", <http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys_Targaryen>; rel="original latest-version” Connection: close Content-Type: text/html; charset=UTF-8
  • 28. Sample URI-M (Step 3) HTTP Response HTTP/1.1 200 OK Date: Sun, 25 May 2014 21:46:12 GMT Server: Apache X-Content-Type-Options: nosniff Memento-Datetime: Sun, 22 Apr 2007 15:01:20 GMT Link: <http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys_Targaryen>; rel="original latest-version”, <http://ws-dl- 05.cs.odu.edu/demo/index.php/Special:TimeGate/Daenerys_Targaryen>; rel="timegate”, <http://ws-dl- 05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys_Targaryen>; rel="timemap”; type="application/link-format” Content-language: en Vary: Accept-Encoding,Cookie Expires: Thu, 01 Jan 1970 00:00:00 GMT Cache-Control: private, must-revalidate, max-age=0 Connection: close Content-Type: text/html; charset=UTF-8