SlideShare a Scribd company logo
1 of 33
Challenges in Replaying
Archived Twitter Pages
Published in Joint Conference on Digital Libraries (JCDL) 2021
Kritika Garg
Web Science & Digital Libraries Research Group
Department of Computer Science, Old Dominion University
@Kritika_garg @WebSciDL @oducs
Committee Members:
Michael L. Nelson (Advisor), Michele C. Weigle,
Sampath Jayarathna, Jian Wu, Vikas Ganjigunte Ashok
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 2
https://doi.org/10.1109/JCDL52503.2021.00028
In 2020, Twitter changed its user Interface.
We examined the challenges web archives faced in
preserving Twitter after the change.
The observations and results provided in this work are
accurate for the time of this study in 2021. Things may
have altered since Twitter ownership shifted in 2022.
https://www.bbc.com/news/technology-63402338
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 3
Tweets and accounts on the live web may become unavailable
https://twitter.com/AOC/status/1364623055658635268 https://twitter.com/realDonaldTrump/
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 4
Archives allow us to access pages that no longer exist on live web
URI-R: https://twitter.com/AOC/status/1364623055658635268
URI-M: https://web.archive.org/web/20210224170823/https://twitter.com/AOC/status/1364623055658635268
Memento-Datetime: 20210224170823 (datetime of when memento was captured)
Archive banner providing
details of the capture. For
ex, this capture is from
February 24, 2021.
Web archives rehost
the captured page
(memento)
All the embeds and
outlinked pages are
also served from the
web archive.
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 5
Archives allow us to replay the past web of suspended accounts
2009
https://web.archive.org/web/20090702030955/https://twitter.com/realDonaldTrum
p
2013
https://web.archive.org/web/20130608234757/https://twitter.com/realDonaldTrump
https://web.archive.org/web/20170702084625/https://twitter.com/realDonaldTrum
p
https://web.archive.org/web/20230407025620/https://twitter.com/realDonaldTrump
2017
2020
Mementos (archived
pages) allow us to
replay the earlier pages
of suspended or
deleted Twitter
accounts from when
they were present on
the live web.
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 6
Live web keeps changing, web archives must adjust to keep up
2009
https://web.archive.org/web/20090702030955/https://twitter.com/realDonaldTrum
p
2013
https://web.archive.org/web/20130608234757/https://twitter.com/realDonaldTrump
https://web.archive.org/web/20170702084625/https://twitter.com/realDonaldTrum
p
https://web.archive.org/web/20230407025620/https://twitter.com/realDonaldTrump
2017
(Old UI)
2020
(New UI)
Twitter user interface
(UI) has undergone
various changes.
The web archives were
affected by the change in
2020 due to the vast
structural differences
between the old and new
UI.
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 7
Tweets in old UI are embedded in HTML while
new UI requires separate JSON requests to populate content
the root HTML contains only a skeleton, and all page sections
are served dynamically through API JSON responses
New
UI
Old UI
20 tweets and Twitter bio are
embedded in the root HTML
Content populated with
follow-up XHR requests
(https://api.twitter.com/2/timeline/profile/25073877.json?.
.)
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 8
Archiving new UI resulted in error or incomplete pages due to
Twitter’s API rate limiting
To archive the new UI, multiple calls
for JSON responses must be issued to
Twitter’s API.
Result: Error or incomplete pages
because of exceeding API rate limit
https://ws-dl.blogspot.com/2020/07/2020-07-15-twitter-was-already.html
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 9
Many web archives continued to archive the old UI
by pretending to be a “GoogleBot”
This technique no longer returns old Twitter UI (last observed on April 10, 2023)
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 10
Mismatch in what we saw on live web and
how it replayed in the web archive
Missing
Twitter's
Fact-check
warning
Archived Live Web (2020)
Old User Interface
https://twitter.com/peterktodd/status/1325549199350435841
Many web archives had difficulty archiving the new UI, so they pretended to be “googlebot” so they can archive the old UI.
Result: view a page on the live web, archive it & replay it, and they don’t match
https://web.archive.org/web/20200529145339/https://twitter.com/realDonaldTrump/status/1266231100780744704
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 11
Crucial data, like Twitter Labels, in new UI were not in old UI
https://twitter.com/realDonaldTrump/status/1313449844413992961
https://twitter.com/realDonaldTrump/status/1265255835124539392
Violated Twitter Rules Labels
(VTR)
Fact-check Labels
No engagements!
Placing a Tweet in violation (controversial content
or behavior) behind a tombstone
https://help.twitter.com/en/rules-and-policies/notices-on-twitter
Labeling a Tweet that may contain disputed or
misleading information
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 12
New UI mementos may replay pages that never existed on live web
Aug 18, 2020, 05:52:23 UTC
https://ws-dl.blogspot.com/2020/11/2020-11-04-new-twitter-ui-replaying.html
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 13
New UI mementos may replay pages that never existed on live web
71 Missing
Tweets
Aug 18, 2020, 05:52:23 UTC
https://ws-dl.blogspot.com/2020/11/2020-11-04-new-twitter-ui-replaying.html
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 14
Archives had difficulty in accurately preserving Twitter in 2020
Historians using web archives for a study of historically significant tweets
made in late 2020 might witness:
1. Mementos displaying the “Something went wrong”
2. Mementos with different UI for the same URI-R
3. Mementos not displaying labels on disputed or controversial tweets
4. Mementos of Twitter account pages missing tweets
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 15
Using @realDonaldTrump to study the impact of Twitter UI
change on web archives
2022-11-19
2020-05-01 2021-01-08
No content on live web for ~2 years
as account was suspended
Collected ~8 months of archived data of
@realDonaldTrump to quantify the impact of the change
Suspension of
@realDonaldTrump
https://blog.twitter.com/en_us/topics/company/2020/suspension
https://en.wikipedia.org/wiki/Acquisition_of_Twitter_by_Elon_Musk
Elon Musk brings Donald
Trump back on Twitter
Twitter stopped
supporting its old UI
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 16
@realDonaldTrump is well archived
http://web.archive.org/web/20200701000000*/https://twitter.com/realDonaldTrump
https://www.thetrumparchive.com/
https://factba.se/trump/
Internet Archive
The
Trump
Archive
Factbase
Dedicated third party archives
were available for ground truth
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 17
Twitter’s account page vs. tweet page
Profile/Account Page Tweet Page
The account page provides details specific to the account's
owner, such as their brief description, following, followers,
and the recent tweets they published or retweeted.
The tweet page displays a single tweet and its
engagement, such as the number of likes,
retweets, and replies to the tweet.
https://twitter.com/realDonaldTrump/status/1347569870578266115
https://twitter.com/realDonaldTrump
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 18
~1.3M mementos for 8.7K @realDonaldTrump’s tweets
from 7 web archives
We collected 8.7K @realDonaldTrump’s tweets from the ~8 months of archived data from 7 web archives. We found 64K
mementos of account page and 1.29M mementos for 8.7K tweets.
Start: 2021-05-01
(Twitter stopped supporting its old UI)
End: 2021-01-08
(Trump’s account suspended)
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 19
~1.3M mementos for 8.7K @realDonaldTrump’s tweets
from 7 web archives
We collected 8.7K @realDonaldTrump’s tweets from the ~8 months of archived data from 7 web archives. We found 64K
mementos of account page and 1.29M mementos for 8.7K tweets.
Start: 2021-05-01
(Twitter stopped supporting its old UI)
End: 2021-01-08
(Trump’s account suspended)
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 20
Old Twitter UI is more prominent in web archives,
93% out of 1.3M mementos were old UI
We separated the mementos into old UI and new UI. The graph shows the distribution of old UI and new UI for account page
mementos and tweet page mementos across each month from May 2020 until Jan 202.
a) Account page mementos b) Tweet page mementos
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 21
Collected 476 labeled tweets of @realDonaldTrump:
450 Fact-check and 26 VTR
1. thetrumparchive.com: https://www.thetrumparchive.com/
2. Factba.se: https://factba.se/topic/flagged-tweets
3. Twitterlabels6: https://github.com/oduwsdl/TwitterLabels Number of Tweets (Fact-check, VTR)
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 22
Twitter added VTR label to old UI at least by August 26, 2020
https://ws-dl.blogspot.com/2020/12/2020-12-08-twitter-added-labels-on-its.html
1. The red dot shows when each tweet was created.
2. Before August 26, 2020 (dotted line 1), the mementos do not have labels (blue dot).
3. After September 9, 2020 (dotted line 2), we could see the labels in the mementos (green dot).
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 23
“Fact-check” label never existed in old Twitter UI
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 24
The New UI mementos can be used to
see the labelled tweet.
https://web.archive.org/web/20221122044113/https://twitter.com/realDonaldTrump/status/1265255835124539392
Archived New UI
Fact-check label no longer exist on live web (new UI)
Live Web
No Twitter's
Fact-check label
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 25
At least 18% of 6.5K new UI mementos replayed the labels
Fact-check: at least 967 out of 5,994 (16%) new UI mementos were working and displayed the Fact-check label.
VTR: at least 213 out of 559 (38%) new UI mementos were working and displayed the VTR label.
Type of labels Tweets New UI mementos Working mementos Mementos with label
Fact-check 450 5,994 1,615 967
VTR 26 559 272 213
Total 476 6,553 1,887 1180
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 26
Analyzed missing tweets in new UI mementos
Memento-Datetime of
the root HTML
Time delta
(Δ)
Memento-Datetime of
the archived JSON
= -
71 Missing
Tweets
-1 day 5 hrs 4 mins
Aug 18, 2020, 05:52:23 UTC
Tweets
http://web.archive.org/web/20200818055223/https://twitter.com/realdonaldtrump
http://web.archive.org/web/20200817004843/
https://api.twitter.com/2/timeline/profile/25073
877.json?..
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 27
Analyzed missing tweets in new UI mementos
Memento-Datetime of
the root HTML
Time delta
(Δ)
Memento-Datetime of
the archived JSON
= -
71 Missing
Tweets
-1 day 5 hrs 4 mins
Aug 18, 2020, 05:52:23 UTC
Tweets
http://web.archive.org/web/20200817004843/
https://api.twitter.com/2/timeline/profile/25073
877.json?..
Since within this ~2 days (time delta), Trump tweeted 71 times, this memento is temporally violative. This
phenomenon is referred as Temporal Violation
http://web.archive.org/web/20200818055223/https://twitter.com/realdonaldtrump
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 28
Calculated time deltas for 1.8K new UI account page mementos
-1 day 5 hs 4 mins
-1 day 5 hrs 19 mins
-1 day 5 hrs 19 mins
71 Missing
Tweets
-1 day 5 hrs 4 mins
-24 days 21 hrs 29 mins
Aug 18, 2020, 05:52:23 UTC
Bio
Tweets
You might like
What’s happening
Media timeline
Memento-Datetime of
the root HTML
Time delta
(Δ)
Memento-Datetime of
the archived JSON
= -
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 29
Temporal spread for new UI account page mementos
We analyzed the maximum and minimum value of the time delta for 1.8K new UI mementos to obtain temporal spread
Tweets
Bio
Media timeline
You might like
What’s happening
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 30
49% of 1.8K new UI mementos of @realDonaldTrump were
temporally violative
We looked at number of missing (negative delta) or future (positive delta) tweets in each memento.
The linear relationship shows that as the time delta increases, tweets the memento is off by also increases.
JSON from 6 days
in future -> the
memento is off by
more than 250
tweets
JSON from 4 days
in past -> the
memento is missing
~130 tweets
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 31
49% of 1.8K new UI mementos of @realDonaldTrump were
temporally violative
We looked at number of missing (negative delta) or future (positive delta) tweets in each memento.
The linear relationship shows that as the time delta increases, tweets the memento is off by also increases.
Outliers:
Very high activity by
@realDonaldTrump
in small time delta
e.g., 115 tweets in
under 7.7 hours
This relationship only hold for highly active accounts. For accounts with less activity, the time-delta would
have to be higher for temporal violation to be apparent.
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 32
Conclusions
● Change in Twitter’s UI in 2020 brought new challenges for web
archives
● Old UI was more prominent (93.3% of 1.3M mementos) than new
UI mementos
● Missing labels in web archives:
○ No “Fact-check” label in old UI
○ VTR was added to old UI at least by August 26, 2020
○ 18% of 6.5K new UI mementos of 476 labeled tweets
replayed the label
● Missing tweets:
○ Temporal violation can occur with components (JSON
response) from either the past or future
○ 49% of 1.8K mementos were temporally violative
Github Repo: https://github.com/oduwsdl/TwitterLabels
Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 33
What’s happening now?
Conclusions
● Change in Twitter’s UI in 2020 brought new challenges for web
archives
● Old UI was more prominent (93.3% of 1.3M mementos) than new
UI mementos
● Missing labels in web archives:
○ No “Fact-check” label in old UI
○ VTR was added to old UI at least by August 26, 2020
○ 18% of 6.5K new UI mementos of 476 labeled tweets
replayed the label
● Missing tweets:
○ Temporal violation can occur with components (JSON
response) from either the past or future
○ 49% of 1.8K mementos were temporally violative
Github Repo: https://github.com/oduwsdl/TwitterLabels
● Twitter no longer provides its old UI for
“Googlebot”
● Web archives are archiving new UI
● Mementos from late 2020 & 2021, contains old
and new Twitter UI
● Fact-check label no longer exist on live web
● VTR label still exist on live web.

More Related Content

Similar to Challenges in Replaying Archived Twitter Pages

Google Material desgin for AmmXdroid
Google Material desgin for AmmXdroidGoogle Material desgin for AmmXdroid
Google Material desgin for AmmXdroidSultan Shalakhti
 
Search and Social Media: The Opportunity in 2008
Search and Social Media: The Opportunity in 2008Search and Social Media: The Opportunity in 2008
Search and Social Media: The Opportunity in 2008Simon Baptist
 
Can Social Media make Utilities sexy again?
Can Social Media make Utilities sexy again?Can Social Media make Utilities sexy again?
Can Social Media make Utilities sexy again?Tom Raftery
 
A Framework for Verifying the Fixity of Archived Web Resources
A Framework for Verifying the Fixity of Archived Web ResourcesA Framework for Verifying the Fixity of Archived Web Resources
A Framework for Verifying the Fixity of Archived Web Resourcesmaturban
 
Preserving a Web of Linked Data: Lessons and challenges from a fading web
Preserving a Web of Linked Data: Lessons and challenges from a fading webPreserving a Web of Linked Data: Lessons and challenges from a fading web
Preserving a Web of Linked Data: Lessons and challenges from a fading webMiel Vander Sande
 
Social Scientists and the Social data revolution uqam
Social Scientists and the Social data revolution uqamSocial Scientists and the Social data revolution uqam
Social Scientists and the Social data revolution uqamClaude G. Théoret
 
Accessibility of Twitter
Accessibility of TwitterAccessibility of Twitter
Accessibility of Twittercsunwebmaster
 
Social Media Marketing: Twitter
Social Media Marketing: TwitterSocial Media Marketing: Twitter
Social Media Marketing: TwitterLaurynas Binderis
 
Enabling Personal Use of Web Archives
Enabling Personal Use of Web ArchivesEnabling Personal Use of Web Archives
Enabling Personal Use of Web ArchivesMichele Weigle
 
IRJET- Socially Smart an Aggregation System for Social Media using Web Sc...
IRJET-  	  Socially Smart an Aggregation System for Social Media using Web Sc...IRJET-  	  Socially Smart an Aggregation System for Social Media using Web Sc...
IRJET- Socially Smart an Aggregation System for Social Media using Web Sc...IRJET Journal
 
Twitter - What, Why, Who & How
Twitter - What, Why, Who & HowTwitter - What, Why, Who & How
Twitter - What, Why, Who & Howpoint2five
 
@twitter Try out #Grabeeter to Export, Archive and Search Your Tweets
@twitter Try out #Grabeeter to Export, Archive and Search Your Tweets@twitter Try out #Grabeeter to Export, Archive and Search Your Tweets
@twitter Try out #Grabeeter to Export, Archive and Search Your TweetsMartin Ebner
 
Big data for Brains (part 3)
Big data for Brains (part 3)Big data for Brains (part 3)
Big data for Brains (part 3)Agnieszka Zdebiak
 
webクリエイターのための情報交換所sp(2015年半期決算)
webクリエイターのための情報交換所sp(2015年半期決算)webクリエイターのための情報交換所sp(2015年半期決算)
webクリエイターのための情報交換所sp(2015年半期決算)kenji goto
 
Open Publish 2008 Matt Moore
Open Publish 2008 Matt MooreOpen Publish 2008 Matt Moore
Open Publish 2008 Matt MooreMatthew Moore
 
Like It Or Lump It: An Exploration Of Social Media Change And Its Implication...
Like It Or Lump It: An Exploration Of Social Media Change And Its Implication...Like It Or Lump It: An Exploration Of Social Media Change And Its Implication...
Like It Or Lump It: An Exploration Of Social Media Change And Its Implication...Makayla Lewis
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Michael Nelson
 
Modelling social Web applications via tinydb
Modelling social Web applications via tinydbModelling social Web applications via tinydb
Modelling social Web applications via tinydbClaudiu Mihăilă
 
Twitter Presentation: #APIConSF
Twitter Presentation: #APIConSFTwitter Presentation: #APIConSF
Twitter Presentation: #APIConSFRyan Choi
 

Similar to Challenges in Replaying Archived Twitter Pages (20)

Google Material desgin for AmmXdroid
Google Material desgin for AmmXdroidGoogle Material desgin for AmmXdroid
Google Material desgin for AmmXdroid
 
Search and Social Media: The Opportunity in 2008
Search and Social Media: The Opportunity in 2008Search and Social Media: The Opportunity in 2008
Search and Social Media: The Opportunity in 2008
 
Can Social Media make Utilities sexy again?
Can Social Media make Utilities sexy again?Can Social Media make Utilities sexy again?
Can Social Media make Utilities sexy again?
 
A Framework for Verifying the Fixity of Archived Web Resources
A Framework for Verifying the Fixity of Archived Web ResourcesA Framework for Verifying the Fixity of Archived Web Resources
A Framework for Verifying the Fixity of Archived Web Resources
 
Grampa, What's a deleted tweet?
Grampa, What's a deleted tweet?Grampa, What's a deleted tweet?
Grampa, What's a deleted tweet?
 
Preserving a Web of Linked Data: Lessons and challenges from a fading web
Preserving a Web of Linked Data: Lessons and challenges from a fading webPreserving a Web of Linked Data: Lessons and challenges from a fading web
Preserving a Web of Linked Data: Lessons and challenges from a fading web
 
Social Scientists and the Social data revolution uqam
Social Scientists and the Social data revolution uqamSocial Scientists and the Social data revolution uqam
Social Scientists and the Social data revolution uqam
 
Accessibility of Twitter
Accessibility of TwitterAccessibility of Twitter
Accessibility of Twitter
 
Social Media Marketing: Twitter
Social Media Marketing: TwitterSocial Media Marketing: Twitter
Social Media Marketing: Twitter
 
Enabling Personal Use of Web Archives
Enabling Personal Use of Web ArchivesEnabling Personal Use of Web Archives
Enabling Personal Use of Web Archives
 
IRJET- Socially Smart an Aggregation System for Social Media using Web Sc...
IRJET-  	  Socially Smart an Aggregation System for Social Media using Web Sc...IRJET-  	  Socially Smart an Aggregation System for Social Media using Web Sc...
IRJET- Socially Smart an Aggregation System for Social Media using Web Sc...
 
Twitter - What, Why, Who & How
Twitter - What, Why, Who & HowTwitter - What, Why, Who & How
Twitter - What, Why, Who & How
 
@twitter Try out #Grabeeter to Export, Archive and Search Your Tweets
@twitter Try out #Grabeeter to Export, Archive and Search Your Tweets@twitter Try out #Grabeeter to Export, Archive and Search Your Tweets
@twitter Try out #Grabeeter to Export, Archive and Search Your Tweets
 
Big data for Brains (part 3)
Big data for Brains (part 3)Big data for Brains (part 3)
Big data for Brains (part 3)
 
webクリエイターのための情報交換所sp(2015年半期決算)
webクリエイターのための情報交換所sp(2015年半期決算)webクリエイターのための情報交換所sp(2015年半期決算)
webクリエイターのための情報交換所sp(2015年半期決算)
 
Open Publish 2008 Matt Moore
Open Publish 2008 Matt MooreOpen Publish 2008 Matt Moore
Open Publish 2008 Matt Moore
 
Like It Or Lump It: An Exploration Of Social Media Change And Its Implication...
Like It Or Lump It: An Exploration Of Social Media Change And Its Implication...Like It Or Lump It: An Exploration Of Social Media Change And Its Implication...
Like It Or Lump It: An Exploration Of Social Media Change And Its Implication...
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
 
Modelling social Web applications via tinydb
Modelling social Web applications via tinydbModelling social Web applications via tinydb
Modelling social Web applications via tinydb
 
Twitter Presentation: #APIConSF
Twitter Presentation: #APIConSFTwitter Presentation: #APIConSF
Twitter Presentation: #APIConSF
 

Recently uploaded

Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 

Recently uploaded (20)

Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 

Challenges in Replaying Archived Twitter Pages

  • 1. Challenges in Replaying Archived Twitter Pages Published in Joint Conference on Digital Libraries (JCDL) 2021 Kritika Garg Web Science & Digital Libraries Research Group Department of Computer Science, Old Dominion University @Kritika_garg @WebSciDL @oducs Committee Members: Michael L. Nelson (Advisor), Michele C. Weigle, Sampath Jayarathna, Jian Wu, Vikas Ganjigunte Ashok
  • 2. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 2 https://doi.org/10.1109/JCDL52503.2021.00028 In 2020, Twitter changed its user Interface. We examined the challenges web archives faced in preserving Twitter after the change. The observations and results provided in this work are accurate for the time of this study in 2021. Things may have altered since Twitter ownership shifted in 2022. https://www.bbc.com/news/technology-63402338
  • 3. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 3 Tweets and accounts on the live web may become unavailable https://twitter.com/AOC/status/1364623055658635268 https://twitter.com/realDonaldTrump/
  • 4. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 4 Archives allow us to access pages that no longer exist on live web URI-R: https://twitter.com/AOC/status/1364623055658635268 URI-M: https://web.archive.org/web/20210224170823/https://twitter.com/AOC/status/1364623055658635268 Memento-Datetime: 20210224170823 (datetime of when memento was captured) Archive banner providing details of the capture. For ex, this capture is from February 24, 2021. Web archives rehost the captured page (memento) All the embeds and outlinked pages are also served from the web archive.
  • 5. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 5 Archives allow us to replay the past web of suspended accounts 2009 https://web.archive.org/web/20090702030955/https://twitter.com/realDonaldTrum p 2013 https://web.archive.org/web/20130608234757/https://twitter.com/realDonaldTrump https://web.archive.org/web/20170702084625/https://twitter.com/realDonaldTrum p https://web.archive.org/web/20230407025620/https://twitter.com/realDonaldTrump 2017 2020 Mementos (archived pages) allow us to replay the earlier pages of suspended or deleted Twitter accounts from when they were present on the live web.
  • 6. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 6 Live web keeps changing, web archives must adjust to keep up 2009 https://web.archive.org/web/20090702030955/https://twitter.com/realDonaldTrum p 2013 https://web.archive.org/web/20130608234757/https://twitter.com/realDonaldTrump https://web.archive.org/web/20170702084625/https://twitter.com/realDonaldTrum p https://web.archive.org/web/20230407025620/https://twitter.com/realDonaldTrump 2017 (Old UI) 2020 (New UI) Twitter user interface (UI) has undergone various changes. The web archives were affected by the change in 2020 due to the vast structural differences between the old and new UI.
  • 7. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 7 Tweets in old UI are embedded in HTML while new UI requires separate JSON requests to populate content the root HTML contains only a skeleton, and all page sections are served dynamically through API JSON responses New UI Old UI 20 tweets and Twitter bio are embedded in the root HTML Content populated with follow-up XHR requests (https://api.twitter.com/2/timeline/profile/25073877.json?. .)
  • 8. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 8 Archiving new UI resulted in error or incomplete pages due to Twitter’s API rate limiting To archive the new UI, multiple calls for JSON responses must be issued to Twitter’s API. Result: Error or incomplete pages because of exceeding API rate limit https://ws-dl.blogspot.com/2020/07/2020-07-15-twitter-was-already.html
  • 9. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 9 Many web archives continued to archive the old UI by pretending to be a “GoogleBot” This technique no longer returns old Twitter UI (last observed on April 10, 2023)
  • 10. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 10 Mismatch in what we saw on live web and how it replayed in the web archive Missing Twitter's Fact-check warning Archived Live Web (2020) Old User Interface https://twitter.com/peterktodd/status/1325549199350435841 Many web archives had difficulty archiving the new UI, so they pretended to be “googlebot” so they can archive the old UI. Result: view a page on the live web, archive it & replay it, and they don’t match https://web.archive.org/web/20200529145339/https://twitter.com/realDonaldTrump/status/1266231100780744704
  • 11. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 11 Crucial data, like Twitter Labels, in new UI were not in old UI https://twitter.com/realDonaldTrump/status/1313449844413992961 https://twitter.com/realDonaldTrump/status/1265255835124539392 Violated Twitter Rules Labels (VTR) Fact-check Labels No engagements! Placing a Tweet in violation (controversial content or behavior) behind a tombstone https://help.twitter.com/en/rules-and-policies/notices-on-twitter Labeling a Tweet that may contain disputed or misleading information
  • 12. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 12 New UI mementos may replay pages that never existed on live web Aug 18, 2020, 05:52:23 UTC https://ws-dl.blogspot.com/2020/11/2020-11-04-new-twitter-ui-replaying.html
  • 13. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 13 New UI mementos may replay pages that never existed on live web 71 Missing Tweets Aug 18, 2020, 05:52:23 UTC https://ws-dl.blogspot.com/2020/11/2020-11-04-new-twitter-ui-replaying.html
  • 14. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 14 Archives had difficulty in accurately preserving Twitter in 2020 Historians using web archives for a study of historically significant tweets made in late 2020 might witness: 1. Mementos displaying the “Something went wrong” 2. Mementos with different UI for the same URI-R 3. Mementos not displaying labels on disputed or controversial tweets 4. Mementos of Twitter account pages missing tweets
  • 15. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 15 Using @realDonaldTrump to study the impact of Twitter UI change on web archives 2022-11-19 2020-05-01 2021-01-08 No content on live web for ~2 years as account was suspended Collected ~8 months of archived data of @realDonaldTrump to quantify the impact of the change Suspension of @realDonaldTrump https://blog.twitter.com/en_us/topics/company/2020/suspension https://en.wikipedia.org/wiki/Acquisition_of_Twitter_by_Elon_Musk Elon Musk brings Donald Trump back on Twitter Twitter stopped supporting its old UI
  • 16. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 16 @realDonaldTrump is well archived http://web.archive.org/web/20200701000000*/https://twitter.com/realDonaldTrump https://www.thetrumparchive.com/ https://factba.se/trump/ Internet Archive The Trump Archive Factbase Dedicated third party archives were available for ground truth
  • 17. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 17 Twitter’s account page vs. tweet page Profile/Account Page Tweet Page The account page provides details specific to the account's owner, such as their brief description, following, followers, and the recent tweets they published or retweeted. The tweet page displays a single tweet and its engagement, such as the number of likes, retweets, and replies to the tweet. https://twitter.com/realDonaldTrump/status/1347569870578266115 https://twitter.com/realDonaldTrump
  • 18. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 18 ~1.3M mementos for 8.7K @realDonaldTrump’s tweets from 7 web archives We collected 8.7K @realDonaldTrump’s tweets from the ~8 months of archived data from 7 web archives. We found 64K mementos of account page and 1.29M mementos for 8.7K tweets. Start: 2021-05-01 (Twitter stopped supporting its old UI) End: 2021-01-08 (Trump’s account suspended)
  • 19. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 19 ~1.3M mementos for 8.7K @realDonaldTrump’s tweets from 7 web archives We collected 8.7K @realDonaldTrump’s tweets from the ~8 months of archived data from 7 web archives. We found 64K mementos of account page and 1.29M mementos for 8.7K tweets. Start: 2021-05-01 (Twitter stopped supporting its old UI) End: 2021-01-08 (Trump’s account suspended)
  • 20. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 20 Old Twitter UI is more prominent in web archives, 93% out of 1.3M mementos were old UI We separated the mementos into old UI and new UI. The graph shows the distribution of old UI and new UI for account page mementos and tweet page mementos across each month from May 2020 until Jan 202. a) Account page mementos b) Tweet page mementos
  • 21. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 21 Collected 476 labeled tweets of @realDonaldTrump: 450 Fact-check and 26 VTR 1. thetrumparchive.com: https://www.thetrumparchive.com/ 2. Factba.se: https://factba.se/topic/flagged-tweets 3. Twitterlabels6: https://github.com/oduwsdl/TwitterLabels Number of Tweets (Fact-check, VTR)
  • 22. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 22 Twitter added VTR label to old UI at least by August 26, 2020 https://ws-dl.blogspot.com/2020/12/2020-12-08-twitter-added-labels-on-its.html 1. The red dot shows when each tweet was created. 2. Before August 26, 2020 (dotted line 1), the mementos do not have labels (blue dot). 3. After September 9, 2020 (dotted line 2), we could see the labels in the mementos (green dot).
  • 23. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 23 “Fact-check” label never existed in old Twitter UI
  • 24. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 24 The New UI mementos can be used to see the labelled tweet. https://web.archive.org/web/20221122044113/https://twitter.com/realDonaldTrump/status/1265255835124539392 Archived New UI Fact-check label no longer exist on live web (new UI) Live Web No Twitter's Fact-check label
  • 25. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 25 At least 18% of 6.5K new UI mementos replayed the labels Fact-check: at least 967 out of 5,994 (16%) new UI mementos were working and displayed the Fact-check label. VTR: at least 213 out of 559 (38%) new UI mementos were working and displayed the VTR label. Type of labels Tweets New UI mementos Working mementos Mementos with label Fact-check 450 5,994 1,615 967 VTR 26 559 272 213 Total 476 6,553 1,887 1180
  • 26. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 26 Analyzed missing tweets in new UI mementos Memento-Datetime of the root HTML Time delta (Δ) Memento-Datetime of the archived JSON = - 71 Missing Tweets -1 day 5 hrs 4 mins Aug 18, 2020, 05:52:23 UTC Tweets http://web.archive.org/web/20200818055223/https://twitter.com/realdonaldtrump http://web.archive.org/web/20200817004843/ https://api.twitter.com/2/timeline/profile/25073 877.json?..
  • 27. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 27 Analyzed missing tweets in new UI mementos Memento-Datetime of the root HTML Time delta (Δ) Memento-Datetime of the archived JSON = - 71 Missing Tweets -1 day 5 hrs 4 mins Aug 18, 2020, 05:52:23 UTC Tweets http://web.archive.org/web/20200817004843/ https://api.twitter.com/2/timeline/profile/25073 877.json?.. Since within this ~2 days (time delta), Trump tweeted 71 times, this memento is temporally violative. This phenomenon is referred as Temporal Violation http://web.archive.org/web/20200818055223/https://twitter.com/realdonaldtrump
  • 28. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 28 Calculated time deltas for 1.8K new UI account page mementos -1 day 5 hs 4 mins -1 day 5 hrs 19 mins -1 day 5 hrs 19 mins 71 Missing Tweets -1 day 5 hrs 4 mins -24 days 21 hrs 29 mins Aug 18, 2020, 05:52:23 UTC Bio Tweets You might like What’s happening Media timeline Memento-Datetime of the root HTML Time delta (Δ) Memento-Datetime of the archived JSON = -
  • 29. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 29 Temporal spread for new UI account page mementos We analyzed the maximum and minimum value of the time delta for 1.8K new UI mementos to obtain temporal spread Tweets Bio Media timeline You might like What’s happening
  • 30. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 30 49% of 1.8K new UI mementos of @realDonaldTrump were temporally violative We looked at number of missing (negative delta) or future (positive delta) tweets in each memento. The linear relationship shows that as the time delta increases, tweets the memento is off by also increases. JSON from 6 days in future -> the memento is off by more than 250 tweets JSON from 4 days in past -> the memento is missing ~130 tweets
  • 31. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 31 49% of 1.8K new UI mementos of @realDonaldTrump were temporally violative We looked at number of missing (negative delta) or future (positive delta) tweets in each memento. The linear relationship shows that as the time delta increases, tweets the memento is off by also increases. Outliers: Very high activity by @realDonaldTrump in small time delta e.g., 115 tweets in under 7.7 hours This relationship only hold for highly active accounts. For accounts with less activity, the time-delta would have to be higher for temporal violation to be apparent.
  • 32. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 32 Conclusions ● Change in Twitter’s UI in 2020 brought new challenges for web archives ● Old UI was more prominent (93.3% of 1.3M mementos) than new UI mementos ● Missing labels in web archives: ○ No “Fact-check” label in old UI ○ VTR was added to old UI at least by August 26, 2020 ○ 18% of 6.5K new UI mementos of 476 labeled tweets replayed the label ● Missing tweets: ○ Temporal violation can occur with components (JSON response) from either the past or future ○ 49% of 1.8K mementos were temporally violative Github Repo: https://github.com/oduwsdl/TwitterLabels
  • 33. Challenges in Replaying Archived Twitter Pages | Kritika Garg <@kritika_garg> 33 What’s happening now? Conclusions ● Change in Twitter’s UI in 2020 brought new challenges for web archives ● Old UI was more prominent (93.3% of 1.3M mementos) than new UI mementos ● Missing labels in web archives: ○ No “Fact-check” label in old UI ○ VTR was added to old UI at least by August 26, 2020 ○ 18% of 6.5K new UI mementos of 476 labeled tweets replayed the label ● Missing tweets: ○ Temporal violation can occur with components (JSON response) from either the past or future ○ 49% of 1.8K mementos were temporally violative Github Repo: https://github.com/oduwsdl/TwitterLabels ● Twitter no longer provides its old UI for “Googlebot” ● Web archives are archiving new UI ● Mementos from late 2020 & 2021, contains old and new Twitter UI ● Fact-check label no longer exist on live web ● VTR label still exist on live web.