This document describes a project that aims to develop a tool to automatically determine the authenticity of Twitter screenshots. The tool would extract information from screenshots using optical character recognition and date extraction techniques. It would then search the live web and web archives to try to find the original source of the screenshot's content in order to label it as either real or fake. The project seeks to address the problem of misinformation from fabricated screenshots by developing a technique to evaluate screenshot authenticity.
1. Extracting Information
from Twitter Screenshots
Modeling and Simulation Student Capstone Conference 2023
Track: Data Science
Authors: Tarannum Zaki, Michael L. Nelson, and Michele C. Weigle
Presented by Tarannum Zaki
Department of Computer Science
Old Dominion University, Norfolk, Virginia
April 20, 2023
2. Screenshots are commonly used for information sharing
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL 2
https://twitter.com/BetteMidler/status/1541472225341198338
https://twitter.com/MahyarTousi/status/1534307163073658881 https://twitter.com/urbanachievr/status/1505944201208516612
3. Why screenshots?
To increase cross-platform engagement
3
https://www.facebook.com/watchclassinsession/posts/pfbid0344Hu2bxJtAiiL5VHfM2YQyPTU9jTm3
tfdJMj4TZMDunomMarXMQfTxPGvsVwfBmwl
https://twitter.com/RBReich/status/1560027191404072961
Inter-platform operability on social media platforms is quite difficult.
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL
4. Why screenshots?
To use as an evidence for deleted posts
4
https://web.archive.org/web/20220525125749/https://twitter.com/DanielDefense/status/1526237750
277681154
Controversial posts may be deleted.
https://twitter.com/ashtonpittman/status/1530243294868930560
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL
5. Did they really post that?
5
https://twitter.com/Shayan86/status/1515753937139388418
https://twitter.com/paulthacker11/status/1495436489492090881
https://twitter.com/elonmusk/status/1544051155562598401
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL
6. Creating fake tweets using Tweet Gen
6
https://www.tweetgen.com/
https://www.tweetgen.com/create/tweet.html
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL
7. Using live web and web archives to validate screenshots
7
https://www.google.com/search
https://archive.org/web/
https://www.reuters.com/
https://www.snopes.com/
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL
8. Verifying screenshots using Google Search
8
https://twitter.com/hannahgais/status/1526674114995527680
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL
9. Verifying screenshots using FactCheck.org
9
https://www.factcheck.org/2022/07/fabricated-fourth-of-july-tweet-was-not-from-rep-marjorie-taylor-greene/
July 5, 2022
July 4, 2022
https://twitter.com/Imposter_Edits/status/1543960895965085696
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL
10. Verifying screenshots using the Wayback Machine
10
https://web.archive.org/web/20220525125749/https://twitter.com/DanielDefense/status/1526237750277681154
May 16
May 27, 2022
https://twitter.com/ashtonpittman/status/1530243294868930560
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL
11. Motivation
➢ Fake tweets can be responsible for misinformation/disinformation spread.
➢ Fake tweets are easy to create using online tools.
➢ There are no tools currently available to evaluate the authenticity of screenshots.
11
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL
12. Aim
To develop a tool that would automatically provide a probability
whether a screenshot is fake by using the services of the live web
and web archives.
12
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL
14. Data Collection
14
Fields
Shared post’s URL Original post’s URL
Category Reason
Content category Structural features
Post type Social media
Search strategy Annotated images
Screenshot Remarks
- Screenshot images shared on Twitter.
- 200 examples
- Examples include both real and fake screenshots
https://ws-dl.blogspot.com/2022/12/2022-12-12-disinformation-spread-on.html
https://twitter.com/rvawonk/status/1503227687917305863
https://twitter.com/RealCandaceO/status/1501576
352587292673
Category: Real
Reason: Found in the live web
Content category: Politics
Post Type: Tweet
Structural features: Single author, single
post
Search strategy: Searched on Twitter
interface
Social media: Twitter
Original post’s URL
Shared post’s URL
Screenshot
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL
15. Extracting information from screenshot examples
15
Twitter handle
Timestamp
Tweet text
https://twitter.com/RealCandaceO/status/1501576352587292673
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL
16. Applying OCR on screenshot examples: Single tweet images
16
OCR
Optical Character Recognition extracts information as text from digital image.
Example screenshot image OCR extracted output
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL
17. Method 1: Using the Python DateFinder module to extract
timestamps
17
https://twitter.com/gaywonk/status/1540398670658654208
OCR extracted output
Example screenshot image
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL
18. Method 2: Using the Python DateFinder module with an
additional date format logic to extract timestamps
18
https://twitter.com/gaywonk/status/1540398670658654208
OCR extracted output
Example screenshot image
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL
19. Method 2 performs better than Method 1
19
Methods Accuracy Precision Recall F1-score
Method 1 41% 60% 39% 47%
Method 2 80% 74% 97% 89%
Experimented on 125 single tweet images from the collected dataset.
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL
20. Summary
20
❏ Screenshots are an easy way to share content on social media.
❏ Since screenshots can be easily faked, it is a critical task to detect a
fabricated post.
❏ Services of live web and web archives could be useful to validate content of a
screenshot.
❏ Our research will mitigate misinformation and disinformation spread on
social media.
Tarannum Zaki MSVSCC 2023 Extracting Information from Twitter Screenshots @tarannum_zaki @WebSciDL