SlideShare a Scribd company logo
On the Origins of Memes by Means of Fringe
Web Communities
Savvas Zannettou, Tristan Caulfield, Jeremy Blackburn, Emiliano De Cristofaro,
Michael Sirivianos, Gianluca Stringhini, Guillermo Suarez-Tangil
WARNING
IMAGERY IN THIS TALK IS
UNCENSORED AND MIGHT BE
OFFENSIVE
Memes are fun!
Not always though…
Hateful/Racist Memes
Memes in politics
Memes in politics
Memes in politics
Memes have become a popular,
and seemingly effective, method to
transmit ideology.
Memes have been weaponized
But what do
we really
know about
memes?
• How can we track meme propagation
across Web communities?
• How can we characterize variants of
the same meme?
• Can we characterize Web
communities through their memes?
• Can we measure the influence of Web
communities with respect to memes
they share?
Memes processing pipeline
3. Clustering
1. pHash Extraction
2. pHash-based Pairwise
Distance Calculation
pHashes of some or all Web
communities' images
Clusters of images
5. Cluster Annotation
Pairwise Comparisons
of pHashes
annotated
images
6. Association of
Images to Clusters
Annotated
Clusters
pHashes of 
annotated images
pHashes
(all Web Communities)
7. Analysis and
Influence Estimation
Occurrences of Memes in
all Web Communities
4. Screenshot
Classifier
annotated
images
pHashes of non-screenshot
annotated images
Know Your
Meme
Generic
Annotation
Sites
Meme Annotation Sites
Generic
Web
Communities
4chan Twitter Reddit Gab
Web Communities posting Memes
images
Let’s see our data sources…
3. Clustering
1. pHash Extraction
2. pHash-based Pairwise
Distance Calculation
pHashes of some or all Web
communities' images
Clusters of images
5. Cluster Annotation
Pairwise Comparisons
of pHashes
annotated
images
6. Association of
Images to Clusters
Annotated
Clusters
pHashes of 
annotated images
pHashes
(all Web Communities)
7. Analysis and
Influence Estimation
Occurrences of Memes in
all Web Communities
4. Screenshot
Classifier
annotated
images
pHashes of non-screenshot
annotated images
Know Your
Meme
Generic
Annotation
Sites
Meme Annotation Sites
Generic
Web
Communities
4chan Twitter Reddit Gab
Web Communities posting Memes
images
Know Your Meme (KYM)
• Crowdsourced encyclopedia for Memes
• Provides useful metadata
• E.g., origin, descriptive tags, description, examples, image galleries
• Built custom crawler
• Obtained data for 15K KYM entries
• Download every image per entry (706K)
Datasets
# of posts 1.4B 1.0B 48M 12M 15K
# of posts with
images
242M 62M 13M 955K 15K
# of Images 114M 40M 4M 235K 706K
Perceptual hashing extraction
3. Clustering
1. pHash Extraction
2. pHash-based Pairwise
Distance Calculation
pHashes of some or all Web
communities' images
Clusters of images
5. Cluster Annotation
Pairwise Comparisons
of pHashes
annotated
images
6. Association of
Images to Clusters
Annotated
Clusters
pHashes of 
annotated images
pHashes
(all Web Communities)
7. Analysis and
Influence Estimation
Occurrences of Memes in
all Web Communities
4. Screenshot
Classifier
annotated
images
pHashes of non-screenshot
annotated images
Know Your
Meme
Generic
Annotation
Sites
Meme Annotation Sites
Generic
Web
Communities
4chan Twitter Reddit Gab
Web Communities posting Memes
images
Perceptual hashing (pHash)
• Generates a hash for each image
• Visually similar images have minor differences in
their hashes
• Reduces dimensionality of the images
• Run the pHash algorithm for
• All images from KYM (706K)
• All images from Twitter, Reddit, /pol/, and Gab
(159.5M)
Creating clusters of images/memes
3. Clustering
1. pHash Extraction
2. pHash-based Pairwise
Distance Calculation
pHashes of some or all Web
communities' images
Clusters of images
5. Cluster Annotation
Pairwise Comparisons
of pHashes
annotated
images
6. Association of
Images to Clusters
Annotated
Clusters
pHashes of 
annotated images
pHashes
(all Web Communities)
7. Analysis and
Influence Estimation
Occurrences of Memes in
all Web Communities
4. Screenshot
Classifier
annotated
images
pHashes of non-screenshot
annotated images
Know Your
Meme
Generic
Annotation
Sites
Meme Annotation Sites
Generic
Web
Communities
4chan Twitter Reddit Gab
Web Communities posting Memes
images
Pairwise comparisons and clustering
• Calculated all pairwise comparisons between
all pHashes from /pol/, The_Donald, and Gab
• Used TensorFlow and GPUs to speed-up the
process
• Hamming distance
• Performed clustering using:
• DBSCAN algorithm
Example clusters
Nut Button Meme Goofy’s Time Meme
Annotating clusters
3. Clustering
1. pHash Extraction
2. pHash-based Pairwise
Distance Calculation
pHashes of some or all Web
communities' images
Clusters of images
5. Cluster Annotation
Pairwise Comparisons
of pHashes
annotated
images
6. Association of
Images to Clusters
Annotated
Clusters
pHashes of 
annotated images
pHashes
(all Web Communities)
7. Analysis and
Influence Estimation
Occurrences of Memes in
all Web Communities
4. Screenshot
Classifier
annotated
images
pHashes of non-screenshot
annotated images
Know Your
Meme
Generic
Annotation
Sites
Meme Annotation Sites
Generic
Web
Communities
4chan Twitter Reddit Gab
Web Communities posting Memes
images
Annotating clusters
• Calculated medoid of each cluster
• “Representative” image in cluster
• Compared all medoids with all KYM images
• We have a hit if the Hamming distance is <= pre-
defined threshold
• Assign the representative label according to:
• Number of hits
• Average distance between all hits
• Performed small-scale evaluation of
annotations
Finding all memes and analyzing final dataset
3. Clustering
1. pHash Extraction
2. pHash-based Pairwise
Distance Calculation
pHashes of some or all Web
communities' images
Clusters of images
5. Cluster Annotation
Pairwise Comparisons
of pHashes
annotated
images
6. Association of
Images to Clusters
Annotated
Clusters
pHashes of 
annotated images
pHashes
(all Web Communities)
7. Analysis and
Influence Estimation
Occurrences of Memes in
all Web Communities
4. Screenshot
Classifier
annotated
images
pHashes of non-screenshot
annotated images
Know Your
Meme
Generic
Annotation
Sites
Meme Annotation Sites
Generic
Web
Communities
4chan Twitter Reddit Gab
Web Communities posting Memes
images
Top memes per Web community
Studying specific groups of memes
• Focus on racist and political memes
• Use KYM tags to find relevant memes
• “politics,” “2016 us presidential election,” “trump,” and
“clinton” tags
• “racism,” “racist,” or “antisemitism” tags
• Obtain 117 racist memes and 556 political memes
from KYM dataset
How are memes shared over time?
Political Memes Racist Memes
How are memes shared over time?
Political Memes Racist Memes
2nd US
presidential
debate
How are memes shared over time?
Political Memes Racist Memes
2016 US
elections
2nd US
presidential
debate
How memes are shared over time?
Political Memes Racist Memes
2016 US
elections
Gab activity
increase
2017
2nd US
presidential
debate
How are memes shared over time?
Political Memes Racist Memes
2016 US
elections
Gab activity
increase
2017
/pol/
constant
share
2nd US
presidential
debate
How are memes shared over time?
Political Memes Racist Memes
2016 US
elections
Gab activity
increase
2017
/pol/
constant
share
Gab activity
increase in
2017
2nd US
presidential
debate
How to quantify the influence?
• Hawkes processes
• Assume K processes
• Each with a rate of events (i.e., posting of a meme),
called the background rate
• An event can cause impulse responses in other
processes
• Increases the rates of other processes for a period of
time
• Enables us to assess root cause of events
Hawkes processes example
A
B
C
1
2
3
4
Background Rate A
Background Rate B
Background Rate C
Hawkes processes example
A
B
C
1
Background Rate A
Background Rate B
Background Rate C
Hawkes processes example
A
B
C
1
Background Rate A
Background Rate B
Background Rate C
Hawkes processes example
A
B
C
1
2
Background Rate A
Background Rate B
Background Rate C
Hawkes processes example
A
B
C
1
2
Background Rate A
Background Rate B
Background Rate C
Hawkes processes example
A
B
C
1
2
3
Background Rate A
Background Rate B
Background Rate C
Hawkes processes example
A
B
C
1
2
3
Background Rate A
Background Rate B
Background Rate C
Hawkes processes example
A
B
C
1
2
3
4
Background Rate A
Background Rate B
Background Rate C
For our
purposes…
• Hawkes model with 5 processes
• One for each platform/community (/pol/,
The_Donald, Reddit, Twitter, Gab)
• Distinct model for each cluster; fit each
model with Gibbs sampling
• Calculate the influence and efficiency of each
community
Communities’ influence (racist memes)
/pol/ is most
influential in terms
of spreading racist
memes
Communities’ efficiency (racist memes)
If we look at the
influence normalized
to the number of
memes posted, the
The_Donald is most
efficient in terms of
disseminating memes
Summary
• Proposed meme processing pipeline
• Code and datasets available on Github
(https://github.com/memespaper/memes_pipeline)
• Important differences between the memes posted on
Web communities
• Quantified influence among Web communities
Project LegacyAcknowledgments

More Related Content

Similar to On the Origins of Memes by Means of Fringe Web Communities

A network based model for predicting a hashtag break out in twitter
A network based model for predicting a hashtag break out in twitter A network based model for predicting a hashtag break out in twitter
A network based model for predicting a hashtag break out in twitter
Sultan Alzahrani
 
Explaining Controversy on Social Media via Stance Summarization
Explaining Controversy on Social Media via Stance SummarizationExplaining Controversy on Social Media via Stance Summarization
Explaining Controversy on Social Media via Stance Summarization
miajang
 
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Artificial Intelligence Institute at UofSC
 
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Lviv Data Science Summer School
 
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Lviv Data Science Summer School
 
Opinion Dynamics on Networks
Opinion Dynamics on NetworksOpinion Dynamics on Networks
Opinion Dynamics on Networks
Mason Porter
 
Vector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdfVector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdf
ConnorShorten2
 
SXSW Interactive 2015 Highlights
SXSW Interactive 2015 HighlightsSXSW Interactive 2015 Highlights
SXSW Interactive 2015 Highlights
Katie Kelly
 
The Power of ‘Like’: The Emergent Power of Facebook for Reaching and (Re)enga...
The Power of ‘Like’: The Emergent Power of Facebook for Reaching and (Re)enga...The Power of ‘Like’: The Emergent Power of Facebook for Reaching and (Re)enga...
The Power of ‘Like’: The Emergent Power of Facebook for Reaching and (Re)enga...
Australian Federation of AIDS Organisations
 
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...
Marco Brambilla
 
Blog clustering
Blog clusteringBlog clustering
Blog clustering
Ahmad Ammari
 
Approaching Big Data: Lesson Plan
Approaching Big Data: Lesson Plan Approaching Big Data: Lesson Plan
Approaching Big Data: Lesson Plan
Bessie Chu
 
Supercharge Your Facebook Fans - Updated 11/28/12
Supercharge Your Facebook Fans - Updated 11/28/12Supercharge Your Facebook Fans - Updated 11/28/12
Supercharge Your Facebook Fans - Updated 11/28/12
Swift Kick
 
E xtension 2011 fsa cop social media project-05-11
E xtension 2011 fsa cop social media project-05-11E xtension 2011 fsa cop social media project-05-11
E xtension 2011 fsa cop social media project-05-11
Barbara O'Neill
 
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
Pierpaolo Basile
 
Twitris - Web Information System 2011 Course
Twitris - Web Information System 2011 Course Twitris - Web Information System 2011 Course
Twitris - Web Information System 2011 Course
Ashutosh Jadhav
 
8 Ways to Write Viral Headlines
8 Ways to Write Viral Headlines8 Ways to Write Viral Headlines
8 Ways to Write Viral Headlines
Reinvent Interactive, Inc.
 
UCLA X469.21 SPRING '17 - WEEK 4
UCLA X469.21 SPRING '17 - WEEK 4UCLA X469.21 SPRING '17 - WEEK 4
UCLA X469.21 SPRING '17 - WEEK 4
SocialMediaUCLA
 
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
Saratoga
 
ashu ppt final.pptx
ashu ppt final.pptxashu ppt final.pptx
ashu ppt final.pptx
VekariyaDarshana2
 

Similar to On the Origins of Memes by Means of Fringe Web Communities (20)

A network based model for predicting a hashtag break out in twitter
A network based model for predicting a hashtag break out in twitter A network based model for predicting a hashtag break out in twitter
A network based model for predicting a hashtag break out in twitter
 
Explaining Controversy on Social Media via Stance Summarization
Explaining Controversy on Social Media via Stance SummarizationExplaining Controversy on Social Media via Stance Summarization
Explaining Controversy on Social Media via Stance Summarization
 
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
 
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
 
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
 
Opinion Dynamics on Networks
Opinion Dynamics on NetworksOpinion Dynamics on Networks
Opinion Dynamics on Networks
 
Vector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdfVector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdf
 
SXSW Interactive 2015 Highlights
SXSW Interactive 2015 HighlightsSXSW Interactive 2015 Highlights
SXSW Interactive 2015 Highlights
 
The Power of ‘Like’: The Emergent Power of Facebook for Reaching and (Re)enga...
The Power of ‘Like’: The Emergent Power of Facebook for Reaching and (Re)enga...The Power of ‘Like’: The Emergent Power of Facebook for Reaching and (Re)enga...
The Power of ‘Like’: The Emergent Power of Facebook for Reaching and (Re)enga...
 
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...
 
Blog clustering
Blog clusteringBlog clustering
Blog clustering
 
Approaching Big Data: Lesson Plan
Approaching Big Data: Lesson Plan Approaching Big Data: Lesson Plan
Approaching Big Data: Lesson Plan
 
Supercharge Your Facebook Fans - Updated 11/28/12
Supercharge Your Facebook Fans - Updated 11/28/12Supercharge Your Facebook Fans - Updated 11/28/12
Supercharge Your Facebook Fans - Updated 11/28/12
 
E xtension 2011 fsa cop social media project-05-11
E xtension 2011 fsa cop social media project-05-11E xtension 2011 fsa cop social media project-05-11
E xtension 2011 fsa cop social media project-05-11
 
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
 
Twitris - Web Information System 2011 Course
Twitris - Web Information System 2011 Course Twitris - Web Information System 2011 Course
Twitris - Web Information System 2011 Course
 
8 Ways to Write Viral Headlines
8 Ways to Write Viral Headlines8 Ways to Write Viral Headlines
8 Ways to Write Viral Headlines
 
UCLA X469.21 SPRING '17 - WEEK 4
UCLA X469.21 SPRING '17 - WEEK 4UCLA X469.21 SPRING '17 - WEEK 4
UCLA X469.21 SPRING '17 - WEEK 4
 
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
 
ashu ppt final.pptx
ashu ppt final.pptxashu ppt final.pptx
ashu ppt final.pptx
 

Recently uploaded

Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
muralinath2
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
hozt8xgk
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
Carl Bergstrom
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
LengamoLAppostilic
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
European Sustainable Phosphorus Platform
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
Advanced-Concepts-Team
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
terusbelajar5
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 

Recently uploaded (20)

Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 

On the Origins of Memes by Means of Fringe Web Communities

  • 1. On the Origins of Memes by Means of Fringe Web Communities Savvas Zannettou, Tristan Caulfield, Jeremy Blackburn, Emiliano De Cristofaro, Michael Sirivianos, Gianluca Stringhini, Guillermo Suarez-Tangil
  • 2. WARNING IMAGERY IN THIS TALK IS UNCENSORED AND MIGHT BE OFFENSIVE
  • 3.
  • 4.
  • 10. Memes in politics Memes have become a popular, and seemingly effective, method to transmit ideology. Memes have been weaponized
  • 11. But what do we really know about memes? • How can we track meme propagation across Web communities? • How can we characterize variants of the same meme? • Can we characterize Web communities through their memes? • Can we measure the influence of Web communities with respect to memes they share?
  • 12. Memes processing pipeline 3. Clustering 1. pHash Extraction 2. pHash-based Pairwise Distance Calculation pHashes of some or all Web communities' images Clusters of images 5. Cluster Annotation Pairwise Comparisons of pHashes annotated images 6. Association of Images to Clusters Annotated Clusters pHashes of  annotated images pHashes (all Web Communities) 7. Analysis and Influence Estimation Occurrences of Memes in all Web Communities 4. Screenshot Classifier annotated images pHashes of non-screenshot annotated images Know Your Meme Generic Annotation Sites Meme Annotation Sites Generic Web Communities 4chan Twitter Reddit Gab Web Communities posting Memes images
  • 13. Let’s see our data sources… 3. Clustering 1. pHash Extraction 2. pHash-based Pairwise Distance Calculation pHashes of some or all Web communities' images Clusters of images 5. Cluster Annotation Pairwise Comparisons of pHashes annotated images 6. Association of Images to Clusters Annotated Clusters pHashes of  annotated images pHashes (all Web Communities) 7. Analysis and Influence Estimation Occurrences of Memes in all Web Communities 4. Screenshot Classifier annotated images pHashes of non-screenshot annotated images Know Your Meme Generic Annotation Sites Meme Annotation Sites Generic Web Communities 4chan Twitter Reddit Gab Web Communities posting Memes images
  • 14. Know Your Meme (KYM) • Crowdsourced encyclopedia for Memes • Provides useful metadata • E.g., origin, descriptive tags, description, examples, image galleries • Built custom crawler • Obtained data for 15K KYM entries • Download every image per entry (706K)
  • 15. Datasets # of posts 1.4B 1.0B 48M 12M 15K # of posts with images 242M 62M 13M 955K 15K # of Images 114M 40M 4M 235K 706K
  • 16. Perceptual hashing extraction 3. Clustering 1. pHash Extraction 2. pHash-based Pairwise Distance Calculation pHashes of some or all Web communities' images Clusters of images 5. Cluster Annotation Pairwise Comparisons of pHashes annotated images 6. Association of Images to Clusters Annotated Clusters pHashes of  annotated images pHashes (all Web Communities) 7. Analysis and Influence Estimation Occurrences of Memes in all Web Communities 4. Screenshot Classifier annotated images pHashes of non-screenshot annotated images Know Your Meme Generic Annotation Sites Meme Annotation Sites Generic Web Communities 4chan Twitter Reddit Gab Web Communities posting Memes images
  • 17. Perceptual hashing (pHash) • Generates a hash for each image • Visually similar images have minor differences in their hashes • Reduces dimensionality of the images • Run the pHash algorithm for • All images from KYM (706K) • All images from Twitter, Reddit, /pol/, and Gab (159.5M)
  • 18. Creating clusters of images/memes 3. Clustering 1. pHash Extraction 2. pHash-based Pairwise Distance Calculation pHashes of some or all Web communities' images Clusters of images 5. Cluster Annotation Pairwise Comparisons of pHashes annotated images 6. Association of Images to Clusters Annotated Clusters pHashes of  annotated images pHashes (all Web Communities) 7. Analysis and Influence Estimation Occurrences of Memes in all Web Communities 4. Screenshot Classifier annotated images pHashes of non-screenshot annotated images Know Your Meme Generic Annotation Sites Meme Annotation Sites Generic Web Communities 4chan Twitter Reddit Gab Web Communities posting Memes images
  • 19. Pairwise comparisons and clustering • Calculated all pairwise comparisons between all pHashes from /pol/, The_Donald, and Gab • Used TensorFlow and GPUs to speed-up the process • Hamming distance • Performed clustering using: • DBSCAN algorithm
  • 20. Example clusters Nut Button Meme Goofy’s Time Meme
  • 21. Annotating clusters 3. Clustering 1. pHash Extraction 2. pHash-based Pairwise Distance Calculation pHashes of some or all Web communities' images Clusters of images 5. Cluster Annotation Pairwise Comparisons of pHashes annotated images 6. Association of Images to Clusters Annotated Clusters pHashes of  annotated images pHashes (all Web Communities) 7. Analysis and Influence Estimation Occurrences of Memes in all Web Communities 4. Screenshot Classifier annotated images pHashes of non-screenshot annotated images Know Your Meme Generic Annotation Sites Meme Annotation Sites Generic Web Communities 4chan Twitter Reddit Gab Web Communities posting Memes images
  • 22. Annotating clusters • Calculated medoid of each cluster • “Representative” image in cluster • Compared all medoids with all KYM images • We have a hit if the Hamming distance is <= pre- defined threshold • Assign the representative label according to: • Number of hits • Average distance between all hits • Performed small-scale evaluation of annotations
  • 23. Finding all memes and analyzing final dataset 3. Clustering 1. pHash Extraction 2. pHash-based Pairwise Distance Calculation pHashes of some or all Web communities' images Clusters of images 5. Cluster Annotation Pairwise Comparisons of pHashes annotated images 6. Association of Images to Clusters Annotated Clusters pHashes of  annotated images pHashes (all Web Communities) 7. Analysis and Influence Estimation Occurrences of Memes in all Web Communities 4. Screenshot Classifier annotated images pHashes of non-screenshot annotated images Know Your Meme Generic Annotation Sites Meme Annotation Sites Generic Web Communities 4chan Twitter Reddit Gab Web Communities posting Memes images
  • 24. Top memes per Web community
  • 25. Studying specific groups of memes • Focus on racist and political memes • Use KYM tags to find relevant memes • “politics,” “2016 us presidential election,” “trump,” and “clinton” tags • “racism,” “racist,” or “antisemitism” tags • Obtain 117 racist memes and 556 political memes from KYM dataset
  • 26. How are memes shared over time? Political Memes Racist Memes
  • 27. How are memes shared over time? Political Memes Racist Memes 2nd US presidential debate
  • 28. How are memes shared over time? Political Memes Racist Memes 2016 US elections 2nd US presidential debate
  • 29. How memes are shared over time? Political Memes Racist Memes 2016 US elections Gab activity increase 2017 2nd US presidential debate
  • 30. How are memes shared over time? Political Memes Racist Memes 2016 US elections Gab activity increase 2017 /pol/ constant share 2nd US presidential debate
  • 31. How are memes shared over time? Political Memes Racist Memes 2016 US elections Gab activity increase 2017 /pol/ constant share Gab activity increase in 2017 2nd US presidential debate
  • 32. How to quantify the influence? • Hawkes processes • Assume K processes • Each with a rate of events (i.e., posting of a meme), called the background rate • An event can cause impulse responses in other processes • Increases the rates of other processes for a period of time • Enables us to assess root cause of events
  • 33. Hawkes processes example A B C 1 2 3 4 Background Rate A Background Rate B Background Rate C
  • 34. Hawkes processes example A B C 1 Background Rate A Background Rate B Background Rate C
  • 35. Hawkes processes example A B C 1 Background Rate A Background Rate B Background Rate C
  • 36. Hawkes processes example A B C 1 2 Background Rate A Background Rate B Background Rate C
  • 37. Hawkes processes example A B C 1 2 Background Rate A Background Rate B Background Rate C
  • 38. Hawkes processes example A B C 1 2 3 Background Rate A Background Rate B Background Rate C
  • 39. Hawkes processes example A B C 1 2 3 Background Rate A Background Rate B Background Rate C
  • 40. Hawkes processes example A B C 1 2 3 4 Background Rate A Background Rate B Background Rate C
  • 41. For our purposes… • Hawkes model with 5 processes • One for each platform/community (/pol/, The_Donald, Reddit, Twitter, Gab) • Distinct model for each cluster; fit each model with Gibbs sampling • Calculate the influence and efficiency of each community
  • 42. Communities’ influence (racist memes) /pol/ is most influential in terms of spreading racist memes
  • 43. Communities’ efficiency (racist memes) If we look at the influence normalized to the number of memes posted, the The_Donald is most efficient in terms of disseminating memes
  • 44. Summary • Proposed meme processing pipeline • Code and datasets available on Github (https://github.com/memespaper/memes_pipeline) • Important differences between the memes posted on Web communities • Quantified influence among Web communities