SlideShare a Scribd company logo
Meme Generation
for Social Media
Audience Engagement
author supervisor
Andrew Kurochkin
andrewkurochkin.com
PhD. Kostiantyn Bokhan
Agenda
1. Introduction
2. Research Objectives
3. Approach
4. Dataset
5. Solution
6. Evaluation
7. Summary
2
Introduction
3
Definition #1
Meme - a unit of cultural information spread by imitation.
1976, The Selfish Gene, Richard Dawkins.
4
Definition #1
Meme - a unit of cultural information spread by imitation.
1976, The Selfish Gene, Richard Dawkins.
Image macro - an image superimposed with text.
Image macro is one of the most common form of the
internet meme.
5
Definition #1
Meme - a unit of cultural information spread by imitation.
1976, The Selfish Gene, Richard Dawkins.
Image macro - an image superimposed with text.
Image macro is one of the most common form of the
internet meme.
*Image macro = Meme
6
Definition #1
Image macro consists of:
1. Top and bottom captions.
Font: Impact.
2. Background image.
7
Definition #2
Meme template - is the background
image, which is common across many
meme instances.
Each template has its cultural context.
8
Definition #3
Engage [verb] - occupy, attract, or involve someone's
interest or attention.
9
Definition #3
Engage [verb] - occupy, attract, or involve someone's
interest or attention.
4 levels of engagement [2019, Kholoud Khalil Aldous et al]:
1. View
2. Like
3. Comment, shares
4. External post
10
Motivation
11
3.5 billion people use social media in 2019
2h 20m people spend using social media every day
Motivation
12
~80% of marketers use visual assets in their social media marketing
~30% of marketers say visual images are the most important form of content
for their business
~70% of marketers state that creating more engaging content is the most
important task
3.5 billion people use social media in 2019
2h 20m people spend using social media every day
Motivation
13
Memes can be used to manipulate public opinion, for political or
ideological propaganda
3.5 billion people use social media in 2019
2h 20m people spend using social media every day
~80% of marketers use visual assets in their social media marketing
~30% of marketers say visual images are the most important form of content
for their business
~70% of marketers state that creating more engaging content is the most
important task
Research Objectives
14
Research Objectives
1. To investigate whether the current state-of-the-art
neural network can produce memes that engage
audience.
2. To create a dataset with an image macros,
meta-information, and related comments.
3. To create a pipeline in order to evaluate the degree of
engagement which the image macros induces in the
audience.
15
Approach
16
Approach
A. Get actual information occasion (event)
17
Approach
A. Get actual information occasion (event)
B. Embed an event as a vector
18
Approach
A. Get actual information occasion (event)
B. Embed an event as a vector
C. Choose a meme template based on the event-vector
19
Approach
A. Get actual information occasion (event)
B. Embed an event as a vector
C. Choose a meme template based on the event-vector
D. Generate image macro captions based on the meme
template context and small event-vector
20
Approach
A. Get actual information occasion (event)
B. Embed an event as a vector
C. Choose a meme template based on the event-vector
D. Generate image macro captions based on the meme
template context and small event-vector
E. Combine the results from C and D
21
(A) Get actual information occasion
22
(B) Embed event as a vector
23
(C) Meme template selection
24
(D) Generate meme captions and post title
25
(D) Generate meme captions and post title
GPT-2 model [2019, OpenAI, Alec Radford et al.]
26
1. Trained on a dataset of 8 million web pages, 40GB of Internet text, including
Reddit.
2. GPT-2 is based on multi-layer Transformer decoder. This model gives structured
memory for handling long-term dependencies in text.
3. GPT-2 beat state-of-the-art solutions on various tasks without finetuning.
We used the smallest GPT-2 117M which has 117 million of parameters.
(E) Combine the results from C and D
27
Engaging
content
creation
pipeline
28
Dataset
29
Dataset requirements
1. Post title
2. Post comments
3. Submitted meme
4. Meme template
5. Meme captions (top and bottom)
6. Collected engagement (amount of scores/likes)
30
Dataset collection
31
Dataset collection
1. Download Reddit submissions and comments data for the
subreddit AdviceAnimals.
32
Dataset collection
1. Download Reddit submissions and comments data for the
subreddit AdviceAnimals.
2. Download images from the submissions where it was possible.
33
Dataset collection
1. Download Reddit submissions and comments data for the
subreddit AdviceAnimals.
2. Download images from the submissions where it was possible.
3. Recognize a meme template [2018, Savvas Zannettou et al]:
34
Dataset collection
1. Download Reddit submissions and comments data for the
subreddit AdviceAnimals.
2. Download images from the submissions where it was possible.
3. Recognize a meme template [2018, Savvas Zannettou et al]:
a. Embed each image as a vector with 64 elements based on
its Perceptual Hash (pHash)
35
Dataset collection
1. Download Reddit submissions and comments data for the
subreddit AdviceAnimals.
2. Download images from the submissions where it was possible.
3. Recognize a meme template [2018, Savvas Zannettou et al]:
a. Embed each image as a vector with 64 elements based on
its Perceptual Hash (pHash)
b. Calculate Hamming distances between pHashes
36
Dataset collection
1. Download Reddit submissions and comments data for the
subreddit AdviceAnimals.
2. Download images from the submissions where it was possible.
3. Recognize a meme template [2018, Savvas Zannettou et al]:
a. Embed each image as a vector with 64 elements based on
its Perceptual Hash (pHash)
b. Calculate Hamming distances between pHashes
c. Cluster images using DBSCAN
37
Dataset collection
1. Download Reddit submissions and comments data for the
subreddit AdviceAnimals.
2. Download images from the submissions where it was possible.
3. Recognize a meme template [2018, Savvas Zannettou et al]:
a. Embed each image as a vector with 64 elements based on
its Perceptual Hash (pHash)
b. Calculate Hamming distances between pHashes
c. Cluster images using DBSCAN
4. Optical character recognition (OCR) to extract top and bottom
pieces of text from the meme image (Azure)
38
Volumes of engagement in dataset
39
Final dataset size
650K memes
350K with comments (keywords)
40
Solution
41
1. Meme template selection
42
Data preparation
43
Class balancing
memes templates prior
resampling (imbalanced)
memes templates after
resampling (balanced)
Model selection
44
2. Meme generation
45
Data preparation
46
Input data:
<|startoftext|>~`
001~^
photographer catdog mediocre said maybe~@
As a photography student...~}
just because you own a camera~{
it does not make you a. photographer
<|endoftext|>
Data preparation
47
Input data:
start token
template id
5 keywords
submission title
top caption (lowercased)
bottom caption (lowercased)
end token
<|startoftext|>~`
001~^
photographer catdog mediocre said maybe~@
As a photography student...~}
just because you own a camera~{
it does not make you a. photographer
<|endoftext|>
Generative model training, ~20h
48
Training loss Validation loss
49
Training loss, 1.63 Validation loss, 2.35
Generative model training, ~20h
Evaluation
50
Evaluation
1. Post in social network (Reddit).
Challenges:
a. must filter “bad” content
b. time consuming
c. other factors (posting time, etc).
2. Use crowdsourcing to evaluate engagement.
Compare engagement from the machine and human
memes groups.
51
Evaluation
1. Post in social network (Reddit).
Challenges:
a. must filter “bad” content
b. time consuming
c. other factors (posting time, etc).
2. Use crowdsourcing to evaluate engagement.
Compare engagement from the machine and human
memes groups.
52
Evaluation pipeline
Machine-generated memes Human-created memes
53
Evaluation pipeline
54
Task to collect engagement (MTurk)
Evaluation pipeline
1. Crowdsourcing (MTurk).
2. Contingency table.
3. Chi-square test.
Compare if two groups are similar (statistical hypothesis test).
4. Cohen effect size w.
55
MTurk justification
56
Like Not like
good memes (A) 41 59
bad memes (B) 21 79
MTurk justification
57
Like Not like
good memes (A) 41 59
bad memes (B) 21 79
observations p-value w α power
200 0.2 8.44 0.003 0.21 0.05 0.82
MTurk justification
58
Like Not like
good memes (A) 41 59
bad memes (B) 21 79
observations p-value w α power
200 0.2 8.44 0.003 0.21 0.05 0.82
Chi-square test prior calculations
59
w 0.1 (small)
α 0.05
df (degrees of freedom) 1
test power 0.9
Chi-square test prior calculations
60
w 0.1 (small)
α 0.05
df (degrees of freedom) 1
test power 0.9
total observations ∼1000
observations in sample ~500
Evaluation setting
We targeted workers with the following characteristics:
● Hit approval rate >= 95%
● Number of HITs approved >= 5000
● Location is the United States
● Worker is MTurk Master
We collected 900 observations per sample.
Removed 5% of HITs that were done extremely fast or slow.
61
Memes samples
62
* “bad” human memes are not always bad.
Description
Memes
num.
Observations
num.
Liked (%)
human, random 90 846 0.35
human, bad 90 881 0.27
machine 90 876 0.24
#1 Random (A) vs bad human (B) memes
63
observations 1727
0.08
11.06
p-value <0.01
w 0.08
α 0.05
power 0.91
#1 Random (A) vs bad human (B) memes
64
observations 1727
0.08
11.06
p-value <0.01
w 0.08
α 0.05
power 0.91
#2 Random (A) vs machine generated (B)
65
observations 1722
0.09
21.29
p-value <0.01
w 0.11
α 0.05
power 0.99
#2 Random (A) vs machine generated (B)
66
observations 1722
0.09
21.29
p-value <0.01
w 0.11
α 0.05
power 0.99
#3 Bad human (A) vs machine generated (B)
67
observations 1757
0.03
1.40
p-value 0.27
w 0.02
α 0.05
power 0.22
#3 Bad human (A) vs machine generated (B)
68
observations 1757
0.03
1.40
p-value 0.27
w 0.02
α 0.05
power 0.22
69
Engagement on the Twitter
Summary
70
Contributions
1. Created a unique dataset.
2. Justified that MTurk can be used to approximate measure engagement
from memes.
3. Found out that current SOTA generates memes which collect similar
amount of engagement to the least popular human memes.
4. Proposed approach which can be adapted to generate more complex
content.
71
Future work
1. Detect and filter out offensive machine-generated content.
2. Create model to detect good memes.
3. Publish generated submissions in the big communities to
evaluate engagement in the wild.
4. Advance training data selection.
5. Improve template selection.
72
Thank You!
Highlights from the review
1. “Data collection section should definitely outline all the data scesific counts”
a. users - 8.1M
b. posts - 3.9M
2. “some of the data cleaning details were missing from the report”.
a. remove special characters and urls
b. word2num
c. tokenize
d. lemmatize
3. “it would be interested to see if there are any features that distinguish
unpopular memes from the popular ones, e.g., time when the meme was posted”.
“What’s in a name? Understanding the Interplay between Titles, Content, and
Communities in Social Media”
2013, Himabindu Lakkaraju et al, cseweb.ucsd.edu/~jmcauley/pdfs/icwsm13.pdf
74
75
Memes #1
76
Memes #2

More Related Content

What's hot

Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석
datasciencekorea
 
DIE 20130724
DIE 20130724DIE 20130724
DIE 20130724
Tokyo Tech
 
Universal Adversarial Perturbation
Universal Adversarial PerturbationUniversal Adversarial Perturbation
Universal Adversarial Perturbation
Hyunwoo Kim
 
FYP Thesis
FYP ThesisFYP Thesis
FYP Thesis
Jamie Sullivan
 
Beyond text qa multimedia answer generation by harvesting web information
Beyond text qa multimedia answer generation by harvesting web informationBeyond text qa multimedia answer generation by harvesting web information
Beyond text qa multimedia answer generation by harvesting web information
JPINFOTECH JAYAPRAKASH
 
Ijetr011958
Ijetr011958Ijetr011958
Ijetr011958
ER Publication.org
 

What's hot (6)

Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석
 
DIE 20130724
DIE 20130724DIE 20130724
DIE 20130724
 
Universal Adversarial Perturbation
Universal Adversarial PerturbationUniversal Adversarial Perturbation
Universal Adversarial Perturbation
 
FYP Thesis
FYP ThesisFYP Thesis
FYP Thesis
 
Beyond text qa multimedia answer generation by harvesting web information
Beyond text qa multimedia answer generation by harvesting web informationBeyond text qa multimedia answer generation by harvesting web information
Beyond text qa multimedia answer generation by harvesting web information
 
Ijetr011958
Ijetr011958Ijetr011958
Ijetr011958
 

Similar to Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Audience Engagement

Analysing image collections with the computer vision network approach
Analysing image collections with  the computer vision network approachAnalysing image collections with  the computer vision network approach
Analysing image collections with the computer vision network approach
Janna Joceli Omena
 
Automated Image Captioning – Model Based on CNN – GRU Architecture
Automated Image Captioning – Model Based on CNN – GRU ArchitectureAutomated Image Captioning – Model Based on CNN – GRU Architecture
Automated Image Captioning – Model Based on CNN – GRU Architecture
IRJET Journal
 
paper_148.pptx
paper_148.pptxpaper_148.pptx
paper_148.pptx
Tarun710971
 
Multimediaexercise
MultimediaexerciseMultimediaexercise
Multimediaexercise
Rony Mohamed
 
Image_to_Prompts.pdf
Image_to_Prompts.pdfImage_to_Prompts.pdf
Image_to_Prompts.pdf
Po-Chuan Chen
 
Data-driven Studies on Social Networks: Privacy and Simulation
Data-driven Studies on Social Networks: Privacy and SimulationData-driven Studies on Social Networks: Privacy and Simulation
Data-driven Studies on Social Networks: Privacy and Simulation
Sameera Horawalavithana
 
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - PosterMediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
multimediaeval
 
[ICDE 2014] Incremental Cluster Evolution Tracking from Highly Dynamic Networ...
[ICDE 2014] Incremental Cluster Evolution Tracking from Highly Dynamic Networ...[ICDE 2014] Incremental Cluster Evolution Tracking from Highly Dynamic Networ...
[ICDE 2014] Incremental Cluster Evolution Tracking from Highly Dynamic Networ...
Pei Lee
 
Multimedia Mining
Multimedia Mining Multimedia Mining
Multimedia Mining
Biniam Asnake
 
NPA Data Science: Outcomes and PCs
NPA Data Science: Outcomes and PCsNPA Data Science: Outcomes and PCs
NPA Data Science: Outcomes and PCs
Kate Farrell
 
Machine learning_ Replicating Human Brain
Machine learning_ Replicating Human BrainMachine learning_ Replicating Human Brain
Machine learning_ Replicating Human Brain
Nishant Jain
 
Software Architecture - Quiz Questions
Software Architecture - Quiz QuestionsSoftware Architecture - Quiz Questions
Software Architecture - Quiz Questions
CodeOps Technologies LLP
 
Software Architecture - Quiz Questions
Software Architecture - Quiz QuestionsSoftware Architecture - Quiz Questions
Software Architecture - Quiz Questions
Ganesh Samarthyam
 
Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...
 Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar... Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...
Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...
Lviv Data Science Summer School
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecogn
Ilyas CHAOUA
 
Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
Paolo Missier
 
WELCOME TO AI PROJECT shidhant mittaal.pptx
WELCOME TO AI PROJECT shidhant mittaal.pptxWELCOME TO AI PROJECT shidhant mittaal.pptx
WELCOME TO AI PROJECT shidhant mittaal.pptx
9D38SHIDHANTMITTAL
 
PhD Defense of Teodoro Montanaro
PhD Defense of Teodoro MontanaroPhD Defense of Teodoro Montanaro
PhD Defense of Teodoro Montanaro
Teodoro Montanaro
 
Smart Hydroponic Plant Growing System using IoT
Smart Hydroponic Plant Growing System using IoTSmart Hydroponic Plant Growing System using IoT
Smart Hydroponic Plant Growing System using IoT
Gustavo Sanchez Collado
 
Paper 153
Paper 153Paper 153
Paper 153
Guillaume Dupont
 

Similar to Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Audience Engagement (20)

Analysing image collections with the computer vision network approach
Analysing image collections with  the computer vision network approachAnalysing image collections with  the computer vision network approach
Analysing image collections with the computer vision network approach
 
Automated Image Captioning – Model Based on CNN – GRU Architecture
Automated Image Captioning – Model Based on CNN – GRU ArchitectureAutomated Image Captioning – Model Based on CNN – GRU Architecture
Automated Image Captioning – Model Based on CNN – GRU Architecture
 
paper_148.pptx
paper_148.pptxpaper_148.pptx
paper_148.pptx
 
Multimediaexercise
MultimediaexerciseMultimediaexercise
Multimediaexercise
 
Image_to_Prompts.pdf
Image_to_Prompts.pdfImage_to_Prompts.pdf
Image_to_Prompts.pdf
 
Data-driven Studies on Social Networks: Privacy and Simulation
Data-driven Studies on Social Networks: Privacy and SimulationData-driven Studies on Social Networks: Privacy and Simulation
Data-driven Studies on Social Networks: Privacy and Simulation
 
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - PosterMediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
 
[ICDE 2014] Incremental Cluster Evolution Tracking from Highly Dynamic Networ...
[ICDE 2014] Incremental Cluster Evolution Tracking from Highly Dynamic Networ...[ICDE 2014] Incremental Cluster Evolution Tracking from Highly Dynamic Networ...
[ICDE 2014] Incremental Cluster Evolution Tracking from Highly Dynamic Networ...
 
Multimedia Mining
Multimedia Mining Multimedia Mining
Multimedia Mining
 
NPA Data Science: Outcomes and PCs
NPA Data Science: Outcomes and PCsNPA Data Science: Outcomes and PCs
NPA Data Science: Outcomes and PCs
 
Machine learning_ Replicating Human Brain
Machine learning_ Replicating Human BrainMachine learning_ Replicating Human Brain
Machine learning_ Replicating Human Brain
 
Software Architecture - Quiz Questions
Software Architecture - Quiz QuestionsSoftware Architecture - Quiz Questions
Software Architecture - Quiz Questions
 
Software Architecture - Quiz Questions
Software Architecture - Quiz QuestionsSoftware Architecture - Quiz Questions
Software Architecture - Quiz Questions
 
Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...
 Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar... Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...
Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecogn
 
Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
 
WELCOME TO AI PROJECT shidhant mittaal.pptx
WELCOME TO AI PROJECT shidhant mittaal.pptxWELCOME TO AI PROJECT shidhant mittaal.pptx
WELCOME TO AI PROJECT shidhant mittaal.pptx
 
PhD Defense of Teodoro Montanaro
PhD Defense of Teodoro MontanaroPhD Defense of Teodoro Montanaro
PhD Defense of Teodoro Montanaro
 
Smart Hydroponic Plant Growing System using IoT
Smart Hydroponic Plant Growing System using IoTSmart Hydroponic Plant Growing System using IoT
Smart Hydroponic Plant Growing System using IoT
 
Paper 153
Paper 153Paper 153
Paper 153
 

More from Lviv Data Science Summer School

Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Lviv Data Science Summer School
 
Master defence 2020 - Nazariy Perepichka - Parameterizing of Human Speech Gen...
Master defence 2020 - Nazariy Perepichka - Parameterizing of Human Speech Gen...Master defence 2020 - Nazariy Perepichka - Parameterizing of Human Speech Gen...
Master defence 2020 - Nazariy Perepichka - Parameterizing of Human Speech Gen...
Lviv Data Science Summer School
 
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Lviv Data Science Summer School
 
Master defence 2020 - Serhii Tiutiunnyk - Context-based Question-answering Sy...
Master defence 2020 - Serhii Tiutiunnyk - Context-based Question-answering Sy...Master defence 2020 - Serhii Tiutiunnyk - Context-based Question-answering Sy...
Master defence 2020 - Serhii Tiutiunnyk - Context-based Question-answering Sy...
Lviv Data Science Summer School
 
Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items
 Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items
Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items
Lviv Data Science Summer School
 
Master defence 2020 - Dmytro Babenko - Determining Sentiment and Important Pr...
Master defence 2020 - Dmytro Babenko - Determining Sentiment and Important Pr...Master defence 2020 - Dmytro Babenko - Determining Sentiment and Important Pr...
Master defence 2020 - Dmytro Babenko - Determining Sentiment and Important Pr...
Lviv Data Science Summer School
 
Master defence 2020 - Oleh Lukianykhin - Reinforcement Learning for Voltage C...
Master defence 2020 - Oleh Lukianykhin - Reinforcement Learning for Voltage C...Master defence 2020 - Oleh Lukianykhin - Reinforcement Learning for Voltage C...
Master defence 2020 - Oleh Lukianykhin - Reinforcement Learning for Voltage C...
Lviv Data Science Summer School
 
Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classif...
Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classif...Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classif...
Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classif...
Lviv Data Science Summer School
 
Master defence 2020 - Philipp Kofman - Efficient Generation of Complex Data D...
Master defence 2020 - Philipp Kofman - Efficient Generation of Complex Data D...Master defence 2020 - Philipp Kofman - Efficient Generation of Complex Data D...
Master defence 2020 - Philipp Kofman - Efficient Generation of Complex Data D...
Lviv Data Science Summer School
 
Master defence 2020 - Anastasiia Kasprova - Customer Lifetime Value for Retai...
Master defence 2020 - Anastasiia Kasprova - Customer Lifetime Value for Retai...Master defence 2020 - Anastasiia Kasprova - Customer Lifetime Value for Retai...
Master defence 2020 - Anastasiia Kasprova - Customer Lifetime Value for Retai...
Lviv Data Science Summer School
 
Master defence 2020 - Dmitri Glusco - Replica Exchange For Multiple-Environme...
Master defence 2020 - Dmitri Glusco - Replica Exchange For Multiple-Environme...Master defence 2020 - Dmitri Glusco - Replica Exchange For Multiple-Environme...
Master defence 2020 - Dmitri Glusco - Replica Exchange For Multiple-Environme...
Lviv Data Science Summer School
 
Master defence 2020 - Ivan Prodaiko - Person Re-identification in a Top-view ...
Master defence 2020 - Ivan Prodaiko - Person Re-identification in a Top-view ...Master defence 2020 - Ivan Prodaiko - Person Re-identification in a Top-view ...
Master defence 2020 - Ivan Prodaiko - Person Re-identification in a Top-view ...
Lviv Data Science Summer School
 
Master defence 2020 - Yevhen Pozdniakov - Changing Clothing on People Images...
Master defence 2020 - Yevhen Pozdniakov -  Changing Clothing on People Images...Master defence 2020 - Yevhen Pozdniakov -  Changing Clothing on People Images...
Master defence 2020 - Yevhen Pozdniakov - Changing Clothing on People Images...
Lviv Data Science Summer School
 
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Lviv Data Science Summer School
 
Master defence 2020 - Roman Riazantsev - 3D Reconstruction of Video Sign Lan...
Master defence 2020 -  Roman Riazantsev - 3D Reconstruction of Video Sign Lan...Master defence 2020 -  Roman Riazantsev - 3D Reconstruction of Video Sign Lan...
Master defence 2020 - Roman Riazantsev - 3D Reconstruction of Video Sign Lan...
Lviv Data Science Summer School
 
Master defence 2020 - Vadym Korshunov - Region-Selected Image Generation with...
Master defence 2020 - Vadym Korshunov - Region-Selected Image Generation with...Master defence 2020 - Vadym Korshunov - Region-Selected Image Generation with...
Master defence 2020 - Vadym Korshunov - Region-Selected Image Generation with...
Lviv Data Science Summer School
 
Master defence 2020 -Roman Moiseiev - Stock Market Prediction Utilizing Centr...
Master defence 2020 -Roman Moiseiev - Stock Market Prediction Utilizing Centr...Master defence 2020 -Roman Moiseiev - Stock Market Prediction Utilizing Centr...
Master defence 2020 -Roman Moiseiev - Stock Market Prediction Utilizing Centr...
Lviv Data Science Summer School
 
Master defence 2020 - Maksym Opirskyi -Topological Approach to Wikipedia Arti...
Master defence 2020 - Maksym Opirskyi -Topological Approach to Wikipedia Arti...Master defence 2020 - Maksym Opirskyi -Topological Approach to Wikipedia Arti...
Master defence 2020 - Maksym Opirskyi -Topological Approach to Wikipedia Arti...
Lviv Data Science Summer School
 
Master defence 2020 - Oleksandr Smyrnov - A Multifactorial Optimization of Pe...
Master defence 2020 - Oleksandr Smyrnov - A Multifactorial Optimization of Pe...Master defence 2020 - Oleksandr Smyrnov - A Multifactorial Optimization of Pe...
Master defence 2020 - Oleksandr Smyrnov - A Multifactorial Optimization of Pe...
Lviv Data Science Summer School
 
Master defence 2020 -Volodymyr Lut-Neural Architecture Search: a Probabilisti...
Master defence 2020 -Volodymyr Lut-Neural Architecture Search: a Probabilisti...Master defence 2020 -Volodymyr Lut-Neural Architecture Search: a Probabilisti...
Master defence 2020 -Volodymyr Lut-Neural Architecture Search: a Probabilisti...
Lviv Data Science Summer School
 

More from Lviv Data Science Summer School (20)

Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
 
Master defence 2020 - Nazariy Perepichka - Parameterizing of Human Speech Gen...
Master defence 2020 - Nazariy Perepichka - Parameterizing of Human Speech Gen...Master defence 2020 - Nazariy Perepichka - Parameterizing of Human Speech Gen...
Master defence 2020 - Nazariy Perepichka - Parameterizing of Human Speech Gen...
 
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
 
Master defence 2020 - Serhii Tiutiunnyk - Context-based Question-answering Sy...
Master defence 2020 - Serhii Tiutiunnyk - Context-based Question-answering Sy...Master defence 2020 - Serhii Tiutiunnyk - Context-based Question-answering Sy...
Master defence 2020 - Serhii Tiutiunnyk - Context-based Question-answering Sy...
 
Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items
 Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items
Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items
 
Master defence 2020 - Dmytro Babenko - Determining Sentiment and Important Pr...
Master defence 2020 - Dmytro Babenko - Determining Sentiment and Important Pr...Master defence 2020 - Dmytro Babenko - Determining Sentiment and Important Pr...
Master defence 2020 - Dmytro Babenko - Determining Sentiment and Important Pr...
 
Master defence 2020 - Oleh Lukianykhin - Reinforcement Learning for Voltage C...
Master defence 2020 - Oleh Lukianykhin - Reinforcement Learning for Voltage C...Master defence 2020 - Oleh Lukianykhin - Reinforcement Learning for Voltage C...
Master defence 2020 - Oleh Lukianykhin - Reinforcement Learning for Voltage C...
 
Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classif...
Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classif...Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classif...
Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classif...
 
Master defence 2020 - Philipp Kofman - Efficient Generation of Complex Data D...
Master defence 2020 - Philipp Kofman - Efficient Generation of Complex Data D...Master defence 2020 - Philipp Kofman - Efficient Generation of Complex Data D...
Master defence 2020 - Philipp Kofman - Efficient Generation of Complex Data D...
 
Master defence 2020 - Anastasiia Kasprova - Customer Lifetime Value for Retai...
Master defence 2020 - Anastasiia Kasprova - Customer Lifetime Value for Retai...Master defence 2020 - Anastasiia Kasprova - Customer Lifetime Value for Retai...
Master defence 2020 - Anastasiia Kasprova - Customer Lifetime Value for Retai...
 
Master defence 2020 - Dmitri Glusco - Replica Exchange For Multiple-Environme...
Master defence 2020 - Dmitri Glusco - Replica Exchange For Multiple-Environme...Master defence 2020 - Dmitri Glusco - Replica Exchange For Multiple-Environme...
Master defence 2020 - Dmitri Glusco - Replica Exchange For Multiple-Environme...
 
Master defence 2020 - Ivan Prodaiko - Person Re-identification in a Top-view ...
Master defence 2020 - Ivan Prodaiko - Person Re-identification in a Top-view ...Master defence 2020 - Ivan Prodaiko - Person Re-identification in a Top-view ...
Master defence 2020 - Ivan Prodaiko - Person Re-identification in a Top-view ...
 
Master defence 2020 - Yevhen Pozdniakov - Changing Clothing on People Images...
Master defence 2020 - Yevhen Pozdniakov -  Changing Clothing on People Images...Master defence 2020 - Yevhen Pozdniakov -  Changing Clothing on People Images...
Master defence 2020 - Yevhen Pozdniakov - Changing Clothing on People Images...
 
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
 
Master defence 2020 - Roman Riazantsev - 3D Reconstruction of Video Sign Lan...
Master defence 2020 -  Roman Riazantsev - 3D Reconstruction of Video Sign Lan...Master defence 2020 -  Roman Riazantsev - 3D Reconstruction of Video Sign Lan...
Master defence 2020 - Roman Riazantsev - 3D Reconstruction of Video Sign Lan...
 
Master defence 2020 - Vadym Korshunov - Region-Selected Image Generation with...
Master defence 2020 - Vadym Korshunov - Region-Selected Image Generation with...Master defence 2020 - Vadym Korshunov - Region-Selected Image Generation with...
Master defence 2020 - Vadym Korshunov - Region-Selected Image Generation with...
 
Master defence 2020 -Roman Moiseiev - Stock Market Prediction Utilizing Centr...
Master defence 2020 -Roman Moiseiev - Stock Market Prediction Utilizing Centr...Master defence 2020 -Roman Moiseiev - Stock Market Prediction Utilizing Centr...
Master defence 2020 -Roman Moiseiev - Stock Market Prediction Utilizing Centr...
 
Master defence 2020 - Maksym Opirskyi -Topological Approach to Wikipedia Arti...
Master defence 2020 - Maksym Opirskyi -Topological Approach to Wikipedia Arti...Master defence 2020 - Maksym Opirskyi -Topological Approach to Wikipedia Arti...
Master defence 2020 - Maksym Opirskyi -Topological Approach to Wikipedia Arti...
 
Master defence 2020 - Oleksandr Smyrnov - A Multifactorial Optimization of Pe...
Master defence 2020 - Oleksandr Smyrnov - A Multifactorial Optimization of Pe...Master defence 2020 - Oleksandr Smyrnov - A Multifactorial Optimization of Pe...
Master defence 2020 - Oleksandr Smyrnov - A Multifactorial Optimization of Pe...
 
Master defence 2020 -Volodymyr Lut-Neural Architecture Search: a Probabilisti...
Master defence 2020 -Volodymyr Lut-Neural Architecture Search: a Probabilisti...Master defence 2020 -Volodymyr Lut-Neural Architecture Search: a Probabilisti...
Master defence 2020 -Volodymyr Lut-Neural Architecture Search: a Probabilisti...
 

Recently uploaded

bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
muralinath2
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
Sérgio Sacani
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
AbdullaAlAsif1
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 

Recently uploaded (20)

bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 

Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Audience Engagement

  • 1. Meme Generation for Social Media Audience Engagement author supervisor Andrew Kurochkin andrewkurochkin.com PhD. Kostiantyn Bokhan
  • 2. Agenda 1. Introduction 2. Research Objectives 3. Approach 4. Dataset 5. Solution 6. Evaluation 7. Summary 2
  • 4. Definition #1 Meme - a unit of cultural information spread by imitation. 1976, The Selfish Gene, Richard Dawkins. 4
  • 5. Definition #1 Meme - a unit of cultural information spread by imitation. 1976, The Selfish Gene, Richard Dawkins. Image macro - an image superimposed with text. Image macro is one of the most common form of the internet meme. 5
  • 6. Definition #1 Meme - a unit of cultural information spread by imitation. 1976, The Selfish Gene, Richard Dawkins. Image macro - an image superimposed with text. Image macro is one of the most common form of the internet meme. *Image macro = Meme 6
  • 7. Definition #1 Image macro consists of: 1. Top and bottom captions. Font: Impact. 2. Background image. 7
  • 8. Definition #2 Meme template - is the background image, which is common across many meme instances. Each template has its cultural context. 8
  • 9. Definition #3 Engage [verb] - occupy, attract, or involve someone's interest or attention. 9
  • 10. Definition #3 Engage [verb] - occupy, attract, or involve someone's interest or attention. 4 levels of engagement [2019, Kholoud Khalil Aldous et al]: 1. View 2. Like 3. Comment, shares 4. External post 10
  • 11. Motivation 11 3.5 billion people use social media in 2019 2h 20m people spend using social media every day
  • 12. Motivation 12 ~80% of marketers use visual assets in their social media marketing ~30% of marketers say visual images are the most important form of content for their business ~70% of marketers state that creating more engaging content is the most important task 3.5 billion people use social media in 2019 2h 20m people spend using social media every day
  • 13. Motivation 13 Memes can be used to manipulate public opinion, for political or ideological propaganda 3.5 billion people use social media in 2019 2h 20m people spend using social media every day ~80% of marketers use visual assets in their social media marketing ~30% of marketers say visual images are the most important form of content for their business ~70% of marketers state that creating more engaging content is the most important task
  • 15. Research Objectives 1. To investigate whether the current state-of-the-art neural network can produce memes that engage audience. 2. To create a dataset with an image macros, meta-information, and related comments. 3. To create a pipeline in order to evaluate the degree of engagement which the image macros induces in the audience. 15
  • 17. Approach A. Get actual information occasion (event) 17
  • 18. Approach A. Get actual information occasion (event) B. Embed an event as a vector 18
  • 19. Approach A. Get actual information occasion (event) B. Embed an event as a vector C. Choose a meme template based on the event-vector 19
  • 20. Approach A. Get actual information occasion (event) B. Embed an event as a vector C. Choose a meme template based on the event-vector D. Generate image macro captions based on the meme template context and small event-vector 20
  • 21. Approach A. Get actual information occasion (event) B. Embed an event as a vector C. Choose a meme template based on the event-vector D. Generate image macro captions based on the meme template context and small event-vector E. Combine the results from C and D 21
  • 22. (A) Get actual information occasion 22
  • 23. (B) Embed event as a vector 23
  • 24. (C) Meme template selection 24
  • 25. (D) Generate meme captions and post title 25
  • 26. (D) Generate meme captions and post title GPT-2 model [2019, OpenAI, Alec Radford et al.] 26 1. Trained on a dataset of 8 million web pages, 40GB of Internet text, including Reddit. 2. GPT-2 is based on multi-layer Transformer decoder. This model gives structured memory for handling long-term dependencies in text. 3. GPT-2 beat state-of-the-art solutions on various tasks without finetuning. We used the smallest GPT-2 117M which has 117 million of parameters.
  • 27. (E) Combine the results from C and D 27
  • 30. Dataset requirements 1. Post title 2. Post comments 3. Submitted meme 4. Meme template 5. Meme captions (top and bottom) 6. Collected engagement (amount of scores/likes) 30
  • 32. Dataset collection 1. Download Reddit submissions and comments data for the subreddit AdviceAnimals. 32
  • 33. Dataset collection 1. Download Reddit submissions and comments data for the subreddit AdviceAnimals. 2. Download images from the submissions where it was possible. 33
  • 34. Dataset collection 1. Download Reddit submissions and comments data for the subreddit AdviceAnimals. 2. Download images from the submissions where it was possible. 3. Recognize a meme template [2018, Savvas Zannettou et al]: 34
  • 35. Dataset collection 1. Download Reddit submissions and comments data for the subreddit AdviceAnimals. 2. Download images from the submissions where it was possible. 3. Recognize a meme template [2018, Savvas Zannettou et al]: a. Embed each image as a vector with 64 elements based on its Perceptual Hash (pHash) 35
  • 36. Dataset collection 1. Download Reddit submissions and comments data for the subreddit AdviceAnimals. 2. Download images from the submissions where it was possible. 3. Recognize a meme template [2018, Savvas Zannettou et al]: a. Embed each image as a vector with 64 elements based on its Perceptual Hash (pHash) b. Calculate Hamming distances between pHashes 36
  • 37. Dataset collection 1. Download Reddit submissions and comments data for the subreddit AdviceAnimals. 2. Download images from the submissions where it was possible. 3. Recognize a meme template [2018, Savvas Zannettou et al]: a. Embed each image as a vector with 64 elements based on its Perceptual Hash (pHash) b. Calculate Hamming distances between pHashes c. Cluster images using DBSCAN 37
  • 38. Dataset collection 1. Download Reddit submissions and comments data for the subreddit AdviceAnimals. 2. Download images from the submissions where it was possible. 3. Recognize a meme template [2018, Savvas Zannettou et al]: a. Embed each image as a vector with 64 elements based on its Perceptual Hash (pHash) b. Calculate Hamming distances between pHashes c. Cluster images using DBSCAN 4. Optical character recognition (OCR) to extract top and bottom pieces of text from the meme image (Azure) 38
  • 39. Volumes of engagement in dataset 39
  • 40. Final dataset size 650K memes 350K with comments (keywords) 40
  • 42. 1. Meme template selection 42
  • 43. Data preparation 43 Class balancing memes templates prior resampling (imbalanced) memes templates after resampling (balanced)
  • 46. Data preparation 46 Input data: <|startoftext|>~` 001~^ photographer catdog mediocre said maybe~@ As a photography student...~} just because you own a camera~{ it does not make you a. photographer <|endoftext|>
  • 47. Data preparation 47 Input data: start token template id 5 keywords submission title top caption (lowercased) bottom caption (lowercased) end token <|startoftext|>~` 001~^ photographer catdog mediocre said maybe~@ As a photography student...~} just because you own a camera~{ it does not make you a. photographer <|endoftext|>
  • 48. Generative model training, ~20h 48 Training loss Validation loss
  • 49. 49 Training loss, 1.63 Validation loss, 2.35 Generative model training, ~20h
  • 51. Evaluation 1. Post in social network (Reddit). Challenges: a. must filter “bad” content b. time consuming c. other factors (posting time, etc). 2. Use crowdsourcing to evaluate engagement. Compare engagement from the machine and human memes groups. 51
  • 52. Evaluation 1. Post in social network (Reddit). Challenges: a. must filter “bad” content b. time consuming c. other factors (posting time, etc). 2. Use crowdsourcing to evaluate engagement. Compare engagement from the machine and human memes groups. 52
  • 54. Evaluation pipeline 54 Task to collect engagement (MTurk)
  • 55. Evaluation pipeline 1. Crowdsourcing (MTurk). 2. Contingency table. 3. Chi-square test. Compare if two groups are similar (statistical hypothesis test). 4. Cohen effect size w. 55
  • 56. MTurk justification 56 Like Not like good memes (A) 41 59 bad memes (B) 21 79
  • 57. MTurk justification 57 Like Not like good memes (A) 41 59 bad memes (B) 21 79 observations p-value w α power 200 0.2 8.44 0.003 0.21 0.05 0.82
  • 58. MTurk justification 58 Like Not like good memes (A) 41 59 bad memes (B) 21 79 observations p-value w α power 200 0.2 8.44 0.003 0.21 0.05 0.82
  • 59. Chi-square test prior calculations 59 w 0.1 (small) α 0.05 df (degrees of freedom) 1 test power 0.9
  • 60. Chi-square test prior calculations 60 w 0.1 (small) α 0.05 df (degrees of freedom) 1 test power 0.9 total observations ∼1000 observations in sample ~500
  • 61. Evaluation setting We targeted workers with the following characteristics: ● Hit approval rate >= 95% ● Number of HITs approved >= 5000 ● Location is the United States ● Worker is MTurk Master We collected 900 observations per sample. Removed 5% of HITs that were done extremely fast or slow. 61
  • 62. Memes samples 62 * “bad” human memes are not always bad. Description Memes num. Observations num. Liked (%) human, random 90 846 0.35 human, bad 90 881 0.27 machine 90 876 0.24
  • 63. #1 Random (A) vs bad human (B) memes 63 observations 1727 0.08 11.06 p-value <0.01 w 0.08 α 0.05 power 0.91
  • 64. #1 Random (A) vs bad human (B) memes 64 observations 1727 0.08 11.06 p-value <0.01 w 0.08 α 0.05 power 0.91
  • 65. #2 Random (A) vs machine generated (B) 65 observations 1722 0.09 21.29 p-value <0.01 w 0.11 α 0.05 power 0.99
  • 66. #2 Random (A) vs machine generated (B) 66 observations 1722 0.09 21.29 p-value <0.01 w 0.11 α 0.05 power 0.99
  • 67. #3 Bad human (A) vs machine generated (B) 67 observations 1757 0.03 1.40 p-value 0.27 w 0.02 α 0.05 power 0.22
  • 68. #3 Bad human (A) vs machine generated (B) 68 observations 1757 0.03 1.40 p-value 0.27 w 0.02 α 0.05 power 0.22
  • 71. Contributions 1. Created a unique dataset. 2. Justified that MTurk can be used to approximate measure engagement from memes. 3. Found out that current SOTA generates memes which collect similar amount of engagement to the least popular human memes. 4. Proposed approach which can be adapted to generate more complex content. 71
  • 72. Future work 1. Detect and filter out offensive machine-generated content. 2. Create model to detect good memes. 3. Publish generated submissions in the big communities to evaluate engagement in the wild. 4. Advance training data selection. 5. Improve template selection. 72
  • 74. Highlights from the review 1. “Data collection section should definitely outline all the data scesific counts” a. users - 8.1M b. posts - 3.9M 2. “some of the data cleaning details were missing from the report”. a. remove special characters and urls b. word2num c. tokenize d. lemmatize 3. “it would be interested to see if there are any features that distinguish unpopular memes from the popular ones, e.g., time when the meme was posted”. “What’s in a name? Understanding the Interplay between Titles, Content, and Communities in Social Media” 2013, Himabindu Lakkaraju et al, cseweb.ucsd.edu/~jmcauley/pdfs/icwsm13.pdf 74