Understanding Crisis Events Through Social Media Images

Towards Understanding Crisis Events On
Online Social Networks Through Pictures
IEEE/ACM Conference on Advances in Social Networks
Analysis and Mining (ASONAM), 2017
Prateek Dewan, Anshuman Suri, Varun Bharadhwaj, Aditi Mithal, Ponnurangam Kumaraguru
Precog@IIITD
Indraprastha Institute of Information Technology – Delhi (IIITD)

http://precog.iiitd.edu.in
Who am I?
• PhD student at IIIT-Delhi, India
• 2012 – present
• Masters (Information Security), IIIT-Delhi (2010 – 2012)
• Funded by the Government of India, IIIT-Delhi, IBM, National Internet
eXchange of India (NIXI)…
• Part of Precog@IIITD
• Privacy, eCrime, Online Social Networks, Data Science for Security and Privacy
• Research interests
• Privacy and Security in Online Social Media, Web Security, Machine Learning
• Data Scientist at Apple
2

An example to start…
3

The Human Brain: Images versus text
• Human brain processes images 60,000 times faster than text
4

“A Picture Is Worth A Thousand Words”
• Images are the latest way of communicating on OSNs
• 1.8 billion+ pictures shared on Online Social Networks every day
• Images attract much more attention and engagement as
compared to text
• Tweets with images get 18% more clicks, 150% more retweets
• 93% of most engaging content on Facebook has an image
5

Are we doing enough to "understand" images?
• Most research to analyze social media content focuses on text
• Topics are understood using topic modelling on text
• Sentiment is understood by subjecting textual content to linguistic
techniques
• Is that enough? Does it capture everything?
• Studies related to images are limited to small scale
• Few hundred images manually annotated and analyzed
• What can be done?
• Automated techniques for image summarization using Deep Learning and
Convolutional Neural Networks (CNNs) to scale across large no. of images
• Domain transfer learning: Using existing knowledge in one domain to
understand another domain
• Optical Character Recognition
6

What do we study?
• Crisis event
• Terrorist attacks in Paris, France in November 2015
• Images on Social Networks
• Facebook
• Data collection – Facebook Graph API Search
• #ParisAttacks
• #PrayForParis
7
Unique posts 131,548
Unique users 106,275
Posts with images 75,277
Total images extracted 57,748
Total unique images 15,123

Methodology
• 3-tier pipeline for extracting high level image descriptors
from images
8
Images
Themes
(Inception v3)
Image Sentiment
(DeCAF trained on
SentiBank)
Optical
Character
Recognition
Human
understandable
descriptors
Text Sentiment
(LIWC) +
Topics(TF)
Manual
calibration
Tier 1: Visual Themes
Tier 2: Image Sentiment
Tier 3: Text embedded in images

Tier I: Visual Themes
• ImageNet Large Scale Visual Recognition Challenge (ILSVRC),
2012
• 1.2 million images, 1,000 categories
• Winner: Google’s Inception-v3 (top-1 error: 17.2%)
• 48-layer Deep Convolutional Neural Network
9

Tier I: Visual Themes contd.
• All images labeled using Inception-v3
• Validation:
• Random sample of 2,545 images annotated by 3 human annotators
• 38.87% accuracy (majority voting)
• Manual calibration
• Renamed 7 out of the top 30 (most frequently occurring) labels
• New accuracy: 51.3%
• Why rename? 
10
Bolo Tie
(Inception-v3)
PeaceForParis
(Our dataset)

Tier II: Image Sentiment
• Domain Transfer Learning
• Inception-v3’s last layer retrained using SentiBank
• SentiBank
• Images collected from Flickr using Adjective Noun Pairs (ANPs) as search
query
• ANPs: happy dog, adorable baby, abandoned house
• Weakly labeled dataset of images carrying emotion
• Final training set – 133,108 negative + 305,100 positive sentiment images
• 10-fold random subsampling
• 69.8% accuracy
11

Tier III: Text embedded in images
• Optical Character
Recognition (OCR)
• Tesseract OCR (Python)
• 31,689 images had text
• Manually extracted text
from a random sample of
1,000 images
• Compared with OCR
output using string
similarity metrics
• ~62% accuracy
12
Tesseract output:
No-one thinks that
these people are
representative of
Christians. So why
do so many think
that these people
are representative
of Muslims?

Helix Demo
13

Findings: Top visual themes
14
Label Count Description
Website 12,416 Images of posts, tweets, banners, etc.
Book jacket * 5,383 Posters, banners, etc.
Comic book 3,803 Cartoons, animated posters and memes
Fountain 1,264 Fountain in front of the Louvre museum, other fountains
Envelope * 1,248 Posters, banners, etc.
Suit (clothing) 1,246 People wearing suits, at gatherings etc.
Stage 1,135 Stages during public speeches, mass gathering events, etc.
Candle waxlight 1,021 Lit candles and lamps offering support to victims
Malinois # 995 Police dog that died during the attacks
Scoreboard # 971 Images of sports stadium

Poor quality image content popular on Facebook
15

Image and post text had different topics
• Text embedded in images depicted more negative sentiment
than user generated textual content
16
Text embedded in images User generated text

Findings
• Image sentiment was more positive than text sentiment
17
0
0.1
0.2
0.3
0.4
0.5
0.6
8 24 40 56 72 88 104 120 136 152 168 184 200 216 232 248 264 280
SentimentValue/VolumeFraction
No. of hours after the attacks
Post Text Image Text
Image Volume Fraction

Contrasting sentiment in text and image
18

Contributions
• Insights into the visual side of content during crisis events on
social networks
• Generalizable methodology / pipeline for analyzing large
topical image datasets
19

Limitations
• Object detection technique has limited accuracy
• Retraining is costly; we prefer manual intervention over retraining
• Sentiment portrayed by an image can be subjective
• OCR does not always produce good results
• Missing out on part of the content
20

Thank you!
prateekd@iiitd.ac.in
cerc.iiitd.ac.in

Understanding Crisis Events Through Social Media Images

Recommended

Recommended

More Related Content

Similar to Understanding Crisis Events Through Social Media Images

Similar to Understanding Crisis Events Through Social Media Images (20)

More from IIIT Hyderabad

More from IIIT Hyderabad (20)

Recently uploaded

Recently uploaded (20)

Understanding Crisis Events Through Social Media Images

Editor's Notes