Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

Faking Sandy: Characterizing and
Identifying Fake Images on Twitter
during Hurricane Sandy

Adi$
Gupta

Hemank
Lamba

Ponnurangam
Kumaraguru

Anupam
Joshi

Agenda
 Background

 Mo$va$on

 Main
contribu$ons

 Data
descrip$on

 Methodology

 Results

 Future
work

precog.iiitd.edu.in

2

Background: Hurricane Sandy
 Dates:
Oct
22-‐
31,
2012

 Category
3
storm

 Damages
worth
$75
billion

 Coast
of
NE
America

precog.iiitd.edu.in

3

Motivation
FAKE

IMAGES

precog.iiitd.edu.in

4

Motivation

precog.iiitd.edu.in

5

Our Contributions
  Performed
in-‐depth
characteriza$on
of
tweets
sharing

fake
images
on
TwiVer
during
Hurricane
Sandy

-  Tweets
containing
the
fake
images
URLs
were
mostly

retweets
(86%)

-  Just
11%
overlap
between
the
retweet
and
follower
graphs
of

tweets
containing
fake
images.

  Applied
classiﬁca$on
algorithms
to
dis$nguish
between

tweets
containing
fake
and
real
images.

-  Best
accuracy
of
97%
was
achieved
using
decision
tree

classiﬁer,
using
tweet
based
features.

precog.iiitd.edu.in

6

Methodology

Feature

Genera$on

Data
Collec$on

and
Filtering

Evalua$ng

Results

Data

Characteriza$on

Obtaining

Ground
Truth

Classiﬁca$on
Module

precog.iiitd.edu.in

7

Data Description
Total
tweets

Total
unique
users

1,174,266

Tweets
with
URLs

precog.iiitd.edu.in

1,782,526

622,860

8

Data Filtering
 Reputable
online
resource
to
ﬁlter
fake
and
real

images

- Guardian
collected
and
publically
distributed
a
list

of
fake
and
true
images
shared
during
Hurricane

Sandy

Tweets
with
fake
images

10,350

Users
with
fake
images

10,215

Tweets
with
real
images

5,767

Users
with
real
images

5,678

 One
of
the
biggest
fake
content
propaga$on

datasets
that
have
been
studied
by
researchers

precog.iiitd.edu.in

9

Characterization – Fake Image
Propagation
 86%
of
tweets
spreading
the
fake
images
were

retweets

 Top
30
users
out
of
10,215
users
(0.3%)
resulted

in
90%
of
the
retweets
of
fake
images

precog.iiitd.edu.in

10

Network Analysis

Tweet
–
Retweet
graph
for
the
spread
of

fake
images
at
‘nth’
and
‘n+1th‘
hour

precog.iiitd.edu.in

11

Role of Explicit Twitter Network
 Analyzing
role
of
follower
network
in
fake

image
propaga$on

 We
crawled
the
TwiVer
network
for
all
users

who
tweeted
the
fake
image
URLs

precog.iiitd.edu.in

12

Algorithm

precog.iiitd.edu.in

13

Results
Total
edges
in
the
retweet
network

Total
edges
in
the
follower-‐followee
network

Total
edges
that
exist
in
both
retweet

network
and
the
follower-‐
followee
network

%age
Overlap

precog.iiitd.edu.in

10,508

10,799,122

1,215

11%

14

Classification
 5
fold
cross
valida$on

User
Features
[F1]

Number
of
Friends

Number
of
Followers

Follower-‐Friend
Ra$o

Number
of
$mes
listed

User
has
a
URL

User
is
a
veriﬁed
user

Age
of
user
account

precog.iiitd.edu.in

Tweet
Features
[F2]

Length
of
Tweet

Number
of
Words

Contains
Ques$on
Mark?

Contains
Exclama$on
Mark?

Number
of
Ques$on
Marks

Number
of
Exclama$on
Marks

Contains
Happy
Emo$con

Contains
Sad
Emo$con

Contains
First
Order
Pronoun

Contains
Second
Order
Pronoun

Contains
Third
Order
Pronoun

Number
of
uppercase
characters

Number
of
nega$ve
sen$ment
words

Number
of
posi$ve
sen$ment
words

Number
of
men$ons

Number
of
hashtags

Number
of
URLs

Retweet
count

15

Classification Results
F1
(user)
F2
(tweet)

F1+F2

Naïve
Bayes

56.32%

91.97%

91.52%

Decision
Tree

53.24%

97.65%

96.65%

•  Best
results
were
obtained
from
Decision
Tree
classiﬁer,
we
got
97%

accuracy
in
predic$ng
fake
images
from
real.

•  Tweet
based
features
are
very
eﬀec$ve
in
dis$nguishing
fake
images
tweets

from
real,
while
the
performance
of
user
based
features
was
very
poor.

•  Our
results
provided
a
proof
of
concept
that,
automated
techniques
can
be

used
in
iden$fying
real
images
from
fake
images
posted
on
TwiVer.

precog.iiitd.edu.in

16

Future Work
Fake
Charity

Rumor

precog.iiitd.edu.in

17

Some Attractions

precog.iiitd.edu.in

18

Some Attractions

precog.iiitd.edu.in

19

Q&A

precog.iiitd.edu.in

20

Thank You!!!

pk@iiitd.ac.in

adi$g@iiitd.ac.in

For any further information, please write to
pk@iiitd.ac.in
precog.iiitd.edu.in

22

Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Viewers also liked

Viewers also liked (18)

Similar to Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy

Similar to Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy (20)

More from IIIT Hyderabad

More from IIIT Hyderabad (20)

Recently uploaded

Recently uploaded (20)

Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hurricane Sandy