Coordinating Human and Machine Intelligence to Classify Microblog Communica0ons in Crises

Muhammad
Imran,
Carlos
Cas)llo,
Ji
Lucas,

Patrick
Meier,
Jakob
Rogstadius

Qatar
Compu0ng
Research
Ins0tute
(QCRI)

Doha,
Qatar

Coordina0ng
Human
and
Machine

Intelligence
to
Classify
Microblog

Communica0ons
in
Crises

USEFUL
INFORMATION
ON
TWITTER

Cau0on

and
advice

Informa0on

source

Dona0ons

Causali0es

&
damage

A
siren
heard

Tornado
warning
issued/li>ed

Tornado
sigh)ng/touchdown

42%

50%

30%

12%

18%

Photos
as
info.
source

Webpages
info.
source

Videos
as
info.
source

44%

20%

16%

Other
dona)ons

Money

Equipment,
shelter,

Volunteers,
Blood

38%

8%

54%

People
injured

People
dead

Damage

44%

44%

2%

16%

10%

%
of
informa0ve
tweets

Ref:
“Extrac-ng
Informa-on
Nuggets
from
Disaster-‐Related
Messages
in
Social
Media”.
Imran
et
al.
ISCRAM-‐2013,
Baden-‐Baden,
Germany.

SOCIAL
MEDIA
INFORMATION
PROCESSING:

OFFLINE
APPROACH

Data
collec)on

1
2

Human
annota)ons

on
sample
data

Machine
training

3

Classiﬁca)on

4

Disaster
Timeline:

DATA
COLLECTION

IMPACT
AND
RESPONSE
TIMELINE

Source:
Department
of
Community
Safety,
Queensland
Govt.
2011
&
UNOCHA

Disaster
response
(today)
Disaster
response
(target)

Target
disaster
response
requires
real-‐0me
processing.

REAL-‐TIME
SOCIAL
MEDIA
ANALYSIS

Key
requirements:

•  Real-‐0me
data
collec)on

•  Capable
to
incorporate
new
data
collec0on
strategies

•  Obtain
human-‐labels
in
real-‐0me

•  Perform
de-‐duplica0on

•  Perform
almost
online
machine
learning

•  Con)nuous
learning

•  Learn
as
new
labels
arrive

•  Perform
real-‐0me
classiﬁca0on

•  Scale
with
big
disasters
(Sandy
15k
posts/min)

Data
collec)on

1
2

Human
annota)ons
Machine
training

3

Classiﬁca)on

4

ONLINE
APPROACH

DATA
COLLECTION

H
A

Learning-‐1

CLASSIFICATION

Learning-‐2
Learning-‐3
…
Learning-‐n

Human

annota)on
-‐
1

Human

annota)on
-‐
2

Human

annota)on
-‐
3
…

Human

annota)on
-‐
n

First
few
hours

SOCIAL
MEDIA
INFORMATION
PROCESSING:

ONLINE
APPROACH
(REAL-‐TIME)

hdp://aidr.qcri.org/

AIDR
—Ar)ﬁcial
Intelligence
for
Disaster
Response—
is
a
free,
open-‐source,
and
easy-‐to-‐use

plagorm
to
automa)cally
ﬁlter
and
classify
relevant
tweets
posted
during
humanitarian
crises.

1
2
3

Collect
Curate
Classify

AIDR:
FROM
END-‐USERS
PERSPECTIVE

Collec0on
Classifier(s)

•  Keywords,
Hashtags

•  Geographical
bounding
box

•  Languages

•  Follow
specific
set
of
users

A
collec0on
is
a
set
of
filters
A
classifier
is
a
set
of
tags

•  Dona0ons
requests
&
offers

•  Damage
&
causali0es

•  Eyewitness
accounts

2
step
approach

1
2


REAL-‐TIME
CLASSIFICATION
IN
AIDR

Collec0on
Classifier(s)

Tag
Tag

Tag
Tag

Learner

Classifier-‐1

Tag

Tag
Tag
Tag

30k/min

Classifier-‐2


Tag
Tag
Tag

Labeling
task

Model

HUMAN
ANNOTATION:
CHALLENGES


•  Crisis-‐specific
labels
are
necessary

•  Contras)ng
vocabulary

•  Differences
in
public
concerns,
affected
infrastructure

•  New
labels
should
be
collected
for
each
new
crisis

1-‐
Labeling
task
selec0on
2-‐
Labeling
task
scheduling

•  Which
tasks
to
pick?

•  No
duplicate
tasks
should
be
labeled

•  Priori0ze
tasks
that
are
likely
to

increase
accuracy

•  All-‐at-‐once
labeling

•  Gradual
labeling

•  Independent
labeling

Crowdsourcing
is
a
big
research
topic.
We
address
two
challenges
here:

[
Imran
et
al.
2013b
]

DATASETS


1.  Joplin-‐2011

•  Consists
of
206,764
tweets
collected
using
(#joplin)

2.  Sandy-‐2012

•  Consists
of
4,906,521
tweets
collected
using

(#sandy,
hurricane
sandy,
…)

3.  Oklahoma-‐2013

•  Consists
of
2,742,588
tweets
collected
using

(Oklahoma,
tornado,
…)

DISASTER
PHASES
&
#
OF
TWEETS


Pre:
preparedness
phase

Impact:
phase
corresponds
to
the
period
in
which
the
main
eﬀects
are
felt

Post:
corresponds
to
response
and
recovery
phase

Joplin
(leL),
Sandy
(center),
and
Oklahoma
(right).
Number
of
tweets
per
day
in
all
datasets.

LABELING
TASK
SELECTION


Experiment:

Are
crisis-‐specific
labels
necessary?

Manual
labeling
(using
Crowdflower)

Train
Test
AUC

Joplin
Sandy
0.52

Joplin
Oklahoma
0.56

Sandy
Oklahoma
0.53

Dataset
Phase-‐S1
Phase-‐S2
Phase-‐S3
Phase-‐S4

Joplin
2,000
1,000
1,000
1,000

Sandy
2,000
1,000
1,000
1,000

Oklahoma
2,000
1,000
1,000
N/A

Classifica0on
accuracy
in
various
transfer
scenarios

*
AUC
0.5
represents
a
random
classifier

LABELING
TASK
SELECTION


Experiment:

Is
de-‐duplica0on
necessary?

Phase
Train
Phase
Test
AUC
(without
de-‐
duplica0on)

AUC
(with
de-‐
duplica0on)

S1
(pre)
1,500
S1
(pre)
500
0.78
0.74

S1
(pre)
500
S1
(pre)
500
0.73
0.72

S2
(impact)
500
S2
(impact)
500
0.80
0.72

S3
(post)
500
S3
(post)
500
0.79
0.73

S4
(post’)
500
S4
(post’)
500
0.70
0.64

•  29-‐74%
of
tweets
are
re-‐tweets
&
60-‐75%
are
near
duplicates

•  Duplica)on
causes
an
ar0ﬁcial
increase
in
accuracy

•  Necessary
to
reduce
classiﬁer
bias.
Otherwise
learning
on
a
fewer
concepts

•  Necessary
to
improve
workers
experience

[
Rogstadius
et
al.
2011
]

LABELING
TASK
SELECTION


Experiment:

Which
approach
Passive
vs.
Ac0ve
learning?

JOPLIN

SANDY

OKLAHOMA

S1
S2
S3
S4

LABELING
TASK
SELECTION


•  Are
crisis-‐speciﬁc
labels
necessary?
[YES]

•  Is
de-‐duplica0on
necessary?
[YES]

•  Which
approach
to
follow
Passive
vs.
Ac0ve
learning?

[Ac0ve
learning]

Now
we
know
WHICH
tasks
to
select.

But
we
s0ll
don’t
know
WHEN
to
label
them?

LABELING
TASK
SCHEDULING


labeling

•  Obtain
1,500
labels
on
S1
and
use
all
for
training

•  Cumula0ve
labeling

•  Obtain
500
labels
in
each
of
S1,
S2,
and
S3
and
train
on

labels
available
up
to
each
phase

•  Independent
labeling

•  Obtain
500
labels
in
each
of
S1,
S2,
and
S3
and
use
the

most
recent
labels
for
training,
discarding
old.

LABELING
TASK
SCHEDULING

Experiment:

Which
labeling
strategy
to
follow?

JOPLIN

SANDY

OKLAHOMA

Informa0ve
Informa0ve
(50%)
Dona0ons

CONCLUSION
&
FUTURE
WORK


•  Adap0ve
collec0on

•  Post-‐processing/ﬁltering

•  More
features
and
learning
schemes

•  Task
selec0on

•  De-‐duplica)on
is
necessary

•  Ac)ve
learning
approach
must
be
employed

•  Task
scheduling

for
small-‐scale
crises

•  Incremental
for
medium-‐scale
crises
(needs
tests)

Future
work:


AIDR
—Ar)ﬁcial
Intelligence
for
Disaster
Response—
is
a
free,
open-‐source,
and
easy-‐to-‐use

plagorm
to
automa)cally
ﬁlter
and
classify
relevant
tweets
posted
during
humanitarian
crises.

Thank
you!

Coordinating Human and Machine Intelligence to Classify Microblog Communica0ons in Crises

More Related Content

Similar to Coordinating Human and Machine Intelligence to Classify Microblog Communica0ons in Crises

More from Muhammad Imran

Recently uploaded

Coordinating Human and Machine Intelligence to Classify Microblog Communica0ons in Crises