Presentation of Sven Schaust, Max Walther and Michael Kaisser on the topic "Prepare, Manage, and Understand Crisis Situations using Social Media Analytics" at ISCRAM2013
Prepare, Manage, and Understand Crisis Situations using Social Media Analytics
1. Prepare, Manage, and Understand Crisis
Situations using Social Media Analytics
Sven Schaust, Max Walther and Michael Kaisser
AGT International, Germany
ISCRAM 2013 in Baden-Baden – May 12-15, 2013
2. 2
Outline
1. Introduction & Context
• Social Media Analysis in a C2 Center
2. The “Avalanche” event detection approach
• Identify posting “hot spots”
• Evaluate post clusters with Machine Learning approach
3. Evaluation
4. Outlook
3. 3
Urban Management & Public Safety
• Cites today are complex and need to be organized
• Administration is responsible for keeping population safe
• emergency services
• health services
• fire fighters
• police
Command & Control Center
4. 4
Urban Management & Public Safety
Why is Social Media relevant in this context?
?
5. 5
Urban Management & Public Safety
Why is Social Media relevant in this context?
“There's a plane in
the Hudson. I'm on
the ferry going to
pick up the people.
Crazy”
6. 6
Urban Management & Public Safety
Why is Social Media relevant in this context?
“De tering, wat een hel!!! 1,4 miljoen
mensen op dat terrein! #loveparade”
7. 7
Urban Management & Public Safety
Why is Social Media relevant in this context?
“#Hoboken is on fire.
Building above Hoboken
Farm Corporation at 300
Washington is all smoked
out”
Social Media can help creating a situational awareness
picture
8. 8
• detect, classify and display events to operator
• accidents, fires, violence, demonstrations
1. Automatic detection of breaking events
• improve USAP by focused Social Media Analytics
• possibly contact owner of posts for more information
2. Monitoring of ongoing situations
• automatic report generation
• interactive investigation support
3. Post Incident reporting
Context: Social Media in a C2 Center
9. 9
What do people tweet during disasters?
Hurricane Sandy (NYC Region, October 2012)
• Evaluated Tweets for period 10/25 – 10/31
• Total number of Tweets per day ~ 3 Mio.
• Checked for Tweets about „sandy“, „hurricane“, „storm“,
„evacuation“, „flood“, „building“ „collapsed“, „power“, „outage“,
„fire“.
Examples of Events (semi-automatic evaluation)
• A crane collapsing on a construction site near 57th street
• A part of an apartment house collapsing in Borough Park,
Brooklyn
• A fire in Breezy Point, Queens
• Flooded tunnels, streets, apartments in various areas
• Power outages in various areas
10. 10
Crane Event
overall 950 tweets were found for Oct. 29th
• 29.10.2012 18:41:56; Wow. Right down the street from me.
#Sandy-damaged crane on new 57th St. hi-rise dangling in
wind.
• 29.10.2012 18:46:20; Be careful on West 57th St as there is
a crane dangling from the rooftop! #HurricaneSandy #Sandy
#NYC
• 29.10.2012 18:50:31; From my window I can see the top of a
crane hanging off, 60 stories up...not good news if that comes
off #Sandy
• 29.10.2012 18:57:17; Curious to see what happens with the
dangling crane on 57th between 6th and 7th Staying clear of
that area for a while #HurricaneSandy
11. 11
Breezy Point Fire
overall 1406 tweets were found for Oct. 30th
• 30 Oct 2012 01:51:11; A TV news crew covering the storm is
trapped by rising water and nearby fire @ 147 Oceanside in
Breezy Point - pls RT #sandy #fdny #nypd
• 30 Oct 2012 03:19:35; There are several fires burning in
Breezy Point and Broad Channel, but the FDNY cannot reach
them because of the flooding. #sandy
• 30 Oct 2012 06:00:58; Fire moving 130st street north and
west toward Cronstant Ave in Rockaway. Fire at 209 street in
Breezy. FDNY cannot get to Breezy. #sandy
• 30 Oct 2012 22:16:16; Never seen anything like this in my
life. #sandy @ Breezy Point, NY http://t.co/
18. 18
Two step approach:
1. Identify locations with high tweet activity
• Collect geo-spatial tweet clusters
2. Evaluate clusters with a Machine Learning
approach
• Do these clusters constitute an real-world event
that the tweeters are witnessing first-hand?
Work in Progress:
3. Classify events according to type
How is it done?
20. 20
Machine Learning – What is the task?
• Suspicious package in #GrandCentral #NYC #bomb threat possibility
not sure?? http://t.co/VwU7SP3X
• Suspicious package found in Grand Central Station... the 456
train..the trains are closed !! [pic]: http://t.co/9YPki4k2
• Something happened in the #456 #trainstation in #GrandCentral
#NYC http://t.co/GGKvQura
• Accident on the #456train in #midtown #NYC http://t.co/fj2mJJmf
vs.
• RT @refinery29: This image of Madeleine Albright playing the drums
will be the best thing you'll see today: http://t.co/rGwQ5RdG
• «@_PrettyPoison Guess ill fill out more job apps today» make punna
fill out some 2!
• The Glamour & Glitz at the 2012 Emmy' s that we loved!
http://t.co/CiTFszfL
• @IszwanieSyahira: i'm happy and i hope u feel the same too.
weeeee ~.~
• How to prepare yourself for Friday's apocalypse http://cnet.co/lPU
We need to automatically determine which of the tweet clusters
(tweets issued close to each other in a short time frame)
represent real-world events and which are just random chatter.
21. 21
• We look for geo-
spatial clusters of
tweets (e.g. 3 or
more tweets in a
200m radius,
posted within 30
mins)
• These become
“event candidates”
• Event candidates
are evaluated with
a Machine Learning
scheme.
• We currently use
C4.5 decision trees.
Architecture
22. 22
Machine Learning - Features
Tweet cluster:
• Suspicious package in
#GrandCentral #NYC
#bomb threat possibility
not sure??
http://t.co/VwU7SP3X
• Suspicious package found
in Grand Central Station...
the 456 train..the trains
are closed !! [pic]:
http://t.co/9YPki4k2
• Something happened in
the #456 #trainstation in
#GrandCentral #NYC
http://t.co/GGKvQura
• Accident on the #456train
in #midtown #NYC
http://t.co/fj2mJJmf
25. 25
If there are several tweets …
• from roughly the same location
• at roughly the same time
• from different users
• that nevertheless use the same words
… chances are good that we have detected an
event.
(Somewhat simplyfied) Summary
26. 26
Outlook – what’s left to do?
Derive more coordinates
• from shared pictures
• from toponyms in posts
• use image sharing sites directly
Make use of posts without coordinates
• and add them to already existing clusters
Explore real-time TF-IDF
• to get rid of the Kardashians & Beliebers
Evaluate system with real-world data
• Because recall numbers are currently somewhat misleading