Extracting Important
Information from Social
Network Stream During Crisis
Avijit Paul
ARC Centre of Excellence for Creativ...
• Brief overview (2 minutes)
• Concept of the project (3 minutes)

• Experiment and result (10 minutes)
First 24 hours is the most crucial after a
natural disaster

IMPACT AND RESPONSE TIMELINE
During first 24 hours getting th...
TARGET AUDIENCE
Currently used tool
Source: Yin, J., Lampert, A., Cameron, M., Robinson, B., & Power, R. (2012). Using social media to enh...
WHAT AM I SLICING FOR?

1. Filter noise
2. Find actionable
information
3. Priorities
information

Hendrickson, S. (2012a)....
SEPERATING TWEETS
No

Category

Reference

Weight *

1

RT

Retweet of a previously stored tweet

-10

2

@reply

Communic...
WHICH COMBINATION?
@REPLY +
HASHTAG=
78
RT +
LOCATION
= 75

NAMED ENTITY + TEMPORAL
INFORMATION + KEYWORD =
180
METHOD & DATA
Dataset: QLDflood 2011 (Queensland Flood)
Method:
Step 1: Read 4000+ tweets and identify tweets which I thou...
METHOD & DATA (cont)
Step 2: Those 1000 tweets, I read to understand why I found them important.
I found having Name of pl...
METHOD & DATA (cont)
Step 3: Do Bag of words feature extraction in Excel to identify if the tweets
have any of the above m...
Formula 1

Formula 3

Formula 2

Formula 4
Formula 1 and 2 (Image * 5 + Named Entity * 10 + keyword * 2 AND Image * 10 + Named
Entity * 5 + keyword * 2)
South Bank F...
Formula 3 (Image * 2 + Named Entity * 5 + keyword * 10)
Does anyone else find the name of the cafe floating down Brisbane ...
FILTERING & PRIORITIZING
1. Combining other elements (name of places, images)

with keyword (unigram) is better than ident...
CHALLENGES
1. Finding named places is hard & well known problem

2. Images have challenges too
#QLDFLOOD TOP CATEGORIES URLS

4000
3500
3000
2500
2000
1500
1000
500
0

1400
1200
1000
800
600
400
200
0

2011

2013

SOU...
8 pm uploaded

10 pm uploaded

9 PM uploaded

10 pm uploaded
Avijit Paul
ARC Centre of Excellence for Creative Industries and
Innovation
Queensland University of Technology
a1.paul @ ...
Extracting Important Information from Social Network Stream During Crisis
Upcoming SlideShare
Loading in …5
×

Extracting Important Information from Social Network Stream During Crisis

1,108 views

Published on

Paper Presented in IR14 (Internet research conference).

Published in: Education, Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,108
On SlideShare
0
From Embeds
0
Number of Embeds
37
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Target is to address the issue with Social Networks
  • Target is to address the issue with Social Networks
  • Extracting Important Information from Social Network Stream During Crisis

    1. 1. Extracting Important Information from Social Network Stream During Crisis Avijit Paul ARC Centre of Excellence for Creative Industries and Innovation Queensland University of Technology a1.paul @ qut.edu.au @cdtavijit http://mappingonlinepublics.net/
    2. 2. • Brief overview (2 minutes) • Concept of the project (3 minutes) • Experiment and result (10 minutes)
    3. 3. First 24 hours is the most crucial after a natural disaster IMPACT AND RESPONSE TIMELINE During first 24 hours getting the information is the hardest. Thus makes it difficult to act TARGET DISASTER MANAGEMENT With increased and verified information it is possible to reduce community harm and save lives Source: Department of Community Safety, Queensland Govt. 2011
    4. 4. TARGET AUDIENCE
    5. 5. Currently used tool Source: Yin, J., Lampert, A., Cameron, M., Robinson, B., & Power, R. (2012). Using social media to enhance emergency situation awareness.
    6. 6. WHAT AM I SLICING FOR? 1. Filter noise 2. Find actionable information 3. Priorities information Hendrickson, S. (2012a). Gnip The Social Cocktail, Part 2 Expected vs. Unexpected Events. Retrieved from http://blog.gnip.com/expected-­­vs­­unexpected-­­events-­­in-­­social-­­ media/
    7. 7. SEPERATING TWEETS No Category Reference Weight * 1 RT Retweet of a previously stored tweet -10 2 @reply Communication received from a high profile user 5 4 Named entity Name of a place not identified before 10 5 URL Uses a link to a news site 2 6 Instragram URL Is that a cat photo? +15 9 appeal word Included word “help” 10 *Weigh is not final
    8. 8. WHICH COMBINATION? @REPLY + HASHTAG= 78 RT + LOCATION = 75 NAMED ENTITY + TEMPORAL INFORMATION + KEYWORD = 180
    9. 9. METHOD & DATA Dataset: QLDflood 2011 (Queensland Flood) Method: Step 1: Read 4000+ tweets and identify tweets which I thought was important or not important for Emergency services. I found around 1000 tweets that had certain importance. 0 Our hearts go out to everyone affect by the #qldfloods 1 The Myer Centre entrance now sandbagged in Brisbane CBD. #QLDfloods http://twitpic.com/3p91v7
    10. 10. METHOD & DATA (cont) Step 2: Those 1000 tweets, I read to understand why I found them important. I found having Name of places (Named Entity), Image and Keyword (unigram) was more important to me Amazing photo: South Bank car basement full to the brim http://twitpic.com/3p8hg3 #QLDfloods #fb 3000 homes now underwater in Ipswich. Evacuations now include Bundamba Goodna Redbank Bellbird Park. #qldfloods Another pontoon floating down the Brisbane River #qldfloods http://twitpic.com/3p8g9t
    11. 11. METHOD & DATA (cont) Step 3: Do Bag of words feature extraction in Excel to identify if the tweets have any of the above mentioned words. Step 4: Multiple the existence with a score
    12. 12. Formula 1 Formula 3 Formula 2 Formula 4
    13. 13. Formula 1 and 2 (Image * 5 + Named Entity * 10 + keyword * 2 AND Image * 10 + Named Entity * 5 + keyword * 2) South Bank Ferris Wheel #bnefloods #qldfloods #qldfloodsmap http://twitpic.com/3p8olx Whoa @bazmeister: Eagle Street Pier Brisbane.. #qldfloods http://twitpic.com/3p79rx #BrisVenice
    14. 14. Formula 3 (Image * 2 + Named Entity * 5 + keyword * 10) Does anyone else find the name of the cafe floating down Brisbane river is "Drift"? http://goo.gl/N8PLT #qldfloods Concerns that a power generator in QUT Gardens Point has NOT been turned off - fears of electrocution in water. STAY AWAY #qldfloods Nine News: The iconic Drift Restaurant located on the Brisbane River has broken off and has now sunk so sad. #qldfloods Formula 4 (Image * 10 + Named Entity * 2 + keyword * 5) i'm going to twitpic some photos of the devastion the floods are having on people and families in queensland #prayforaustralia #qldfloods HOLY WOW Myer Centre flood preo. This is 20m from my work! #qldfloods #bnefloods http://twitpic.com/3p8wjt Riverside pathway at the cnr of the CBD botanic gardens #qldfloods http://twitpic.com/3p904f
    15. 15. FILTERING & PRIORITIZING 1. Combining other elements (name of places, images) with keyword (unigram) is better than identifying based on keyword alone 2. Emphasizing named entity (places) > images > keywords (unigram)
    16. 16. CHALLENGES 1. Finding named places is hard & well known problem 2. Images have challenges too
    17. 17. #QLDFLOOD TOP CATEGORIES URLS 4000 3500 3000 2500 2000 1500 1000 500 0 1400 1200 1000 800 600 400 200 0 2011 2013 SOURCE: Social Media in Crisis Communication (Burgess, Bruns & Paul) AOIR 2013
    18. 18. 8 pm uploaded 10 pm uploaded 9 PM uploaded 10 pm uploaded
    19. 19. Avijit Paul ARC Centre of Excellence for Creative Industries and Innovation Queensland University of Technology a1.paul @ qut.edu.au @cdtavijit http://mappingonlinepublics.net/

    ×