Tweet4act: Using Incident-Specific Profiles for
Classifying Crisis-Related Messages
Soudip Roy Chudary, Muhammad Imran, Mu...
Disaster & Social Media
Disaster Strikes, Social Media Responds
Virtual Collaboration, Information Sharing
• Valuable information
• Contribute to situational awareness
• Highly useful, i...
Social Media Response to Disaster Phases
Before
During
After
Disaster Management, Crisis Informatics
- Caution, warnings
- Alerts etc.
- Damage
- Causalities etc.
- Request for help
-...
Datasets & Examples
1. Joplin Tornado on May 22, 2011
2. Nesat Typhoon in Philipines on Sep 27, 2011
3. Haiti Earthquake o...
Tweet4Act System
• Collection -> Filtering -> Period Classification
1. Filtering Process
• Normalization: remove “RT @username” and “@username”
prefixes and remove duplicate messages
• Apply...
Filtering Process Validation
• Using CrowdFlower crowdsourcing platform
2. Dictionary Based Period Classification
• Most frequent words across datasets
• “warning” & “alert” typically found in t...
3. NLP-Based Period Classification
• Tense of verbs can help identify period. (A. Iyengar et al., 2011)
POS tagging
1. Dic...
Manual Period Classification
• CrowdFlower crowdsourcing period labeling
Performance of Tweet4Act
Period Tweet4act SVM MaxEnt Tree RF
P R F1 P R F1 P R F1 P R F1 P R F1
Joplin Tornado
PRE 0.33 0....
References
• A. Iyengar, T. Finin, and A. Joshi (2011) Content-based prediction of
temporal boundaries for events in Twitt...
Thank you!
Muhammad Imran
mimran@qf.org.qa
Upcoming SlideShare
Loading in …5
×

ISCRAM 2013: Tweet4act: Using Incident-Specific Profiles for Classifying Crisis-Related Messages

472 views

Published on

Authors: Soudip Roy Chowdhury, Muhammad Imran, Muhammad Rizwan Asghar,
Sihem Amer-Yahia, Carlos Castillo

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
472
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Social media empowers individuals, providing them a platform from which to share opinions, experiences and information from anywhere at any time.
  • ISCRAM 2013: Tweet4act: Using Incident-Specific Profiles for Classifying Crisis-Related Messages

    1. 1. Tweet4act: Using Incident-Specific Profiles for Classifying Crisis-Related Messages Soudip Roy Chudary, Muhammad Imran, Muhammad Rizwam Asghar, Sihem Amer-Yahia, Carlos Castillo
    2. 2. Disaster & Social Media Disaster Strikes, Social Media Responds
    3. 3. Virtual Collaboration, Information Sharing • Valuable information • Contribute to situational awareness • Highly useful, if analyzed timely and effectively(Starbird et al., 2010; Latonero and Shklovski, 2010)
    4. 4. Social Media Response to Disaster Phases Before During After
    5. 5. Disaster Management, Crisis Informatics - Caution, warnings - Alerts etc. - Damage - Causalities etc. - Request for help - Donations etc. • The main goals of our research: 1. Identify messages related to an incident. 2. Classify incident-messages with the corresponding period (PRE, DURING, POST).
    6. 6. Datasets & Examples 1. Joplin Tornado on May 22, 2011 2. Nesat Typhoon in Philipines on Sep 27, 2011 3. Haiti Earthquake on Jan 12, 2010 • [PRE] New #tropical storm forms in the West #Pacific. #Nesat may hit the #Philippines & #China as a #typhoon next week • [DURING] @Yahoo News: Powerful #typhoon with winds up to 106 mph makes landfall in #Philippines as 100,000 odered to fless homes • [POST] News5 Action center is now accepting donations for the victims of Typhoon “pedring. Drop boxes are located @ TV5 Office :)
    7. 7. Tweet4Act System • Collection -> Filtering -> Period Classification
    8. 8. 1. Filtering Process • Normalization: remove “RT @username” and “@username” prefixes and remove duplicate messages • Apply the k-mediod method with the manhattan distance between medoids and messages in each cluster • Discard all cluster having a negative number or zero as silhouette coefficient • Select from each cluster the fraction m messages closer to the mediod
    9. 9. Filtering Process Validation • Using CrowdFlower crowdsourcing platform
    10. 10. 2. Dictionary Based Period Classification • Most frequent words across datasets • “warning” & “alert” typically found in the Pre • “now”, “sweeps” etc. typically found in During • “aftermath”, “donate” etc. typically found in Post
    11. 11. 3. NLP-Based Period Classification • Tense of verbs can help identify period. (A. Iyengar et al., 2011) POS tagging 1. Dictionary based verbs get +1 (ignore below) 2. Aux verbs get +1(e.g., could-PRE, are-DURING, did-POST) 3. If a main verb in future/present/past tense, add +0.5 to pre/during/post period, respectively. Ties: PRE > DURING > POST
    12. 12. Manual Period Classification • CrowdFlower crowdsourcing period labeling
    13. 13. Performance of Tweet4Act Period Tweet4act SVM MaxEnt Tree RF P R F1 P R F1 P R F1 P R F1 P R F1 Joplin Tornado PRE 0.33 0.85 0.48 0.00 0.00 0.00 0.43 0.21 0.28 0 0 0 0 0 0 DURIN G 0.88 0.89 0.89 0.32 0.91 0.47 0.35 0.55 0.43 0.3 0.73 0.43 0.32 0.1 0.48 POST 0.61 0.84 0.71 0.67 0.20 0.31 0.55 0.6 0.57 0.57 0.4 0.47 1.00 0.1 0.18 AVG. 0.61 0.86 0.69 0.33 0.37 0.39 0.44 0.45 0.42 0.29 0.38 0.45 0.66 0.37 0.33 Haiti Earthquake PRE 0.63 1.00 0.77 1.00 0.67 0.80 1.00 1.00 1.00 1 0.67 0.8 1.00 0.33 0.5 DURIN G 0.72 0.97 0.83 0.75 0.6 0.67 0.67 0.80 0.73 0.6 0.6 0.6 1.00 0.4 0.57 POST 0.46 0.82 0.59 0.92 0.97 0.94 0.97 0.95 0.96 0.92 0.95 0.93 0.88 1.00 0.94 AVG. 0.60 0.87 0.71 0.89 0.74 0.80 0.88 0.91 0.89 0.84 0.74 0.78 0.96 0.58 0.67 Nesat Typhoon PRE 0.36 1.00 0.53 1.00 0.50 0.67 1.00 0.50 0.67 0.33 0.25 0.28 1.00 0.5 0.67 DURIN G 0.94 0.94 0.94 0.79 1.00 0.88 0.81 1.00 0.90 0.71 0.77 0.74 0.79 1 0.88 POST 0.52 0.85 0.65 1.00 0.2 0.33 1.00 0.40 0.57 0 0 0 1.00 0.2 0.33 AVG. 0.61 0.93 0.71 0.93 0.57 0.62 0.94 0.63 0.71 0.35 0.34 0.51 0.93 0.57 0.63 PRE PRE PRE DURING DURING DURING POST POST POST AVG AVG AVG
    14. 14. References • A. Iyengar, T. Finin, and A. Joshi (2011) Content-based prediction of temporal boundaries for events in Twitter. In Proceedings of the Third IEEE International Conference on Social Computing. • K. Starbird, L. Palen, A. Hughes, and S. Vieweg (2010) Chatter on the red: what hazards threat reveals about the social life of microblogged information. In Proceedings of the 2010 ACM conference on Computer supported cooperative work, pages 241–250. ACM. • Latonero, Mark, and Irina Shklovski. "“Respectfully Yours in Safety and Service”: Emergency Management & Social Media Evangelism.” Proceedings of the 7th International ISCRAM Conference– Seattle. Vol. 1. 2010.
    15. 15. Thank you! Muhammad Imran mimran@qf.org.qa

    ×