Extracting Information Nuggets from Disaster-Related Messages in Social Media

1,759 views

Published on

This presentation describes our work presented at the 10th International Conference on Information Systems on Crisis Response and Management (ISCRAM) in Baden-Baden, Germany. The work shows the importance of microblogging websites such as Twitter, and huge number of informative messages that can contribute to situational awareness at the time of disasters. Specifically, the work shows the classification, and information extractions of those valuable, actionable informative messages that people post during emergencies.

Published in: Technology, Business
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,759
On SlideShare
0
From Embeds
0
Number of Embeds
173
Actions
Shares
0
Downloads
0
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide
  • Social media empowers individuals, providing them a platform from which to share opinions, experiences and information from anywhere at any time. Ultimately the shared information can be highly useful provided if analyzed timely and effectively. And that’s what I am going to present in this session.
  • Finding tactical and actionable information from a millions of messages that people post on social media is a complex and challenging task. For this purpose, specifically for disasters we came up with a sensible ontology that has mainly three stages. Every stage refine a piece of information that thus can highly contribute to disaster management. In order to get to the actionable information it is required that we first categories a coming message to a predefined category that is of disaster-specific.
  • Identifies what named entities, what caution/advice and temporal information and others.
  • The inter annotator agreement value shows the level of agreement among workers on an assessable unit(i.e., in our case a tweet). High agreement indicates that different workers frequently gave the same response forthe same tweet message.
  • Extracting Information Nuggets from Disaster-Related Messages in Social Media

    1. 1. Extracting Information Nuggets fromDisaster-Related Messages in Social MediaMuhammad Imran, Shady Elbassuoni, Carlos Castillo, Fernando Diaz, Patrick Meier
    2. 2. Outline• Social media response to disaster• Finding tactical and actionable information• Disaster ontologies• Filtering, classification and extraction• Ongoing work• Discussion
    3. 3. Disaster and Social Media2.3 million tweets reflecting the words “Haiti”or “Red Cross” from Jan 12 to Jan 14, 2010http://www.sysomos.com
    4. 4. Disaster and Social Media
    5. 5. Why Social Media?• Virtual Collaboration, Information Sharing• Highly valuable information• Contribute to situational awareness• Highly useful, if analyzed timely andeffectively
    6. 6. Sandy Tweets@NYGovCuomo orders closing of NYC bridges. Only Staten Islandbridges unaffected at this time. Bridges must close by 7pm. #Sandy#NYC.rt @911buff: public help needed: 2 boys 2 & 4 missing nearly 24 hoursafter they got separated from their mom when car submerged in si.#sandy #911bufffreaking out. home alone. will just watch tv #Sandy #NYC.400 Volunteers are needed for areas that #Sandy destroyed.
    7. 7. Sandy Tweets@NYGovCuomo orders closing of NYC bridges. Only Staten Islandbridges unaffected at this time. Bridges must close by 7pm. #Sandy#NYC.rt @911buff: public help needed: 2 boys 2 & 4 missing nearly 24 hoursafter they got separated from their mom when car submerged in si.#sandy #911bufffreaking out. home alone. will just watch tv #Sandy #NYC.400 Volunteers are needed for areas that #Sandy destroyed.PersonalInformative
    8. 8. Sandy Tweets@NYGovCuomo orders closing of NYC bridges. Only Staten Islandbridges unaffected at this time. Bridges must close by 7pm. #Sandy#NYC.rt @911buff: public help needed: 2 boys 2 & 4 missing nearly 24 hoursafter they got separated from their mom when car submerged in si.#sandy #911bufffreaking out. home alone. will just watch tv #Sandy #NYC.400 Volunteers are needed for areas that #Sandy destroyed.PersonalInformativeCaution and AdviceCasualties and DamageDonations
    9. 9. Sandy Tweets@NYGovCuomo orders closing of NYC bridges. Only Staten Islandbridges unaffected at this time. Bridges must close by 7pm. #Sandy#NYC.rt @911buff: public help needed: 2 boys 2 & 4 missing nearly 24 hoursafter they got separated from their mom when car submerged in si.#sandy #911bufffreaking out. home alone. will just watch tv #Sandy #NYC.400 Volunteers are needed for areas that #Sandy destroyed.PersonalInformativeCaution and AdviceCasualties and DamageDonations
    10. 10. Finding Tactical & Actionable InformationPersonalInformative(Direct & Indirect)OtherCaution and adviceCasualties and damageDonationsPeople missing, found, or seenInformation sourceSiren heard, warning issued/lifted etc.People dead, injured, damage etc.Money, shelter, blood, goods, or servicesWebpages, photos, videos information sources…
    11. 11. Our Approach3.Extraction2.Classification1.Filtering
    12. 12. Our DatasetsJoplin Dataset• 206,764 tweets collected during Joplin tornadothat hit Joplin, Missouri on May 22, 2011• Collected by researchers at the university ofColorado at Boulder• Collected through Twitter API by monitoring thetweets with hashtags #joplin or #tornado
    13. 13. Our DatasetsSandy Dataset:• 140,000 tweets collected during hurricane Sandythat hit northeastern USA on Oct 29, 2012• Collected through Twitter API by monitoring thetweets with hashtag #sandy or #nyc
    14. 14. 1. FilteringIs disaster-related?Contributes tosituationalawareness?Yes YesNo No
    15. 15. 1. Filtering: Training Data32%60%8%4406 tweets sampled uniformly from theJoplin dataset Annotated using CrowdFlowerPersonalInformativeOther
    16. 16. 2. ClassificationCaution &AdviceInformationSourcesDamage &CasualtiesDonationsHealthShelterFoodWaterLogistics......Filteredtweets
    17. 17. Distribution of Tweet Types50%18%16%10%6%Caution/AdviceInfo SourceDonationsCasualties/DamageUnknownJoplin Tornado (2011)
    18. 18. Automatic ClassificationClass Prec Rec F-Measure AUCCaution and advice 0.85 0.76 0.80 0.91Information source 0.54 0.58 0.56 0.76Donations 0.72 0.71 0.72 0.89Casualties/damage 0.52 0.65 0.58 0.87• Binary (hashtags, URL, emotion etc.)• Scalar (tweet length)• Text features (Unigram, bigram, POS tags, Verbnet etc.)Features:
    19. 19. 3. Extraction...Classifiedtweets@JimFreund: Apparently we have no choice.There is a tornado watch in effecttonight.
    20. 20. Labels for Extraction: Training Data• Type-dependent instruction• Ask evaluators to copy-paste a word/phrasefrom each tweet
    21. 21. Tool• CMU ARK Twitter NLP– Tokenization– Feature extraction– CRF learning• Very easy to use: simply change the trainingset (part-of-speech tags) into anything, and re-train
    22. 22. Extraction EvaluationSetting Rec PrecTrain 2/3 Joplin, Test 1/3 Joplin 78% 90%Train 2/3 Sandy, Test 1/3 Sandy 41% 79%Train Joplin, Test Sandy 11% 78%Train Joplin + 10% Sandy, Test 90% Sandy 21% 81%• Precision is: one word or more in common withwhat humans extracted (Imran et al., 2013)
    23. 23. Ongoing work
    24. 24. Self-service for crisis-related classification• Machine learning software can be provided asa service– e.g. Google Prediction API• Can we provide crisis-related tweetclassification as a service?– Automatic collection of tweets– Re-usable ontologies / default training sets– Active learning
    25. 25. Request Labeled / Unlabeled DatasetsContact us at: mimran@qf.org.qa
    26. 26. References• K. Starbird, L. Palen, A. Hughes, and S. Vieweg (2010) Chatter on the red: what hazardsthreat reveals about the social life of microblogged information. In Proceedings of the 2010ACM conference on Computer supported cooperative work, pages 241–250. ACM.• Latonero, Mark, and Irina Shklovski. "“Respectfully Yours in Safety and Service”: EmergencyManagement & Social Media Evangelism." Proceedings of the 7th International ISCRAMConference–Seattle. Vol. 1. 2010.• Muhammad Imran, Shady Elbassuoni, Carlos Castillo, Fernando Diaz and Patrick Meier.Practical Extraction of Disaster-Relevant Information from Social Media. WWW-2013SWDM, May 2013
    27. 27. Thank you!Muhammad Imranmimran@qf.org.qaWith thanks to Carlos Castillo for several slides

    ×