Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Processing Social Media Messages in Mass Emergency: A Survey

138 views

Published on

Millions of people use social media to share information during disasters and mass emergencies. Information available on social media, particularly in the early hours of an event when few other sources are available, can be extremely valuable for emergency responders and decision makers, helping them gain situational awareness and plan relief efforts. Processing social media content to obtain such information involves solving multiple challenges, including parsing brief and informal messages, handling information overload, and prioritizing different types of information. These challenges can be mapped to information processing operations such as filtering, classifying, ranking, aggregating, extracting, and summarizing. This work highlights these challenges and presents state of the art computational techniques to deal with social media messages, focusing on their application to crisis scenarios.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Processing Social Media Messages in Mass Emergency: A Survey

  1. 1. Processing Social Media Messages in Mass Emergency: Survey Summary Muhammad Imran Carlos Castillo Fernando Diaz Sarah Vieweg Authors mimran@hbku.edu.qa chato@acm.org diazf@acm.org sarahvieweg@gmail.com Date: 25th April 2018
  2. 2. Overarching Goal “To extract time-critical information from social media that is useful for emergency responders, affected communities, and other concerned population in disaster situations.” Urgent help need Urgent aid need
  3. 3. Survey Study Selection Domain filters Topic filters Data filters - Humanitarian - Disaster response - Mass emergencies - Computing - Artificial intelligence - Machine learning - Twitter - Facebook - Micro-blogging Keywords Final selection = 180 published research papers Domain Topics Data >700 articles Duplicate filters
  4. 4. Topics Covered Humanitarian + Social Media + AI Volume & Velocity (~18) Data acquisition, storage, and retrieval Event Detection (~36) Topic detection and tracking Classification & Clustering (~40) Classification and clustering Information Summarization ~(15) Abstractive and Extractive summarization Semantics and Crisis Ontologies (~10) Semantic enrichment & Crisis ontologies Information Veracity (~18) Credibility and misinformation Information Visualization (~12) Crisis maps, dashboards Total ~180 papers surveyed
  5. 5. Volume & Velocity
  6. 6. Twitter Storms during Emergencies Source: https://www.wsj.com/articles/twitter-storms-can-help-gauge-damage-of-real-storms-and-disasters-study-says-1457722801 (Castillo C, Big Crisis Data, 2016, Cambridge University Press) Volume Velocity 72k tweets/min 27 million in 3 days
  7. 7. (Yury Kryvasheyeu et al. Sci Adv 2016;2:e1500779) Blue: represents a location farther from the disaster Red: represents a location closer to the disaster Twitter Activity Across Locations during Disasters Activity Retweeting Strong relationship between proximity to Sandy’s path and social media activity
  8. 8. Event Detection
  9. 9. Event Description • Why to detect events from social media? – Human sensors report incidents very quickly – Tweet waves travel faster than earthquake waves • What is an event? – Events can be defined as situations, actions or occurrences that happen in a certain location at a specific time (Dou et al. 2012) • An event is generally characterized by: 5W1H – Who? When? Where? What? Why? How?
  10. 10. Event Detection using Bursty Behavior (Liang et al. Quantifying Information Flow During Emergencies, 2014, Nature.)
  11. 11. Event Detection Systems System Approach Event types Real- time Query type Spatio- temporal Sub- events Reference Twitter Monitor Burst detection Open domain Yes Open No No [Mathioudakis et al. 2010] TwitInfo Burst detection Earthquakes Yes Keyword Spatial Yes [Marcus et al. 2011] Twevent Burst detection Open domain Yes Open No No [Li et al. 2012b] TEDAS Supervised classification Crime/disast ers No Keyword Yes No [Li et al. 2012a] LeadLine Burst detection Open domain No Keyword Yes No [Dou et al. 2012] TwiCal Supervised classification Conflicts/poli tics Yes Open Temporal No [Ritter et al. 2012] Tweet4Act Dictionaries Disasters Yes Keyword No No [Chowdhury et al. 2013] ESA Burst detection Open domain Yes Keyword Spatial No [Robinson et al. 2013a]
  12. 12. Challenges and Future Directions • Inadequate spatial information – Spatial and temporal information are two integral components of an event – Automatic text-based geo-tagging may help • Mundane events – #MusicMonday #FollowFriday are misleading • Describing the events – Named-entities, tracking, semantic enhancements
  13. 13. Information Classification and Clustering
  14. 14. By Information Provided • Caution and advice [Imran et al. 2013b]; warnings [Acar and Muraki 2011]; hazard preparation [Olteanu et al. 2014]; tips [Leavitt and Clark 2014]; advice [Bruns 2014]; status, protocol [Hughes et al. 2014b] • Affected or trapped people [Caragea et al. 2011]; casualties, people missing, found, or seen [Imran et al. 2013b]; self-reports [Acar and Muraki 2011]; injured, missing, killed [Vieweg et al. 2010]; looking for missing people [Qu et al. 2011] • Infrastructure/utilities damage [Imran et al. 2013b]; collapsed structure [Caragea et al. 2011]; built environment [Vieweg et al. 2010]; closure and services [Hughes et al. 2014b] • Needs and donations of money, goods, services [Imran et al. 2013b]; food/water shortage [Caragea et al. 2011]; donations or volunteering [Olteanu et al. 2014]; help requests, relief coordination [Qu et al. 2011]; relief, donations, resources [Hughes et al. 2014b]; help and fundraising [Bruns 2014] • Other useful information: hospital/clinic service, water sanitation [Caragea et al. 2011]; consequences [Olteanu et al. 2014]
  15. 15. By Information Provided • Caution and advice [Imran et al. 2013b]; warnings [Acar and Muraki 2011]; hazard preparation [Olteanu et al. 2014]; tips [Leavitt and Clark 2014]; advice [Bruns 2014]; status, protocol [Hughes et al. 2014b] • Affected or trapped people [Caragea et al. 2011]; casualties, people missing, found, or seen [Imran et al. 2013b]; self-reports [Acar and Muraki 2011]; injured, missing, killed [Vieweg et al. 2010]; looking for missing people [Qu et al. 2011] • Infrastructure/utilities damage [Imran et al. 2013b]; collapsed structure [Caragea et al. 2011]; built environment [Vieweg et al. 2010]; closure and services [Hughes et al. 2014b] • Needs and donations of money, goods, services [Imran et al. 2013b]; food/water shortage [Caragea et al. 2011]; donations or volunteering [Olteanu et al. 2014]; help requests, relief coordination [Qu et al. 2011]; relief, donations, resources [Hughes et al. 2014b]; help and fundraising [Bruns 2014] • Other useful information: hospital/clinic service, water sanitation [Caragea et al. 2011]; consequences [Olteanu et al. 2014] - Supervised classification techniques - Learning algorithms include SVMs, Random Forest, Ensemble methods, and lately deep learning e.g., RNN - Unsupervised: clustering, and LDA for topic modeling Formal response organizations prefer supervised classification as most of the times categories are defined.
  16. 16. Systems for Crisis Data Processing Twitris [Purohit and Sheth 2013] Twitter; semantic enrichment, classify automatically, geotag SensePlace2 [MacEachren et al. 2011] Twitter; geotag, visualize heat-maps based on geotags EAIMS Emergency Analysis Identification and Management System [McCreadie et al. 2016] Twitter; sentiment, alerts, credibility, ESA Emergency Situation Awareness [Yin et al. 2012; Power et al. 2014] Twitter; detect bursts, classify, cluster, geotag
  17. 17. Systems for Crisis Data Processing Twitcident [Abel et al. 2012] Twitter and TwitPic; semantic enrichment, classify CrisisTracker [Rogstadius et al. 2013] Twitter; cluster, annotate manually Tweedr [Ashktorab et al. 2014] Twitter; classify automatically, extract information, geotag AIDR: Artificial Intelligence for Disaster Response [Imran et al. 2014a] Twitter & Facebook; annotate manually, classify automatically (text + image)
  18. 18. Challenges and Future Directions • Missing actionable insights – Who and where help is needed – Automatic extraction of actionable/serviceable msgs • Labeled data scarcity – Most of the systems are labeled data hungry – More robust domain adaption and transfer learning techniques are required • Focus on other content type (Images) – Images contain critical information (e.g., damage) – More focus on multimodal research is required
  19. 19. Information Summarization
  20. 20. Information Summarization Tribhuvan international airport closed after the quake Airport closed after 7.9 Earthquake in Kathmandu Tribhuvan international airport closed after 7.9 earthquake in Kathmandu. Summaries reduce information overload issue
  21. 21. Key Objectives and Challenges • Information coverage – Capture most situational updates from data. The summary should be rich in terms of information coverage • Less redundant information – Messages on Twitter contain duplicate information. Produce summaries with less redundant but important updates • Readability – Twitter messages are often noisy, informal, and full of grammatical mistakes. The aim here is to produce more readable summaries • Real-time (online/updated summaries) – The system should not be heavily overloaded with computations such that by the time the summary is produced, the utility of that information is marginal (McCreadie et al. 2013; Aslam et al. 2013; Nenkova and McKeown 2011; Guo et al. 2013, Rudra et al., 2016)
  22. 22. Crisis Datasets (Labeled + Unlabeled) CrisisMMD: Multimodal Twitter Datasets from Natural Disasters http://CrisisNLP.qcri.org/ http://CrisisLex.org/
  23. 23. Conclusion and Future Directions • Applied Research at its Best – Real-world problems and challenges – Social Media for Social Good – Decent work on information filtering and classification (last 6-8 years) • Social media imagery content is another potential source of information • Labeled data scarcity problem – No or few labeled data instances (in early hours) – High diversity among organizations needs – Information needs change overtime – Domain adaptation and transfer learning techniques required • From situational to actionable insights – Identify requests and needs in real-time – Triangulate missing information – Rank them based on their urgency to help responders
  24. 24. Thank you! Contact me at: mimran@hbku.edu.qa OR @mimran15 For queries, questions, and datasets: Recommended books: Processing Social Media Messages in Mass Emergency: A Survey. ACM Computing Surveys, 2015. Full survey paper:

×