Disaster Data Informatics forSituation AwarenessExpedite decision making process in the disaster situation byidentifying useful/actionable information from social media1. Informativeness Analysis a. Identify information rich tweet messages (filtering noisy tweets) based on variety of analysis2. Classifying information rich messages a. People at the disaster site, suffering people asking for help b. Global response about the disaster (opinions, comments, news etc.)3. Expedite decision making process and situational awareness a. Considering (2.a) understand needs at disaster site b. Make connection resource-->needs
Motivation: InformationOverload●● 5,500 tweets per seconds during japanese earthquake and tsunami***Within a minute of the quake, there were more than40,000 earthquake-related Tweets. The micro-blogging sitesaid it hit about 5,500 Tweets per second on the quake...... -The New York Times How to find useful and actionableinformation quickly from such huge stream of incoming event data?
Data generated at the Data generated Dimensions disaster location around the world MultidimensionalNGO Affected people, dataWho generates the data?(People) volunteers People not directly involved in the diaster Reports about -Opinions, concerns, - current situation, sympathy, desire for helpWhat data is - needs for resources,generated? - medical & other -Sharing of related news,(Content) emergencies blogs and other - complains etc. multimedia - Social media (Twitter, FB) Majorly through socialHow the data is media (Twitter, Facebook, - SMS and Web reports togenerated? blogs, etc) involved NGOs and(Network) government organization - Seeking for help Sharing personal view-Why data is generated?(Intention) - Inform current situation, needs points on the disaster etc related incidentsWhen data is generated After the disaster, in recovery Mostly after the disaster(Time) and rebuild phase
Research problem How can we identify useful/ informative (actionable) information that can be used toexpedite decision making & situational awareness in the disaster situation?
Informativeness Analysis- Definition● Useful/actionable information in the disaster situation that can help for better and faster situation awareness
Examples messagesWe need tent, cover, rice. Uneted Nation never Help us since theearthquake, we live in Carre-four, Lapot street,if women and children are victim of rape or other agressions in provisionnalshelter, what number can we call to have fast assistance.We are still under the sheets. We do not have: Tents, prelates, sanitaryarticles and household etc. Bastien the city Alix fontamara 27we dont have some water in the delmas camp 40bWe need tent indelmas 18 because we dont find nothing in the area.How can we find help and food in fontamara 43 rue menosA father, whose wife passed away, and has two children who need medicalattention. One child has a broken arm, and he is afraid of infection
Data generated at the Data generated Dimensions disaster location around the world Multidimensional dataWho generates thedata? (People) Affected people, NGO volunteers People not directly involved in the disaster -Opinions, concerns, sympathy, Reports about desire for helpWhat data is - current situation, -Sharing of related news, blogsgenerated? - needs for resources, and other multimedia - medical & other(Content) emergencies - complains etc. - Social media (Twitter, FB) Majorly through social mediaHow the data is (Twitter, Facebook, blogs, etc)generated? - SMS and Web reports to involved NGOs(Network) and government organizationWhy data is - Seeking for help Sharing personal view-points on - Inform current situation, needs etc the disaster related incidentsgenerated?(Intention) After the disaster, in recovery and rebuild Mostly after the disasterWhen data is phasegenerated(Time)
Data set● Social Networking Messages ○ Twitter, Facebook● News articles ○ News websites, external links from tweets, FB status● NGO messages ○ Ushahidi messages/reports● Mobile messages ○ SMS
Informativeness Analysis ● Structure and syntactic analysis ● Linguistic analysisContent Analysis ● Text analysis ● Metadata Analysis ● Author profile description ● Social connectivity People Analysis ● Activity level ● Author credibility/influence ● Content analysis ● Social share analysis News Analysis ● URL credibility ● Alexa analysis ● Content annotation using disaster domain model considering:Semantic Analysis entities mentioned, needs, resources, location, organizations, people, disaster type etc.
Content Analysis● Structure and syntactic analysis ○ Message length ○ Number of words, special characters, slags, dictionary words● Linguistic analysis ○ Number of nouns, verbs, adverbs, adjective ○ POS patterns● Text analysis ○ N-gram analysis ○ TF_IDF statistics ○ Entities (dbpedia/ontology)● Metadata analysis ○ Publish time ○ Location (explicit and implicit)
People Analysis● Author profile description ○ Profession ○ Demographic information (age, gender, location)● Social connectivity ○ Number of follow-followers● Activity level ○ Number of tweets ○ Number of tweets "on topic"● Author credibility/influence ○ Klout ○ SocialMatica ○ Peer index
News Analysis● News and other event related stories are generally linked in many of the event related messages (tweets, etc.) primarily ○ Message size limitation (140 characters for Twitter) ○ Bringin external authoritative context● Analyzing news and other event related stories plays a crucial role in event analysis Many news stories about the event ■ which news stories to focus on? ■ how to extract useful and actionable information nuggets from these news stories ?
News Analysis - Structure and syntactic analysis Content Analysis - Linguistic analysis - Text analysis - Metadata Analysis - Number tweets, retweets - Facebook share, like, comments,Social share analysis recommendations - Google plus, LinkedIn shares - Google page rank URL credibility - Local credibility (?) Alexa analysis - Alexa global and country rank(Alexa is a web information - Alexa url authority company) - Alexa url & subdomain mozRank - Alexa page & domain authority
Semantic Analysis● Content annotation using disaster domain model considering variety of entities mentioned (DBPedia) ○ needs, resources, location, organizations, people, disaster type etc.
Semantic Disaster Model***Reuse/ (formalise and build) disaster domain model considering: Earthquake, floods, terror attack (disaster type will help usDisaster type for better understanding of needs) Model of basic human needs needs in disasters like food, Needs water, medicines, shelter, etc Model of resources which can satisfy some need like need: Resources thirsty -> resource: water, fruit juice, need: hungry -> resource: food etc. Location Location of incidents, geo-location dataOrganization Involved government and non-government organizations Model of people base on gender, age group, role (mother, People & father, son, etc.) (This can be help in social role understanding/reasoning needs like if there is mention of mother and baby then need may be milk)