SlideShare a Scribd company logo
1 of 25
Summarizing Situational Tweets in Crisis
Scenario
Imran Muhammad
Qatar Computing Research Institute, HBKU
Doha, Qatar
Co-authors:
Koustav Rudra, Niloy Ganguly, Pawan Goyal (CSE, IIT Kharagpur, India)
Siddhartha Banerjee (Pennsylvania State University, USA)
Prasenjit Mitra (Qatar Computing Research Institute)
Aid Needs and Information Needs
Info. Info. Info.
Disaster event Urgent needs of affected people
Information gathering
Humanitarian organizations and local administration
Information gathering,
especially in real-time, is
the most challenging part
Relief operations
- Food, water
- Shelter
- Medical emergences
- Donations
- …
Information: A Lifeline During Disasters
The opaqueness induced by
disasters is overwhelming
People need information as
much as water, food,
medicine or shelter
Lack of information can make
people victims of disaster and
targets of aid
Twitter: A Useful Information Source
• Provide active communication channels during
crises
• Useful information: reports of casualties,
damages, donation offers and requests
• Quicker than traditional channels (e.g. first
tweet about Westgate Mall attack reported
within a minute)
Opportunities
• Rapid crisis response
• Time-critical situational awareness
• Access to actionable information
• …
• But, it requires real-time data processing
• Categorizations of each incoming item should be
done as soon as it arrives
• Rapid automatic summaries generation
Enhanced Situational Awareness
Time-critical situational awareness by generating
automatic summaries
• We use AIDR (Artificial Intelligence for Disaster
Response) system for:
– real-time data processing
– categorizations of tweets
• We proposed a novel framework for
summarization of informative tweets
Summarization of Tweets Example
Dharara Tower built in 1832 collapses in Kathmandu
during earthquake
Historic Dharara Tower Collapses in Kathmandu After
7.9 Earthquake
Dharara tower built in 1832 collapses in Kathmandu after
7.9 earthquake.
Key Characteristics and Objectives
• Information coverage
– Capture most situational updates from data. The summary should be
rich in terms of information coverage
• Less redundant information
– Messages on Twitter contain duplicate information. We aim for
summaries with less redundancy while keeping important updates
• Readability
– Twitter messages are often noisy, informal, and full of grammatical
mistakes. We aim to produce more readable summaries
• Real-time
– The system should not be heavily overloaded with computations
such that by the time the summary is produced, the utility of that
information is marginal
High-level Approach
Automatic Classification and Summarization
Datasets
• Nepal earthquake tweets from 25th to 27th April 2015
• AIDR classified tweets to the following categories:
– Missing trapped or found people (10,751 tweets)
– Infrastructure and utilities damage (16,842 tweets)
– Shelter and supplies (19,006 tweets)
The Role of Content Words in
Extractive Summarization
• Studies show the significance of content words to
capture important events
– Nouns (e.g. hospitals, buildings, bridges names)
– Numerals (e.g. number of casualties)
– Main verbs (e.g. collapsed, destroyed, killed)
Abstractive Summarization
• We generate a word graph where nodes are bigrams
• We generate sentences from the word graph
Challenge: Maintaining informativeness and readability
– Covering important content words
– Favoring more informative paths
– Maintaining linguistic quality
ILP model combining the above factors
Bi-gram Based Word Graph
• Word graph: nodes represent bi-grams (along with
their POS-tags)
• An edge represents consecutive words
• Nodes of two tweets with same bi-gram and POS-
tags are merged
ILP Based Formulation
Parameters
• Score of sentences/generated paths (CW(s))
– Centroid score
• Linguistic quality(LQ(s))
– Trigram language model
– LQ(s) = 1/(1-ll(w1,w2,…,wq))
– ll(w1,w2,…,wq) = 1/Llog2∏q
t =3 P(wt|wt-2wt-1)
ILP Based Solution
max(∑i=1…n CW(i)*LQ(i)*xi + ∑j=1…m yj)
xi, yj binary variable
xi tweet indicator, yj content word indicator
CW(i) = tweet i centroid score
LQ(i) = Linguistic score of tweet i
Constraints
∑i=1…n xi * Length(i) ≤ L
Length(i) = number of words in tweet i
L = required summary word length
∑i ∈ Tj xi ≥ yj j = [1 … m]
Tj = set of tweets where content word j is present
If yj is selected then at least one tweet covering
that word is also selected
∑j ∈ Ci yj ≥|Ci| * xi i = [1 … n]
Ci = set of content words present in tweet i
If tweet i is selected then all the content words of
that tweet are also selected
Baselines and Gold Standard
• COWTS: content word based extractive tweet stream
summarization algorithm [CIKM 2015]
• APSAL: affinity clustering based supervised disaster related
news articles summarizer [ACL 2015]
• TOWGS: runtime bigram based abstractive summarization
[EACL 2014]
Gold Standard:
We used summaries generated by UN OCHA experts during Nepal
earthquake
Summarization Results
.
 If number of clusters increases, determining importance of different
clusters becomes difficult [APSAL]
 COWTS tries to maximize the coverage of content words but can’t
combine information from related tweets
 TOWGS didn’t consider content words into account
 COWABS tries to combine similar information from related tweets as well
as maximizing coverage of content words
Performance Variation with Summary Length
If summary length increases COWABS still performs better than the baselines
Performance Comparison (User Studies)
Performance Comparison (User Studies)
Summarization Quality (Location)
COWABS captures information at more granular level with location specific
information
Summarization Quality (Event)
We extract event phrases using the method proposed by Ritter et al [EMNLP 2011]
COWABS captures more event specific information
Summarization Quality (Numeral)
COWABS captures more numerical information which includes information
about victims, helpline numbers etc.
Conclusions
• Rapid situational awareness is necessary for
effective relief operations
• Twitter as useful information source during
emergencies
• Automatic classification and summarization
approach
• Propose approach outperforms all baselines
and deemed effective –learned from user
studies
Thank you!
Imran Muhammad
Twitter: @mimran15

More Related Content

What's hot

OSINT: Open Source Intelligence - Rohan Braganza
OSINT: Open Source Intelligence - Rohan BraganzaOSINT: Open Source Intelligence - Rohan Braganza
OSINT: Open Source Intelligence - Rohan BraganzaNSConclave
 
Disaster recovery solution
Disaster recovery solutionDisaster recovery solution
Disaster recovery solutionAnton An
 
Tweet sentiment analysis (Data mining)
Tweet sentiment analysis (Data mining)Tweet sentiment analysis (Data mining)
Tweet sentiment analysis (Data mining)Anil Shrestha
 
2 Modern Security - Microsoft Information Protection
2   Modern Security - Microsoft Information Protection2   Modern Security - Microsoft Information Protection
2 Modern Security - Microsoft Information ProtectionAndrew Bettany
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsPeter Haase
 
Azure Fundamentals Part 1
Azure Fundamentals Part 1Azure Fundamentals Part 1
Azure Fundamentals Part 1CCG
 
Trends in recent technology
Trends in recent technologyTrends in recent technology
Trends in recent technologysai krishna
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSumit Raj
 
Running Mission Critical Workloads on AWS
Running Mission Critical Workloads on AWSRunning Mission Critical Workloads on AWS
Running Mission Critical Workloads on AWSAmazon Web Services
 
Machine Data 101: Turning Data Into Insight
Machine Data 101: Turning Data Into InsightMachine Data 101: Turning Data Into Insight
Machine Data 101: Turning Data Into InsightSplunk
 
Secure your Access to Cloud Apps using Microsoft Defender for Cloud Apps
Secure your Access to Cloud Apps using Microsoft Defender for Cloud AppsSecure your Access to Cloud Apps using Microsoft Defender for Cloud Apps
Secure your Access to Cloud Apps using Microsoft Defender for Cloud AppsVignesh Ganesan I Microsoft MVP
 
Resume: The Complete Guide to Cybersecurity Risks and Controls
Resume: The Complete Guide to Cybersecurity Risks and ControlsResume: The Complete Guide to Cybersecurity Risks and Controls
Resume: The Complete Guide to Cybersecurity Risks and ControlsRd. R. Agung Trimanda
 
Introducing Cloudflare Workers
Introducing Cloudflare WorkersIntroducing Cloudflare Workers
Introducing Cloudflare WorkersMeghan Weinreich
 
Splunk for Enterprise Security and User Behavior Analytics
 Splunk for Enterprise Security and User Behavior Analytics Splunk for Enterprise Security and User Behavior Analytics
Splunk for Enterprise Security and User Behavior AnalyticsSplunk
 
Open Source and the Internet of Things
Open Source and the Internet of ThingsOpen Source and the Internet of Things
Open Source and the Internet of ThingsBlack Duck by Synopsys
 

What's hot (20)

Big Data (security Issue)
Big Data (security Issue)Big Data (security Issue)
Big Data (security Issue)
 
OSINT: Open Source Intelligence - Rohan Braganza
OSINT: Open Source Intelligence - Rohan BraganzaOSINT: Open Source Intelligence - Rohan Braganza
OSINT: Open Source Intelligence - Rohan Braganza
 
Disaster recovery solution
Disaster recovery solutionDisaster recovery solution
Disaster recovery solution
 
Tweet sentiment analysis (Data mining)
Tweet sentiment analysis (Data mining)Tweet sentiment analysis (Data mining)
Tweet sentiment analysis (Data mining)
 
2 Modern Security - Microsoft Information Protection
2   Modern Security - Microsoft Information Protection2   Modern Security - Microsoft Information Protection
2 Modern Security - Microsoft Information Protection
 
Disastermannager
DisastermannagerDisastermannager
Disastermannager
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge Graphs
 
Azure Fundamentals Part 1
Azure Fundamentals Part 1Azure Fundamentals Part 1
Azure Fundamentals Part 1
 
Trends in recent technology
Trends in recent technologyTrends in recent technology
Trends in recent technology
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Running Mission Critical Workloads on AWS
Running Mission Critical Workloads on AWSRunning Mission Critical Workloads on AWS
Running Mission Critical Workloads on AWS
 
Machine Data 101: Turning Data Into Insight
Machine Data 101: Turning Data Into InsightMachine Data 101: Turning Data Into Insight
Machine Data 101: Turning Data Into Insight
 
Secure your Access to Cloud Apps using Microsoft Defender for Cloud Apps
Secure your Access to Cloud Apps using Microsoft Defender for Cloud AppsSecure your Access to Cloud Apps using Microsoft Defender for Cloud Apps
Secure your Access to Cloud Apps using Microsoft Defender for Cloud Apps
 
Resume: The Complete Guide to Cybersecurity Risks and Controls
Resume: The Complete Guide to Cybersecurity Risks and ControlsResume: The Complete Guide to Cybersecurity Risks and Controls
Resume: The Complete Guide to Cybersecurity Risks and Controls
 
information security awareness course
information security awareness courseinformation security awareness course
information security awareness course
 
Introducing Cloudflare Workers
Introducing Cloudflare WorkersIntroducing Cloudflare Workers
Introducing Cloudflare Workers
 
Splunk for Enterprise Security and User Behavior Analytics
 Splunk for Enterprise Security and User Behavior Analytics Splunk for Enterprise Security and User Behavior Analytics
Splunk for Enterprise Security and User Behavior Analytics
 
UEBA
UEBAUEBA
UEBA
 
Open Source and the Internet of Things
Open Source and the Internet of ThingsOpen Source and the Internet of Things
Open Source and the Internet of Things
 
Disaster database slide
Disaster database slideDisaster database slide
Disaster database slide
 

Viewers also liked

Online Learning And Teaching Across Schoolboard Boundaries
Online Learning And Teaching Across Schoolboard BoundariesOnline Learning And Teaching Across Schoolboard Boundaries
Online Learning And Teaching Across Schoolboard Boundariesaorakinet
 
Membership application new logo
Membership application new logoMembership application new logo
Membership application new logopatcalhoun
 
Data Mining and Machine Learning (DMML)
Data Mining and Machine Learning (DMML)Data Mining and Machine Learning (DMML)
Data Mining and Machine Learning (DMML)butest
 
High-level
High-levelHigh-level
High-levelbutest
 
Report
ReportReport
Reportbutest
 
#Membership application
#Membership application#Membership application
#Membership applicationpatcalhoun
 
#Membership renewal new logo
#Membership renewal new logo#Membership renewal new logo
#Membership renewal new logopatcalhoun
 
ISCRAM 2013: Extracting Information Nuggets from Disaster-Related Messages i...
ISCRAM 2013: Extracting Information Nuggets from  Disaster-Related Messages i...ISCRAM 2013: Extracting Information Nuggets from  Disaster-Related Messages i...
ISCRAM 2013: Extracting Information Nuggets from Disaster-Related Messages i...ISCRAM Events
 
01.사람의 이해와 UX/UI(시스템과 유저 인터페이스)
01.사람의 이해와 UX/UI(시스템과 유저 인터페이스)01.사람의 이해와 UX/UI(시스템과 유저 인터페이스)
01.사람의 이해와 UX/UI(시스템과 유저 인터페이스)Billy Choi
 
Laundry Management System
Laundry Management SystemLaundry Management System
Laundry Management Systemcetrosoft
 
영화 미디어에 투영된 미래 분석 및 미래가전 시사점 퍼셉션
영화 미디어에 투영된 미래 분석 및 미래가전 시사점 퍼셉션영화 미디어에 투영된 미래 분석 및 미래가전 시사점 퍼셉션
영화 미디어에 투영된 미래 분석 및 미래가전 시사점 퍼셉션PERCEPTION
 
Concepto e importancia de la mercadotecnia
Concepto e importancia de la mercadotecniaConcepto e importancia de la mercadotecnia
Concepto e importancia de la mercadotecniaNombre Apellidos
 

Viewers also liked (14)

Online Learning And Teaching Across Schoolboard Boundaries
Online Learning And Teaching Across Schoolboard BoundariesOnline Learning And Teaching Across Schoolboard Boundaries
Online Learning And Teaching Across Schoolboard Boundaries
 
Membership application new logo
Membership application new logoMembership application new logo
Membership application new logo
 
Data Mining and Machine Learning (DMML)
Data Mining and Machine Learning (DMML)Data Mining and Machine Learning (DMML)
Data Mining and Machine Learning (DMML)
 
High-level
High-levelHigh-level
High-level
 
Report
ReportReport
Report
 
#Membership application
#Membership application#Membership application
#Membership application
 
doc
docdoc
doc
 
Los dioses
Los diosesLos dioses
Los dioses
 
#Membership renewal new logo
#Membership renewal new logo#Membership renewal new logo
#Membership renewal new logo
 
ISCRAM 2013: Extracting Information Nuggets from Disaster-Related Messages i...
ISCRAM 2013: Extracting Information Nuggets from  Disaster-Related Messages i...ISCRAM 2013: Extracting Information Nuggets from  Disaster-Related Messages i...
ISCRAM 2013: Extracting Information Nuggets from Disaster-Related Messages i...
 
01.사람의 이해와 UX/UI(시스템과 유저 인터페이스)
01.사람의 이해와 UX/UI(시스템과 유저 인터페이스)01.사람의 이해와 UX/UI(시스템과 유저 인터페이스)
01.사람의 이해와 UX/UI(시스템과 유저 인터페이스)
 
Laundry Management System
Laundry Management SystemLaundry Management System
Laundry Management System
 
영화 미디어에 투영된 미래 분석 및 미래가전 시사점 퍼셉션
영화 미디어에 투영된 미래 분석 및 미래가전 시사점 퍼셉션영화 미디어에 투영된 미래 분석 및 미래가전 시사점 퍼셉션
영화 미디어에 투영된 미래 분석 및 미래가전 시사점 퍼셉션
 
Concepto e importancia de la mercadotecnia
Concepto e importancia de la mercadotecniaConcepto e importancia de la mercadotecnia
Concepto e importancia de la mercadotecnia
 

Similar to Summarizing Situational Tweets in Crisis Scenario

Processing Social Media Messages in Mass Emergency: A Survey
Processing Social Media Messages in Mass Emergency: A SurveyProcessing Social Media Messages in Mass Emergency: A Survey
Processing Social Media Messages in Mass Emergency: A SurveyMuhammad Imran
 
OCHA Think Brief - Hashtag Standards for emergencies
OCHA Think Brief - Hashtag Standards for emergenciesOCHA Think Brief - Hashtag Standards for emergencies
OCHA Think Brief - Hashtag Standards for emergenciesJan Husar
 
Data Analytics and Industry-Academic Partnerships: An Irish Perspective
Data Analytics and Industry-Academic Partnerships: An Irish PerspectiveData Analytics and Industry-Academic Partnerships: An Irish Perspective
Data Analytics and Industry-Academic Partnerships: An Irish PerspectiveJohn Breslin
 
Open analytics social media framework
Open analytics   social media frameworkOpen analytics   social media framework
Open analytics social media frameworkOpen Analytics
 
Twitter as a personalizable information service ii
Twitter as a personalizable information service iiTwitter as a personalizable information service ii
Twitter as a personalizable information service iiKan-Han (John) Lu
 
Using Twitter as a Postgraduate Researcher
Using Twitter as a Postgraduate ResearcherUsing Twitter as a Postgraduate Researcher
Using Twitter as a Postgraduate ResearcherSimon Bishop
 
Rob Procter
Rob ProcterRob Procter
Rob ProcterNSMNSS
 
IRJET- Identification of Prevalent News from Twitter and Traditional Media us...
IRJET- Identification of Prevalent News from Twitter and Traditional Media us...IRJET- Identification of Prevalent News from Twitter and Traditional Media us...
IRJET- Identification of Prevalent News from Twitter and Traditional Media us...IRJET Journal
 
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...PrattSILS
 
Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...
Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...
Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...Pavan Kapanipathi
 
Twitter data analysis using R
Twitter data analysis using RTwitter data analysis using R
Twitter data analysis using Rsantoshi mangalgi
 
Content Curation – New L&D Mindset & Skill Set
Content Curation – New L&D Mindset & Skill SetContent Curation – New L&D Mindset & Skill Set
Content Curation – New L&D Mindset & Skill SetLearningCafe
 
Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...
Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...
Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...Artificial Intelligence Institute at UofSC
 
New Approaches to Large-Scale Social Media Analytics: Investigating Twitter i...
New Approaches to Large-Scale Social Media Analytics: Investigating Twitter i...New Approaches to Large-Scale Social Media Analytics: Investigating Twitter i...
New Approaches to Large-Scale Social Media Analytics: Investigating Twitter i...Axel Bruns
 
IRJET- An Improved Machine Learning for Twitter Breaking News Extraction ...
IRJET-  	  An Improved Machine Learning for Twitter Breaking News Extraction ...IRJET-  	  An Improved Machine Learning for Twitter Breaking News Extraction ...
IRJET- An Improved Machine Learning for Twitter Breaking News Extraction ...IRJET Journal
 
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Farida Vis
 
What's up at Kno.e.sis?
What's up at Kno.e.sis? What's up at Kno.e.sis?
What's up at Kno.e.sis? Amit Sheth
 
Twitter, Public Communication and the Media Ecology: The Case of the Queensla...
Twitter, Public Communication and the Media Ecology: The Case of the Queensla...Twitter, Public Communication and the Media Ecology: The Case of the Queensla...
Twitter, Public Communication and the Media Ecology: The Case of the Queensla...Axel Bruns
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introductionamiyadash
 
Social Media in Australia: The Case of Twitter
Social Media in Australia: The Case of TwitterSocial Media in Australia: The Case of Twitter
Social Media in Australia: The Case of TwitterAxel Bruns
 

Similar to Summarizing Situational Tweets in Crisis Scenario (20)

Processing Social Media Messages in Mass Emergency: A Survey
Processing Social Media Messages in Mass Emergency: A SurveyProcessing Social Media Messages in Mass Emergency: A Survey
Processing Social Media Messages in Mass Emergency: A Survey
 
OCHA Think Brief - Hashtag Standards for emergencies
OCHA Think Brief - Hashtag Standards for emergenciesOCHA Think Brief - Hashtag Standards for emergencies
OCHA Think Brief - Hashtag Standards for emergencies
 
Data Analytics and Industry-Academic Partnerships: An Irish Perspective
Data Analytics and Industry-Academic Partnerships: An Irish PerspectiveData Analytics and Industry-Academic Partnerships: An Irish Perspective
Data Analytics and Industry-Academic Partnerships: An Irish Perspective
 
Open analytics social media framework
Open analytics   social media frameworkOpen analytics   social media framework
Open analytics social media framework
 
Twitter as a personalizable information service ii
Twitter as a personalizable information service iiTwitter as a personalizable information service ii
Twitter as a personalizable information service ii
 
Using Twitter as a Postgraduate Researcher
Using Twitter as a Postgraduate ResearcherUsing Twitter as a Postgraduate Researcher
Using Twitter as a Postgraduate Researcher
 
Rob Procter
Rob ProcterRob Procter
Rob Procter
 
IRJET- Identification of Prevalent News from Twitter and Traditional Media us...
IRJET- Identification of Prevalent News from Twitter and Traditional Media us...IRJET- Identification of Prevalent News from Twitter and Traditional Media us...
IRJET- Identification of Prevalent News from Twitter and Traditional Media us...
 
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
 
Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...
Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...
Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...
 
Twitter data analysis using R
Twitter data analysis using RTwitter data analysis using R
Twitter data analysis using R
 
Content Curation – New L&D Mindset & Skill Set
Content Curation – New L&D Mindset & Skill SetContent Curation – New L&D Mindset & Skill Set
Content Curation – New L&D Mindset & Skill Set
 
Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...
Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...
Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...
 
New Approaches to Large-Scale Social Media Analytics: Investigating Twitter i...
New Approaches to Large-Scale Social Media Analytics: Investigating Twitter i...New Approaches to Large-Scale Social Media Analytics: Investigating Twitter i...
New Approaches to Large-Scale Social Media Analytics: Investigating Twitter i...
 
IRJET- An Improved Machine Learning for Twitter Breaking News Extraction ...
IRJET-  	  An Improved Machine Learning for Twitter Breaking News Extraction ...IRJET-  	  An Improved Machine Learning for Twitter Breaking News Extraction ...
IRJET- An Improved Machine Learning for Twitter Breaking News Extraction ...
 
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
 
What's up at Kno.e.sis?
What's up at Kno.e.sis? What's up at Kno.e.sis?
What's up at Kno.e.sis?
 
Twitter, Public Communication and the Media Ecology: The Case of the Queensla...
Twitter, Public Communication and the Media Ecology: The Case of the Queensla...Twitter, Public Communication and the Media Ecology: The Case of the Queensla...
Twitter, Public Communication and the Media Ecology: The Case of the Queensla...
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introduction
 
Social Media in Australia: The Case of Twitter
Social Media in Australia: The Case of TwitterSocial Media in Australia: The Case of Twitter
Social Media in Australia: The Case of Twitter
 

More from Muhammad Imran

Damage Assessment from Social Media Imagery Data During Disasters
Damage Assessment from Social Media Imagery Data During DisastersDamage Assessment from Social Media Imagery Data During Disasters
Damage Assessment from Social Media Imagery Data During DisastersMuhammad Imran
 
Image4Act: Online Social Media Image Processing for Disaster Response
Image4Act: Online Social Media Image Processing for Disaster ResponseImage4Act: Online Social Media Image Processing for Disaster Response
Image4Act: Online Social Media Image Processing for Disaster ResponseMuhammad Imran
 
Real-Time Processing of Social Media Content for Social Good
Real-Time Processing of Social Media Content for Social GoodReal-Time Processing of Social Media Content for Social Good
Real-Time Processing of Social Media Content for Social GoodMuhammad Imran
 
AIDR Tutorial (Artificial Intelligence for Disaster Response)
AIDR Tutorial (Artificial Intelligence for Disaster Response)AIDR Tutorial (Artificial Intelligence for Disaster Response)
AIDR Tutorial (Artificial Intelligence for Disaster Response)Muhammad Imran
 
A Robust Framework for Classifying Evolving Document Streams in an Expert-Mac...
A Robust Framework for Classifying Evolving Document Streams in an Expert-Mac...A Robust Framework for Classifying Evolving Document Streams in an Expert-Mac...
A Robust Framework for Classifying Evolving Document Streams in an Expert-Mac...Muhammad Imran
 
Introduction to Machine Learning: An Application to Disaster Response
Introduction to Machine Learning: An Application to Disaster ResponseIntroduction to Machine Learning: An Application to Disaster Response
Introduction to Machine Learning: An Application to Disaster ResponseMuhammad Imran
 
A Real-time Heuristic-based Unsupervised Method for Name Disambiguation in Di...
A Real-time Heuristic-based Unsupervised Method for Name Disambiguation in Di...A Real-time Heuristic-based Unsupervised Method for Name Disambiguation in Di...
A Real-time Heuristic-based Unsupervised Method for Name Disambiguation in Di...Muhammad Imran
 
Coordinating Human and Machine Intelligence to Classify Microblog Communica0o...
Coordinating Human and Machine Intelligence to Classify Microblog Communica0o...Coordinating Human and Machine Intelligence to Classify Microblog Communica0o...
Coordinating Human and Machine Intelligence to Classify Microblog Communica0o...Muhammad Imran
 
Tweet4act: Using Incident-Specific Profiles for Classifying Crisis-Related Me...
Tweet4act: Using Incident-Specific Profiles for Classifying Crisis-Related Me...Tweet4act: Using Incident-Specific Profiles for Classifying Crisis-Related Me...
Tweet4act: Using Incident-Specific Profiles for Classifying Crisis-Related Me...Muhammad Imran
 
Extracting Information Nuggets from Disaster-Related Messages in Social Media
Extracting Information Nuggets from Disaster-Related Messages in Social MediaExtracting Information Nuggets from Disaster-Related Messages in Social Media
Extracting Information Nuggets from Disaster-Related Messages in Social MediaMuhammad Imran
 
Domain Specific Mashups
Domain Specific MashupsDomain Specific Mashups
Domain Specific MashupsMuhammad Imran
 
Reseval Mashup Platform Talk at SECO
Reseval Mashup Platform Talk at SECOReseval Mashup Platform Talk at SECO
Reseval Mashup Platform Talk at SECOMuhammad Imran
 
ResEval: Resource-oriented Research Impact Evaluation platform
ResEval: Resource-oriented Research Impact Evaluation platformResEval: Resource-oriented Research Impact Evaluation platform
ResEval: Resource-oriented Research Impact Evaluation platformMuhammad Imran
 

More from Muhammad Imran (13)

Damage Assessment from Social Media Imagery Data During Disasters
Damage Assessment from Social Media Imagery Data During DisastersDamage Assessment from Social Media Imagery Data During Disasters
Damage Assessment from Social Media Imagery Data During Disasters
 
Image4Act: Online Social Media Image Processing for Disaster Response
Image4Act: Online Social Media Image Processing for Disaster ResponseImage4Act: Online Social Media Image Processing for Disaster Response
Image4Act: Online Social Media Image Processing for Disaster Response
 
Real-Time Processing of Social Media Content for Social Good
Real-Time Processing of Social Media Content for Social GoodReal-Time Processing of Social Media Content for Social Good
Real-Time Processing of Social Media Content for Social Good
 
AIDR Tutorial (Artificial Intelligence for Disaster Response)
AIDR Tutorial (Artificial Intelligence for Disaster Response)AIDR Tutorial (Artificial Intelligence for Disaster Response)
AIDR Tutorial (Artificial Intelligence for Disaster Response)
 
A Robust Framework for Classifying Evolving Document Streams in an Expert-Mac...
A Robust Framework for Classifying Evolving Document Streams in an Expert-Mac...A Robust Framework for Classifying Evolving Document Streams in an Expert-Mac...
A Robust Framework for Classifying Evolving Document Streams in an Expert-Mac...
 
Introduction to Machine Learning: An Application to Disaster Response
Introduction to Machine Learning: An Application to Disaster ResponseIntroduction to Machine Learning: An Application to Disaster Response
Introduction to Machine Learning: An Application to Disaster Response
 
A Real-time Heuristic-based Unsupervised Method for Name Disambiguation in Di...
A Real-time Heuristic-based Unsupervised Method for Name Disambiguation in Di...A Real-time Heuristic-based Unsupervised Method for Name Disambiguation in Di...
A Real-time Heuristic-based Unsupervised Method for Name Disambiguation in Di...
 
Coordinating Human and Machine Intelligence to Classify Microblog Communica0o...
Coordinating Human and Machine Intelligence to Classify Microblog Communica0o...Coordinating Human and Machine Intelligence to Classify Microblog Communica0o...
Coordinating Human and Machine Intelligence to Classify Microblog Communica0o...
 
Tweet4act: Using Incident-Specific Profiles for Classifying Crisis-Related Me...
Tweet4act: Using Incident-Specific Profiles for Classifying Crisis-Related Me...Tweet4act: Using Incident-Specific Profiles for Classifying Crisis-Related Me...
Tweet4act: Using Incident-Specific Profiles for Classifying Crisis-Related Me...
 
Extracting Information Nuggets from Disaster-Related Messages in Social Media
Extracting Information Nuggets from Disaster-Related Messages in Social MediaExtracting Information Nuggets from Disaster-Related Messages in Social Media
Extracting Information Nuggets from Disaster-Related Messages in Social Media
 
Domain Specific Mashups
Domain Specific MashupsDomain Specific Mashups
Domain Specific Mashups
 
Reseval Mashup Platform Talk at SECO
Reseval Mashup Platform Talk at SECOReseval Mashup Platform Talk at SECO
Reseval Mashup Platform Talk at SECO
 
ResEval: Resource-oriented Research Impact Evaluation platform
ResEval: Resource-oriented Research Impact Evaluation platformResEval: Resource-oriented Research Impact Evaluation platform
ResEval: Resource-oriented Research Impact Evaluation platform
 

Recently uploaded

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 

Recently uploaded (20)

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 

Summarizing Situational Tweets in Crisis Scenario

  • 1. Summarizing Situational Tweets in Crisis Scenario Imran Muhammad Qatar Computing Research Institute, HBKU Doha, Qatar Co-authors: Koustav Rudra, Niloy Ganguly, Pawan Goyal (CSE, IIT Kharagpur, India) Siddhartha Banerjee (Pennsylvania State University, USA) Prasenjit Mitra (Qatar Computing Research Institute)
  • 2. Aid Needs and Information Needs Info. Info. Info. Disaster event Urgent needs of affected people Information gathering Humanitarian organizations and local administration Information gathering, especially in real-time, is the most challenging part Relief operations - Food, water - Shelter - Medical emergences - Donations - …
  • 3. Information: A Lifeline During Disasters The opaqueness induced by disasters is overwhelming People need information as much as water, food, medicine or shelter Lack of information can make people victims of disaster and targets of aid
  • 4. Twitter: A Useful Information Source • Provide active communication channels during crises • Useful information: reports of casualties, damages, donation offers and requests • Quicker than traditional channels (e.g. first tweet about Westgate Mall attack reported within a minute)
  • 5. Opportunities • Rapid crisis response • Time-critical situational awareness • Access to actionable information • … • But, it requires real-time data processing • Categorizations of each incoming item should be done as soon as it arrives • Rapid automatic summaries generation
  • 6. Enhanced Situational Awareness Time-critical situational awareness by generating automatic summaries • We use AIDR (Artificial Intelligence for Disaster Response) system for: – real-time data processing – categorizations of tweets • We proposed a novel framework for summarization of informative tweets
  • 7. Summarization of Tweets Example Dharara Tower built in 1832 collapses in Kathmandu during earthquake Historic Dharara Tower Collapses in Kathmandu After 7.9 Earthquake Dharara tower built in 1832 collapses in Kathmandu after 7.9 earthquake.
  • 8. Key Characteristics and Objectives • Information coverage – Capture most situational updates from data. The summary should be rich in terms of information coverage • Less redundant information – Messages on Twitter contain duplicate information. We aim for summaries with less redundancy while keeping important updates • Readability – Twitter messages are often noisy, informal, and full of grammatical mistakes. We aim to produce more readable summaries • Real-time – The system should not be heavily overloaded with computations such that by the time the summary is produced, the utility of that information is marginal
  • 10. Datasets • Nepal earthquake tweets from 25th to 27th April 2015 • AIDR classified tweets to the following categories: – Missing trapped or found people (10,751 tweets) – Infrastructure and utilities damage (16,842 tweets) – Shelter and supplies (19,006 tweets)
  • 11. The Role of Content Words in Extractive Summarization • Studies show the significance of content words to capture important events – Nouns (e.g. hospitals, buildings, bridges names) – Numerals (e.g. number of casualties) – Main verbs (e.g. collapsed, destroyed, killed)
  • 12. Abstractive Summarization • We generate a word graph where nodes are bigrams • We generate sentences from the word graph Challenge: Maintaining informativeness and readability – Covering important content words – Favoring more informative paths – Maintaining linguistic quality ILP model combining the above factors
  • 13. Bi-gram Based Word Graph • Word graph: nodes represent bi-grams (along with their POS-tags) • An edge represents consecutive words • Nodes of two tweets with same bi-gram and POS- tags are merged
  • 14. ILP Based Formulation Parameters • Score of sentences/generated paths (CW(s)) – Centroid score • Linguistic quality(LQ(s)) – Trigram language model – LQ(s) = 1/(1-ll(w1,w2,…,wq)) – ll(w1,w2,…,wq) = 1/Llog2∏q t =3 P(wt|wt-2wt-1)
  • 15. ILP Based Solution max(∑i=1…n CW(i)*LQ(i)*xi + ∑j=1…m yj) xi, yj binary variable xi tweet indicator, yj content word indicator CW(i) = tweet i centroid score LQ(i) = Linguistic score of tweet i Constraints ∑i=1…n xi * Length(i) ≤ L Length(i) = number of words in tweet i L = required summary word length ∑i ∈ Tj xi ≥ yj j = [1 … m] Tj = set of tweets where content word j is present If yj is selected then at least one tweet covering that word is also selected ∑j ∈ Ci yj ≥|Ci| * xi i = [1 … n] Ci = set of content words present in tweet i If tweet i is selected then all the content words of that tweet are also selected
  • 16. Baselines and Gold Standard • COWTS: content word based extractive tweet stream summarization algorithm [CIKM 2015] • APSAL: affinity clustering based supervised disaster related news articles summarizer [ACL 2015] • TOWGS: runtime bigram based abstractive summarization [EACL 2014] Gold Standard: We used summaries generated by UN OCHA experts during Nepal earthquake
  • 17. Summarization Results .  If number of clusters increases, determining importance of different clusters becomes difficult [APSAL]  COWTS tries to maximize the coverage of content words but can’t combine information from related tweets  TOWGS didn’t consider content words into account  COWABS tries to combine similar information from related tweets as well as maximizing coverage of content words
  • 18. Performance Variation with Summary Length If summary length increases COWABS still performs better than the baselines
  • 21. Summarization Quality (Location) COWABS captures information at more granular level with location specific information
  • 22. Summarization Quality (Event) We extract event phrases using the method proposed by Ritter et al [EMNLP 2011] COWABS captures more event specific information
  • 23. Summarization Quality (Numeral) COWABS captures more numerical information which includes information about victims, helpline numbers etc.
  • 24. Conclusions • Rapid situational awareness is necessary for effective relief operations • Twitter as useful information source during emergencies • Automatic classification and summarization approach • Propose approach outperforms all baselines and deemed effective –learned from user studies

Editor's Notes

  1. For an effective planning, we need as complete information as possible. Looking back over the events of 2004 and before, it is striking how many of the year’s disasters, deaths, medical emergencies etc. could have been avoided with better information and communication with the affected people. The opaqueness induced by disasters is overwhelming. Information could be a lifeline for everybody involved in a disasters situation. People need information as much as water, food, medicine or shelter. Lack of information make people victims of disasters and targets of aid. http://www.ifrc.org/Global/Publications/disasters/WDR/69001-WDR2005-english-LR.pdf
  2. A number of research studies reveal that social media platforms such as twitter can be very useful sources of information during disasters. For example, Twitter provide active communciation channels during crises. Affected people post sitatuational awareness and actionable information. Studies have seen reports of casualities, damages, donations offers and requests… We have also seen that Twitter for example, can be quicker than traditional channels in terms of reporting first hand information
  3. Social media data can be useful for rapid crisis response But, it requires real-time data analysis So the categorizations of each incoming item into different user-defined classes should be as soon as they For example, infrastructure damage, urgent needs
  4. Social media data can be useful for rapid crisis response But, it requires real-time data analysis So the categorizations of each incoming item into different user-defined classes should be as soon as they For example, infrastructure damage, urgent needs
  5. We use a trigram-based log-likelihood score using a language model as a dummy representative of the Readability of the generated content.
  6. For extractive summarization, we mainly focus on content words. For example, nouns, numerals, main verbs Studies show that content words play significant role to capture important events. Now the question is, how much of the content words we can capture after processing N number of words. For this purpose, we randomly sample same number of tweets for each of the three classes for Nepal EQ and other two events black-Friday and Chelsea match. It was observed that number of content words are very small in case of disaster events. BLF = blackfriday CHL = chelsea match Distinct content words posted during disaster are very less in number compared to generic events Within 1000 words limit, we can cover more than 80% of the content words in a disaster event.
  7. Challenges in abstractive summarization
  8. Given this graph, in a real-scenario, the number of generated tweet-paths can be several thousands, because there can be multiple points of merging across several tweets. Our goal is to generate sentences by selecting the best paths from these generated paths with the objective of generating a readable and informative summary. We set a minimum (10 words) and maximum (16 words) length for a sentence to be generated. We apply such constraints to avoid very long sentences that might be grammatically ill-formed and very short sentences that are often incomplete.
  9. The ILP-based technique optimizes based upon three factors - (i) Presence of content words (ii) Informativeness of a path i.e. importance of a path, and (iii) Linguistic Quality Score that captures the readability of a path using a trigram confidence score. Each sentence is represented as a TF-IDF vector. The centroid is basically the mean of the TF-IDF vectors of all the sentences.
  10. Optimization problem and constraints An integer programming problem is a mathematical optimization program in which some or all of the variables are restricted to be integers and the objective function and the constraints (other than the integer constraints) are linear. Integer programming is NP-hard.
  11. Comparison of different methods and their drawbacks, our improvements
  12. Comparison with baselines when summary length increases