Crisis Information Processing - with the power of A.I.

The Open University
The Open UniversityProfessor at Knowledge Media Institute - Open University
Crisis Information Processing
with the power of A.I.
Harith Alani
Knowledge Media institute
The Open University, UK
@halani
AGEOF
DISASTERS
Kerala flooda, August 2018, ~400 fatalities,
~500K people displaced
voanews.com
Attica, Greece, Wild fires, July 2018, 98 deaths
bloomberg.com
Japan, Typhoon Jebi, Sept 2018, 17 deaths
japantimes.co.jp.com
Hurricane Maria,
Oct 2017, 3K
fatalities
thefrontpageonline.com/
Lombok, Indonesia, Earthquake,
August 2018, over 500 deaths
uk.businessinsider.com/
weforum.org
floodlist.com
COST OF DISASTERS
event
DISASTER
MANAGEMENT
CYCLE
POWER OF
INFORMATION
“Disaster-affected people need
information as much as water, food,
medicine or shelter. Information can
save lives, livelihoods and resources.”
Crisis Information Processing - with the power of A.I.
Crisis Information Processing - with the power of A.I.
Crisis Information Processing - with the power of A.I.
12
TWITTER DURING DISASTERS
13
DISASTER
RESPONSE
THROUGH SOCIAL
MEDIA
“The models that are emerging indicate
that affected people are becoming
extremely adept at using social media
platforms in particular to engage in
networked systems of response. This
means they are able to post about specific
needs and solicit individual responses to
those needs, and that people offering
specific help can also do so”
15
https://www.dhs.gov/sites/default/files/publications/privacy-pia-FEMA-OUSM-April2016.pdf
https://federalnewsradio.com/digital-government/2016/03/fema-hhs-turn-social-listening-better-disaster-response/
FEMA MOBILE APP
“Unfortunately, we’ve been underwhelmed
with the use of that app, because I think
everyone is at the Weather Channel doing
their Instagrams there,” …. “Instead of
trying to do everything ourselves, we need
to find smarter ways to integrate the social
media world more effectively into how we
perform our business functions.”
Scott Shoup
chief data officer at FEMA
Crisis Information Processing - with the power of A.I.
”Immediate damage estimates based on FEMA
models can miss areas of heavy impact. Augmenting
initial models with real-time analysis of social media
and crowdsourced information can help identify
overlooked areas. Twitter-sourced estimates were
virtually available as people tweeted distress signals,
of these parcel-level damage estimates, 46 percent
were not captured by FEMA estimates.”
FEMA MISSES
HURRICANE
DAMAGE
REPORTED ON
TWITTER
CROWDSOURCING
EMERGENCY
RESPONSE
Crisis Information Processing - with the power of A.I.
WORKFLOW OF USHAHIDI & SIMILAR PLATFORMS
citizen reporters digital responders
Manual
Annotations
administrators
Manual
Verification
Manual
Publishing
analysts/public/
research teams
SOCIAL MEDIA INFOSMOG DURING
DISASTERS
In the US, 1.1 million tweets were sent in the first day of Hurricane Sandy, and
over 20 million in total
~800K photos with #Sandy hashtag on Instagram
More than 23 million tweets were posted about the haze in Singapore
In Nepal, more than half a million posts were shared about the devastating
earthquake in 2015
>2.3M tweets were sent with the words “Haiti” or “Red Cross” in 2010
~177 million tweets sent about the Japan 2011 earthquake disaster
ENGAGING STAKEHOLDERS
RESPONDERS
POLICE
MAKERS
REPORTERS
DEPLOYERS
REQUIREMENTS & CHALLENGES
VOLUME
VALUE
VARIETY
VALIDITY
Too much content to handle manually
More content is coming in all the time
Rumours and hoaxes
spread wild during
disasters
Content is often repetitive and
uninformative
Much of the content is irrelevant
VELOCITY
Filtering out irrelevant information helps to
tackle information overload
How do we identify relevant and irrelevant
information across diverse crises
situations?
Can we learn from one type of crisis
situation, and apply it to another?
Can we train our models on one language
and apply it to another?
RELEVANCY OF
SOCIAL MEDIA
POSTS
CRISES DATA
RELEVANCY
Query Filtering
#hashtags, keywords
disaster name
disaster specific phrases
locations
filtered
data
FILTERING METHODS
Post Collection
text search
topic modelling
semantic search
automatic categorisation
filtered
data
supervisedunsupervised
Event Label
Machine Learning Classifiers (e.g., Naïve
Bayes, SVM, J48, CNN)
Features (e.g., n-grams, linguistic features,
semantics)
34
AUTOMATIC CLASSIFICATION
N-grams
Text length
Count of
nouns/verbs/pronouns
Hashtags
Mentions
Readability score
…
Analysis Features
CLASSIFICATION MODELS
HURRICANE
HARVEY
HURRICANE
IRMA
KERALA
FLOODS
LOMBOK
EARTHQUAKE
Classification
Model
Typical approach: Train and test on data from the same disasters
SVM (20 iterations 5- fold cross validation)
Features P R F
0.81 0.81 0.81Statistical Features
PRECISION RECALL F-MEASURE
TRAIN & TEST ON SAME CRISES EVENTS
What if we add some domain knowledge?
SEMANTIC INFORMATION
<dbp:Barack_Obama>
American
dbprop:nationality
<skos:Nobel_Peace_Price_laureates>
dcterms:subject
<dbo:PresidentOfUnitedStateofAmerica>
rdf:type
<dbp:Hosni_Mubarak>
<skos:PresidentsOfEgypt>
<dbp:CNN>
<dbp:Egyptian_Arabic>
<skos:Arab_republics>
<dbp:Egypt>
<skos:English-language_television_stations>
dcterms:subject
dcterms:subject
dbprop:languages
<dbp:Country>
rdf:type
rdf:type
Filtering out abstract
concepts
ADDING SEMANTICS TO
CLASSIFICATION MODEL
Semantic Annotation
Semantic Expansion
Semantic Filtering
v
SVM (20 iterations 5- fold cross validation)
Features P R F
0.81 0.81 0.81 -
0.82 0.82 0.82 1.39
0.81 0.81 0.81 0.33
0.82 0.82 0.82 0.6
Semantic Features
Statistical Features
PRECISION RECALL F-MEASURE
∆F /F
(%)
Semantic Features
Semantic Features
TRAIN & TEST ON SAME CRISES EVENTS
CLASSIFICATION MODELS
HURRICANE
HARVEY
HURRICANE
IRMA
KERALA
FLOODS
LOMBOK
EARTHQUAKE
How can we train
models to become
less biased towards
specific disaster
events, or type of
events?
?
TYPHOON
TRAIN CRASH BOMBING
MASS SHOOTING
TSUNAMI
?
?
Classification
Model
41
8%
16%
32%8%
8%
4%
4%
4%
4%
8%
4% Wildfire/Bushfire
E’quakes
Flood/Typhoons
Terror
Shooting/Bombing
Train Crash
Meteor
Haze
Helicopter Crash
DISTRIBUTION
OF CRISES
EVENT TYPES
CLASSIFYING FAMILIAR EVENTS
Train model on all data,
then test on a new crisis
event of a type the was in
the training set
Eg., train model on data
that include flood events,
then test on a new flood
crisis event
Adding semantic features
offer modest improvements
over statistical features
alone
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
TyphoonYolanda
TyphoonPablo
AlbertaFlood
QueenslandFlood
ColoradoFloods
PhilippinesFlood
SardiniaFlood
GuatemalaEarthquake
ItalyEarthquake
BoholEarthquake
CostaRicaEarthquake
average
F-Measure
Statistical Features Semantic Features
Flood/Typhoon Earthquake
∆ 1.7%
CLASSIFYING UNFAMILIAR EVENTS
Train model on certain type
of events, and test it on
other types
E.g., train model on data
that include flood and
earthquake events, then
test on a train crash
incident
Adding semantic features
offer a good improvement
over statistical features
alone
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
LAAirportShoot
LacMeganticTrainCrash
BostonBombing
SpainTrainCrash
TyphoonYolanda
TyphoonPablo
AlbertaFlood
QueenslandFlood
ColoradoFloods
PhilippinesFlood
SardiniaFlood
GuatemalaEarthquake
ItalyEarthquake
BoholEarthquake
CostaRicaEarthquake
average
F-Measure
Statistical Features Semantic Features
Terror/Bomb/Train Flood/Typhoon Earthquake
Khare, P.; Burel, G. and Alani, H. Classifying Crises-Information Relevancy with Semantics. Extended Semantic Web Conference (ESWC), Heraklion, Crete, 2018.
∆ 7.2%
0
10
20
30
40
50
60
70
80
90
100
Colorado
W
ildfire
CostaRica
Quake
Guatem
alaQuake
ItalyQuake
PhilippinesFlood
Typhoon
Pablo
VenezuelaRefinery
Alberta
Flood
Australia
Bushfire
BoholE’quake
Boston
Bom
bing
BrazilClub
Fire
Colorado
Floods
Glasgow
Helicopter
LA
AirportShoot
LacM
eganticTrain
M
anilaFlood
NYTrain
Crash
Queensland
Flood
RussiaM
eteor
Sardinia
Flood
SavarBuilding
SingaporeHaze
Spain
Train
Crash
Typhoon
Yolanda
TexasExplosion
L’AquilaQuake
GenovaFlood
Em
iliaQuake
Chile
Quake
ENGLISH ITALIEN SPANISH OTHER
MULTILINGUALITY IN CRISES DATA
CLASSIFYING MULTILINGUAL CRISES DATA
Monolingual Classification
with Monolingual Models
Cross-lingual Classification
with Monolingual Models
Train the model on one language and
test it on data in the same language.
For example, train and test on data
written in English. This is the default
approach, and can be used as a
baseline.
Run the classifiers on crisis data in
languages that were not observed in
the training data. For example, we
test the classifier on Italian when the
classifier was trained on English or
Spanish.
Cross-lingual Classification
with Machine Translation
Train the classification model on data
in a certain language (e.g. Spanish),
and use it to classify data that has
been automatically translated from
other languages (e.g., Italian and
English) into the language of the
training data.
Khare, P., Burel, G., Maynard, D., and Alani, H. Cross-Lingual Classification of Crisis Data. Int. Semantic Web Conference, Monterey, CA, USA, 2018
Around 9% improvement in
detecting crisis-data
relevancy when training on
one language and applying it
on another
0.429
0.688
0.521
0.64
0.578
0.489
0.5570.572
0.659
0.538
0.631 0.65
0.543
0.599
English [Italian] English [Spanish] Italian [English] Italian [Spanish] Spanish [English] Spanish [Italian] average
Cross-lingual Classification
with Monolingual Models
Machine translation offers
good classification
improvements without any
semantics
0.546
0.669
0.572
0.609
0.675
0.593
0.633
0.581
0.664
0.551
0.582
0.683
0.571
0.605
English [Italian-
>English]
English [Spanish-
>English]
Italiant [English-
>Italian]
Italiant [Spanish-
>Italian]
Spanish [English-
>Spanish]
Spanish [Italian-
>Spanish]
average
Cross-lingual Classification
with Machine Translation
Semantics add little/no
benefit when building, and
applying, classification
models on the same
language
0.831
0.709
0.781
0.774
0.818
0.712
0.776
0.769
English [English] Italian [Italian] Spanish [Spanish] average
Train language [Test language]
Statistical Features
Semantic Features
Monolingual Classification
with Monolingual Models
Task 1 Crisis vs. non-Crisis Related Messages
Task 2 Type of Crisis
Task 3 Type of Information
Differentiate those posts that are related to a crisis
situation vs. those posts that are not
Identify the different types of crises the message is
related to
Differentiate those posts that are related to a crisis
situation vs. those posts that are not
Granularity CRISIS-DATA PROCESSING TASKS
Shooting, Explosion, Building Collapse, Fires,
Floods, Meteorite Fall, etc.
Affected Individuals, Infrastructures and Utilities,
Donations and Volunteer, Caution and Advice,
etc.
Olteanu, A., Vieweg, S., Castillo, C. What to Expect When
the Unexpected Happens: Social Media Communications
Across Crises. ACM Comp. Supported Cooperative Work
and Social Computing (CSCW), 2015
CRISIS-DATA PROCESSING TASKS
Incorporating semantics into Machine Learning classification methods:
Approach 2: Deep LearningApproach 1: Traditional ML Classifiers
Crisis Information Processing - with the power of A.I.
DEEP LEARNING VS CLASSIC ML
http://hurricane.dsig.net
DEEP LEARNING FOR CRISIS EVENT DETECTION
A semantically-enriched deep learning
model for event detection on Twitter
Tweets Preprocessing
Concept
Extraction
Word
Vectors
Initialisation
Sem-CNN
Training
Pre-trained
Embeddings
Semantic
Vectors
Initialisation
Bag of Words
Bag of Concepts
T = “Obama
attends vigil for
Boston Marathon
bombing victims”
W = [obama, attends, vigil, for, boston,
marathon, bombing, victims]
C = [obama, politician, none, none,
none, boston, location, none, none,
none]
Term-Document Vector
(Term Presence)
Embeddings
obama
politician
boston
location
...
...
...
...
none
obama
attends
vigil
for
boston
marathon
bombing
victims
1
1
1
1
0
0
0
0
1
Concepts
Vector
DEEP LEARNING
MODEL
Affected Individuals, Infrastructures and Utilities, Donations and Volunteering, Caution and
Advice, Sympathy and Support, Other Useful Information (Olteanu et al 2015)
CLASSIFYING TWEETS WITH DEEP LEARNING
SVM (TF-IDF): A linear kernel SVM
classifier trained from the words’ TF-
IDF vectors extracted from our dataset
SVM (Word2Vec): A linear kernel SVM
classifier trained from the Google pre-
trained 300-dimensional word
embeddings
SEM-DL: Semantic Deep Learning
approach
Data is from CrisisLexT26: 26 crisis events,
with 1,000 annotated tweets for a total of
around 28,000 tweets. Data is too small for
Deep Learning, hence only a proof of concept
0.48 0.5 0.52 0.54 0.56 0.58 0.6 0.62 0.64
Precision
Recall
F1
SEM-DL SVM (Word2Vec) SVM (TF-IDF)
Burel, G.; Saif, H. and Alani, H. Semantic Wide and Deep Learning for Detecting Crisis-Information Categories on Social Media. Int. Semantic Web Conf. (ISWC), Vienna, Austria, 2017.
CREES automatically processes short texts in a Google sheet, and
identifies if a text is about a crisis, crisis-types and information-types
Uses Deep Learning methods
Google Sheet Add-on
CRISIS EVENT EXTRACTION SERVICE
Burel, G. & Alani, H. Crisis Event Extraction Service (CREES) - Automatic Detection and Classification of Crisis-related Content on Social Media. 15th Int. Conf. on Info. Sys. for Crisis Response and Management, Rochester, NY, USA, 2018
= CREES_RELATED(A1:A2) = CREES_TYPE(A1:A2) = CREES_INFO(A1:A2)
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
Affected Individuals Caution & Advise Donation &
Volunteering
Infrastructure Sympathy Other useful
information
Hurricane Harvey Hurricane Irma Kerala Floods Hurricane Florence
36%
9%
15%
7%2%
11%
20%
Relevant Caution & Advise
Donation & Volunteering Affected Individuals
Infrastructure Sympathy
Other useful information
DISTRIBUTION OF INFORMATION TYPES
Crisis Information Processing - with the power of A.I.
Crisis Information Processing - with the power of A.I.
RUMOURS
v
RECURRING RUMOURS
v
DEEP LEARNING
RUMOUR
VERACITY
CLASSIFIER
Can work without waiting for responses
(e.g., comments, retweets)
https://cloud.gate.ac.uk/shopfront/displayItem/rumour-veracity
Does not require the reactions
(stances) given by the responses --
stance detection may introduce noise
Makes use only of the source tweet
CHATBOTS FOR
CRISES REPORTING
Potential vs Reality
On FB Messenger alone, there are currently
over 300K active bots, exchanging over 8
billion messages between people and
businesses each month.
67
REPORT INCIDENTS
VIA FACEBOOK
MESSENGER
CHATBOT
Crisis Information Processing - with the power of A.I.
What kind of issue would you like to
report?
Good afternoon first of all
Oh my, I'm not programmed to
understand what you're saying. Sorry!
CHATBOTS – A LONG WAY TO GO
Visits to the Facebook chatbot
Visitors who clicked around in chatbot
Users not following user flow
Users tried to follow user flow
Technical fault when submitting
Reports successfully sent to Uchaguzi
Total reports submitted through Twitter, SMS,
onsite reporters
Reports structured, geolocated, verified, and
published
6875
687
3034
1501
1150
222
106
55
CHATBOT
STATS
PLATFORM
STATS
65%
35%
CHATBOT USER DEMOGRAPHICS
Crisis Information Processing - with the power of A.I.
WHAT’S NEXT
Inclusiveness of social media
Biases: gender, technology, social media platform, language
Usage of social media can differ across countries, cultures,
genders, platforms, economies …
How can we encourage, and direct, a better and more
sustained crowdsourcing during disasters
Many tools and services: when and how they need to be
orchestrated and used
Relevancy and value of social media crisis data is subjective
and person/time dependent
Free, A.I. powered tools are now
available, to:
• Separate relevant from rubbish
tweets, in ”multiple languages”, and
for “any” type of crisis
• Identify the category of crisis
information they hold
• Measure their veracity
”.. I would suggest, then, that the formula for
the next 10,000 start-ups is very, very simple,
which is to take x and add AI. That is the
formula, that's what we're going to be doing.
And that is the way in which we're going to
make this second Industrial Revolution”
Kevin Kelly, IBM
Gregoire Burel
Lara Piccolo
Prashant Khare
Acknowledgements
1 of 61

Recommended

Classifying Crisis Information Relevancy with Semantics (ESWC 2018) by
Classifying Crisis Information Relevancy with Semantics (ESWC 2018)Classifying Crisis Information Relevancy with Semantics (ESWC 2018)
Classifying Crisis Information Relevancy with Semantics (ESWC 2018)Prashant Khare
993 views33 slides
Understanding the world with NLP: interactions between society, behaviour and... by
Understanding the world with NLP: interactions between society, behaviour and...Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...Diana Maynard
202 views79 slides
Twitris in Action - a review of its many applications by
Twitris in Action - a review of its many applications Twitris in Action - a review of its many applications
Twitris in Action - a review of its many applications Amit Sheth
121 views32 slides
Fake news and trust and distrust in fact checking sites by
Fake news and trust and distrust in fact checking sitesFake news and trust and distrust in fact checking sites
Fake news and trust and distrust in fact checking sitesPetter Bae Brandtzæg
2.2K views68 slides
Computational Verification Challenges in Social Media by
Computational Verification Challenges in Social MediaComputational Verification Challenges in Social Media
Computational Verification Challenges in Social MediaSymeon Papadopoulos
1.3K views24 slides
Social media mining for sensing and responding to real-world trends and events by
Social media mining for sensing and responding to real-world trends and eventsSocial media mining for sensing and responding to real-world trends and events
Social media mining for sensing and responding to real-world trends and eventsYiannis Kompatsiaris
384 views65 slides

More Related Content

Similar to Crisis Information Processing - with the power of A.I.

Weather events identification in social media streams: tools to detect their ... by
Weather events identification in social media streams: tools to detect their ...Weather events identification in social media streams: tools to detect their ...
Weather events identification in social media streams: tools to detect their ...Alfonso Crisci
204 views21 slides
On Semantics and Deep Learning for Event Detection in Crisis Situations by
On Semantics and Deep Learning for Event Detection in Crisis SituationsOn Semantics and Deep Learning for Event Detection in Crisis Situations
On Semantics and Deep Learning for Event Detection in Crisis SituationsCOMRADES project
95 views12 slides
Era of Sociology News Rumors News Detection using Machine Learning by
Era of Sociology News Rumors News Detection using Machine LearningEra of Sociology News Rumors News Detection using Machine Learning
Era of Sociology News Rumors News Detection using Machine Learningijtsrd
63 views4 slides
Collecting and Coding Twitter Data in DiscoverText by
Collecting and Coding Twitter Data in DiscoverTextCollecting and Coding Twitter Data in DiscoverText
Collecting and Coding Twitter Data in DiscoverTextJill Hopke
2K views37 slides
Lew Short emergency response & recovery conference by
Lew Short emergency response & recovery conferenceLew Short emergency response & recovery conference
Lew Short emergency response & recovery conferenceBlackash Bushfire Consulting
367 views40 slides
Emergency Response PresentationAssessment Description The p.docx by
Emergency Response PresentationAssessment Description The p.docxEmergency Response PresentationAssessment Description The p.docx
Emergency Response PresentationAssessment Description The p.docxgreg1eden90113
8 views5 slides

Similar to Crisis Information Processing - with the power of A.I.(20)

Weather events identification in social media streams: tools to detect their ... by Alfonso Crisci
Weather events identification in social media streams: tools to detect their ...Weather events identification in social media streams: tools to detect their ...
Weather events identification in social media streams: tools to detect their ...
Alfonso Crisci204 views
On Semantics and Deep Learning for Event Detection in Crisis Situations by COMRADES project
On Semantics and Deep Learning for Event Detection in Crisis SituationsOn Semantics and Deep Learning for Event Detection in Crisis Situations
On Semantics and Deep Learning for Event Detection in Crisis Situations
COMRADES project95 views
Era of Sociology News Rumors News Detection using Machine Learning by ijtsrd
Era of Sociology News Rumors News Detection using Machine LearningEra of Sociology News Rumors News Detection using Machine Learning
Era of Sociology News Rumors News Detection using Machine Learning
ijtsrd63 views
Collecting and Coding Twitter Data in DiscoverText by Jill Hopke
Collecting and Coding Twitter Data in DiscoverTextCollecting and Coding Twitter Data in DiscoverText
Collecting and Coding Twitter Data in DiscoverText
Jill Hopke2K views
Emergency Response PresentationAssessment Description The p.docx by greg1eden90113
Emergency Response PresentationAssessment Description The p.docxEmergency Response PresentationAssessment Description The p.docx
Emergency Response PresentationAssessment Description The p.docx
greg1eden901138 views
Rfs & social media by BGTT_SYD
Rfs & social mediaRfs & social media
Rfs & social media
BGTT_SYD388 views
AI Challenges for Non-Profits, Small Business and Government by Michael Bryan
AI Challenges for Non-Profits, Small Business and GovernmentAI Challenges for Non-Profits, Small Business and Government
AI Challenges for Non-Profits, Small Business and Government
Michael Bryan8 views
Hashtags as Publics: Global Frackdown Anti-fracking Movement Twitter Practices by Jill Hopke
Hashtags as Publics: Global Frackdown Anti-fracking Movement Twitter PracticesHashtags as Publics: Global Frackdown Anti-fracking Movement Twitter Practices
Hashtags as Publics: Global Frackdown Anti-fracking Movement Twitter Practices
Jill Hopke421 views
Classifying Crises-Information Relevancy with Semantics by COMRADES project
Classifying Crises-Information Relevancy with SemanticsClassifying Crises-Information Relevancy with Semantics
Classifying Crises-Information Relevancy with Semantics
COMRADES project87 views
Towards Explainable Fact Checking (DIKU Business Club presentation) by Isabelle Augenstein
Towards Explainable Fact Checking (DIKU Business Club presentation)Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)
Semantic Wide and Deep Learning for Detecting Crisis-Information Categories o... by Gregoire Burel
Semantic Wide and Deep Learning for Detecting Crisis-Information Categories o...Semantic Wide and Deep Learning for Detecting Crisis-Information Categories o...
Semantic Wide and Deep Learning for Detecting Crisis-Information Categories o...
Gregoire Burel621 views
Emergency Response Presentation Assessment The purpose of this assignment.docx by write4
Emergency Response Presentation Assessment The purpose of this assignment.docxEmergency Response Presentation Assessment The purpose of this assignment.docx
Emergency Response Presentation Assessment The purpose of this assignment.docx
write42 views
Analytic Journalism: Investing in an Intellectual Portfolio to Secure Journal... by J T "Tom" Johnson
Analytic Journalism: Investing in an Intellectual Portfolio to Secure Journal...Analytic Journalism: Investing in an Intellectual Portfolio to Secure Journal...
Analytic Journalism: Investing in an Intellectual Portfolio to Secure Journal...
Topic sentences paragraph writing patterns by Susan Bolling
Topic sentences paragraph writing patternsTopic sentences paragraph writing patterns
Topic sentences paragraph writing patterns
Susan Bolling456 views
Quantitative and Digital Skills of International Journalism and Communication... by J T "Tom" Johnson
Quantitative and Digital Skills of International Journalism and Communication...Quantitative and Digital Skills of International Journalism and Communication...
Quantitative and Digital Skills of International Journalism and Communication...
J T "Tom" Johnson551 views
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re... by Farida Vis
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Farida Vis5.8K views

More from The Open University

Misinformation vs Fact-Checks: The Ongoing Battle by
Misinformation vs Fact-Checks: The Ongoing BattleMisinformation vs Fact-Checks: The Ongoing Battle
Misinformation vs Fact-Checks: The Ongoing BattleThe Open University
65 views48 slides
knod22-Alani.pdf by
knod22-Alani.pdfknod22-Alani.pdf
knod22-Alani.pdfThe Open University
423 views39 slides
Co-Creating Misinformation Resilient Societies by
Co-Creating Misinformation Resilient Societies Co-Creating Misinformation Resilient Societies
Co-Creating Misinformation Resilient Societies The Open University
455 views16 slides
SASIG Workshop on “Improving the digital landscape for our children” by
SASIG Workshop on “Improving the digital landscape for our children”SASIG Workshop on “Improving the digital landscape for our children”
SASIG Workshop on “Improving the digital landscape for our children”The Open University
276 views11 slides
COMRADES summary by
COMRADES summaryCOMRADES summary
COMRADES summaryThe Open University
403 views9 slides
COMRADES project introduction by
COMRADES project introduction COMRADES project introduction
COMRADES project introduction The Open University
213 views17 slides

More from The Open University(20)

Co-Creating Misinformation Resilient Societies by The Open University
Co-Creating Misinformation Resilient Societies Co-Creating Misinformation Resilient Societies
Co-Creating Misinformation Resilient Societies
SASIG Workshop on “Improving the digital landscape for our children” by The Open University
SASIG Workshop on “Improving the digital landscape for our children”SASIG Workshop on “Improving the digital landscape for our children”
SASIG Workshop on “Improving the digital landscape for our children”
Co-Inform (Co-Creating Misinformation Resilient Societies) by The Open University
Co-Inform (Co-Creating Misinformation Resilient Societies)Co-Inform (Co-Creating Misinformation Resilient Societies)
Co-Inform (Co-Creating Misinformation Resilient Societies)
Mining and Comparing Engagement Dynamics Across Multiple Social Media Platfor... by The Open University
Mining and Comparing Engagement Dynamics Across Multiple Social Media Platfor...Mining and Comparing Engagement Dynamics Across Multiple Social Media Platfor...
Mining and Comparing Engagement Dynamics Across Multiple Social Media Platfor...
The Open University1.3K views
Social Media Analytics with a pinch of semantics by The Open University
Social Media Analytics with a pinch of semanticsSocial Media Analytics with a pinch of semantics
Social Media Analytics with a pinch of semantics
The Open University2.6K views
Monitoring and Analysis of Online Communities by The Open University
Monitoring and Analysis of Online CommunitiesMonitoring and Analysis of Online Communities
Monitoring and Analysis of Online Communities
The Open University3.5K views

Recently uploaded

Soco 7.pdf by
Soco 7.pdfSoco 7.pdf
Soco 7.pdfSocioCosmos
5 views1 slide
The Beav Slideshow.pptx by
The Beav Slideshow.pptxThe Beav Slideshow.pptx
The Beav Slideshow.pptxajlfelix26
64 views43 slides
What's better for marketing by
What's better for marketingWhat's better for marketing
What's better for marketingsagarsivan97
6 views1 slide
"Mastering Social Media Marketing: A Guide to Fremont's Local Influence and C... by
"Mastering Social Media Marketing: A Guide to Fremont's Local Influence and C..."Mastering Social Media Marketing: A Guide to Fremont's Local Influence and C...
"Mastering Social Media Marketing: A Guide to Fremont's Local Influence and C...Embtel Solutions
13 views19 slides
Unlock the Power of Viral Marketing 7 Proven Strategies to Amplify Your Brand... by
Unlock the Power of Viral Marketing 7 Proven Strategies to Amplify Your Brand...Unlock the Power of Viral Marketing 7 Proven Strategies to Amplify Your Brand...
Unlock the Power of Viral Marketing 7 Proven Strategies to Amplify Your Brand...Sarah Boyer
6 views9 slides
The Playing cards.pptx by
The Playing cards.pptxThe Playing cards.pptx
The Playing cards.pptxdivyabhana2
16 views5 slides

Recently uploaded(6)

The Beav Slideshow.pptx by ajlfelix26
The Beav Slideshow.pptxThe Beav Slideshow.pptx
The Beav Slideshow.pptx
ajlfelix2664 views
What's better for marketing by sagarsivan97
What's better for marketingWhat's better for marketing
What's better for marketing
sagarsivan976 views
"Mastering Social Media Marketing: A Guide to Fremont's Local Influence and C... by Embtel Solutions
"Mastering Social Media Marketing: A Guide to Fremont's Local Influence and C..."Mastering Social Media Marketing: A Guide to Fremont's Local Influence and C...
"Mastering Social Media Marketing: A Guide to Fremont's Local Influence and C...
Embtel Solutions13 views
Unlock the Power of Viral Marketing 7 Proven Strategies to Amplify Your Brand... by Sarah Boyer
Unlock the Power of Viral Marketing 7 Proven Strategies to Amplify Your Brand...Unlock the Power of Viral Marketing 7 Proven Strategies to Amplify Your Brand...
Unlock the Power of Viral Marketing 7 Proven Strategies to Amplify Your Brand...
Sarah Boyer6 views
The Playing cards.pptx by divyabhana2
The Playing cards.pptxThe Playing cards.pptx
The Playing cards.pptx
divyabhana216 views

Crisis Information Processing - with the power of A.I.

  • 1. Crisis Information Processing with the power of A.I. Harith Alani Knowledge Media institute The Open University, UK @halani
  • 2. AGEOF DISASTERS Kerala flooda, August 2018, ~400 fatalities, ~500K people displaced voanews.com Attica, Greece, Wild fires, July 2018, 98 deaths bloomberg.com Japan, Typhoon Jebi, Sept 2018, 17 deaths japantimes.co.jp.com Hurricane Maria, Oct 2017, 3K fatalities thefrontpageonline.com/ Lombok, Indonesia, Earthquake, August 2018, over 500 deaths uk.businessinsider.com/
  • 5. POWER OF INFORMATION “Disaster-affected people need information as much as water, food, medicine or shelter. Information can save lives, livelihoods and resources.”
  • 10. 13
  • 11. DISASTER RESPONSE THROUGH SOCIAL MEDIA “The models that are emerging indicate that affected people are becoming extremely adept at using social media platforms in particular to engage in networked systems of response. This means they are able to post about specific needs and solicit individual responses to those needs, and that people offering specific help can also do so”
  • 13. https://federalnewsradio.com/digital-government/2016/03/fema-hhs-turn-social-listening-better-disaster-response/ FEMA MOBILE APP “Unfortunately, we’ve been underwhelmed with the use of that app, because I think everyone is at the Weather Channel doing their Instagrams there,” …. “Instead of trying to do everything ourselves, we need to find smarter ways to integrate the social media world more effectively into how we perform our business functions.” Scott Shoup chief data officer at FEMA
  • 15. ”Immediate damage estimates based on FEMA models can miss areas of heavy impact. Augmenting initial models with real-time analysis of social media and crowdsourced information can help identify overlooked areas. Twitter-sourced estimates were virtually available as people tweeted distress signals, of these parcel-level damage estimates, 46 percent were not captured by FEMA estimates.” FEMA MISSES HURRICANE DAMAGE REPORTED ON TWITTER
  • 18. WORKFLOW OF USHAHIDI & SIMILAR PLATFORMS citizen reporters digital responders Manual Annotations administrators Manual Verification Manual Publishing analysts/public/ research teams
  • 19. SOCIAL MEDIA INFOSMOG DURING DISASTERS In the US, 1.1 million tweets were sent in the first day of Hurricane Sandy, and over 20 million in total ~800K photos with #Sandy hashtag on Instagram More than 23 million tweets were posted about the haze in Singapore In Nepal, more than half a million posts were shared about the devastating earthquake in 2015 >2.3M tweets were sent with the words “Haiti” or “Red Cross” in 2010 ~177 million tweets sent about the Japan 2011 earthquake disaster
  • 21. REQUIREMENTS & CHALLENGES VOLUME VALUE VARIETY VALIDITY Too much content to handle manually More content is coming in all the time Rumours and hoaxes spread wild during disasters Content is often repetitive and uninformative Much of the content is irrelevant VELOCITY
  • 22. Filtering out irrelevant information helps to tackle information overload How do we identify relevant and irrelevant information across diverse crises situations? Can we learn from one type of crisis situation, and apply it to another? Can we train our models on one language and apply it to another? RELEVANCY OF SOCIAL MEDIA POSTS
  • 25. Query Filtering #hashtags, keywords disaster name disaster specific phrases locations filtered data FILTERING METHODS Post Collection text search topic modelling semantic search automatic categorisation filtered data supervisedunsupervised Event Label Machine Learning Classifiers (e.g., Naïve Bayes, SVM, J48, CNN) Features (e.g., n-grams, linguistic features, semantics)
  • 26. 34 AUTOMATIC CLASSIFICATION N-grams Text length Count of nouns/verbs/pronouns Hashtags Mentions Readability score … Analysis Features
  • 28. SVM (20 iterations 5- fold cross validation) Features P R F 0.81 0.81 0.81Statistical Features PRECISION RECALL F-MEASURE TRAIN & TEST ON SAME CRISES EVENTS What if we add some domain knowledge?
  • 30. Filtering out abstract concepts ADDING SEMANTICS TO CLASSIFICATION MODEL Semantic Annotation Semantic Expansion Semantic Filtering v
  • 31. SVM (20 iterations 5- fold cross validation) Features P R F 0.81 0.81 0.81 - 0.82 0.82 0.82 1.39 0.81 0.81 0.81 0.33 0.82 0.82 0.82 0.6 Semantic Features Statistical Features PRECISION RECALL F-MEASURE ∆F /F (%) Semantic Features Semantic Features TRAIN & TEST ON SAME CRISES EVENTS
  • 32. CLASSIFICATION MODELS HURRICANE HARVEY HURRICANE IRMA KERALA FLOODS LOMBOK EARTHQUAKE How can we train models to become less biased towards specific disaster events, or type of events? ? TYPHOON TRAIN CRASH BOMBING MASS SHOOTING TSUNAMI ? ? Classification Model
  • 34. CLASSIFYING FAMILIAR EVENTS Train model on all data, then test on a new crisis event of a type the was in the training set Eg., train model on data that include flood events, then test on a new flood crisis event Adding semantic features offer modest improvements over statistical features alone 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 TyphoonYolanda TyphoonPablo AlbertaFlood QueenslandFlood ColoradoFloods PhilippinesFlood SardiniaFlood GuatemalaEarthquake ItalyEarthquake BoholEarthquake CostaRicaEarthquake average F-Measure Statistical Features Semantic Features Flood/Typhoon Earthquake ∆ 1.7%
  • 35. CLASSIFYING UNFAMILIAR EVENTS Train model on certain type of events, and test it on other types E.g., train model on data that include flood and earthquake events, then test on a train crash incident Adding semantic features offer a good improvement over statistical features alone 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 LAAirportShoot LacMeganticTrainCrash BostonBombing SpainTrainCrash TyphoonYolanda TyphoonPablo AlbertaFlood QueenslandFlood ColoradoFloods PhilippinesFlood SardiniaFlood GuatemalaEarthquake ItalyEarthquake BoholEarthquake CostaRicaEarthquake average F-Measure Statistical Features Semantic Features Terror/Bomb/Train Flood/Typhoon Earthquake Khare, P.; Burel, G. and Alani, H. Classifying Crises-Information Relevancy with Semantics. Extended Semantic Web Conference (ESWC), Heraklion, Crete, 2018. ∆ 7.2%
  • 37. CLASSIFYING MULTILINGUAL CRISES DATA Monolingual Classification with Monolingual Models Cross-lingual Classification with Monolingual Models Train the model on one language and test it on data in the same language. For example, train and test on data written in English. This is the default approach, and can be used as a baseline. Run the classifiers on crisis data in languages that were not observed in the training data. For example, we test the classifier on Italian when the classifier was trained on English or Spanish. Cross-lingual Classification with Machine Translation Train the classification model on data in a certain language (e.g. Spanish), and use it to classify data that has been automatically translated from other languages (e.g., Italian and English) into the language of the training data.
  • 38. Khare, P., Burel, G., Maynard, D., and Alani, H. Cross-Lingual Classification of Crisis Data. Int. Semantic Web Conference, Monterey, CA, USA, 2018 Around 9% improvement in detecting crisis-data relevancy when training on one language and applying it on another 0.429 0.688 0.521 0.64 0.578 0.489 0.5570.572 0.659 0.538 0.631 0.65 0.543 0.599 English [Italian] English [Spanish] Italian [English] Italian [Spanish] Spanish [English] Spanish [Italian] average Cross-lingual Classification with Monolingual Models Machine translation offers good classification improvements without any semantics 0.546 0.669 0.572 0.609 0.675 0.593 0.633 0.581 0.664 0.551 0.582 0.683 0.571 0.605 English [Italian- >English] English [Spanish- >English] Italiant [English- >Italian] Italiant [Spanish- >Italian] Spanish [English- >Spanish] Spanish [Italian- >Spanish] average Cross-lingual Classification with Machine Translation Semantics add little/no benefit when building, and applying, classification models on the same language 0.831 0.709 0.781 0.774 0.818 0.712 0.776 0.769 English [English] Italian [Italian] Spanish [Spanish] average Train language [Test language] Statistical Features Semantic Features Monolingual Classification with Monolingual Models
  • 39. Task 1 Crisis vs. non-Crisis Related Messages Task 2 Type of Crisis Task 3 Type of Information Differentiate those posts that are related to a crisis situation vs. those posts that are not Identify the different types of crises the message is related to Differentiate those posts that are related to a crisis situation vs. those posts that are not Granularity CRISIS-DATA PROCESSING TASKS Shooting, Explosion, Building Collapse, Fires, Floods, Meteorite Fall, etc. Affected Individuals, Infrastructures and Utilities, Donations and Volunteer, Caution and Advice, etc. Olteanu, A., Vieweg, S., Castillo, C. What to Expect When the Unexpected Happens: Social Media Communications Across Crises. ACM Comp. Supported Cooperative Work and Social Computing (CSCW), 2015
  • 40. CRISIS-DATA PROCESSING TASKS Incorporating semantics into Machine Learning classification methods: Approach 2: Deep LearningApproach 1: Traditional ML Classifiers
  • 42. DEEP LEARNING VS CLASSIC ML
  • 44. DEEP LEARNING FOR CRISIS EVENT DETECTION A semantically-enriched deep learning model for event detection on Twitter Tweets Preprocessing Concept Extraction Word Vectors Initialisation Sem-CNN Training Pre-trained Embeddings Semantic Vectors Initialisation Bag of Words Bag of Concepts T = “Obama attends vigil for Boston Marathon bombing victims” W = [obama, attends, vigil, for, boston, marathon, bombing, victims] C = [obama, politician, none, none, none, boston, location, none, none, none] Term-Document Vector (Term Presence) Embeddings obama politician boston location ... ... ... ... none obama attends vigil for boston marathon bombing victims 1 1 1 1 0 0 0 0 1 Concepts Vector DEEP LEARNING MODEL Affected Individuals, Infrastructures and Utilities, Donations and Volunteering, Caution and Advice, Sympathy and Support, Other Useful Information (Olteanu et al 2015)
  • 45. CLASSIFYING TWEETS WITH DEEP LEARNING SVM (TF-IDF): A linear kernel SVM classifier trained from the words’ TF- IDF vectors extracted from our dataset SVM (Word2Vec): A linear kernel SVM classifier trained from the Google pre- trained 300-dimensional word embeddings SEM-DL: Semantic Deep Learning approach Data is from CrisisLexT26: 26 crisis events, with 1,000 annotated tweets for a total of around 28,000 tweets. Data is too small for Deep Learning, hence only a proof of concept 0.48 0.5 0.52 0.54 0.56 0.58 0.6 0.62 0.64 Precision Recall F1 SEM-DL SVM (Word2Vec) SVM (TF-IDF) Burel, G.; Saif, H. and Alani, H. Semantic Wide and Deep Learning for Detecting Crisis-Information Categories on Social Media. Int. Semantic Web Conf. (ISWC), Vienna, Austria, 2017.
  • 46. CREES automatically processes short texts in a Google sheet, and identifies if a text is about a crisis, crisis-types and information-types Uses Deep Learning methods Google Sheet Add-on CRISIS EVENT EXTRACTION SERVICE Burel, G. & Alani, H. Crisis Event Extraction Service (CREES) - Automatic Detection and Classification of Crisis-related Content on Social Media. 15th Int. Conf. on Info. Sys. for Crisis Response and Management, Rochester, NY, USA, 2018
  • 47. = CREES_RELATED(A1:A2) = CREES_TYPE(A1:A2) = CREES_INFO(A1:A2)
  • 48. 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% Affected Individuals Caution & Advise Donation & Volunteering Infrastructure Sympathy Other useful information Hurricane Harvey Hurricane Irma Kerala Floods Hurricane Florence 36% 9% 15% 7%2% 11% 20% Relevant Caution & Advise Donation & Volunteering Affected Individuals Infrastructure Sympathy Other useful information DISTRIBUTION OF INFORMATION TYPES
  • 53. DEEP LEARNING RUMOUR VERACITY CLASSIFIER Can work without waiting for responses (e.g., comments, retweets) https://cloud.gate.ac.uk/shopfront/displayItem/rumour-veracity Does not require the reactions (stances) given by the responses -- stance detection may introduce noise Makes use only of the source tweet
  • 54. CHATBOTS FOR CRISES REPORTING Potential vs Reality On FB Messenger alone, there are currently over 300K active bots, exchanging over 8 billion messages between people and businesses each month.
  • 57. What kind of issue would you like to report? Good afternoon first of all Oh my, I'm not programmed to understand what you're saying. Sorry! CHATBOTS – A LONG WAY TO GO Visits to the Facebook chatbot Visitors who clicked around in chatbot Users not following user flow Users tried to follow user flow Technical fault when submitting Reports successfully sent to Uchaguzi Total reports submitted through Twitter, SMS, onsite reporters Reports structured, geolocated, verified, and published 6875 687 3034 1501 1150 222 106 55 CHATBOT STATS PLATFORM STATS 65% 35% CHATBOT USER DEMOGRAPHICS
  • 59. WHAT’S NEXT Inclusiveness of social media Biases: gender, technology, social media platform, language Usage of social media can differ across countries, cultures, genders, platforms, economies … How can we encourage, and direct, a better and more sustained crowdsourcing during disasters Many tools and services: when and how they need to be orchestrated and used Relevancy and value of social media crisis data is subjective and person/time dependent
  • 60. Free, A.I. powered tools are now available, to: • Separate relevant from rubbish tweets, in ”multiple languages”, and for “any” type of crisis • Identify the category of crisis information they hold • Measure their veracity ”.. I would suggest, then, that the formula for the next 10,000 start-ups is very, very simple, which is to take x and add AI. That is the formula, that's what we're going to be doing. And that is the way in which we're going to make this second Industrial Revolution” Kevin Kelly, IBM
  • 61. Gregoire Burel Lara Piccolo Prashant Khare Acknowledgements