SlideShare a Scribd company logo
A Real-time System for Detecting
Landslide Reports on Social Media
using Artificial Intelligence
Ferda Ofli1, Umair Qazi1, Muhammad Imran1, Julien Roch2,
Catherine Pennington3, Vanessa Banks3, Remy Bossu2
1Qatar Computing Research Institute
2European-Mediterranean Seismological Centre
3British Geological Survey
ICWE 2022
Bari, Italy
Agenda
• Motivation
• System Design
• Model Development
• System Benchmark
• Real-world Deployment
• Conclusion
2
Motivation
Landslides cause thousands of deaths and millions of dollars in infrastructural
damage worldwide each year. 3
Motivation
Landslide events are often under-reported and insufficiently documented.
Credit: Petley, D. Geology (2012)
4
Lack of such important data not only hinders humanitarian aid
but also impedes scientific research.
Existing Approaches
On-the-ground surveys Satellite imagery analysis
5
© BGS
- Expensive
- Time-consuming
- Impractical/not-applicable
Existing Approaches – Citizen Science (I)
6
Juang et al., “Using citizen science to expand the global map of landslides: Introducing the Cooperative Open
Online Landslide Repository”, Plos One 2019.
NASA Landslide Reporter
Existing Approaches – Citizen Science (II)
7
Mobile Applications
Kocaman & Gokceoglu, “A CitSci app for
landslide data collection”, Landslides 2019.
Sellers et al., “MARLI: a mobile application for regional
landslide inventories in Ecuador”, Landslides 2021.
Not easily scalable as they require active participation of
volunteers that opt-in to use a particular application.
What about Social Media?
8
Goal
Identify landslide
reports on social
media seamlessly
and at a much
larger scale
9
Detecting Landslides in Tweets
10
motorcycle
accident
heavy rainfall
earthquake
wildfire tropical cyclone
on fire
flooded
car accident
Detecting Landslides in Tweets
11
motorcycle
accident
heavy rainfall
earthquake
wildfire tropical cyclone
on fire
flooded
car accident
Detecting Landslides in Tweets
12
motorcycle
accident
heavy rainfall
earthquake
wildfire tropical cyclone
on fire
flooded
car accident
System Architecture
13
System Architecture – Image Pipeline
14
System Architecture – Image Pipeline
15
System Architecture – Image Pipeline
16
System Architecture – Image Pipeline
17
System Architecture – Image Pipeline
18
System Architecture – Image Pipeline
19
System Architecture – Text Pipeline
20
System Architecture – Text Pipeline
21
System Architecture – Text Pipeline
22
System Architecture – Text Pipeline
23
System Architecture – Text Pipeline
24
System Architecture
25
Global Landslide Detector
26
Duplicate Filter
• Image features extracted from the penultimate layer of a ResNet-50
model pre-trained on the Places dataset
• Threshold based on Euclidean distance
• 600 image pairs (460 duplicate / 140 non-duplicate)
27
Junk Filter
• Fine-tune a ResNet-50 model, pre-trained on the ImageNet dataset,
using a custom dataset introduced by Nguyen et al. [ISCRAM 2017]
28
Nguyen et al., “Automatic Image Filtering on Social Networks Using Deep Learning
and Perceptual Hashing During Crises”, ISCRAM 2017.
Landslide Detector
29
Landslides
Landslide Rockslide Mudslide
Keywords: landslide, landslip, earth slip, mudslide, mudflow, rockslide, rock fall, cliff fall
30
Collection of Landslide Images
• Downloaded from Google and Twitter using keywords
• Donated by BGS
31
Labeling Methodology
• Manual annotation by three landslide specialists
• Several rounds of discussion to agree on a labeling methodology
• CV-based interpretation is different from desk- or field-based landslide identification
32
Pennington et al., “A near-real-time global landslide incident reporting tool
demonstrator using social media and artificial intelligence”, IJDRR 2022.
Final Dataset
• Inter-annotator agreement
• Fleiss’ Kappa = 0.58 (almost substantial)
• Percent Agreement = 76%
• Imbalanced class distribution
• 23% landslide vs. 77% not-landslide
Google Twitter BGS Total
Landslide 1,240 598 852 2,690
Not-landslide 5,044 555 3,448 9,047
Total 6,284 1,153 4,300 11,737
34
Pennington et al., “A near-real-time global landslide incident reporting tool
demonstrator using social media and artificial intelligence”, IJDRR 2022.
35
Landslide Model Training
• Fine-tune a ResNet-50 model, pre-trained on the ImageNet dataset,
using the home-grown dataset.
36
Ofli et al., “Landslide Detection in Real-Time Social Media Image
Streams”, arXiv preprint arXiv:2110.04080, 2021.
Qualitative
Analysis
w/ t-SNE
37
Class Activation Maps – True Positives
38
Class Activation Maps – True Negatives
39
Class Activation Maps – False Positives
40
Class Activation Maps – False Negatives
41
Geolocation Tagger
42
Qazi et al., “GeoCoV19: A Dataset of Hundreds of Millions of Multilingual COVID-19 Tweets with
Location Information”, Computer Science, ACM SIGSPATIAL Special, v12, pp 6-15, 2020.
Performance Evaluation & Benchmarking
• Stress-test the system and understand its scalability
• Latency
• time taken by a module to process a given input load
• Throughput
• number of items processed in a unit time (one second) given an input load
• Critical system components
• Duplicate filter
• Junk filter
• Landslide detector
• Geolocation tagger
43
Performance Evaluation & Benchmarking
44
Input Load (per second)
Latency
(second)
0
50
100
150
0
1
2
4
8
1
6
3
2
6
4
1
2
8
2
5
6
5
1
2
1
0
2
4
2
0
4
8
4
0
9
6
Duplicate Filter
Input Load (per second)
Throughput
(items/second)
0
10
20
30
40
0
1
2
4
8
1
6
3
2
6
4
1
2
8
2
5
6
5
1
2
1
0
2
4
2
0
4
8
4
0
9
6
Duplicate Filter
Performance Evaluation & Benchmarking
45
Input Load (per second)
Latency
(second)
0
5
10
15
0
1
2
4
8
1
6
3
2
6
4
1
2
8
2
5
6
5
1
2
1
0
2
4
2
0
4
8
4
0
9
6
Junk Filter
Input Load (per second)
Throughput
(items/second)
0
100
200
300
400
500
0
1
2
4
8
1
6
3
2
6
4
1
2
8
2
5
6
5
1
2
1
0
2
4
2
0
4
8
4
0
9
6
Junk Filter
Performance Evaluation & Benchmarking
46
Input Load (per second)
Latency
(second)
0
5
10
15
20
0
1
2
4
8
1
6
3
2
6
4
1
2
8
2
5
6
5
1
2
1
0
2
4
2
0
4
8
4
0
9
6
Landslide Detector
Input Load (per second)
Throughput
(items/second)
0
100
200
300
400
500
0
1
2
4
8
1
6
3
2
6
4
1
2
8
2
5
6
5
1
2
1
0
2
4
2
0
4
8
4
0
9
6
Landslide Detector
Performance Evaluation & Benchmarking
47
Input Load (per second)
Latency
(second)
0
100
200
300
0
1
2
4
8
1
6
3
2
6
4
1
2
8
2
5
6
5
1
2
1
0
2
4
2
0
4
8
4
0
9
6
With cache Without cache
Geolocation Tagger
Input Load (per second)
Throughput
(items/sec)
0
20
40
60
0
1
2
4
8
1
6
3
2
6
4
1
2
8
2
5
6
5
1
2
1
0
2
4
2
0
4
8
4
0
9
6
With cache Without cache
Geolocation Tagger
Real-world Deployment
• Online since February 2020 to monitor live Twitter stream globally
• 339 multilingual keywords in 32 languages
• February 2020 – December 2021
• Collected more than 54 million tweets and 15 million image URLs
• ~2.5 million image URLs deemed unique and downloaded for further analysis
• ~17,000 images classified as relevant, unique and landslides
• Corresponds to <1% of the collected images
• Highlights the challenging nature of the problem
• ~6,500 landslide reports shared by personal accounts whereas ~4,500 by
organizational accounts
48
Real-world Deployment – Data Statistics
49
Data
Volume
1
10
100
1000
10000
100000
1000000
2020-02-01
2020-02-29
2020-03-28
2020-04-25
2020-05-23
2020-06-20
2020-07-18
2020-08-15
2020-09-12
2020-10-10
2020-11-07
2020-12-05
2021-01-02
2021-01-30
2021-02-27
2021-03-27
2021-04-24
2021-05-22
2021-06-19
2021-07-17
2021-08-14
2021-09-11
2021-10-09
2021-11-06
2021-12-04
2021-12-31
Raw Tweets Raw Images Relevant Images Non-duplicate Images Landslide Images
Real-world Deployment – Verification
• Randomly sampled 3,600 images processed by the system
• Asked experts to label the sampled images
• System-predicted labels compared to expert annotations
50
Real-world Deployment – Verification
• Randomly sampled 3,600 images processed by the system
• Asked experts to label the sampled images
• System-predicted labels compared to expert annotations
51
True False
Landslide (positive) 123 39
Not-landslide (negative) 3395 43
Real-world Deployment – Worldwide Reports
52
NASA landslide susceptibility map
Real-world Deployment – Country Maps
53
Real-world Deployment – Quarterly Maps
54
Real-world Deployment – Quarterly Maps
55
US, Ecuador, Colombia, and India experience significant landslide numbers all year round.
Real-world Deployment – Quarterly Maps
56
For India, landslides become even more prevalent in Q3.
Real-world Deployment – Quarterly Maps
57
Mexico experiences a significant increase in Q3.
Real-world Deployment – Quarterly Maps
58
Prominent landslide numbers in Indonesia and Malaysia happen in Q1 and Q4.
Real-world Deployment – Quarterly Maps
59
Prominent landslide numbers in the UK happen in Q1 and Q2.
Real-world Deployment – Quarterly Maps
60
Turkey experiences most landslides in Q1 thru Q3.
Landslide Reports in Italy
61
Landslide Reports in Italy
62
Landslide Reports in Italy
63
Landslide Reports in Italy
64
Landslide Reports in Italy
65
Landslide Reports in Italy
66
Landslide Reports in Italy
67
Landslide Reports in Italy
68
Conclusion
• An interdisciplinary collaboration between computer scientists
(QCRI), seismologists (EMSC), and landslide specialists (BGS).
• The system leverages online social media data in real time to identify
landslide reports automatically using state-of-the-art AI techniques
• Reduces the information overload by eliminating duplicate and irrelevant
content
• Identifies landslide images
• Infers their geolocation
• Categorizes the user type (organization or person)
• The real-world deployment shows the success of the system.
69
Conclusion
• We believe that our system can contribute to harvesting of global
landslide data and facilitate further landslide research.
• It can support global landslide susceptibility maps to provide
situational awareness and improve emergency response and decision
making.
• Next steps:
• Historical data analysis w/ ground truth from other sources, e.g., BGS, NASA,
EM-DAT, etc.
• Spatiotemporal detection of events
70
Thank you!
https://landslide-aidr.qcri.org/service.php
Please give us feedback!
71

More Related Content

Similar to A Real-time System for Detecting Landslide Reports on Social Media using Artificial Intelligence

The role of geospatial information in a hyper connected society
The role of geospatial information in a hyper connected societyThe role of geospatial information in a hyper connected society
The role of geospatial information in a hyper connected society
Maria Antonia Brovelli
 
PEARC17: Visual exploration and analysis of time series earthquake data
PEARC17: Visual exploration and analysis of time series earthquake dataPEARC17: Visual exploration and analysis of time series earthquake data
PEARC17: Visual exploration and analysis of time series earthquake data
Amit Chourasia
 
Community Structure, Interaction and Evolution Analysis of Online Social Netw...
Community Structure, Interaction and Evolution Analysis of Online Social Netw...Community Structure, Interaction and Evolution Analysis of Online Social Netw...
Community Structure, Interaction and Evolution Analysis of Online Social Netw...
Symeon Papadopoulos
 
The TRIDEC system in the NEAMWave12 exercise
The TRIDEC system in the NEAMWave12 exerciseThe TRIDEC system in the NEAMWave12 exercise
The TRIDEC system in the NEAMWave12 exercise
Martin Hammitzsch
 
NextGen environmental sensing at the National Arboretum
NextGen environmental sensing at the National ArboretumNextGen environmental sensing at the National Arboretum
NextGen environmental sensing at the National Arboretum
TimeScience
 
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
University of Southern California
 
24.pdf
24.pdf24.pdf
24.pdf
a a
 
Processing and understanding geo-social media content
Processing and understanding geo-social media contentProcessing and understanding geo-social media content
Processing and understanding geo-social media content
foostermann
 
Pipelines: 2052-James Breaux, Centurion Pipeline Co.
Pipelines: 2052-James Breaux, Centurion Pipeline Co.Pipelines: 2052-James Breaux, Centurion Pipeline Co.
Pipelines: 2052-James Breaux, Centurion Pipeline Co.
Energy Network marcus evans
 
Geographic context analysis of volunteered information
Geographic context analysis of volunteered informationGeographic context analysis of volunteered information
Geographic context analysis of volunteered information
foostermann
 
Validation of services, data and metadata
Validation of services, data and metadataValidation of services, data and metadata
Validation of services, data and metadata
Luis Bermudez
 
Automated Experimentation in Social Informatics
Automated Experimentation in Social InformaticsAutomated Experimentation in Social Informatics
Automated Experimentation in Social Informatics
Aliaksandr Birukou
 
Handling Uncertainty in Geo-Spatial Data.
Handling Uncertainty in Geo-Spatial Data.Handling Uncertainty in Geo-Spatial Data.
Handling Uncertainty in Geo-Spatial Data.
Andreas Zuefle
 
Vision for the OpenQuake Platform [Sep 2012]
Vision for the OpenQuake Platform [Sep 2012]Vision for the OpenQuake Platform [Sep 2012]
Vision for the OpenQuake Platform [Sep 2012]
Global Earthquake Model Foundation
 
Cobweb: Using citizen science data to support flood risk modelling
Cobweb: Using citizen science data to support flood risk modellingCobweb: Using citizen science data to support flood risk modelling
Cobweb: Using citizen science data to support flood risk modelling
COBWEB Project
 
Collaborative Geo-information Capturing To Support Emergency Response
Collaborative Geo-information Capturing To Support Emergency Response Collaborative Geo-information Capturing To Support Emergency Response
Collaborative Geo-information Capturing To Support Emergency Response
UN-SPIDER
 
Volunteer Crowd Computing and Federated Cloud developments
Volunteer Crowd Computing and Federated Cloud developmentsVolunteer Crowd Computing and Federated Cloud developments
Volunteer Crowd Computing and Federated Cloud developments
David Wallom
 
TraitCapture: NextGen Monitoring and Visualization from seed to ecosystem
TraitCapture: NextGen Monitoring and Visualization from seed to ecosystemTraitCapture: NextGen Monitoring and Visualization from seed to ecosystem
TraitCapture: NextGen Monitoring and Visualization from seed to ecosystem
TimeScience
 
Crowdsourcing Land Cover and Land Use Data: Experiences from IIASA
Crowdsourcing Land Cover and Land Use Data: Experiences from IIASACrowdsourcing Land Cover and Land Use Data: Experiences from IIASA
Crowdsourcing Land Cover and Land Use Data: Experiences from IIASA
Louisa Diggs
 
Taylor neon pheno_cam_2014_aceas
Taylor neon pheno_cam_2014_aceasTaylor neon pheno_cam_2014_aceas
Taylor neon pheno_cam_2014_aceas
aceas13tern
 

Similar to A Real-time System for Detecting Landslide Reports on Social Media using Artificial Intelligence (20)

The role of geospatial information in a hyper connected society
The role of geospatial information in a hyper connected societyThe role of geospatial information in a hyper connected society
The role of geospatial information in a hyper connected society
 
PEARC17: Visual exploration and analysis of time series earthquake data
PEARC17: Visual exploration and analysis of time series earthquake dataPEARC17: Visual exploration and analysis of time series earthquake data
PEARC17: Visual exploration and analysis of time series earthquake data
 
Community Structure, Interaction and Evolution Analysis of Online Social Netw...
Community Structure, Interaction and Evolution Analysis of Online Social Netw...Community Structure, Interaction and Evolution Analysis of Online Social Netw...
Community Structure, Interaction and Evolution Analysis of Online Social Netw...
 
The TRIDEC system in the NEAMWave12 exercise
The TRIDEC system in the NEAMWave12 exerciseThe TRIDEC system in the NEAMWave12 exercise
The TRIDEC system in the NEAMWave12 exercise
 
NextGen environmental sensing at the National Arboretum
NextGen environmental sensing at the National ArboretumNextGen environmental sensing at the National Arboretum
NextGen environmental sensing at the National Arboretum
 
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
 
24.pdf
24.pdf24.pdf
24.pdf
 
Processing and understanding geo-social media content
Processing and understanding geo-social media contentProcessing and understanding geo-social media content
Processing and understanding geo-social media content
 
Pipelines: 2052-James Breaux, Centurion Pipeline Co.
Pipelines: 2052-James Breaux, Centurion Pipeline Co.Pipelines: 2052-James Breaux, Centurion Pipeline Co.
Pipelines: 2052-James Breaux, Centurion Pipeline Co.
 
Geographic context analysis of volunteered information
Geographic context analysis of volunteered informationGeographic context analysis of volunteered information
Geographic context analysis of volunteered information
 
Validation of services, data and metadata
Validation of services, data and metadataValidation of services, data and metadata
Validation of services, data and metadata
 
Automated Experimentation in Social Informatics
Automated Experimentation in Social InformaticsAutomated Experimentation in Social Informatics
Automated Experimentation in Social Informatics
 
Handling Uncertainty in Geo-Spatial Data.
Handling Uncertainty in Geo-Spatial Data.Handling Uncertainty in Geo-Spatial Data.
Handling Uncertainty in Geo-Spatial Data.
 
Vision for the OpenQuake Platform [Sep 2012]
Vision for the OpenQuake Platform [Sep 2012]Vision for the OpenQuake Platform [Sep 2012]
Vision for the OpenQuake Platform [Sep 2012]
 
Cobweb: Using citizen science data to support flood risk modelling
Cobweb: Using citizen science data to support flood risk modellingCobweb: Using citizen science data to support flood risk modelling
Cobweb: Using citizen science data to support flood risk modelling
 
Collaborative Geo-information Capturing To Support Emergency Response
Collaborative Geo-information Capturing To Support Emergency Response Collaborative Geo-information Capturing To Support Emergency Response
Collaborative Geo-information Capturing To Support Emergency Response
 
Volunteer Crowd Computing and Federated Cloud developments
Volunteer Crowd Computing and Federated Cloud developmentsVolunteer Crowd Computing and Federated Cloud developments
Volunteer Crowd Computing and Federated Cloud developments
 
TraitCapture: NextGen Monitoring and Visualization from seed to ecosystem
TraitCapture: NextGen Monitoring and Visualization from seed to ecosystemTraitCapture: NextGen Monitoring and Visualization from seed to ecosystem
TraitCapture: NextGen Monitoring and Visualization from seed to ecosystem
 
Crowdsourcing Land Cover and Land Use Data: Experiences from IIASA
Crowdsourcing Land Cover and Land Use Data: Experiences from IIASACrowdsourcing Land Cover and Land Use Data: Experiences from IIASA
Crowdsourcing Land Cover and Land Use Data: Experiences from IIASA
 
Taylor neon pheno_cam_2014_aceas
Taylor neon pheno_cam_2014_aceasTaylor neon pheno_cam_2014_aceas
Taylor neon pheno_cam_2014_aceas
 

Recently uploaded

Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Neo4j
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
BibashShahi
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 

Recently uploaded (20)

Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 

A Real-time System for Detecting Landslide Reports on Social Media using Artificial Intelligence

  • 1. A Real-time System for Detecting Landslide Reports on Social Media using Artificial Intelligence Ferda Ofli1, Umair Qazi1, Muhammad Imran1, Julien Roch2, Catherine Pennington3, Vanessa Banks3, Remy Bossu2 1Qatar Computing Research Institute 2European-Mediterranean Seismological Centre 3British Geological Survey ICWE 2022 Bari, Italy
  • 2. Agenda • Motivation • System Design • Model Development • System Benchmark • Real-world Deployment • Conclusion 2
  • 3. Motivation Landslides cause thousands of deaths and millions of dollars in infrastructural damage worldwide each year. 3
  • 4. Motivation Landslide events are often under-reported and insufficiently documented. Credit: Petley, D. Geology (2012) 4 Lack of such important data not only hinders humanitarian aid but also impedes scientific research.
  • 5. Existing Approaches On-the-ground surveys Satellite imagery analysis 5 © BGS - Expensive - Time-consuming - Impractical/not-applicable
  • 6. Existing Approaches – Citizen Science (I) 6 Juang et al., “Using citizen science to expand the global map of landslides: Introducing the Cooperative Open Online Landslide Repository”, Plos One 2019. NASA Landslide Reporter
  • 7. Existing Approaches – Citizen Science (II) 7 Mobile Applications Kocaman & Gokceoglu, “A CitSci app for landslide data collection”, Landslides 2019. Sellers et al., “MARLI: a mobile application for regional landslide inventories in Ecuador”, Landslides 2021. Not easily scalable as they require active participation of volunteers that opt-in to use a particular application.
  • 8. What about Social Media? 8
  • 9. Goal Identify landslide reports on social media seamlessly and at a much larger scale 9
  • 10. Detecting Landslides in Tweets 10 motorcycle accident heavy rainfall earthquake wildfire tropical cyclone on fire flooded car accident
  • 11. Detecting Landslides in Tweets 11 motorcycle accident heavy rainfall earthquake wildfire tropical cyclone on fire flooded car accident
  • 12. Detecting Landslides in Tweets 12 motorcycle accident heavy rainfall earthquake wildfire tropical cyclone on fire flooded car accident
  • 14. System Architecture – Image Pipeline 14
  • 15. System Architecture – Image Pipeline 15
  • 16. System Architecture – Image Pipeline 16
  • 17. System Architecture – Image Pipeline 17
  • 18. System Architecture – Image Pipeline 18
  • 19. System Architecture – Image Pipeline 19
  • 20. System Architecture – Text Pipeline 20
  • 21. System Architecture – Text Pipeline 21
  • 22. System Architecture – Text Pipeline 22
  • 23. System Architecture – Text Pipeline 23
  • 24. System Architecture – Text Pipeline 24
  • 27. Duplicate Filter • Image features extracted from the penultimate layer of a ResNet-50 model pre-trained on the Places dataset • Threshold based on Euclidean distance • 600 image pairs (460 duplicate / 140 non-duplicate) 27
  • 28. Junk Filter • Fine-tune a ResNet-50 model, pre-trained on the ImageNet dataset, using a custom dataset introduced by Nguyen et al. [ISCRAM 2017] 28 Nguyen et al., “Automatic Image Filtering on Social Networks Using Deep Learning and Perceptual Hashing During Crises”, ISCRAM 2017.
  • 30. Landslides Landslide Rockslide Mudslide Keywords: landslide, landslip, earth slip, mudslide, mudflow, rockslide, rock fall, cliff fall 30
  • 31. Collection of Landslide Images • Downloaded from Google and Twitter using keywords • Donated by BGS 31
  • 32. Labeling Methodology • Manual annotation by three landslide specialists • Several rounds of discussion to agree on a labeling methodology • CV-based interpretation is different from desk- or field-based landslide identification 32 Pennington et al., “A near-real-time global landslide incident reporting tool demonstrator using social media and artificial intelligence”, IJDRR 2022.
  • 33. Final Dataset • Inter-annotator agreement • Fleiss’ Kappa = 0.58 (almost substantial) • Percent Agreement = 76% • Imbalanced class distribution • 23% landslide vs. 77% not-landslide Google Twitter BGS Total Landslide 1,240 598 852 2,690 Not-landslide 5,044 555 3,448 9,047 Total 6,284 1,153 4,300 11,737 34 Pennington et al., “A near-real-time global landslide incident reporting tool demonstrator using social media and artificial intelligence”, IJDRR 2022.
  • 34. 35
  • 35. Landslide Model Training • Fine-tune a ResNet-50 model, pre-trained on the ImageNet dataset, using the home-grown dataset. 36 Ofli et al., “Landslide Detection in Real-Time Social Media Image Streams”, arXiv preprint arXiv:2110.04080, 2021.
  • 37. Class Activation Maps – True Positives 38
  • 38. Class Activation Maps – True Negatives 39
  • 39. Class Activation Maps – False Positives 40
  • 40. Class Activation Maps – False Negatives 41
  • 41. Geolocation Tagger 42 Qazi et al., “GeoCoV19: A Dataset of Hundreds of Millions of Multilingual COVID-19 Tweets with Location Information”, Computer Science, ACM SIGSPATIAL Special, v12, pp 6-15, 2020.
  • 42. Performance Evaluation & Benchmarking • Stress-test the system and understand its scalability • Latency • time taken by a module to process a given input load • Throughput • number of items processed in a unit time (one second) given an input load • Critical system components • Duplicate filter • Junk filter • Landslide detector • Geolocation tagger 43
  • 43. Performance Evaluation & Benchmarking 44 Input Load (per second) Latency (second) 0 50 100 150 0 1 2 4 8 1 6 3 2 6 4 1 2 8 2 5 6 5 1 2 1 0 2 4 2 0 4 8 4 0 9 6 Duplicate Filter Input Load (per second) Throughput (items/second) 0 10 20 30 40 0 1 2 4 8 1 6 3 2 6 4 1 2 8 2 5 6 5 1 2 1 0 2 4 2 0 4 8 4 0 9 6 Duplicate Filter
  • 44. Performance Evaluation & Benchmarking 45 Input Load (per second) Latency (second) 0 5 10 15 0 1 2 4 8 1 6 3 2 6 4 1 2 8 2 5 6 5 1 2 1 0 2 4 2 0 4 8 4 0 9 6 Junk Filter Input Load (per second) Throughput (items/second) 0 100 200 300 400 500 0 1 2 4 8 1 6 3 2 6 4 1 2 8 2 5 6 5 1 2 1 0 2 4 2 0 4 8 4 0 9 6 Junk Filter
  • 45. Performance Evaluation & Benchmarking 46 Input Load (per second) Latency (second) 0 5 10 15 20 0 1 2 4 8 1 6 3 2 6 4 1 2 8 2 5 6 5 1 2 1 0 2 4 2 0 4 8 4 0 9 6 Landslide Detector Input Load (per second) Throughput (items/second) 0 100 200 300 400 500 0 1 2 4 8 1 6 3 2 6 4 1 2 8 2 5 6 5 1 2 1 0 2 4 2 0 4 8 4 0 9 6 Landslide Detector
  • 46. Performance Evaluation & Benchmarking 47 Input Load (per second) Latency (second) 0 100 200 300 0 1 2 4 8 1 6 3 2 6 4 1 2 8 2 5 6 5 1 2 1 0 2 4 2 0 4 8 4 0 9 6 With cache Without cache Geolocation Tagger Input Load (per second) Throughput (items/sec) 0 20 40 60 0 1 2 4 8 1 6 3 2 6 4 1 2 8 2 5 6 5 1 2 1 0 2 4 2 0 4 8 4 0 9 6 With cache Without cache Geolocation Tagger
  • 47. Real-world Deployment • Online since February 2020 to monitor live Twitter stream globally • 339 multilingual keywords in 32 languages • February 2020 – December 2021 • Collected more than 54 million tweets and 15 million image URLs • ~2.5 million image URLs deemed unique and downloaded for further analysis • ~17,000 images classified as relevant, unique and landslides • Corresponds to <1% of the collected images • Highlights the challenging nature of the problem • ~6,500 landslide reports shared by personal accounts whereas ~4,500 by organizational accounts 48
  • 48. Real-world Deployment – Data Statistics 49 Data Volume 1 10 100 1000 10000 100000 1000000 2020-02-01 2020-02-29 2020-03-28 2020-04-25 2020-05-23 2020-06-20 2020-07-18 2020-08-15 2020-09-12 2020-10-10 2020-11-07 2020-12-05 2021-01-02 2021-01-30 2021-02-27 2021-03-27 2021-04-24 2021-05-22 2021-06-19 2021-07-17 2021-08-14 2021-09-11 2021-10-09 2021-11-06 2021-12-04 2021-12-31 Raw Tweets Raw Images Relevant Images Non-duplicate Images Landslide Images
  • 49. Real-world Deployment – Verification • Randomly sampled 3,600 images processed by the system • Asked experts to label the sampled images • System-predicted labels compared to expert annotations 50
  • 50. Real-world Deployment – Verification • Randomly sampled 3,600 images processed by the system • Asked experts to label the sampled images • System-predicted labels compared to expert annotations 51 True False Landslide (positive) 123 39 Not-landslide (negative) 3395 43
  • 51. Real-world Deployment – Worldwide Reports 52 NASA landslide susceptibility map
  • 52. Real-world Deployment – Country Maps 53
  • 53. Real-world Deployment – Quarterly Maps 54
  • 54. Real-world Deployment – Quarterly Maps 55 US, Ecuador, Colombia, and India experience significant landslide numbers all year round.
  • 55. Real-world Deployment – Quarterly Maps 56 For India, landslides become even more prevalent in Q3.
  • 56. Real-world Deployment – Quarterly Maps 57 Mexico experiences a significant increase in Q3.
  • 57. Real-world Deployment – Quarterly Maps 58 Prominent landslide numbers in Indonesia and Malaysia happen in Q1 and Q4.
  • 58. Real-world Deployment – Quarterly Maps 59 Prominent landslide numbers in the UK happen in Q1 and Q2.
  • 59. Real-world Deployment – Quarterly Maps 60 Turkey experiences most landslides in Q1 thru Q3.
  • 68. Conclusion • An interdisciplinary collaboration between computer scientists (QCRI), seismologists (EMSC), and landslide specialists (BGS). • The system leverages online social media data in real time to identify landslide reports automatically using state-of-the-art AI techniques • Reduces the information overload by eliminating duplicate and irrelevant content • Identifies landslide images • Infers their geolocation • Categorizes the user type (organization or person) • The real-world deployment shows the success of the system. 69
  • 69. Conclusion • We believe that our system can contribute to harvesting of global landslide data and facilitate further landslide research. • It can support global landslide susceptibility maps to provide situational awareness and improve emergency response and decision making. • Next steps: • Historical data analysis w/ ground truth from other sources, e.g., BGS, NASA, EM-DAT, etc. • Spatiotemporal detection of events 70