SlideShare a Scribd company logo
1 of 70
Download to read offline
A Real-time System for Detecting
Landslide Reports on Social Media
using Artificial Intelligence
Ferda Ofli1, Umair Qazi1, Muhammad Imran1, Julien Roch2,
Catherine Pennington3, Vanessa Banks3, Remy Bossu2
1Qatar Computing Research Institute
2European-Mediterranean Seismological Centre
3British Geological Survey
ICWE 2022
Bari, Italy
Agenda
• Motivation
• System Design
• Model Development
• System Benchmark
• Real-world Deployment
• Conclusion
2
Motivation
Landslides cause thousands of deaths and millions of dollars in infrastructural
damage worldwide each year. 3
Motivation
Landslide events are often under-reported and insufficiently documented.
Credit: Petley, D. Geology (2012)
4
Lack of such important data not only hinders humanitarian aid
but also impedes scientific research.
Existing Approaches
On-the-ground surveys Satellite imagery analysis
5
© BGS
- Expensive
- Time-consuming
- Impractical/not-applicable
Existing Approaches – Citizen Science (I)
6
Juang et al., “Using citizen science to expand the global map of landslides: Introducing the Cooperative Open
Online Landslide Repository”, Plos One 2019.
NASA Landslide Reporter
Existing Approaches – Citizen Science (II)
7
Mobile Applications
Kocaman & Gokceoglu, “A CitSci app for
landslide data collection”, Landslides 2019.
Sellers et al., “MARLI: a mobile application for regional
landslide inventories in Ecuador”, Landslides 2021.
Not easily scalable as they require active participation of
volunteers that opt-in to use a particular application.
What about Social Media?
8
Goal
Identify landslide
reports on social
media seamlessly
and at a much
larger scale
9
Detecting Landslides in Tweets
10
motorcycle
accident
heavy rainfall
earthquake
wildfire tropical cyclone
on fire
flooded
car accident
Detecting Landslides in Tweets
11
motorcycle
accident
heavy rainfall
earthquake
wildfire tropical cyclone
on fire
flooded
car accident
Detecting Landslides in Tweets
12
motorcycle
accident
heavy rainfall
earthquake
wildfire tropical cyclone
on fire
flooded
car accident
System Architecture
13
System Architecture – Image Pipeline
14
System Architecture – Image Pipeline
15
System Architecture – Image Pipeline
16
System Architecture – Image Pipeline
17
System Architecture – Image Pipeline
18
System Architecture – Image Pipeline
19
System Architecture – Text Pipeline
20
System Architecture – Text Pipeline
21
System Architecture – Text Pipeline
22
System Architecture – Text Pipeline
23
System Architecture – Text Pipeline
24
System Architecture
25
Global Landslide Detector
26
Duplicate Filter
• Image features extracted from the penultimate layer of a ResNet-50
model pre-trained on the Places dataset
• Threshold based on Euclidean distance
• 600 image pairs (460 duplicate / 140 non-duplicate)
27
Junk Filter
• Fine-tune a ResNet-50 model, pre-trained on the ImageNet dataset,
using a custom dataset introduced by Nguyen et al. [ISCRAM 2017]
28
Nguyen et al., “Automatic Image Filtering on Social Networks Using Deep Learning
and Perceptual Hashing During Crises”, ISCRAM 2017.
Landslide Detector
29
Landslides
Landslide Rockslide Mudslide
Keywords: landslide, landslip, earth slip, mudslide, mudflow, rockslide, rock fall, cliff fall
30
Collection of Landslide Images
• Downloaded from Google and Twitter using keywords
• Donated by BGS
31
Labeling Methodology
• Manual annotation by three landslide specialists
• Several rounds of discussion to agree on a labeling methodology
• CV-based interpretation is different from desk- or field-based landslide identification
32
Pennington et al., “A near-real-time global landslide incident reporting tool
demonstrator using social media and artificial intelligence”, IJDRR 2022.
Final Dataset
• Inter-annotator agreement
• Fleiss’ Kappa = 0.58 (almost substantial)
• Percent Agreement = 76%
• Imbalanced class distribution
• 23% landslide vs. 77% not-landslide
Google Twitter BGS Total
Landslide 1,240 598 852 2,690
Not-landslide 5,044 555 3,448 9,047
Total 6,284 1,153 4,300 11,737
34
Pennington et al., “A near-real-time global landslide incident reporting tool
demonstrator using social media and artificial intelligence”, IJDRR 2022.
35
Landslide Model Training
• Fine-tune a ResNet-50 model, pre-trained on the ImageNet dataset,
using the home-grown dataset.
36
Ofli et al., “Landslide Detection in Real-Time Social Media Image
Streams”, arXiv preprint arXiv:2110.04080, 2021.
Qualitative
Analysis
w/ t-SNE
37
Class Activation Maps – True Positives
38
Class Activation Maps – True Negatives
39
Class Activation Maps – False Positives
40
Class Activation Maps – False Negatives
41
Geolocation Tagger
42
Qazi et al., “GeoCoV19: A Dataset of Hundreds of Millions of Multilingual COVID-19 Tweets with
Location Information”, Computer Science, ACM SIGSPATIAL Special, v12, pp 6-15, 2020.
Performance Evaluation & Benchmarking
• Stress-test the system and understand its scalability
• Latency
• time taken by a module to process a given input load
• Throughput
• number of items processed in a unit time (one second) given an input load
• Critical system components
• Duplicate filter
• Junk filter
• Landslide detector
• Geolocation tagger
43
Performance Evaluation & Benchmarking
44
Input Load (per second)
Latency
(second)
0
50
100
150
0
1
2
4
8
1
6
3
2
6
4
1
2
8
2
5
6
5
1
2
1
0
2
4
2
0
4
8
4
0
9
6
Duplicate Filter
Input Load (per second)
Throughput
(items/second)
0
10
20
30
40
0
1
2
4
8
1
6
3
2
6
4
1
2
8
2
5
6
5
1
2
1
0
2
4
2
0
4
8
4
0
9
6
Duplicate Filter
Performance Evaluation & Benchmarking
45
Input Load (per second)
Latency
(second)
0
5
10
15
0
1
2
4
8
1
6
3
2
6
4
1
2
8
2
5
6
5
1
2
1
0
2
4
2
0
4
8
4
0
9
6
Junk Filter
Input Load (per second)
Throughput
(items/second)
0
100
200
300
400
500
0
1
2
4
8
1
6
3
2
6
4
1
2
8
2
5
6
5
1
2
1
0
2
4
2
0
4
8
4
0
9
6
Junk Filter
Performance Evaluation & Benchmarking
46
Input Load (per second)
Latency
(second)
0
5
10
15
20
0
1
2
4
8
1
6
3
2
6
4
1
2
8
2
5
6
5
1
2
1
0
2
4
2
0
4
8
4
0
9
6
Landslide Detector
Input Load (per second)
Throughput
(items/second)
0
100
200
300
400
500
0
1
2
4
8
1
6
3
2
6
4
1
2
8
2
5
6
5
1
2
1
0
2
4
2
0
4
8
4
0
9
6
Landslide Detector
Performance Evaluation & Benchmarking
47
Input Load (per second)
Latency
(second)
0
100
200
300
0
1
2
4
8
1
6
3
2
6
4
1
2
8
2
5
6
5
1
2
1
0
2
4
2
0
4
8
4
0
9
6
With cache Without cache
Geolocation Tagger
Input Load (per second)
Throughput
(items/sec)
0
20
40
60
0
1
2
4
8
1
6
3
2
6
4
1
2
8
2
5
6
5
1
2
1
0
2
4
2
0
4
8
4
0
9
6
With cache Without cache
Geolocation Tagger
Real-world Deployment
• Online since February 2020 to monitor live Twitter stream globally
• 339 multilingual keywords in 32 languages
• February 2020 – December 2021
• Collected more than 54 million tweets and 15 million image URLs
• ~2.5 million image URLs deemed unique and downloaded for further analysis
• ~17,000 images classified as relevant, unique and landslides
• Corresponds to <1% of the collected images
• Highlights the challenging nature of the problem
• ~6,500 landslide reports shared by personal accounts whereas ~4,500 by
organizational accounts
48
Real-world Deployment – Data Statistics
49
Data
Volume
1
10
100
1000
10000
100000
1000000
2020-02-01
2020-02-29
2020-03-28
2020-04-25
2020-05-23
2020-06-20
2020-07-18
2020-08-15
2020-09-12
2020-10-10
2020-11-07
2020-12-05
2021-01-02
2021-01-30
2021-02-27
2021-03-27
2021-04-24
2021-05-22
2021-06-19
2021-07-17
2021-08-14
2021-09-11
2021-10-09
2021-11-06
2021-12-04
2021-12-31
Raw Tweets Raw Images Relevant Images Non-duplicate Images Landslide Images
Real-world Deployment – Verification
• Randomly sampled 3,600 images processed by the system
• Asked experts to label the sampled images
• System-predicted labels compared to expert annotations
50
Real-world Deployment – Verification
• Randomly sampled 3,600 images processed by the system
• Asked experts to label the sampled images
• System-predicted labels compared to expert annotations
51
True False
Landslide (positive) 123 39
Not-landslide (negative) 3395 43
Real-world Deployment – Worldwide Reports
52
NASA landslide susceptibility map
Real-world Deployment – Country Maps
53
Real-world Deployment – Quarterly Maps
54
Real-world Deployment – Quarterly Maps
55
US, Ecuador, Colombia, and India experience significant landslide numbers all year round.
Real-world Deployment – Quarterly Maps
56
For India, landslides become even more prevalent in Q3.
Real-world Deployment – Quarterly Maps
57
Mexico experiences a significant increase in Q3.
Real-world Deployment – Quarterly Maps
58
Prominent landslide numbers in Indonesia and Malaysia happen in Q1 and Q4.
Real-world Deployment – Quarterly Maps
59
Prominent landslide numbers in the UK happen in Q1 and Q2.
Real-world Deployment – Quarterly Maps
60
Turkey experiences most landslides in Q1 thru Q3.
Landslide Reports in Italy
61
Landslide Reports in Italy
62
Landslide Reports in Italy
63
Landslide Reports in Italy
64
Landslide Reports in Italy
65
Landslide Reports in Italy
66
Landslide Reports in Italy
67
Landslide Reports in Italy
68
Conclusion
• An interdisciplinary collaboration between computer scientists
(QCRI), seismologists (EMSC), and landslide specialists (BGS).
• The system leverages online social media data in real time to identify
landslide reports automatically using state-of-the-art AI techniques
• Reduces the information overload by eliminating duplicate and irrelevant
content
• Identifies landslide images
• Infers their geolocation
• Categorizes the user type (organization or person)
• The real-world deployment shows the success of the system.
69
Conclusion
• We believe that our system can contribute to harvesting of global
landslide data and facilitate further landslide research.
• It can support global landslide susceptibility maps to provide
situational awareness and improve emergency response and decision
making.
• Next steps:
• Historical data analysis w/ ground truth from other sources, e.g., BGS, NASA,
EM-DAT, etc.
• Spatiotemporal detection of events
70
Thank you!
https://landslide-aidr.qcri.org/service.php
Please give us feedback!
71

More Related Content

Similar to A Real-time System for Detecting Landslide Reports on Social Media using Artificial Intelligence

Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
University of Southern California
 
Processing and understanding geo-social media content
Processing and understanding geo-social media contentProcessing and understanding geo-social media content
Processing and understanding geo-social media content
foostermann
 
Geographic context analysis of volunteered information
Geographic context analysis of volunteered informationGeographic context analysis of volunteered information
Geographic context analysis of volunteered information
foostermann
 

Similar to A Real-time System for Detecting Landslide Reports on Social Media using Artificial Intelligence (20)

The role of geospatial information in a hyper connected society
The role of geospatial information in a hyper connected societyThe role of geospatial information in a hyper connected society
The role of geospatial information in a hyper connected society
 
PEARC17: Visual exploration and analysis of time series earthquake data
PEARC17: Visual exploration and analysis of time series earthquake dataPEARC17: Visual exploration and analysis of time series earthquake data
PEARC17: Visual exploration and analysis of time series earthquake data
 
Community Structure, Interaction and Evolution Analysis of Online Social Netw...
Community Structure, Interaction and Evolution Analysis of Online Social Netw...Community Structure, Interaction and Evolution Analysis of Online Social Netw...
Community Structure, Interaction and Evolution Analysis of Online Social Netw...
 
The TRIDEC system in the NEAMWave12 exercise
The TRIDEC system in the NEAMWave12 exerciseThe TRIDEC system in the NEAMWave12 exercise
The TRIDEC system in the NEAMWave12 exercise
 
NextGen environmental sensing at the National Arboretum
NextGen environmental sensing at the National ArboretumNextGen environmental sensing at the National Arboretum
NextGen environmental sensing at the National Arboretum
 
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
 
24.pdf
24.pdf24.pdf
24.pdf
 
Processing and understanding geo-social media content
Processing and understanding geo-social media contentProcessing and understanding geo-social media content
Processing and understanding geo-social media content
 
Pipelines: 2052-James Breaux, Centurion Pipeline Co.
Pipelines: 2052-James Breaux, Centurion Pipeline Co.Pipelines: 2052-James Breaux, Centurion Pipeline Co.
Pipelines: 2052-James Breaux, Centurion Pipeline Co.
 
Geographic context analysis of volunteered information
Geographic context analysis of volunteered informationGeographic context analysis of volunteered information
Geographic context analysis of volunteered information
 
Validation of services, data and metadata
Validation of services, data and metadataValidation of services, data and metadata
Validation of services, data and metadata
 
Automated Experimentation in Social Informatics
Automated Experimentation in Social InformaticsAutomated Experimentation in Social Informatics
Automated Experimentation in Social Informatics
 
Handling Uncertainty in Geo-Spatial Data.
Handling Uncertainty in Geo-Spatial Data.Handling Uncertainty in Geo-Spatial Data.
Handling Uncertainty in Geo-Spatial Data.
 
Vision for the OpenQuake Platform [Sep 2012]
Vision for the OpenQuake Platform [Sep 2012]Vision for the OpenQuake Platform [Sep 2012]
Vision for the OpenQuake Platform [Sep 2012]
 
Cobweb: Using citizen science data to support flood risk modelling
Cobweb: Using citizen science data to support flood risk modellingCobweb: Using citizen science data to support flood risk modelling
Cobweb: Using citizen science data to support flood risk modelling
 
Collaborative Geo-information Capturing To Support Emergency Response
Collaborative Geo-information Capturing To Support Emergency Response Collaborative Geo-information Capturing To Support Emergency Response
Collaborative Geo-information Capturing To Support Emergency Response
 
Volunteer Crowd Computing and Federated Cloud developments
Volunteer Crowd Computing and Federated Cloud developmentsVolunteer Crowd Computing and Federated Cloud developments
Volunteer Crowd Computing and Federated Cloud developments
 
TraitCapture: NextGen Monitoring and Visualization from seed to ecosystem
TraitCapture: NextGen Monitoring and Visualization from seed to ecosystemTraitCapture: NextGen Monitoring and Visualization from seed to ecosystem
TraitCapture: NextGen Monitoring and Visualization from seed to ecosystem
 
Crowdsourcing Land Cover and Land Use Data: Experiences from IIASA
Crowdsourcing Land Cover and Land Use Data: Experiences from IIASACrowdsourcing Land Cover and Land Use Data: Experiences from IIASA
Crowdsourcing Land Cover and Land Use Data: Experiences from IIASA
 
Taylor neon pheno_cam_2014_aceas
Taylor neon pheno_cam_2014_aceasTaylor neon pheno_cam_2014_aceas
Taylor neon pheno_cam_2014_aceas
 

Recently uploaded

Recently uploaded (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

A Real-time System for Detecting Landslide Reports on Social Media using Artificial Intelligence

  • 1. A Real-time System for Detecting Landslide Reports on Social Media using Artificial Intelligence Ferda Ofli1, Umair Qazi1, Muhammad Imran1, Julien Roch2, Catherine Pennington3, Vanessa Banks3, Remy Bossu2 1Qatar Computing Research Institute 2European-Mediterranean Seismological Centre 3British Geological Survey ICWE 2022 Bari, Italy
  • 2. Agenda • Motivation • System Design • Model Development • System Benchmark • Real-world Deployment • Conclusion 2
  • 3. Motivation Landslides cause thousands of deaths and millions of dollars in infrastructural damage worldwide each year. 3
  • 4. Motivation Landslide events are often under-reported and insufficiently documented. Credit: Petley, D. Geology (2012) 4 Lack of such important data not only hinders humanitarian aid but also impedes scientific research.
  • 5. Existing Approaches On-the-ground surveys Satellite imagery analysis 5 © BGS - Expensive - Time-consuming - Impractical/not-applicable
  • 6. Existing Approaches – Citizen Science (I) 6 Juang et al., “Using citizen science to expand the global map of landslides: Introducing the Cooperative Open Online Landslide Repository”, Plos One 2019. NASA Landslide Reporter
  • 7. Existing Approaches – Citizen Science (II) 7 Mobile Applications Kocaman & Gokceoglu, “A CitSci app for landslide data collection”, Landslides 2019. Sellers et al., “MARLI: a mobile application for regional landslide inventories in Ecuador”, Landslides 2021. Not easily scalable as they require active participation of volunteers that opt-in to use a particular application.
  • 8. What about Social Media? 8
  • 9. Goal Identify landslide reports on social media seamlessly and at a much larger scale 9
  • 10. Detecting Landslides in Tweets 10 motorcycle accident heavy rainfall earthquake wildfire tropical cyclone on fire flooded car accident
  • 11. Detecting Landslides in Tweets 11 motorcycle accident heavy rainfall earthquake wildfire tropical cyclone on fire flooded car accident
  • 12. Detecting Landslides in Tweets 12 motorcycle accident heavy rainfall earthquake wildfire tropical cyclone on fire flooded car accident
  • 14. System Architecture – Image Pipeline 14
  • 15. System Architecture – Image Pipeline 15
  • 16. System Architecture – Image Pipeline 16
  • 17. System Architecture – Image Pipeline 17
  • 18. System Architecture – Image Pipeline 18
  • 19. System Architecture – Image Pipeline 19
  • 20. System Architecture – Text Pipeline 20
  • 21. System Architecture – Text Pipeline 21
  • 22. System Architecture – Text Pipeline 22
  • 23. System Architecture – Text Pipeline 23
  • 24. System Architecture – Text Pipeline 24
  • 27. Duplicate Filter • Image features extracted from the penultimate layer of a ResNet-50 model pre-trained on the Places dataset • Threshold based on Euclidean distance • 600 image pairs (460 duplicate / 140 non-duplicate) 27
  • 28. Junk Filter • Fine-tune a ResNet-50 model, pre-trained on the ImageNet dataset, using a custom dataset introduced by Nguyen et al. [ISCRAM 2017] 28 Nguyen et al., “Automatic Image Filtering on Social Networks Using Deep Learning and Perceptual Hashing During Crises”, ISCRAM 2017.
  • 30. Landslides Landslide Rockslide Mudslide Keywords: landslide, landslip, earth slip, mudslide, mudflow, rockslide, rock fall, cliff fall 30
  • 31. Collection of Landslide Images • Downloaded from Google and Twitter using keywords • Donated by BGS 31
  • 32. Labeling Methodology • Manual annotation by three landslide specialists • Several rounds of discussion to agree on a labeling methodology • CV-based interpretation is different from desk- or field-based landslide identification 32 Pennington et al., “A near-real-time global landslide incident reporting tool demonstrator using social media and artificial intelligence”, IJDRR 2022.
  • 33. Final Dataset • Inter-annotator agreement • Fleiss’ Kappa = 0.58 (almost substantial) • Percent Agreement = 76% • Imbalanced class distribution • 23% landslide vs. 77% not-landslide Google Twitter BGS Total Landslide 1,240 598 852 2,690 Not-landslide 5,044 555 3,448 9,047 Total 6,284 1,153 4,300 11,737 34 Pennington et al., “A near-real-time global landslide incident reporting tool demonstrator using social media and artificial intelligence”, IJDRR 2022.
  • 34. 35
  • 35. Landslide Model Training • Fine-tune a ResNet-50 model, pre-trained on the ImageNet dataset, using the home-grown dataset. 36 Ofli et al., “Landslide Detection in Real-Time Social Media Image Streams”, arXiv preprint arXiv:2110.04080, 2021.
  • 37. Class Activation Maps – True Positives 38
  • 38. Class Activation Maps – True Negatives 39
  • 39. Class Activation Maps – False Positives 40
  • 40. Class Activation Maps – False Negatives 41
  • 41. Geolocation Tagger 42 Qazi et al., “GeoCoV19: A Dataset of Hundreds of Millions of Multilingual COVID-19 Tweets with Location Information”, Computer Science, ACM SIGSPATIAL Special, v12, pp 6-15, 2020.
  • 42. Performance Evaluation & Benchmarking • Stress-test the system and understand its scalability • Latency • time taken by a module to process a given input load • Throughput • number of items processed in a unit time (one second) given an input load • Critical system components • Duplicate filter • Junk filter • Landslide detector • Geolocation tagger 43
  • 43. Performance Evaluation & Benchmarking 44 Input Load (per second) Latency (second) 0 50 100 150 0 1 2 4 8 1 6 3 2 6 4 1 2 8 2 5 6 5 1 2 1 0 2 4 2 0 4 8 4 0 9 6 Duplicate Filter Input Load (per second) Throughput (items/second) 0 10 20 30 40 0 1 2 4 8 1 6 3 2 6 4 1 2 8 2 5 6 5 1 2 1 0 2 4 2 0 4 8 4 0 9 6 Duplicate Filter
  • 44. Performance Evaluation & Benchmarking 45 Input Load (per second) Latency (second) 0 5 10 15 0 1 2 4 8 1 6 3 2 6 4 1 2 8 2 5 6 5 1 2 1 0 2 4 2 0 4 8 4 0 9 6 Junk Filter Input Load (per second) Throughput (items/second) 0 100 200 300 400 500 0 1 2 4 8 1 6 3 2 6 4 1 2 8 2 5 6 5 1 2 1 0 2 4 2 0 4 8 4 0 9 6 Junk Filter
  • 45. Performance Evaluation & Benchmarking 46 Input Load (per second) Latency (second) 0 5 10 15 20 0 1 2 4 8 1 6 3 2 6 4 1 2 8 2 5 6 5 1 2 1 0 2 4 2 0 4 8 4 0 9 6 Landslide Detector Input Load (per second) Throughput (items/second) 0 100 200 300 400 500 0 1 2 4 8 1 6 3 2 6 4 1 2 8 2 5 6 5 1 2 1 0 2 4 2 0 4 8 4 0 9 6 Landslide Detector
  • 46. Performance Evaluation & Benchmarking 47 Input Load (per second) Latency (second) 0 100 200 300 0 1 2 4 8 1 6 3 2 6 4 1 2 8 2 5 6 5 1 2 1 0 2 4 2 0 4 8 4 0 9 6 With cache Without cache Geolocation Tagger Input Load (per second) Throughput (items/sec) 0 20 40 60 0 1 2 4 8 1 6 3 2 6 4 1 2 8 2 5 6 5 1 2 1 0 2 4 2 0 4 8 4 0 9 6 With cache Without cache Geolocation Tagger
  • 47. Real-world Deployment • Online since February 2020 to monitor live Twitter stream globally • 339 multilingual keywords in 32 languages • February 2020 – December 2021 • Collected more than 54 million tweets and 15 million image URLs • ~2.5 million image URLs deemed unique and downloaded for further analysis • ~17,000 images classified as relevant, unique and landslides • Corresponds to <1% of the collected images • Highlights the challenging nature of the problem • ~6,500 landslide reports shared by personal accounts whereas ~4,500 by organizational accounts 48
  • 48. Real-world Deployment – Data Statistics 49 Data Volume 1 10 100 1000 10000 100000 1000000 2020-02-01 2020-02-29 2020-03-28 2020-04-25 2020-05-23 2020-06-20 2020-07-18 2020-08-15 2020-09-12 2020-10-10 2020-11-07 2020-12-05 2021-01-02 2021-01-30 2021-02-27 2021-03-27 2021-04-24 2021-05-22 2021-06-19 2021-07-17 2021-08-14 2021-09-11 2021-10-09 2021-11-06 2021-12-04 2021-12-31 Raw Tweets Raw Images Relevant Images Non-duplicate Images Landslide Images
  • 49. Real-world Deployment – Verification • Randomly sampled 3,600 images processed by the system • Asked experts to label the sampled images • System-predicted labels compared to expert annotations 50
  • 50. Real-world Deployment – Verification • Randomly sampled 3,600 images processed by the system • Asked experts to label the sampled images • System-predicted labels compared to expert annotations 51 True False Landslide (positive) 123 39 Not-landslide (negative) 3395 43
  • 51. Real-world Deployment – Worldwide Reports 52 NASA landslide susceptibility map
  • 52. Real-world Deployment – Country Maps 53
  • 53. Real-world Deployment – Quarterly Maps 54
  • 54. Real-world Deployment – Quarterly Maps 55 US, Ecuador, Colombia, and India experience significant landslide numbers all year round.
  • 55. Real-world Deployment – Quarterly Maps 56 For India, landslides become even more prevalent in Q3.
  • 56. Real-world Deployment – Quarterly Maps 57 Mexico experiences a significant increase in Q3.
  • 57. Real-world Deployment – Quarterly Maps 58 Prominent landslide numbers in Indonesia and Malaysia happen in Q1 and Q4.
  • 58. Real-world Deployment – Quarterly Maps 59 Prominent landslide numbers in the UK happen in Q1 and Q2.
  • 59. Real-world Deployment – Quarterly Maps 60 Turkey experiences most landslides in Q1 thru Q3.
  • 68. Conclusion • An interdisciplinary collaboration between computer scientists (QCRI), seismologists (EMSC), and landslide specialists (BGS). • The system leverages online social media data in real time to identify landslide reports automatically using state-of-the-art AI techniques • Reduces the information overload by eliminating duplicate and irrelevant content • Identifies landslide images • Infers their geolocation • Categorizes the user type (organization or person) • The real-world deployment shows the success of the system. 69
  • 69. Conclusion • We believe that our system can contribute to harvesting of global landslide data and facilitate further landslide research. • It can support global landslide susceptibility maps to provide situational awareness and improve emergency response and decision making. • Next steps: • Historical data analysis w/ ground truth from other sources, e.g., BGS, NASA, EM-DAT, etc. • Spatiotemporal detection of events 70