SlideShare a Scribd company logo
1 of 24
Download to read offline
Using Spark Streaming and ML for real-time insights
into humanitarian crisis
Project Fortis
Erik Schlegel @erikschlegel1
Microsoft - Senior Engineer
Erik Schlegel
• Senior Engineer – Microsoft / PCT
• Focus on emerging technology projects with
innovative partners
Project Fortis: A collaboration with the UN
• Humanitarian aid plans are manually composed
• Resulting in imprecise aid relief plans
• Aid planners monitor 400+ data sources daily
• Impacted areas often have limited accessibility
• Slow turnaround
Challenges
Project Fortis: Goals
• Accelerate the construction of aid planning
• Improve its data accuracy
• Provide deeper insights and trends
• Real-time analytics
• More intelligence and insight to enable better forecasting
With those goals in mind, we built Fortis, which at its core, is a
data ingestion, analysis and visualization pipeline.
Project Fortis - Showcase
Spark Streaming: Realtime Data Pipeline
• Using spark streaming connector extensions
• Social and public DStreams are collected in RT
• Events are processed and merged through Spark
• Cassandra acts as the state engine for aggregation
• Results are published to Event Hub topics
• Consumers are notified through GraphQL Subscriptions
Kubernetes ACS Cluster
Data Pipeline: Functional Architecture
Twitter
Event Hub
Facebook
Instagram
Kafka
Bing
Radio Stream
ACSContainers
SparkStreamingConnector
Receivers
Sentiment
Topics
Location
ComputerVision/
Speech
OpenStreetmap
image / video
object detection
Planet File
Transform
Stream
Batches
ComputerVisionText AnalyticsAPI SpeechAPI
AI -CognitiveServices
Event HubTopic
Tile
Aggregation
Feature
Extraction
Filter
SparkJobComponent
FortisSiteDefinition
IncomingDataStreams
Publish
Mutations
RealtimeInsights
Publish To
Consumers
API
Subscriptions
Data Mutations
ClientDataFlow
Fortis Admin
Interface
Manage
DataDependencies
RDD1
RDD2
RDD3
RDD4
RDD5
MLComponents/Services
Message
Pusher
Data Persistence
Geospatial Requirements
• Geotag mentioned places from conversations
• To query activities within a bounding box
• Filter activities based on topic(s)
Heatmap Generation
Aggregate detected places across geographic regions
XYZ Tiles for summarization
• Divides world up
into tiles.
• Each tile has four
children at the
next higher zoom
level.
• Maps multiple
layers to a single
space dimension.
Heatmap Spark Mapper
For each location, map to tiles at every zoom level:
(36.9741, -122.0308)  [
(10_398_164, 1), (11_797_329, 1)
(12_1594_659, 1), (13_3189_1319, 1),
(14_6378_2638, 1), (15_12757_5276,1),
(16_25514_10552, 1), (17_51028_21105, 1),
(18_102057_42211, 1)
]
Heatmap Spark Mapper
Reduce all mappings with the same key into an aggregate value
(10_398_164, 151)  [
(10_398_164, 1), (10_398_164, 1), …
(10_398_164, 1), (10_398_164, 1), …
(10_398_164, 1), (10_398_164, 1), …
(10_398_164, 1), (10_398_164, 1), …
(10_398_164, 1)
]
Heatmap Spark Algorithm
For each location, compute a tile id key/value pair for all zoom levels
Heatmap Spark
Build the heatmap then save the reduced dataset to Cassandra
Aggregated Tile Schema
Aggregated Tile Data - Cassandra: fortis_tiles_tbl
tile_x tile_y tile_z period Source publisher Topic Lang feature_collection
15 13346 15 month-2017-06 Twitter Al Jazeera isis en {“mention_count”: 1213, “sentiments_avg”: [“neg”, “.78”,“pos”: .02], “entities”:
[{“name”: “bashar assad”, “mention_count”: 627, “ref_id”: “3256”}]}]}
231 3345 14 month-2017-06 Facebook Times of Libya Isis en { “mention_count”: 453, “sentiments_avg”: [“neg”, “.78”,“pos”: .02], “entities”:
[{“name”: “bashar assad”, “mention_count”: 124, “ref_id”: “3256”}]}]}
76 98242 13 month-2017-06 Facebook Times of Libya isis en { “mention_count”: 453, “sentiments_avg”: [“neg”, “.78”,“pos”: .02], “entities”:
[{“name”: “bashar assad”, “mention_count”: 124, “ref_id”: “3256”}]}]}
All inbound events are aggregated by
tile_x, tile_y, tile_z, period, source, publisher, topic, lang
NLP Feature Extraction
Event Details – Cassandra: fortis_events_tbl
event_id sourc
e
Title src_url detected_features Publisher Topics lang message_body Feature_collection
851144530
439090180
twitter Syria’s Assad is
‘inextricably
connected’ to
Islamic State
https://twitter.c
om/AJEnglish/
status/851144
53043909018
0
[“wof85678363”] Al Jazeera [“isis,
“bombings”]
en Far from fighting ISIS, Assad looked the
other way when it set up many suicide
bombings. An former Islamist from Hamah
province who fought
{ “sentiments_avg”:
[“neg”, “.78”,“pos”: .02],
“entities”: [{“name”:
“bashar assad”, “ref_id”:
“3256”}]}}
8619312
2151284
7360
twitter Suicide Attack at
Tank Brigade
https://twitter.c
om/AJEnglish/
status/851144
53043909018
[“wof85678323”] Libya Herald [“suicide”,
“attack”,
“clashes”]
en A suicide attack was reported at the 204
Tank Battalion headquarters which is in
close proximity to 17 February Brigade
camp. Clashes then broke out in Fuwayhat
{ “sentiments_avg”:
[“neg”, “.78”,“pos”: .02],,
“ref_id”: “3256”}]}}
Event Stream
{"source":"twitter","created_at":"2017-05-09T13:09:51.000Z","message":{"created_at":"2017-05-
09T13:09:51.000Z","id":"861931221512847360","user_id":"114544915","geo":null,"originalSources":["Al Jazeera"],"lang":"en","message":"A suicide attack
was reported at the 204 Tank Battalion headquarters which is in close proximity to 17 February Brigade camp. Clashes then broke out in Fuwayhat district,
where 204 Tank Brigade started fighting against 17 February Brigade. 3 reported dead.“, “Title”: “Suicide Attack at Tank Brigade”}
{"source":"twitter","created_at":"2017-05-09T13:09:51.000Z","message":{"created_at":"2017-05-
09T13:09:51.000Z","id":"861931221512847311","user_id":"114544915","geo":null,"originalSources":["Libya Herald"],"lang":"en","message":"Far from fighting
ISIS, Assad looked the other way when it set up many suicide bombings. An former Islamist from Hamah province who fought.“, “Title”: “Syria’s Assad is
‘inextricably connected’ to Islamic State”}}
Detected Places
Detected Topics
Detected Entities
Realtime Audio Streaming with Spark
• Speech Recognition APIs through radio broadcasts
"message":“This just coming in now.
A bombing was reported in the town
of Kamir outside of Aleppo. 10
people appear to be injured. “
Kubernetes Helm Contributions
• Spark 2.1 / Zeppelin
• High availability chart for Kafka
• High availability chart for Cassandra
• ELK chart for logging
Available on github.com/CatalystCode/charts
Streaming Connector Contributions
• Instagram / Computer Vision Integration
com.github.catalystcode:streaming-instagram_2.11
• Bing streaming with customized search ranking
com.github.catalystcode:streaming-bing_2.11
• Facebook
com.github.catalystcode:streaming-facebook_2.11
Spark Streaming
Accelerating our ability to use data for good
Thank You.
Erik Schlegel @erikschlegel1
Do more open source!
github.com/catalystcode/project-fortis
Erik Schlegel @erikschlegel1

More Related Content

Similar to Spark Summit - Microsoft Project Fortis

Ingesting streaming data into Graph Database
Ingesting streaming data into Graph DatabaseIngesting streaming data into Graph Database
Ingesting streaming data into Graph DatabaseGuido Schmutz
 
Social Stream Analysis Use Cases
Social Stream Analysis Use Cases Social Stream Analysis Use Cases
Social Stream Analysis Use Cases WSO2
 
Lifting the hood on spark streaming - StampedeCon 2015
Lifting the hood on spark streaming - StampedeCon 2015Lifting the hood on spark streaming - StampedeCon 2015
Lifting the hood on spark streaming - StampedeCon 2015StampedeCon
 
Flexible Event Tracking (Paul Gebheim)
Flexible Event Tracking (Paul Gebheim)Flexible Event Tracking (Paul Gebheim)
Flexible Event Tracking (Paul Gebheim)MongoSF
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...confluent
 
Hook, Line, and Sinker: Reeling in ArcGIS Webhooks
Hook, Line, and Sinker: Reeling in ArcGIS Webhooks Hook, Line, and Sinker: Reeling in ArcGIS Webhooks
Hook, Line, and Sinker: Reeling in ArcGIS Webhooks Safe Software
 
Data Modeling for IoT and Big Data
Data Modeling for IoT and Big DataData Modeling for IoT and Big Data
Data Modeling for IoT and Big DataJayesh Thakrar
 
Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...
Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...
Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...Databricks
 
Timeseries - data visualization in Grafana
Timeseries - data visualization in GrafanaTimeseries - data visualization in Grafana
Timeseries - data visualization in GrafanaOCoderFest
 
Modeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and SparkModeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and SparkSingleStore
 
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleData Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleSriram Krishnan
 
Hack it like its hot!
Hack it like its hot!Hack it like its hot!
Hack it like its hot!Stefan Adolf
 
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...WSO2
 
Scalding big ADta
Scalding big ADtaScalding big ADta
Scalding big ADtab0ris_1
 
GraphConnect 2014 SF: Neo4j at Scale using Enterprise Integration Patterns
GraphConnect 2014 SF: Neo4j at Scale using Enterprise Integration PatternsGraphConnect 2014 SF: Neo4j at Scale using Enterprise Integration Patterns
GraphConnect 2014 SF: Neo4j at Scale using Enterprise Integration PatternsNeo4j
 
Real-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case studyReal-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case studydeep.bi
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaGoDataDriven
 
WSO2 Analytics Platform: The one stop shop for all your data needs
WSO2 Analytics Platform: The one stop shop for all your data needsWSO2 Analytics Platform: The one stop shop for all your data needs
WSO2 Analytics Platform: The one stop shop for all your data needsSriskandarajah Suhothayan
 

Similar to Spark Summit - Microsoft Project Fortis (20)

Ingesting streaming data into Graph Database
Ingesting streaming data into Graph DatabaseIngesting streaming data into Graph Database
Ingesting streaming data into Graph Database
 
Async
AsyncAsync
Async
 
Social Stream Analysis Use Cases
Social Stream Analysis Use Cases Social Stream Analysis Use Cases
Social Stream Analysis Use Cases
 
Lifting the hood on spark streaming - StampedeCon 2015
Lifting the hood on spark streaming - StampedeCon 2015Lifting the hood on spark streaming - StampedeCon 2015
Lifting the hood on spark streaming - StampedeCon 2015
 
Flexible Event Tracking (Paul Gebheim)
Flexible Event Tracking (Paul Gebheim)Flexible Event Tracking (Paul Gebheim)
Flexible Event Tracking (Paul Gebheim)
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
 
Con3036 soaring-through-the-clouds-oow2016-160920214845
Con3036 soaring-through-the-clouds-oow2016-160920214845Con3036 soaring-through-the-clouds-oow2016-160920214845
Con3036 soaring-through-the-clouds-oow2016-160920214845
 
Hook, Line, and Sinker: Reeling in ArcGIS Webhooks
Hook, Line, and Sinker: Reeling in ArcGIS Webhooks Hook, Line, and Sinker: Reeling in ArcGIS Webhooks
Hook, Line, and Sinker: Reeling in ArcGIS Webhooks
 
Data Modeling for IoT and Big Data
Data Modeling for IoT and Big DataData Modeling for IoT and Big Data
Data Modeling for IoT and Big Data
 
Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...
Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...
Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...
 
Timeseries - data visualization in Grafana
Timeseries - data visualization in GrafanaTimeseries - data visualization in Grafana
Timeseries - data visualization in Grafana
 
Modeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and SparkModeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and Spark
 
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleData Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
 
Hack it like its hot!
Hack it like its hot!Hack it like its hot!
Hack it like its hot!
 
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
 
Scalding big ADta
Scalding big ADtaScalding big ADta
Scalding big ADta
 
GraphConnect 2014 SF: Neo4j at Scale using Enterprise Integration Patterns
GraphConnect 2014 SF: Neo4j at Scale using Enterprise Integration PatternsGraphConnect 2014 SF: Neo4j at Scale using Enterprise Integration Patterns
GraphConnect 2014 SF: Neo4j at Scale using Enterprise Integration Patterns
 
Real-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case studyReal-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case study
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
 
WSO2 Analytics Platform: The one stop shop for all your data needs
WSO2 Analytics Platform: The one stop shop for all your data needsWSO2 Analytics Platform: The one stop shop for all your data needs
WSO2 Analytics Platform: The one stop shop for all your data needs
 

Recently uploaded

Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 

Recently uploaded (20)

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 

Spark Summit - Microsoft Project Fortis

  • 1. Using Spark Streaming and ML for real-time insights into humanitarian crisis Project Fortis Erik Schlegel @erikschlegel1 Microsoft - Senior Engineer
  • 2. Erik Schlegel • Senior Engineer – Microsoft / PCT • Focus on emerging technology projects with innovative partners
  • 3.
  • 4. Project Fortis: A collaboration with the UN • Humanitarian aid plans are manually composed • Resulting in imprecise aid relief plans • Aid planners monitor 400+ data sources daily • Impacted areas often have limited accessibility • Slow turnaround Challenges
  • 5.
  • 6. Project Fortis: Goals • Accelerate the construction of aid planning • Improve its data accuracy • Provide deeper insights and trends • Real-time analytics • More intelligence and insight to enable better forecasting
  • 7. With those goals in mind, we built Fortis, which at its core, is a data ingestion, analysis and visualization pipeline.
  • 8. Project Fortis - Showcase
  • 9. Spark Streaming: Realtime Data Pipeline • Using spark streaming connector extensions • Social and public DStreams are collected in RT • Events are processed and merged through Spark • Cassandra acts as the state engine for aggregation • Results are published to Event Hub topics • Consumers are notified through GraphQL Subscriptions Kubernetes ACS Cluster
  • 10. Data Pipeline: Functional Architecture Twitter Event Hub Facebook Instagram Kafka Bing Radio Stream ACSContainers SparkStreamingConnector Receivers Sentiment Topics Location ComputerVision/ Speech OpenStreetmap image / video object detection Planet File Transform Stream Batches ComputerVisionText AnalyticsAPI SpeechAPI AI -CognitiveServices Event HubTopic Tile Aggregation Feature Extraction Filter SparkJobComponent FortisSiteDefinition IncomingDataStreams Publish Mutations RealtimeInsights Publish To Consumers API Subscriptions Data Mutations ClientDataFlow Fortis Admin Interface Manage DataDependencies RDD1 RDD2 RDD3 RDD4 RDD5 MLComponents/Services Message Pusher Data Persistence
  • 11. Geospatial Requirements • Geotag mentioned places from conversations • To query activities within a bounding box • Filter activities based on topic(s)
  • 12. Heatmap Generation Aggregate detected places across geographic regions
  • 13. XYZ Tiles for summarization • Divides world up into tiles. • Each tile has four children at the next higher zoom level. • Maps multiple layers to a single space dimension.
  • 14. Heatmap Spark Mapper For each location, map to tiles at every zoom level: (36.9741, -122.0308)  [ (10_398_164, 1), (11_797_329, 1) (12_1594_659, 1), (13_3189_1319, 1), (14_6378_2638, 1), (15_12757_5276,1), (16_25514_10552, 1), (17_51028_21105, 1), (18_102057_42211, 1) ]
  • 15. Heatmap Spark Mapper Reduce all mappings with the same key into an aggregate value (10_398_164, 151)  [ (10_398_164, 1), (10_398_164, 1), … (10_398_164, 1), (10_398_164, 1), … (10_398_164, 1), (10_398_164, 1), … (10_398_164, 1), (10_398_164, 1), … (10_398_164, 1) ]
  • 16. Heatmap Spark Algorithm For each location, compute a tile id key/value pair for all zoom levels
  • 17. Heatmap Spark Build the heatmap then save the reduced dataset to Cassandra
  • 18. Aggregated Tile Schema Aggregated Tile Data - Cassandra: fortis_tiles_tbl tile_x tile_y tile_z period Source publisher Topic Lang feature_collection 15 13346 15 month-2017-06 Twitter Al Jazeera isis en {“mention_count”: 1213, “sentiments_avg”: [“neg”, “.78”,“pos”: .02], “entities”: [{“name”: “bashar assad”, “mention_count”: 627, “ref_id”: “3256”}]}]} 231 3345 14 month-2017-06 Facebook Times of Libya Isis en { “mention_count”: 453, “sentiments_avg”: [“neg”, “.78”,“pos”: .02], “entities”: [{“name”: “bashar assad”, “mention_count”: 124, “ref_id”: “3256”}]}]} 76 98242 13 month-2017-06 Facebook Times of Libya isis en { “mention_count”: 453, “sentiments_avg”: [“neg”, “.78”,“pos”: .02], “entities”: [{“name”: “bashar assad”, “mention_count”: 124, “ref_id”: “3256”}]}]} All inbound events are aggregated by tile_x, tile_y, tile_z, period, source, publisher, topic, lang
  • 19. NLP Feature Extraction Event Details – Cassandra: fortis_events_tbl event_id sourc e Title src_url detected_features Publisher Topics lang message_body Feature_collection 851144530 439090180 twitter Syria’s Assad is ‘inextricably connected’ to Islamic State https://twitter.c om/AJEnglish/ status/851144 53043909018 0 [“wof85678363”] Al Jazeera [“isis, “bombings”] en Far from fighting ISIS, Assad looked the other way when it set up many suicide bombings. An former Islamist from Hamah province who fought { “sentiments_avg”: [“neg”, “.78”,“pos”: .02], “entities”: [{“name”: “bashar assad”, “ref_id”: “3256”}]}} 8619312 2151284 7360 twitter Suicide Attack at Tank Brigade https://twitter.c om/AJEnglish/ status/851144 53043909018 [“wof85678323”] Libya Herald [“suicide”, “attack”, “clashes”] en A suicide attack was reported at the 204 Tank Battalion headquarters which is in close proximity to 17 February Brigade camp. Clashes then broke out in Fuwayhat { “sentiments_avg”: [“neg”, “.78”,“pos”: .02],, “ref_id”: “3256”}]}} Event Stream {"source":"twitter","created_at":"2017-05-09T13:09:51.000Z","message":{"created_at":"2017-05- 09T13:09:51.000Z","id":"861931221512847360","user_id":"114544915","geo":null,"originalSources":["Al Jazeera"],"lang":"en","message":"A suicide attack was reported at the 204 Tank Battalion headquarters which is in close proximity to 17 February Brigade camp. Clashes then broke out in Fuwayhat district, where 204 Tank Brigade started fighting against 17 February Brigade. 3 reported dead.“, “Title”: “Suicide Attack at Tank Brigade”} {"source":"twitter","created_at":"2017-05-09T13:09:51.000Z","message":{"created_at":"2017-05- 09T13:09:51.000Z","id":"861931221512847311","user_id":"114544915","geo":null,"originalSources":["Libya Herald"],"lang":"en","message":"Far from fighting ISIS, Assad looked the other way when it set up many suicide bombings. An former Islamist from Hamah province who fought.“, “Title”: “Syria’s Assad is ‘inextricably connected’ to Islamic State”}} Detected Places Detected Topics Detected Entities
  • 20. Realtime Audio Streaming with Spark • Speech Recognition APIs through radio broadcasts "message":“This just coming in now. A bombing was reported in the town of Kamir outside of Aleppo. 10 people appear to be injured. “
  • 21. Kubernetes Helm Contributions • Spark 2.1 / Zeppelin • High availability chart for Kafka • High availability chart for Cassandra • ELK chart for logging Available on github.com/CatalystCode/charts
  • 22. Streaming Connector Contributions • Instagram / Computer Vision Integration com.github.catalystcode:streaming-instagram_2.11 • Bing streaming with customized search ranking com.github.catalystcode:streaming-bing_2.11 • Facebook com.github.catalystcode:streaming-facebook_2.11
  • 23. Spark Streaming Accelerating our ability to use data for good
  • 24. Thank You. Erik Schlegel @erikschlegel1 Do more open source! github.com/catalystcode/project-fortis Erik Schlegel @erikschlegel1