“Scalable, Pluggable, and Fault-Tolerant Multi-Modal Situational Awareness Data Stream Management Systems”
By Michael Partin
Monday, July 8, 2019
Committee: Drs. Amit Sheth, Advisor, TK Prasad, and Valerie Shalin (Department of Psychology)
ABSTRACT:
Features and attributes that describe an event (disasters, social movements, etc.) are heterogeneous in nature. For virtually all events that impact humans, technology enables us to capture a large amount and variety of data from many sources, including humans (i.e., social media) and sensors/internet of things (IoTs). The corresponding modalities of data include text, imagery, voice and video, along with structured data such as gazetteers (i.e., location-based data) and government and statistical data. However, even though there is often an abundance of information produced, this information is fragmented across the various modalities and sources. The DisasterRecord system aims to provide a way to combine (interlink and integrate) data streams in different modalities in a meaningful way, with the in-depth use case of flood events. The DisasterRecord project was originally developed as a demo to showcase the efforts of the team at Kno.e.sis in the area of combining and analyzing multimodal data for the IBM CallForCode challenge in 2018. This thesis represents extensive follow-on work in the areas of deployability, flexibility, and reliability. Specific topics addressed are: a method that utilizes current technologies to easily deploy into cloud infrastructure; the modifications made to add flexibility to add and modify the multimodal analysis pipeline; and reliability improvements to make it a stable and reliable system.
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Data Stream Management Systems
1. Scalable, Pluggable, and Fault Tolerant Multi-Modal
Situational Awareness Data Stream Management Systems
-Updates and Improvements to DisasterRecord
Michael Partin — Summer 2019
COMMITTEE MEMBERS:
Dr. Krishnaprasad Thirunarayan
Dr. Amit Sheth
Dr. Valerie L. Shalin
2. Scalable, Pluggable, and Fault Tolerant Multi-Modal
Situational Awareness Data Stream Management Systems
-Updates and Improvements to DisasterRecord
Michael Partin — Summer 2019
COMMITTEE MEMBERS:
Dr. Krishnaprasad Thirunarayan
Dr. Amit Sheth
Dr. Valerie L. Shalin
3. About Me
Michael Partin
Education:
MS Computer Science (Wright State University 2020)
BS Computer Engineering (Wright State University 2017)
AAS Electronic Engineering Technology (Sinclair College 2010)
Related Work:
Twitris [Magpie Data Collector / Monitoring / Plugins]
System Administration [OpenStack / Ceph / AWS]
Work:
CognoviLabs [Multilingual NLP]
5. Overview: Previous Work
● What is DisasterRecord?
DisasterRecord is a tool used to provide event-centric situational analysis in real-time. Events
(disasters, social movements, etc.) are heterogeneous in nature with many modalities and
sources. DisasterRecord is designed to combine information from various modalities in a
meaningful way to help users gather a more complete analysis of a given situation.
● What work had been done on DisasterRecord previously?
DisasterRecord comes from the efforts of many great minds here at Kno.e.sis at Wright State
University along with collaboration at Ohio State University. A demo of DisasterRecord was
developed by many members of Kno.e.sis and submitted for the IBM CallForCode challenge. At
that time it provided analysis upon one specific event, the 2015 Chennai flood event.
6. Overview: Current Work
● What work was done here?
The IBM CallForCode demo showcased the capabilities of the DisasterRecord system but lacked
in a few areas such as flexibility, scalability, and portability. This presentation describes the
efforts in each of these areas. Deployment methods and architecture are discussed to address
portability; microservice architecture is discussed to address scalability; and an improved image
processing pipeline is discussed showcasing a higher level of flexibility.
8. Deployment Architecture - Technologies
● Ansible makes cloud automation easy and reliable
● Powerful centralized control over large clusters
● Jinja templates help to make deployment flexible
Django
SQL
ES Redis
Storm
Kafka
10. Deployment Architecture - Technologies
● Docker containers solves dependency issue problems
● Vast repository for standard services
● Containers are portable across many platforms
Develop Push to Repo Deploy on Server
11. Deployment Architecture - Technologies
● Kubernetes is a containerized management system
● Perfect fit for microservice architecture
● Scalable, self-healing, highly extensible
Source: http://www.joseluisgomez.com/containers/hands-on-kubernetes-pods/
https://github.com/phusion/baseimage-docker
Phusion
Base Image
16. Microservice Architecture
microservices - also known as the microservice architecture - is an architectural style
that structures an application as a collection of services that are
● Highly maintainable and testable
● Loosely coupled
● Independently deployable
● Organized around business capabilities
● Owned by a small team
microservices.io describes microservices as:
https://microservices.io/
17. Microservice Architecture - Overview
SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Plugin
B
Photon
18. Microservice Architecture - Overview
SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Twitter
Stream
Photon
Setup Campaign
Example Use Case
19. Microservice Architecture - Overview
SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Twitter
Stream
Photon
Update DB
20. Microservice Architecture - Overview
SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Twitter
Stream
Photon
Detect
Change
21. Microservice Architecture - Overview
SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Twitter
Stream
PhotonSpawn Core
22. Microservice Architecture - Overview
SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Twitter
Stream
Photon
Core
Processing
23. Microservice Architecture - Overview
SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Twitter
Stream
PhotonUpdate DB
24. Microservice Architecture - Technologies
● Flask is a Python based web application microframework
● Wrap functionality and create endpoints
● Makes processing portable and accessible
Flask
Object
Detection
TF Model
GET /classify?url=https://www.somewebpage.com/someimage.jpg
Object Detection Microservice
RESPONSE: {“objects”:[{“class”:”person”,”position”:[24.54,11.3...
25. Microservice Architecture - Technologies
● TensorFlow is a software library for creating stateful dataflow
graphs for easy deployment of computation topologies
● Used extensively in machine learning applications
● Many higher-level frames utilize TensorFlow as backend
Source: Cui, B., Li, Y., Zhang, Y., & Zhang, Z. (2017).
Text Coherence Analysis Based on Deep Neural
Network. ArXiv, abs/1710.07770.
Source: https://colah.github.io/posts/2015-08-Understanding-LSTMs/
Source: Saxena, Pratha (2019) Multi Layer Perceptron with Tensorflow
26. Microservice Architecture - Technologies
● SQLite is a light-weight relational database
● Used to keep metadata related to campaigns
● Keeps the state of data processing
CampaignID CampaignName Centroid BoundingBox ES-Index ...
0 Harvey -95.5,30.0 28.82,-98.15,31.33,-93.61 harvey2 ...
1 Michael -85.3,30.0 29.5521,-87.3273,31.0231,-84.5319 michael ...
2 Florence -77.9,34.2 33.6971,-80.5936,34.9256,-77.3588 florence ...
27. Microservice Architecture - Technologies
● Elasticsearch is a search and analytics engine built on
Apache Lucene
● RESTful communication that utilizes JSON
● Distributed, scalable, reliable
{
“objects”: {
“person”: 10,
“vehicles”: 2,
“animals”: 0,
},
“isFlood”: true,
“locationMentions”:
[1432,2492,543],
“estWaterHeight”: 10,
“needs”: {
...
Indices
yellow open irma2-file 0hdqu21PTqOpitzRlf4wyQ 5 1 4555 0 1.2mb 1.2mb
yellow open harvey-file hEoGbE3fQVip2HENd_nCjA 5 1 4839 0 2.4mb 2.4mb
yellow open random-tweetneeds Dpplj6UcSXyAJ0TUTssrbg 5 1 55 0 147.5kb 147.5kb
yellow open michael-osm 74DVfmPiSLO0-GjZ7CmM2A 5 1 795 0 170.5kb 170.5kb
yellow open irma-tweetneeds DfKZEk67TVSU2vvjwY7KUw 5 1 170 0 214.7kb 214.7kb
yellow open florence-osm VSVbVcS6Sn6wfqWfMtlPVg 5 1 1126 0 470.8kb 470.8kb
yellow open harveytest-tweetneeds i3Bpisn7QWiMssLO9Ue1gw 5 1 4687 0 1.9mb 1.9mb
yellow open irma-osm JJxRFXhPTimCsyupo-bpKw 5 1 0 0 955b 955b
28. Microservice Architecture - Technologies
● Location Name Extraction Tool (LNEx)
● Developed at Kno.e.sis by Hussein Al-Olimat
● When provided with a geo-bounding box and texts LNEx will
extract the location mentions from the given texts
“@PubliusTX I thought Houston First
maintained the parking facilities and caused
the flooding during last flood at City Hall?”
"Hoosier Rd students talk about Houston
project for flood-ravaged school
https://t.co/pNyGkxzMdB"
"Drop off school items at Children's Museum
for Houston flood victims - Museums leading
society #jhumda https://t.co/xeL76dgOAw via
@indystar"
“@PubliusTX I thought Houston First
maintained the parking facilities and caused
the flooding during last flood at City Hall?”
"Hoosier Rd students talk about Houston
project for flood-ravaged school
https://t.co/pNyGkxzMdB"
"Drop off school items at Children's Museum
for Houston flood victims - Museums leading
society #jhumda https://t.co/xeL76dgOAw via
@indystar"
Source: https://boundingbox.klokantech.com/
38. Image Models: Overview
Why these TensorFlow models?
These models come from a source that has:
● Community support
● Active repositories
● Thorough documentation
● Methods for transfer learning
Tensorflow Models: https://github.com/tensorflow/models
DeCAF Paper: http://arxiv.org/abs/1310.1531
41. Image Models: Object Detection - Retrain [Train]
Prep Annotated Data:
dev@knoesis:~$ cd ~;unzip code.zip;cd code
dev@knoesis:~$ cp ~/workspace/research/object_detection/export_inference_graph.py .
dev@knoesis:~$ #find the xml_to_csv.py script and utilize to convert annotations to csv
dev@knoesis:~$ #note: the following scripts may require packages and/or extra steps and should be
considered an example / guide
dev@knoesis:~$ #python xml_to_csv.py -p annotations/train/ -o trainingdata.csv
dev@knoesis:~$ #python class_gen.py trainingdata.csv > data/object-det.pbtxt
dev@knoesis:~$ #python gen_tf_record_mod.py --class_file data/object-det.pbtxt --csv_input
trainingdata.csv --output_path data/aerial_train.record
Guide: https://becominghuman.ai/tensorflow-object-detection-api-tutorial-training-and-evaluating-custom-object-detector-ed2594afcf73
Download: http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_11_06_2017.tar.gz
Notes and Research Google Drive: https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr
Train:
dev@knoesis:~$ #obtain mobilenet_v1_coco model from link below
dev@knoesis:~$ vim ssd_mobilenet_v1_coco.config #edit hyperparameters
dev@knoesis:~$ python train.py --logtostderr --train_dir=training/ --
pipeline_config_path=training/ssd_mobilenet_v1_coco.config #find in legacy?
42. Image Models: Object Detection - Retrain [Export]
Export Graph:
dev@knoesis:~$ python -u export_inference_graph.py --input_type=image_tensor --
pipeline_config_path=training/ssd_mobilenet_v1_coco.config --
trained_checkpoint_prefix=training/<model_checkpoint> --output_directory=training
Guide: https://becominghuman.ai/tensorflow-object-detection-api-tutorial-training-and-evaluating-custom-object-detector-ed2594afcf73
Download: http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_11_06_2017.tar.gz
Notes and Research Google Drive: https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr
The exported (frozen) inference graph can then be used in a production or
standalone environment.
43. Image Models: Object Detection - Standalone
Test retrained Object Detection “ssd mobilenet 1 coco”:
dev@knoesis:~$ #standalone_OD.py: https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr
dev@knoesis:~$ #contact mikep@knoesis.org or mike.partin@gmail.com for standalone_OD.zip
dev@knoesis:~$ python apply_to_image_batch.py #script to test on batch of images
dev@knoesis:~$ python standalone_OD.py #script to test on image url
LabelImg Windows Binaries: https://tzutalin.github.io/labelImg/
LabelImg: https://github.com/tzutalin/labelImg
Inside the standalone_OD.py are paths to the model and labels. There have
been a few models retrained for various purposes such as parts of objects
with known heights. Below is a sample output from the standalone_OD.py:
dev@knoesis:~$ python standalone_OD.py "http://myimagesurl.com/img1.jpg"
head,0.430,590,284,640,329
torso,0.280,590,327,655,386
head,0.182,590,299,632,330
window,0.124,470,236,515,268
window,0.121,703,3,743,30
44. Image Models: Object Detection - Flask
Wrap the Object Detection in Flask:
DisasterRecord Deployment: https://github.com/shrutikar/DisasterRecord/tree/deployment
ObjectDetector Class
__init__:
#load frozen_inference_graph.pb
#load class
load_image_into_numpy_array:
-->image data
#transform image data to np array
<--np array
run_inference_for_single_image:
-->np array
-->graph
#feed np array into graph
#return inference
<--inference dict
extract:
-->image_url
#download image
#load_image_into_numpy_array(image_data)
#returnDict=run_inference_for_single_image
<--returnDict
45. Image Models: Inception V3 - Retrain [Prevision]
Use TF-Slim for retraining:
● <workspace>/models/research/slim/scripts/finetune_inception_v3_on_flowers.sh
● copy the file to <workspace>/models/research/slim and rename
● run the script to ensure it works with the flower example
● find and modify the following:
○ download_and_convert_data.py (add entry for your new dataset)
○ datasets/download_and_convert_<custom>.py (copy flowers and rename)
○ datasets/<custom>.py (again use flowers.py as template)
○ dataset_factory (include your dataset)
Slim: https://github.com/tensorflow/models/tree/master/research/slim
46. Image Models: Inception V3 - Retrain [Prevision]
Use TF-Slim for retraining:
python train_image_classifier.py
--train_dir=${TRAIN_DIR}
--dataset_dir=${DATASET_DIR}
--dataset_name=flowers
--dataset_split_name=train
--model_name=inception_v3
--checkpoint_path=${CHECKPOINT_PATH}
--checkpoint_exclude_scopes=InceptionV3/Logits,InceptionV3/AuxLogits
--trainable_scopes=InceptionV3/Logits,InceptionV3/AuxLogits
*after training export to pb (see TensorFlow Transfer Learning Boil Down on Google Drive)
Slim: https://github.com/tensorflow/models/tree/master/research/slim
47. Image Models: Inception V3 - Flood / Nonflood Model
● Created using 6,000 flood example images along with 4,000 nonflood counter
examples
● The Flood / Nonflood binary classifier was tested on a randomly selected group of
200 images from the 2015 Chennai flood dataset representing the kind of data we
expected to see in the live stream during a disaster event
● The model performed with 87.5% precision with 75.4% recall giving a F-score of 0.81
● Inception V3 has a reported accuracy of 78.1% with the 1000 classes it was originally
trained on
● In production the confidence had to be 80% or higher for flood in order to classify the
input image as a flood, otherwise the input image was classified as nonflood
54.97% => nonflood 99.99% => flood 99.99% => flood 79.48% => nonflood 95.07% => nonflood
Flood Nonflood
True 49 128
False 7 16
correctly labeled flood: 49
incorrectly labeled flood: 7
incorrectly labeled nonflood: 16
correctly labeled nonflood: 128
48. Image Models: Inception V3 - Standalone
Notes and Research Google Drive:
https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr
FloodDetector Class
getImage:
-->imgURL
#download and resize the image
<--resized image for Inception model
load:
-->imgURL
-->modelName
#id_np=getImage(imgURL)
#load graph from <modelName>.pb
#load classes from <modelName>.txt
#run_inference_for_single_image
<--tuple of (class,confidence)
run_inference_for_single_image:
-->np array
-->graph
#feed np array into graph
#return top inference
<--tuple of (class,confidence)
#Example:
fd=FloodDetector()
fd.load(“https://bit.ly/2YA2Izd”,”flood”)
#NOTE THE NAMES OF THE TENSORS FOR INPUT & OUTPUT OF GRAPH
...
y=sess.graph.get_tensor_by_name('InceptionV3/Predictions/Reshape_1:0')
x=sess.graph.get_tensor_by_name('Placeholder:0')
...
49. Image Models: Inception V3 - Flask
DisasterRecord Deployment: https://github.com/shrutikar/DisasterRecord/tree/deployment
Flask
FloodDetector
TF Model
GET /classify?url=https://www.somewebpage.com/someimage.jpg
RESPONSE: nonflood
50. Image Models: DeepLab - Overview
[1] DeepLab: https://github.com/tensorflow/models/tree/master/research/deeplab
[2] ADE20K Dataset: https://groups.csail.mit.edu/vision/datasets/ADE20K/
According to the DeepLab GitHub Page:
“DeepLab is a state-of-art deep learning model for semantic image segmentation,
where the goal is to assign semantic labels (e.g., person, dog, cat and so on) to every
pixel in the input image.” [1] ADE20K Dataset [2]
Original Image
Object Segmentation
Parts Segmentation
51. Image Models: DeepLab - Overview
[1] DeepLab: https://github.com/tensorflow/models/tree/master/research/deeplab
ADE20K Dataset: https://groups.csail.mit.edu/vision/datasets/ADE20K/
DeepLab model is used as-is to detect the boundary of water in an image.
● The results of the DeepLab
analysis combined with object
detection, Inception inference,
and external knowledge can
result in some very powerful
intelligent analysis
● What is it used for? Explained
in more detail in later...
52. Image Models: DeepLab - Standalone
[1] DeepLab: https://github.com/tensorflow/models/tree/master/research/deeplab
ADE20K Dataset: https://groups.csail.mit.edu/vision/datasets/ADE20K/
DeepLabModel Script:
DeepLabModel Class:
__init__:
-->modelPath
#load TF graph tar from path
<--resized image for Inception model
run:
-->image data
#model produces segmentation map
<--segmentation map
create_pascal_label_colormap:
#generate colormap for various segments
<--colormap
label_to_color_image:
-->label
#obtain the color corresponding to the label
<--color
vis_segmentation:
-->image data
-->segmentation map
vis_segmentation: #continued
#save segmentation map as bmp
#end of DeepLabModel Class
#main of script
#label 22 maps to bodies of water in DeepLab
water_color=FULL_COLOR_MAP[22][0]
d=DeepLabModel("models/SG/deeplabv3_xception_ade20k_train_2018_05_29.tar.gz")
image_url = str(args[1]) #get url from arg
response = requests.get(image_url) #download image
image = Image.open(BytesIO(response.content))
o=d.run(image) #get segmentation map
vis_segmentation(o[0],o[1]) #save as bmp
53. Image Models: DeepLab - Flask
DeepLab model is currently not wrapped in a Flask app yet.
● Output from the DeepLab model is a bitmap unlike the other two models
which are JSON responses
● Elasticsearch might not be the best place to store this kind of information
● Work still needs to be done to research where best to store the results of
this analysis
59. LNExAPI: Overview
Redis
SQLite
Photon
Extract “New Orleans” ab4bd48395bdab12 [@PubliusTX I thought
Houston First maintained the parking facilities and caused the flooding
during last flood at City Hall?",...
LNEx Deployment: https://github.com/halolimat/LNEx/tree/LNExAPI-Deployment
LNEx Core https://github.com/halolimat/LNEx
61. LNExAPI: Overview
Redis
SQLite
Photon
Query Zone and set ready bit when
completed
LNEx Deployment: https://github.com/halolimat/LNEx/tree/LNExAPI-Deployment
LNEx Core https://github.com/halolimat/LNEx
Poll...
63. LNExAPI: Overview
Redis
SQLite
Photon
[{"text":"@PubliusTX I thought Houston First maintained the parking facilities
and caused the flooding during last flood at City Hall?"], "results":[3458,12349…
JSON RESPONSE
LNEx Deployment: https://github.com/halolimat/LNEx/tree/LNExAPI-Deployment
LNEx Core https://github.com/halolimat/LNEx
64. LNExAPI: Functionality - RESTful API Endpoints
Method URL
GET /apiv1/LNEx/initZone?key=xx&bb=[lon1,lat1,lon2,lat2]&zone=ZoneName
GET /apiv1/LNEx/destroyZone?key=xx&zone=ZoneName
POST /apiv1/LNEx/bulkExtract?key=xx&zone=ZoneName
POST /apiv1/LNEx/fullBulkExtract?key=xx&zone=ZoneName
GET /apiv1/LNEx/results?key=xx&token=yy
GET /apiv1/LNEx/geoInfo?key=xx&zone=ZoneName&geoIDs=[1,2,3]
GET /apiv1/LNEx/photonID?key=xx&osm_id=1
GET /apiv1/LNEx/zoneReady?key=xx&zone=ZoneName
65. LNExAPI: Functionality - CLI Client
LNExAPI CLI: https://github.com/halolimat/LNEx/blob/LNExAPI-Deployment/LNExAPIClient.md
LNEx Core https://github.com/halolimat/LNEx
Source code, wheel, and help can be found on GitHub repository
from LNExAPI import LNExAPI
def displayResults(results):
for result in results:
print("Matches for:",result['text'])
for entity in result['entities']:
print("[ ]-->",entity['match'])
for location in entity['locations']:
print(" [ ]-->",str(location['coordinate']['lat'])+","+str(location['coordinate']['lon']))
lnex = LNExAPI(key="168ba4d297a8c64a03",host="http://127.0.0.1/") #REPLACE WITH YOUR USER KEY AND HOST
lnex.initZone([-84.6447033333,39.1912856591,-83.2384533333,40.0880515857],"dayton")
print("Zone Dayton is being initialized...")
lnex.pollZoneReady("dayton") #WAITS UNTIL ZONE IS INIT/READY
text=[
"Your text goes here:",
"A list of text in which locations will be searched for...",
"A list of the same size will be returned once you execute a doBulkExtract on the text list",
"Each item in the returned list will be a list of the entities that matched",]
print("Extracting locations from Dayton Zone...")
result_token,results=lnex.pollFullBulkExtract("dayton",text)
displayResults(results)
66. LNExAPI: Functionality - Server Side
LNExAPI Server: https://github.com/halolimat/LNEx/blob/LNExAPI-Deployment/LNExAPIServer.md
LNEx Core https://github.com/halolimat/LNEx
Tools for controlling and maintaining LNExAPI
● LNExCLI - create, list, Users, list Zones, perform “House Cleaning”
● startAPI - start the Django application
● killAPI - shutoff the Django application
● startSafetyCheck - start a supplemental service that ensures LNExAPI is active
● killSafetyCheck - disables the supplemental service
Example
● ./LNExCLI create user <name> <email>
● ./LNExCLI activate user <name> <access_level>
67. LNExAPI: Functionality - Performance
Zone Bounding Box Entities KM Init (s) LT (s)
Dayton 39.70185, -84.311377, 39.920823, -84.092938 8655 378.66 22.58 7.63
NYC 40.477399, -74.25909, 40.916178, -73.700181 1036003 1977.15 256.15 16.07
London 51.28676, -0.510375, 51.691874, 0.334015 486345 3309.55 262.50 31.91
Japan 133.19,32.88,141.98,40.68 1214426 507176.61 N/A N/A
Tokyo 33.98, 135.85, 35.9, 139.36 257428 47719.84 143.24 15.82
Jakarta -6.5087, 106.5271, -5.8643, 107.2412 98549 613.14 78.89 15.69
Sri Lanka 5.5809, 79.3815, 10.1383, 82.194 72865 21666.06 78.91 15.67
Jordan 29.18, 34.88, 33.38, 39.3 145913 119150.64 111.17 15.76
70. Multimodal Analysis - Service Types
SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Plugin
B
Photon
● Shared Services
○ Provide information to
all services
● Core Analysis Services
○ Do not rely upon other
core analysis output
○ Analyze the input data
● Custom Analysis Services
○ May require core
analysis results for
further computations
thus may take much
more time
71. Multimodal Analysis - Service Types
SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Plugin
B
Photon
● Shared Services
○ Provide information to
all services
● Core Analysis Services
○ Do not rely upon other
core analysis output
○ Analyze the input data
● Custom Analysis Services
○ May require core
analysis results for
further computations
thus may take much
more time
72. Multimodal Analysis - Service Types
SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Plugin
B
Photon
● Shared Services
○ Provide information to
all services
● Core Analysis Services
○ Do not rely upon other
core analysis output
○ Analyze the input data
● Custom Analysis Services
○ may require core
analysis results for
further computations
thus may take much
more time
73. Multimodal Analysis - Example: Water Height Estimation
Algorithm for water depth analysis
KnoBase = {
..“head”: {
....“avg_obj_height”: 22,
....“height_above_ground”: 165
..},
..“Torso”: {...
#KnoBase contains average expected heigths of
#objects detected in the image
WaterColor ← water class color
74. Multimodal Analysis - Example: Water Height Estimation
Algorithm for water depth analysis
def getColor(x,y):
..return color value at given pixels
def getScale(Obj):
..h ← KnoBase[Obj][‘avg_obj_height’]
..ratio_h ← Obj.H / h
..return ratio_h
water_estimations=[]
75. Multimodal Analysis - Example: Water Height Estimation
Algorithm for water depth analysis
def estimate(Image,DetectedObjects):
..foreach Obj in DetectedObjects:
....Obj_Ratio ← getScale(Obj)
....Pixel_X ← Obj.X + (Obj.W / 2)
....Pixel_Y ← Obj.Y
....PixelColor ← getColor(Pixel_X,Pixel_Y)
....Total_Height ← 0
76. Multimodal Analysis - Example: Water Height Estimation
Algorithm for water depth analysis
....foreach y in range(Pixel_Y, Image.Height):
......#from top to bottom scan segmentation map
......#for change
......if getColor(Pixel_X,y) == PixelColor:
........Total_Height++
......elif getColor(Pixel_X,y) == WaterColor:
........break
......else
........Total_Height ← -1
........break
77. Multimodal Analysis - Example: Water Height Estimation
Algorithm for water depth analysis
..calc_height ← Obj_Ratio * Total_Height
..water_level ← KnoBase[Obj][‘height_above_ground’] - calc_height
..water_estimations.append(water_level)
78. Multimodal Analysis - Merging Modalities
Bringing it all together:
"Griggs Rd is extremely flooded!
{
“objects”: {
“person”: 10,
“vehicles”: 2,
“animals”: 0,
},
“isFlood”: true,
“locationMentions”:
[1432,2492,543],
“estWaterHeight”: 10,
“needs”: {
...
Ingest Data Perform Analysis Combine Analysis
and Build
Knowledge Base
Display Analysis
79. Multimodal Analysis - Querying Knowledge Base
I want to see all the
vehicles found in the
Houston area within the
last hour
{
"query":{
"bool":{
"must":{[
{"range":{"timestamp":{"gte":"now-1h"}}},
{"range":{"objects.vehicles":{"gte:1}}}
]},
"filter":{"geo_bounding_box":{
"coordinate":{
"bottom_left":{"lat":bb[0],"lon":bb[1]},
"top_right":{"lat":bb[2],"lon":bb[3]}
}
}}
}
}
}
Why Elasticsearch is used:
80. Multimodal Analysis - Querying Knowledge Base
I want to see all the
vehicles found in the
Houston area within the
last hour
{
"query":{
"bool":{
"must":{[
{"range":{"timestamp":{"gte":"now-1h"}}},
{"range":{"objects.vehicles":{"gte:1}}}
]},
"filter":{"geo_bounding_box":{
"coordinate":{
"bottom_left":{"lat":bb[0],"lon":bb[1]},
"top_right":{"lat":bb[2],"lon":bb[3]}
}
}}
}
}
}
Why Elasticsearch is used:
81. Multimodal Analysis - Querying Knowledge Base
I want to see all the
vehicles found in the
Houston area within the
last hour
{
"query":{
"bool":{
"must":{[
{"range":{"timestamp":{"gte":"now-1h"}}},
{"range":{"objects.vehicles":{"gte:1}}}
]},
"filter":{"geo_bounding_box":{
"coordinate":{
"bottom_left":{"lat":bb[0],"lon":bb[1]},
"top_right":{"lat":bb[2],"lon":bb[3]}
}
}}
}
}
}
Why Elasticsearch is used:
82. Multimodal Analysis - Querying Knowledge Base
I want to see all the
vehicles found in the
Houston area within the
last hour
{
"query":{
"bool":{
"must":{[
{"range":{"timestamp":{"gte":"now-1h"}}},
{"range":{"objects.vehicles":{"gte:1}}}
]},
"filter":{"geo_bounding_box":{
"coordinate":{
"bottom_left":{"lat":bb[0],"lon":bb[1]},
"top_right":{"lat":bb[2],"lon":bb[3]}
}
}}
}
}
}
Why Elasticsearch is used:
83. Multimodal Analysis - Merging Modalities
Current Issues:
● Each stage of analysis introduces uncertainty and must have a quantifiable measure
associated with it
● Uncertainties must be combined in a way to generate an overall confidence measure of
the combined analysis
● Some analysis doesn’t fit well into Elasticsearch schema (Semantic Segmentation)
● Disambiguation issue with location extraction for reliance on
○ Example: what part of Griggs Rd?
85. Future Work
Areas of work
● Frontend Development
○ Campaign Management
● Image Analysis
○ Aerial / Ground-Level Classifier
○ Aerial Object Detection
● Backend Services
○ Clean up logging
○ Some bugs still to work out
● Deployment
○ Improve segregation to align with microservice paradigm
86. Future Work
Resource Locations
Kubernetes Cluster 130.108.87.249, 130.108.87.250, 130.108.87.251
Kubernetes DR http://130.108.87.249:30138
Kubernetes LNExAPI http://130.108.87.249:30139
OpenStack DR (Staging) http://130.108.86.153/DR/listCampaigns
OpenStack DR (Production) http://130.108.86.152/DR/listCampaigns
Photon DB http://130.108.86.153:9201/
Notes: https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr
87. Summary
DisasterRecord is a tool used to provide event-centric situational analysis in real-time. It was
originally developed as a demo for the IBM CallForCode challenge but was limited in that it only
analyzed one event. The efforts described in this presentation showcase the work done to improve
the DisasterRecord project in the areas of flexibility, scalability, and portability. Deployment
methods utilizing Ansible, Docker, and Kubernetes were discussed. These technologies fall inline
with the microservice architecture that was also discussed and help solve the scalability and
portability concerns of the original DisasterRecord demo. Improvements to the image processing
pipeline that make analysis more flexible and expandable were also discussed.
DisasterRecord is ready for the next set of developers to dive in and continue expansion.
90. Thanks
Special Thanks
I also want to extend a special thanks to all the members of Kno.e.sis for the helpful discussions,
encouraging words, and tireless efforts. We all bring our own set of talents, interests, and skills to work
on some of the most challenging problems not only in computer science but across a wide range of
disciplines and areas including psychology, healthcare, emergency response, national defense, and
many more. Kno.e.sis provides a rich soil that cultivates an environment of cooperation, teamwork, and
collaboration. I’ve never felt such personal growth like that I’ve experience here at Kno.e.sis.
Thank you all for the wonderful journey! I hope the very best for all your future endeavors!
91. Bibliography
Purohit, Hemant et al. “Understanding User-Community Engagement by Multi-faceted Features: A Case
Study on Twitter.” (2011).
Hussein, et al. “Location Name Extraction from Targeted Text Streams Using Gazetteer-Based
Statistical Language Models.” ArXiv.org, 7 June 2018, arxiv.org/abs/1708.03105.
Halolimat. “Halolimat/LNEx.” GitHub, 11 June 2019, github.com/halolimat/LNEx.
“DisasterRecord.” DisasterRecord - Knoesis Wiki, wiki.knoesis.org/index.php/DisasterRecord.
“A Cloud-Enabled Automatic Disaster Analysis System of Multi-Sourced Data Streams: An Example
Synthesizing Social Media, Remote Sensing and Wikipedia Data.” Computers, Environment and Urban
Systems, Pergamon, 4 Aug. 2017, www.sciencedirect.com/science/article/pii/S0198971517303216.
“Twitris.” Twitris - Knoesis Wiki, wiki.knoesis.org/index.php/Twitris.
Ansible, Red Hat. “Ansible Is Simple IT Automation.” Ansible Is Simple IT Automation,
www.ansible.com/.
“Enterprise Container Platform.” Docker, www.docker.com/.
“Production-Grade Container Orchestration.” Kubernetes, kubernetes.io/.
“ADE20K.” ADE20K Dataset, groups.csail.mit.edu/vision/datasets/ADE20K/.
92. Bibliography
Tensorflow. “Tensorflow/Models.” GitHub, 12 June 2019,
github.com/tensorflow/models/tree/master/research/deeplab.
“Django.” Django, www.djangoproject.com/.
“Flask” | Flask (A Python Microframework), flask.pocoo.org/.
Redis, redis.io/.
Treml, Michael, et al. “Speeding up Semantic Segmentation for Autonomous Driving.” Venues, 15 Oct.
2016, openreview.net/forum?id=S1uHiFyyg.
Al-Olimat, Hussein S., et al. “Location Name Extraction from Targeted Text Streams Using Gazetteer-
Based Statistical Language Models.” Proceedings of the 27th International Conference on
Computational Linguistics, 1 Aug. 2018, knoesis.wright.edu/node/2906.
Komoot. “Komoot/Photon.” GitHub, 16 June 2019, github.com/komoot/photon.
Halolimat. “Halolimat/LNEx.” GitHub, 22 Feb. 2019, github.com/halolimat/LNEx/tree/LNExAPI-
Deployment.
Gamauf, Thomas. “Tensorflow Records? What They Are and How to Use Them.” Medium, Mostly AI, 2
Oct. 2018, medium.com/mostly-ai/tensorflow-records-what-they-are-and-how-to-use-them-
c46bc4bbb564.
93. Bibliography
Kar, Shruti. “Multi-Scale and Multi-Modal Streaming Data Aggregation and Processing for Decision
Support during Natural Disasters.” OhioLINK ETD: Kar, Shruti, 2018,
etd.ohiolink.edu/pg_10?::NO:10:P10_ETD_SUBID:175444.
Kar, Shruti. “DRecord: Disaster Response and Relief Coordination Pipeline.” DRecord, ACM, 6 Nov.
2018, dl.acm.org/citation.cfm?id=3284572.
Partin, Michael, et al. “Knowledge-Empowered Real-Time Event-Centric Situational Analysis”, NSF
I/UCRC, Center for Surveillance Research Advisory Board Meeting, 7 Aug. 2018,
knoesis.org/node/2912.
Donahue, et al. “DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition.”
ArXiv.org, 6 Oct. 2013, arxiv.org/abs/1310.1531.
“Advanced Guide to Inception v3 on Cloud TPU | Cloud TPU | Google Cloud.” Google, Google,
cloud.google.com/tpu/docs/inception-v3-advanced.
Howard, et al. “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.”
ArXiv.org, 17 Apr. 2017, arxiv.org/abs/1704.04861.
Tensorflow. “Tensorflow/Models.” GitHub, 29 June 2019, github.com/tensorflow/models.
Tzutalin. “Tzutalin/LabelImg.” GitHub, 4 June 2019, github.com/tzutalin/labelImg.