Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Data Stream Management Systems

Scalable, Pluggable, and Fault Tolerant Multi-Modal
Situational Awareness Data Stream Management Systems
-Updates and Improvements to DisasterRecord
Michael Partin — Summer 2019
COMMITTEE MEMBERS:
Dr. Krishnaprasad Thirunarayan
Dr. Amit Sheth
Dr. Valerie L. Shalin

About Me
Michael Partin
Education:
MS Computer Science (Wright State University 2020)
BS Computer Engineering (Wright State University 2017)
AAS Electronic Engineering Technology (Sinclair College 2010)
Related Work:
Twitris [Magpie Data Collector / Monitoring / Plugins]
System Administration [OpenStack / Ceph / AWS]
Work:
CognoviLabs [Multilingual NLP]

Overview
● Deployment Architecture
● Microservice Architecture
● Campaign Management
● Image Processing Pipeline
● Image Models
● LNExAPI
● Multimodal Analysis
● Future Work
Technologies Utilized:

Overview: Previous Work
● What is DisasterRecord?
DisasterRecord is a tool used to provide event-centric situational analysis in real-time. Events
(disasters, social movements, etc.) are heterogeneous in nature with many modalities and
sources. DisasterRecord is designed to combine information from various modalities in a
meaningful way to help users gather a more complete analysis of a given situation.
● What work had been done on DisasterRecord previously?
DisasterRecord comes from the efforts of many great minds here at Kno.e.sis at Wright State
University along with collaboration at Ohio State University. A demo of DisasterRecord was
developed by many members of Kno.e.sis and submitted for the IBM CallForCode challenge. At
that time it provided analysis upon one specific event, the 2015 Chennai flood event.

Overview: Current Work
● What work was done here?
The IBM CallForCode demo showcased the capabilities of the DisasterRecord system but lacked
in a few areas such as flexibility, scalability, and portability. This presentation describes the
efforts in each of these areas. Deployment methods and architecture are discussed to address
portability; microservice architecture is discussed to address scalability; and an improved image
processing pipeline is discussed showcasing a higher level of flexibility.

DisasterRecord
Deployment
Architecture

Deployment Architecture - Technologies
● Ansible makes cloud automation easy and reliable
● Powerful centralized control over large clusters
● Jinja templates help to make deployment flexible
Django
SQL
ES Redis
Storm
Kafka

Key Features
● Playbooks
● Templates
● Groups

● Docker containers solves dependency issue problems
● Vast repository for standard services
● Containers are portable across many platforms
Develop Push to Repo Deploy on Server

● Kubernetes is a containerized management system
● Perfect fit for microservice architecture
● Scalable, self-healing, highly extensible
Source: http://www.joseluisgomez.com/containers/hands-on-kubernetes-pods/
https://github.com/phusion/baseimage-docker
Phusion
Base Image

---
apiVersion: v1
kind: Service
metadata:
name: dr-base-ansible-ssh
spec:
selector:
statefulset.kubernetes.io/pod-name: dr-base-ansible-0
ports:
- name: ssh
protocol: TCP
port: 22
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: dr-base-ansible
labels:
service: dr-base-ansible
spec:
serviceName: dr-base-ansible
replicas: 1
selector:
matchLabels:
template:
metadata:
labels:
spec:
terminationGracePeriodSeconds: 300
containers:
- name: dr-base-ansible
image: knoesis/std_base:v0.0.1
imagePullPolicy: IfNotPresent
ports:
- containerPort: 22
name: ssh
resources:
requests:
memory: 8Gi
limits:
memory: 16Gi
Kubernetes Deployment YAML
Kubernetes: http://130.108.87.249:30138/listCampaigns
OpenStack Production: http://130.108.86.152/DR/listCampaigns
OpenStack Staging: http://130.108.86.153/DR/listCampaigns

dev@knoesis:~$ ssh boss@130.108.87.249 -i key.pem
boss@kub-01:~$ kubectl apply -f dr-base-photon.yaml
boss@kub-01:~$ kubectl get pod -o wide --namespace dr
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS
dr-base-ansible-0 1/1 Running 0 114d 10.40.0.1 kub-03 <none> <none>
dr-base-api-0 1/1 Running 0 114d 10.40.0.2 kub-03 <none> <none>
dr-base-photon-0 1/1 Running 0 34d 10.46.0.28 kub-02 <none> <none>
dr-base-redis-0 1/1 Running 0 114d 10.46.0.6 kub-02 <none> <none>
boss@kub-01:~$ kubectl get services -o wide --namespace dr
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dr-base-ansible-ssh ClusterIP 10.97.164.199 <none> 22/TCP 114d
dr-base-api-services NodePort 10.105.254.5 <none> 8000:30138/TCP,80:30139/TCP 114d
dr-base-api-ssh ClusterIP 10.107.232.90 <none> 22/TCP 114d
dr-base-redis-redis ClusterIP 10.99.235.131 <none> 6379/TCP 114d
dr-base-redis-ssh ClusterIP 10.108.98.44 <none> 22/TCP 114d
photon-web-access NodePort 10.107.18.139 <none> 80:30140/TCP 34d
boss@kub-01:~$ kubectl --namespace dr exec -it dr-api-0 -- /bin/bash
boss@kub-01:~$ kubectl --namespace dr delete sts dr-api
Kubernetes: http://130.108.87.249:30138/listCampaigns
OpenStack Production: http://130.108.86.152/DR/listCampaigns
OpenStack Staging: http://130.108.86.153/DR/listCampaigns
Kubernetes CLI

DisasterRecord
Microservice
Architecture

Microservice Architecture
microservices - also known as the microservice architecture - is an architectural style
that structures an application as a collection of services that are
● Highly maintainable and testable
● Loosely coupled
● Independently deployable
● Organized around business capabilities
● Owned by a small team
microservices.io describes microservices as:
https://microservices.io/

Microservice Architecture - Overview
SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Plugin
B
Photon

SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Twitter
Stream
Photon
Setup Campaign
Example Use Case

SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Twitter
Stream
Photon
Update DB

SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Twitter
Stream
Photon
Detect
Change

SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Twitter
Stream
PhotonSpawn Core

SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Twitter
Stream
Photon
Core
Processing

SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Twitter
Stream
PhotonUpdate DB

Microservice Architecture - Technologies
● Flask is a Python based web application microframework
● Wrap functionality and create endpoints
● Makes processing portable and accessible
Flask
Object
Detection
TF Model
GET /classify?url=https://www.somewebpage.com/someimage.jpg
Object Detection Microservice
RESPONSE: {“objects”:[{“class”:”person”,”position”:[24.54,11.3...

● TensorFlow is a software library for creating stateful dataflow
graphs for easy deployment of computation topologies
● Used extensively in machine learning applications
● Many higher-level frames utilize TensorFlow as backend
Source: Cui, B., Li, Y., Zhang, Y., & Zhang, Z. (2017).
Text Coherence Analysis Based on Deep Neural
Network. ArXiv, abs/1710.07770.
Source: https://colah.github.io/posts/2015-08-Understanding-LSTMs/
Source: Saxena, Pratha (2019) Multi Layer Perceptron with Tensorflow

● SQLite is a light-weight relational database
● Used to keep metadata related to campaigns
● Keeps the state of data processing
CampaignID CampaignName Centroid BoundingBox ES-Index ...
0 Harvey -95.5,30.0 28.82,-98.15,31.33,-93.61 harvey2 ...
1 Michael -85.3,30.0 29.5521,-87.3273,31.0231,-84.5319 michael ...
2 Florence -77.9,34.2 33.6971,-80.5936,34.9256,-77.3588 florence ...

● Elasticsearch is a search and analytics engine built on
Apache Lucene
● RESTful communication that utilizes JSON
● Distributed, scalable, reliable
{
“objects”: {
“person”: 10,
“vehicles”: 2,
“animals”: 0,
},
“isFlood”: true,
“locationMentions”:
[1432,2492,543],
“estWaterHeight”: 10,
“needs”: {
...
Indices
yellow open irma2-file 0hdqu21PTqOpitzRlf4wyQ 5 1 4555 0 1.2mb 1.2mb
yellow open harvey-file hEoGbE3fQVip2HENd_nCjA 5 1 4839 0 2.4mb 2.4mb
yellow open random-tweetneeds Dpplj6UcSXyAJ0TUTssrbg 5 1 55 0 147.5kb 147.5kb
yellow open michael-osm 74DVfmPiSLO0-GjZ7CmM2A 5 1 795 0 170.5kb 170.5kb
yellow open irma-tweetneeds DfKZEk67TVSU2vvjwY7KUw 5 1 170 0 214.7kb 214.7kb
yellow open florence-osm VSVbVcS6Sn6wfqWfMtlPVg 5 1 1126 0 470.8kb 470.8kb
yellow open harveytest-tweetneeds i3Bpisn7QWiMssLO9Ue1gw 5 1 4687 0 1.9mb 1.9mb
yellow open irma-osm JJxRFXhPTimCsyupo-bpKw 5 1 0 0 955b 955b

● Location Name Extraction Tool (LNEx)
● Developed at Kno.e.sis by Hussein Al-Olimat
● When provided with a geo-bounding box and texts LNEx will
extract the location mentions from the given texts
“@PubliusTX I thought Houston First
maintained the parking facilities and caused
the flooding during last flood at City Hall?”
"Hoosier Rd students talk about Houston
project for flood-ravaged school
https://t.co/pNyGkxzMdB"
"Drop off school items at Children's Museum
for Houston flood victims - Museums leading
society #jhumda https://t.co/xeL76dgOAw via
@indystar"
“@PubliusTX I thought Houston First
maintained the parking facilities and caused
the flooding during last flood at City Hall?”
"Hoosier Rd students talk about Houston
project for flood-ravaged school
https://t.co/pNyGkxzMdB"
"Drop off school items at Children's Museum
for Houston flood victims - Museums leading
society #jhumda https://t.co/xeL76dgOAw via
@indystar"
Source: https://boundingbox.klokantech.com/

DisasterRecord
Campaign
Management

Campaign Management - Define Campaign
Campaign Definition
● Name
● Spatial (Bounding Box)
● Temporal (Date Range)
● Data Sources
○ Data Set
○ Twitter
○ Flood Maps
○ Drone Images

Campaign Management - Multiple Users
Creating / Modifying the campaign
requires credentials
Viewing the
campaign is public

DisasterRecord
Image Processing
Pipeline

Image Processing Pipeline
Retrained “SSD Mobilenet COCO”
https://github.com/tensorflow/models/blob/master/research/object_detection

DeepLab Semantic Segmentation
https://github.com/tensorflow/models/tree/master/research/deeplab

Retrained Inception V3 Classifier
99.99% => flood
58.08% => flood
99.85% => non-flood
100.00% => flood

Image Models: Overview
● 1: Object Detection
○ Retrain
○ Standalone
○ Flask
● 2: Inception V3
○ Retrain
○ Standalone
○ Flask
● 3: Deeplab
○ Standalone
○ Flask
2
[1] https://machinethink.net/blog/mobilenet-v2/
[2] https://cloud.google.com/tpu/docs/inception-v3-advanced
[3] https://handong1587.github.io/deep_learning/2015/10/09/segmentation.html
1
3

Image Models: Overview
Why these TensorFlow models?
These models come from a source that has:
● Community support
● Active repositories
● Thorough documentation
● Methods for transfer learning
Tensorflow Models: https://github.com/tensorflow/models
DeCAF Paper: http://arxiv.org/abs/1310.1531

Image Models: Object Detection - Retrain [Prevision]
Provision Object Detection to retrain “ssd mobilenet 1 coco”:
dev@knoesis:~$ cd workspace #or create workspace; cd workspace
dev@knoesis:~$ git clone https://github.com/tensorflow/models.git
dev@knoesis:~$ mdlPath="/home/<user>/workspace/models"
dev@knoesis:~$ slimPath="/home/<user>/workspace/models/research/slim/"
dev@knoesis:~$ export PYTHONPATH="${PYTHONPATH}:$mdlPath:$slimPath"
dev@knoesis:~$ sudo apt update; sudo apt install virtualenv
dev@knoesis:~$ cd home/<user>/workspace/models/research/object_detection
dev@knoesis:~$ mkdir annotations;mkdir annotations/test;mkdir annotations/train
dev@knoesis:~$ #annotate your images!
dev@knoesis:~$ cp <your_test_image>*.jpg $mdlPath/research/object_detection/annotations/test
dev@knoesis:~$ cp <your_test_image>*.jpg $mdlPath/research/object_detection/annotations/train
dev@knoesis:~$ cd ~/workspace/research/object_detection
dev@knoesis:~$ virtualenv -p python3 .;. ./bin/activate #create virtualenv
dev@knoesis:~$ pip install tensorflow matplotlib Pillow pandas #or tensorflow-gpu
dev@knoesis:~$ #find code.zip here: https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr
dev@knoesis:~$ #or contact mikep@knoesis.org or mike.partin@gmail.com for code.zip
LabelImg Windows Binaries: https://tzutalin.github.io/labelImg/
LabelImg: https://github.com/tzutalin/labelImg

Image Models: Object Detection - Retrain [Data]
Annotate some data:
dev@knoesis:~$ pip3 install labelImg
dev@knoesis:~$ labelImg

Image Models: Object Detection - Retrain [Train]
Prep Annotated Data:
dev@knoesis:~$ cd ~;unzip code.zip;cd code
dev@knoesis:~$ cp ~/workspace/research/object_detection/export_inference_graph.py .
dev@knoesis:~$ #find the xml_to_csv.py script and utilize to convert annotations to csv
dev@knoesis:~$ #note: the following scripts may require packages and/or extra steps and should be
considered an example / guide
dev@knoesis:~$ #python xml_to_csv.py -p annotations/train/ -o trainingdata.csv
dev@knoesis:~$ #python class_gen.py trainingdata.csv > data/object-det.pbtxt
dev@knoesis:~$ #python gen_tf_record_mod.py --class_file data/object-det.pbtxt --csv_input
trainingdata.csv --output_path data/aerial_train.record
Guide: https://becominghuman.ai/tensorflow-object-detection-api-tutorial-training-and-evaluating-custom-object-detector-ed2594afcf73
Download: http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_11_06_2017.tar.gz
Notes and Research Google Drive: https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr
Train:
dev@knoesis:~$ #obtain mobilenet_v1_coco model from link below
dev@knoesis:~$ vim ssd_mobilenet_v1_coco.config #edit hyperparameters
dev@knoesis:~$ python train.py --logtostderr --train_dir=training/ --
pipeline_config_path=training/ssd_mobilenet_v1_coco.config #find in legacy?

Image Models: Object Detection - Retrain [Export]
Export Graph:
dev@knoesis:~$ python -u export_inference_graph.py --input_type=image_tensor --
pipeline_config_path=training/ssd_mobilenet_v1_coco.config --
trained_checkpoint_prefix=training/<model_checkpoint> --output_directory=training
Guide: https://becominghuman.ai/tensorflow-object-detection-api-tutorial-training-and-evaluating-custom-object-detector-ed2594afcf73
Download: http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_11_06_2017.tar.gz
Notes and Research Google Drive: https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr
The exported (frozen) inference graph can then be used in a production or
standalone environment.

Image Models: Object Detection - Standalone
Test retrained Object Detection “ssd mobilenet 1 coco”:
dev@knoesis:~$ #standalone_OD.py: https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr
dev@knoesis:~$ #contact mikep@knoesis.org or mike.partin@gmail.com for standalone_OD.zip
dev@knoesis:~$ python apply_to_image_batch.py #script to test on batch of images
dev@knoesis:~$ python standalone_OD.py #script to test on image url
Inside the standalone_OD.py are paths to the model and labels. There have
been a few models retrained for various purposes such as parts of objects
with known heights. Below is a sample output from the standalone_OD.py:
dev@knoesis:~$ python standalone_OD.py "http://myimagesurl.com/img1.jpg"
head,0.430,590,284,640,329
torso,0.280,590,327,655,386
head,0.182,590,299,632,330
window,0.124,470,236,515,268
window,0.121,703,3,743,30

Image Models: Object Detection - Flask
Wrap the Object Detection in Flask:
DisasterRecord Deployment: https://github.com/shrutikar/DisasterRecord/tree/deployment
ObjectDetector Class
__init__:
#load frozen_inference_graph.pb
#load class
load_image_into_numpy_array:
-->image data
#transform image data to np array
<--np array
run_inference_for_single_image:
-->np array
-->graph
#feed np array into graph
#return inference
<--inference dict
extract:
-->image_url
#download image
#load_image_into_numpy_array(image_data)
#returnDict=run_inference_for_single_image
<--returnDict

Image Models: Inception V3 - Retrain [Prevision]
Use TF-Slim for retraining:
● <workspace>/models/research/slim/scripts/finetune_inception_v3_on_flowers.sh
● copy the file to <workspace>/models/research/slim and rename
● run the script to ensure it works with the flower example
● find and modify the following:
○ download_and_convert_data.py (add entry for your new dataset)
○ datasets/download_and_convert_<custom>.py (copy flowers and rename)
○ datasets/<custom>.py (again use flowers.py as template)
○ dataset_factory (include your dataset)
Slim: https://github.com/tensorflow/models/tree/master/research/slim

Image Models: Inception V3 - Retrain [Prevision]
Use TF-Slim for retraining:
python train_image_classifier.py
--train_dir=${TRAIN_DIR}
--dataset_dir=${DATASET_DIR}
--dataset_name=flowers
--dataset_split_name=train
--model_name=inception_v3
--checkpoint_path=${CHECKPOINT_PATH}
--checkpoint_exclude_scopes=InceptionV3/Logits,InceptionV3/AuxLogits
--trainable_scopes=InceptionV3/Logits,InceptionV3/AuxLogits
*after training export to pb (see TensorFlow Transfer Learning Boil Down on Google Drive)
Slim: https://github.com/tensorflow/models/tree/master/research/slim

Image Models: Inception V3 - Flood / Nonflood Model
● Created using 6,000 flood example images along with 4,000 nonflood counter
examples
● The Flood / Nonflood binary classifier was tested on a randomly selected group of
200 images from the 2015 Chennai flood dataset representing the kind of data we
expected to see in the live stream during a disaster event
● The model performed with 87.5% precision with 75.4% recall giving a F-score of 0.81
● Inception V3 has a reported accuracy of 78.1% with the 1000 classes it was originally
trained on
● In production the confidence had to be 80% or higher for flood in order to classify the
input image as a flood, otherwise the input image was classified as nonflood
54.97% => nonflood 99.99% => flood 99.99% => flood 79.48% => nonflood 95.07% => nonflood
Flood Nonflood
True 49 128
False 7 16
correctly labeled flood: 49
incorrectly labeled flood: 7
incorrectly labeled nonflood: 16
correctly labeled nonflood: 128

Image Models: Inception V3 - Standalone
Notes and Research Google Drive:
https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr
FloodDetector Class
getImage:
-->imgURL
#download and resize the image
<--resized image for Inception model
load:
-->imgURL
-->modelName
#id_np=getImage(imgURL)
#load graph from <modelName>.pb
#load classes from <modelName>.txt
#run_inference_for_single_image
<--tuple of (class,confidence)
run_inference_for_single_image:
-->np array
-->graph
#feed np array into graph
#return top inference
<--tuple of (class,confidence)
#Example:
fd=FloodDetector()
fd.load(“https://bit.ly/2YA2Izd”,”flood”)
#NOTE THE NAMES OF THE TENSORS FOR INPUT & OUTPUT OF GRAPH
...
y=sess.graph.get_tensor_by_name('InceptionV3/Predictions/Reshape_1:0')
x=sess.graph.get_tensor_by_name('Placeholder:0')
...

Image Models: Inception V3 - Flask
DisasterRecord Deployment: https://github.com/shrutikar/DisasterRecord/tree/deployment
Flask
FloodDetector
TF Model
GET /classify?url=https://www.somewebpage.com/someimage.jpg
RESPONSE: nonflood

Image Models: DeepLab - Overview
[1] DeepLab: https://github.com/tensorflow/models/tree/master/research/deeplab
[2] ADE20K Dataset: https://groups.csail.mit.edu/vision/datasets/ADE20K/
According to the DeepLab GitHub Page:
“DeepLab is a state-of-art deep learning model for semantic image segmentation,
where the goal is to assign semantic labels (e.g., person, dog, cat and so on) to every
pixel in the input image.” [1] ADE20K Dataset [2]
Original Image
Object Segmentation
Parts Segmentation

Image Models: DeepLab - Overview
ADE20K Dataset: https://groups.csail.mit.edu/vision/datasets/ADE20K/
DeepLab model is used as-is to detect the boundary of water in an image.
● The results of the DeepLab
analysis combined with object
detection, Inception inference,
and external knowledge can
result in some very powerful
intelligent analysis
● What is it used for? Explained
in more detail in later...

Image Models: DeepLab - Standalone
ADE20K Dataset: https://groups.csail.mit.edu/vision/datasets/ADE20K/
DeepLabModel Script:
DeepLabModel Class:
__init__:
-->modelPath
#load TF graph tar from path
<--resized image for Inception model
run:
-->image data
#model produces segmentation map
<--segmentation map
create_pascal_label_colormap:
#generate colormap for various segments
<--colormap
label_to_color_image:
-->label
#obtain the color corresponding to the label
<--color
vis_segmentation:
-->image data
-->segmentation map
vis_segmentation: #continued
#save segmentation map as bmp
#end of DeepLabModel Class
#main of script
#label 22 maps to bodies of water in DeepLab
water_color=FULL_COLOR_MAP[22][0]
d=DeepLabModel("models/SG/deeplabv3_xception_ade20k_train_2018_05_29.tar.gz")
image_url = str(args[1]) #get url from arg
response = requests.get(image_url) #download image
image = Image.open(BytesIO(response.content))
o=d.run(image) #get segmentation map
vis_segmentation(o[0],o[1]) #save as bmp

Image Models: DeepLab - Flask
DeepLab model is currently not wrapped in a Flask app yet.
● Output from the DeepLab model is a bitmap unlike the other two models
which are JSON responses
● Elasticsearch might not be the best place to store this kind of information
● Work still needs to be done to research where best to store the results of
this analysis

LNExAPI: Overview
Redis
SQLite
Photon
InitializeZone [-92.13,28.74,-88.41,31.54] “New Orleans” ab4bd48395bdab12
LNEx Deployment: https://github.com/halolimat/LNEx/tree/LNExAPI-Deployment
LNEx Core https://github.com/halolimat/LNEx

LNExAPI: Overview
Redis
SQLite
Photon
Authenticate / Check Limits

LNExAPI: Overview
Redis
SQLite
Photon
Query Redis

LNExAPI: Overview
Redis
SQLite
Photon
Query Photon and Fill Redis Cache

LNExAPI: Overview
Redis
SQLite
Photon
Extract “New Orleans” ab4bd48395bdab12 [@PubliusTX I thought
Houston First maintained the parking facilities and caused the flooding
during last flood at City Hall?",...

LNExAPI: Overview
Redis
SQLite
Photon
Token: 36b2a12ba34cc127

LNExAPI: Overview
Redis
SQLite
Photon
Query Zone and set ready bit when
completed
Poll...

LNExAPI: Overview
Redis
SQLite
Photon
Results 36b2a12ba34cc127

LNExAPI: Overview
Redis
SQLite
Photon
[{"text":"@PubliusTX I thought Houston First maintained the parking facilities
and caused the flooding during last flood at City Hall?"], "results":[3458,12349…
JSON RESPONSE

LNExAPI: Functionality - RESTful API Endpoints
Method URL
GET /apiv1/LNEx/initZone?key=xx&bb=[lon1,lat1,lon2,lat2]&zone=ZoneName
GET /apiv1/LNEx/destroyZone?key=xx&zone=ZoneName
POST /apiv1/LNEx/bulkExtract?key=xx&zone=ZoneName
POST /apiv1/LNEx/fullBulkExtract?key=xx&zone=ZoneName
GET /apiv1/LNEx/results?key=xx&token=yy
GET /apiv1/LNEx/geoInfo?key=xx&zone=ZoneName&geoIDs=[1,2,3]
GET /apiv1/LNEx/photonID?key=xx&osm_id=1
GET /apiv1/LNEx/zoneReady?key=xx&zone=ZoneName

LNExAPI: Functionality - CLI Client
LNExAPI CLI: https://github.com/halolimat/LNEx/blob/LNExAPI-Deployment/LNExAPIClient.md
Source code, wheel, and help can be found on GitHub repository
from LNExAPI import LNExAPI
def displayResults(results):
for result in results:
print("Matches for:",result['text'])
for entity in result['entities']:
print("[ ]-->",entity['match'])
for location in entity['locations']:
print(" [ ]-->",str(location['coordinate']['lat'])+","+str(location['coordinate']['lon']))
lnex = LNExAPI(key="168ba4d297a8c64a03",host="http://127.0.0.1/") #REPLACE WITH YOUR USER KEY AND HOST
lnex.initZone([-84.6447033333,39.1912856591,-83.2384533333,40.0880515857],"dayton")
print("Zone Dayton is being initialized...")
lnex.pollZoneReady("dayton") #WAITS UNTIL ZONE IS INIT/READY
text=[
"Your text goes here:",
"A list of text in which locations will be searched for...",
"A list of the same size will be returned once you execute a doBulkExtract on the text list",
"Each item in the returned list will be a list of the entities that matched",]
print("Extracting locations from Dayton Zone...")
result_token,results=lnex.pollFullBulkExtract("dayton",text)
displayResults(results)

LNExAPI: Functionality - Server Side
LNExAPI Server: https://github.com/halolimat/LNEx/blob/LNExAPI-Deployment/LNExAPIServer.md
Tools for controlling and maintaining LNExAPI
● LNExCLI - create, list, Users, list Zones, perform “House Cleaning”
● startAPI - start the Django application
● killAPI - shutoff the Django application
● startSafetyCheck - start a supplemental service that ensures LNExAPI is active
● killSafetyCheck - disables the supplemental service
Example
● ./LNExCLI create user <name> <email>
● ./LNExCLI activate user <name> <access_level>

LNExAPI: Functionality - Performance
Zone Bounding Box Entities KM Init (s) LT (s)
Dayton 39.70185, -84.311377, 39.920823, -84.092938 8655 378.66 22.58 7.63
NYC 40.477399, -74.25909, 40.916178, -73.700181 1036003 1977.15 256.15 16.07
London 51.28676, -0.510375, 51.691874, 0.334015 486345 3309.55 262.50 31.91
Japan 133.19,32.88,141.98,40.68 1214426 507176.61 N/A N/A
Tokyo 33.98, 135.85, 35.9, 139.36 257428 47719.84 143.24 15.82
Jakarta -6.5087, 106.5271, -5.8643, 107.2412 98549 613.14 78.89 15.69
Sri Lanka 5.5809, 79.3815, 10.1383, 82.194 72865 21666.06 78.91 15.67
Jordan 29.18, 34.88, 33.38, 39.3 145913 119150.64 111.17 15.76

LNExAPI: Functionality - Performance

DisasterRecord
Multimodal
Analysis

Multimodal Analysis - Service Types
SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Plugin
B
Photon
● Shared Services
○ Provide information to
all services
● Core Analysis Services
○ Do not rely upon other
core analysis output
○ Analyze the input data
● Custom Analysis Services
○ May require core
analysis results for
further computations
thus may take much
more time

Multimodal Analysis - Service Types
SQL
Redis
Web
Host
ES
Core
DR
Worker
Core
Core
LNEx
Object
Det
Text
Class
Image
Plugin
A
Plugin
B
Photon
● Shared Services
○ Provide information to
all services
● Core Analysis Services
○ Do not rely upon other
core analysis output
○ Analyze the input data
● Custom Analysis Services
○ may require core
analysis results for
further computations
thus may take much
more time

Multimodal Analysis - Example: Water Height Estimation
Algorithm for water depth analysis
KnoBase = {
..“head”: {
....“avg_obj_height”: 22,
....“height_above_ground”: 165
..},
..“Torso”: {...
#KnoBase contains average expected heigths of
#objects detected in the image
WaterColor ← water class color

def getColor(x,y):
..return color value at given pixels
def getScale(Obj):
..h ← KnoBase[Obj][‘avg_obj_height’]
..ratio_h ← Obj.H / h
..return ratio_h
water_estimations=[]

def estimate(Image,DetectedObjects):
..foreach Obj in DetectedObjects:
....Obj_Ratio ← getScale(Obj)
....Pixel_X ← Obj.X + (Obj.W / 2)
....Pixel_Y ← Obj.Y
....PixelColor ← getColor(Pixel_X,Pixel_Y)
....Total_Height ← 0

....foreach y in range(Pixel_Y, Image.Height):
......#from top to bottom scan segmentation map
......#for change
......if getColor(Pixel_X,y) == PixelColor:
........Total_Height++
......elif getColor(Pixel_X,y) == WaterColor:
........break
......else
........Total_Height ← -1
........break

..calc_height ← Obj_Ratio * Total_Height
..water_level ← KnoBase[Obj][‘height_above_ground’] - calc_height
..water_estimations.append(water_level)

Multimodal Analysis - Merging Modalities
Bringing it all together:
"Griggs Rd is extremely flooded!
{
“objects”: {
“person”: 10,
“vehicles”: 2,
“animals”: 0,
},
“isFlood”: true,
“locationMentions”:
[1432,2492,543],
“estWaterHeight”: 10,
“needs”: {
...
Ingest Data Perform Analysis Combine Analysis
and Build
Knowledge Base
Display Analysis

Multimodal Analysis - Querying Knowledge Base
I want to see all the
vehicles found in the
Houston area within the
last hour
{
"query":{
"bool":{
"must":{[
{"range":{"timestamp":{"gte":"now-1h"}}},
{"range":{"objects.vehicles":{"gte:1}}}
]},
"filter":{"geo_bounding_box":{
"coordinate":{
"bottom_left":{"lat":bb[0],"lon":bb[1]},
"top_right":{"lat":bb[2],"lon":bb[3]}
}
}}
}
}
}
Why Elasticsearch is used:

Multimodal Analysis - Merging Modalities
Current Issues:
● Each stage of analysis introduces uncertainty and must have a quantifiable measure
associated with it
● Uncertainties must be combined in a way to generate an overall confidence measure of
the combined analysis
● Some analysis doesn’t fit well into Elasticsearch schema (Semantic Segmentation)
● Disambiguation issue with location extraction for reliance on
○ Example: what part of Griggs Rd?

Future Work
Areas of work
● Frontend Development
○ Campaign Management
● Image Analysis
○ Aerial / Ground-Level Classifier
○ Aerial Object Detection
● Backend Services
○ Clean up logging
○ Some bugs still to work out
● Deployment
○ Improve segregation to align with microservice paradigm

Future Work
Resource Locations
Kubernetes Cluster 130.108.87.249, 130.108.87.250, 130.108.87.251
Kubernetes DR http://130.108.87.249:30138
Kubernetes LNExAPI http://130.108.87.249:30139
OpenStack DR (Staging) http://130.108.86.153/DR/listCampaigns
OpenStack DR (Production) http://130.108.86.152/DR/listCampaigns
Photon DB http://130.108.86.153:9201/
Notes: https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr

Summary
DisasterRecord is a tool used to provide event-centric situational analysis in real-time. It was
originally developed as a demo for the IBM CallForCode challenge but was limited in that it only
analyzed one event. The efforts described in this presentation showcase the work done to improve
the DisasterRecord project in the areas of flexibility, scalability, and portability. Deployment
methods utilizing Ansible, Docker, and Kubernetes were discussed. These technologies fall inline
with the microservice architecture that was also discussed and help solve the scalability and
portability concerns of the original DisasterRecord demo. Improvements to the image processing
pipeline that make analysis more flexible and expandable were also discussed.
DisasterRecord is ready for the next set of developers to dive in and continue expansion.

Thanks
Dr. Amit Sheth Dr. Krishnaprasad
Thirunarayan
Dr. Valerie L.
Shalin

Thanks
Special Thanks
Hussein Al-
Olimat
Shruti Kar Kirill Kultinov Joy Prakash
Sain
Austin
Kempton
Alan Fleming

Thanks
Special Thanks
I also want to extend a special thanks to all the members of Kno.e.sis for the helpful discussions,
encouraging words, and tireless efforts. We all bring our own set of talents, interests, and skills to work
on some of the most challenging problems not only in computer science but across a wide range of
disciplines and areas including psychology, healthcare, emergency response, national defense, and
many more. Kno.e.sis provides a rich soil that cultivates an environment of cooperation, teamwork, and
collaboration. I’ve never felt such personal growth like that I’ve experience here at Kno.e.sis.
Thank you all for the wonderful journey! I hope the very best for all your future endeavors!

Bibliography
Purohit, Hemant et al. “Understanding User-Community Engagement by Multi-faceted Features: A Case
Study on Twitter.” (2011).
Hussein, et al. “Location Name Extraction from Targeted Text Streams Using Gazetteer-Based
Statistical Language Models.” ArXiv.org, 7 June 2018, arxiv.org/abs/1708.03105.
Halolimat. “Halolimat/LNEx.” GitHub, 11 June 2019, github.com/halolimat/LNEx.
“DisasterRecord.” DisasterRecord - Knoesis Wiki, wiki.knoesis.org/index.php/DisasterRecord.
“A Cloud-Enabled Automatic Disaster Analysis System of Multi-Sourced Data Streams: An Example
Synthesizing Social Media, Remote Sensing and Wikipedia Data.” Computers, Environment and Urban
Systems, Pergamon, 4 Aug. 2017, www.sciencedirect.com/science/article/pii/S0198971517303216.
“Twitris.” Twitris - Knoesis Wiki, wiki.knoesis.org/index.php/Twitris.
Ansible, Red Hat. “Ansible Is Simple IT Automation.” Ansible Is Simple IT Automation,
www.ansible.com/.
“Enterprise Container Platform.” Docker, www.docker.com/.
“Production-Grade Container Orchestration.” Kubernetes, kubernetes.io/.
“ADE20K.” ADE20K Dataset, groups.csail.mit.edu/vision/datasets/ADE20K/.

Bibliography
Tensorflow. “Tensorflow/Models.” GitHub, 12 June 2019,
github.com/tensorflow/models/tree/master/research/deeplab.
“Django.” Django, www.djangoproject.com/.
“Flask” | Flask (A Python Microframework), flask.pocoo.org/.
Redis, redis.io/.
Treml, Michael, et al. “Speeding up Semantic Segmentation for Autonomous Driving.” Venues, 15 Oct.
2016, openreview.net/forum?id=S1uHiFyyg.
Al-Olimat, Hussein S., et al. “Location Name Extraction from Targeted Text Streams Using Gazetteer-
Based Statistical Language Models.” Proceedings of the 27th International Conference on
Computational Linguistics, 1 Aug. 2018, knoesis.wright.edu/node/2906.
Komoot. “Komoot/Photon.” GitHub, 16 June 2019, github.com/komoot/photon.
Halolimat. “Halolimat/LNEx.” GitHub, 22 Feb. 2019, github.com/halolimat/LNEx/tree/LNExAPI-
Deployment.
Gamauf, Thomas. “Tensorflow Records? What They Are and How to Use Them.” Medium, Mostly AI, 2
Oct. 2018, medium.com/mostly-ai/tensorflow-records-what-they-are-and-how-to-use-them-
c46bc4bbb564.

Bibliography
Kar, Shruti. “Multi-Scale and Multi-Modal Streaming Data Aggregation and Processing for Decision
Support during Natural Disasters.” OhioLINK ETD: Kar, Shruti, 2018,
etd.ohiolink.edu/pg_10?::NO:10:P10_ETD_SUBID:175444.
Kar, Shruti. “DRecord: Disaster Response and Relief Coordination Pipeline.” DRecord, ACM, 6 Nov.
2018, dl.acm.org/citation.cfm?id=3284572.
Partin, Michael, et al. “Knowledge-Empowered Real-Time Event-Centric Situational Analysis”, NSF
I/UCRC, Center for Surveillance Research Advisory Board Meeting, 7 Aug. 2018,
knoesis.org/node/2912.
Donahue, et al. “DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition.”
ArXiv.org, 6 Oct. 2013, arxiv.org/abs/1310.1531.
“Advanced Guide to Inception v3 on Cloud TPU | Cloud TPU | Google Cloud.” Google, Google,
cloud.google.com/tpu/docs/inception-v3-advanced.
Howard, et al. “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.”
ArXiv.org, 17 Apr. 2017, arxiv.org/abs/1704.04861.
Tensorflow. “Tensorflow/Models.” GitHub, 29 June 2019, github.com/tensorflow/models.
Tzutalin. “Tzutalin/LabelImg.” GitHub, 4 June 2019, github.com/tzutalin/labelImg.

Bibliography - Slide Resources (Icons)
https://www.flaticon.com/authors/flat-icons
https://www.flaticon.com/authors/ultimatearm
https://www.flaticon.com/authors/phatplus
https://www.flaticon.com/authors/smashicons
https://www.freepik.com/
https://www.flaticon.com/authors/pixel-perfect
https://www.flaticon.com/authors/smalllikeart
http://www.joseluisgomez.com/containers/hands-on-kubernetes-pods/

Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Data Stream Management Systems

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Similar to Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Data Stream Management Systems

Similar to Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Data Stream Management Systems (20)

Recently uploaded

Recently uploaded (20)

Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Data Stream Management Systems