Successfully reported this slideshow.

Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Data Stream Management Systems

0

Share

Loading in …3
×
1 of 95
1 of 95

Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Data Stream Management Systems

0

Share

Download to read offline

“Scalable, Pluggable, and Fault-Tolerant Multi-Modal Situational Awareness Data Stream Management Systems”

By Michael Partin
Monday, July 8, 2019
Committee: Drs. Amit Sheth, Advisor, TK Prasad, and Valerie Shalin (Department of Psychology)

ABSTRACT:

Features and attributes that describe an event (disasters, social movements, etc.) are heterogeneous in nature. For virtually all events that impact humans, technology enables us to capture a large amount and variety of data from many sources, including humans (i.e., social media) and sensors/internet of things (IoTs). The corresponding modalities of data include text, imagery, voice and video, along with structured data such as gazetteers (i.e., location-based data) and government and statistical data. However, even though there is often an abundance of information produced, this information is fragmented across the various modalities and sources. The DisasterRecord system aims to provide a way to combine (interlink and integrate) data streams in different modalities in a meaningful way, with the in-depth use case of flood events. The DisasterRecord project was originally developed as a demo to showcase the efforts of the team at Kno.e.sis in the area of combining and analyzing multimodal data for the IBM CallForCode challenge in 2018. This thesis represents extensive follow-on work in the areas of deployability, flexibility, and reliability. Specific topics addressed are: a method that utilizes current technologies to easily deploy into cloud infrastructure; the modifications made to add flexibility to add and modify the multimodal analysis pipeline; and reliability improvements to make it a stable and reliable system.

“Scalable, Pluggable, and Fault-Tolerant Multi-Modal Situational Awareness Data Stream Management Systems”

By Michael Partin
Monday, July 8, 2019
Committee: Drs. Amit Sheth, Advisor, TK Prasad, and Valerie Shalin (Department of Psychology)

ABSTRACT:

Features and attributes that describe an event (disasters, social movements, etc.) are heterogeneous in nature. For virtually all events that impact humans, technology enables us to capture a large amount and variety of data from many sources, including humans (i.e., social media) and sensors/internet of things (IoTs). The corresponding modalities of data include text, imagery, voice and video, along with structured data such as gazetteers (i.e., location-based data) and government and statistical data. However, even though there is often an abundance of information produced, this information is fragmented across the various modalities and sources. The DisasterRecord system aims to provide a way to combine (interlink and integrate) data streams in different modalities in a meaningful way, with the in-depth use case of flood events. The DisasterRecord project was originally developed as a demo to showcase the efforts of the team at Kno.e.sis in the area of combining and analyzing multimodal data for the IBM CallForCode challenge in 2018. This thesis represents extensive follow-on work in the areas of deployability, flexibility, and reliability. Specific topics addressed are: a method that utilizes current technologies to easily deploy into cloud infrastructure; the modifications made to add flexibility to add and modify the multimodal analysis pipeline; and reliability improvements to make it a stable and reliable system.

More Related Content

Similar to Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Data Stream Management Systems

Related Books

Free with a 14 day trial from Scribd

See all

Related Audiobooks

Free with a 14 day trial from Scribd

See all

Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Data Stream Management Systems

  1. 1. Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Data Stream Management Systems -Updates and Improvements to DisasterRecord Michael Partin — Summer 2019 COMMITTEE MEMBERS: Dr. Krishnaprasad Thirunarayan Dr. Amit Sheth Dr. Valerie L. Shalin
  2. 2. Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Data Stream Management Systems -Updates and Improvements to DisasterRecord Michael Partin — Summer 2019 COMMITTEE MEMBERS: Dr. Krishnaprasad Thirunarayan Dr. Amit Sheth Dr. Valerie L. Shalin
  3. 3. About Me Michael Partin Education: MS Computer Science (Wright State University 2020) BS Computer Engineering (Wright State University 2017) AAS Electronic Engineering Technology (Sinclair College 2010) Related Work: Twitris [Magpie Data Collector / Monitoring / Plugins] System Administration [OpenStack / Ceph / AWS] Work: CognoviLabs [Multilingual NLP]
  4. 4. Overview ● Deployment Architecture ● Microservice Architecture ● Campaign Management ● Image Processing Pipeline ● Image Models ● LNExAPI ● Multimodal Analysis ● Future Work Technologies Utilized:
  5. 5. Overview: Previous Work ● What is DisasterRecord? DisasterRecord is a tool used to provide event-centric situational analysis in real-time. Events (disasters, social movements, etc.) are heterogeneous in nature with many modalities and sources. DisasterRecord is designed to combine information from various modalities in a meaningful way to help users gather a more complete analysis of a given situation. ● What work had been done on DisasterRecord previously? DisasterRecord comes from the efforts of many great minds here at Kno.e.sis at Wright State University along with collaboration at Ohio State University. A demo of DisasterRecord was developed by many members of Kno.e.sis and submitted for the IBM CallForCode challenge. At that time it provided analysis upon one specific event, the 2015 Chennai flood event.
  6. 6. Overview: Current Work ● What work was done here? The IBM CallForCode demo showcased the capabilities of the DisasterRecord system but lacked in a few areas such as flexibility, scalability, and portability. This presentation describes the efforts in each of these areas. Deployment methods and architecture are discussed to address portability; microservice architecture is discussed to address scalability; and an improved image processing pipeline is discussed showcasing a higher level of flexibility.
  7. 7. DisasterRecord Deployment Architecture
  8. 8. Deployment Architecture - Technologies ● Ansible makes cloud automation easy and reliable ● Powerful centralized control over large clusters ● Jinja templates help to make deployment flexible Django SQL ES Redis Storm Kafka
  9. 9. Deployment Architecture - Technologies Key Features ● Playbooks ● Templates ● Groups
  10. 10. Deployment Architecture - Technologies ● Docker containers solves dependency issue problems ● Vast repository for standard services ● Containers are portable across many platforms Develop Push to Repo Deploy on Server
  11. 11. Deployment Architecture - Technologies ● Kubernetes is a containerized management system ● Perfect fit for microservice architecture ● Scalable, self-healing, highly extensible Source: http://www.joseluisgomez.com/containers/hands-on-kubernetes-pods/ https://github.com/phusion/baseimage-docker Phusion Base Image
  12. 12. Deployment Architecture - Technologies
  13. 13. Deployment Architecture - Technologies --- apiVersion: v1 kind: Service metadata: name: dr-base-ansible-ssh spec: selector: statefulset.kubernetes.io/pod-name: dr-base-ansible-0 ports: - name: ssh protocol: TCP port: 22 --- apiVersion: apps/v1 kind: StatefulSet metadata: name: dr-base-ansible labels: service: dr-base-ansible spec: serviceName: dr-base-ansible replicas: 1 selector: matchLabels: service: dr-base-ansible template: metadata: labels: service: dr-base-ansible spec: terminationGracePeriodSeconds: 300 containers: - name: dr-base-ansible image: knoesis/std_base:v0.0.1 imagePullPolicy: IfNotPresent ports: - containerPort: 22 name: ssh resources: requests: memory: 8Gi limits: memory: 16Gi Kubernetes Deployment YAML Kubernetes: http://130.108.87.249:30138/listCampaigns OpenStack Production: http://130.108.86.152/DR/listCampaigns OpenStack Staging: http://130.108.86.153/DR/listCampaigns
  14. 14. Deployment Architecture - Technologies dev@knoesis:~$ ssh boss@130.108.87.249 -i key.pem boss@kub-01:~$ kubectl apply -f dr-base-photon.yaml boss@kub-01:~$ kubectl get pod -o wide --namespace dr NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS dr-base-ansible-0 1/1 Running 0 114d 10.40.0.1 kub-03 <none> <none> dr-base-api-0 1/1 Running 0 114d 10.40.0.2 kub-03 <none> <none> dr-base-photon-0 1/1 Running 0 34d 10.46.0.28 kub-02 <none> <none> dr-base-redis-0 1/1 Running 0 114d 10.46.0.6 kub-02 <none> <none> boss@kub-01:~$ kubectl get services -o wide --namespace dr NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE dr-base-ansible-ssh ClusterIP 10.97.164.199 <none> 22/TCP 114d dr-base-api-services NodePort 10.105.254.5 <none> 8000:30138/TCP,80:30139/TCP 114d dr-base-api-ssh ClusterIP 10.107.232.90 <none> 22/TCP 114d dr-base-redis-redis ClusterIP 10.99.235.131 <none> 6379/TCP 114d dr-base-redis-ssh ClusterIP 10.108.98.44 <none> 22/TCP 114d photon-web-access NodePort 10.107.18.139 <none> 80:30140/TCP 34d boss@kub-01:~$ kubectl --namespace dr exec -it dr-api-0 -- /bin/bash boss@kub-01:~$ kubectl --namespace dr delete sts dr-api Kubernetes: http://130.108.87.249:30138/listCampaigns OpenStack Production: http://130.108.86.152/DR/listCampaigns OpenStack Staging: http://130.108.86.153/DR/listCampaigns Kubernetes CLI
  15. 15. DisasterRecord Microservice Architecture
  16. 16. Microservice Architecture microservices - also known as the microservice architecture - is an architectural style that structures an application as a collection of services that are ● Highly maintainable and testable ● Loosely coupled ● Independently deployable ● Organized around business capabilities ● Owned by a small team microservices.io describes microservices as: https://microservices.io/
  17. 17. Microservice Architecture - Overview SQL Redis Web Host ES Core DR Worker Core Core LNEx Object Det Text Class Image Plugin A Plugin B Photon
  18. 18. Microservice Architecture - Overview SQL Redis Web Host ES Core DR Worker Core LNEx Object Det Text Class Image Plugin A Twitter Stream Photon Setup Campaign Example Use Case
  19. 19. Microservice Architecture - Overview SQL Redis Web Host ES Core DR Worker Core LNEx Object Det Text Class Image Plugin A Twitter Stream Photon Update DB
  20. 20. Microservice Architecture - Overview SQL Redis Web Host ES Core DR Worker Core LNEx Object Det Text Class Image Plugin A Twitter Stream Photon Detect Change
  21. 21. Microservice Architecture - Overview SQL Redis Web Host ES Core DR Worker Core Core LNEx Object Det Text Class Image Plugin A Twitter Stream PhotonSpawn Core
  22. 22. Microservice Architecture - Overview SQL Redis Web Host ES Core DR Worker Core Core LNEx Object Det Text Class Image Plugin A Twitter Stream Photon Core Processing
  23. 23. Microservice Architecture - Overview SQL Redis Web Host ES Core DR Worker Core Core LNEx Object Det Text Class Image Plugin A Twitter Stream PhotonUpdate DB
  24. 24. Microservice Architecture - Technologies ● Flask is a Python based web application microframework ● Wrap functionality and create endpoints ● Makes processing portable and accessible Flask Object Detection TF Model GET /classify?url=https://www.somewebpage.com/someimage.jpg Object Detection Microservice RESPONSE: {“objects”:[{“class”:”person”,”position”:[24.54,11.3...
  25. 25. Microservice Architecture - Technologies ● TensorFlow is a software library for creating stateful dataflow graphs for easy deployment of computation topologies ● Used extensively in machine learning applications ● Many higher-level frames utilize TensorFlow as backend Source: Cui, B., Li, Y., Zhang, Y., & Zhang, Z. (2017). Text Coherence Analysis Based on Deep Neural Network. ArXiv, abs/1710.07770. Source: https://colah.github.io/posts/2015-08-Understanding-LSTMs/ Source: Saxena, Pratha (2019) Multi Layer Perceptron with Tensorflow
  26. 26. Microservice Architecture - Technologies ● SQLite is a light-weight relational database ● Used to keep metadata related to campaigns ● Keeps the state of data processing CampaignID CampaignName Centroid BoundingBox ES-Index ... 0 Harvey -95.5,30.0 28.82,-98.15,31.33,-93.61 harvey2 ... 1 Michael -85.3,30.0 29.5521,-87.3273,31.0231,-84.5319 michael ... 2 Florence -77.9,34.2 33.6971,-80.5936,34.9256,-77.3588 florence ...
  27. 27. Microservice Architecture - Technologies ● Elasticsearch is a search and analytics engine built on Apache Lucene ● RESTful communication that utilizes JSON ● Distributed, scalable, reliable { “objects”: { “person”: 10, “vehicles”: 2, “animals”: 0, }, “isFlood”: true, “locationMentions”: [1432,2492,543], “estWaterHeight”: 10, “needs”: { ... Indices yellow open irma2-file 0hdqu21PTqOpitzRlf4wyQ 5 1 4555 0 1.2mb 1.2mb yellow open harvey-file hEoGbE3fQVip2HENd_nCjA 5 1 4839 0 2.4mb 2.4mb yellow open random-tweetneeds Dpplj6UcSXyAJ0TUTssrbg 5 1 55 0 147.5kb 147.5kb yellow open michael-osm 74DVfmPiSLO0-GjZ7CmM2A 5 1 795 0 170.5kb 170.5kb yellow open irma-tweetneeds DfKZEk67TVSU2vvjwY7KUw 5 1 170 0 214.7kb 214.7kb yellow open florence-osm VSVbVcS6Sn6wfqWfMtlPVg 5 1 1126 0 470.8kb 470.8kb yellow open harveytest-tweetneeds i3Bpisn7QWiMssLO9Ue1gw 5 1 4687 0 1.9mb 1.9mb yellow open irma-osm JJxRFXhPTimCsyupo-bpKw 5 1 0 0 955b 955b
  28. 28. Microservice Architecture - Technologies ● Location Name Extraction Tool (LNEx) ● Developed at Kno.e.sis by Hussein Al-Olimat ● When provided with a geo-bounding box and texts LNEx will extract the location mentions from the given texts “@PubliusTX I thought Houston First maintained the parking facilities and caused the flooding during last flood at City Hall?” "Hoosier Rd students talk about Houston project for flood-ravaged school https://t.co/pNyGkxzMdB" "Drop off school items at Children's Museum for Houston flood victims - Museums leading society #jhumda https://t.co/xeL76dgOAw via @indystar" “@PubliusTX I thought Houston First maintained the parking facilities and caused the flooding during last flood at City Hall?” "Hoosier Rd students talk about Houston project for flood-ravaged school https://t.co/pNyGkxzMdB" "Drop off school items at Children's Museum for Houston flood victims - Museums leading society #jhumda https://t.co/xeL76dgOAw via @indystar" Source: https://boundingbox.klokantech.com/
  29. 29. DisasterRecord Campaign Management
  30. 30. Campaign Management - Define Campaign Campaign Definition ● Name ● Spatial (Bounding Box) ● Temporal (Date Range) ● Data Sources ○ Data Set ○ Twitter ○ Flood Maps ○ Drone Images
  31. 31. Campaign Management - Multiple Users Creating / Modifying the campaign requires credentials Viewing the campaign is public
  32. 32. DisasterRecord Image Processing Pipeline
  33. 33. Image Processing Pipeline Retrained “SSD Mobilenet COCO” https://github.com/tensorflow/models/blob/master/research/object_detection
  34. 34. Image Processing Pipeline DeepLab Semantic Segmentation https://github.com/tensorflow/models/tree/master/research/deeplab
  35. 35. Image Processing Pipeline Retrained Inception V3 Classifier 99.99% => flood 58.08% => flood 99.85% => non-flood 100.00% => flood
  36. 36. DisasterRecord Image Models
  37. 37. Image Models: Overview ● 1: Object Detection ○ Retrain ○ Standalone ○ Flask ● 2: Inception V3 ○ Retrain ○ Standalone ○ Flask ● 3: Deeplab ○ Standalone ○ Flask 2 [1] https://machinethink.net/blog/mobilenet-v2/ [2] https://cloud.google.com/tpu/docs/inception-v3-advanced [3] https://handong1587.github.io/deep_learning/2015/10/09/segmentation.html 1 3
  38. 38. Image Models: Overview Why these TensorFlow models? These models come from a source that has: ● Community support ● Active repositories ● Thorough documentation ● Methods for transfer learning Tensorflow Models: https://github.com/tensorflow/models DeCAF Paper: http://arxiv.org/abs/1310.1531
  39. 39. Image Models: Object Detection - Retrain [Prevision] Provision Object Detection to retrain “ssd mobilenet 1 coco”: dev@knoesis:~$ cd workspace #or create workspace; cd workspace dev@knoesis:~$ git clone https://github.com/tensorflow/models.git dev@knoesis:~$ mdlPath="/home/<user>/workspace/models" dev@knoesis:~$ slimPath="/home/<user>/workspace/models/research/slim/" dev@knoesis:~$ export PYTHONPATH="${PYTHONPATH}:$mdlPath:$slimPath" dev@knoesis:~$ sudo apt update; sudo apt install virtualenv dev@knoesis:~$ cd home/<user>/workspace/models/research/object_detection dev@knoesis:~$ mkdir annotations;mkdir annotations/test;mkdir annotations/train dev@knoesis:~$ #annotate your images! dev@knoesis:~$ cp <your_test_image>*.jpg $mdlPath/research/object_detection/annotations/test dev@knoesis:~$ cp <your_test_image>*.jpg $mdlPath/research/object_detection/annotations/train dev@knoesis:~$ cd ~/workspace/research/object_detection dev@knoesis:~$ virtualenv -p python3 .;. ./bin/activate #create virtualenv dev@knoesis:~$ pip install tensorflow matplotlib Pillow pandas #or tensorflow-gpu dev@knoesis:~$ #find code.zip here: https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr dev@knoesis:~$ #or contact mikep@knoesis.org or mike.partin@gmail.com for code.zip LabelImg Windows Binaries: https://tzutalin.github.io/labelImg/ LabelImg: https://github.com/tzutalin/labelImg
  40. 40. Image Models: Object Detection - Retrain [Data] Annotate some data: dev@knoesis:~$ pip3 install labelImg dev@knoesis:~$ labelImg LabelImg Windows Binaries: https://tzutalin.github.io/labelImg/ LabelImg: https://github.com/tzutalin/labelImg
  41. 41. Image Models: Object Detection - Retrain [Train] Prep Annotated Data: dev@knoesis:~$ cd ~;unzip code.zip;cd code dev@knoesis:~$ cp ~/workspace/research/object_detection/export_inference_graph.py . dev@knoesis:~$ #find the xml_to_csv.py script and utilize to convert annotations to csv dev@knoesis:~$ #note: the following scripts may require packages and/or extra steps and should be considered an example / guide dev@knoesis:~$ #python xml_to_csv.py -p annotations/train/ -o trainingdata.csv dev@knoesis:~$ #python class_gen.py trainingdata.csv > data/object-det.pbtxt dev@knoesis:~$ #python gen_tf_record_mod.py --class_file data/object-det.pbtxt --csv_input trainingdata.csv --output_path data/aerial_train.record Guide: https://becominghuman.ai/tensorflow-object-detection-api-tutorial-training-and-evaluating-custom-object-detector-ed2594afcf73 Download: http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_11_06_2017.tar.gz Notes and Research Google Drive: https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr Train: dev@knoesis:~$ #obtain mobilenet_v1_coco model from link below dev@knoesis:~$ vim ssd_mobilenet_v1_coco.config #edit hyperparameters dev@knoesis:~$ python train.py --logtostderr --train_dir=training/ -- pipeline_config_path=training/ssd_mobilenet_v1_coco.config #find in legacy?
  42. 42. Image Models: Object Detection - Retrain [Export] Export Graph: dev@knoesis:~$ python -u export_inference_graph.py --input_type=image_tensor -- pipeline_config_path=training/ssd_mobilenet_v1_coco.config -- trained_checkpoint_prefix=training/<model_checkpoint> --output_directory=training Guide: https://becominghuman.ai/tensorflow-object-detection-api-tutorial-training-and-evaluating-custom-object-detector-ed2594afcf73 Download: http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_11_06_2017.tar.gz Notes and Research Google Drive: https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr The exported (frozen) inference graph can then be used in a production or standalone environment.
  43. 43. Image Models: Object Detection - Standalone Test retrained Object Detection “ssd mobilenet 1 coco”: dev@knoesis:~$ #standalone_OD.py: https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr dev@knoesis:~$ #contact mikep@knoesis.org or mike.partin@gmail.com for standalone_OD.zip dev@knoesis:~$ python apply_to_image_batch.py #script to test on batch of images dev@knoesis:~$ python standalone_OD.py #script to test on image url LabelImg Windows Binaries: https://tzutalin.github.io/labelImg/ LabelImg: https://github.com/tzutalin/labelImg Inside the standalone_OD.py are paths to the model and labels. There have been a few models retrained for various purposes such as parts of objects with known heights. Below is a sample output from the standalone_OD.py: dev@knoesis:~$ python standalone_OD.py "http://myimagesurl.com/img1.jpg" head,0.430,590,284,640,329 torso,0.280,590,327,655,386 head,0.182,590,299,632,330 window,0.124,470,236,515,268 window,0.121,703,3,743,30
  44. 44. Image Models: Object Detection - Flask Wrap the Object Detection in Flask: DisasterRecord Deployment: https://github.com/shrutikar/DisasterRecord/tree/deployment ObjectDetector Class __init__: #load frozen_inference_graph.pb #load class load_image_into_numpy_array: -->image data #transform image data to np array <--np array run_inference_for_single_image: -->np array -->graph #feed np array into graph #return inference <--inference dict extract: -->image_url #download image #load_image_into_numpy_array(image_data) #returnDict=run_inference_for_single_image <--returnDict
  45. 45. Image Models: Inception V3 - Retrain [Prevision] Use TF-Slim for retraining: ● <workspace>/models/research/slim/scripts/finetune_inception_v3_on_flowers.sh ● copy the file to <workspace>/models/research/slim and rename ● run the script to ensure it works with the flower example ● find and modify the following: ○ download_and_convert_data.py (add entry for your new dataset) ○ datasets/download_and_convert_<custom>.py (copy flowers and rename) ○ datasets/<custom>.py (again use flowers.py as template) ○ dataset_factory (include your dataset) Slim: https://github.com/tensorflow/models/tree/master/research/slim
  46. 46. Image Models: Inception V3 - Retrain [Prevision] Use TF-Slim for retraining: python train_image_classifier.py --train_dir=${TRAIN_DIR} --dataset_dir=${DATASET_DIR} --dataset_name=flowers --dataset_split_name=train --model_name=inception_v3 --checkpoint_path=${CHECKPOINT_PATH} --checkpoint_exclude_scopes=InceptionV3/Logits,InceptionV3/AuxLogits --trainable_scopes=InceptionV3/Logits,InceptionV3/AuxLogits *after training export to pb (see TensorFlow Transfer Learning Boil Down on Google Drive) Slim: https://github.com/tensorflow/models/tree/master/research/slim
  47. 47. Image Models: Inception V3 - Flood / Nonflood Model ● Created using 6,000 flood example images along with 4,000 nonflood counter examples ● The Flood / Nonflood binary classifier was tested on a randomly selected group of 200 images from the 2015 Chennai flood dataset representing the kind of data we expected to see in the live stream during a disaster event ● The model performed with 87.5% precision with 75.4% recall giving a F-score of 0.81 ● Inception V3 has a reported accuracy of 78.1% with the 1000 classes it was originally trained on ● In production the confidence had to be 80% or higher for flood in order to classify the input image as a flood, otherwise the input image was classified as nonflood 54.97% => nonflood 99.99% => flood 99.99% => flood 79.48% => nonflood 95.07% => nonflood Flood Nonflood True 49 128 False 7 16 correctly labeled flood: 49 incorrectly labeled flood: 7 incorrectly labeled nonflood: 16 correctly labeled nonflood: 128
  48. 48. Image Models: Inception V3 - Standalone Notes and Research Google Drive: https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr FloodDetector Class getImage: -->imgURL #download and resize the image <--resized image for Inception model load: -->imgURL -->modelName #id_np=getImage(imgURL) #load graph from <modelName>.pb #load classes from <modelName>.txt #run_inference_for_single_image <--tuple of (class,confidence) run_inference_for_single_image: -->np array -->graph #feed np array into graph #return top inference <--tuple of (class,confidence) #Example: fd=FloodDetector() fd.load(“https://bit.ly/2YA2Izd”,”flood”) #NOTE THE NAMES OF THE TENSORS FOR INPUT & OUTPUT OF GRAPH ... y=sess.graph.get_tensor_by_name('InceptionV3/Predictions/Reshape_1:0') x=sess.graph.get_tensor_by_name('Placeholder:0') ...
  49. 49. Image Models: Inception V3 - Flask DisasterRecord Deployment: https://github.com/shrutikar/DisasterRecord/tree/deployment Flask FloodDetector TF Model GET /classify?url=https://www.somewebpage.com/someimage.jpg RESPONSE: nonflood
  50. 50. Image Models: DeepLab - Overview [1] DeepLab: https://github.com/tensorflow/models/tree/master/research/deeplab [2] ADE20K Dataset: https://groups.csail.mit.edu/vision/datasets/ADE20K/ According to the DeepLab GitHub Page: “DeepLab is a state-of-art deep learning model for semantic image segmentation, where the goal is to assign semantic labels (e.g., person, dog, cat and so on) to every pixel in the input image.” [1] ADE20K Dataset [2] Original Image Object Segmentation Parts Segmentation
  51. 51. Image Models: DeepLab - Overview [1] DeepLab: https://github.com/tensorflow/models/tree/master/research/deeplab ADE20K Dataset: https://groups.csail.mit.edu/vision/datasets/ADE20K/ DeepLab model is used as-is to detect the boundary of water in an image. ● The results of the DeepLab analysis combined with object detection, Inception inference, and external knowledge can result in some very powerful intelligent analysis ● What is it used for? Explained in more detail in later...
  52. 52. Image Models: DeepLab - Standalone [1] DeepLab: https://github.com/tensorflow/models/tree/master/research/deeplab ADE20K Dataset: https://groups.csail.mit.edu/vision/datasets/ADE20K/ DeepLabModel Script: DeepLabModel Class: __init__: -->modelPath #load TF graph tar from path <--resized image for Inception model run: -->image data #model produces segmentation map <--segmentation map create_pascal_label_colormap: #generate colormap for various segments <--colormap label_to_color_image: -->label #obtain the color corresponding to the label <--color vis_segmentation: -->image data -->segmentation map vis_segmentation: #continued #save segmentation map as bmp #end of DeepLabModel Class #main of script #label 22 maps to bodies of water in DeepLab water_color=FULL_COLOR_MAP[22][0] d=DeepLabModel("models/SG/deeplabv3_xception_ade20k_train_2018_05_29.tar.gz") image_url = str(args[1]) #get url from arg response = requests.get(image_url) #download image image = Image.open(BytesIO(response.content)) o=d.run(image) #get segmentation map vis_segmentation(o[0],o[1]) #save as bmp
  53. 53. Image Models: DeepLab - Flask DeepLab model is currently not wrapped in a Flask app yet. ● Output from the DeepLab model is a bitmap unlike the other two models which are JSON responses ● Elasticsearch might not be the best place to store this kind of information ● Work still needs to be done to research where best to store the results of this analysis
  54. 54. DisasterRecord LNExAPI
  55. 55. LNExAPI: Overview Redis SQLite Photon InitializeZone [-92.13,28.74,-88.41,31.54] “New Orleans” ab4bd48395bdab12 LNEx Deployment: https://github.com/halolimat/LNEx/tree/LNExAPI-Deployment LNEx Core https://github.com/halolimat/LNEx
  56. 56. LNExAPI: Overview Redis SQLite Photon Authenticate / Check Limits LNEx Deployment: https://github.com/halolimat/LNEx/tree/LNExAPI-Deployment LNEx Core https://github.com/halolimat/LNEx
  57. 57. LNExAPI: Overview Redis SQLite Photon Query Redis LNEx Deployment: https://github.com/halolimat/LNEx/tree/LNExAPI-Deployment LNEx Core https://github.com/halolimat/LNEx
  58. 58. LNExAPI: Overview Redis SQLite Photon Query Photon and Fill Redis Cache LNEx Deployment: https://github.com/halolimat/LNEx/tree/LNExAPI-Deployment LNEx Core https://github.com/halolimat/LNEx
  59. 59. LNExAPI: Overview Redis SQLite Photon Extract “New Orleans” ab4bd48395bdab12 [@PubliusTX I thought Houston First maintained the parking facilities and caused the flooding during last flood at City Hall?",... LNEx Deployment: https://github.com/halolimat/LNEx/tree/LNExAPI-Deployment LNEx Core https://github.com/halolimat/LNEx
  60. 60. LNExAPI: Overview Redis SQLite Photon Token: 36b2a12ba34cc127 LNEx Deployment: https://github.com/halolimat/LNEx/tree/LNExAPI-Deployment LNEx Core https://github.com/halolimat/LNEx
  61. 61. LNExAPI: Overview Redis SQLite Photon Query Zone and set ready bit when completed LNEx Deployment: https://github.com/halolimat/LNEx/tree/LNExAPI-Deployment LNEx Core https://github.com/halolimat/LNEx Poll...
  62. 62. LNExAPI: Overview Redis SQLite Photon LNEx Deployment: https://github.com/halolimat/LNEx/tree/LNExAPI-Deployment LNEx Core https://github.com/halolimat/LNEx Results 36b2a12ba34cc127
  63. 63. LNExAPI: Overview Redis SQLite Photon [{"text":"@PubliusTX I thought Houston First maintained the parking facilities and caused the flooding during last flood at City Hall?"], "results":[3458,12349… JSON RESPONSE LNEx Deployment: https://github.com/halolimat/LNEx/tree/LNExAPI-Deployment LNEx Core https://github.com/halolimat/LNEx
  64. 64. LNExAPI: Functionality - RESTful API Endpoints Method URL GET /apiv1/LNEx/initZone?key=xx&bb=[lon1,lat1,lon2,lat2]&zone=ZoneName GET /apiv1/LNEx/destroyZone?key=xx&zone=ZoneName POST /apiv1/LNEx/bulkExtract?key=xx&zone=ZoneName POST /apiv1/LNEx/fullBulkExtract?key=xx&zone=ZoneName GET /apiv1/LNEx/results?key=xx&token=yy GET /apiv1/LNEx/geoInfo?key=xx&zone=ZoneName&geoIDs=[1,2,3] GET /apiv1/LNEx/photonID?key=xx&osm_id=1 GET /apiv1/LNEx/zoneReady?key=xx&zone=ZoneName
  65. 65. LNExAPI: Functionality - CLI Client LNExAPI CLI: https://github.com/halolimat/LNEx/blob/LNExAPI-Deployment/LNExAPIClient.md LNEx Core https://github.com/halolimat/LNEx Source code, wheel, and help can be found on GitHub repository from LNExAPI import LNExAPI def displayResults(results): for result in results: print("Matches for:",result['text']) for entity in result['entities']: print("[ ]-->",entity['match']) for location in entity['locations']: print(" [ ]-->",str(location['coordinate']['lat'])+","+str(location['coordinate']['lon'])) lnex = LNExAPI(key="168ba4d297a8c64a03",host="http://127.0.0.1/") #REPLACE WITH YOUR USER KEY AND HOST lnex.initZone([-84.6447033333,39.1912856591,-83.2384533333,40.0880515857],"dayton") print("Zone Dayton is being initialized...") lnex.pollZoneReady("dayton") #WAITS UNTIL ZONE IS INIT/READY text=[ "Your text goes here:", "A list of text in which locations will be searched for...", "A list of the same size will be returned once you execute a doBulkExtract on the text list", "Each item in the returned list will be a list of the entities that matched",] print("Extracting locations from Dayton Zone...") result_token,results=lnex.pollFullBulkExtract("dayton",text) displayResults(results)
  66. 66. LNExAPI: Functionality - Server Side LNExAPI Server: https://github.com/halolimat/LNEx/blob/LNExAPI-Deployment/LNExAPIServer.md LNEx Core https://github.com/halolimat/LNEx Tools for controlling and maintaining LNExAPI ● LNExCLI - create, list, Users, list Zones, perform “House Cleaning” ● startAPI - start the Django application ● killAPI - shutoff the Django application ● startSafetyCheck - start a supplemental service that ensures LNExAPI is active ● killSafetyCheck - disables the supplemental service Example ● ./LNExCLI create user <name> <email> ● ./LNExCLI activate user <name> <access_level>
  67. 67. LNExAPI: Functionality - Performance Zone Bounding Box Entities KM Init (s) LT (s) Dayton 39.70185, -84.311377, 39.920823, -84.092938 8655 378.66 22.58 7.63 NYC 40.477399, -74.25909, 40.916178, -73.700181 1036003 1977.15 256.15 16.07 London 51.28676, -0.510375, 51.691874, 0.334015 486345 3309.55 262.50 31.91 Japan 133.19,32.88,141.98,40.68 1214426 507176.61 N/A N/A Tokyo 33.98, 135.85, 35.9, 139.36 257428 47719.84 143.24 15.82 Jakarta -6.5087, 106.5271, -5.8643, 107.2412 98549 613.14 78.89 15.69 Sri Lanka 5.5809, 79.3815, 10.1383, 82.194 72865 21666.06 78.91 15.67 Jordan 29.18, 34.88, 33.38, 39.3 145913 119150.64 111.17 15.76
  68. 68. LNExAPI: Functionality - Performance
  69. 69. DisasterRecord Multimodal Analysis
  70. 70. Multimodal Analysis - Service Types SQL Redis Web Host ES Core DR Worker Core Core LNEx Object Det Text Class Image Plugin A Plugin B Photon ● Shared Services ○ Provide information to all services ● Core Analysis Services ○ Do not rely upon other core analysis output ○ Analyze the input data ● Custom Analysis Services ○ May require core analysis results for further computations thus may take much more time
  71. 71. Multimodal Analysis - Service Types SQL Redis Web Host ES Core DR Worker Core Core LNEx Object Det Text Class Image Plugin A Plugin B Photon ● Shared Services ○ Provide information to all services ● Core Analysis Services ○ Do not rely upon other core analysis output ○ Analyze the input data ● Custom Analysis Services ○ May require core analysis results for further computations thus may take much more time
  72. 72. Multimodal Analysis - Service Types SQL Redis Web Host ES Core DR Worker Core Core LNEx Object Det Text Class Image Plugin A Plugin B Photon ● Shared Services ○ Provide information to all services ● Core Analysis Services ○ Do not rely upon other core analysis output ○ Analyze the input data ● Custom Analysis Services ○ may require core analysis results for further computations thus may take much more time
  73. 73. Multimodal Analysis - Example: Water Height Estimation Algorithm for water depth analysis KnoBase = { ..“head”: { ....“avg_obj_height”: 22, ....“height_above_ground”: 165 ..}, ..“Torso”: {... #KnoBase contains average expected heigths of #objects detected in the image WaterColor ← water class color
  74. 74. Multimodal Analysis - Example: Water Height Estimation Algorithm for water depth analysis def getColor(x,y): ..return color value at given pixels def getScale(Obj): ..h ← KnoBase[Obj][‘avg_obj_height’] ..ratio_h ← Obj.H / h ..return ratio_h water_estimations=[]
  75. 75. Multimodal Analysis - Example: Water Height Estimation Algorithm for water depth analysis def estimate(Image,DetectedObjects): ..foreach Obj in DetectedObjects: ....Obj_Ratio ← getScale(Obj) ....Pixel_X ← Obj.X + (Obj.W / 2) ....Pixel_Y ← Obj.Y ....PixelColor ← getColor(Pixel_X,Pixel_Y) ....Total_Height ← 0
  76. 76. Multimodal Analysis - Example: Water Height Estimation Algorithm for water depth analysis ....foreach y in range(Pixel_Y, Image.Height): ......#from top to bottom scan segmentation map ......#for change ......if getColor(Pixel_X,y) == PixelColor: ........Total_Height++ ......elif getColor(Pixel_X,y) == WaterColor: ........break ......else ........Total_Height ← -1 ........break
  77. 77. Multimodal Analysis - Example: Water Height Estimation Algorithm for water depth analysis ..calc_height ← Obj_Ratio * Total_Height ..water_level ← KnoBase[Obj][‘height_above_ground’] - calc_height ..water_estimations.append(water_level)
  78. 78. Multimodal Analysis - Merging Modalities Bringing it all together: "Griggs Rd is extremely flooded! { “objects”: { “person”: 10, “vehicles”: 2, “animals”: 0, }, “isFlood”: true, “locationMentions”: [1432,2492,543], “estWaterHeight”: 10, “needs”: { ... Ingest Data Perform Analysis Combine Analysis and Build Knowledge Base Display Analysis
  79. 79. Multimodal Analysis - Querying Knowledge Base I want to see all the vehicles found in the Houston area within the last hour { "query":{ "bool":{ "must":{[ {"range":{"timestamp":{"gte":"now-1h"}}}, {"range":{"objects.vehicles":{"gte:1}}} ]}, "filter":{"geo_bounding_box":{ "coordinate":{ "bottom_left":{"lat":bb[0],"lon":bb[1]}, "top_right":{"lat":bb[2],"lon":bb[3]} } }} } } } Why Elasticsearch is used:
  80. 80. Multimodal Analysis - Querying Knowledge Base I want to see all the vehicles found in the Houston area within the last hour { "query":{ "bool":{ "must":{[ {"range":{"timestamp":{"gte":"now-1h"}}}, {"range":{"objects.vehicles":{"gte:1}}} ]}, "filter":{"geo_bounding_box":{ "coordinate":{ "bottom_left":{"lat":bb[0],"lon":bb[1]}, "top_right":{"lat":bb[2],"lon":bb[3]} } }} } } } Why Elasticsearch is used:
  81. 81. Multimodal Analysis - Querying Knowledge Base I want to see all the vehicles found in the Houston area within the last hour { "query":{ "bool":{ "must":{[ {"range":{"timestamp":{"gte":"now-1h"}}}, {"range":{"objects.vehicles":{"gte:1}}} ]}, "filter":{"geo_bounding_box":{ "coordinate":{ "bottom_left":{"lat":bb[0],"lon":bb[1]}, "top_right":{"lat":bb[2],"lon":bb[3]} } }} } } } Why Elasticsearch is used:
  82. 82. Multimodal Analysis - Querying Knowledge Base I want to see all the vehicles found in the Houston area within the last hour { "query":{ "bool":{ "must":{[ {"range":{"timestamp":{"gte":"now-1h"}}}, {"range":{"objects.vehicles":{"gte:1}}} ]}, "filter":{"geo_bounding_box":{ "coordinate":{ "bottom_left":{"lat":bb[0],"lon":bb[1]}, "top_right":{"lat":bb[2],"lon":bb[3]} } }} } } } Why Elasticsearch is used:
  83. 83. Multimodal Analysis - Merging Modalities Current Issues: ● Each stage of analysis introduces uncertainty and must have a quantifiable measure associated with it ● Uncertainties must be combined in a way to generate an overall confidence measure of the combined analysis ● Some analysis doesn’t fit well into Elasticsearch schema (Semantic Segmentation) ● Disambiguation issue with location extraction for reliance on ○ Example: what part of Griggs Rd?
  84. 84. DisasterRecord Future Work
  85. 85. Future Work Areas of work ● Frontend Development ○ Campaign Management ● Image Analysis ○ Aerial / Ground-Level Classifier ○ Aerial Object Detection ● Backend Services ○ Clean up logging ○ Some bugs still to work out ● Deployment ○ Improve segregation to align with microservice paradigm
  86. 86. Future Work Resource Locations Kubernetes Cluster 130.108.87.249, 130.108.87.250, 130.108.87.251 Kubernetes DR http://130.108.87.249:30138 Kubernetes LNExAPI http://130.108.87.249:30139 OpenStack DR (Staging) http://130.108.86.153/DR/listCampaigns OpenStack DR (Production) http://130.108.86.152/DR/listCampaigns Photon DB http://130.108.86.153:9201/ Notes: https://drive.google.com/open?id=1rJ9BcFjG5BHDcRerO9PwqgpAjg8SyDUr
  87. 87. Summary DisasterRecord is a tool used to provide event-centric situational analysis in real-time. It was originally developed as a demo for the IBM CallForCode challenge but was limited in that it only analyzed one event. The efforts described in this presentation showcase the work done to improve the DisasterRecord project in the areas of flexibility, scalability, and portability. Deployment methods utilizing Ansible, Docker, and Kubernetes were discussed. These technologies fall inline with the microservice architecture that was also discussed and help solve the scalability and portability concerns of the original DisasterRecord demo. Improvements to the image processing pipeline that make analysis more flexible and expandable were also discussed. DisasterRecord is ready for the next set of developers to dive in and continue expansion.
  88. 88. Thanks Dr. Amit Sheth Dr. Krishnaprasad Thirunarayan Dr. Valerie L. Shalin
  89. 89. Thanks Special Thanks Hussein Al- Olimat Shruti Kar Kirill Kultinov Joy Prakash Sain Austin Kempton Alan Fleming
  90. 90. Thanks Special Thanks I also want to extend a special thanks to all the members of Kno.e.sis for the helpful discussions, encouraging words, and tireless efforts. We all bring our own set of talents, interests, and skills to work on some of the most challenging problems not only in computer science but across a wide range of disciplines and areas including psychology, healthcare, emergency response, national defense, and many more. Kno.e.sis provides a rich soil that cultivates an environment of cooperation, teamwork, and collaboration. I’ve never felt such personal growth like that I’ve experience here at Kno.e.sis. Thank you all for the wonderful journey! I hope the very best for all your future endeavors!
  91. 91. Bibliography Purohit, Hemant et al. “Understanding User-Community Engagement by Multi-faceted Features: A Case Study on Twitter.” (2011). Hussein, et al. “Location Name Extraction from Targeted Text Streams Using Gazetteer-Based Statistical Language Models.” ArXiv.org, 7 June 2018, arxiv.org/abs/1708.03105. Halolimat. “Halolimat/LNEx.” GitHub, 11 June 2019, github.com/halolimat/LNEx. “DisasterRecord.” DisasterRecord - Knoesis Wiki, wiki.knoesis.org/index.php/DisasterRecord. “A Cloud-Enabled Automatic Disaster Analysis System of Multi-Sourced Data Streams: An Example Synthesizing Social Media, Remote Sensing and Wikipedia Data.” Computers, Environment and Urban Systems, Pergamon, 4 Aug. 2017, www.sciencedirect.com/science/article/pii/S0198971517303216. “Twitris.” Twitris - Knoesis Wiki, wiki.knoesis.org/index.php/Twitris. Ansible, Red Hat. “Ansible Is Simple IT Automation.” Ansible Is Simple IT Automation, www.ansible.com/. “Enterprise Container Platform.” Docker, www.docker.com/. “Production-Grade Container Orchestration.” Kubernetes, kubernetes.io/. “ADE20K.” ADE20K Dataset, groups.csail.mit.edu/vision/datasets/ADE20K/.
  92. 92. Bibliography Tensorflow. “Tensorflow/Models.” GitHub, 12 June 2019, github.com/tensorflow/models/tree/master/research/deeplab. “Django.” Django, www.djangoproject.com/. “Flask” | Flask (A Python Microframework), flask.pocoo.org/. Redis, redis.io/. Treml, Michael, et al. “Speeding up Semantic Segmentation for Autonomous Driving.” Venues, 15 Oct. 2016, openreview.net/forum?id=S1uHiFyyg. Al-Olimat, Hussein S., et al. “Location Name Extraction from Targeted Text Streams Using Gazetteer- Based Statistical Language Models.” Proceedings of the 27th International Conference on Computational Linguistics, 1 Aug. 2018, knoesis.wright.edu/node/2906. Komoot. “Komoot/Photon.” GitHub, 16 June 2019, github.com/komoot/photon. Halolimat. “Halolimat/LNEx.” GitHub, 22 Feb. 2019, github.com/halolimat/LNEx/tree/LNExAPI- Deployment. Gamauf, Thomas. “Tensorflow Records? What They Are and How to Use Them.” Medium, Mostly AI, 2 Oct. 2018, medium.com/mostly-ai/tensorflow-records-what-they-are-and-how-to-use-them- c46bc4bbb564.
  93. 93. Bibliography Kar, Shruti. “Multi-Scale and Multi-Modal Streaming Data Aggregation and Processing for Decision Support during Natural Disasters.” OhioLINK ETD: Kar, Shruti, 2018, etd.ohiolink.edu/pg_10?::NO:10:P10_ETD_SUBID:175444. Kar, Shruti. “DRecord: Disaster Response and Relief Coordination Pipeline.” DRecord, ACM, 6 Nov. 2018, dl.acm.org/citation.cfm?id=3284572. Partin, Michael, et al. “Knowledge-Empowered Real-Time Event-Centric Situational Analysis”, NSF I/UCRC, Center for Surveillance Research Advisory Board Meeting, 7 Aug. 2018, knoesis.org/node/2912. Donahue, et al. “DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition.” ArXiv.org, 6 Oct. 2013, arxiv.org/abs/1310.1531. “Advanced Guide to Inception v3 on Cloud TPU | Cloud TPU | Google Cloud.” Google, Google, cloud.google.com/tpu/docs/inception-v3-advanced. Howard, et al. “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.” ArXiv.org, 17 Apr. 2017, arxiv.org/abs/1704.04861. Tensorflow. “Tensorflow/Models.” GitHub, 29 June 2019, github.com/tensorflow/models. Tzutalin. “Tzutalin/LabelImg.” GitHub, 4 June 2019, github.com/tzutalin/labelImg.
  94. 94. Bibliography - Slide Resources (Icons) https://www.flaticon.com/authors/flat-icons https://www.flaticon.com/authors/ultimatearm https://www.flaticon.com/authors/phatplus https://www.flaticon.com/authors/smashicons https://www.freepik.com/ https://www.flaticon.com/authors/pixel-perfect https://www.flaticon.com/authors/smalllikeart http://www.joseluisgomez.com/containers/hands-on-kubernetes-pods/
  95. 95. Questions? Any Question?

×