Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Science Powered Apps for Internet of Things

477 views

Published on

SpringOne Platform 2016
Speaker: Chris Rawles; Data Scientist, Pivotal

The Internet of Things (IoT) continues to provide value and hold promise for both the consumer and enterprise alike. To succeed, any IoT project must concern itself with (1) how to ingest data, (2) build actionable models, and (3) react in real-time.

In this talk, Chris describes approaches to addressing these concerns through a deep-dive into an interactive demo centered around classification of human activities. See the guts of such applications and learn about the tools that will enable you to build an application like this yourself!

These include: (1) collecting streaming smartphone data, (2) the process of training and building machine learning models in real-time, and (3) an application that scores real-time. For each of these he will cover the necessary components of the entire IoT stack of ingesting, storing, and processing big data - all in real-time using the open-source Pivotal Cloud Foundry and Big Data Suite.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Data Science Powered Apps for Internet of Things

  1. 1. 1© Copyright 2016 Pivotal. All rights reserved. Data Science-Powered Apps for the Internet of Things Chris Rawles1 and Jarrod Vawdrey2 1. Sr. Data Scientist in New York, New York 2. Sr. Data Scientist in Atlanta, Georgia
  2. 2. 2© Copyright 2016 Pivotal. All rights reserved. Today’s talk 1. A real-time data science app A. The app: a live demonstration B. How can a data scientist build a data science application? C. Revisiting the app 2. Generalizing the framework: Solving new data science challenges A. Internet of Things – Creating a smart app to prevent oil spill disasters B. Financial data - How can retail banks influence their cardholders’ behavior?
  3. 3. 3© Copyright 2016 Pivotal. All rights reserved. Today’s talk 1. A real-time data science app A. The app: a live demonstration B. How can a data scientist build a data science application? C. Revisiting the app 2. Generalizing the framework: Solving new data science challenges A. Internet of Things – Creating a smart app to prevent oil spill disasters B. Financial data - How can retail banks influence their cardholders’ behavior?
  4. 4. 4© Copyright 2016 Pivotal. All rights reserved. App
  5. 5. 5© Copyright 2016 Pivotal. All rights reserved. Today’s talk 1. A real-time data science app A. The app: a live demonstration B. How can a data scientist build a data science application? C. Revisiting the app 2. Generalizing the framework: Solving new data science challenges A. Internet of Things – creating a smart app B. Financial data - How can retail banks influence their cardholders’ behavior?
  6. 6. 6© Copyright 2016 Pivotal. All rights reserved. Training app Model Scoring as a service Model Training as a service Sensor app Scoring app Dashboard app Data science workflow: Movement classification 1. Sensor + Dashboard 2. Redis 3. Training app 4. Scoring app
  7. 7. 7© Copyright 2016 Pivotal. All rights reserved. here is my source code run it on the cloud for me - Onsi Fakhouri @onsijoe i do not care how
  8. 8. 8© Copyright 2016 Pivotal. All rights reserved. cf push  CF determines app type (Java, Python, Ruby, …)  Installs necessary environment  Provisions and binds data services  Creates domain, routing, and load balancing  Continual app health checks and restarts
  9. 9. 9© Copyright 2016 Pivotal. All rights reserved. Data ingestion: Accelerometric data  Accelerometric data streamed from mobile phone at 15 Hz (15x / second)  Other sensor data: gyroscopic data, magnetometer data, lon/lat, etc. Accelerometer axes
  10. 10. 10© Copyright 2016 Pivotal. All rights reserved.  For real-time applications, low-latency data ingestion into the data store is essential  WebSocket protocol - socket.io – Mobile phone  Webserver – Webserver  Dashboard  socket.io  redis Data ingestion Training app Sensor app
  11. 11. 11© Copyright 2016 Pivotal. All rights reserved. Data storage  We are using a redis store for: – Storing training data – Model persistence – Storing a micro-batch of scoring data  Other storage systems include GemFire, HAWQ/Hadoop, Greenplum Database, PostgreSQL, …
  12. 12. 12© Copyright 2016 Pivotal. All rights reserved. Modeling Scalable machine learning applications in Pivotal Cloud Foundry 1. Training app 2. Scoring app
  13. 13. 13© Copyright 2016 Pivotal. All rights reserved. Modeling – Training app  Goal: build a data-driven model that learns accelerometric motions associated with each activity Feature Engineering • Time-domain transformations • Fast Fourier Transform analysis Machine Learning Classification Model • Random Forest Model using 2 second time windows (30 samples) … Training data Trained model
  14. 14. 14© Copyright 2016 Pivotal. All rights reserved. Model building  20 seconds per training activity  Two second moving window on training data  Features: time- domain summary statistics and Fourier transform coefficients
  15. 15. 15© Copyright 2016 Pivotal. All rights reserved. Model training approaches 1. Near-real-time model training – Use small batches to train model 2. Real-time model training – Online machine learning algorithm : continually update model using each new data point 3. Offline model training – Build a model offline using batches – Useful for models requiring finer model tuning and calibration
  16. 16. 16© Copyright 2016 Pivotal. All rights reserved. Feature Engineering • Time-domain transformations • Fast Fourier Transform analysis Machine Learning Classification Model • Random Forest Model using 2 second time windows (30 samples) Trained model Streaming input window Model Prediction API Call Model prediction PCF App: Scoring app • Real-time model scoring • The dashboard initiates a request via an API call and receives a model prediction { "channel": "1234", "label": ”walking", "label_value": 0.746 }
  17. 17. 17© Copyright 2016 Pivotal. All rights reserved. 1. Application auto-scaling – As the data grows, the model scales 2. Application autonomy – The model application is independent of other applications = faster development iterations – Faster development = rapid feedback loop 3. Multiple applications can access model scoring app Operationalizing scalable data science applications Model scoring as a service Why?
  18. 18. 18© Copyright 2016 Pivotal. All rights reserved. Today’s talk 1. A real-time data science app A. The app: a live demonstration B. How can a data scientist build a data science application? C. Revisiting the app 2. Generalizing the framework: Solving new data science challenges A. Internet of Things – creating a smart app B. Financial data - How can retail banks influence their cardholders’ behavior?
  19. 19. 19© Copyright 2016 Pivotal. All rights reserved. App
  20. 20. 20© Copyright 2016 Pivotal. All rights reserved. Today’s talk 1. A real-time data science app A. The app: a live demonstration B. How can a data scientist build a data science application? C. Revisiting the app 2. Generalizing the framework: Solving new data science challenges A. Internet of Things – Creating a smart app to prevent oil spill disasters B. Financial data - How can retail banks influence their cardholders’ behavior?
  21. 21. 21© Copyright 2016 Pivotal. All rights reserved. Gene Sequencing Smart Grids COST TO SEQUENCE ONE GENOME HAS FALLEN FROM $100M IN 2001 TO $10K IN 2011 TO $1K IN 2014 READING SMART METERS EVERY 15 MINUTES IS 3000X MORE DATA INTENSIVE Stock Market Social Media FACEBOOK UPLOADS 250 MILLION PHOTOS EACH DAY In all industries billions of data points represent opportunities for the Internet of Things Oil Exploration Video Surveillance OIL RIGS GENERATE 25000 DATA POINTS PER SECOND Medical Imaging Mobile Sensors
  22. 22. 22© Copyright 2016 Pivotal. All rights reserved. How can we use data to help prevent accidents like the Macondo Disaster ?
  23. 23. 23© Copyright 2016 Pivotal. All rights reserved. 23© Copyright 2016 Pivotal. All rights reserved. …by creating a Smart Application
  24. 24. 24© Copyright 2016 Pivotal. All rights reserved. Training app Model Scoring as a service Model Training as a service Sensor app Scoring app Dashboard app Data science workflow: Movement classification
  25. 25. 25© Copyright 2016 Pivotal. All rights reserved. Training app Model Scoring as a service Model Training as a service Sensor app Scoring app Dashboard app Data science workflow: Creating a smart app to prevent oil spill disasters • Alert operator • Send signal to control system to change operating parameters • Replace old machinery • Shut down plant
  26. 26. 26© Copyright 2016 Pivotal. All rights reserved. Training app Model Scoring as a service Model Training as a service Sensor app Scoring app Dashboard app Data science workflow: How can retail banks influence their cardholders’ behavior? • Provide customized services and promotions • Next best offer • Characterize and improve customer satisfaction
  27. 27. 27© Copyright 2016 Pivotal. All rights reserved. Thank you Questions and comments crawles@pivotal.io
  28. 28. 28© Copyright 2016 Pivotal. All rights reserved.

×