Predictive analytics with streaming sensor data


Published on

Presented at Big Data Analytics Tokyo 2017
Real-time IoT sensor data collected from on a working, moving robot is used to predict failure using anomaly detection with autoencoders with H2O, all running on the MapR Converged Platform and MapR Streams (a Kafka superset implementation). This drives an AR headset to visualize the results (green OK red FAILURE) on a marker attached to the robot.

See the demo video:

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • It’s just not true ML solves all problems. ML seeks to make predictions, which is very useful. But most business processes don’t need prediction every step of the way, they are rather more like a series of steps with conditionals arranged in a DAG
  • What is Industry 4.0 in relation to IoT
    What is predictive maintenance, how does this have value?
    Save down time: get replacement parts before the machine breaks
    Planned downtime is MUCH cheaper than unplanned
    Increased quality: preventive maintenance of robots
    Lower cost: use highly qualified technicians more efficiently
  • One industrial customer requires the response time to be 2s or less, because that’s the time it takes an operator to turn around and check on a machine.
  • Scope; detect electric engine failure using outside movement sensor
  • Show the slide and show the setup together
  • Show the slide and show the setup together
  • We can show the little sensor generating data in real time?
  • Prediction =>予測 (yosoku)
  • Mention why we are doing it with machine learning at all!
    No rules, automatically learn the best parameters for each application without new coding and not based on supervised techniques.
    Especially good when we don’t know what we are looking for: machines can break in a variety of ways.
  • Note that the zeros are our choice of what to record. The sensor can record all of them.
  • 教師なし学習 => unsupervised learning
    異常認識 => anomaly detection
    The real data is very noisy
    Why use ML at all? We don’t want to use rules for every type of robot and every situation
    Don’t mention threshold, just say we did some parameter tuning of the ML algorithms or something

  • 閾値 (shiki-ichi) => threshold
    標準偏差 (hyoujun-hensa) => SD
  • Global publish-subscribe event streaming system for big data
    Decouple Hardware input and output from data analysis
    The stream is a reliable data store
    Scales to millions of events per seconds
    REST Proxy: Easy and fast multi-language API

    Streaming Architecture is state of the art in data analytics architecture at scale
  • H2O Exports the models as Java POJO’s
    Custom coding in Scala
    Kafka native API (0.9 – MapR Implementation)
    Consume input data
    Format data for input into model
    Get model prediction
    Publish result to output stream
  • H2O Exports the models as Java POJO’s
    Custom coding in Scala
    Kafka native API (0.9 – MapR Implementation)
    Consume input data
    Format data for input into model
    Get model prediction
    Publish result to output stream
  • Comment: The reason ML is not used everywhere in IoT is because IT’S REALLY HARD TO DO.
  • Predictive analytics with streaming sensor data

    1. 1. © 2017 MapR Technologies 11MapR Confidential © 2017 MapR Technologies PRACTICAL MACHINE LEARNING PIPELINE USING STREAMING IOT SENSOR DATA
    2. 2. © 2017 MapR Technologies 22MapR Confidential Mathieu Dumoulin and Mateusz Dymczyk • Data Engineer @ MapR Technologies • Previously data scientist and DS team manager, search, NLP and ML engineer • Software Engineer @ • Previously ML/NLP @ Fujitsu Laboratories and en-japan inc • Sommelier in training
    3. 3. © 2017 MapR Technologies 33MapR Confidential The Time for IoT is NOW
    4. 4. © 2017 MapR Technologies 44MapR Confidential IoT and Industry 4.0: Predictive Maintenance
    5. 5. © 2017 MapR Technologies 55MapR Confidential Problem Statement and Business Value Situation: 生産能力を2倍以上にすることがゴール We want a real-time view to allow factory staff to see which robots will probably fail, before they actually fail. More Requirements: • Deal with variety of robots (age, maker, function) • Scale to [100-10,000] robots in real-time and multiple factories • Ensure data reliability • Factory staff has low level of IT knowledge
    6. 6. © 2017 MapR Technologies 66MapR Confidential Demo Pipeline
    7. 7. © 2017 MapR Technologies 77MapR Confidential Demo Pipeline – Normal State
    8. 8. © 2017 MapR Technologies 88MapR Confidential Demo Pipeline – Predict Failure
    9. 9. © 2017 MapR Technologies 99MapR Confidential LP-RESEARCH Motion Sensor • Tokyo-based startup • Hardware R&D for Industry 4.0 applications • Founded by Waseda University Ph.D. grads •
    10. 10. © 2017 MapR Technologies 1010MapR Confidential © 2017 MapR Technologies Our Demo – Real Time Robot Failure Prediction… with AR Visualization
    11. 11. © 2017 MapR Technologies 1111MapR Confidential How we Made our Demo 1. Machine learning modeling 2. Data input 1. Sensor to backend analysis 3. Backend data analysis 1. MapR Converged Data Platform 2. Streaming Architecture, MapR Streams (Apache Kafka) 4. Data output: visualizing predictions 1. Augmented Reality Headset
    12. 12. © 2017 MapR Technologies 1212MapR Confidential Machine Learning Modeling 1. Set the Machine Learning goal 1. Detect abnormal events > 90% accuracy 2. Avoid false positives 3. Decide output 2. How to reach the goal 1. Supervised vs. unsupervised 2. Choose algorithm 3. Initial dataset exploration 4. Data cleaning and feature extraction 5. Deal with real-time and large scale 3. Deploy to production 1. Use MapR CDP and custom software 2. H2O’s export to POJO function Normal State (OK!) PREDICT FAILURE
    13. 13. © 2017 MapR Technologies 1313MapR Confidential ML – Looking at the data
    14. 14. © 2017 MapR Technologies 1414MapR Confidential ML – Anomaly Detection • Unsupervised: 教師なし学習 • Anomaly detection: 異常認識 • H2O uses autoencoder algorithm (deep learning) • H2O’s R API for modeling • Very productive API • Good graphs • Parameter tuning of models • See H2O’s training-book on GitHub
    15. 15. © 2017 MapR Technologies 1515MapR Confidential ML – Results Note: Time window: 200ms, Threshold: 1SD (標準偏差 )
    16. 16. © 2017 MapR Technologies 1616MapR Confidential ML – Deploy to Production – Real-time Data
    17. 17. © 2017 MapR Technologies 1717MapR Confidential Open Source Engines & Tools Commercial Engines & Applications Enterprise-Grade Platform Services DataProcessing Web-Scale Storage MapR-FS MapR-DB Search and Others Real Time Unified Security Multi-tenancy Disaster Recovery Global NamespaceHigh Availability MapR Streams Cloud and Managed Services Search and Others UnifiedManagementandMonitoring Search and Others Event StreamingDatabase Custom Apps HDFS API POSIX, NFS HBase API JSON API Kafka API MapR Converged Data Platform
    18. 18. © 2017 MapR Technologies 1818MapR Confidential Data Output – Making Predictions
    19. 19. © 2017 MapR Technologies 1919MapR Confidential Data Output – Making Predictions
    20. 20. © 2017 MapR Technologies 2020MapR Confidential Conclusion • Getting a good enough model on some data was less than 10% of the total work. • Team members need to have ALL expertise for this kind of project. Hardware, software, big data, ML. • MapR, H2O and LP-RESEARCH’s sensor were all essential parts of the project success. – The MapR platform worked perfectly, H2O model is high quality and fast. • The hardware expertise of LP-RESEARCH was critical
    21. 21. © 2017 MapR Technologies 2121MapR Confidential Q&AEngage with us! PROJECT GITHUB: Mathieu Dumoulin, @Lordxar • Blog: Mateusz Dymczyk,, @mdymczyk Klaus Petersen, • LP-RESEARCH:
    22. 22. © 2017 MapR Technologies 2222MapR Confidential Thank you to LP-RESEARCH! Hardware design and production Expertise in Motion sensors Gyroscope Accelerometer Magnetometer Sensor fusion algorithm development Multi-platform application development See all our products: LPMS-B2 LPMS- CU2 LPMS- CANAL2 LPMS- USBAL2 OEM also available!