Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Edge to AI Deep Dive Barcelona Meetup March 2019

353 views

Published on

The Edge to AI Deep Dive Barcelona Meetup March 2019

A deep dive demo of using MiNiFi, NiFi, CDSW for real-time AI at the edge, in a local cluster, in the cloud and in a Data Science platform at scale with real-time streaming and data storage.

Apache NiFi, MiNiFi, NiFi Registry, Cloudera Data Science Workbench (CDSW), Python, Pyspark, Spark SQL, Apache Calcite, Apache Parquet, Apache MXNet, GluonCV.

Published in: Data & Analytics
  • Be the first to comment

The Edge to AI Deep Dive Barcelona Meetup March 2019

  1. 1. Edge to AI: Deep Dive Future of Data: Barcelona Meetup TIMOTHY SPANN, Senior Solutions Engineer, Cloudera https://www.datainmotion.dev/
  2. 2. 2 © Cloudera, Inc. All rights reserved. DISCLAIMER The information in this document is proprietary to Cloudera. No part of this document may be reproduced, copied or transmitted in any form for any purpose without the express prior written permission of Cloudera. This document is a preliminary version and not subject to your license agreement or any other agreement with Cloudera. This document contains only intended strategies, developments and functionalities of Cloudera products and is not intended to be binding upon Cloudera to any particular course of business, product strategy and/or development. Please note that this document is subject to change and may be changed by Cloudera at any time without notice. Cloudera assumes no responsibility for errors or omissions in this document. Cloudera does not warrant the accuracy or completeness of the information, text, graphics, links or other items contained within this material. This document is provided without a warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability, fitness for a particular purpose or non-infringement. Cloudera shall have no liability for damages of any kind including without limitation direct, special, indirect or consequential damages that may result from the use of these materials. The limitation shall not apply in cases of gross negligence.
  3. 3. There are some who call him... DZone Zone Leader and Big Data MVB; Princeton Future of Data Meetup https://github.com/tspannhw https://community.hortonworks.com/users/9304/tspann.html https://dzone.com/users/297029/bunkertor.html https://www.meetup.com/futureofdata-princeton/
  4. 4. Hadoop {Submarine} Project: Running deep learning workloads on YARN , Tim Spann (Cloudera)
  5. 5. IoT Edge Processing with Apache MiniFi and Multiple Deep Learning Libraries
  6. 6. 10 © Cloudera, Inc. All rights reserved. AI MACHINE LEARNING DATA SCIENCE ANALYTICS "BIG DATA"
  7. 7. 11© Cloudera, Inc. All rights reserved.
  8. 8. 12© Cloudera, Inc. All rights reserved. MACHINE LEARNING PHASES Where to Connect to Apache NiFi
  9. 9. 14 © Cloudera, Inc. All rights reserved. INDUSTRIALIZED AI REQUIRES LARGER DATA PLATFORM Streaming Ingest Batch Ingest Machine Learning Tools BI Tools and SQL Editors Data Products DATA, METADATA, SECURITY, GOVERNANCE, WORKLOAD MANAGEMENT MACHINE LEARNING DATA ENGINEERING DATA WAREHOUSE OPERATIONAL DATABASE
  10. 10. Speed of Data Model Training Model Scoring Use Case Batch Batch Batch Batch Reporting, Analytics, Applications Online DS Applications/ Interactive Dashboards Streaming In-stream Streaming Applications Incremental/Online In-stream Streaming Applications Training, Scoring and Monitoring
  11. 11. 16© Cloudera, Inc. All rights reserved.
  12. 12. 17© Cloudera, Inc. All rights reserved.
  13. 13. 18© Cloudera, Inc. All rights reserved.
  14. 14. 19© Cloudera, Inc. All rights reserved.
  15. 15. 20© Cloudera, Inc. All rights reserved.
  16. 16. 21© Cloudera, Inc. All rights reserved. TENSORFLOW IN GATEWAY
  17. 17. Using TensorFlow Lite on The Edge with Sensors and Google Coral (MiNiFi) https://github.com/tspannhw/nifi-minifi-coral https://www.datainmotion.dev/2019/03/using- raspberry-pi-3b-with-apache-nifi.html { "endtime": "1552164369.27", "memory": "19.1", "cputemp": "32", "ipaddress": "192.168.1.183", "diskusage": "50336.5", "score_2": "0.14", "score_1": "0.68", "runtime": "4.74", "host": "mv2", "starttime": "03/09/2019 15:46:04", "label_1": "hard disc, hard disk, fixed disk", "uuid": "20190309204609_05c9a240-d801- 4bac-b029-e5bf38c02d40", "label_2": "buckle", "systemtime": "03/09/2019 15:46:09" }
  18. 18. Using TensorFlow Lite on The Edge with Sensors and Google Coral (MiNiFi)
  19. 19. 24 © Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCH Accelerate machine learning from research to production For data scientists • Experiment faster Use R, Python, or Scala with on- demand compute and secure CDH data access • Work together Share reproducible research with your whole team • Deploy with confidence Get to production repeatably and without recoding For IT professionals • Bring data science to the data Give your data science team more freedom while reducing the risk and cost of silos • Secure by default Leverage common security and governance across workloads • Run anywhere On-premises or in the cloud
  20. 20. 25 © Cloudera, Inc. All rights reserved. ACCELERATED DEEP LEARNING WITH GPUS Multi-tenant GPU support on-premises or cloud • Extend CDSW to deep learning • Schedule & share GPU resources • Train on GPUs, deploy on CPUs • Works on-premises or cloud CDSW GPUCPU CDH CPU CDH CPU single-node training distributed training, scoring “Our data scientists want GPUs, but we need multi-tenancy. If they go to the cloud on their own, it’s expensive and we lose governance.” GPU On CDH coming in C6
  21. 21. 26 © Cloudera, Inc. All rights reserved. INTRODUCING MODELS Machine learning models as one-click microservices (REST APIs) Model APIs made easy! 1. Choose Python/R file, e.g. score.py 2. Choose function, e.g. forecast f = open('model.pk', 'rb') model = pickle.load(f) def forecast(data): return model.predict(data) 3. Choose resources
  22. 22. 27© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCH Select a Project, Create a Session, Load Libraries and Data CLOUDERA DATA SCIENCE WORKBENCH
  23. 23. 28© Cloudera, Inc. All rights reserved. Load a File and Run It CLOUDERA DATA SCIENCE WORKBENCH
  24. 24. 29© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCH Install Python Libraries for Python 2 or Python 3 CLOUDERA DATA SCIENCE WORKBENCH
  25. 25. 30© Cloudera, Inc. All rights reserved. Test your function with an argument CLOUDERA DATA SCIENCE WORKBENCH
  26. 26. 31© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCH Create a model from that file and function CLOUDERA DATA SCIENCE WORKBENCH
  27. 27. 32© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHList All The Models CLOUDERA DATA SCIENCE WORKBENCH
  28. 28. 33© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHCheckout The Build CLOUDERA DATA SCIENCE WORKBENCH
  29. 29. 34© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHDeploy the Model CLOUDERA DATA SCIENCE WORKBENCH
  30. 30. 35© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHTest the Model CLOUDERA DATA SCIENCE WORKBENCH
  31. 31. 36© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHValidate the Model Results CLOUDERA DATA SCIENCE WORKBENCH
  32. 32. 37 © Cloudera, Inc. All rights reserved. WHAT’S NEW: CLOUDERA DATA SCIENCE WORKBENCH Accelerate and simplify machine learning from research to production ANALYZE DATA • Explore data securely and share insights with the team TRAIN MODELS • Run, track, and compare reproducible experiments DEPLOY APIs • Deploy and monitor models as APIs to serve predictions NEW! NEW! MANAGE SHARED RESOURCES • Provide a secure, collaborative, self-service platform for your data science teams
  33. 33. 38© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHMonitor The Running Models CLOUDERA DATA SCIENCE WORKBENCH
  34. 34. 39 © Cloudera, Inc. All rights reserved. MODEL MANAGEMENT View, test, monitor, and update models by team or project
  35. 35. 40© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHInvoke the Model From Apache NiFi In Flow CLOUDERA DATA SCIENCE WORKBENCH
  36. 36. 41© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHQuery Results of Classification in Flow { "class1": "cat", "cpu": 38.3, "end": "1549672761.1262221", "host": "gluoncv-apache-mxnet-29-50-7fb5cfc5b9-sx6dg", "memory": 14.9, "pct1": "98.15670800000001", "shape": "(1, 3, 566, 512)", "systemtime": "02/09/2019 00:39:21", "te": "3.380652666091919" } CLOUDERA DATA-IN-MOTION (APACHE NIFI)
  37. 37. 42© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCH ● https://blog.cloudera.com/blog/2019/02/integrating-machine-learning-models-into-your-big-data-pipelines-in- real-time-with-no-coding/ ● https://community.hortonworks.com/articles/239961/using-cloudera-data-science-workbench-with-apache.html ● https://community.hortonworks.com/content/kbentry/239858/integrating-machine-learning-models-into-your- big.html ● https://github.com/tspannhw/nifi-cdsw-gluoncv ● https://community.hortonworks.com/articles/227560/real-time-stock-processing-with-apache-nifi-and-ap.html ● https://www.datainmotion.dev/2019/03/iot-series-sensors-utilizing-breakout_1.html ● https://www.datainmotion.dev/2019/03/edge-to-ai-apache-spark-apache-nifi.html REFERENCES

×