Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Apache Deep Learning 201 - Barcelona DWS March 2019

279 views

Published on

Apache Deep Learning 201 - Barcelona DWS March 2019

The art of using Apache NiFi with Apache Tika, Apache OpenNLP, Apache Spark, Apache MXNet, Apache NiFi MiNiFi, Apache NiFi Registry, Apache Livy, Apache HBase, Apache Phoenix, Apache Hive and Apache YARN for deep learning workloads. Including Submarine.

Published in: Data & Analytics
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Apache Deep Learning 201 - Barcelona DWS March 2019

  1. 1. @PaaSDev Apache Deep Learning 201 v1.00 (For Data Engineers) Timothy Spann https://github.com/tspannhw/ApacheDeepLearning201/
  2. 2. @PaaSDev Disclaimer • This is my personal integration and use of Apache software, no companies vision. • This document may contain product features and technology directions that are under development, may be under development in the future or may ultimately not be developed. This is Tim’s ideas only. • Technical feasibility, market demand, user feedback, and the Apache Software Foundation community development process can all effect timing and final delivery. • This document’s description of these features and technology directions does not represent a contractual commitment, promise or obligation from Hortonworks to deliver these features in any generally available product. • Product features and technology directions are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. • Since this document contains an outline of general product development plans, customers should not rely upon it when making a purchase decision.
  3. 3. @PaaSDev There are some who call him... DZone Zone Leader and Big Data MVB; Princeton Future of Data Meetup https://github.com/tspannhw https://community.hortonworks.com/users/9304/tspann.html https://dzone.com/users/297029/bunkertor.html https://www.meetup.com/futureofdata-princeton/
  4. 4. @PaaSDev Hadoop {Submarine} Project: Running deep learning workloads on YARN , Tim Spann (Cloudera)
  5. 5. @PaaSDev
  6. 6. @PaaSDev
  7. 7. @PaaSDev IoT Edge Processing with Apache MiniFi and Multiple Deep Learning Libraries
  8. 8. @PaaSDev Deep Learning for Big Data Engineers Multiple users, frameworks, languages, devices, data sources & clusters BIG DATA ENGINEER • Experience in ETL • Coding skills in Scala, Python, Java • Experience with Apache Hadoop • Knowledge of database query languages such as SQL • Knowledge of Hadoop tools such as Hive, or Pig • Expert in ETL (Eating, Ties and Laziness) • Social Media Maven • Deep SME in Buzzwords • No Coding Skills • Interest in Pig and Falcon CAT AI • Will Drive your Car • Will Fix Your Code • Will Beat You At Q-Bert • Will Not Be Discussed Today • Will Not Finish This Talk For Me, This Time http://gluon.mxnet.io/chapter01_crashcourse/preface.html
  9. 9. @PaaSDev
  10. 10. @PaaSDev
  11. 11. @PaaSDev Why Apache NiFi? • Guaranteed delivery • Data buffering - Backpressure - Pressure release • Prioritized queuing • Flow specific QoS - Latency vs. throughput - Loss tolerance • Data provenance • Supports push and pull models • Hundreds of processors • Visual command and control • Over a 200 sources • Flow templates • Pluggable/multi-role security • Designed for extension • Clustering • Version Control
  12. 12. @PaaSDev Aggregate all the Data! Sensors, Drones, logs, Geo-location devices Photos, Images, Results from running predictions on Pre-trained models. Collect: Bring Together
  13. 13. @PaaSDev Mediate point-to-point and Bidirectional data flows Delivering data reliably to and from Apache HBase, Druid, Apache Phoenix, Apache Hive, Impala, Kudu, HDFS, Slack and Email. Conduct: Mediate the Data Flow
  14. 14. @PaaSDev Orchestrate, parse, merge, aggregate, filter, join, transform, fork Query, sort, dissect, store, enrich with weather, location, Sentiment analysis, image analysis, object detection, image recognition, … Curate: Gain Insights
  15. 15. @PaaSDev • Cloud ready • Python, C++, Scala, R, Julia, Matlab, MXNet.js and Perl Support • Experienced team (XGBoost) • AWS, Microsoft, NVIDIA, Baidu, Intel • Apache Incubator Project • Run distributed on YARN and Spark • In my early tests, faster than TensorFlow. (Try this yourself) • Runs on Raspberry PI, NVidia Jetson TX1 and other constrained devices https://mxnet.incubator.apache.org/how_to/cloud.html https://github.com/apache/incubator-mxnet/tree/1.3.1/example https://gluon-cv.mxnet.io/api/model_zoo.html
  16. 16. @PaaSDev • Great documentation • Crash Course • Gluon (Open API), GluonCV, GluonNLP • Keras (One API Many Runtime Options) • Great Python Interaction. Java and Scala APIs! • Open Source Model Server Available • ONNX (Open Neural Network Exchange Format) Support for AI Models • Now in Version 1.4.0! • Rich Model Zoo! • Math Kernel Library and NVidia CUDA Optimizations • TensorBoard compatible http://mxnet.incubator.apache.org / http://gluon.mxnet.io/https://onnx.ai / pip3.6 install -U keras-mxnet https://gluon- nlp.mxnet.io/ pip3.6 install --upgrade mxnet pip3.6 install gluonnlp pip3.6 install gluoncv pip3.6 install mxnet-mkl>=1.3.0 --upgrade
  17. 17. @PaaSDev Apache MXNet GluonCV Zoo https://gluon-cv.mxnet.io/model_zoo/classification.html • ResNet152_v2 • MobileNetV2_0.25 • VGG19_bn • SqueezeNet1.1 • DenseNet201 • Darknet53 • InceptionV3 • CIFAR_ResNeXt29_16x64 • yolo3_darknet53_voc • ssd_512_mobilenet1.0_coco • faster_rcnn_resnet101_v1d_coco • yolo3_darknet53_coco • FCN model on PASCAL VOC
  18. 18. @PaaSDev • Apache MXNet Running in Apache Zeppelin Notebooks • Apache MXNet Running on YARN 3.1 In Hadoop 3.1 In Dockerized Containers • Apache MXNet Running on YARN Apache NiFi Integration with Apache Hadoop Options https://community.hortonworks.com/articles/176789/apache-deep-learning-101-using-apache-mxnet-in-apa.html https://community.hortonworks.com/articles/174399/apache-deep-learning-101-using-apache-mxnet-on-apa.html https://www.slideshare.net/Hadoop_Summit/deep-learning-on-yarn-running-distributed-tensorflow-etc-on-hadoop-cluster-v3
  19. 19. @PaaSDev Object Detection: GluonCV YOLO v3 and Apache NiFi https://community.hortonworks.com/articles/222367/using-apache-nifi-with-apache-mxnet-gluoncv-for-yo.html
  20. 20. @PaaSDev Object Detection: Faster RCNN with GluonCV net = gcv.model_zoo.get_model(faster_rcnn_resnet50_v1b_voc, pretrained=True) Faster RCNN model trained on Pascal VOC dataset with ResNet-50 backbone https://gluon-cv.mxnet.io/api/model_zoo.html
  21. 21. @PaaSDev Instance Segmentation: Mask RCNN with GluonCV net = model_zoo.get_model('mask_rcnn_resnet50_v1b_coco', pretrained=True) Mask RCNN model trained on COCO dataset with ResNet-50 backbone https://gluon-cv.mxnet.io/build/examples_instance/demo_mask_rcnn.html https://arxiv.org/abs/1703.06870 https://github.com/matterport/Mask_RCNN
  22. 22. @PaaSDev Semantic Segmentation: DeepLabV3 with GluonCV model = gluoncv.model_zoo.get_model('deeplab_resnet101_ade', pretrained=True) GluonCV DeepLabV3 model on ADE20K dataset https://gluon-cv.mxnet.io/build/examples_segmentation/demo_deeplab.html run1.sh demo_deeplab_webcam.py http://groups.csail.mit.edu/vision/datasets/ADE20K/ https://arxiv.org/abs/1706.05587 https://www.cityscapes-dataset.com/ This one is a bit slower.
  23. 23. @PaaSDev Semantic Segmentation: Fully Convolutional Networks model = gluoncv.model_zoo.get_model(‘fcn_resnet101_voc ', pretrained=True) GluonCV FCN model on PASCAL VOC dataset https://gluon-cv.mxnet.io/build/examples_segmentation/demo_fcn.html run1.sh demo_fcn_webcam.py https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf
  24. 24. @PaaSDev Simple Pose Estimation https://gluon-cv.mxnet.io/build/examples_pose/cam_demo.html pip3.6 install gluoncv --pre --upgrade https://github.com/dmlc/gluon-cv/tree/master/scripts/pose/simple_pose yolo3_mobilenet1.0_coco + simple_pose_resnet18_v1b
  25. 25. @PaaSDev Apache MXNet Model Server from Apache NiFi https://community.hortonworks.com/articles/223916/posting-images-with-apache-nifi-17-and-a-custom- pr.html
  26. 26. @PaaSDev Apache MXNet Native Processor for Apache NiFi This is a beta, community release by me using the new beta Java API for Apache MXNet. https://github.com/tspannhw/nifi-mxnetinference- processorhttps://community.hortonworks.com/articles/229215/apache-nifi-processor-for-apache-mxnet-ssd- single.htmlhttps://www.youtube.com/watch?v=Q4dSGPvq
  27. 27. @PaaSDev Edge Intelligence with Apache NiFi Subproject - MiNiFi ⬢ Guaranteed delivery ⬢ Data buffering ‒ Backpressure ‒ Pressure release ⬢ Prioritized queuing ⬢ Flow specific QoS ‒ Latency vs. throughput ‒ Loss tolerance ⬢ Data provenance ⬢ Recovery / recording a rolling log of fine-grained history ⬢ Designed for extension ⬢ Java or C++ Agent Different from Apache NiFi ⬢ Design and Deploy ⬢ Warm re-deploys Key Features
  28. 28. @PaaSDev Apache MXNet Running on Edge Nodes (MiniFi) https://community.hortonworks.com/articles/83100/deep-learning-iot-workflows-with-raspberry-pi- mqtt.html https://github.com/tspannhw/OpenSourceComputerVision https://github.com/tspannhw/ApacheDeepLearning101 https://github.com/tspannhw/mxnet-for-iot
  29. 29. @PaaSDev Multiple IoT Devices with Apache NiFi and Apache MXNet https://community.hortonworks.com/articles/203638/ingesting-multiple-iot-devices-with-apache-nifi-17.html
  30. 30. @PaaSDev Using Apache MXNet on The Edge with Sensors and Intel Movidius (MiNiFi) https://community.hortonworks.com/articles/176932/apache-deep-learning-101-using-apache-mxnet-on-the.html https://community.hortonworks.com/articles/146704/edge-analytics-with-nvidia-jetson-tx1-running-apac.html
  31. 31. @PaaSDev Using Apache MXNet on The Edge with Sensors and Google Coral (MiNiFi) https://www.datainmotion.dev/2019/03/using-raspberry-pi-3b-with-apache-nifi.html
  32. 32. @PaaSDev Storage Platform: HDFS in Apache Hadoop 3.1 Compute & GPU Platform: YARN in Apache Hadoop 3.1HBase2.0 Security & Governance: Atlas 1.0, Ranger 1.0, Knox 1.0 Hive 3.0 Spark 2.3Phoenix 0.8 Operations: Ambari 2.7 Open Source Hadoop 3.1
  33. 33. @PaaSDev Apache MXNet on Apache YARN 3.1 Native No Spark yarn jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -shell_command python3.6 -shell_args "/opt/demo/analyzex.py /opt/images/cat.jpg" -container_resources memory- mb=512,vcores=1 Uses: Python Any
  34. 34. @PaaSDev Apache MXNet on Apache YARN 3.1 Native No Spark https://community.hortonworks.com/content/kbentry/222242/running-apache-mxnet-deep-learning-on-yarn-31- hdp.html https://github.com/tspannhw/ApacheDeepLearning101/blob/master/analyzehdfs.py
  35. 35. @PaaSDev Apache MXNet on YARN 3.2 in Docker Using “Submarine” https://github.com/apache/hadoop/tree/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine yarn jar hadoop-yarn-applications-submarine-<version>.jar job run --name xyz-job-001 --docker_image <your docker image> --input_path hdfs://default/dataset/cifar-10-data --checkpoint_path hdfs://default/tmp/cifar-10-jobdir --num_workers 1 --worker_resources memory=8G,vcores=2,gpu=2 --worker_launch_cmd "shell for Apache MXNet" Wangda Tan (wangda@apache.org) Hadoop {Submarine} Project: Running deep learning workloads on YARN https://issues.apache.org/jira/browse/YARN-8135

×