Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Inteligencia artificial, open source e IBM Call for Code

87 views

Published on

Nesta palestra vamos abordar algumas das tendências em Inteligência Artificial e as dificuldades na uso da Inteligência Artificial. Por isso, também apresentaremos algumas ferramentas disponíveis em código livre que podem ajudar a simplificar a adoção da IA. E faremos uma breve introdução ao “Call for Code” que é uma iniciativa da IBM para construir soluções na prevenção e reação a desastres naturais.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Inteligencia artificial, open source e IBM Call for Code

  1. 1. Open Source @ IBM Inteligência Artificial, Open Source e IBM Call for Code 2018 / © 2018 IBM Corporation 1 Luciano Resende Data Science Platform Architect
  2. 2. About me - Luciano Resende 2 Data Science Platform Architect – IBM – CODAIT • Have been contributing to open source at ASF for over 10 years • Currently contributing to : Jupyter Notebook ecosystem, Apache Bahir, Apache Toree, Apache Spark among other projects related to AI/ML platforms lresende@apache.org https://www.linkedin.com/in/lresende @lresende1975 https://github.com/lresende
  3. 3. 3 Learn Open Source @ IBM Program touches 78,000 IBMers annually Consume Virtually all IBM products contain some open source • 40,363 pkgs Per Year Contribute • >62K OS Certs per year • ~10K IBM commits per month Connect > 1000 active IBM Contributors Working in key OS projects 2018 / © 2018 IBM Corporation Open Source participation and usage is simpler than ever
  4. 4. 4 Open Source is essential to Developer Advocacy IBM generated open source innovation • 137 Code Open (dWO) projects w/1000+ Github projects • 4 graduates: Node-Red, OpenWhisk, SystemML, Blockchain fabric to full open governance in the last year • developer.ibm.com/code/open/code/ Community • IBM focused on 18 strategic communities • Drive open governance in “Centers of Gravity” • IBM Leaders drive key technologies and assure freedom of action The IBM OS Way is now open sourced • Training, Recognition, Tooling • Organization, Consuming, Contributing 2018 / © 2018 IBM Corporation
  5. 5. 5 IBM’s history of strong AI leadership 1997: Deep Blue • Deep Blue became the first machine to beat a world chess champion in tournament play 2011: Jeopardy! • Watson beat two top Jeopardy! champions 1968, 2001: A Space Odyssey • IBM was a technical advisor • HAL is “the latest in machine intelligence” 2018: Open Tech, AI & emerging standards • New IBM centers of gravity for AI • OS projects increasing exponentially • Emerging global standards in AI 2018 / © 2018 IBM Corporation
  6. 6. AI Scenarios Today 82018 / © 2018 IBM Corporation
  7. 7. Home Automation & Security - Multiple connected or standalone devices - Controlled by Voice - Amazon Echo (Alexa) - Google Home - Apple HomePod (Siri) 9
  8. 8. TESLA connected cars CONNECTED VEHICLES. It’s not just about Google Maps in cars. When Tesla finds a software fault with their vehicle rather than issuing an expensive and damaging recall, they simply updated the car’s operating system over the air. [hcp://www.wired.com/2014/02/te slas- air-fix-best-example-yet- internet-things/] 10
  9. 9. AMAZON Go AMAZON GO – No lines, no checkout, just grab and go 11
  10. 10. Model Asset eXchange 122018 / © 2018 IBM Corporation Enabling domain experts to use deep learning in the enterprise
  11. 11. Q: What is deep learning? A: Machine learning using deep neural networks. 132018 / © 2018 IBM Corporation InceptionV3 Convolutional Neural Net (A “medium-sized” deep learning model) Image Source: https://github.com/tensorflow/models/blob/master/research/inception/g3doc /inception_v3_architecture.png
  12. 12. Characteristics of Deep Learning (1) 14 State-of-the-Art prediction quality in many domains – Image classification – Machine translation – Facial recognition – Time series prediction – Many more 2018 / © 2018 IBM Corporation
  13. 13. Characteristics of Deep Learning (2) 15 Large, complex models – Model size generally determined by “how big a model can you fit on your device?” 2018 / © 2018 IBM Corporation Each box ≈ between 32 and 768 linear regression models
  14. 14. Characteristics of Deep Learning (3) 16 Poorly understood today …even by experts – Why do the models converge? – Why do the models converge with low loss? – Why do the models generalize? 2018 / © 2018 IBM Corporation
  15. 15. Focus of this Talk 17 Incorporating well-understood deep learning models into enterprise applications. 2018 / © 2018 IBM Corporation
  16. 16. 182018 / © 2018 IBM Corporation Sounds easy!
  17. 17. “cat” The Components of a Deep Learning Model 192018 / © 2018 IBM Corporation Dense (3×8) Dense (8×6) Input (3) Output (2)Dense (6×4) Dense (4×2) Neural Network Graph Weights (not to scale) Driver Program
  18. 18. Example: Get an Image Classifier 20 Step 1: Find a suitable neural network graph. – Need to read some papers 2018 / © 2018 IBM Corporation
  19. 19. Example: Get an Image Classifier 21 Step 2: Find code to generate the neural network graph 2018 / © 2018 IBM Corporation TensorFlow code to build ResNet50 neural network graph
  20. 20. Example: Get an Image Classifier 22 Step 3: Find some pre-trained weights for your graph 2018 / © 2018 IBM Corporation Caffe2 ResNet50 model weights
  21. 21. Example: Get an Image Classifier 23 Step 4: Find example code that performs model inference 2018 / © 2018 IBM Corporation TensorFlow code for training and batch inference on ResNet50
  22. 22. Example: Get an Image Classifier 24 Step 5: Write your own code to perform model inference on one image at a time Step 6: Package your inference code, graph creation code, and pre- trained weights together Step 7: Deploy your package 2018 / © 2018 IBM Corporation
  23. 23. Model Marketplaces 25 Collections of well-understood deep learning models Provide a central place to find known-good implementations of these models 2018 / © 2018 IBM Corporation
  24. 24. IBM Model Asset eXchange MAX is a one-stop shop open source ecosystem for data scientists and AI developers to share and consume models that use machine learning engines, such as TensorFlow, PyTorch and Caffe2. It also provides a standard approach to classify, annotate, and deploy these models for prediction and inferencing. MAX https://developer.ibm.com/cod e/exchanges/models/ 2018 / © 2018 IBM Corporation 26
  25. 25. 272018 / © 2018 IBM Corporation Demo! https://developer.ibm.com/code/exchanges/models/ https://developer.ibm.com/code/patterns/create-web-app-interact-machine-learning-generated-image-captions/
  26. 26. Summary 28 Free, open-source models. Wide variety of domains. Multiple deep learning frameworks. Vetted and tested code and IP. Build and deploy a web service in 30 seconds. Start training on Watson Studio in minutes. 2018 / © 2018 IBM Corporation
  27. 27. MAX: Future Plans 29 Many more models – Train with Watson Studio/DLaaS – Run inference on IBM infrastructure Revamped website Integration with Watson Catalog IBMer-uploaded models More IBM Code code patterns showing usage 2018 / © 2018 IBM Corporation https://developer.ibm.com/code/exchanges/models/
  28. 28. Click to edit Master title style FfDL Fabric for Deep Learning 2018 / © 2018 IBM Corporation 30 FfDL provides a scalable, resilient, and fault tolerant deep-learning framework
  29. 29. Fabric for Deep Learning https://github.com/IBM/FfDL 2018 / © 2018 IBM Corporation FfDL provides a scalable, resilient, and fault tolerant deep-learning framework FfDL Github Page https://github.com/IBM/FfDL FfDL dwOpen Page https://developer.ibm.com/code/open/projects/fabri c-for-deep-learning-ffdl/ FfDL Announcement Blog http://developer.ibm.com/code/2018/03/20/fabric- for-deep-learning FfDL Technical Architecture Blog http://developer.ibm.com/code/2018/03/20/democr atize-ai-with-fabric-for-deep-learning Deep Learning as a Service within Watson Studio https://www.ibm.com/cloud/deep-learning Research paper: “Scalable Multi-Framework Management of Deep Learning Training Jobs” http://learningsys.org/nips17/assets/papers/paper_ 29.pdf • Fabric for Deep Learning or FfDL (pronounced as ‘fiddle’) is an open source project which aims at making Deep Learning easily accessible to the people it matters the most i.e. Data Scientists, and AI developers. • FfDL Provides a consistent way to deploy, train and visualize Deep Learning jobs across multiple frameworks like TensorFlow, Caffe, PyTorch, Keras etc. • FfDL is being developed in close collaboration with IBM Research and IBM Watson. It forms the core of Watson`s Deep Learning service in open source. FfDL 31
  30. 30. Fabric for Deep Learning https://github.com/IBM/FfDL FfDL is built using Microservices architecture on Kubernetes • FfDL platform uses a microservices architecture to offer resilience, scalability, multi-tenancy, and security without modifying the deep learning frameworks, and with no or minimal changes to model code. • FfDL control plane microservices are deployed as pods on Kubernetes to manage this cluster of GPU- and CPU-enabled machines effectively • Tested Platforms: Minikube, IBM Cloud Public, IBM Cloud Private, GPUs using both Kubernetes feature gate Accelerators and NVidia device plugins 322018 / © 2018 IBM Corporation
  31. 31. source code training definition Auto-allocation means infrastructure is used only when needed Kubernetes container training artifacts compute cluster NVIDIA Tesla K80, P100, V100 Cloud Object Storage Training assets are managed and tracked. Access to elastic compute leveraging Kubernetes 332018 / © 2018 IBM Corporation
  32. 32. NVIDIA GPUs Kubernetes container orchestration training runs containers server cluster dataset Cloud Object Storage Model training distributed across containers 342018 / © 2018 IBM Corporation
  33. 33. 35 FfDL: Architecture 2018 / © 2018 IBM Corporation
  34. 34. 36 https://arxiv.org/abs/1709.05871 FfDL: Research Papers 2018 / © 2018 IBM Corporation
  35. 35. Click to edit Master title style Jupyter Enterprise Gateway 2018 / © 2018 IBM Corporation 37 Provides multi-tenant, scalable and secure remote Jupyter Notebook kernels
  36. 36. Jupyter Notebooks Overview 38© 2018 IBM Corporation
  37. 37. Jupyter Notebooks © 2018 IBM Corporation 39 Notebooks are interactive computational environments, in which you can combine code execution, rich text, mathematics, plots and rich media.
  38. 38. Jupyter Notebooks © 2018 IBM Corporation 40 • Notebook UI runs on the browser • The Notebook Server serves the ’Notebooks’ • Kernels interpret/execute cell contents – Are responsible for code execution – Abstracts different languages
  39. 39. Building a Data Science Analytical Platform 41© 2018 IBM Corporation
  40. 40. Building an Data Science Platform © 2018 IBM Corporation Large pool of shared computing resources • Enterprise Cloud, Public Cloud or Hybrid • Data in the cloud (Data Lakes/Object Storage) Distributed Consumers • Notebooks running local (users laptop) or as a service (e.g. Jupyter Hub) Different Resource Utilization Patterns • High number of idle resources
  41. 41. Vanilla Jupyter Notebooks © 2018 IBM Corporation Gather Data Analyze Data Machine Learning Deep Learning Deploy Model Maintain Model Python Data Science Stack Fabric for Deep Learning (FfDL) Mleap + PFA Scikit-LearnPandas Apache Spark Apache Spark Jupyter Model Asset eXchange Keras + Tensorflow 43 8 8 8 8 0 10 20 30 40 50 60 70 80 4 Nodes 8 Nodes 12 Nodes 16 NodesMaxKernels(4GBHeap) Cluster Size (32GB Nodes) MAXIMUM NUMBER OF SIMULTANEOUS KERNELS Kernel Kernel Kernel Kernel Limitations of Jupyter Notebook Stack • Security limitations • Single user sharing the same privileges • Users can see and control each other process using Jupyter administrative utilities • Scalability limitations • Jupyter Kernels running as local process • Resources are limited by what is available on the one single node that runs all Kernels and associated Spark drivers Kernel
  42. 42. Jupyter Enterprise Gateway © 2018 IBM Corporation Jupyter Enterprise Gateway at IBM Code https://developer.ibm.com/code/openprojects/jupyter-enterprise-gateway/ Jupyter Enterprise Gateway source code at GitHub https://github.com/jupyter-incubator/enterprise_gateway Jupyter Enterprise Gateway Documentation http://jupyter-enterprise-gateway.readthedocs.io/en/latest/ Supported Kernels Supported Platforms 45 A lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across an Apache Spark or Kubernetes cluster for Enterprise/Cloud use cases Spectrum Conductor +
  43. 43. Jupyter Enterprise Gateway © 2018 IBM Corporation Gather Data Analyze Data Machine Learning Deep Learning Deploy Model Maintain Model Python Data Science Stack Fabric for Deep Learning (FfDL) Mleap + PFA Scikit-LearnPandas Apache Spark Apache Spark Jupyter Model Asset eXchange Keras + Tensorflow 46 16 32 48 64 0 10 20 30 40 50 60 70 80 4 Nodes 8 Nodes 12 Nodes 16 NodesMaxKernels(4GBHeap) Cluster Size (32GB Nodes) MAXIMUM NUMBER OF SIMULTANEOUS KERNELS Kernel Kernel KernelKernel Optimized Resource Allocation – Utilize resources on all cluster nodes by running kernels as Spark applications in YARN Cluster Mode. – Pluggable architecture to enable support for additional Resource Managers Enhanced Security – End-to-End secure communications • Secure socket communications • Encrypted HTTP communication using SSL Multiuser support with user impersonation – Enhance security and sandboxing by enabling user impersonation when running kernels (using Kerberos). – Individual HDFS home folder for each notebook user. – Use the same user ID for notebook and batch jobs. KernelKernel Kernel
  44. 44. Jupyter Enterprise Gateway – YARN © 2018 IBM Corporation 47 YARN Cluster YARN Workers Gateway Node Jupyter Enterprise Gateway • Multitenancy • Remote kernel lifecycle management via process proxies Spark Executors Spark Executors Spark Executors Yarn Container Jupyter Kernel Spark Driver Impersonation: Alice’s kernel runs under Alice’s user ID. Spark Executors Spark Executors Spark Executors Yarn Container Jupyter Kernel Spark Driver SecurityLayer nb2kg nb2kg Spark Executors Spark Executors Spark Executors Yarn Container Jupyter Kernel Spark Driver Bob Alice
  45. 45. Enterprise Gateway & Kubernetes © 2018 IBM Corporation Supported Platforms Kernel Kernel Kernel Kernel Before Jupyter Enterprise Gateway … • Scalability limitations • Resources are limited and the amount required to all kernels needs to be allocated during Notebook Server pod creation. • Resources are limited by what is available on the one single node that runs all Kernels and associated Spark drivers Kernel KernelKernel
  46. 46. Jupyter Enterprise Gateway - Kubernetes © 2018 IBM Corporation 49 Container images defined in kernelspec Community image Kernel Spark on K8 Kernel Distributed File System Vanilla Kernels Spark based kernels Gateway nb2kg nb2kg
  47. 47. Summary 54© 2018 IBM Corporation
  48. 48. Summary © 2018 IBM Corporation 55 • Model Asset Exchange • Curated set of models ready to use or embedded in your application or solution • Fabric for Deep Learning • Provides a consistent way for AI developers and Data Scientists to train their models • Jupyter Enterprise Gateway • Enables your Jupyter Notebook stack to scale in order to build Machine Learning and AI Models more resource effectively MAX https://developer.ibm.com/cod e/exchanges/models/
  49. 49. 56© 2018 IBM Corporation
  50. 50. 57May 17, 2018 / © 2018 IBM Corporation
  51. 51. 58© 2018 IBM Corporation

×