Sivakumar Krishnasamy & Josiah Samuel
IBM Cognitive Systems
February 2019
PowerAI:
Open-Source Based Enterprise AI Platform
2
Artificial
Intelligence
Mimic Humans
Machine
Learning
Learn with
Experience
Deep Learning
(Neural Networks)
Self-Learn with
More Data
3
Transform & Prep
Data (ETL)
AI Infrastructure Stack
Applications
Cognitive APIs
(Eg: Watson)
In-House APIs
Machine & Deep Learning
Libraries & Frameworks
Distributed Computing
Data Lake & Data Stores
Segment Specific:
Finance, Retail, Healthcare
Speech, Vision,
NLP, Sentiment
TensorFlow, Caffe,
SparkML
Spark, MPI
Hadoop HDFS,
NoSQL DBs
Accelerated
Infrastructure
Accelerated Servers Storage
PowerAI
+
Watson
Studio
4
PowerAI
Open-Source Based
Enterprise AI Platform
Open Source Frameworks:
Supported Distribution
Developer Ease-of-Use Tools
Faster Training Times via
HW & SW Performance Optimizations
Integrated & Supported AI Platform
3-4x Speedup for AI Training
Ease of Use Tools for Data Scientists
GPU-Accelerated
Power Servers
Storage
Caffe
SnapML
5
AI Changes the Compute Architecture
5x Faster Data Communication with Unique
CPU-GPU NVLink High-Speed Connection
1 TB
Memory
Power 9
CPU
V100
GPU
V100
GPU
170GB/s
NVLink
150 GB/s
1 TB
Memory
Power 9
CPU
V100
GPU
V100
GPU
170GB/s
NVLink
150 GB/s
IBM AC922 Power System
Deep Learning Server (4-GPU Config)
Store Large Models
in System Memory
Operate on One
Layer at a Time
Fast Transfer
via NVLink
6
Large AI Models Train
~4 Times Faster
POWER9 Servers with NVLink to GPUs
vs
x86 Servers with PCIe to GPUs
7
3.1 Hours
49 Mins
0
2000
4000
6000
8000
10000
12000
Xeon x86 2640v4 w/
4x V100 GPUs
Power AC922 w/ 4x
V100 GPUs
Time(secs)
Caffe with LMS (Large Model Support)
Runtime of 1000 Iterations
3.8x Faster
GoogleNet model on Enlarged
ImageNet Dataset (2240x2240)
Detailed Benchmark Information in Back
8
Snap ML - Distributed GPU-Accelerated Machine Learning Library
libGLM (C++ / CUDA
Optimized Primitive Lib)
Distributed Training
Logistic Regression Linear Regression
Support Vector
Machines (SVM)
Distributed Hyper-
Parameter Optimization
More Coming Soon
Snap Machine Learning (ML) Library
Distributed Deep Learning (DDL): Reduce training time: Days to Hours
Deep Learning has limited scaling to multiple servers: IBM DDL solves this limitation
1
2
4
8
16
32
64
128
256
4 16 64 256
Speedup
Number of GPUs
Ideal Scaling
DDL Actual Scaling
95%Scaling with
256 GPUS
ResNet-50, ImageNet-1K
Caffe with PowerAI DDL, Running on Minsky
(S822Lc) Power8 System
16 Days 7 Hours
PowerAI Distributed Deep Leaning (DDL)9
PowerAI
Enterprise
10
Deep Learning Impact
(DLI) Module
Data & Model
Management, ETL,
Visualize, Advise
IBM Spectrum Conductor
Cluster Virtualization,
Auto Hyper-Parameter Optimization,
Spark for Data Transformation and Preparation
PowerAI: Open Source ML Frameworks
Large Model Support (LMS)
Distributed Deep
Learning (DDL)
Auto ML
Auto-ML for Images & Video
PowerAI
Vision Label Train Deploy
Accelerated
Infrastructure
Accelerated Servers Storage
PowerAI Vision: Auto-Deep Learning for Images & Video
11
Label Image or
Video Data
Auto-Train AI Model Package & Deploy
AI Model
Semi-Automatic Labeling using PowerAI Vision
12
Train DL Model
Define Labels
Manually Label Some
Images / Video Frames
Manually Label
Use Trained DL
Model
Run Trained DL Model
on Entire Input Data
to Generate Labels
Correct Labels
on Some Data
Manually Correct
Labels on Some Data
Repeat Till Labels Achieve
Desired Accuracy
13
Simplicity: Integrated
Platform that Just Works
Curate, Test, and
Support Fast Moving
Open Source
Provide Enterprise
Distribution on RedHat
Easy to deploy
Enterprise AI Platform
Ease of Use, Unique
Capabilities
Faster Model
Training Time
Large data & model
support due to NVLink
Acceleration of Analytics
& ML
AutoML: PowerAI Vision
Elastic Training: Scale
GPUs as Required
Faster Training Times
in Single Server
Scalability to 100s of
Servers (Cluster level
Integration)
Leads to Faster
Insights and Better
Economics
Platform that Partners
can build on
Software Partners:
H2O, Anaconda
SIs, Solution Vendors
& Accelerator Partners
Open AI Platform w/
Ecosystem Partners
Power9
CPU
GPU
PowerAI
IBM
SW
ISV
SW
Solution
SIs
Top Reasons to Choose PowerAI
Get Started Today with
Machine & Deep Learning
14
Build a Data Science Team
Your Developers Can Learn
http://cognitiveclass.ai
Identify a Low Hanging Use Case
Figure Out Data Strategy
Consider Pre-Built AI APIs
Hire Consulting Services
Get Started Today at
www.ibm.biz/poweraideveloper
Docker hub
https://hub.docker.com/r/ibmcom/powerai/

OpenPOWER and IBM AI overview

  • 1.
    Sivakumar Krishnasamy &Josiah Samuel IBM Cognitive Systems February 2019 PowerAI: Open-Source Based Enterprise AI Platform
  • 2.
  • 3.
    3 Transform & Prep Data(ETL) AI Infrastructure Stack Applications Cognitive APIs (Eg: Watson) In-House APIs Machine & Deep Learning Libraries & Frameworks Distributed Computing Data Lake & Data Stores Segment Specific: Finance, Retail, Healthcare Speech, Vision, NLP, Sentiment TensorFlow, Caffe, SparkML Spark, MPI Hadoop HDFS, NoSQL DBs Accelerated Infrastructure Accelerated Servers Storage PowerAI + Watson Studio
  • 4.
    4 PowerAI Open-Source Based Enterprise AIPlatform Open Source Frameworks: Supported Distribution Developer Ease-of-Use Tools Faster Training Times via HW & SW Performance Optimizations Integrated & Supported AI Platform 3-4x Speedup for AI Training Ease of Use Tools for Data Scientists GPU-Accelerated Power Servers Storage Caffe SnapML
  • 5.
    5 AI Changes theCompute Architecture 5x Faster Data Communication with Unique CPU-GPU NVLink High-Speed Connection 1 TB Memory Power 9 CPU V100 GPU V100 GPU 170GB/s NVLink 150 GB/s 1 TB Memory Power 9 CPU V100 GPU V100 GPU 170GB/s NVLink 150 GB/s IBM AC922 Power System Deep Learning Server (4-GPU Config) Store Large Models in System Memory Operate on One Layer at a Time Fast Transfer via NVLink
  • 6.
  • 7.
    Large AI ModelsTrain ~4 Times Faster POWER9 Servers with NVLink to GPUs vs x86 Servers with PCIe to GPUs 7 3.1 Hours 49 Mins 0 2000 4000 6000 8000 10000 12000 Xeon x86 2640v4 w/ 4x V100 GPUs Power AC922 w/ 4x V100 GPUs Time(secs) Caffe with LMS (Large Model Support) Runtime of 1000 Iterations 3.8x Faster GoogleNet model on Enlarged ImageNet Dataset (2240x2240) Detailed Benchmark Information in Back
  • 8.
    8 Snap ML -Distributed GPU-Accelerated Machine Learning Library libGLM (C++ / CUDA Optimized Primitive Lib) Distributed Training Logistic Regression Linear Regression Support Vector Machines (SVM) Distributed Hyper- Parameter Optimization More Coming Soon Snap Machine Learning (ML) Library
  • 9.
    Distributed Deep Learning(DDL): Reduce training time: Days to Hours Deep Learning has limited scaling to multiple servers: IBM DDL solves this limitation 1 2 4 8 16 32 64 128 256 4 16 64 256 Speedup Number of GPUs Ideal Scaling DDL Actual Scaling 95%Scaling with 256 GPUS ResNet-50, ImageNet-1K Caffe with PowerAI DDL, Running on Minsky (S822Lc) Power8 System 16 Days 7 Hours PowerAI Distributed Deep Leaning (DDL)9
  • 10.
    PowerAI Enterprise 10 Deep Learning Impact (DLI)Module Data & Model Management, ETL, Visualize, Advise IBM Spectrum Conductor Cluster Virtualization, Auto Hyper-Parameter Optimization, Spark for Data Transformation and Preparation PowerAI: Open Source ML Frameworks Large Model Support (LMS) Distributed Deep Learning (DDL) Auto ML Auto-ML for Images & Video PowerAI Vision Label Train Deploy Accelerated Infrastructure Accelerated Servers Storage
  • 11.
    PowerAI Vision: Auto-DeepLearning for Images & Video 11 Label Image or Video Data Auto-Train AI Model Package & Deploy AI Model
  • 12.
    Semi-Automatic Labeling usingPowerAI Vision 12 Train DL Model Define Labels Manually Label Some Images / Video Frames Manually Label Use Trained DL Model Run Trained DL Model on Entire Input Data to Generate Labels Correct Labels on Some Data Manually Correct Labels on Some Data Repeat Till Labels Achieve Desired Accuracy
  • 13.
    13 Simplicity: Integrated Platform thatJust Works Curate, Test, and Support Fast Moving Open Source Provide Enterprise Distribution on RedHat Easy to deploy Enterprise AI Platform Ease of Use, Unique Capabilities Faster Model Training Time Large data & model support due to NVLink Acceleration of Analytics & ML AutoML: PowerAI Vision Elastic Training: Scale GPUs as Required Faster Training Times in Single Server Scalability to 100s of Servers (Cluster level Integration) Leads to Faster Insights and Better Economics Platform that Partners can build on Software Partners: H2O, Anaconda SIs, Solution Vendors & Accelerator Partners Open AI Platform w/ Ecosystem Partners Power9 CPU GPU PowerAI IBM SW ISV SW Solution SIs Top Reasons to Choose PowerAI
  • 14.
    Get Started Todaywith Machine & Deep Learning 14 Build a Data Science Team Your Developers Can Learn http://cognitiveclass.ai Identify a Low Hanging Use Case Figure Out Data Strategy Consider Pre-Built AI APIs Hire Consulting Services Get Started Today at www.ibm.biz/poweraideveloper Docker hub https://hub.docker.com/r/ibmcom/powerai/