Enabling a hardware accelerated deep learning data science experience for Apache Spark and Hadoop

Enabling a hardware accelerated deep
learning data science experience for Apache
Spark and Hadoop
Indrajit (I.P) Poddar
Senior Technical Staff Member
IBM Cloud and Cognitive Systems
June 2018

Safe Harbor Statement and Disclaimer
•Copyright IBM Corporation 2018. All rights reserved. U.S. Government Users Restricted Rights - use, duplication, or disclosure restricted by GSA
ADP Schedule Contract with IBM Corporation.
•IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other
countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or
TM), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks
make also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and
trademark information at : ibm.com/legal/copytrade/shtml.
•The information contained in this presentation is provided for informational purpose only. While efforts were made to verify the completeness and
accuracy of the information contained in this presentation, it is provided “as is” without warranty of any kind, expressed or implied. IBM shall not be
responsible for any damages arising out of the use of, or otherwise related to, this presentation or any other documentation.
•The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or
functionality. Information about potential future products may not be incorporated into any contract. Nothing contained in this presentation is intended
to, nor shall have the effect of, creating any warranties or representations from IBM (or its suppliers or licensors), or altering the terms and conditions
of any agreement or license governing the use of IBM products and/or software.
•Any statements of performance are based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual
throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multi-
programming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be
given that an individual user will achieve results similar to those stated.
•IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion. The
development, release, and timing of any future features or functionality described for our products remains at our sole discretion. Information
regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision.”

AI, Deep Learning, Machine Learning
02
Data Science Experience
03
Hardware Acceleration
04
Demo
Agenda 01

4
Artificial
Intelligence
Mimic Humans
Machine
Learning
Learn with
Experience
Deep Learning
(Neural Networks)
Self-Learn with
More Data

Deep Learning Has
Revolutionized Machine Learning
5
Data
Accuracy
Deep
Learning
Traditional
Machine
Learning
100
80
60
40
20
0
Deep Learning Popularity
Growing Exponentially
Source: Google Trends. Search term “Deep Learning”
2011 2012 2013 2014 2015 2016 2017

6
Machine Learning
Deep Learning
Input
Deep Neural Network
OutputFeature Extraction
& Classification
Input Feature
Extraction
Features Classification Output
Machine Learning
Algorithms

7
Transform & Prep
Data (ETL)
AI Infrastructure Stack
Applications
Cognitive APIs
(Eg: Watson)
In-House APIs
Machine & Deep Learning
Libraries & Frameworks
Distributed Computing
Data Lake & Data Stores
Segment Specific:
Finance, Retail, Healthcare
Speech, Vision,
NLP, Sentiment
TensorFlow, Caffe,
SparkML
Kubernetes, Spark, MPI
Hadoop HDFS,
NoSQL DBs
Accelerated
Infrastructure
Accelerated Servers Storage

8
Open Source Frameworks:
Supported Distribution
Developer Ease-of-Use Tools
Faster Training Times via
HW & SW Performance Optimizations
Integrated & Supported AI Platform
Higher Productivity for Data Scientists
Enable non-Data Scientists to use AI
Integrated
software and
hardware for
AI

Teams in modeling experimentation phase

Teams in applications building phase

Teams in model deployment, monitoring and support phase

Software Architecture Best Practices
Run as a collection of “dockerized” services which are managed by Kubernetes
Kubernetes handles the service orchestration by providing
• Service monitoring and administration
• High availability / service failure detection and automatic restart
• Dynamically adds or removes nodes
• Online upgrades
Services running in Kubernetes include:
• UI services built with Node.js frameworks for browsers to connect to
• User authentication services
• Project services for user collaboration and data sharing
• Notebook services with enhanced access to Jupyter notebooks
• A Spark service with access to sophisticated analytics libraries
• Pipeline and model building services
• Data connection building service for access to external data
• Various internal management services

Specialized Runtime environments for containers with GPUs
16
• Create microservices using
nvidia-docker images
• Add AI frameworks which
transparently exploit GPUs such
as Tensorflow to the docker
image
• Deploy image and allocate
GPUs in a cluster using
kubernetes

Connect to Spark and Hadoop cluster for larger data sets and
access to shared resources
or YARN

POC Development and Deployment Configuration Kubernetes
cluster
IBM Cloud © 2018 IBM Corporation
Deploy Deploy
Control
Storage
Compute
Control
Storage
Compute
Control
Storage
Compute
IBM POWER Systems LC 922

Mixed Development and Deployment Configuration
Deploy Deploy
Control
Storage
Compute
Control
Storage
Compute
Control
Storage
Compute Compute
Compute
IBM POWER Systems LC 922 IBM POWER Systems AC 922

Separate Development and Deployment Configuration
Control
Storage
Compute
Control
Storage
Compute
Control
Storage
Compute
Compute
Deploy Deploy
IBM POWER Systems LC 922 IBM POWER Systems AC 922

22
Faster Data Communication with Unique
CPU-GPU NVLink High-Speed Connection
1 TB
Memory
CPU
GPU GPU
170GB/s
NVLink
150 GB/s
1 TB
Memory
CPU
GPU GPU
170GB/s
NVLink
150 GB/s
Deep Learning Server (4-GPU Config)
Store Large Models
in System Memory
Operate on One
Layer at a Time
Fast Transfer
via NVLink

Train Large AI Models
Faster
Servers with NVLink to GPUs
23
3.8x Faster
GoogleNet model on Enlarged
ImageNet Dataset (2240x2240)
More details:
https://developer.ibm.com/linuxonpower/perfcol/perfcol-mldl/

Distributed Deep
Learning (DDL)
24
Deep learning training takes
days to weeks
Distributed learning enables
scaling to 100s of servers
connected with Mellanox IB
16 Days Down to 7 Hours
58x Faster
16 Days
7 Hours
Near Ideal Scaling to 256 GPUs
ResNet-101, ImageNet-22K
95%Scaling with
256 GPUS
Caffe with PowerAI DDL, Running on Minsky (S822Lc) Power System
ResNet-50, ImageNet-1K

25
Network Switch
GPU
Memory
POWER
CPU
DDR4
GPU
Storage Network IB, Eth
PCle
DDR4POWER
CPU
GPU GPU GPU
GPU
Memory
GPU
Memory
GPU
Memory
NVLinkNVLink
NVLink
N
VLink
NVLink
NVLink
GPU
Memory
POWER
CPU
DDR4
GPU
PCle
DDR4POWER
CPU
GPU GPU GPU
GPU
Memory
GPU
Memory
GPU
Memory
NVLinkNVLink
NVLink
N
VLink
NVLink
NVLink
COMMUNICATION PATHS
DDL: Fully utilize bandwidth for links within each node and across all nodes
 Learners communicate as efficiently as possible
Storage
Mellanox IB Network Switch
GPU
Memory
POWER
CPU
DDR4
GPU
Network IB, Eth
PCle
DDR4POWER
CPU
GPU GPU GPU
GPU
Memory
GPU
Memory
GPU
Memory
NVLinkNVLink
NVLink
N
VLink
NVLink
NVLink
GPU
Memory
POWER
CPU
DDR4
GPU
PCle
DDR4POWER
CPU
GPU GPU GPU
GPU
Memory
GPU
Memory
GPU
Memory
NVLinkNVLink
NVLink
N
VLink
NVLink
NVLink

Auto Hyper-Parameter Tuning
Hyper-parameters
– Learning rate
– Decay rate
– Batch size
– Optimizer
• GradientDecedent,
Adadelta, Momentum,
RMSProp …
– Momentum (for some
optimizers)
– LSTM hidden unit size
Random
Tree-based Parzen
Estimator (TPE)
Bayesian
Multi-tenant Spark Cluster
IBM Spectrum Conductor with Spark
Spark search jobs are generated dynamically and executed in parallel
26

27
libGLM (C++ / CUDA
Optimized Primitive Lib)
Distributed Training
Logistic Regression Linear Regression
Support Vector
Machines (SVM)
Distributed Hyper-
Parameter Optimization
More Coming Soon
APIs for Popular ML
Frameworks
Snap ML
Distributed GPU-Accelerated Machine Learning Library
(coming
soon)
Snap Machine Learning (ML) Library

46x faster than previous
record set by Google
Workload: Click-through rate
prediction for advertising
Logistic Regression Classifier in
Snap ML using GPUs vs
TensorFlow using CPU-only
28
Snap ML: Training Time Goes
From An Hour to Minutes
Logistic Regression in Snap ML (with
GPUs) vs TensorFlow (CPU-only)
46x Faster
Dataset: Criteo Terabyte Click Logs
(http://labs.criteo.com/2013/12/download-terabyte-click-logs/)
4 billion training examples, 1 million features
Model: Logistic Regression: TensorFlow vs Snap ML
Test LogLoss: 0.1293 (Google using Tensorflow), 0.1292 (Snap ML)
Platform: 89 CPU-only machines in Google using Tensorflow versus
4 AC922 servers (each 2 Power9 CPUs + 4 V100 GPUs) for Snap ML
Google data from this Google blog

PowerAI Vision: ”Point-and-Click” AI for Images & Video
29
Label Image or
Video Data
Auto-Train AI Model Package & Deploy
AI Model

Semi-Automatic Labeling using PowerAI Vision
30
Train DL Model
Define Labels
Manually Label Some
Images / Video Frames
Manually Label
Use Trained DL
Model
Run Trained DL Model
on Entire Input Data to
Generate Labels
Correct Labels
on Some Data
Manually Correct
Labels on Some Data
Repeat Till Labels Achieve
Desired Accuracy

Notice and disclaimers
33
Copyright © 2018 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission
from IBM.
U.S. Government Users Restricted Rights — use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM.
Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of
initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. This document is distributed
“as is” without any warranty, either express or implied. In no event shall IBM be liable for any damage arising from the use of this information, including but not limited
to, loss of data, business interruption, loss of profit or loss of opportunity. IBM products and services are warranted according to the terms and conditions of the
agreements under which they are provided.
IBM products are manufactured from new parts or new and used parts. In some cases, a product may not be new and may have been previously installed. Regardless, our
warranty terms apply.”
Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice.
Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers
have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary.
References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries
in which IBM operates or does business.
Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and
discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or
their specific situation.
It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and
interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such
laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law.

Notice and disclaimers cont.
34
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available
sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other
claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does
not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate with IBM’s products. IBM expressly disclaims
all warranties, expressed or implied, including but not limited to, the implied warranties of merchantability and fitness for a particular, purpose.
The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks
or other intellectual property right.
IBM, the IBM logo, ibm.com, AIX, BigInsights, Bluemix, CICS, Easy Tier, FlashCopy, FlashSystem, GDPS, GPFS, Guardium, HyperSwap, IBM Cloud
Managed Services, IBM Elastic Storage, IBM FlashCore, IBM FlashSystem, IBM MobileFirst, IBM Power Systems, IBM PureSystems, IBM Spectrum, IBM
Spectrum Accelerate, IBM Spectrum Archive, IBM Spectrum Control, IBM Spectrum Protect, IBM Spectrum Scale, IBM Spectrum Storage, IBM Spectrum
Virtualize, IBM Watson, IBM Z, IBM z Systems, IBM z13, IMS, InfoSphere, Linear Tape File System, OMEGAMON, OpenPower, Parallel Sysplex, Power,
POWER, POWER4, POWER7, POWER8, Power Series, Power Systems, Power Systems Software, PowerHA, PowerLinux, PowerVM, PureApplica- tion,
RACF, Real-time Compression, Redbooks, RMF, SPSS, Storwize, Symphony, SystemMirror, System Storage, Tivoli, WebSphere, XIV, z Systems, z/OS,
z/VM, z/VSE, zEnterprise and zSecure are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other
product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and
trademark information" at: www.ibm.com/legal/copytrade.shtml.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Java and all Java-based trademarks and logos are
trademarks or registered trademarks of Oracle and/or its affiliates.

Enabling a hardware accelerated deep learning data science experience for Apache Spark and Hadoop

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Enabling a hardware accelerated deep learning data science experience for Apache Spark and Hadoop

Similar to Enabling a hardware accelerated deep learning data science experience for Apache Spark and Hadoop (20)

More from DataWorks Summit

More from DataWorks Summit (20)

Recently uploaded

Recently uploaded (20)

Enabling a hardware accelerated deep learning data science experience for Apache Spark and Hadoop

Editor's Notes