Webinar: Cutting Time, Complexity and Cost from Data Science to Production

February 2019
Cutting Time, Complexity and Costs
from Data Science to Production

 Data science challenges
 Iguazio data science PaaS over Kubernetes
 NVIDIA solutions to accelerate data science with Kubernetes
o GPU integration, TensorRT, RAPIDS
 Hands on tutorial
o End-to-end application: real-time predictive infrastructure monitoring
(ingest, explore, hyper param training, deploy to production)
o Serverless and scale-out data science
o NVIDIA RAPIDS
 Summary
 Q&A
Agenda

Today: ML Lifecycle is Complex and Siloed
Data Prep & Analytics
Data Engineers
Model Building
Data Scientists
Model Deployment
ETL Data Lakes/
Warehouses
CSVs Model
Need more
fresh data
Tune model
Active Data
(CSV/in-mem)
GPU
Data Engineers and App Developers
ML Model
Serving
App Deployment
Interactive App
Stream Processing
Triggers and
InteractionsDatabase

4
ML Challenges in Real Life
Re-coding &
instrumenting
AI Model “Depth” & Accuracy
vs Performance & Costs
Observability &
Reproducibility
Infrastructure and
Software Complexity
Can we gather (and prep)
model features in production?

5
Solution: Fast & Continuous Data Science Pipeline
Collect
Constantly Ingest, Clean &
Tag Data via “Collectors”
Develop
“Serverless” Functions
& Notebooks
Deploy to Production
Triggers and
Interactions
Intelligent
Serverless
Run-Time
In Cloud, On-prem or Edge
Build & Test
CI/CD for Code
& Models
ML Model
Training
CPU GPU
Monitor & Reiterate
Deploy in Any
Cloud or Edge
Deliver Accurate
Results in Real-time
Develop and
Iterate Faster

6
Iguazio: Open & High-Performance Data-Science PaaS
Real-time Structured & Unstructured Data Fabric
External Data
Managed & hardened open-source
plus 3rd party services and apps
Secure real-time data sharing
enabling collaboration & parallelism
Self-service experience from A to Z
CPU GPU
Built on a cloud-native architecture
Compute

7
Develop Faster, Run Faster, Use Less Resources
Managed Jupyter
Data science notebooks and online IDE
 Serverless notebooks: self-service, scale to zero on idle
 Simplify, secure and accelerate data access and processing
 Accelerate applications and training using shared GPUs and ML services
 One-click deployment to production (as jobs, real-time functions and dashboards)
Time Series Stream Table Object
GPU
Historical and real-time data
from a variety of sources
Integrated, 3rd party or cloud
ML services on-demand

8
Deploy Faster to Production with Serverless
Nuclio: the leading open-source serverless for real-time intelligence
 Minimize software development and maintenance overhead
 Extreme performance (Up to 370K events/sec per process, 0.1 ms latency, fast data access)
 Open, supports many event/data sources - HTTP, streaming, messaging, jobs
 One-click deployment from many sources (code, containers, notebooks, git, templates)
Cloud, On-prem
or Edge
One-Click
Deployment

9
Kubernetes
Kubernetes Helps Simplify the Use of Clusters and GPUs
Think of Kubernetes as an operating
system for a cluster.
Kubernetes manages nodes, administer
access, launch containers, jobs and more
Container
Worker
Worker
Worker
Worker
C. C.
Container
Master
Server
API Server
Replication Controller
Scheduler
Daemon
Daemon
Daemon
Daemon
Infrastructure as code:
e.g. PyTorch Training Job
pytorch-job.yml
---
apiVersion: batch/v1
kind: Job
metadata:
name: pytorch-example
spec:
backoffLimit: 5
template:
spec:
imagePullSecrets:
- name: nvcr.dgxkey
containers:
- name: pytorch-container
image: nvcr.io/nvidia/pytorch:18.06-py3
command: ["/bin/sh"]
args: ["-c", "python /examples/mnist/main.py"]
resources:
limits:
nvidia.com/gpu: 1
9

10
Open Source, End-to-end GPU-accelerated Workflow Built On CUDA
Data
preparation
/ wrangling
cuDF
Optimized ML
model
training
cuML Visualization
Data
visualization
libraries
data insights
Re-Imagining Data Science Workflow
10

11
Software Stack Python
Data Preparation
cuDF
Visualization
cuGRAPH
Model Training
cuML
CUDA
PYTHON
APACHE ARROW on GPU Memory
DASK
DEEP
LEARNING
FRAMEWORKS
CUDNN
RAPIDS
CUMLCUDF CUGRAPH
Read/Write RAPIDS
dataframes Directly into
Iguzaio Database & FS
RAPIDS – GPU Accelerated Data Science
11

12
2,290
1,956
1,999
1,948
169
157
0 1,000 2,000 3,000
20 CPU
Nodes
30 CPU
Nodes
50 CPU
Nodes
100 CPU
Nodes
DGX-2
5x DGX-1
0 5,000 10,000
20 CPU
Nodes
30 CPU
Nodes
50 CPU
Nodes
100 CPU
Nodes
DGX-2
5x DGX-1
cuML — XGBoost
2,741
1,675
715
379
42
19
0 1,000 2,000 3,000
20 CPU
Nodes
30 CPU
Nodes
50 CPU
Nodes
100 CPU
Nodes
DGX-2
5x DGX-1
End-to-End
cuIO/cuDF —
Load and Data Preparation
Benchmark
200GB CSV dataset; Data preparation
includes joins, variable
transformations.
CPU Cluster Configuration
CPU nodes (61 GiB of memory, 8 vCPUs,
64-bit platform), Apache Spark
DGX Cluster Configuration
5x DGX-1 on InfiniBand network
Time in seconds — Shorter is better
cuIO / cuDF (Load and Data Preparation) Data Conversion XGBoost
Faster Speeds, Real World Benefits
12

13
TensorRT – GPU Powered Inference Server
Available with Monthly Updates
Models supported
● TensorFlow GraphDef/SavedModel
● TensorFlow and TensorRT GraphDef
● TensorRT Plans
● Caffe2 NetDef (ONNX import)
Multi-GPU support
Concurrent model execution
Server HTTP REST API/gRPC
Python/C++ client libraries
Python/C++ Client Library
13

Details: https://developer.nvidia.com/tensorrt
Time Series DB
NVIDIA TensorRT Over Kubernetes & Iguazio
Nuclio Function
(Serverless)
14

16
 Eliminate complexity through pre-integrated managed services
 Leverage parallelism and hardware acceleration to improve ROI
 Consolidate data engineering, science and app dev platforms
 Focus on the end goal:
Build and Deploy Intelligent Apps Faster:
Summary
Production Deployment of Intelligent Applications

info@iguazio.com | www.iguazio.com
Thank You

19
 Many APIs and models on the same data
o SQL, NoSQL, time series, stream, files
o Custom APIs, streaming, sync and ETLs
 Minimize CPU, mem, and ops overhead
Iguazio Smart Unified Real-time DB & File-System
100TB NVMe Flash
(direct attached)
High-Speed Fabric
Real-time Firewall
Smart Real-time DB
Many standard &
open APIs on a
unified DB Engine
Use NVMe Flash
as an extension
of memory
Granular
security
S3
ETL Streams
 In-memory performance, at 1/30 of the
cost and 30x the density (on Flash)
 Real-time time series & data analytics
 Fine-grained security
Apps & Users Backup

Real-time Intelligent Infrastructure Management
Auto-Healing Network Operations
 Replaced a complex Hadoop based data
pipeline that was never productized
 Cross correlating real-time data from
multiple sources with historical data
 AI-based predictions trigger pre-
programmed actions that fix evolving
problems in the network
 Implemented within weeks of initial
deployment
Singtel uses Iguazio to predict network outages and avoid them in real-time
Singtel’s self-healing network is the perfect example of a client shifting from
reactive to proactive with Iguazio
20

21
Real-time Intelligent Infrastructure Management
Maintaining Continuous Fast Response for 2nd Tier Cloud Services
Analyzing and predicting cloud service response time for optimal results
Real-time Data Ingestion
From multiple monitoring tools including Jennifer and Zabbix
Anomaly Detection
Accurate anomaly detection with order of magnitude lower
false positives as opposed to the previous Elasticsearch based
platform
Root Cause Analysis
Real-time root cause analysis from multiple factors. For
example, correlating servers’ CPU’s and applications response
time changes occurring simultaneously
Predictive Analytics
Predicting response times and sending real-time alerts
indicating which factors need to be adjusted to avoid
malfunctions
From deployment to completion in less than two weeks!

22
Evolve Into an Agile Cloud-Native Architecture
YARN
HbaseHDFS
Map
Reduce
Pig,
Hive, ..
DBaaS
S3 (object)
From a Legacy & Resource
Intensive Architecture To Simpler & Modern Approach
Data
Orchestration
Middleware
Your Business Logic
Consume
Innovate
Serverless Data-Science BigData

Webinar: Cutting Time, Complexity and Cost from Data Science to Production

More Related Content

What's hot

Similar to Webinar: Cutting Time, Complexity and Cost from Data Science to Production

Recently uploaded

Webinar: Cutting Time, Complexity and Cost from Data Science to Production