TOP 5 LESSONS
LEARNED
IN DEPLOYING AI IN THE REAL WORLD
Joshua Robinson, Founding Engineer
© 2018 PURE STORAGE INC.2
AI REVOLUTIONIZING EVERY INDUSTRY
Kitchen- LG
LG Instaview refrigerator with AI-powered Alexa
helps owners shop for groceries with their voice
Healthcare- Mayo Clinic
Finds genetic markers in images to avoid surgery for
tumor samples & recommend treatments
Farming- Blue River
10% of lettuce in the US is harvested by LettuceBot,
using AI to maximize crop yield & minimize
chemicals
Marketing- SAP
SAP’s Brand Impact tool measures ROI of brand
sponsorships using video analytics and AI
Consumer Goods- P&G
Proctor & Gamble’s Olay is using AI to inspect skin
& improve trouble spots, all with a mobile phone
Automotive- Volvo
Volvo and its subsidiary, Zenuity, are racing to build
fully autonomous cars by 2021
© 2018 PURE STORAGE INC.3
SOFTWARE 2.0
AI REPRESENTS FUNDAMENTAL SHIFT IN HOW SOFTWARE IS WRITTEN
Software 2.0: https://medium.com/@karpathy/software-2-0-a64152b37c35
Comic: https://xkcd.com/303/
© 2018 PURE STORAGE INC.4
QUESTIONONEVERYONE’SMIND:
WHYISASTORAGECOMPANYHERE?
© 2018 PURE STORAGE INC.5
NEW ALGORITHMS
Massively Parallel Delivering
Superhuman Accuracy
MODERN COMPUTE
Massively Parallel Architecture
Driving Performance
GPU- THOUSANDS OF CORES
BIG DATA
Data is the New Oil
50 Zettabytes Created in 2020
THE BIG BANG OF INTELLIGENCE
FUELED BY PARALLEL COMPUTE, NEW ALGORITHMS, AND BIG DATA
© 2018 PURE STORAGE INC.6
FRAMEWORKS GPU SERVER STORAGE
TECHNOLOGIES OF THE BIG BANG
WHAT CUSTOMERS DEPLOY
© 2018 PURE STORAGE INC.7
DATA IS VITAL TO MACHINE LEARNING
OBSERVATION BY PROF. ANDREW NG, AI LUMINARY
© 2018 PURE STORAGE INC.8
TOP5LESSONSLEARNED
1.AIisaDataPipeline
© 2018 PURE STORAGE INC.9
WHAT MOST THINK IS AI
NEW POSSIBILITIES
For Nearly Every Industry
FRAMEWORKS
To Get Started
GPU
The Engine
© 2018 PURE STORAGE INC.10
AI IS SO MUCH MORE
“Hidden Technical Debt in Machine Learning Systems”, Google NIPS 2015
© 2018 PURE STORAGE INC.11
COMPLEXITIES OF AI IN PRODUCTION
INGEST
From sensors, machines,
& user generated
CLEAN &
TRANSFORM
Label, anomaly detection,
ETL, prep, stage
EXPLORE
Quickly iterate to
converge on models
TRAIN
Run for hours to days in
production cluster
CPU Servers GPU Server GPU Production Cluster
COPY &
TRANSFORM
COPY &
TRANSFORM
COPY &
TRANSFORM
© 2018 PURE STORAGE INC.12
WIDE RANGE OF NEEDS IN AI PIPELINE
SIGNIFICANT CHALLENGE TO LEGACY STORAGE
INGEST
From sensors & machines
CLEAN &
TRANSFORM
CPU Servers
EXPLORE
GPU Server
TRAIN
GPU Production Cluster
Access
Pattern
sequential sequential or random random random
Access Type write read & write read read
File Size mostly large small to large small to large mostly small
Concurrency high high low high
© 2018 PURE STORAGE INC.13
TOP5LESSONSLEARNED
1.AIisaDataPipeline
2.Don’tThrowYourDataintoDataLake
© 2018 PURE STORAGE INC.14
DATA LAKE
OR DATA GRAVEYARD?
We see customers creating big
data graveyards, dumping
everything into HDFS [Hadoop
Distributed File System] and
hoping to do something with it
down the road. But then they just
lose track of what’s there.
The main challenge is not creating
a data lake, but taking advantage
of the opportunities it presents.
“
”
PricewaterhouseCoopers
Technology Forecast, Issue 1, 2014
© 2018 PURE STORAGE INC.15
THE OLD WORLD OF DATA ANALYTICS
15 YEARS AGO
Google File System was introduced,
inspiring creation of Hadoop & HDFS
Typical File is Large
Access is Sequential
Hardware Failure is the Norm
Data is Batched
Network is Slow
ASSUMPTIONS ABOUT
DATA IN GFS & HDFS
DATA PLATFORM IS
DISTRIBUTED DISKS
Lots of Disk in Nodes
3x Data Replication
Batched Workflow
Fixed Compute to Storage
...
© 2018 PURE STORAGE INC.16
MODERN DATA CHANGES EVERYTHING
Small to Large Files
Random to Sequential Access
Real-time or Batched
Apps & Data Evolve Quickly
Elastic Infrastructure
DATA IS NOW DIFFERENT
DECADE AGO TODAY
© 2018 PURE STORAGE INC.17
MODERN ANALYTICS WITH OLD DATA LAKE
SPRAWLING, COMPLEX SILOS & SLOW PERFORMANCE
Each App Locked into Physical Silos
Redundant Data Copies in Silos
Fixed Compute to Storage in Silo
Built for Large, Sequential Data
Optimized for Batch, Not Real-Time
STATIC DATA LAKE
NO LONGER VIABLE
HDFS DATA LAKE
SILO
SILO
SILOSILOSILO
© 2018 PURE STORAGE INC.18
TOP5LESSONSLEARNED
1.AIisaDataPipeline
2.Don’tThrowYourDataintoDataLake
3.CloudorNottoCloud?
© 2018 PURE STORAGE INC.19
IT DEPENDS
WHERE YOU ARE ON YOUR AI JOURNEY
EXPLORATION PRODUCTION
NEED Start Immediately
Get New Products & Features to
Market Faster than Competition
© 2018 PURE STORAGE INC.20
IT DEPENDS
WHERE YOU ARE ON YOUR AI JOURNEY
EXPLORATION PRODUCTION
NEED Start Immediately
Get New Products & Features to
Market Faster than Competition
DON’T NEED
Bogged Down with
Infrastructure
Bogged Down by Performance
& Cost Inefficiencies
© 2018 PURE STORAGE INC.21
IT DEPENDS
WHERE YOU ARE ON YOUR AI JOURNEY
EXPLORATION PRODUCTION
NEED Start Immediately
Get New Products & Features to
Market Faster than Competition
DON’T NEED
Bogged Down with
Infrastructure
Bogged Down by Performance
& Cost Inefficiencies
RECOMMENDATION Cloud On-Premises
© 2018 PURE STORAGE INC.22
COST INEFFICIENCIES OF CLOUD
$
TIME1 Year 2 Years 3 Years
DGX-1 +
FlashBlade ~$300K
Cloud
~$800K in 3 Years
> 60% Savings
Comparing DGX-1 Volta with FlashBlade system to AWS p3.16xlarge instance, AWS p3.16xlarge instance = $24.48/hour, AWS S3 cost, GET op = $0.004 per 10K requests, ignored other storage costs
8 Volta GPUs deliver 4,100 images/sec with Caffe2, https://caffe2.ai/blog/2017/05/10/caffe2-adds-FP16-training-support.html
Assume 100% utilization for 3 years
Comparing NVIDIA DGX-1 (Volta) & Pure Storage FlashBlade vs AWS EC2 & S3
© 2018 PURE STORAGE INC.23
TOP5LESSONSLEARNED
1.AIisaDataPipeline
2.Don’tThrowYourDataintoDataLake
3.CloudorNottoCloud?
4.Lies,DamnLies,andBenchmarks
© 2018 PURE STORAGE INC.24
BENCHMARKS DO NOT REFLECT REALITY
IMAGENET
REAL-WORLD AUTONOMOUS
CAR COMPANY
IMAGE SIZE 100-200KB 2-5MB
FILE SIZE 150MB
(Packed TFRecords)
2-5MB
MODE OF TESTING Synthetic (No I/O) Read from Storage
© 2018 PURE STORAGE INC.25
AI TRAINING SYSTEM
GOAL IS TO KEEP THE GPUs 100% BUSY
decode scale
evaluate
forward-
propagation
update
back-propagation
GPUI/O CPU
FULL TRAINING
WORKFLOW
Setup #1: Synthetic
Data from System RAM
into GPUs
Setup #3: Real Image Data from
FlashBlade into DGX-1
BENCHMARK
SETUP
GPU ONLY I/O + CPU + GPU
Setup #2: Real Image Data
from System RAM Through
CPU + GPU
CPU + GPU
© 2018 PURE STORAGE INC.26
NEAR-LINEAR SCALE DELIVERED
AIRI ENGINEERED FOR MAXIMUM PRODUCTIVITY AND OUT-OF-THE-BOX SCALE
DEEP LEARNING TRAINING- MULTI-NODE USING GPUDIRECT RDMA OVER
ETHERNET
Comparing Synthetic Mode, Entire Data in DRAM, Entire Data in FlashBlade
© 2018 PURE STORAGE INC.27
TOP5LESSONSLEARNED
1.AIisaDataPipeline
2.Don’tThrowYourDataintoDataLake
3.CloudorNottoCloud?
4.Lies,DamnLies,andBenchmarks
5.IdealDataPlatformisaDataHub
© 2018 PURE STORAGE INC.28
IDEAL PLATFORM FOR MODERN ERA
DYNAMIC DATA HUB ARCHITECTED FOR REAL-TIME & ELASTIC DATA
DATA PIPELINE
DATA HUB
“TUNED FOR EVERYTHING”
Small, Random to Large,
Seq.
Architected for the Unknown
REAL-TIME
Low Latency Performance
for Instant Response
ALL-FLASH
Modern, Ultra-Fast
Technology
PARALLEL
No Serial Bottlenecks
for Max Throughput
ELASTIC
Grow Non-Disruptively
with More App Clusters
SIMPLE
Focus More on Insights,
Not Infrastructure
© 2018 PURE STORAGE INC. PURE PROPRIETARY29
NVIDIA® DGX-1™ | 4x DGX-1 Systems | 4 PFLOPS of DL
Performance
PURE FLASHBLADE™ | 15x 17TB Blades | 1.5M IOPS
ARISTA | 2x 100Gb Ethernet Switches with RDMA
NVIDIA GPU CLOUD DEEP LEARNING STACK | NVIDIA Optimized
Frameworks
AIRI SCALING TOOLKIT | Multi-node Training Made Simple
THE INDUSTRY’S FIRST
COMPLETE AI-READY INFRASTRUCTURE
HARDWARE
SOFTWARE
© 2018 PURE STORAGE INC.30
AI & MODERN ANALYTICS
POWERING ANALYTICS FOR WORLD’S LARGEST PUBLIC HEDGE FUND
AI CLEAN & LABEL AI EXPLORE AI TRAIN
CPU Servers GPU Server GPU Servers
SPARK
CPU Servers CPU Servers
MONGO
Our quants want to test a
model, get the results, and
then test another one- all day
long. So a 10-20X
improvement in
performance is a game-
changer when it comes to
creating a time-to-market
advantage for us.Gary Collier, co-CTO, Man AHL
“
”
© 2018 PURE STORAGE INC.31
TOP5LESSONSLEARNED
1.AIisaDataPipeline
2.Don’tThrowYourDataintoDataLake
3.CloudorNottoCloud?
4.Lies,DamnLies,andBenchmarks
5.IdealDataPlatformisaDataHub
Top 5 Lessons Learned in Deploying AI in the Real World

Top 5 Lessons Learned in Deploying AI in the Real World

  • 1.
    TOP 5 LESSONS LEARNED INDEPLOYING AI IN THE REAL WORLD Joshua Robinson, Founding Engineer
  • 2.
    © 2018 PURESTORAGE INC.2 AI REVOLUTIONIZING EVERY INDUSTRY Kitchen- LG LG Instaview refrigerator with AI-powered Alexa helps owners shop for groceries with their voice Healthcare- Mayo Clinic Finds genetic markers in images to avoid surgery for tumor samples & recommend treatments Farming- Blue River 10% of lettuce in the US is harvested by LettuceBot, using AI to maximize crop yield & minimize chemicals Marketing- SAP SAP’s Brand Impact tool measures ROI of brand sponsorships using video analytics and AI Consumer Goods- P&G Proctor & Gamble’s Olay is using AI to inspect skin & improve trouble spots, all with a mobile phone Automotive- Volvo Volvo and its subsidiary, Zenuity, are racing to build fully autonomous cars by 2021
  • 3.
    © 2018 PURESTORAGE INC.3 SOFTWARE 2.0 AI REPRESENTS FUNDAMENTAL SHIFT IN HOW SOFTWARE IS WRITTEN Software 2.0: https://medium.com/@karpathy/software-2-0-a64152b37c35 Comic: https://xkcd.com/303/
  • 4.
    © 2018 PURESTORAGE INC.4 QUESTIONONEVERYONE’SMIND: WHYISASTORAGECOMPANYHERE?
  • 5.
    © 2018 PURESTORAGE INC.5 NEW ALGORITHMS Massively Parallel Delivering Superhuman Accuracy MODERN COMPUTE Massively Parallel Architecture Driving Performance GPU- THOUSANDS OF CORES BIG DATA Data is the New Oil 50 Zettabytes Created in 2020 THE BIG BANG OF INTELLIGENCE FUELED BY PARALLEL COMPUTE, NEW ALGORITHMS, AND BIG DATA
  • 6.
    © 2018 PURESTORAGE INC.6 FRAMEWORKS GPU SERVER STORAGE TECHNOLOGIES OF THE BIG BANG WHAT CUSTOMERS DEPLOY
  • 7.
    © 2018 PURESTORAGE INC.7 DATA IS VITAL TO MACHINE LEARNING OBSERVATION BY PROF. ANDREW NG, AI LUMINARY
  • 8.
    © 2018 PURESTORAGE INC.8 TOP5LESSONSLEARNED 1.AIisaDataPipeline
  • 9.
    © 2018 PURESTORAGE INC.9 WHAT MOST THINK IS AI NEW POSSIBILITIES For Nearly Every Industry FRAMEWORKS To Get Started GPU The Engine
  • 10.
    © 2018 PURESTORAGE INC.10 AI IS SO MUCH MORE “Hidden Technical Debt in Machine Learning Systems”, Google NIPS 2015
  • 11.
    © 2018 PURESTORAGE INC.11 COMPLEXITIES OF AI IN PRODUCTION INGEST From sensors, machines, & user generated CLEAN & TRANSFORM Label, anomaly detection, ETL, prep, stage EXPLORE Quickly iterate to converge on models TRAIN Run for hours to days in production cluster CPU Servers GPU Server GPU Production Cluster COPY & TRANSFORM COPY & TRANSFORM COPY & TRANSFORM
  • 12.
    © 2018 PURESTORAGE INC.12 WIDE RANGE OF NEEDS IN AI PIPELINE SIGNIFICANT CHALLENGE TO LEGACY STORAGE INGEST From sensors & machines CLEAN & TRANSFORM CPU Servers EXPLORE GPU Server TRAIN GPU Production Cluster Access Pattern sequential sequential or random random random Access Type write read & write read read File Size mostly large small to large small to large mostly small Concurrency high high low high
  • 13.
    © 2018 PURESTORAGE INC.13 TOP5LESSONSLEARNED 1.AIisaDataPipeline 2.Don’tThrowYourDataintoDataLake
  • 14.
    © 2018 PURESTORAGE INC.14 DATA LAKE OR DATA GRAVEYARD? We see customers creating big data graveyards, dumping everything into HDFS [Hadoop Distributed File System] and hoping to do something with it down the road. But then they just lose track of what’s there. The main challenge is not creating a data lake, but taking advantage of the opportunities it presents. “ ” PricewaterhouseCoopers Technology Forecast, Issue 1, 2014
  • 15.
    © 2018 PURESTORAGE INC.15 THE OLD WORLD OF DATA ANALYTICS 15 YEARS AGO Google File System was introduced, inspiring creation of Hadoop & HDFS Typical File is Large Access is Sequential Hardware Failure is the Norm Data is Batched Network is Slow ASSUMPTIONS ABOUT DATA IN GFS & HDFS DATA PLATFORM IS DISTRIBUTED DISKS Lots of Disk in Nodes 3x Data Replication Batched Workflow Fixed Compute to Storage ...
  • 16.
    © 2018 PURESTORAGE INC.16 MODERN DATA CHANGES EVERYTHING Small to Large Files Random to Sequential Access Real-time or Batched Apps & Data Evolve Quickly Elastic Infrastructure DATA IS NOW DIFFERENT DECADE AGO TODAY
  • 17.
    © 2018 PURESTORAGE INC.17 MODERN ANALYTICS WITH OLD DATA LAKE SPRAWLING, COMPLEX SILOS & SLOW PERFORMANCE Each App Locked into Physical Silos Redundant Data Copies in Silos Fixed Compute to Storage in Silo Built for Large, Sequential Data Optimized for Batch, Not Real-Time STATIC DATA LAKE NO LONGER VIABLE HDFS DATA LAKE SILO SILO SILOSILOSILO
  • 18.
    © 2018 PURESTORAGE INC.18 TOP5LESSONSLEARNED 1.AIisaDataPipeline 2.Don’tThrowYourDataintoDataLake 3.CloudorNottoCloud?
  • 19.
    © 2018 PURESTORAGE INC.19 IT DEPENDS WHERE YOU ARE ON YOUR AI JOURNEY EXPLORATION PRODUCTION NEED Start Immediately Get New Products & Features to Market Faster than Competition
  • 20.
    © 2018 PURESTORAGE INC.20 IT DEPENDS WHERE YOU ARE ON YOUR AI JOURNEY EXPLORATION PRODUCTION NEED Start Immediately Get New Products & Features to Market Faster than Competition DON’T NEED Bogged Down with Infrastructure Bogged Down by Performance & Cost Inefficiencies
  • 21.
    © 2018 PURESTORAGE INC.21 IT DEPENDS WHERE YOU ARE ON YOUR AI JOURNEY EXPLORATION PRODUCTION NEED Start Immediately Get New Products & Features to Market Faster than Competition DON’T NEED Bogged Down with Infrastructure Bogged Down by Performance & Cost Inefficiencies RECOMMENDATION Cloud On-Premises
  • 22.
    © 2018 PURESTORAGE INC.22 COST INEFFICIENCIES OF CLOUD $ TIME1 Year 2 Years 3 Years DGX-1 + FlashBlade ~$300K Cloud ~$800K in 3 Years > 60% Savings Comparing DGX-1 Volta with FlashBlade system to AWS p3.16xlarge instance, AWS p3.16xlarge instance = $24.48/hour, AWS S3 cost, GET op = $0.004 per 10K requests, ignored other storage costs 8 Volta GPUs deliver 4,100 images/sec with Caffe2, https://caffe2.ai/blog/2017/05/10/caffe2-adds-FP16-training-support.html Assume 100% utilization for 3 years Comparing NVIDIA DGX-1 (Volta) & Pure Storage FlashBlade vs AWS EC2 & S3
  • 23.
    © 2018 PURESTORAGE INC.23 TOP5LESSONSLEARNED 1.AIisaDataPipeline 2.Don’tThrowYourDataintoDataLake 3.CloudorNottoCloud? 4.Lies,DamnLies,andBenchmarks
  • 24.
    © 2018 PURESTORAGE INC.24 BENCHMARKS DO NOT REFLECT REALITY IMAGENET REAL-WORLD AUTONOMOUS CAR COMPANY IMAGE SIZE 100-200KB 2-5MB FILE SIZE 150MB (Packed TFRecords) 2-5MB MODE OF TESTING Synthetic (No I/O) Read from Storage
  • 25.
    © 2018 PURESTORAGE INC.25 AI TRAINING SYSTEM GOAL IS TO KEEP THE GPUs 100% BUSY decode scale evaluate forward- propagation update back-propagation GPUI/O CPU FULL TRAINING WORKFLOW Setup #1: Synthetic Data from System RAM into GPUs Setup #3: Real Image Data from FlashBlade into DGX-1 BENCHMARK SETUP GPU ONLY I/O + CPU + GPU Setup #2: Real Image Data from System RAM Through CPU + GPU CPU + GPU
  • 26.
    © 2018 PURESTORAGE INC.26 NEAR-LINEAR SCALE DELIVERED AIRI ENGINEERED FOR MAXIMUM PRODUCTIVITY AND OUT-OF-THE-BOX SCALE DEEP LEARNING TRAINING- MULTI-NODE USING GPUDIRECT RDMA OVER ETHERNET Comparing Synthetic Mode, Entire Data in DRAM, Entire Data in FlashBlade
  • 27.
    © 2018 PURESTORAGE INC.27 TOP5LESSONSLEARNED 1.AIisaDataPipeline 2.Don’tThrowYourDataintoDataLake 3.CloudorNottoCloud? 4.Lies,DamnLies,andBenchmarks 5.IdealDataPlatformisaDataHub
  • 28.
    © 2018 PURESTORAGE INC.28 IDEAL PLATFORM FOR MODERN ERA DYNAMIC DATA HUB ARCHITECTED FOR REAL-TIME & ELASTIC DATA DATA PIPELINE DATA HUB “TUNED FOR EVERYTHING” Small, Random to Large, Seq. Architected for the Unknown REAL-TIME Low Latency Performance for Instant Response ALL-FLASH Modern, Ultra-Fast Technology PARALLEL No Serial Bottlenecks for Max Throughput ELASTIC Grow Non-Disruptively with More App Clusters SIMPLE Focus More on Insights, Not Infrastructure
  • 29.
    © 2018 PURESTORAGE INC. PURE PROPRIETARY29 NVIDIA® DGX-1™ | 4x DGX-1 Systems | 4 PFLOPS of DL Performance PURE FLASHBLADE™ | 15x 17TB Blades | 1.5M IOPS ARISTA | 2x 100Gb Ethernet Switches with RDMA NVIDIA GPU CLOUD DEEP LEARNING STACK | NVIDIA Optimized Frameworks AIRI SCALING TOOLKIT | Multi-node Training Made Simple THE INDUSTRY’S FIRST COMPLETE AI-READY INFRASTRUCTURE HARDWARE SOFTWARE
  • 30.
    © 2018 PURESTORAGE INC.30 AI & MODERN ANALYTICS POWERING ANALYTICS FOR WORLD’S LARGEST PUBLIC HEDGE FUND AI CLEAN & LABEL AI EXPLORE AI TRAIN CPU Servers GPU Server GPU Servers SPARK CPU Servers CPU Servers MONGO Our quants want to test a model, get the results, and then test another one- all day long. So a 10-20X improvement in performance is a game- changer when it comes to creating a time-to-market advantage for us.Gary Collier, co-CTO, Man AHL “ ”
  • 31.
    © 2018 PURESTORAGE INC.31 TOP5LESSONSLEARNED 1.AIisaDataPipeline 2.Don’tThrowYourDataintoDataLake 3.CloudorNottoCloud? 4.Lies,DamnLies,andBenchmarks 5.IdealDataPlatformisaDataHub