SlideShare a Scribd company logo
1 of 27
Running Tensorflow on Apache YARN –
A sneak peak into GPU Scheduling
Sunil Govindan
Apache Hadoop PMC member
YARN Team @ Hortonworks
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
 Overview to Machine Learning on Big Data Platform
 GPU support in Apache Hadoop YARN
 Tensorflow on YARN – example and demo
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Overview:
Machine Learning on Big Data Platform
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Machine learning workflow
Feature
Selection
Data
Feature
Transform
Feature
Encoding
Feature
Evaluation
Model
Training
Feature
Model
Evaluation
Model
Validation
Model
Staging
Experiment
Online
Feature
Model
Database
Exper-
iment
Model as
Service
Real-time
Feature
Calibration
Data Preprocessing
Feature Engineering
Model
Training
Online
Service
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Machine learning (BigData) – Data Preprocessing
Feature
Selection
Data
Feature
Transform
Feature
Encoding
Feature
Evaluation
Feature
Engineering
 Import data
– HDFS
– AWS
– RDBMS
 Join data
 Data exploration
 Data sample
 Training/Test random split
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Machine learning (BigData) – Feature Engineering
Feature
Selection
Data
Feature
Transform
Feature
Encoding
Feature
Evaluation
Feature
Engineering
 Feature transform/selection
 Feature embedding
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Machine learning (BigData) – Model Training
Model
Training
Feature
Model
Evaluation
Model
Validation
Model
Staging
Model
Training
 Traditional machine
learning models
– Logistic Regression
– Gradient boosting tree
– Recommendation/ALS
– LDA
 Libraries
– Apache Spark MLlib
– XGBoost
 Deep learning models
– DNN
– CNN
– RNN
– LSTM
 Libraries
– TensorFlow
– Apache MXNet
– BigDL
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Machine learning (BigData) – Model Serving
Experiment
Online
Feature
Model
Database
Exper-
iment
Model as
Service
Real-time
Feature
Calibration
Online
Service
 Model deploy
 Model serving
– Batch
– Streaming
 Experiment
– offline
– online (A/B test)
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
GPU support in Apache Hadoop YARN
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Machine learning platform on YARN
CPU GPU SSD
YARN: Data Operating System
(Cluster Resource Management)
Spark MLlib XGBoost Hive/LLAP Spark SQLTensorFlow
Zeppelin
HDFS AWS S3 RDBMS
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Why GPU?
 GPU can speed up following computation-
intensive applications 10x - 300x times
Gene Analysis
Deep learningSelf-Driving Car
Scientific Computation
Without GPU speed up, you will almost
impossible to do these computations. (If job
runs for weeks).
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Why GPU?
 GPU: Many cores to handle massive (but simple) computation tasks simultaneously:
GPU CPU
Computation Intensive Other
Nvidia Tesla K40:
2880 CUDA cores.
$2200.00 => $0.76 / core
Intel Xeon E5-2697
14 cores
$2295.00 => $163 / core
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Why all under YARN
SLA!
Monitoring!A normal YARN user
Quotas!
Isolation!
Capacity Planning, Preemption, Reservation System.
Time line services, Grafana, etc.
CPU / Memory, (WIP) GPU, FPGA, Network
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
All running on the same YARN platform
LLAP
128 G 128 G 128 G 128 G 128 G
LLAP LLAP
128 G 128 G
GPUs
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Current status of GPU support on YARN
 Using node label (YARN-796), since Apache Hadoop 2.6.0
– Use node label to partition one big cluster to smaller disjoint clusters, and assign shares/acls to
queues.
– Issues: 1) GPU is not a countable resource in scheduling. 2) No proper isolation for GPU.
 Rest part of GPU support is WIP, umbrella JIRA: YARN-6223
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
GPU support: Challenges
 GPU isolation
– Different from memory / cpu, computations affinity to per-GPU-device.
– And multiple processes use the single GPU will be serialized. (MPS is an exception).
– And multiple process share the same GPU cause OOM easily.
• Even though TF provide options to use GPU memory less than whole device provided. But we
cannot enforce this from external.
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
GPU support: Challenges
 Hierarchy of GPUs matters:
– Topology of GPU really matters: affect communication latency a lot! (Von Neumann bottleneck)
Picture credit to: https://opus.nci.org.au
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
GPU support: Challenges
 GPU on Docker: Build once and run
anywhere is not simple:
 For a regular app:
 It can run on Centos 6/7, or any different
hosts as well as CPU arch is same.
 However, GPU application needs driver to
talk to hardware:
Nginx App
Nginx AppUbuntu 14:04
Tensorflow 1.2
Nginx AppUbuntu 14:04
GPU Base Lib v2
Nginx AppHost OS
GPU Base Lib v1
X Fails
CUDA Library 5.0
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
GPU Support : Solutions
 GPU isolation:
– With general resource types feature:
• detect & report number of GPUs to YARN scheduler, and scheduler make central decision.
– For normal processes: use cgroups: device submodule. (Same as cpu/memory isolation
mechanism)
– For docker processes: use --device command line before launch docker container.
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
GPU Support : Solutions
 GPU on Docker support
– By using nvidia-docker-plugin.
Tensorflow 1.2
Nginx AppUbuntu 14:04
Nginx AppHost OS
GPU Base Lib v1
Volume Mount
CUDA Library 5.0
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
How rest of YARN helps GPU support.
 Node partition
– Without node partition, cannot guarantee
best GPU utilizations, let’s look at an example:
– Two hosts in the cluster, only host1 has GPUs.
At the beginning, cluster is empty.
– At time T1, user submit a Spark job, which
need 10G mem, 4 CPUs. Without node
partition, it could be placed to Host1
– If we have another job, which needs 15G
memory, 6 CPUs and 3 GPUs, it won’t possible
to get allocated.
20G
8
4
Mem
CPU
GPU
20G
8
Host1 (GPU)
Host2
10G
4
4
Mem
CPU
GPU
20G
8
Host1 (GPU)
Host2
Task1
?
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
How rest of YARN helps GPU support.
 Resource Profiles
– A generalized vector
– Admins can create custom Resource Types!
– Ease of resource requesting model using
profiles
NodeManager
Memory
CPU
GPU
NodeManager
Memory
CPU
GPU
ResourceManager
Small
Medium
Large
Profile Memory CPU GPU
Small 2 GB 4 Cores 1 Cores
Medium 4 GB 8 Cores 1 Cores
Large 16 GB 16 Cores 4 Cores
Application Master
Small
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Current development status (YARN-6223)
 Apache Hadoop 3.1.0 release (Jan 15, 2018)
– GPU auto detection (Merged)
– GPU scheduling in RM (Merged)
– GPU isolation using Cgroups. (Merged)
– GPU on docker isolation & volume. (Merged)
– UI / Metrics (Merged).
– Documentation (Open)
– Ambari changes (Open)
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
TensorFlow on Apache Hadoop YARN
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN assembly: Makes everything easier!
 Forget about writing an application master, this is how you can run app on YARN ..
 Write assembly spec in JSON (we call it Yarnfile)
 Post the JSON as REST request to YARN server.
 YARN to figure out rest of it.
 An example:
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Demo….
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Questions?

More Related Content

What's hot

February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...
February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...
February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...Yahoo Developer Network
 
TeraCache: Efficient Caching Over Fast Storage Devices
TeraCache: Efficient Caching Over Fast Storage DevicesTeraCache: Efficient Caching Over Fast Storage Devices
TeraCache: Efficient Caching Over Fast Storage DevicesDatabricks
 
Kudu - Fast Analytics on Fast Data
Kudu - Fast Analytics on Fast DataKudu - Fast Analytics on Fast Data
Kudu - Fast Analytics on Fast DataRyan Bosshart
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated ArchitectureDatabricks
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
Hadoop Meetup Jan 2019 - Overview of Ozone
Hadoop Meetup Jan 2019 - Overview of OzoneHadoop Meetup Jan 2019 - Overview of Ozone
Hadoop Meetup Jan 2019 - Overview of OzoneErik Krogen
 
Mapreduce over snapshots
Mapreduce over snapshotsMapreduce over snapshots
Mapreduce over snapshotsenissoz
 
Tez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache HadoopTez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache HadoopDataWorks Summit
 
Set Up & Operate Real-Time Data Loading into Hadoop
Set Up & Operate Real-Time Data Loading into HadoopSet Up & Operate Real-Time Data Loading into Hadoop
Set Up & Operate Real-Time Data Loading into HadoopContinuent
 
Bn 1016 demo postgre sql-online-training
Bn 1016 demo  postgre sql-online-trainingBn 1016 demo  postgre sql-online-training
Bn 1016 demo postgre sql-online-trainingconline training
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Time-Series Apache HBase
Time-Series Apache HBaseTime-Series Apache HBase
Time-Series Apache HBaseHBaseCon
 
Achieving HBase Multi-Tenancy with RegionServer Groups and Favored Nodes
Achieving HBase Multi-Tenancy with RegionServer Groups and Favored NodesAchieving HBase Multi-Tenancy with RegionServer Groups and Favored Nodes
Achieving HBase Multi-Tenancy with RegionServer Groups and Favored NodesDataWorks Summit
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated ArchitectureImproving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated ArchitectureDatabricks
 

What's hot (20)

February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...
February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...
February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...
 
TeraCache: Efficient Caching Over Fast Storage Devices
TeraCache: Efficient Caching Over Fast Storage DevicesTeraCache: Efficient Caching Over Fast Storage Devices
TeraCache: Efficient Caching Over Fast Storage Devices
 
Rds data lake @ Robinhood
Rds data lake @ Robinhood Rds data lake @ Robinhood
Rds data lake @ Robinhood
 
Achieving 100k Queries per Hour on Hive on Tez
Achieving 100k Queries per Hour on Hive on TezAchieving 100k Queries per Hour on Hive on Tez
Achieving 100k Queries per Hour on Hive on Tez
 
Kudu - Fast Analytics on Fast Data
Kudu - Fast Analytics on Fast DataKudu - Fast Analytics on Fast Data
Kudu - Fast Analytics on Fast Data
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 
1.0 vs2.0
1.0 vs2.01.0 vs2.0
1.0 vs2.0
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
Hadoop Meetup Jan 2019 - Overview of Ozone
Hadoop Meetup Jan 2019 - Overview of OzoneHadoop Meetup Jan 2019 - Overview of Ozone
Hadoop Meetup Jan 2019 - Overview of Ozone
 
Mapreduce over snapshots
Mapreduce over snapshotsMapreduce over snapshots
Mapreduce over snapshots
 
Tez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache HadoopTez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache Hadoop
 
Apache HBase: State of the Union
Apache HBase: State of the UnionApache HBase: State of the Union
Apache HBase: State of the Union
 
HDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFSHDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFS
 
Set Up & Operate Real-Time Data Loading into Hadoop
Set Up & Operate Real-Time Data Loading into HadoopSet Up & Operate Real-Time Data Loading into Hadoop
Set Up & Operate Real-Time Data Loading into Hadoop
 
Bn 1016 demo postgre sql-online-training
Bn 1016 demo  postgre sql-online-trainingBn 1016 demo  postgre sql-online-training
Bn 1016 demo postgre sql-online-training
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Time-Series Apache HBase
Time-Series Apache HBaseTime-Series Apache HBase
Time-Series Apache HBase
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Achieving HBase Multi-Tenancy with RegionServer Groups and Favored Nodes
Achieving HBase Multi-Tenancy with RegionServer Groups and Favored NodesAchieving HBase Multi-Tenancy with RegionServer Groups and Favored Nodes
Achieving HBase Multi-Tenancy with RegionServer Groups and Favored Nodes
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated ArchitectureImproving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 

Similar to [Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan

Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3DataWorks Summit
 
Dataworks Berlin Summit 18' - Deep learning On YARN - Running Distributed Te...
Dataworks Berlin Summit 18' - Deep learning On YARN -  Running Distributed Te...Dataworks Berlin Summit 18' - Deep learning On YARN -  Running Distributed Te...
Dataworks Berlin Summit 18' - Deep learning On YARN - Running Distributed Te...Wangda Tan
 
Running Tensorflow In Production: Challenges and Solutions on YARN 3.x
Running Tensorflow In Production: Challenges and Solutions on YARN 3.x Running Tensorflow In Production: Challenges and Solutions on YARN 3.x
Running Tensorflow In Production: Challenges and Solutions on YARN 3.x Wangda Tan
 
Infrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningInfrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningSergey Karayev
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Yahoo Developer Network
 
Kindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievKindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievVolodymyr Saviak
 
Azinta Gpu Cloud Services London Financial Python Ug 1.2
Azinta Gpu Cloud Services   London Financial Python Ug 1.2Azinta Gpu Cloud Services   London Financial Python Ug 1.2
Azinta Gpu Cloud Services London Financial Python Ug 1.2Suleiman Shehu
 
Introducing Apache Geode and Spring Data GemFire
Introducing Apache Geode and Spring Data GemFireIntroducing Apache Geode and Spring Data GemFire
Introducing Apache Geode and Spring Data GemFireJohn Blum
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...Edge AI and Vision Alliance
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer FugakuRCCSRENKEI
 
Deep Learning with Spark and GPUs
Deep Learning with Spark and GPUsDeep Learning with Spark and GPUs
Deep Learning with Spark and GPUsDataWorks Summit
 
Deep Learning with Apache Spark and GPUs with Pierce Spitler
Deep Learning with Apache Spark and GPUs with Pierce SpitlerDeep Learning with Apache Spark and GPUs with Pierce Spitler
Deep Learning with Apache Spark and GPUs with Pierce SpitlerDatabricks
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)Kohei KaiGai
 
LCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLinaro
 
Stream Processing
Stream ProcessingStream Processing
Stream Processingarnamoy10
 
Light-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryLight-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryDataWorks Summit
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateDataWorks Summit
 
IBM: The Linux Ecosystem
IBM: The Linux EcosystemIBM: The Linux Ecosystem
IBM: The Linux EcosystemKangaroot
 

Similar to [Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan (20)

Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
 
Dataworks Berlin Summit 18' - Deep learning On YARN - Running Distributed Te...
Dataworks Berlin Summit 18' - Deep learning On YARN -  Running Distributed Te...Dataworks Berlin Summit 18' - Deep learning On YARN -  Running Distributed Te...
Dataworks Berlin Summit 18' - Deep learning On YARN - Running Distributed Te...
 
Running Tensorflow In Production: Challenges and Solutions on YARN 3.x
Running Tensorflow In Production: Challenges and Solutions on YARN 3.x Running Tensorflow In Production: Challenges and Solutions on YARN 3.x
Running Tensorflow In Production: Challenges and Solutions on YARN 3.x
 
Infrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningInfrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep Learning
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
 
Kindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievKindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 Kiev
 
Azinta Gpu Cloud Services London Financial Python Ug 1.2
Azinta Gpu Cloud Services   London Financial Python Ug 1.2Azinta Gpu Cloud Services   London Financial Python Ug 1.2
Azinta Gpu Cloud Services London Financial Python Ug 1.2
 
Introducing Apache Geode and Spring Data GemFire
Introducing Apache Geode and Spring Data GemFireIntroducing Apache Geode and Spring Data GemFire
Introducing Apache Geode and Spring Data GemFire
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
 
Running Spark in Production
Running Spark in ProductionRunning Spark in Production
Running Spark in Production
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer Fugaku
 
Deep Learning with Spark and GPUs
Deep Learning with Spark and GPUsDeep Learning with Spark and GPUs
Deep Learning with Spark and GPUs
 
Deep Learning with Apache Spark and GPUs with Pierce Spitler
Deep Learning with Apache Spark and GPUs with Pierce SpitlerDeep Learning with Apache Spark and GPUs with Pierce Spitler
Deep Learning with Apache Spark and GPUs with Pierce Spitler
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
 
LCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience Report
 
Stream Processing
Stream ProcessingStream Processing
Stream Processing
 
Light-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryLight-weighted HDFS disaster recovery
Light-weighted HDFS disaster recovery
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
 
IBM: The Linux Ecosystem
IBM: The Linux EcosystemIBM: The Linux Ecosystem
IBM: The Linux Ecosystem
 
NWU and HPC
NWU and HPCNWU and HPC
NWU and HPC
 

Recently uploaded

Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Hararemasabamasaba
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 

Recently uploaded (20)

Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 

[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan

  • 1. Running Tensorflow on Apache YARN – A sneak peak into GPU Scheduling Sunil Govindan Apache Hadoop PMC member YARN Team @ Hortonworks
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda  Overview to Machine Learning on Big Data Platform  GPU support in Apache Hadoop YARN  Tensorflow on YARN – example and demo
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Overview: Machine Learning on Big Data Platform
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Machine learning workflow Feature Selection Data Feature Transform Feature Encoding Feature Evaluation Model Training Feature Model Evaluation Model Validation Model Staging Experiment Online Feature Model Database Exper- iment Model as Service Real-time Feature Calibration Data Preprocessing Feature Engineering Model Training Online Service
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Machine learning (BigData) – Data Preprocessing Feature Selection Data Feature Transform Feature Encoding Feature Evaluation Feature Engineering  Import data – HDFS – AWS – RDBMS  Join data  Data exploration  Data sample  Training/Test random split
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Machine learning (BigData) – Feature Engineering Feature Selection Data Feature Transform Feature Encoding Feature Evaluation Feature Engineering  Feature transform/selection  Feature embedding
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Machine learning (BigData) – Model Training Model Training Feature Model Evaluation Model Validation Model Staging Model Training  Traditional machine learning models – Logistic Regression – Gradient boosting tree – Recommendation/ALS – LDA  Libraries – Apache Spark MLlib – XGBoost  Deep learning models – DNN – CNN – RNN – LSTM  Libraries – TensorFlow – Apache MXNet – BigDL
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Machine learning (BigData) – Model Serving Experiment Online Feature Model Database Exper- iment Model as Service Real-time Feature Calibration Online Service  Model deploy  Model serving – Batch – Streaming  Experiment – offline – online (A/B test)
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved GPU support in Apache Hadoop YARN
  • 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Machine learning platform on YARN CPU GPU SSD YARN: Data Operating System (Cluster Resource Management) Spark MLlib XGBoost Hive/LLAP Spark SQLTensorFlow Zeppelin HDFS AWS S3 RDBMS
  • 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Why GPU?  GPU can speed up following computation- intensive applications 10x - 300x times Gene Analysis Deep learningSelf-Driving Car Scientific Computation Without GPU speed up, you will almost impossible to do these computations. (If job runs for weeks).
  • 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Why GPU?  GPU: Many cores to handle massive (but simple) computation tasks simultaneously: GPU CPU Computation Intensive Other Nvidia Tesla K40: 2880 CUDA cores. $2200.00 => $0.76 / core Intel Xeon E5-2697 14 cores $2295.00 => $163 / core
  • 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Why all under YARN SLA! Monitoring!A normal YARN user Quotas! Isolation! Capacity Planning, Preemption, Reservation System. Time line services, Grafana, etc. CPU / Memory, (WIP) GPU, FPGA, Network
  • 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved All running on the same YARN platform LLAP 128 G 128 G 128 G 128 G 128 G LLAP LLAP 128 G 128 G GPUs
  • 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Current status of GPU support on YARN  Using node label (YARN-796), since Apache Hadoop 2.6.0 – Use node label to partition one big cluster to smaller disjoint clusters, and assign shares/acls to queues. – Issues: 1) GPU is not a countable resource in scheduling. 2) No proper isolation for GPU.  Rest part of GPU support is WIP, umbrella JIRA: YARN-6223
  • 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved GPU support: Challenges  GPU isolation – Different from memory / cpu, computations affinity to per-GPU-device. – And multiple processes use the single GPU will be serialized. (MPS is an exception). – And multiple process share the same GPU cause OOM easily. • Even though TF provide options to use GPU memory less than whole device provided. But we cannot enforce this from external.
  • 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved GPU support: Challenges  Hierarchy of GPUs matters: – Topology of GPU really matters: affect communication latency a lot! (Von Neumann bottleneck) Picture credit to: https://opus.nci.org.au
  • 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved GPU support: Challenges  GPU on Docker: Build once and run anywhere is not simple:  For a regular app:  It can run on Centos 6/7, or any different hosts as well as CPU arch is same.  However, GPU application needs driver to talk to hardware: Nginx App Nginx AppUbuntu 14:04 Tensorflow 1.2 Nginx AppUbuntu 14:04 GPU Base Lib v2 Nginx AppHost OS GPU Base Lib v1 X Fails CUDA Library 5.0
  • 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved GPU Support : Solutions  GPU isolation: – With general resource types feature: • detect & report number of GPUs to YARN scheduler, and scheduler make central decision. – For normal processes: use cgroups: device submodule. (Same as cpu/memory isolation mechanism) – For docker processes: use --device command line before launch docker container.
  • 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved GPU Support : Solutions  GPU on Docker support – By using nvidia-docker-plugin. Tensorflow 1.2 Nginx AppUbuntu 14:04 Nginx AppHost OS GPU Base Lib v1 Volume Mount CUDA Library 5.0
  • 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved How rest of YARN helps GPU support.  Node partition – Without node partition, cannot guarantee best GPU utilizations, let’s look at an example: – Two hosts in the cluster, only host1 has GPUs. At the beginning, cluster is empty. – At time T1, user submit a Spark job, which need 10G mem, 4 CPUs. Without node partition, it could be placed to Host1 – If we have another job, which needs 15G memory, 6 CPUs and 3 GPUs, it won’t possible to get allocated. 20G 8 4 Mem CPU GPU 20G 8 Host1 (GPU) Host2 10G 4 4 Mem CPU GPU 20G 8 Host1 (GPU) Host2 Task1 ?
  • 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved How rest of YARN helps GPU support.  Resource Profiles – A generalized vector – Admins can create custom Resource Types! – Ease of resource requesting model using profiles NodeManager Memory CPU GPU NodeManager Memory CPU GPU ResourceManager Small Medium Large Profile Memory CPU GPU Small 2 GB 4 Cores 1 Cores Medium 4 GB 8 Cores 1 Cores Large 16 GB 16 Cores 4 Cores Application Master Small
  • 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Current development status (YARN-6223)  Apache Hadoop 3.1.0 release (Jan 15, 2018) – GPU auto detection (Merged) – GPU scheduling in RM (Merged) – GPU isolation using Cgroups. (Merged) – GPU on docker isolation & volume. (Merged) – UI / Metrics (Merged). – Documentation (Open) – Ambari changes (Open)
  • 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved TensorFlow on Apache Hadoop YARN
  • 25. 25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved YARN assembly: Makes everything easier!  Forget about writing an application master, this is how you can run app on YARN ..  Write assembly spec in JSON (we call it Yarnfile)  Post the JSON as REST request to YARN server.  YARN to figure out rest of it.  An example:
  • 26. 26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Demo….
  • 27. 27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Questions?

Editor's Notes

  1. This is a typical machine learning, which involves three steps: feature engineering, model training and online service. Not surprisingly, the most important thing is to have the right features: those capturing historical information dominate other types of features. Once we have the right features and the right model, other factors play small roles. We first get feature representation from raw data, and then feed these features into machine learning model, and then evaluate the model and choose the best one to push into online service. The machine learning workflow is complicated, usually involves several steps under the help of several infrastructure components.
  2. Machine learning workflow starts with loading data from different data sources, like HDFS, AWS S3 or database system. After that, we usually join data from different source to generate a wide table. Apache Hive or Apache Spark is the most appropriate tools to handle this workload. And then, data scientists starts data exploration via Zeppelin. The most common issue is unbalanced label for the dataset, for example, the number of positive label is far more than the negative label. To get more accurate model, we need to subsample data from the group which has more instances to make it balanced. After that, we random split the dataset for training and test under the help of Spark. Once we get training data, we can start feature engineering.
  3. Feature engineering technology has made great progress over the past decade, from hand-designed features to automating feature discovery by deep learning. In many cases, hand-designed features can leverage the understanding of the domain knowledge which will lead to optimal results, Spark MLlib provides lots of feature transform/selection operators to make it simple and easily. But it will involves heavy physical work and need hire experienced engineers. DNNs has been successful applied in computer vision, speech recognition and natural language processing during recent years. More and more scientists and engineers applied deep neural network in computer vision, speech recognition and natural language and it has achieved good results. DNN can learn features automatically via embedding, the most famous embedding trick is word2vec which can produce a vector space, with each unique word in the corpus being assigned a corresponding vector in the space.
  4. Model training is the most important step of the whole pipeline.
  5. Deploy the model distributed for parallel model serving on batch mode or streaming mode. Evaluate the model offline or online by different metrics.