End-to-End Big Data AI with Analytics Zoo

End-to-End Big Data AI with Analytics Zoo
Jason Dai – Sr. Principal Engineer
Fadi Zuhayri – Sr. Director
Intel Architecture, Graphics & Software

• Intel’s Transformation for Intelligence Era
• Analytics Zoo: Software Platform for Big Data AI
• Building Big Data AI Applications on Analytics Zoo
Outline

COMPUTE
1980 1990 2000 2010 2020 2030 2040
PC ERA
DIGITIZE
EVERYTHING
NETWORK
EVERYTHING
1 BILLION INTERNET
CONNECTED DEVICES
1018
109
104
1015
102
T E C H N O L O G Y L E D D I S R U P T I O N S
COMPUTE
DEMOCRATIZATION

COMPUTE
1980 1990 2000 2010 2020 2030 2040
PC ERA
DIGITIZE
EVERYTHING
NETWORK
EVERYTHING
1 BILLION INTERNET
CONNECTED DEVICES
1018
109
104
1015
102
CLOUD
EVERYTHING
MOBILE
EVERYTHING
MOBILE + CLOUD ERA
10 BILLION CLOUD CONNECTED DEVICES
COMPUTE
DEMOCRATIZATION

COMPUTE
1980 1990 2000 2010 2020 2030 2040
PC ERA
DIGITIZE
EVERYTHING
NETWORK
EVERYTHING
1 BILLION INTERNET
CONNECTED DEVICES
1018
109
104
1015
102
100 BILLION
INTELLIGENT
CONNECTED
DEVICES
INTELLIGENCE ERA
CLOUD
EVERYTHING
MOBILE
EVERYTHING
MOBILE + CLOUD ERA
COMPUTE
DEMOCRATIZATION

COMPUTE
1980 1990 2000 2010 2020 2030 2040
PC ERA
DIGITIZE
EVERYTHING
NETWORK
EVERYTHING
1 BILLION INTERNET
CONNECTED DEVICES
1018
109
104
1015
102
COMPUTE
DEMOCRATIZATION F O R E V E R Y O N E
EXASCALE
CLOUD
EVERYTHING
MOBILE
EVERYTHING
MOBILE + CLOUD ERA

Industry inflections are fueling the growth of data
Intelligent
Edge
Artificial
Intelligence
5G Network
Transformation
Cloudification

Move Faster Store More Process Everything
Software & System Level Optimized
Intel® Silicon Photonics
Intel® Ethernet
Intel® Tofino
Unleashing the Potential of Data

Analytics & AI Strategy
21
4 3 Hardware
Software
Ecosystem
OPTIMIZED
SOFTWARE
E2E DATA
SCIENCE
UNIFIED
APIs
CPU INFUSED
WITH AI
FLEXIBLE
ACCELERATION
OPTIMIZED
PLATFORM
A THRIVING
COMMUNITY
INTELLIGENT
SOLUTIONS
INNOVATION &
INVESTMENT

XPU: DIVERSE INTEL HW PORTFOLIO – from edge to cloud
CPU
General
compute,
AI Inference
& training
Xe
GPU
HPC & AI
AI training &
inference
Habana
Low power
vision
Low power
NLPGNA
Movidius

24 OPTIMIZED
TOPOLOGIES
44 OPTIMIZED
TOPOLOGIES
100+ OPTIMIZED
TOPOLOGIES
…
Foundation for AI
13
More built-in
AI acceleration &
optimized
topologies with
each new gen
OPTIMIZED LIBRARIES AND FRAMEWORKS
2017 1ST GEN
Intel® Advanced
Vector Extensions
512 (Intel AVX-512)
2019 2ND GEN
Intel Deep
Learning Boost
(with VNNI)
2020 3RD GEN
Intel Deep
Learning Boost
(VNNI, BF16)
Intel Deep
Learning Boost
(AMX)
2021 NEXT GEN
AIPERFORMANCE
CPU INFUSED
WITH AI

Languages and Libraries
Middleware & Frameworks
Application Workloads
CPU GPU FPGAOther
Accelerators
Scalar Vector Matrix Spatial

XPUs
Languages and Libraries
Middleware & Frameworks
Application Workloads
CPU GPU FPGAOther
Accelerators
Scalar Vector Matrix Spatial

2020 2021
Q1 Q2 Q3 Q4Q4Q3
0.6
Spec.
0.7
Spec.
0.8
Spec.
0.9
Spec.
1.0
Spec.
Industry
Initiative
Announced
More Soon…
Learn more at oneapi.com

Middleware, Frameworks & Runtimes
Applications & Services
XPUs
CPU GPU FPGA
Intel® oneAPI Product
Gold
Available December
2020
...
Compatibility
Tool
Languages Libraries
Analysis &
Debug Tools
Hardware Abstraction Layer

Run Locally Run in the Cloud
Get started quickly: code samples, quick-start guides, webinars, training
software.intel.com/oneapi
Downloads
Repositories
Containers
DevCloud

One Minute
to Code
No Hardware
Acquisition
No Download, Install
or Configuration
Support for Jupyter
Notebooks, VS Code
Easy Access to Samples
and Tutorials

ECOSYSTEM OF AI SOFTWARE STACK
HW
LIBRARIES &
COMPILERS
AI/ANALYTICS
SOLUTIONS
DL/ML/BIGDATA
FRAMEWORKS
XPU
oneDNN
CPU
oneDAL oneCCL
DATA SCIENTISTS &
DATA ANALYSTS
GPU ACCELERATERS
M O D E L Z O O A N A L Y T I C S Z O O
O P E N -
V I N O ™
T E N S O R -
F L O W
P Y T H O N
/
N U M B A
T V M
P Y -
T O R C H
M X N E T
S P A R K
S Q L + M L / DL S c a l e O ut
M O D I N
N U M P Y
X G -
BO O S T
S C I K I T -
L E A R N
P A N D A S

21
Achieving higher
yields and efficiency
Increasing production
and uptime
Transforming
learning
Enhancing
safety
Revolutionizing
patient outcomes
Turning data
into value
Pervasive Analytics & AI
Agriculture Energy Education Government Finance Healthcare
Empowering
industry 4.0
Creating thrilling
experiences
Modernizing
shopping
Enabling homes to
see, hear & respond
Fueling automated
driving
Driving network
efficiency
Industrial Media Retail Smart Home Telecom Transport
intel.com/customerspotlight

INFERENCETRAINING
FEATURE
ENGINEERING
AI TREND: END TO END
GROWING DEMAND FOR END-TO-END
AI PIPELINE
DATA
INGESTION

Big Data AI Open-Source Software Platform
Distributed TensorFlow, PyTorch, Keras, BigDL, RAY, and Apache Spark
Reference Use Cases, AI Models, High-level APIs, Feature Engineering, etc.
https://github.com/intel-analytics/analytics-zoo
AI on Big Data
Simplifying End-to-End Big Data AI Solutions Development

Jason Dai
Senior Principal Engineer

• Big Data AI
• Analytics Zoo overview
• Summary
Agenda

Seamless Scaling from Laptop to Distributed Big Data
Distributed, High-Performance
Deep Learning Framework
for Apache Spark
https://github.com/intel-analytics/bigdl
Unified Big Data AI Platform
for TensorFlow, PyTorch, Keras, BigDL,
OpenVINO, Ray and Apache Spark
AI on Big Data

Transformation of Big Data
• Storing and processing more data
• Analyzing (querying) more data
• Real-time analysis
• Modelling and prediction (ML/DL)
AI is everywhere
• Moving from experimentation to production
• Applying to large-scale, distributed Big Data
Big Data AI

Case Study: Image Feature Extraction at JD.com
Query
Search Result
Source: “Bringing deep learning into big data analytics using BigDL”, Xianyan Jia and Zhenhua Wang, Strata Data Conference Singapore 2017
Similar Image Search Image Deduplication
Image Feature
Extraction:
Applications:

* https://software.intel.com/en-us/articles/building-large-scale-image-feature-extraction-with-bigdl-at-jdcom
* “BigDL: A Distributed Deep Learning Framework for Big Data”, ACM SoCC 2019, https://arxiv.org/abs/1804.05839
For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.
• End-to-end Big Data AI
pipeline (using BigDL
on Apache Spark)
• Efficiently scale out
(3.83x speed-up vs.
Nvidia GPU severs)*
Case Study: Image Feature Extraction at JD.com

Analytics Zoo: Software Platform for Big Data AI
End-to-End Pipelines
(Seamlessly scale AI models to distributed Big Data)
ML Workflow
(Automate tasks for building end-to-end pipelines)
Models
(Built-in models and algorithms)
Compute
Environment
K8s Cluster Cloud
Python Libraries
(Numpy/Pandas/sklearn/…)
DL Frameworks
(TF/PyTorch/BigDL/OpenVINO/…)
Distributed Analytics
(Spark/Flink/Ray/…)
Laptop Hadoop Cluster
Powered by oneAPI

End-to-End Big Data Analytics and AI
Seamless Scaling from Laptop to Distributed Big Data
Big Data
Pipeline
Prototype on laptop
using sample data
Experiment on clusters
with history data
Production deployment w/
distributed data pipeline
• Easily prototype end-to-end pipelines that apply AI models to big data
• “Zero” code change from laptop to distributed cluster
• Seamlessly deployed on production Hadoop/K8s clusters
• Automate the process of applying machine learning to big data

Recommendation
Distributed TensorFlow & PyTorch on Spark
Spark Dataframes & ML Pipelines for DL
RayOnSpark
InferenceModel
Models &
Algorithms
End-to-end
Pipelines
Time Series Computer Vision NLP
ML Workflow AutoML Automatic Cluster Serving
Compute
Environment
K8s Cluster Cloud
Python Libraries
DL Frameworks
Powered by oneAPI

Recommendation
Spark Dataframes & ML Pipelines for DL
Distributed TensorFlow & PyTorch on Spark
InferenceModel
Models &
Algorithms
End-to-end
Pipelines
Time Series Computer Vision NLP
ML Workflow AutoML Automatic Cluster Serving
Compute
Environment
K8s Cluster Cloud
Python Libraries
DL Frameworks
Powered by oneAPI
RayOnSpark

Distributed TensorFlow/PyTorch on Spark
Write TensorFlow/PyTorch inline with Spark code
#pyspark code
train_rdd = spark.hadoopFile(…).map(…)
dataset = TFDataset.from_rdd(train_rdd,…)
#tensorflow code
import tensorflow as tf
slim = tf.contrib.slim
images, labels = dataset.tensors
with slim.arg_scope(lenet.lenet_arg_scope()):
logits, end_points = lenet.lenet(images, …)
loss = tf.reduce_mean(tf.losses.sparse_softmax_cross_entropy(
logits=logits, labels=labels))
#distributed training on Spark
optimizer = TFOptimizer.from_loss(loss, Adam(…))
optimizer.optimize(end_trigger=MaxEpoch(5))
Analytics Zoo API in blue

Network Quality Prediction in SK Telecom
Distributed TensorFlow/PyTorch on Spark
https://networkbuilders.intel.com/solutionslibrary/sk-telecom-intel-build-ai-pipeline-to-improve-network-quality

• : distributed framework for
emerging AI applications
• RayOnSpark
• Directly run Ray programs on Big
Data cluster
• Integrate Ray programs into Spark
data pipeline
https://medium.com/riselab/rayonspark-running-emerging-ai-applications-on-big-data-clusters-with-ray-and-analytics-zoo-923e0136ed6a
RayOnSpark
Run Ray Programs Directly on Big Data Platform

RayOnSpark
Run Ray Programs Directly on Big Data Platform
sc = init_spark_on_yarn(...)
ray_ctx = RayContext(sc=sc, ...)
ray_ctx.init()
#Ray code
@ray.remote
class TestRay():
def hostname(self):
import socket
return socket.gethostname()
actors = [TestRay.remote() for i in range(0, 100)]
print([ray.get(actor.hostname.remote()) for actor in actors])
ray_ctx.stop()
https://medium.com/riselab/rayonspark-running-emerging-ai-applications-on-big-data-clusters-with-ray-and-analytics-zoo-923e0136ed6a

Fast Food Recommendation in Burger King
End-to-End Training Pipeline w/ RayOnSpark
* https://medium.com/riselab/context-aware-fast-food-recommendation-at-burger-king-with-rayonspark-2e7a6009dd2d
* “Context-Aware Drive-thru Recommendation Service at Fast Food Restaurants”, https://arxiv.org/abs/2010.06197
DATA
INGESTIO N
FEATURE
ENGINEERING
TRAINING INFERENCE
on

Scalable AutoML for Time Series Prediction
Automated feature generation, model selection and hyper parameter tuning
tsp = TimeSequencePredictor(
dt_col="datetime",
target_col="value")
pipeline = tsp.fit(train_df,
val_df, metric="mse",
recipe=RandomRecipe())
pipeline.predict(test_df)
https://medium.com/riselab/scalable-automl-for-time-series-prediction-using-ray-and-analytics-zoo-
b79a6fd08139

Automated feature generation, model selection and hyper parameter tuning
FeatureTransformer
Model
SearchEngine
Search presets
trial
trial
trial
trial
…best model
/parameters
trail jobs
Pipeline
with tunable parameters
with tunable parameters
configured with best parameters/model
Each trial runs a different combination
of hyper parameters
Ray Tune
rolling, scaling, feature generation, etc.
Spark + Ray
“Scalable AutoML for Time Series Forecasting using Ray”, USENIX OpML’20

TI-One ML Platform in Tencent Cloud
Using Analytics Zoo in Tencent Cloud TI-
One ML Platform
Predicting NYC Taxi Passengers Using AutoML
https://software.intel.com/content/www/us/en/develop/articles/tencent-cloud-leverages-analytics-zoo-to-improve-performance-of-ti-one-ml-platform.html

“Zouwu”
Open Source Framework for Time Series on Analytics Zoo
Application framework for building end-to-end
time series analysis
• Use case - reference time series use cases
• Models - built-in models for time series analysis
• AutoTS - AutoML support for building E2E time
series analysis pipelines
Project
Zouwu
Built-in Models
ML
Workflow
AutoML Workflow
use-case
models autots
https://github.com/intel-analytics/analytics-
zoo/tree/master/pyzoo/zoo/zouwu
“Project Zouwu: Scalable AutoML for Telco Time Series Analysis using Ray and Analytics”, Ray Summit 2020

• E2E Big Data & AI pipeline (distributed TF/PyTorch/OpenVINO/Ray on Spark)
• Advanced AI workflow (AutoML, Time-Series, Cluster Serving, etc.)
Github
• Project repo: https://github.com/intel-analytics/analytics-zoo
• Use cases: https://analytics-zoo.github.io/master/#powered-by/
Technical paper/tutorials
• CVPR 2020 tutorial: https://jason-dai.github.io/cvpr2018/
• ACM SoCC 2019 paper: https://arxiv.org/abs/1804.05839
• AAAI 2019 tutorial: https://jason-dai.github.io/aaai2019/
Conclusion

(Seamlessly scale AI models to distributed Big Data)
ML Workflow
(Automate tasks for building end-to-end pipelines)
Compute
Environment
K8s Cluster Cloud
Python Libraries
DL Frameworks
Powered by oneAPI
Recommendation Time Series Computer Vision NLP

• Recommendation
• Time series analysis
• Computer vision
• Natural language processing (NLP)
Big Data AI Applications on Analytics Zoo

Food Recommendation Using Analytics Zoo in Burger King
Guest arrives
ODMB
Checks Menu
Board
Cashier enters
order
Checks Menu
Board
Guest
completes
order
* “Context-aware Fast Food Recommendation with Ray on Apache Spark at Burger King”, Data + AI Summit Europe 2020

Food Recommendation Challenges
Challenges
• Lack of user identifiers
• Same session food compatibilities
• Other variables in our use case:
locations, weathers, time, etc.
• Deployment challenges

Transformer Cross Transformer (TxT) Model
Model Components
• Sequence Transformer
• Taking item order sequence as input
• Context Transformer
• Taking multiple context features as input
• Latent Cross Joint Training
• Element-wise product for both transformer
outputs

Unified Big Data Processing and Model Training on
Analytics Zoo
CurrentPrevious

Food Recommendation Using Analytics Zoo in
Burger King
Offline Training Result
Model Top1
Accuracy
Top3
Accuracy
RNN 29.98% 46.24%
Contextual ItemCF 32.18% 48.37%
RNN Latent Cross 33.10% 49.98%
TxT 34.52% 52.37%
A/B Testing Result
Model Conversation
Rate Gain
Add-on
Sales Gain
RNN Latent
Cross (control)
- -
TxT +7.5% +4.7%

Recommendation Using Analytic Zoo in Mastercard
https://software.intel.com/en-us/articles/deep-learning-with-analytic-zoo-optimizes-mastercard-recommender-ai-service
Train NCF Model
Features Models
Model
Candidates
Models
sampled
partition
Training Data
…
Load Parquet
Train Multiple Models
Train Wide & Deep Model
sampled
partition
sampled
partition
Spark ML Pipeline Stages
Test Data
Predictions
Test
Spark DataFramesParquet Files
Feature
Selections
SparkMLPipeline
Neural Recommender using Analytics Zoo
Estimator Transformer Model
Evaluation
& Fine
Tune
Train ALS Model

Time Series Based Network Quality Prediction in
SK Telecom
• Predict Network Quality Indicators (CQI, RSRP, RSRQ, SINR, …)*
for anomaly detection and real-time management
* CQI : Channel Quality Indicator
* RSRP : Reference Signal Received Power
* RSRQ : Reference Signal Received Quality
* SINR :Signal to Interference Noise Ratio
* “Vectorized Deep Learning Acceleration from Preprocessing to Inference and Training on Apache Spark in SK Telecom”, Spark + AI Summit 2020
* https://networkbuilders.intel.com/solutionslibrary/sk-telecom-intel-build-ai-pipeline-to-improve-network-quality

Memory Augmented Network

Memory Augmented Network – Test Result
Improved predictions for sudden change!
seq2seq
Mem-network

Migrating to Analytics Zoo on Intel® Xeon
Data Loader
DRAM
Store tiering forked.
Flash
Store customized.
Data Source APIs
Spark-SQL
SQL Queries
(Web, Jupyter)
LegacyDesignwithGPU
Export Preprocessing AITraining/Inference
GPU
Servers
NewArchitecture: Unified DataAnalytic+AIPlatform
Preprocessing RDDofTensor
2nd Generation Intel®Xeon®
Scalable Processors
csv files pandas, dask
spark spark
spark
Manually manage separate clusters, and segregated workflow
E2E architecture that atomically scales deep learning on Spark
AIModelCodeofTF

Inference Pipeline Speed-up with Analytics Zoo
Up-to 6x speedup for
end-to-end inference
on Analytics Zoo in
SK Telecom*

Training Pipeline Speed-up with Analytics Zoo
Up-to 4x speedup for
end-to-end training
on Analytics Zoo in
SK Telecom*

Wind Power Prediction using Analytics Zoo in
GoldWind
LSTNet
ETL Training Prediction
Deployment
Update
DB
Historical
Power
Wind Power Prediction
• Accuracy improved to 79%
(from previous 59%)*
• 4x training speedup*
* https://www.intel.cn/content/www/cn/zh/analytics/artificial-intelligence/create-power-forecasting-solutions.html

Industrial Vision Inspection Using Analytics Zoo in
Midea and KUKA
https://software.intel.com/en-us/articles/industrial-inspection-platform-in-midea-and-kuka-using-distributed-tensorflow-on-analytics

Industrial Vision Inspection Using Analytics Zoo in
Midea and KUKA
Edge to Cloud architecture using
Analytics Zoo
• 99.8% accuracy*
• <50ms image processing latency*
• >8x inference speedup*
* https://software.intel.com/en-us/articles/industrial-inspection-platform-in-midea-and-kuka-using-distributed-tensorflow-on-analytics
* https://www.intel.cn/content/www/cn/zh/analytics/artificial-intelligence/midea-case-study.html

AI-Assisted Radiology Using Analytics Zoo in
Dell EMC
Condition E
Condition D
Condition C
Condition B
Condition A
Patient A
Transfer Learning using ResNet-50 trained with ImageNet
https://www.delltaechnologies.com/resources/en-us/asset/white-papers/solutions/h17686_hornet_wp.pdf
chest X-rays

Customer Service Chatbot Using Analytics Zoo in
Microsoft Azure
*https://software.intel.com/en-us/articles/use-analytics-zoo-to-inject-ai-into-customer-service-platforms-on-microsoft-azure-part-1
*https://www.infoq.com/articles/analytics-zoo-qa-module/

Job Recommendation Using Analytics Zoo in Talroo
documents
(resume, job
description)
https://software.intel.com/content/www/us/en/develop/articles/talroo-uses-analytics-zoo-and-aws-to-leverage-deep-learning-for-job-recommendations.html

• E2E Big Data & AI pipeline (distributed TF/PyTorch/OpenVINO/Ray on Spark)
• Advanced AI workflow (AutoML, Time-Series, Cluster Serving, etc.)
Github
• Project repo: https://github.com/intel-analytics/analytics-zoo
• Use cases: https://analytics-zoo.github.io/master/#powered-by/
Technical paper/tutorials
• ACM SoCC 2019 paper: https://arxiv.org/abs/1804.05839
• AAAI 2019 tutorial: https://jason-dai.github.io/aaai2019/
Big Data AI

Summary
INDUSTRY INFLECTIONS ARE FUELING THE GROWTH OF DATA
5G Network Transformation, Artificial Intelligence, Intelligent Edge, Cloudification
AI & ANALYTICS ARE THE DEFINING WORKLOADS OF THE NEXT DECADE
with growing demand for end-to-end AI pipeline
UNMATCHED PORTFOLIO BREADTH AND ECOSYSTEM SUPPORT
Intel delivers a silicon & software foundation designed for the
diverse range of use cases from the cloud to the edge
ANALYTICS ZOO OPEN-SOURCE SOFTWARE PLATFORM FOR BIG DATA AI
Simplifies End-to-End Big Data AI pipeline solutions development

Notices & Disclaimers
• Performance varies by use, configuration and other factors. Learn more at www.intel.com/performanceIndex
• Performance may vary based on the specific game title and server configuration. To reference the full list of Intel Server GPU platform measurements, please refer to
http://www.intel.com/content/www/us/en/benchmarks/server/graphics/IntelServerGPU
• All product plans and roadmaps are subject to change without notice.
• Intel technologies may require enabled hardware, software or service activation.
• No product or component can be absolutely secure.
• Your costs and results may vary.
• Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy.
• All product plans and roadmaps are subject to change without notice.
• © Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.
Intel Server GPU TCO analysis is based on internal Intel research. Pricing as of 10/01/2020. Analysis assumes standard serving pricing, GPU list pricing, and software pricing based on
estimated Nvidia software license costs of $1 per year for 5 years.
• Intel Server GPU Performance may vary based on the specific game title and server configuration. To reference the full list of Intel Server GPU platform measurements, please refer to
http://www.intel.com/content/www/us/en/benchmarks/server/graphics/IntelServerGPU
• Video game footage courtesy of Tencent Games and Gamestream.
• LEGO STAR WARS TITLES : © Lucasfilm Entertainment Company Ltd. or Lucasfilm Ltd. & ® or TM as indicated. All rights reserved.
• LEGO, the LEGO logo and the Minifigure are trademarks of The LEGO Group. © The LEGO Group. All rights reserved.
• “DiRT4”™ : © 2017 The Codemasters Software Company Limited ("Codemasters"). All rights reserved. "Codemasters"®, “EGO”®, the Codemasters logo, and “DiRT”® are registered
trademarks owned by Codemasters. “DiRT4”™ and “RaceNet”™ are trademarks of Codemasters. All rights reserved. Under licence from International Management Group (UK) Limited. All
other copyrights or trademarks are the property of their respective owners and are being used under license. Developed by Codemasters.

End-to-End Big Data AI with Analytics Zoo

More Related Content

What's hot

Similar to End-to-End Big Data AI with Analytics Zoo

Recently uploaded

End-to-End Big Data AI with Analytics Zoo