Demonstrating the Benefits of Hyper-Acceleration

Roop Ganguly, Solution Architect

The End of Moore’s
Law
350 nm
180 nm
130 nm
90 nm
65 nm
1.0
2.0
3.0
1970 1980 1990 2000
Power Wall
GHz
Gordon Moore

Implications for Big Data
Security AnalyticsRisk Management
Behavioral Analytics
Natural Language Processing
AI/Deep Learning
Machine Learning

CPU-Bound Applications – A New Bottleneck
40Gb-
100Gb
Network
Now that faster networking
and disk technologies have
emerged, CPUs act like
“stop signs” for computation
Node 1
Node 2
Node 3

Accelerators
Microprocessor and Cloud Vendors Respond
ASIC
GPU
FPGA

Data Scientists &
Developers
Performance Team
Inhibitor: Programming Model Gap
for Hardware Accelerators
Two wildly
different skill sets
CPU GPU FPGA
Data Science Programming Model
BIG DATA PLATFORMS
Acceleration Programming Model
Programming Model Gap

Cross Platform
Cross Hardware
Intelligent, automatic computation routing
Zero code change
Introducing Bigstream
Hyper-acceleration Layer
Dataflow Adaptation Layer
Bigstream Dataflow
Bigstream Hypervisor
HYPER-ACCELERATION LAYER
BIG DATA PLATFORMS
CPU GPU FPGA
3X to 30X acceleration

Accelerated Spark Architecture with
Bigstream

9
Business Intelligence Use Case

Business Intelligence Query
•Based on Transaction Processing Performance Council –
Decision Support (TPC-DS) Benchmark
•Spark/SQL Query:
SELECT i_item_id , avg(ss_quantity) agg1, avg(ss_list_price) agg2, avg(ss_coupon_amt)
agg3 | FROM store_sales, customer_demographics, date_dim, item, promotion
WHERE ss_sold_date_sk = d_date_sk AND ss_item_sk = i_item_sk
AND…….
•Input: approximately 2GB of avro table data
•Simultaneously run software-accelerated and unaccelerated on
identical Amazon EMR clusters

Business Intelligence Use Case Demo

Adtech ETL/ML Data Pipeline
Spark
Streaming
Spark
Streaming
APPLICATION/
WEB
SERVERS KAFKA
clicks
clicks, likes
impressions
USERS
Spark
ML
RTB
Systems
Distributed messaging system
(tens of servers)
Distributed computation system
(hundreds of servers)
Millions of users

Announcement –
Bigstream on
AWS EMR

Setting the bootstrap script
Bigstream ON EMR
Add the Bigstream bootstrap URL
and your cluster has hyper-acceleration

Demonstrating the Benefits of Hyper-Acceleration

More Related Content

Viewers also liked

Similar to Demonstrating the Benefits of Hyper-Acceleration

Recently uploaded

Demonstrating the Benefits of Hyper-Acceleration

Editor's Notes