Apache Spark Acceleration Using Hardware Resources in the Cloud, Seamlessl with Chris Kachris

Chris Kachris www.inaccel.com
CEO, co-founder
chris@inaccel.com
Apache Spark
Acceleration using
FPGAs in the Cloud,
Seamlessly
#HWCSAIS16

…or
How to speedup your Spark
applications
with the same cost
with the same code
#HWCSAIS16

Why acceleration?
3www.inaccel.com, Chris Kachris, #HWCSAIS16
Source: APACHE SPARK SURVEY 2016 REPORT

The new era of heterogenous cloud
• DCs have started deploying Heterogenous
systems (GPGPUs, FPGAs) to face the
increased traffic requirements.
• A new emerging era has started: specialized
systems for big data applications and data
analytics

FPGAs in the news

Available Platforms
CPUs
+ Flexible & Cheap
- low performance
GPUs
+ Flexible
- Expensive &
hard to program
Specialized chips/FPGA
+ High Performance
- low flexibility
Flexibility
Performance

Available Platforms
Flexibility
+
Best of 2 worlds
Performance

InAccel
helps companies speedup their Spark
applications
by providing ready-to-use
accelerators-as-a-service in the cloud

Acceleration for machine learning
Inaccel offers Accelerators-
as-a-Service for Apache
Spark in the cloud (e.g.
Amazon AWS f1) using
FPGAs

Hardware acceleration
module filter1 (clock, rst, strm_in, strm_out)
for (i=0; i<NUMUNITS; i=i+1)
always@(posedge clock)
integer i,j; //index for loops
tmp_kernel[j] = k[i*OFFSETX];
FPGA handles compute-
intensive, deeply pipelined,
hardware-accelerated
operations
CPU handles the rest
application
InAccel 800 sec80 sec
200
sec
Source: amazon, Inc.

AWS Marketplace
Amazon EC2 FPGA
Deployment via Marketplace
Amazon
Machine
Image (AMI)
Amazon FPGA Image
(AFI)
AFI is secured, encrypted,
dynamically loaded into the
FPGA - can’t be copied or
downloaded
Customers
AWS Marketplace

Accelerators for Spark
user
c4, m4
s4, …
F1 (FPGA)
Amazon
Marketplace
Download
InAccel Accelerator
from Marketplace
Run your code on CPU
Offload hard
work on F1

IP cores available in Amazon AWS
Logistic Regression K-mean clustering
K-means is one of the
simplest unsupervised
learning algorithms
that solve the well
known clustering
problem.
Gradient Descent IP
block for faster
training of machine
learning algorithms.
Recommendation Engines (ALS
Alternative Least
Square IP core for
the acceleration of
recommendation
engines.
Available in Amazon AWS marketplace for free trial: www.inaccel.com

IP Cores
• Develop hardware
components as IP cores for
widely used applications
• Logistic regression
• Recommendation
• K-means
• Linear regression
• Decision Trees
• NaiveBayes
• …

Comparison with AWS c4.large
• c4 (36 cores)
• m4 (16 cores)
• f1 with our
Accelerator
0
10
20
30
40
50
60
c4 (36) m4 (16) f1 (Accel)
Logistic regression comparison

Speedup
• Logistic regression: 2.7x
– 784 features,
– 30 classes
• K-means clustering: 2.8x
– 784 features
– 30 clusters

Acceleration of Logistic Regression

Communication with Host - RDD
Accelerators for logistic regression/kmeans

Available in AWS Marketplace
• IP cores in Amazon AWS marketplace

Zero code changes
• Only replacement of the library is required
Zero code changes

No modification on your code
Only addition of the inaccel library is required

No modification on your code
• Only addition of the inaccel library is required

APIs support:

Demo on Amazon AWS
Intel 36 cores Xeon on Amazon AWS
c4.8xlarge $1.592/hour
8 cores +
in Amazon AWS FPGA
f1.2xlarge $1.65/hour + inaccel
Note: 4x fast forward for both cases

Seamless setup on AWS
1. Executing a single script, will setup Hadoop
and Spark, format the Hadoop Namenode and
upload all the necessary datasets to the hdfs.
2. Start Spark and Hadoop and then your
applications are ready to be accelerated on f1

Documentation
Documentation for all
the APIs
https://inaccel.github.io
• C/C++
• Java
• Python
• Scala

Mllib library
MLlib contains many algorithms and utilities.
• Classification: logistic regression, naive Bayes,...
• Regression: generalized linear regression, survival
regression,...
• Recommendation: alternating least squares (ALS)
• Clustering: K-means, Gaussian mixtures (GMMs),...
• Decision trees: …

Plans for future frameworks

Speedup your application
Contact us if you want to embrace the new
opportunity to speedup your application:
With the same cost
With the same code
info@inaccel.com

Apache Spark Acceleration Using Hardware Resources in the Cloud, Seamlessl with Chris Kachris

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Apache Spark Acceleration Using Hardware Resources in the Cloud, Seamlessl with Chris Kachris

Similar to Apache Spark Acceleration Using Hardware Resources in the Cloud, Seamlessl with Chris Kachris (20)

More from Databricks

More from Databricks (20)

Recently uploaded

Recently uploaded (20)

Apache Spark Acceleration Using Hardware Resources in the Cloud, Seamlessl with Chris Kachris