SlideShare a Scribd company logo
Dr. Chris Kachris
CEO, co-founder
www.inaccel.com
FPGA Acceleration of Apache Spark
ML on the Cloud, Instantly
…or
How to speedup your Spark ML
applications,
reduce the cost,
without changing your code
www.inaccel.com
Why acceleration
˃ 91% of Spark users for Big Data analytics care about Performance
Source: Databricks, Apache Spark Survey 2016, Report
www.inaccel.com
Heterogeneous requirements in Data Centers
Source: intel, inc.
www.inaccel.com
Available Platforms
CPUs
+ Flexible & Cheap
- low performance
GPUs
+ Flexible
- Expensive &
hard to program
Specialized chips/FPGA
+ High Performance
- low flexibility
Flexibility
Performance
www.inaccel.com
Available Platforms
Flexibility
+
Best of 2 worlds
Performance
www.inaccel.com
FPGAs in the news
www.inaccel.com
Accelerators versus Processors
• Accelerators are customized to the application requirements
[Source: National Instruments: Smart Grid Ready Instrumentation”
www.inaccel.com
Flexibility versus High performance
CPU: High flexibility, low throughput
Chip Accelerators: High throughput, high NRE
Initial one-off costs
Programmable accelerators (FPGAs) can provide massive parallelism
and higher efficiency than CPUs for certain categories of applications
InAccel offers novel accelerators as IP cores to boost the performance
of their applications (similar to specialized containers)
www.inaccel.com
Hardware Accelerators
A GPU is effective at processing the same set of operations in parallel – single
instruction, multiple data (SIMD). A GPU has a well-defined instruction-set, and
fixed word sizes – for example single, double, or half-precision integer and floating
point values.
An FPGA is effective at processing the same or different operations in parallel –
multiple instructions, multiple data (MIMD). An FPGA does not have a predefined
instruction-set, or a fixed data width.
Control
ALU
ALU
Cache
DRAM
ALU
ALU
CPU
(one core)
FPGA
DRAM DRAM
GPU
Each FPGA
has more than
2M of these
cells
Each GPU
has 2880
of these
cores
DRAM
BlockRAM
BlockRAM
DRAM DRAM
www.inaccel.com
Programming FPGAs
˃ Using Hardware description
Languages (VHDL, Verilog)
˃ Using specialized OpenCL, HLS
(High level synthesis)
˃ As IP core (plug-n-play)
Performance Flexibility Deployment
++ -- --
+ + -
+ ++ ++
www.inaccel.com
helps companies speedup
their applications
by providing ready-to-use
accelerators-as-a-service in
the cloud
3x-10x Speedup
2x Lower Cost
Zero code changes
www.inaccel.com
InAccel, Inc.
˃Founded in 2018
˃Offices in US and Greece
˃Member of:
www.inaccel.com
Market size
˃ The data center accelerator market is expected to
reach USD 21.19 billion by 2023 from USD 2.84 billion
by 2018, at a CAGR of 49.47% from 2018 to 2023.
˃ The market for FPGA is expected to grow at the
highest CAGR during the forecast period owing to
the increasing adoption of FPGAs for the acceleration
of enterprise workloads.
[Source: Data Center Accelerator Market by Processor Type (CPU, GPU, FPGA, ASIC)- Global Forecast
to 2023, Research and Markets]
www.inaccel.com
InAccel integrated framework
Supported APIs
Zero-code changes
˃ C/C++
˃ Scala
˃ Python
˃ Java
˃ Apache Spark
Scalable
Resource Manager
cores
Alveo U200 F1 instances
www.inaccel.com
Apache Spark
˃ Spark is the most widely used
framework for Data Analytics
˃ Develop hardware components as
IP cores for widely used
applications
Spark
‒ Logistic regression
‒ Recommendation
‒ K-means
‒ Linear regression
‒ PageRank
‒ Graph computing
www.inaccel.com
Acceleration for machine learning
inaccel offers
Accelerators-as-a-
Service for Apache
Spark in the cloud
(e.g. Amazon AWS f1)
using FPGAs
www.inaccel.com
Hardware acceleration
module filter1 (clock, rst, strm_in, strm_out)
for (i=0; i<NUMUNITS; i=i+1)
always@(posedge clock)
integer i,j; //index for loops
tmp_kernel[j] = k[i*OFFSETX];
FPGA handles compute-
intensive, deeply pipelined,
hardware-accelerated
operations
CPU handles the rest
application
InAccel 800 sec80 sec
200 sec
Source: amazon, Inc.
www.inaccel.com
Accelerators for Spark ML in Amazon AWS in 3 steps
f1 (8
cores+FPGA)
1.Create an f1
instance using
InAccel’s Image
(AMI)
2.Import InAccel API
3.Run your applications
on AWS f1 to get 3x –
20x speedup
www.inaccel.com
FPGA in the cloud – the AWS model
˃ Amazon EC F1’s Xilinx FPGA
www.inaccel.com
Cloud Marketplace: available now
Amazon EC2 FPGA
Deployment via Marketplace
InAccel
Products
Customers
AWS Marketplace
Scalable to worldwide
market
First to provide
accelerators for Spark
www.inaccel.com
IP cores available in Amazon AWS
Logistic RegressionK-mean clustering
K-means is one of the
simplest unsupervised
learning algorithms that
solve the well known
clustering problem.
Gradient Descent​
IP block for faster
training of
machine learning
algorithms.
Recommendation
Engines (ALS)
Alternative-Least-
Square IP core for the
acceleration of
recommendation
engines based on
collaborative filtering.Available in Amazon AWS marketplace for free trial: www.inaccel.com
www.inaccel.com
Acceleration of Logistic regression
www.inaccel.com
Communication with Host in Amazon AWS
Accelerators for logistic regression/kmeans
www.inaccel.com
Docker-based implementation for easy integration
˃ Inaccel’s FPGA manager docker container
comprises both an FPGA manager to schedule,
orchestrate, and monitor the execution of the
accelerated applications but also the required
FPGA runtime system.
˃ The dockerized runtime system detects the
FPGA platform (aws F1) and manages the
interaction/communication with the FPGA (i.e.,
loading the accelerator, transferring input data
and results), making it transparent to the
application.
˃ Docker swarm, Kubernetes, naïve execution
www.inaccel.com
Cluster mode
˃ Cluster
mode Worker
Driver (sparkContext)
Docker Runtime
& FPGA Manager
Executor
Worker
Executor
Worker
Executor
FPGA
f1.x2
FPGA
f1.x2
FPGA
f1.x2
FPGA accelerated
Infrastructure
IP Library
www.inaccel.com
Speedup comparison
˃ Up to 10x speedup compared to 32 cores based on f1.x2
Cluster of 4 f1 (SW) Cluster of 4 f1 (SW + InAccel)
f1.x2largef1.x2large
ML
Accel
ML
Accel
ML
Accel
ML
Accel
f1.x2large f1.x2large
1
10.2x
4x f1.x2large (32 cores) 4x f1.x2large
(32cores+InAccel)
Speedup on cluster of f1.x2 using
InAccel
www.inaccel.com
Speed up
˃ Up to 12x speedup compared to 64 cores on f1.x16
1.00
12.14
f1.16xlarge (sw) f1.16xlarge (hw)
Speedup of f1.x16 with 8 InAccel
FPGA kernels
f1.x16large (SW)
64 cores
f1.x16large (SW + 8 InAccel cores)
64 cores + 8 FPGAs with InAccel
MLAccel
MLAccel
MLAccel
MLAccel
MLAccel
MLAccel
MLAccel
MLAccel
www.inaccel.com
Speedup comparison
˃ 3x Speedup compared to r4
˃ 2x lower OpEx
1.00
3.18
cluster of 4 r4 cluster of 4 f1.x2
Speedup comparison normalized on cost
for a cluster of 4 nodes ($2/hour/node)
Cluster of 4 r4 (SW) Cluster of 4 f1 (SW + InAccel)
r4 (32 cores each –
128 cores total)
ML
Accel
ML
Accel
ML
Accel
ML
Accel
f1.x2large f1.x2large
www.inaccel.com
Performance evaluation
www.inaccel.com
Demo on Amazon AWS
Intel 36 cores Xeon on Amazon AWS
c4.8xlarge $1.592/hour
8 cores +
in Amazon AWS FPGA
f1.2xlarge $1.65/hour + inaccel
Note: 4x fast forward for both cases
www.inaccel.com
Cost reduction
˃ Up to 3x lower cost to train your ML model
www.inaccel.com
Try for free on Amazon AWS
Single node version
˃ Single-node Machine learning
accelerators for Amazon
f1.x2large instances providing
APIs for C/C++, Java, Python and
Scala for easy integration
Distributed version for Apache Spark
˃ Machine learning accelerators
for Apache Spark providing all the
required APIs and libraries for the
seamless integration in distributed
systems
Single node ML suite Distributed node ML suite
www.inaccel.com
Simple integration
˃ InAccel OFF: $ spark-submit [ arguments ]
˃ InAccel ON: $ spark-submit --inaccel [ arguments ]
www.inaccel.com
InAccel unique Advantages
www.inaccel.com
Takeaways
˃ Accelerated ML Suite
C/C+
Python
Java
Scala
Spark
3x-10x Speedup
2x Lower Cost
No code changes
www.inaccel.com
chris@inaccel.com
#inaccel

More Related Content

Recently uploaded

一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 

Recently uploaded (20)

一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 

Featured

How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
SpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Christy Abraham Joy
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
Vit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
MindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
GetSmarter
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
Alireza Esmikhani
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
Project for Public Spaces & National Center for Biking and Walking
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
Erica Santiago
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Saba Software
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
Simplilearn
 

Featured (20)

How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
 

Acceleration of Spark ML using FPGAs on the cloud, instantly

  • 1. Dr. Chris Kachris CEO, co-founder www.inaccel.com FPGA Acceleration of Apache Spark ML on the Cloud, Instantly
  • 2. …or How to speedup your Spark ML applications, reduce the cost, without changing your code
  • 3. www.inaccel.com Why acceleration ˃ 91% of Spark users for Big Data analytics care about Performance Source: Databricks, Apache Spark Survey 2016, Report
  • 4. www.inaccel.com Heterogeneous requirements in Data Centers Source: intel, inc.
  • 5. www.inaccel.com Available Platforms CPUs + Flexible & Cheap - low performance GPUs + Flexible - Expensive & hard to program Specialized chips/FPGA + High Performance - low flexibility Flexibility Performance
  • 8. www.inaccel.com Accelerators versus Processors • Accelerators are customized to the application requirements [Source: National Instruments: Smart Grid Ready Instrumentation”
  • 9. www.inaccel.com Flexibility versus High performance CPU: High flexibility, low throughput Chip Accelerators: High throughput, high NRE Initial one-off costs Programmable accelerators (FPGAs) can provide massive parallelism and higher efficiency than CPUs for certain categories of applications InAccel offers novel accelerators as IP cores to boost the performance of their applications (similar to specialized containers)
  • 10. www.inaccel.com Hardware Accelerators A GPU is effective at processing the same set of operations in parallel – single instruction, multiple data (SIMD). A GPU has a well-defined instruction-set, and fixed word sizes – for example single, double, or half-precision integer and floating point values. An FPGA is effective at processing the same or different operations in parallel – multiple instructions, multiple data (MIMD). An FPGA does not have a predefined instruction-set, or a fixed data width. Control ALU ALU Cache DRAM ALU ALU CPU (one core) FPGA DRAM DRAM GPU Each FPGA has more than 2M of these cells Each GPU has 2880 of these cores DRAM BlockRAM BlockRAM DRAM DRAM
  • 11. www.inaccel.com Programming FPGAs ˃ Using Hardware description Languages (VHDL, Verilog) ˃ Using specialized OpenCL, HLS (High level synthesis) ˃ As IP core (plug-n-play) Performance Flexibility Deployment ++ -- -- + + - + ++ ++
  • 12. www.inaccel.com helps companies speedup their applications by providing ready-to-use accelerators-as-a-service in the cloud 3x-10x Speedup 2x Lower Cost Zero code changes
  • 13. www.inaccel.com InAccel, Inc. ˃Founded in 2018 ˃Offices in US and Greece ˃Member of:
  • 14. www.inaccel.com Market size ˃ The data center accelerator market is expected to reach USD 21.19 billion by 2023 from USD 2.84 billion by 2018, at a CAGR of 49.47% from 2018 to 2023. ˃ The market for FPGA is expected to grow at the highest CAGR during the forecast period owing to the increasing adoption of FPGAs for the acceleration of enterprise workloads. [Source: Data Center Accelerator Market by Processor Type (CPU, GPU, FPGA, ASIC)- Global Forecast to 2023, Research and Markets]
  • 15. www.inaccel.com InAccel integrated framework Supported APIs Zero-code changes ˃ C/C++ ˃ Scala ˃ Python ˃ Java ˃ Apache Spark Scalable Resource Manager cores Alveo U200 F1 instances
  • 16. www.inaccel.com Apache Spark ˃ Spark is the most widely used framework for Data Analytics ˃ Develop hardware components as IP cores for widely used applications Spark ‒ Logistic regression ‒ Recommendation ‒ K-means ‒ Linear regression ‒ PageRank ‒ Graph computing
  • 17. www.inaccel.com Acceleration for machine learning inaccel offers Accelerators-as-a- Service for Apache Spark in the cloud (e.g. Amazon AWS f1) using FPGAs
  • 18. www.inaccel.com Hardware acceleration module filter1 (clock, rst, strm_in, strm_out) for (i=0; i<NUMUNITS; i=i+1) always@(posedge clock) integer i,j; //index for loops tmp_kernel[j] = k[i*OFFSETX]; FPGA handles compute- intensive, deeply pipelined, hardware-accelerated operations CPU handles the rest application InAccel 800 sec80 sec 200 sec Source: amazon, Inc.
  • 19. www.inaccel.com Accelerators for Spark ML in Amazon AWS in 3 steps f1 (8 cores+FPGA) 1.Create an f1 instance using InAccel’s Image (AMI) 2.Import InAccel API 3.Run your applications on AWS f1 to get 3x – 20x speedup
  • 20. www.inaccel.com FPGA in the cloud – the AWS model ˃ Amazon EC F1’s Xilinx FPGA
  • 21. www.inaccel.com Cloud Marketplace: available now Amazon EC2 FPGA Deployment via Marketplace InAccel Products Customers AWS Marketplace Scalable to worldwide market First to provide accelerators for Spark
  • 22. www.inaccel.com IP cores available in Amazon AWS Logistic RegressionK-mean clustering K-means is one of the simplest unsupervised learning algorithms that solve the well known clustering problem. Gradient Descent​ IP block for faster training of machine learning algorithms. Recommendation Engines (ALS) Alternative-Least- Square IP core for the acceleration of recommendation engines based on collaborative filtering.Available in Amazon AWS marketplace for free trial: www.inaccel.com
  • 24. www.inaccel.com Communication with Host in Amazon AWS Accelerators for logistic regression/kmeans
  • 25. www.inaccel.com Docker-based implementation for easy integration ˃ Inaccel’s FPGA manager docker container comprises both an FPGA manager to schedule, orchestrate, and monitor the execution of the accelerated applications but also the required FPGA runtime system. ˃ The dockerized runtime system detects the FPGA platform (aws F1) and manages the interaction/communication with the FPGA (i.e., loading the accelerator, transferring input data and results), making it transparent to the application. ˃ Docker swarm, Kubernetes, naïve execution
  • 26. www.inaccel.com Cluster mode ˃ Cluster mode Worker Driver (sparkContext) Docker Runtime & FPGA Manager Executor Worker Executor Worker Executor FPGA f1.x2 FPGA f1.x2 FPGA f1.x2 FPGA accelerated Infrastructure IP Library
  • 27. www.inaccel.com Speedup comparison ˃ Up to 10x speedup compared to 32 cores based on f1.x2 Cluster of 4 f1 (SW) Cluster of 4 f1 (SW + InAccel) f1.x2largef1.x2large ML Accel ML Accel ML Accel ML Accel f1.x2large f1.x2large 1 10.2x 4x f1.x2large (32 cores) 4x f1.x2large (32cores+InAccel) Speedup on cluster of f1.x2 using InAccel
  • 28. www.inaccel.com Speed up ˃ Up to 12x speedup compared to 64 cores on f1.x16 1.00 12.14 f1.16xlarge (sw) f1.16xlarge (hw) Speedup of f1.x16 with 8 InAccel FPGA kernels f1.x16large (SW) 64 cores f1.x16large (SW + 8 InAccel cores) 64 cores + 8 FPGAs with InAccel MLAccel MLAccel MLAccel MLAccel MLAccel MLAccel MLAccel MLAccel
  • 29. www.inaccel.com Speedup comparison ˃ 3x Speedup compared to r4 ˃ 2x lower OpEx 1.00 3.18 cluster of 4 r4 cluster of 4 f1.x2 Speedup comparison normalized on cost for a cluster of 4 nodes ($2/hour/node) Cluster of 4 r4 (SW) Cluster of 4 f1 (SW + InAccel) r4 (32 cores each – 128 cores total) ML Accel ML Accel ML Accel ML Accel f1.x2large f1.x2large
  • 31. www.inaccel.com Demo on Amazon AWS Intel 36 cores Xeon on Amazon AWS c4.8xlarge $1.592/hour 8 cores + in Amazon AWS FPGA f1.2xlarge $1.65/hour + inaccel Note: 4x fast forward for both cases
  • 32. www.inaccel.com Cost reduction ˃ Up to 3x lower cost to train your ML model
  • 33. www.inaccel.com Try for free on Amazon AWS Single node version ˃ Single-node Machine learning accelerators for Amazon f1.x2large instances providing APIs for C/C++, Java, Python and Scala for easy integration Distributed version for Apache Spark ˃ Machine learning accelerators for Apache Spark providing all the required APIs and libraries for the seamless integration in distributed systems Single node ML suite Distributed node ML suite
  • 34. www.inaccel.com Simple integration ˃ InAccel OFF: $ spark-submit [ arguments ] ˃ InAccel ON: $ spark-submit --inaccel [ arguments ]
  • 36. www.inaccel.com Takeaways ˃ Accelerated ML Suite C/C+ Python Java Scala Spark 3x-10x Speedup 2x Lower Cost No code changes