SlideShare a Scribd company logo
FPGAs on The Cloud
Ioannis Tsagatakis
Ioannis Stefanis
Msc in Informatics & Multimedia
Department of Informatics Engineering TEI of Crete
Embedded Systems
2
Accelerated Computing: GPUs and FPGA
3
Massive Parallelism
● GPU
– SIMD
– Instruction Set
– Fixed Word Sizes
– Simple control logic
● FPGA
– MIMD
– No instruction set
– Any data width
– Complex control logic (FSMs)
4
AWS F1 FPGA Instances
● Cloud based FPGA
– No need to buy hardware
● Cloud based IDE
– Ready to used AMI
– HDL: Verilog, VHDL
– SDAccel: C/C++, OpenCL
– AFI tools
● Marketplace
– A new market for Ips
– Secure encrypted AFIs
● f1.2xlarge
– 1 VU9P UltraScale+
● 2.5M logic elements
● 6,800 DSP
– 8 vCPU Cores
– 122GB RAM
– PCIe X16
– 1.6$ per hour
● f1.16xlarge
– 8 FPGA/64 CPUs
● Run simulation design on C4
to save money
5
The SDAccel Development Environment
● Cloud IDE
or
● Local Install
● Virtual JTAG
Intefcace
6
AWS F1 Platform Model
7
The AWS F1 Shell Amazon AFI
Image
Predefined interface
Secured, encrypted User can’t
Dynamically (re)loaded see the bits
8
AFI Creation Flow
9
Kernel Creation: The 2 workflows
● Custom IP must packaged as an
SDAccel Kernel
● Strict interface requirements
● Design for performance
● SDAccel provides a Kernel
Wizard
● Kernel container file (XO file)
- XML metadata, Vivado project
- RTL files
● Or generate kernel from OpenCL
● Advanced optimizations
- Memory partitioning,
- Loop unrolling
- DSP block inferencing
10
An OpenCL Kernel
● Language support
– Embedded profile (1.0)
– Pipes (2.0)
– Image Objects (2.0)
● N dim ranges
● SIMD vector types
● Math library functions
11
Compiling the Platform
12
Creating the Amazon FPGA Image
● Created by an amazon
service
● Secured stored and
encrypted
● Developers have no
access to RTL IP
● The distributable
awsxclbin contains
only the AFI id
13
SDAccel Testing and Execution Modes
14
OpenCL Memory Model
15
OpenCL vs Cuda
● Cuda
– SIMD
– Easier programming
model
– Restricted memory
access patterns
– Faster development
– Vendor lock
– Easy deployment
● F1 FPGA
– MIMD
– More complexity
– Harder programming
– Deep pipelining
– Slow development
– Vendor lock
– Cloud deployment
16
Smith–Waterman algorithm (sw_emu)
------FPGA Accelerator Summary --------
Number of SmithWaterman instances on FPGA:16
Total processing elements:512
Length of reference string:256
Length of read(query) string:128
Read-Ref pair block size(HOST to FPGA):1024
Verify Mode is:0
---------------------------------------
Generating read-ref samples
Processing 16384 Samples
HW Block Size: 16384
Total Number of blocks: 1
INFO: [smithwaterman.cpp:654] TIME: [Wed Feb 21 22:37:07 2018] nruns = 1
INFO: [smithwaterman.cpp:655] TIME: [Wed Feb 21 22:37:07 2018] total [ms] = 43326.373
INFO: [smithwaterman.cpp:656] TIME: [Wed Feb 21 22:37:07 2018] Host write [ms] = 0.768
INFO: [smithwaterman.cpp:657] TIME: [Wed Feb 21 22:37:07 2018] Krnl exec [ms] = 43317.977
INFO: [smithwaterman.cpp:658] TIME: [Wed Feb 21 22:37:07 2018] Host read [ms] = 1.029
GCups(based on kernel execution time):0.0115426
GCups(based on total execution time):0.0115403
INFO: [smithwaterman.cpp:679] TIME: [Wed Feb 21 22:37:07 2018] Host2Device rate [mbps] = 15616.602
INFO: [smithwaterman.cpp:691] TIME: [Wed Feb 21 22:37:07 2018] Device2Host rate [mbps] = 1457.154
INFO: [main.cpp:172] TIME: [Wed Feb 21 22:37:07 2018] finished
~/aws-fpga/SDAccel/examples/xilinx/acceleration/smithwaterman
17
Smith–Waterman algorithm (wh_emu)
~/aws-fpga/SDAccel/examples/xilinx/acceleration/smithwaterman
xsimk
Generating read-ref samples
Processing 16384 Samples
HW Block Size: 16384
Total Number of blocks: 1
INFO: [SDx-EM 22] [Wall clock time: 23:05, Emulation time: 0.275298 ms] Data transfer between kernel(s) and
global memory(s)
BANK0 RD = 64.316 KB WR = 7.875 KB
BANK1 RD = 0.000 KB WR = 0.000 KB
BANK2 RD = 0.000 KB WR = 0.000 KB
BANK3 RD = 0.000 KB WR = 0.000 KB
…. after many hours …
INFO: [SDx-EM 22] [Wall clock time: 00:27, Emulation time: 4.77014 ms] Data transfer between kernel(s) and
global memory(s)
BANK0 RD = 1110.004 KB WR = 138.562 KB
BANK1 RD = 0.000 KB WR = 0.000 KB
BANK2 RD = 0.000 KB WR = 0.000 KB
BANK3 RD = 0.000 KB WR = 0.000 KB
….
18
Building Times
For the helloworld example
INFO: [XOCC 60-629] Linking for hardware target
INFO: [XOCC 60-895] Target platform: /home/centos/src/project_data/aws-
fpga/SDAccel/aws_platform/xilinx_aws-vu9p-f1_4ddr-xpr-2pr_4_0/xilinx_aws-vu9p-f1_4ddr-xpr-2pr_4_0.xpfm
INFO: [XOCC 60-423] Target device: xilinx:aws-vu9p-f1:4ddr-xpr-2pr:4.0
INFO: [XOCC 60-251] Hardware accelerator integration...
Creating Vivado project and starting FPGA synthesis.
................................................................................................................................
Finished 1st of 5 tasks (FPGA synthesis). Elapsed time: 00h 34m 54s.
.....
Finished 2nd of 5 tasks (FPGA logic optimization). Elapsed time: 00h 05m 37s.
...............................
Finished 3rd of 5 tasks (FPGA logic placement). Elapsed time: 00h 43m 50s.
................................
Finished 4th of 5 tasks (FPGA routing). Elapsed time: 00h 56m 33s.
INFO: [XOCC 60-586] Created xclbin/vector_addition.hw.xilinx_aws-vu9p-f1_4ddr-xpr-2pr_4_0.xclbin
INFO: [XOCC 60-791] Total elapsed time: 2h 31m 50s
And then you have to build the AFI ...
Give up building the
19
Building Times
For the helloworld example
INFO: [XOCC 60-629] Linking for hardware target
INFO: [XOCC 60-895] Target platform: /home/centos/src/project_data/aws-
fpga/SDAccel/aws_platform/xilinx_aws-vu9p-f1_4ddr-xpr-2pr_4_0/xilinx_aws-vu9p-f1_4ddr-xpr-2pr_4_0.xpfm
INFO: [XOCC 60-423] Target device: xilinx:aws-vu9p-f1:4ddr-xpr-2pr:4.0
INFO: [XOCC 60-251] Hardware accelerator integration...
Creating Vivado project and starting FPGA synthesis.
................................................................................................................................
Finished 1st of 5 tasks (FPGA synthesis). Elapsed time: 00h 34m 54s.
.....
Finished 2nd of 5 tasks (FPGA logic optimization). Elapsed time: 00h 05m 37s.
...............................
Finished 3rd of 5 tasks (FPGA logic placement). Elapsed time: 00h 43m 50s.
................................
Finished 4th of 5 tasks (FPGA routing). Elapsed time: 00h 56m 33s.
INFO: [XOCC 60-586] Created xclbin/vector_addition.hw.xilinx_aws-vu9p-f1_4ddr-xpr-2pr_4_0.xclbin
INFO: [XOCC 60-791] Total elapsed time: 2h 31m 50s
And then you have to build the AFI ...
Good luck
building the Smith-Waterman
Example
20
Conclusions
● Moderate* costs
● Easy setup with minor issues
● Cloud based IDE (rdp), or ssh
● Slow development
● Harder to learn than CUDA
● Good documentation and examples
● Market place is still small but
promising
●
No 3rd
party examples
Moderate cost ;
$3,500 Xilinx Virtex-7 FPGA VC707 Evaluation
Kit
$13,000 Xilinx Virtex-7 FPGA VC7222 Char. Kit
$1.500 Intel Xeon Phi 7120P Coprocessor
$1.400 Nvidia GeForce Titan X Pascal
21
Future work
22
FPGA vs GPU Accelerating Compute-Intensive Applications with GPUs and
FPGAs
S. Che, J. Li, J. W. Sheaffer, K. Skadron and J. Lach,
2008 Symposium on Application Specific Processors
CUDA and the GeForce 8800 GTX GPU
VHDL and the Xilinx Virtex-II Pro FPGA
23
FPGAs vs GPU
24
FPGA vs GPU
25
Is FPGA
and reconfigurable computing
the Future ?
Video on the cloud ? Deep Learning ?
26
Questions ?

More Related Content

What's hot

Reliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on LinuxReliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on Linux
Samsung Open Source Group
 
Qemu Pcie
Qemu PcieQemu Pcie
IBTA Releases Updated Specification for RoCEv2
IBTA Releases Updated Specification for RoCEv2IBTA Releases Updated Specification for RoCEv2
IBTA Releases Updated Specification for RoCEv2
inside-BigData.com
 
introduction to Embedded System
introduction to Embedded Systemintroduction to Embedded System
introduction to Embedded System
Ankur Soni
 
Linux Porting to a Custom Board
Linux Porting to a Custom BoardLinux Porting to a Custom Board
Linux Porting to a Custom BoardPatrick Bellasi
 
Simultaneously Leveraging Linux and Android in a GENIVI compliant IVI System
Simultaneously Leveraging Linux and Android in a GENIVI compliant IVI System Simultaneously Leveraging Linux and Android in a GENIVI compliant IVI System
Simultaneously Leveraging Linux and Android in a GENIVI compliant IVI System
mentoresd
 
Power management
Power managementPower management
Power management
Scott Shu
 
Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_
Linaro
 
Tilera tile64 by Ibrahem Batta
Tilera tile64  by Ibrahem BattaTilera tile64  by Ibrahem Batta
Tilera tile64 by Ibrahem BattaIbrahem Batta
 
Soc architecture and design
Soc architecture and designSoc architecture and design
Soc architecture and design
Satya Harish
 
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM SystemsXPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
The Linux Foundation
 
LCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted FirmwareLCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted Firmware
Linaro
 
QEMU - Binary Translation
QEMU - Binary Translation QEMU - Binary Translation
QEMU - Binary Translation
Jiann-Fuh Liaw
 
BUD17-209: Reliability, Availability, and Serviceability (RAS) on ARM64
BUD17-209: Reliability, Availability, and Serviceability (RAS) on ARM64 BUD17-209: Reliability, Availability, and Serviceability (RAS) on ARM64
BUD17-209: Reliability, Availability, and Serviceability (RAS) on ARM64
Linaro
 
UEFI presentation
UEFI presentationUEFI presentation
UEFI presentation
Bruno Cornec
 
Low power electronic design
Low power electronic designLow power electronic design
Low power electronic design
Mahesh Dananjaya
 
FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning
Dr. Swaminathan Kathirvel
 
Design Verification Using SystemC
Design Verification Using SystemCDesign Verification Using SystemC
Design Verification Using SystemCDVClub
 

What's hot (20)

Reliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on LinuxReliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on Linux
 
Qemu Pcie
Qemu PcieQemu Pcie
Qemu Pcie
 
Linux Porting
Linux PortingLinux Porting
Linux Porting
 
IBTA Releases Updated Specification for RoCEv2
IBTA Releases Updated Specification for RoCEv2IBTA Releases Updated Specification for RoCEv2
IBTA Releases Updated Specification for RoCEv2
 
introduction to Embedded System
introduction to Embedded Systemintroduction to Embedded System
introduction to Embedded System
 
SOC design
SOC design SOC design
SOC design
 
Linux Porting to a Custom Board
Linux Porting to a Custom BoardLinux Porting to a Custom Board
Linux Porting to a Custom Board
 
Simultaneously Leveraging Linux and Android in a GENIVI compliant IVI System
Simultaneously Leveraging Linux and Android in a GENIVI compliant IVI System Simultaneously Leveraging Linux and Android in a GENIVI compliant IVI System
Simultaneously Leveraging Linux and Android in a GENIVI compliant IVI System
 
Power management
Power managementPower management
Power management
 
Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_
 
Tilera tile64 by Ibrahem Batta
Tilera tile64  by Ibrahem BattaTilera tile64  by Ibrahem Batta
Tilera tile64 by Ibrahem Batta
 
Soc architecture and design
Soc architecture and designSoc architecture and design
Soc architecture and design
 
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM SystemsXPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
 
LCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted FirmwareLCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted Firmware
 
QEMU - Binary Translation
QEMU - Binary Translation QEMU - Binary Translation
QEMU - Binary Translation
 
BUD17-209: Reliability, Availability, and Serviceability (RAS) on ARM64
BUD17-209: Reliability, Availability, and Serviceability (RAS) on ARM64 BUD17-209: Reliability, Availability, and Serviceability (RAS) on ARM64
BUD17-209: Reliability, Availability, and Serviceability (RAS) on ARM64
 
UEFI presentation
UEFI presentationUEFI presentation
UEFI presentation
 
Low power electronic design
Low power electronic designLow power electronic design
Low power electronic design
 
FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning
 
Design Verification Using SystemC
Design Verification Using SystemCDesign Verification Using SystemC
Design Verification Using SystemC
 

Similar to FPGA on the Cloud

CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
byteLAKE
 
PowerDRC/LVS 2.0.1 released by POLYTEDA
PowerDRC/LVS 2.0.1 released by POLYTEDAPowerDRC/LVS 2.0.1 released by POLYTEDA
PowerDRC/LVS 2.0.1 released by POLYTEDA
Alexander Grudanov
 
HiPEAC-Keynote.pptx
HiPEAC-Keynote.pptxHiPEAC-Keynote.pptx
HiPEAC-Keynote.pptx
Behzad Salami
 
Installing Oracle Database on LDOM
Installing Oracle Database on LDOMInstalling Oracle Database on LDOM
Installing Oracle Database on LDOM
Philippe Fierens
 
TechWiseTV Workshop: Cisco UCS C4200
TechWiseTV Workshop: Cisco UCS C4200TechWiseTV Workshop: Cisco UCS C4200
TechWiseTV Workshop: Cisco UCS C4200
Robb Boyd
 
Efabless Marketplace webinar slides 2024
Efabless Marketplace webinar slides 2024Efabless Marketplace webinar slides 2024
Efabless Marketplace webinar slides 2024
Nobin Mathew
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVM
data://disrupted®
 
Announcing Amazon EC2 F1 Instances with Custom FPGAs
Announcing Amazon EC2 F1 Instances with Custom FPGAsAnnouncing Amazon EC2 F1 Instances with Custom FPGAs
Announcing Amazon EC2 F1 Instances with Custom FPGAs
Amazon Web Services
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVM
DevOps.com
 
QEMU and Raspberry Pi. Instant Embedded Development
QEMU and Raspberry Pi. Instant Embedded DevelopmentQEMU and Raspberry Pi. Instant Embedded Development
QEMU and Raspberry Pi. Instant Embedded Development
GlobalLogic Ukraine
 
PowerDRC/LVS 2.2 released by POLYTEDA
PowerDRC/LVS 2.2 released by POLYTEDAPowerDRC/LVS 2.2 released by POLYTEDA
PowerDRC/LVS 2.2 released by POLYTEDA
Alexander Grudanov
 
Fixed-point Multi-Core DSP Platform
Fixed-point Multi-Core DSP PlatformFixed-point Multi-Core DSP Platform
Fixed-point Multi-Core DSP Platform
Sundance Multiprocessor Technology Ltd.
 
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
Shinya Takamaeda-Y
 
Nytro-XV_NWD_VM_Performance_Acceleration
Nytro-XV_NWD_VM_Performance_AccelerationNytro-XV_NWD_VM_Performance_Acceleration
Nytro-XV_NWD_VM_Performance_AccelerationKhai Le
 
AMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor Architecture
AMD
 
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
OpenStack Korea Community
 
PowerEdge Rack and Tower Server Masters AMD Processors.pptx
PowerEdge Rack and Tower Server Masters AMD Processors.pptxPowerEdge Rack and Tower Server Masters AMD Processors.pptx
PowerEdge Rack and Tower Server Masters AMD Processors.pptx
NeoKenj
 
Introduction to EDA Tools
Introduction to EDA ToolsIntroduction to EDA Tools
Introduction to EDA Tools
venkatasuman1983
 
100Gbps OpenStack For Providing High-Performance NFV
100Gbps OpenStack For Providing High-Performance NFV100Gbps OpenStack For Providing High-Performance NFV
100Gbps OpenStack For Providing High-Performance NFV
NTT Communications Technology Development
 

Similar to FPGA on the Cloud (20)

CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
 
PowerDRC/LVS 2.0.1 released by POLYTEDA
PowerDRC/LVS 2.0.1 released by POLYTEDAPowerDRC/LVS 2.0.1 released by POLYTEDA
PowerDRC/LVS 2.0.1 released by POLYTEDA
 
HiPEAC-Keynote.pptx
HiPEAC-Keynote.pptxHiPEAC-Keynote.pptx
HiPEAC-Keynote.pptx
 
Installing Oracle Database on LDOM
Installing Oracle Database on LDOMInstalling Oracle Database on LDOM
Installing Oracle Database on LDOM
 
TechWiseTV Workshop: Cisco UCS C4200
TechWiseTV Workshop: Cisco UCS C4200TechWiseTV Workshop: Cisco UCS C4200
TechWiseTV Workshop: Cisco UCS C4200
 
Efabless Marketplace webinar slides 2024
Efabless Marketplace webinar slides 2024Efabless Marketplace webinar slides 2024
Efabless Marketplace webinar slides 2024
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVM
 
Announcing Amazon EC2 F1 Instances with Custom FPGAs
Announcing Amazon EC2 F1 Instances with Custom FPGAsAnnouncing Amazon EC2 F1 Instances with Custom FPGAs
Announcing Amazon EC2 F1 Instances with Custom FPGAs
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVM
 
QEMU and Raspberry Pi. Instant Embedded Development
QEMU and Raspberry Pi. Instant Embedded DevelopmentQEMU and Raspberry Pi. Instant Embedded Development
QEMU and Raspberry Pi. Instant Embedded Development
 
PowerDRC/LVS 2.2 released by POLYTEDA
PowerDRC/LVS 2.2 released by POLYTEDAPowerDRC/LVS 2.2 released by POLYTEDA
PowerDRC/LVS 2.2 released by POLYTEDA
 
Fixed-point Multi-Core DSP Platform
Fixed-point Multi-Core DSP PlatformFixed-point Multi-Core DSP Platform
Fixed-point Multi-Core DSP Platform
 
GPU: Understanding CUDA
GPU: Understanding CUDAGPU: Understanding CUDA
GPU: Understanding CUDA
 
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
 
Nytro-XV_NWD_VM_Performance_Acceleration
Nytro-XV_NWD_VM_Performance_AccelerationNytro-XV_NWD_VM_Performance_Acceleration
Nytro-XV_NWD_VM_Performance_Acceleration
 
AMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor Architecture
 
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
 
PowerEdge Rack and Tower Server Masters AMD Processors.pptx
PowerEdge Rack and Tower Server Masters AMD Processors.pptxPowerEdge Rack and Tower Server Masters AMD Processors.pptx
PowerEdge Rack and Tower Server Masters AMD Processors.pptx
 
Introduction to EDA Tools
Introduction to EDA ToolsIntroduction to EDA Tools
Introduction to EDA Tools
 
100Gbps OpenStack For Providing High-Performance NFV
100Gbps OpenStack For Providing High-Performance NFV100Gbps OpenStack For Providing High-Performance NFV
100Gbps OpenStack For Providing High-Performance NFV
 

More from jtsagata

Advanced Notes on Pointers
Advanced Notes on PointersAdvanced Notes on Pointers
Advanced Notes on Pointers
jtsagata
 
C locales
C localesC locales
C locales
jtsagata
 
GPGPU Computation
GPGPU ComputationGPGPU Computation
GPGPU Computation
jtsagata
 
Eισαγωγή στο TDD
Eισαγωγή στο TDDEισαγωγή στο TDD
Eισαγωγή στο TDD
jtsagata
 
Παιγνίδια με Πίνακες και Δείκτες
Παιγνίδια με Πίνακες και ΔείκτεςΠαιγνίδια με Πίνακες και Δείκτες
Παιγνίδια με Πίνακες και Δείκτες
jtsagata
 
Linux and C
Linux and CLinux and C
Linux and C
jtsagata
 
Git intro
Git introGit intro
Git intro
jtsagata
 
Greek utf8
Greek utf8Greek utf8
Greek utf8
jtsagata
 
Function pointers in C
Function pointers in CFunction pointers in C
Function pointers in C
jtsagata
 
Why computers can' compute
Why computers can' computeWhy computers can' compute
Why computers can' compute
jtsagata
 
Τι είναι υπολογισμός
Τι είναι υπολογισμόςΤι είναι υπολογισμός
Τι είναι υπολογισμός
jtsagata
 
IEEE 754 Floating point
IEEE 754 Floating pointIEEE 754 Floating point
IEEE 754 Floating point
jtsagata
 
Η Τέχνη του TeX/LaTeX
Η Τέχνη του TeX/LaTeXΗ Τέχνη του TeX/LaTeX
Η Τέχνη του TeX/LaTeX
jtsagata
 
Unikernels
UnikernelsUnikernels
Unikernels
jtsagata
 
Evolutionary keyboard Layout
Evolutionary keyboard LayoutEvolutionary keyboard Layout
Evolutionary keyboard Layout
jtsagata
 
Το εργαλείο
Το εργαλείοΤο εργαλείο
Το εργαλείοjtsagata
 

More from jtsagata (17)

Advanced Notes on Pointers
Advanced Notes on PointersAdvanced Notes on Pointers
Advanced Notes on Pointers
 
C locales
C localesC locales
C locales
 
GPGPU Computation
GPGPU ComputationGPGPU Computation
GPGPU Computation
 
Eισαγωγή στο TDD
Eισαγωγή στο TDDEισαγωγή στο TDD
Eισαγωγή στο TDD
 
Παιγνίδια με Πίνακες και Δείκτες
Παιγνίδια με Πίνακες και ΔείκτεςΠαιγνίδια με Πίνακες και Δείκτες
Παιγνίδια με Πίνακες και Δείκτες
 
Linux and C
Linux and CLinux and C
Linux and C
 
Git intro
Git introGit intro
Git intro
 
Greek utf8
Greek utf8Greek utf8
Greek utf8
 
Function pointers in C
Function pointers in CFunction pointers in C
Function pointers in C
 
Why computers can' compute
Why computers can' computeWhy computers can' compute
Why computers can' compute
 
Τι είναι υπολογισμός
Τι είναι υπολογισμόςΤι είναι υπολογισμός
Τι είναι υπολογισμός
 
IEEE 754 Floating point
IEEE 754 Floating pointIEEE 754 Floating point
IEEE 754 Floating point
 
Η Τέχνη του TeX/LaTeX
Η Τέχνη του TeX/LaTeXΗ Τέχνη του TeX/LaTeX
Η Τέχνη του TeX/LaTeX
 
Unikernels
UnikernelsUnikernels
Unikernels
 
Evolutionary keyboard Layout
Evolutionary keyboard LayoutEvolutionary keyboard Layout
Evolutionary keyboard Layout
 
Omilia
OmiliaOmilia
Omilia
 
Το εργαλείο
Το εργαλείοΤο εργαλείο
Το εργαλείο
 

Recently uploaded

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 

Recently uploaded (20)

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 

FPGA on the Cloud

  • 1. FPGAs on The Cloud Ioannis Tsagatakis Ioannis Stefanis Msc in Informatics & Multimedia Department of Informatics Engineering TEI of Crete Embedded Systems
  • 3. 3 Massive Parallelism ● GPU – SIMD – Instruction Set – Fixed Word Sizes – Simple control logic ● FPGA – MIMD – No instruction set – Any data width – Complex control logic (FSMs)
  • 4. 4 AWS F1 FPGA Instances ● Cloud based FPGA – No need to buy hardware ● Cloud based IDE – Ready to used AMI – HDL: Verilog, VHDL – SDAccel: C/C++, OpenCL – AFI tools ● Marketplace – A new market for Ips – Secure encrypted AFIs ● f1.2xlarge – 1 VU9P UltraScale+ ● 2.5M logic elements ● 6,800 DSP – 8 vCPU Cores – 122GB RAM – PCIe X16 – 1.6$ per hour ● f1.16xlarge – 8 FPGA/64 CPUs ● Run simulation design on C4 to save money
  • 5. 5 The SDAccel Development Environment ● Cloud IDE or ● Local Install ● Virtual JTAG Intefcace
  • 7. 7 The AWS F1 Shell Amazon AFI Image Predefined interface Secured, encrypted User can’t Dynamically (re)loaded see the bits
  • 9. 9 Kernel Creation: The 2 workflows ● Custom IP must packaged as an SDAccel Kernel ● Strict interface requirements ● Design for performance ● SDAccel provides a Kernel Wizard ● Kernel container file (XO file) - XML metadata, Vivado project - RTL files ● Or generate kernel from OpenCL ● Advanced optimizations - Memory partitioning, - Loop unrolling - DSP block inferencing
  • 10. 10 An OpenCL Kernel ● Language support – Embedded profile (1.0) – Pipes (2.0) – Image Objects (2.0) ● N dim ranges ● SIMD vector types ● Math library functions
  • 12. 12 Creating the Amazon FPGA Image ● Created by an amazon service ● Secured stored and encrypted ● Developers have no access to RTL IP ● The distributable awsxclbin contains only the AFI id
  • 13. 13 SDAccel Testing and Execution Modes
  • 15. 15 OpenCL vs Cuda ● Cuda – SIMD – Easier programming model – Restricted memory access patterns – Faster development – Vendor lock – Easy deployment ● F1 FPGA – MIMD – More complexity – Harder programming – Deep pipelining – Slow development – Vendor lock – Cloud deployment
  • 16. 16 Smith–Waterman algorithm (sw_emu) ------FPGA Accelerator Summary -------- Number of SmithWaterman instances on FPGA:16 Total processing elements:512 Length of reference string:256 Length of read(query) string:128 Read-Ref pair block size(HOST to FPGA):1024 Verify Mode is:0 --------------------------------------- Generating read-ref samples Processing 16384 Samples HW Block Size: 16384 Total Number of blocks: 1 INFO: [smithwaterman.cpp:654] TIME: [Wed Feb 21 22:37:07 2018] nruns = 1 INFO: [smithwaterman.cpp:655] TIME: [Wed Feb 21 22:37:07 2018] total [ms] = 43326.373 INFO: [smithwaterman.cpp:656] TIME: [Wed Feb 21 22:37:07 2018] Host write [ms] = 0.768 INFO: [smithwaterman.cpp:657] TIME: [Wed Feb 21 22:37:07 2018] Krnl exec [ms] = 43317.977 INFO: [smithwaterman.cpp:658] TIME: [Wed Feb 21 22:37:07 2018] Host read [ms] = 1.029 GCups(based on kernel execution time):0.0115426 GCups(based on total execution time):0.0115403 INFO: [smithwaterman.cpp:679] TIME: [Wed Feb 21 22:37:07 2018] Host2Device rate [mbps] = 15616.602 INFO: [smithwaterman.cpp:691] TIME: [Wed Feb 21 22:37:07 2018] Device2Host rate [mbps] = 1457.154 INFO: [main.cpp:172] TIME: [Wed Feb 21 22:37:07 2018] finished ~/aws-fpga/SDAccel/examples/xilinx/acceleration/smithwaterman
  • 17. 17 Smith–Waterman algorithm (wh_emu) ~/aws-fpga/SDAccel/examples/xilinx/acceleration/smithwaterman xsimk Generating read-ref samples Processing 16384 Samples HW Block Size: 16384 Total Number of blocks: 1 INFO: [SDx-EM 22] [Wall clock time: 23:05, Emulation time: 0.275298 ms] Data transfer between kernel(s) and global memory(s) BANK0 RD = 64.316 KB WR = 7.875 KB BANK1 RD = 0.000 KB WR = 0.000 KB BANK2 RD = 0.000 KB WR = 0.000 KB BANK3 RD = 0.000 KB WR = 0.000 KB …. after many hours … INFO: [SDx-EM 22] [Wall clock time: 00:27, Emulation time: 4.77014 ms] Data transfer between kernel(s) and global memory(s) BANK0 RD = 1110.004 KB WR = 138.562 KB BANK1 RD = 0.000 KB WR = 0.000 KB BANK2 RD = 0.000 KB WR = 0.000 KB BANK3 RD = 0.000 KB WR = 0.000 KB ….
  • 18. 18 Building Times For the helloworld example INFO: [XOCC 60-629] Linking for hardware target INFO: [XOCC 60-895] Target platform: /home/centos/src/project_data/aws- fpga/SDAccel/aws_platform/xilinx_aws-vu9p-f1_4ddr-xpr-2pr_4_0/xilinx_aws-vu9p-f1_4ddr-xpr-2pr_4_0.xpfm INFO: [XOCC 60-423] Target device: xilinx:aws-vu9p-f1:4ddr-xpr-2pr:4.0 INFO: [XOCC 60-251] Hardware accelerator integration... Creating Vivado project and starting FPGA synthesis. ................................................................................................................................ Finished 1st of 5 tasks (FPGA synthesis). Elapsed time: 00h 34m 54s. ..... Finished 2nd of 5 tasks (FPGA logic optimization). Elapsed time: 00h 05m 37s. ............................... Finished 3rd of 5 tasks (FPGA logic placement). Elapsed time: 00h 43m 50s. ................................ Finished 4th of 5 tasks (FPGA routing). Elapsed time: 00h 56m 33s. INFO: [XOCC 60-586] Created xclbin/vector_addition.hw.xilinx_aws-vu9p-f1_4ddr-xpr-2pr_4_0.xclbin INFO: [XOCC 60-791] Total elapsed time: 2h 31m 50s And then you have to build the AFI ... Give up building the
  • 19. 19 Building Times For the helloworld example INFO: [XOCC 60-629] Linking for hardware target INFO: [XOCC 60-895] Target platform: /home/centos/src/project_data/aws- fpga/SDAccel/aws_platform/xilinx_aws-vu9p-f1_4ddr-xpr-2pr_4_0/xilinx_aws-vu9p-f1_4ddr-xpr-2pr_4_0.xpfm INFO: [XOCC 60-423] Target device: xilinx:aws-vu9p-f1:4ddr-xpr-2pr:4.0 INFO: [XOCC 60-251] Hardware accelerator integration... Creating Vivado project and starting FPGA synthesis. ................................................................................................................................ Finished 1st of 5 tasks (FPGA synthesis). Elapsed time: 00h 34m 54s. ..... Finished 2nd of 5 tasks (FPGA logic optimization). Elapsed time: 00h 05m 37s. ............................... Finished 3rd of 5 tasks (FPGA logic placement). Elapsed time: 00h 43m 50s. ................................ Finished 4th of 5 tasks (FPGA routing). Elapsed time: 00h 56m 33s. INFO: [XOCC 60-586] Created xclbin/vector_addition.hw.xilinx_aws-vu9p-f1_4ddr-xpr-2pr_4_0.xclbin INFO: [XOCC 60-791] Total elapsed time: 2h 31m 50s And then you have to build the AFI ... Good luck building the Smith-Waterman Example
  • 20. 20 Conclusions ● Moderate* costs ● Easy setup with minor issues ● Cloud based IDE (rdp), or ssh ● Slow development ● Harder to learn than CUDA ● Good documentation and examples ● Market place is still small but promising ● No 3rd party examples Moderate cost ; $3,500 Xilinx Virtex-7 FPGA VC707 Evaluation Kit $13,000 Xilinx Virtex-7 FPGA VC7222 Char. Kit $1.500 Intel Xeon Phi 7120P Coprocessor $1.400 Nvidia GeForce Titan X Pascal
  • 22. 22 FPGA vs GPU Accelerating Compute-Intensive Applications with GPUs and FPGAs S. Che, J. Li, J. W. Sheaffer, K. Skadron and J. Lach, 2008 Symposium on Application Specific Processors CUDA and the GeForce 8800 GTX GPU VHDL and the Xilinx Virtex-II Pro FPGA
  • 25. 25 Is FPGA and reconfigurable computing the Future ? Video on the cloud ? Deep Learning ?