ProFAX - Implementation (Short version)

HARDWAREACCELERATION
Lorenzo DiTucci
lorenzo.ditucci@mail.polimi.it
Giulia Guidi
giulia.guidi@mail.polimi.it
XilinxOpen Hardware Context 2016s

Profiling
2
To find out the bottleneck procedures in the program, i.e.
the most compute intensive function.

Static Code Analysis
3
• most computationally intese function: computePairEnergy
• operational intensity equal to 52 Operations/Byte
• exploiting Roofline model we obtain expected performance
in GOps/s

Hardware Implementation
4
1. implementation of the bottleneck function into Vivado HLS
2. integration of the code into SDAccel
• automation of the hardware design flow
• possibility of introducing errors reduced

Optimization (1)
5
DRAM BRAM
z

Optimization (2)
6
Loop without Pipeline
Loop with Pipeline
Issue: can’t pipeline all loops
• the FPGA area  
is not large enough
• RAW hazards

Optimization (3)
7
Reduction
• removes the dependency between
iterations of a loops when updating
a variable
• works by writing the result of each
iteration of the loop in a cell of a
temporary array
• creation of multiple loops and
arrays to sum up the values inside
the first temporary structure

8
Thanks for your attention
Lorenzo Di Tucci
Giulia Guidi
lorenzo.ditucci@mail.polimi.it
giulia.guidi@mail.polimi.it
Follow us on : https://www.facebook.com/profaxnecstlab/
Follow us on : https://twitter.com/ProFAX_NECST
Follow us on : http://www.slideshare.net/ProFAX

ProFAX - Implementation (Short version)

More Related Content

What's hot

Viewers also liked

Similar to ProFAX - Implementation (Short version)

Recently uploaded

ProFAX - Implementation (Short version)