A Parallel, Energy Efficient Hardware Architecture for the merAligner on FPGA using Chisel HCL

A Parallel, Energy Efficient Hardware Architecture
for the merAligner on FPGA using Chisel HCL
Lorenzo Di Tucci, Marco Santambrogio {lorenzo.ditucci, marco.santambrogio}@polimi.it
Alessandro Comodi, Davide Conficconi {alessandro.comodi,davide.conficconi}@mail.polimi.it
Steven Hofmeyr, David Donofrio {shofmeyr, ddonofrio}@lbl.gov
RAW @ JW Marriott, Vancouver
May 22 2018
speaker: Alessandro Comodi

Context 1
Large amounts of
genomic data Algorithm complexity
In such scenario there is a need to have efficient solutions both
from a performance and a power consumption point of view

Sequence Alignment 2
Sequence alignment algorithms are some of the most compute
intensive ones
Pure software solution
Poor performance
High power consumption

merAligner 3
To overcome performance issues Lawrence Berkeley National
Labs and UC Berkeley have proposed the merAligner
High
Performance
Low power efficiency
(90 kW per cabinet)
More than 15,000 cores

Contributions 4
• The design and development of a hardware
architecture for the Smith-Waterman algorithm
on FPGA, using Chisel HCL
• The development of a wrapper written in Chisel,
used to integrate RTL cores into the Xilinx
SDAccel Framework

Smith-Waterman 5
The main bottleneck of the merAligner tool is the Smith-
Waterman algorithm implementation
Highly parallel computation

Architecture 6
Systolic array
based design
Each processing element is fed
with the result of the previous one

Results 7
Read Reference
Frequency
[MHz]
Performance
[GCUPS]
Speed up
Performance
Efficiency
[GCUPS/W]
Speed up
Power efficiency
128[*] 1024 - 3.87 - 0.0165 -
128 1024 150 3.542 0.91X 0.141 8.54X
128 2048 140 5.616 1.45X 0.224 14.35X
128 4096 180 6.529 1.68X 0.261 15.81X
128 16384 110 11.443 2.84X 0.457 27.69X
256 1024 160 6.123 1.58X 0.244 14.78X
256 2048 160 8.393 2.16X 0.335 20.30X
256 4096 130 15.225 3.93X 0.609 36.90X
256 16384 140 27.312 7.05X 1.092 66.18X
[*] State of the Art Smith-Waterman software implementation

Concluding Remarks 8
Read Reference
Frequency
[MHz]
Performance
[GCUPS]
Speed up
Performance
Efficiency
[GCUPS/W]
Speed up
Power efficiency
128[*] 1024 - 3.87 - 0.0165 -
128 1024 150 3.542 0.91X 0.141 8.54X
128 2048 140 5.616 1.45X 0.224 14.35X
128 4096 180 6.529 1.68X 0.261 15.81X
128 16384 110 11.443 2.84X 0.457 27.69X
256 1024 160 6.123 1.58X 0.244 14.78X
256 2048 160 8.393 2.16X 0.335 20.30X
256 4096 130 15.225 3.93X 0.609 36.90X
256 16384 140 27.312 7.05X 1.092 66.18X
[*] State of the Art Smith-Waterman software implementation
Thank you for your attention!
Lorenzo Di Tucci, Marco Santambrogio {lorenzo.ditucci, marco.santambrogio}@polimi.it
Alessandro Comodi, Davide Conficconi {alessandro.comodi,davide.conficconi}@mail.polimi.it
Steven Hofmeyr, David Donofrio {shofmeyr, ddonofrio}@lbl.gov
speaker: Alessandro Comodi
Hardware architecture for the acceleration of the
Smith-Waterman step of the merAligner on FPGA
using Chisel HCL

A Parallel, Energy Efficient Hardware Architecture for the merAligner on FPGA using Chisel HCL

More Related Content

Similar to A Parallel, Energy Efficient Hardware Architecture for the merAligner on FPGA using Chisel HCL

More from NECST Lab @ Politecnico di Milano

Recently uploaded

A Parallel, Energy Efficient Hardware Architecture for the merAligner on FPGA using Chisel HCL