SlideShare a Scribd company logo
A Parallel, Energy Efficient Hardware Architecture
for the merAligner on FPGA using Chisel HCL
Lorenzo Di Tucci, Marco Santambrogio {lorenzo.ditucci, marco.santambrogio}@polimi.it
Alessandro Comodi, Davide Conficconi {alessandro.comodi,davide.conficconi}@mail.polimi.it
Steven Hofmeyr, David Donofrio {shofmeyr, ddonofrio}@lbl.gov
RAW @ JW Marriott, Vancouver
May 22 2018
speaker: Alessandro Comodi
Context 1
Large amounts of
genomic data Algorithm complexity
In such scenario there is a need to have efficient solutions both
from a performance and a power consumption point of view
Sequence Alignment 2
Sequence alignment algorithms are some of the most compute
intensive ones
Pure software solution
Poor performance
High power consumption
merAligner 3
To overcome performance issues Lawrence Berkeley National
Labs and UC Berkeley have proposed the merAligner
High
Performance
Low power efficiency
(90 kW per cabinet)
More than 15,000 cores
Contributions 4
• The design and development of a hardware
architecture for the Smith-Waterman algorithm
on FPGA, using Chisel HCL
• The development of a wrapper written in Chisel,
used to integrate RTL cores into the Xilinx
SDAccel Framework
Smith-Waterman 5
The main bottleneck of the merAligner tool is the Smith-
Waterman algorithm implementation
Highly parallel computation
Architecture 6
Systolic array
based design
Each processing element is fed
with the result of the previous one
Results 7
Read Reference
Frequency
[MHz]
Performance
[GCUPS]
Speed up
Performance
Efficiency
[GCUPS/W]
Speed up
Power efficiency
128[*] 1024 - 3.87 - 0.0165 -
128 1024 150 3.542 0.91X 0.141 8.54X
128 2048 140 5.616 1.45X 0.224 14.35X
128 4096 180 6.529 1.68X 0.261 15.81X
128 16384 110 11.443 2.84X 0.457 27.69X
256 1024 160 6.123 1.58X 0.244 14.78X
256 2048 160 8.393 2.16X 0.335 20.30X
256 4096 130 15.225 3.93X 0.609 36.90X
256 16384 140 27.312 7.05X 1.092 66.18X
[*] State of the Art Smith-Waterman software implementation
Concluding Remarks 8
Read Reference
Frequency
[MHz]
Performance
[GCUPS]
Speed up
Performance
Efficiency
[GCUPS/W]
Speed up
Power efficiency
128[*] 1024 - 3.87 - 0.0165 -
128 1024 150 3.542 0.91X 0.141 8.54X
128 2048 140 5.616 1.45X 0.224 14.35X
128 4096 180 6.529 1.68X 0.261 15.81X
128 16384 110 11.443 2.84X 0.457 27.69X
256 1024 160 6.123 1.58X 0.244 14.78X
256 2048 160 8.393 2.16X 0.335 20.30X
256 4096 130 15.225 3.93X 0.609 36.90X
256 16384 140 27.312 7.05X 1.092 66.18X
[*] State of the Art Smith-Waterman software implementation
Thank you for your attention!
Lorenzo Di Tucci, Marco Santambrogio {lorenzo.ditucci, marco.santambrogio}@polimi.it
Alessandro Comodi, Davide Conficconi {alessandro.comodi,davide.conficconi}@mail.polimi.it
Steven Hofmeyr, David Donofrio {shofmeyr, ddonofrio}@lbl.gov
speaker: Alessandro Comodi
Hardware architecture for the acceleration of the
Smith-Waterman step of the merAligner on FPGA
using Chisel HCL

More Related Content

Similar to A Parallel, Energy Efficient Hardware Architecture for the merAligner on FPGA using Chisel HCL

Trends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient PerformanceTrends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient Performance
inside-BigData.com
 
Performance Analysis and Optimizations of CAE Applications (Case Study: STAR_...
Performance Analysis and Optimizations of CAE Applications (Case Study: STAR_...Performance Analysis and Optimizations of CAE Applications (Case Study: STAR_...
Performance Analysis and Optimizations of CAE Applications (Case Study: STAR_...Fisnik Kraja
 
Technology overview
Technology overviewTechnology overview
Technology overviewvirtuehm
 
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the projectLEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGATO project
 
unit 1vlsi notes.pdf
unit 1vlsi notes.pdfunit 1vlsi notes.pdf
unit 1vlsi notes.pdf
AcademicICECE
 
IRJET - Predicting the Maximum Computational Power of Microprocessors using M...
IRJET - Predicting the Maximum Computational Power of Microprocessors using M...IRJET - Predicting the Maximum Computational Power of Microprocessors using M...
IRJET - Predicting the Maximum Computational Power of Microprocessors using M...
IRJET Journal
 
Quad Core Processors - Technology Presentation
Quad Core Processors - Technology PresentationQuad Core Processors - Technology Presentation
Quad Core Processors - Technology Presentation
vinaya.hs
 
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P..."Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...
Edge AI and Vision Alliance
 
Klessydra t - designing vector coprocessors for multi-threaded edge-computing...
Klessydra t - designing vector coprocessors for multi-threaded edge-computing...Klessydra t - designing vector coprocessors for multi-threaded edge-computing...
Klessydra t - designing vector coprocessors for multi-threaded edge-computing...
RISC-V International
 
Supermicro X12 Performance Update
Supermicro X12 Performance UpdateSupermicro X12 Performance Update
Supermicro X12 Performance Update
Rebekah Rodriguez
 
Architecting for Hyper-Scale Datacenter Efficiency
Architecting for Hyper-Scale Datacenter EfficiencyArchitecting for Hyper-Scale Datacenter Efficiency
Architecting for Hyper-Scale Datacenter Efficiency
Intel IT Center
 
TRACK D: Advanced design regardless of process technology/ Marco Casale-Rossi
TRACK D: Advanced design regardless of process technology/ Marco Casale-RossiTRACK D: Advanced design regardless of process technology/ Marco Casale-Rossi
TRACK D: Advanced design regardless of process technology/ Marco Casale-Rossichiportal
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
Dilum Bandara
 
Semi Autonomous Hand Launched Rotary Wing Unmanned Air Vehicles
Semi Autonomous Hand Launched Rotary Wing Unmanned Air VehiclesSemi Autonomous Hand Launched Rotary Wing Unmanned Air Vehicles
Semi Autonomous Hand Launched Rotary Wing Unmanned Air Vehicles
ahmad bassiouny
 
X13 Products + Intel® Xeon® CPU Max Series–An Applications & Performance View
 X13 Products + Intel® Xeon® CPU Max Series–An Applications & Performance View X13 Products + Intel® Xeon® CPU Max Series–An Applications & Performance View
X13 Products + Intel® Xeon® CPU Max Series–An Applications & Performance View
Rebekah Rodriguez
 
X13 Products + Intel® Xeon® CPU Max Series–An Applications & Performance View
X13 Products + Intel® Xeon® CPU Max Series–An Applications & Performance ViewX13 Products + Intel® Xeon® CPU Max Series–An Applications & Performance View
X13 Products + Intel® Xeon® CPU Max Series–An Applications & Performance View
Rebekah Rodriguez
 
Cellular LPWA and LTE-M
Cellular LPWA and LTE-MCellular LPWA and LTE-M
Cellular LPWA and LTE-M
Nicolas Damour
 
Lte-m Sierra Wireless V1
Lte-m Sierra Wireless V1Lte-m Sierra Wireless V1
Lte-m Sierra Wireless V1
IoT Academy
 

Similar to A Parallel, Energy Efficient Hardware Architecture for the merAligner on FPGA using Chisel HCL (20)

Trends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient PerformanceTrends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient Performance
 
Performance Analysis and Optimizations of CAE Applications (Case Study: STAR_...
Performance Analysis and Optimizations of CAE Applications (Case Study: STAR_...Performance Analysis and Optimizations of CAE Applications (Case Study: STAR_...
Performance Analysis and Optimizations of CAE Applications (Case Study: STAR_...
 
Technology overview
Technology overviewTechnology overview
Technology overview
 
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the projectLEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
 
unit 1vlsi notes.pdf
unit 1vlsi notes.pdfunit 1vlsi notes.pdf
unit 1vlsi notes.pdf
 
Fmcad08
Fmcad08Fmcad08
Fmcad08
 
IRJET - Predicting the Maximum Computational Power of Microprocessors using M...
IRJET - Predicting the Maximum Computational Power of Microprocessors using M...IRJET - Predicting the Maximum Computational Power of Microprocessors using M...
IRJET - Predicting the Maximum Computational Power of Microprocessors using M...
 
Quad Core Processors - Technology Presentation
Quad Core Processors - Technology PresentationQuad Core Processors - Technology Presentation
Quad Core Processors - Technology Presentation
 
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P..."Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...
 
Klessydra t - designing vector coprocessors for multi-threaded edge-computing...
Klessydra t - designing vector coprocessors for multi-threaded edge-computing...Klessydra t - designing vector coprocessors for multi-threaded edge-computing...
Klessydra t - designing vector coprocessors for multi-threaded edge-computing...
 
Supermicro X12 Performance Update
Supermicro X12 Performance UpdateSupermicro X12 Performance Update
Supermicro X12 Performance Update
 
Architecting for Hyper-Scale Datacenter Efficiency
Architecting for Hyper-Scale Datacenter EfficiencyArchitecting for Hyper-Scale Datacenter Efficiency
Architecting for Hyper-Scale Datacenter Efficiency
 
Content
ContentContent
Content
 
TRACK D: Advanced design regardless of process technology/ Marco Casale-Rossi
TRACK D: Advanced design regardless of process technology/ Marco Casale-RossiTRACK D: Advanced design regardless of process technology/ Marco Casale-Rossi
TRACK D: Advanced design regardless of process technology/ Marco Casale-Rossi
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Semi Autonomous Hand Launched Rotary Wing Unmanned Air Vehicles
Semi Autonomous Hand Launched Rotary Wing Unmanned Air VehiclesSemi Autonomous Hand Launched Rotary Wing Unmanned Air Vehicles
Semi Autonomous Hand Launched Rotary Wing Unmanned Air Vehicles
 
X13 Products + Intel® Xeon® CPU Max Series–An Applications & Performance View
 X13 Products + Intel® Xeon® CPU Max Series–An Applications & Performance View X13 Products + Intel® Xeon® CPU Max Series–An Applications & Performance View
X13 Products + Intel® Xeon® CPU Max Series–An Applications & Performance View
 
X13 Products + Intel® Xeon® CPU Max Series–An Applications & Performance View
X13 Products + Intel® Xeon® CPU Max Series–An Applications & Performance ViewX13 Products + Intel® Xeon® CPU Max Series–An Applications & Performance View
X13 Products + Intel® Xeon® CPU Max Series–An Applications & Performance View
 
Cellular LPWA and LTE-M
Cellular LPWA and LTE-MCellular LPWA and LTE-M
Cellular LPWA and LTE-M
 
Lte-m Sierra Wireless V1
Lte-m Sierra Wireless V1Lte-m Sierra Wireless V1
Lte-m Sierra Wireless V1
 

More from NECST Lab @ Politecnico di Milano

Mesticheria Team - WiiReflex
Mesticheria Team - WiiReflexMesticheria Team - WiiReflex
Mesticheria Team - WiiReflex
NECST Lab @ Politecnico di Milano
 
Punto e virgola Team - Stressometro
Punto e virgola Team - StressometroPunto e virgola Team - Stressometro
Punto e virgola Team - Stressometro
NECST Lab @ Politecnico di Milano
 
BitIt Team - Stay.straight
BitIt Team - Stay.straight BitIt Team - Stay.straight
BitIt Team - Stay.straight
NECST Lab @ Politecnico di Milano
 
BabYodini Team - Talking Gloves
BabYodini Team - Talking GlovesBabYodini Team - Talking Gloves
BabYodini Team - Talking Gloves
NECST Lab @ Politecnico di Milano
 
printf("Nome Squadra"); Team - NeoTon
printf("Nome Squadra"); Team - NeoTonprintf("Nome Squadra"); Team - NeoTon
printf("Nome Squadra"); Team - NeoTon
NECST Lab @ Politecnico di Milano
 
BlackBoard Team - Motion Tracking Platform
BlackBoard Team - Motion Tracking PlatformBlackBoard Team - Motion Tracking Platform
BlackBoard Team - Motion Tracking Platform
NECST Lab @ Politecnico di Milano
 
#include<brain.h> Team - HomeBeatHome
#include<brain.h> Team - HomeBeatHome#include<brain.h> Team - HomeBeatHome
#include<brain.h> Team - HomeBeatHome
NECST Lab @ Politecnico di Milano
 
Flipflops Team - Wave U
Flipflops Team - Wave UFlipflops Team - Wave U
Flipflops Team - Wave U
NECST Lab @ Politecnico di Milano
 
Bug(atta) Team - Little Brother
Bug(atta) Team - Little BrotherBug(atta) Team - Little Brother
Bug(atta) Team - Little Brother
NECST Lab @ Politecnico di Milano
 
#NECSTCamp: come partecipare
#NECSTCamp: come partecipare#NECSTCamp: come partecipare
#NECSTCamp: come partecipare
NECST Lab @ Politecnico di Milano
 
NECSTCamp101@2020.10.1
NECSTCamp101@2020.10.1NECSTCamp101@2020.10.1
NECSTCamp101@2020.10.1
NECST Lab @ Politecnico di Milano
 
NECSTLab101 2020.2021
NECSTLab101 2020.2021NECSTLab101 2020.2021
NECSTLab101 2020.2021
NECST Lab @ Politecnico di Milano
 
TreeHouse, nourish your community
TreeHouse, nourish your communityTreeHouse, nourish your community
TreeHouse, nourish your community
NECST Lab @ Politecnico di Milano
 
TiReX: Tiled Regular eXpressionsmatching architecture
TiReX: Tiled Regular eXpressionsmatching architectureTiReX: Tiled Regular eXpressionsmatching architecture
TiReX: Tiled Regular eXpressionsmatching architecture
NECST Lab @ Politecnico di Milano
 
Embedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposingEmbedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposing
NECST Lab @ Politecnico di Milano
 
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
NECST Lab @ Politecnico di Milano
 
EMPhASIS - An EMbedded Public Attention Stress Identification System
 EMPhASIS - An EMbedded Public Attention Stress Identification System EMPhASIS - An EMbedded Public Attention Stress Identification System
EMPhASIS - An EMbedded Public Attention Stress Identification System
NECST Lab @ Politecnico di Milano
 
Luns - Automatic lungs segmentation through neural network
Luns - Automatic lungs segmentation through neural networkLuns - Automatic lungs segmentation through neural network
Luns - Automatic lungs segmentation through neural network
NECST Lab @ Politecnico di Milano
 
BlastFunction: How to combine Serverless and FPGAs
BlastFunction: How to combine Serverless and FPGAsBlastFunction: How to combine Serverless and FPGAs
BlastFunction: How to combine Serverless and FPGAs
NECST Lab @ Politecnico di Milano
 
Maeve - Fast genome analysis leveraging exact string matching
Maeve - Fast genome analysis leveraging exact string matchingMaeve - Fast genome analysis leveraging exact string matching
Maeve - Fast genome analysis leveraging exact string matching
NECST Lab @ Politecnico di Milano
 

More from NECST Lab @ Politecnico di Milano (20)

Mesticheria Team - WiiReflex
Mesticheria Team - WiiReflexMesticheria Team - WiiReflex
Mesticheria Team - WiiReflex
 
Punto e virgola Team - Stressometro
Punto e virgola Team - StressometroPunto e virgola Team - Stressometro
Punto e virgola Team - Stressometro
 
BitIt Team - Stay.straight
BitIt Team - Stay.straight BitIt Team - Stay.straight
BitIt Team - Stay.straight
 
BabYodini Team - Talking Gloves
BabYodini Team - Talking GlovesBabYodini Team - Talking Gloves
BabYodini Team - Talking Gloves
 
printf("Nome Squadra"); Team - NeoTon
printf("Nome Squadra"); Team - NeoTonprintf("Nome Squadra"); Team - NeoTon
printf("Nome Squadra"); Team - NeoTon
 
BlackBoard Team - Motion Tracking Platform
BlackBoard Team - Motion Tracking PlatformBlackBoard Team - Motion Tracking Platform
BlackBoard Team - Motion Tracking Platform
 
#include<brain.h> Team - HomeBeatHome
#include<brain.h> Team - HomeBeatHome#include<brain.h> Team - HomeBeatHome
#include<brain.h> Team - HomeBeatHome
 
Flipflops Team - Wave U
Flipflops Team - Wave UFlipflops Team - Wave U
Flipflops Team - Wave U
 
Bug(atta) Team - Little Brother
Bug(atta) Team - Little BrotherBug(atta) Team - Little Brother
Bug(atta) Team - Little Brother
 
#NECSTCamp: come partecipare
#NECSTCamp: come partecipare#NECSTCamp: come partecipare
#NECSTCamp: come partecipare
 
NECSTCamp101@2020.10.1
NECSTCamp101@2020.10.1NECSTCamp101@2020.10.1
NECSTCamp101@2020.10.1
 
NECSTLab101 2020.2021
NECSTLab101 2020.2021NECSTLab101 2020.2021
NECSTLab101 2020.2021
 
TreeHouse, nourish your community
TreeHouse, nourish your communityTreeHouse, nourish your community
TreeHouse, nourish your community
 
TiReX: Tiled Regular eXpressionsmatching architecture
TiReX: Tiled Regular eXpressionsmatching architectureTiReX: Tiled Regular eXpressionsmatching architecture
TiReX: Tiled Regular eXpressionsmatching architecture
 
Embedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposingEmbedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposing
 
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
 
EMPhASIS - An EMbedded Public Attention Stress Identification System
 EMPhASIS - An EMbedded Public Attention Stress Identification System EMPhASIS - An EMbedded Public Attention Stress Identification System
EMPhASIS - An EMbedded Public Attention Stress Identification System
 
Luns - Automatic lungs segmentation through neural network
Luns - Automatic lungs segmentation through neural networkLuns - Automatic lungs segmentation through neural network
Luns - Automatic lungs segmentation through neural network
 
BlastFunction: How to combine Serverless and FPGAs
BlastFunction: How to combine Serverless and FPGAsBlastFunction: How to combine Serverless and FPGAs
BlastFunction: How to combine Serverless and FPGAs
 
Maeve - Fast genome analysis leveraging exact string matching
Maeve - Fast genome analysis leveraging exact string matchingMaeve - Fast genome analysis leveraging exact string matching
Maeve - Fast genome analysis leveraging exact string matching
 

Recently uploaded

Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Basic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparelBasic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparel
top1002
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 

Recently uploaded (20)

Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Basic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparelBasic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparel
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 

A Parallel, Energy Efficient Hardware Architecture for the merAligner on FPGA using Chisel HCL

  • 1. A Parallel, Energy Efficient Hardware Architecture for the merAligner on FPGA using Chisel HCL Lorenzo Di Tucci, Marco Santambrogio {lorenzo.ditucci, marco.santambrogio}@polimi.it Alessandro Comodi, Davide Conficconi {alessandro.comodi,davide.conficconi}@mail.polimi.it Steven Hofmeyr, David Donofrio {shofmeyr, ddonofrio}@lbl.gov RAW @ JW Marriott, Vancouver May 22 2018 speaker: Alessandro Comodi
  • 2. Context 1 Large amounts of genomic data Algorithm complexity In such scenario there is a need to have efficient solutions both from a performance and a power consumption point of view
  • 3. Sequence Alignment 2 Sequence alignment algorithms are some of the most compute intensive ones Pure software solution Poor performance High power consumption
  • 4. merAligner 3 To overcome performance issues Lawrence Berkeley National Labs and UC Berkeley have proposed the merAligner High Performance Low power efficiency (90 kW per cabinet) More than 15,000 cores
  • 5. Contributions 4 • The design and development of a hardware architecture for the Smith-Waterman algorithm on FPGA, using Chisel HCL • The development of a wrapper written in Chisel, used to integrate RTL cores into the Xilinx SDAccel Framework
  • 6. Smith-Waterman 5 The main bottleneck of the merAligner tool is the Smith- Waterman algorithm implementation Highly parallel computation
  • 7. Architecture 6 Systolic array based design Each processing element is fed with the result of the previous one
  • 8. Results 7 Read Reference Frequency [MHz] Performance [GCUPS] Speed up Performance Efficiency [GCUPS/W] Speed up Power efficiency 128[*] 1024 - 3.87 - 0.0165 - 128 1024 150 3.542 0.91X 0.141 8.54X 128 2048 140 5.616 1.45X 0.224 14.35X 128 4096 180 6.529 1.68X 0.261 15.81X 128 16384 110 11.443 2.84X 0.457 27.69X 256 1024 160 6.123 1.58X 0.244 14.78X 256 2048 160 8.393 2.16X 0.335 20.30X 256 4096 130 15.225 3.93X 0.609 36.90X 256 16384 140 27.312 7.05X 1.092 66.18X [*] State of the Art Smith-Waterman software implementation
  • 9. Concluding Remarks 8 Read Reference Frequency [MHz] Performance [GCUPS] Speed up Performance Efficiency [GCUPS/W] Speed up Power efficiency 128[*] 1024 - 3.87 - 0.0165 - 128 1024 150 3.542 0.91X 0.141 8.54X 128 2048 140 5.616 1.45X 0.224 14.35X 128 4096 180 6.529 1.68X 0.261 15.81X 128 16384 110 11.443 2.84X 0.457 27.69X 256 1024 160 6.123 1.58X 0.244 14.78X 256 2048 160 8.393 2.16X 0.335 20.30X 256 4096 130 15.225 3.93X 0.609 36.90X 256 16384 140 27.312 7.05X 1.092 66.18X [*] State of the Art Smith-Waterman software implementation Thank you for your attention! Lorenzo Di Tucci, Marco Santambrogio {lorenzo.ditucci, marco.santambrogio}@polimi.it Alessandro Comodi, Davide Conficconi {alessandro.comodi,davide.conficconi}@mail.polimi.it Steven Hofmeyr, David Donofrio {shofmeyr, ddonofrio}@lbl.gov speaker: Alessandro Comodi Hardware architecture for the acceleration of the Smith-Waterman step of the merAligner on FPGA using Chisel HCL