SlideShare a Scribd company logo
Maeve
Fast genome analysis leveraging
exact string matching
Beatrice Branchini <beatrice.branchini@mail.polimi.it>
Sofia Breschi <sofia.breschi@mail.polimi.it>
Alberto Zeni <alberto.zeni@mail.polimi.it>
Marco Santambrogio <marco.santambrogio@polimi.it>
July 22, vNGC
Genomics investigation
2
Applications
Diagnosis of genetic diseases
Personalized medicine
3
Prevention of diseases
https://www.genome.gov/sequencingcostsdata
4
Genome analysis cost
DNA sequencing cost per genome
Sequencing Assembly
Genomic pipeline
5
DNA analysis
Sequencing
Genomic pipeline
6
DNA analysisAssembly
Context
Exact pattern
matching algorithms
7
Technical challenges
8
Computationally intensive
Time consuming
Inefficient architectures
Solution
9
Implementation of
Knuth-Morris-Pratt (KMP) algorithm
on FPGA
KMP algorithm
10
• linear time complexity
KMP is an exact pattern matching algorithm
• reduces unnecessary comparisons of
characters
Solution
11
Implementation of
Knuth-Morris-Pratt (KMP) algorithm
on FPGA
KMP fast and efficient
FPGA acceleration FPGA
Our implementation
12
Genome
+
SEQ1
Genome
+
SEQ2
Genome
+
SEQ3
Genome
+
SEQ4
Our implementation
13
Genome
+
SEQ1
Genome
+
SEQ2
Genome
+
SEQ3
Genome
+
SEQ4
Failure table
for SEQ1
Failure table
for SEQ4
Failure table
for SEQ3
Failure table
for SEQ2
Our implementation
14
Genome
+
SEQ1
Genome
+
SEQ2
Genome
+
SEQ3
Genome
+
SEQ4
Failure table
for SEQ1
Failure table
for SEQ4
Failure table
for SEQ3
Failure table
for SEQ2
Matching
Matching
Matching
Matching
Our implementation
15
Genome
+
SEQ1
Genome
+
SEQ2
Genome
+
SEQ3
Genome
+
SEQ4
Match/
mismatch
Match/
mismatch
Match/
mismatch
Match/
mismatch
Failure table
for SEQ1
Failure table
for SEQ4
Failure table
for SEQ3
Failure table
for SEQ2
Matching
Matching
Matching
Matching
Results
16
~ 6x improvement
over multicore
Overall speedup of the alignment process
Board: Xilinx Alveo U200
Software: Vitis
Thanks for
your attention
Beatrice Branchini <beatrice.branchini@mail.polimi.it>
Sofia Breschi <sofia.breschi@mail.polimi.it>
Alberto Zeni <alberto.zeni@mail.polimi.it>
Marco Santambrogio <marco.santambrogio@polimi.it>

More Related Content

Similar to Maeve - Fast genome analysis leveraging exact string matching

Howard University: Center for Computational Biology and Bioinformatics
Howard University: Center for Computational Biology and BioinformaticsHoward University: Center for Computational Biology and Bioinformatics
Howard University: Center for Computational Biology and Bioinformatics
karl.barnes
 
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...
Robert (Rob) Salomon
 
Performance Efficient DNA Sequence Detectionalgo
Performance Efficient DNA Sequence DetectionalgoPerformance Efficient DNA Sequence Detectionalgo
Performance Efficient DNA Sequence Detectionalgo
Rahul Shirude
 

Similar to Maeve - Fast genome analysis leveraging exact string matching (20)

Howard University: Center for Computational Biology and Bioinformatics
Howard University: Center for Computational Biology and BioinformaticsHoward University: Center for Computational Biology and Bioinformatics
Howard University: Center for Computational Biology and Bioinformatics
 
HUGenomics: a support for personalized medicine research
HUGenomics: a support for personalized medicine researchHUGenomics: a support for personalized medicine research
HUGenomics: a support for personalized medicine research
 
HUG @ NGCLE@e-Novia 15.11.2017
HUG @ NGCLE@e-Novia 15.11.2017HUG @ NGCLE@e-Novia 15.11.2017
HUG @ NGCLE@e-Novia 15.11.2017
 
ResCUE presentation
ResCUE presentationResCUE presentation
ResCUE presentation
 
2014 IEEE JAVA NETWORKING PROJECT Secure continuous aggregation in wireless s...
2014 IEEE JAVA NETWORKING PROJECT Secure continuous aggregation in wireless s...2014 IEEE JAVA NETWORKING PROJECT Secure continuous aggregation in wireless s...
2014 IEEE JAVA NETWORKING PROJECT Secure continuous aggregation in wireless s...
 
2. HUGenomics: rationale behind FPGA
2. HUGenomics: rationale behind FPGA2. HUGenomics: rationale behind FPGA
2. HUGenomics: rationale behind FPGA
 
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...
 
Fast algorithms for large scale genome alignment and comparison
Fast algorithms for large scale genome alignment and comparisonFast algorithms for large scale genome alignment and comparison
Fast algorithms for large scale genome alignment and comparison
 
Performance Efficient DNA Sequence Detectionalgo
Performance Efficient DNA Sequence DetectionalgoPerformance Efficient DNA Sequence Detectionalgo
Performance Efficient DNA Sequence Detectionalgo
 
Bioinformatics Practical-Course- ATIT Academy
Bioinformatics Practical-Course- ATIT AcademyBioinformatics Practical-Course- ATIT Academy
Bioinformatics Practical-Course- ATIT Academy
 
Atit academy bioinformatics practical course
Atit academy   bioinformatics practical courseAtit academy   bioinformatics practical course
Atit academy bioinformatics practical course
 
Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?
 
ResCUE - Marketability
ResCUE - MarketabilityResCUE - Marketability
ResCUE - Marketability
 
IRJET-A Novel Approaches for Motif Discovery using Data Mining Algorithm
IRJET-A Novel Approaches for Motif Discovery using Data Mining AlgorithmIRJET-A Novel Approaches for Motif Discovery using Data Mining Algorithm
IRJET-A Novel Approaches for Motif Discovery using Data Mining Algorithm
 
T1 2018 bioinformatics
T1 2018 bioinformaticsT1 2018 bioinformatics
T1 2018 bioinformatics
 
New Heuristic Model for Optimal CRC Polynomial
New Heuristic Model for Optimal CRC Polynomial New Heuristic Model for Optimal CRC Polynomial
New Heuristic Model for Optimal CRC Polynomial
 
Review of Liao et al - A draft human pangenome reference - Nature (2023)
Review of Liao et al - A draft human pangenome reference - Nature (2023)Review of Liao et al - A draft human pangenome reference - Nature (2023)
Review of Liao et al - A draft human pangenome reference - Nature (2023)
 
Face detection on_embedded_systems
Face detection on_embedded_systemsFace detection on_embedded_systems
Face detection on_embedded_systems
 
On Error Injection for NoC Platforms: A UVM-based Practical Case Study
On Error Injection for NoC Platforms: A UVM-based Practical Case StudyOn Error Injection for NoC Platforms: A UVM-based Practical Case Study
On Error Injection for NoC Platforms: A UVM-based Practical Case Study
 
Supercharging MD Simulations with GPUs
Supercharging MD Simulations with GPUsSupercharging MD Simulations with GPUs
Supercharging MD Simulations with GPUs
 

More from NECST Lab @ Politecnico di Milano

Embedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposingEmbedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposing
NECST Lab @ Politecnico di Milano
 

More from NECST Lab @ Politecnico di Milano (20)

Mesticheria Team - WiiReflex
Mesticheria Team - WiiReflexMesticheria Team - WiiReflex
Mesticheria Team - WiiReflex
 
Punto e virgola Team - Stressometro
Punto e virgola Team - StressometroPunto e virgola Team - Stressometro
Punto e virgola Team - Stressometro
 
BitIt Team - Stay.straight
BitIt Team - Stay.straight BitIt Team - Stay.straight
BitIt Team - Stay.straight
 
BabYodini Team - Talking Gloves
BabYodini Team - Talking GlovesBabYodini Team - Talking Gloves
BabYodini Team - Talking Gloves
 
printf("Nome Squadra"); Team - NeoTon
printf("Nome Squadra"); Team - NeoTonprintf("Nome Squadra"); Team - NeoTon
printf("Nome Squadra"); Team - NeoTon
 
BlackBoard Team - Motion Tracking Platform
BlackBoard Team - Motion Tracking PlatformBlackBoard Team - Motion Tracking Platform
BlackBoard Team - Motion Tracking Platform
 
#include<brain.h> Team - HomeBeatHome
#include<brain.h> Team - HomeBeatHome#include<brain.h> Team - HomeBeatHome
#include<brain.h> Team - HomeBeatHome
 
Flipflops Team - Wave U
Flipflops Team - Wave UFlipflops Team - Wave U
Flipflops Team - Wave U
 
Bug(atta) Team - Little Brother
Bug(atta) Team - Little BrotherBug(atta) Team - Little Brother
Bug(atta) Team - Little Brother
 
#NECSTCamp: come partecipare
#NECSTCamp: come partecipare#NECSTCamp: come partecipare
#NECSTCamp: come partecipare
 
NECSTCamp101@2020.10.1
NECSTCamp101@2020.10.1NECSTCamp101@2020.10.1
NECSTCamp101@2020.10.1
 
NECSTLab101 2020.2021
NECSTLab101 2020.2021NECSTLab101 2020.2021
NECSTLab101 2020.2021
 
TreeHouse, nourish your community
TreeHouse, nourish your communityTreeHouse, nourish your community
TreeHouse, nourish your community
 
TiReX: Tiled Regular eXpressionsmatching architecture
TiReX: Tiled Regular eXpressionsmatching architectureTiReX: Tiled Regular eXpressionsmatching architecture
TiReX: Tiled Regular eXpressionsmatching architecture
 
Embedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposingEmbedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposing
 
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
 
EMPhASIS - An EMbedded Public Attention Stress Identification System
 EMPhASIS - An EMbedded Public Attention Stress Identification System EMPhASIS - An EMbedded Public Attention Stress Identification System
EMPhASIS - An EMbedded Public Attention Stress Identification System
 
Luns - Automatic lungs segmentation through neural network
Luns - Automatic lungs segmentation through neural networkLuns - Automatic lungs segmentation through neural network
Luns - Automatic lungs segmentation through neural network
 
BlastFunction: How to combine Serverless and FPGAs
BlastFunction: How to combine Serverless and FPGAsBlastFunction: How to combine Serverless and FPGAs
BlastFunction: How to combine Serverless and FPGAs
 
EMoCy - Emotions Monitoring via wearable Computing System
EMoCy - Emotions Monitoring via wearable Computing SystemEMoCy - Emotions Monitoring via wearable Computing System
EMoCy - Emotions Monitoring via wearable Computing System
 

Recently uploaded

ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
Kamal Acharya
 
Fruit shop management system project report.pdf
Fruit shop management system project report.pdfFruit shop management system project report.pdf
Fruit shop management system project report.pdf
Kamal Acharya
 
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsRS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
Atif Razi
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 

Recently uploaded (20)

Peek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdfPeek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdf
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering Workshop
 
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
Toll tax management system project report..pdf
Toll tax management system project report..pdfToll tax management system project report..pdf
Toll tax management system project report..pdf
 
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
 
Explosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdfExplosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdf
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
 
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and ClusteringKIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
Fruit shop management system project report.pdf
Fruit shop management system project report.pdfFruit shop management system project report.pdf
Fruit shop management system project report.pdf
 
2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edge2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edge
 
Pharmacy management system project report..pdf
Pharmacy management system project report..pdfPharmacy management system project report..pdf
Pharmacy management system project report..pdf
 
Natalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in KrakówNatalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in Kraków
 
Arduino based vehicle speed tracker project
Arduino based vehicle speed tracker projectArduino based vehicle speed tracker project
Arduino based vehicle speed tracker project
 
IT-601 Lecture Notes-UNIT-2.pdf Data Analysis
IT-601 Lecture Notes-UNIT-2.pdf Data AnalysisIT-601 Lecture Notes-UNIT-2.pdf Data Analysis
IT-601 Lecture Notes-UNIT-2.pdf Data Analysis
 
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsRS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
fluid mechanics gate notes . gate all pyqs answer
fluid mechanics gate notes . gate all pyqs answerfluid mechanics gate notes . gate all pyqs answer
fluid mechanics gate notes . gate all pyqs answer
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 

Maeve - Fast genome analysis leveraging exact string matching