Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Hpcc euler
1. Motivation
Method
Results
Summary
GPU-Euler
Sequence Assembly using GPGPU
S. Mahmood H. Rangwala
Department of Computer Science
George Mason University
International Conference on High Performance Computing &
Communications, 2011
Ban, Canada
Mahmood, Rangwala GPU-Euler
2. Motivation
Method
Results
Summary
Outline
1 Motivation
Genome Assembly
Previous Work
GPGPU
2 Method
Parallel Eulerian Assembly
Time Complexity Analysis
Evaluation
3 Results
Data sets
Results
Mahmood, Rangwala GPU-Euler
3. Motivation
Genome Assembly
Method
Previous Work
Results
GPGPU
Summary
Outline
1 Motivation
Genome Assembly
Previous Work
GPGPU
2 Method
Parallel Eulerian Assembly
Time Complexity Analysis
Evaluation
3 Results
Data sets
Results
Mahmood, Rangwala GPU-Euler
4. Motivation
Genome Assembly
Method
Previous Work
Results
GPGPU
Summary
Genome
Genome a biological blueprint.
Very long chains of four types of
nucleobases.
Adenine
Guanine
Cytosine
Thymine
Important to understand the
Figure: Double Helix DNA
function of the organism.
representation
1
1 Image courtesy of Image Library of Biological Macromolecules, Jena,
Germany. http://www.imb-jena.de/IMAGE.html
Mahmood, Rangwala GPU-Euler
5. Motivation
Genome Assembly
Method
Previous Work
Results
GPGPU
Summary
Sequence Assembly
Challenges
Total number of nucleobases in a genome is very large
eg. Human Genome has 3.2 Billion base pairs.
Existing technologies can only read a fraction of this long
strand.
Smaller fragments(reads) are required to be stitched together.
Figure: Sequence Assembly
2
2 Image courtesy of Center for BioInformatics Computational Biology,UMD.
www.cbcb.umd.edu/research/assembly_primer.html
Mahmood, Rangwala GPU-Euler
6. Motivation
Genome Assembly
Method
Previous Work
Results
GPGPU
Summary
Problem Statement
Given a set of alphabets ∑ = {A, G , C , T } and a set of strings
R = {r1 , r2 , r3 . . . rn } over alphabet Σ
Construct Super String S, containing all the strings from R.
Similar to Shortest Common Super string.
Need to consider Repeats.
Massive volume of data
Mahmood, Rangwala GPU-Euler
7. Motivation
Genome Assembly
Method
Previous Work
Results
GPGPU
Summary
Outline
1 Motivation
Genome Assembly
Previous Work
GPGPU
2 Method
Parallel Eulerian Assembly
Time Complexity Analysis
Evaluation
3 Results
Data sets
Results
Mahmood, Rangwala GPU-Euler
9. Motivation
Genome Assembly
Method
Previous Work
Results
GPGPU
Summary
Outline
1 Motivation
Genome Assembly
Previous Work
GPGPU
2 Method
Parallel Eulerian Assembly
Time Complexity Analysis
Evaluation
3 Results
Data sets
Results
Mahmood, Rangwala GPU-Euler
10. Motivation
Genome Assembly
Method
Previous Work
Results
GPGPU
Summary
General Purpose GPU Computing
GPUs for General Purpose Computing
Massive parallelism for application.
nVidia CUDA , a framework for
development on nVidia GPUs.
Similar model for parallel computation
Parallel Random Access Machine
(PRAM) Figure: CUDA Application
Single Instruction Multiple Data Stack
(SIMD)
3
3 Image courtesy of nVidia : nVidia CUDA Toolkit Reference Manual
Mahmood, Rangwala GPU-Euler
11. Motivation
Genome Assembly
Method
Previous Work
Results
GPGPU
Summary
CUDA
Compute Unied Device Architecture
A CUDA enabled device has
Symmetric Multiprocessor (SM)
Each SM has a set of Streaming
Processors (SP).
Global Memory.
Concurrent execution of same
code on all SM.
Computations use GPU memory.
Figure: Hardware Architecture
4
4 Image courtesy of nVidia : nVidia CUDA Toolkit Reference Manual
Mahmood, Rangwala GPU-Euler
12. Motivation
Parallel Eulerian Assembly
Method
Time Complexity Analysis
Results
Evaluation
Summary
Outline
1 Motivation
Genome Assembly
Previous Work
GPGPU
2 Method
Parallel Eulerian Assembly
Time Complexity Analysis
Evaluation
3 Results
Data sets
Results
Mahmood, Rangwala GPU-Euler
13. Motivation
Parallel Eulerian Assembly
Method
Time Complexity Analysis
Results
Evaluation
Summary
Concepts
de-Bruijn Graph A directed graph
vertice are k length
word
edge represents a
k − 1 between
vertices.
Contigs Assembled sequences from
the input data.
EulerTour A graph traversal visiting Figure: de-Bruijn Graph
each edge only once.
Mahmood, Rangwala GPU-Euler
14. Motivation
Parallel Eulerian Assembly
Method
Time Complexity Analysis
Results
Evaluation
Summary
Objective
Represent input read as a de-Bruijn graph
Each edge would correspond to a single base.
An Euler tour will visit each base only once.
Mahmood, Rangwala GPU-Euler
15. Motivation
Parallel Eulerian Assembly
Method
Time Complexity Analysis
Results
Evaluation
Summary
Parallel Eulerian Assembly
FASTA file
(input)
Reads
Construct de-Bruijn Graph Debruijn Graph
Construction Graph
Euler Tour
Construction Annotated Graph
Identify Contigs
Find Euler Tour EulerGPU
Contigs
Output Contigs FASTA file
(output)
Figure: GPU Euler Work ow
Mahmood, Rangwala GPU-Euler
16. Motivation
Parallel Eulerian Assembly
Method
Time Complexity Analysis
Results
Evaluation
Summary
Parallel de-Bruijn Graph Construction
FASTA file
CUDA
Assign each CUDA thread to Count
Edges
one read.
CUDA
Generate k -mers and Setup
Vertices
k + 1-mers.
Store them in a hash table. CUDA
Setup
Create nodes from k-mers Edges
and vertices from k + 1-mers.
Figure: Graph Construction
Mahmood, Rangwala GPU-Euler
17. Motivation
Parallel Eulerian Assembly
Method
Time Complexity Analysis
Results
Evaluation
Summary
Parallel Euler Tour
Create a Edge Successor Graph
from de-Bruijn Graph. CUDA
Identify Circuits in the Edge
Assign DeBruijn Graph
Successor
Successor Graph Annotated Graph
Create a Circuit Graph by
CUDA
Create
Circuit
CUDA
identifying adjacent circuits.
Comp. Label Graph
Find
Component
Circuit Graph
Calculate a spanning tree for Find
Circuit.
Spanning Tree Spanning Tree
CUDA
Execute
Swipe
Euler Tour
Traverse Circuit Graph and
switch successor edges of Figure: Parallel Euler Tour
adjacent Circuits.
Mahmood, Rangwala GPU-Euler
18. Motivation
Parallel Eulerian Assembly
Method
Time Complexity Analysis
Results
Evaluation
Summary
Decomposition of Dierent Phases
Phase Computation
I/O and k-mer Extraction CPU + GPU
Hash Table Construction GPU
debruijn Graph Construction GPU
Euler Tour Construction GPU + CPU
Sub-steps for Euler Tour Construction
Finding Connected Component GPU
Circuit Graph Creation GPU
Spanning Tree CPU
Swipe Execution GPU
Traversal (Other) GPU
Contig Generation (O/P) CPU
Mahmood, Rangwala GPU-Euler
19. Motivation
Parallel Eulerian Assembly
Method
Time Complexity Analysis
Results
Evaluation
Summary
Outline
1 Motivation
Genome Assembly
Previous Work
GPGPU
2 Method
Parallel Eulerian Assembly
Time Complexity Analysis
Evaluation
3 Results
Data sets
Results
Mahmood, Rangwala GPU-Euler
20. Motivation
Parallel Eulerian Assembly
Method
Time Complexity Analysis
Results
Evaluation
Summary
Time Complexity Analysis
Step Complexity Processors
de-Bruijn Graph Construction O (1) O (n )
Euler Tour Construction (log n) O (n )
Spanning Tree O (log |V |) O (|V |)
GPU-Euler O (log n ) O (n )
Mahmood, Rangwala GPU-Euler
21. Motivation
Parallel Eulerian Assembly
Method
Time Complexity Analysis
Results
Evaluation
Summary
Outline
1 Motivation
Genome Assembly
Previous Work
GPGPU
2 Method
Parallel Eulerian Assembly
Time Complexity Analysis
Evaluation
3 Results
Data sets
Results
Mahmood, Rangwala GPU-Euler
22. Motivation
Parallel Eulerian Assembly
Method
Time Complexity Analysis
Results
Evaluation
Summary
Experimental Protocol
Compared Timing, N50 Score, Mean length with EulerSR
using various parameters.
Why EulerSR
Based on same concept
Shared memory approach
Support short reads
Contigs with length 100 were included in the comparison.
Calculated contig converge using MUMMER.
Individual GPU Computations were timed as well.
Mahmood, Rangwala GPU-Euler
23. Motivation
Method Data sets
Results Results
Summary
Outline
1 Motivation
Genome Assembly
Previous Work
GPGPU
2 Method
Parallel Eulerian Assembly
Time Complexity Analysis
Evaluation
3 Results
Data sets
Results
Mahmood, Rangwala GPU-Euler
24. Motivation
Method Data sets
Results Results
Summary
Data Sets
Genome size and number of simulated reads for dierent read length
Genome Length 36 bp 50 bp 256 bp
Campylobacter Jejuni 1,641,481 911,934 656,593 128,241
Neisseria Meningitidis 2,184,406 1,213,559 873,763 170,657
Lactococcus Lactisd 2,635,589 1,314,216 946,236 184,812
Mahmood, Rangwala GPU-Euler
25. Motivation
Method Data sets
Results Results
Summary
Outline
1 Motivation
Genome Assembly
Previous Work
GPGPU
2 Method
Parallel Eulerian Assembly
Time Complexity Analysis
Evaluation
3 Results
Data sets
Results
Mahmood, Rangwala GPU-Euler
29. Motivation
Method Data sets
Results Results
Summary
GPU Euler Phase Distribution
Phase Computation % Time
I/O and k-mer Extraction CPU + GPU 77.29+1.44
Hash Table Construction GPU 0.31
debruijn Graph Construction GPU 1.15
Euler Tour Construction GPU + CPU
Sub-steps for Euler Tour Construction
Finding Connected Component GPU 10.06
Spanning Tree CPU 0.06
Swipe Execution GPU 0.01
Circuit Graph Traversal (Other) GPU 0.72
Contig Generation (O/P) CPU 4.39
Mahmood, Rangwala GPU-Euler
30. Motivation
Method
Results
Summary
Summary
Exploiting GPUs for Sequence Assembly.
Implementation of PRAM algorithm on CUDA devices.
Outlook
No Error Correction
Graph Simplication.
Mahmood, Rangwala GPU-Euler