Towards Ultra-Large-Scale System: Design of Scalable Software and Next-Gen HPC Cluster for Big Data Genome Analysis

Pictures from: https://www.slideshare.net/AdrinBezOrtega/genome-big-data, http://astro-icore.phys.huji.ac.il/node/70,
https://timesofindia.indiatimes.com/city/pune/ligo-observatorys-work-to-begin-in-2018-land-acquisition-
underway/articleshow/60882081.cms, https://www.yumpu.com/en/document/view/3703243/large-geoscience-databases-big-data

 Introduction
 Big Data Genome analysis
 Big DataAnalysis Framework
 Big data application and genome sequence
 De novo Genome assembly
 De novo Genomic error correction
 Big data cyberinfrastructure
 Evaluation of different cluster
 Model for optimally balanced cluster

$1,000
$10,000
$100,000
$1,000,000
$10,000,000
$100,000,000
Sep-01
Sep-02
Sep-03
Sep-04
Sep-05
Sep-06
Sep-07
Sep-08
Sep-09
Sep-10
Sep-11
Sep-12
Sep-13
Sep-14
Sep-15
Cost per Genome
Moore’s Law
Genome
Short reads
(TBs)
Reconstructed
genome
(few MB/GB)NGS HPC
 NGS technologies
 Outpaced Moore’s Law
 Software Challenges
 Extreme scalabality
 Algorithmic complexity
 HPC Platform Challenges
 More compute cycles
 Extreme I/O performance
 Huge Storage Space
I/O, Compute
and memory-
intensive
application
Data collected from https://www.genome.gov/sequencingcosts/

 MapReduce
 Hadoop
 Vertex-centric graph Processing
 Giraph,GraphX
 Distributed NoSQL
 In-memory: Hazelcast, Redis
 Disk-based: Hbase, MongoDB

 Cost decreases
 Bandwidth increases
1.00E+10
1.00E+11
1.00E+12
1.00E+13
1.00E+14
1.00E+15
1.00E+16
1.00E+17
1993
1994
1997
2000
2003
2005
2007
2009
2011
2012
Increase in FLOPS of fastest
supercomputer
1
10
100
1000
10000
1995
1997
1999
2001
2003
2005
2007
2009
2011
Increase in Bandwidth (MB/s) for storage
and network
I/O bandwidth per device
Network bandwidth per cable
 Hardware evolution
 Processor
 Storage
 Network

Picture from http://www.slideshare.net/torstenseemann/approaches-to-analysing-1000s-of-bacterial-isolates-iceid-2015-
atlanta-usa-mon-24-aug-2015
 High throughput sequencing machine
 High coverage to get desired accuracy
 Complex algorithms
 Data, Compute, and Memory intensive application

 Like restoring a damaged book from multiple copies torn at random places
 The problem can be mapped as a graph analytic problem: De Bruijn Graph

 Modified version of parallel list ranking algorithm
 Mark head (h) and tail (t) and merge the h-t link
 Number of rounds: O(log |n|) [where, n: #vertices in the longest path]
Round
#1
Round
#2

Round
#3
 Tips:Vertices with only one incoming edge, no outgoing edge
and length < 100bp

Round
#6
 Bubbles:Vertices with same predecessor and same successor
 Levenshtein like edit distance algorithm is used
 If the distance is less than a threshold, vertex with minimum
frequency is removed

Staphylococc
us Aureus
Rhodobactor
Spharoides
Human
Chromosome
(HCR 13)
Yoruban
Male
Source GAGE GAGE GAGE SRA000271
Read size (bp) 255*106 410*106 5.9*109 141.5*109
Read length (bp) 37 & 101 101 101 101
Total Reads 4791974 4105236 59414772 2-Billion
Ref. Genome
Size
2871915 4603060 88289540 3.3*109
Dataset
(GigaByte)
0.3 0.65 10.0 452.0
GAGE: Genome Assembly Goldstandard Evaluation (http://gage.cbcb.umd.edu/)
NCBI Open dataset

S. Sureus (325MB) GiGA ABySS Contrail
# Contigs 298 300 309
Corrected NG50 26725 24819 31332
NG50 count 34 38 30
Max Contig size 95737 125049 95737
Misassembled contigs 0 4 0
 GiGA vs ABySS: shows higher NG50, lower missassembled contigs
 GiGA vs Contrail: almost comparable accuracy
R.Spharoides (650MB) GiGA ABySS Contrail
# Contigs 737 1912 727
Corrected NG50 10804 4215 11718
NG50 count 134 283 126
Max Contig size 65538 54734 51683
Missassembled contigs 1 78 1

HCR13 (10GB) GiGA ABySS Contrail
# Contigs 76049 51790 76209
Corrected
NG50
658 1269 700
NG50 count 33271 16643 34223
Max Contig
size
19446 30053 19321
Missassembled
contigs
3 17 3
 GiGA vs ABySS: shows higher NG50, lower missassembled contigs
 GiGA vs Contrail: almost comparable accuracy

 XSEDE resource
LSU SuperMic HPC cluster is used
Maximum #nodes 128
Cores/node 20 (Two 10-core Intel IvyBridge)
DRAM/node 64GB
Disk/node 250GB (Hard disk drive)
Network 56Gbps InfiniBand

 ABySS processes failed many times for network issues
 Contrail, being disk-based took more than maximum allocated
time for a single job
GiGA ABySS Contrail
# Contigs 3032297 - -
NG50 827 - -
Max Contig size 35465 - -
# Cores 512 - -
Time (hour) 8.5 Failed Failed

 Higher read length
 Better genome finishing
 More complete genome assembly
 Higher error rate
 Higher cost
Illumina short reads PacBio long reads
Platform characteristics High throughput SMRT
Read length 100 – 250bp 5kbp – 20kbp
Cost $0.03/mbp $0.30/mbp
Error rate 1-2% 10-15%
Error characteristics Substitution InDel

 De Bruijn graph-based method
 More scalable than overlap-based method
 Widest path algorithm provides accuracy

 Theory behind using the widest path algorithm
Assume K-mer coverage as random variable IID (F(x))
Theory of minimum probability distribution (1-(1-F(x))n)
Proof:The probability of the minimum coverage k-mer is highest in
the erroneous read given many reads sequenced the same region of
the genome

 There may be many error k-mers with high coverage
 But the probability of finding minimum coverage is significantly
higher in the error path
 Hence, a widest path algorithm is used to select the correct path

 Hadoop (MapReduce): for computation
 Hazelcast (In-memory NoSQL): for de Bruijn graph storage

 Map: Emits three k-mers
 First k-mer: incoming edge
 Middle k-mer: vertex
 Third k-mer: outgoing edge
 Coverage of middle: 1
 Reduce
 Group by vertex
 Aggregate incoming edges
 Aggregate outgoing edges
 Sum coverages

 Hadoop with Hazelcast
 Error detection
 Hadoop Map-only job
 k-mer coverage < threshold  Error k-mer
Millions of Searches
over the entire
dataset

 Widest path algorithm
 Maximize the minimum k-mer coverage in the path of the de Bruijn
graph
 Modified version of the Dijkstra’s Algorithm
 Similar time complexity

PacBio Data #Reads Data Size
(GB)
Read length %Reads
Aligned
E. coli 1129576 1.032 1120 78.97
Yeast 2315594 0.53 5874 82.12
Fruit fly 6701498 55 4328 51.14
Human 23897260 312 6587 72.3
Illumina
Data
#Reads Data Size
(GB)
Read length %Reads
Aligned
E. coli 45440200 13.50 101 99.44
Yeast 4503422 1.20 101 93.75
Fruit fly 179363706 59 101 95.56
Human 1420689270 452 101 79.60

 %Read aligned: Percentage of corrected long reads and the base pairs
aligned to the reference genome
 %ReadsAligned = AlignedReads /TotalReads * 100
 %Base pairs aligned: Percentage of base pairs (of total base pairs) of
corrected long reads and the aligned to the reference genome
 %BasePairAligned = AlignedBases /TotalBases * 100

 Widest path (WP): Select the path in the de Bruijn graph which
maximizes the minimum k-mer coverage
 Leverages the coverage information while correcting the error
 Dijkstra’s shortest path (SP): Select the shortest path without taking
any coverage information
 Coverage information is used only when the de Bruijn graph is
constructed  K-mers below a threshold is removed from the graph
 1-step Greedy (Gr): Select the successor k-mer with highest coverage
 High chance of selecting the wrong path
 Stopped after a predefined number of hops

 Widest path shows the best performance
 Greedy algorithm shows the worst performance
 K is set to 15
Data Algorithm %Read aligned %Base pair
aligned
E. coli ParLECHWP 93.69 92.15
ParLECH sp 87.55 86.49
ParLECH Gr 76.68 70.92
Yeast ParLECHWP 86.07 89.31
Fruit fly ParLECHWP 65.92 62.42

 ParLECH aligned more reads and basepairs to the reference
genome comparing to LoRDEC
 K is set to 15
Data Algorithm %Read aligned %Base pair
aligned
E. coli ParLECH 93.69 92.15
LoRDEC 87.55 86.49
Original 78.97 75.07
Yeast ParLECHWP 86.07 89.31
LoRDEC 84.92 87.08
Fruit fly ParLECHWP 65.92 62.42
LoRDEC 54.53 49.69

 LoRDEC performs better in single node
 ParLECH outperforms when multiple nodes are added
1 2 4 8 16 32
#Nodes
Executiontime(min)
01020304050 ParLECH
LoRDEC

 Almost linear scalability
16 32 64 128
1020501002005002000
Number of Nodes in log scale
Executiontimeinlogscale(min)
KmerCount
LocateError
CorrectError
Total

 A total of 764GB data is processed
 Appreciable accuracy
 LoRDEC could not process
 Could not produce the de Bruijn graph
PacBio data size 312GB
Illumina data size 452GB
#nodes used 128
k 17
Time 28.6 hours
%Read aligned 78.3
%base pair aligned 75.43

 Desired software characteristics for big data genome analysis
 Distributed
 Scalable
 Low cost
 Consider data locality
 Capable to work on commodity hardware
 Develop algorithms using of big data analytics model
 Better performance than other MPI-based software on traditional
HPC environment
Can we get better performance by
changing the hardware infrastructure?

Evaluating Different Distributed-HPC
Infrastructure for Data-Driven Science

 Network issues
 Fat tree architecture with Blocking (2:1)
 Low effective bandwidth  Current
programming models needs bandwidth
 Storage issues
 Fewer directly attached device
(normally hard disk drive)
 Low I/O bandwidth  Big data job
becomes I/O bound
 Memory issues
 Low RAM per core  Significant
tradeoff between the degree of data-
parallelism and memory requirement
 Low buffer size increases data spilling
to disks  Causes significant
performance drop with HDD

SuperMike
II
SwatIII-
Basic
SwatIII-
Basic-SSD
SwatIII
Memory
SwatIII-
Scaleup
SwatIII-
Medium
Cluster
Category
HPC-
Cluster
Scaled out
datacenter
Scaled out
datacenter
Memory
Optimized
datacenter
Scaled up
datacenter
Medium
sized
datacenter
#Pcores/n
ode
16 16 16 16 16 16
DRAM(GB)
/node
32 32 32 256 256 64
#Disks/no
de
1-HDD 1-HDD 1-SSD 1-SSD 7HDD/SSD 2HDD/SSD
Network 40Gbps
QDR IB
10Gbps
Eth.
10Gbps
Eth.
10Gbps
Eth.
10Gbps
Eth.
10Gbps
Eth.

Bumble
bee
Job type Input Final
output
#Jobs Shuffled
data
HDFS
data
Graph
construc
tion
Hadoop 90GB
(500M
reads)
95GB 2 2TB 136GB
Graph
simplific
ation
Series of
Giraph
jobs
95GB
(715M
vertices)
640MB
(62K
vertices)
15 - 966GB
 #Nodes used:
 SuperMikeII: 15
 SwatIII-Basic-HDD/SSD: 15
 SwatIII-Memory: 15
Human Job type Input Final
output
#Jobs Shuffled
data
HDFS
data
Graph
construc
tion
Hadoop 452GB
(2B
reads)
3TB 2 9.9TB 3.2TB
Graph
simplific
ation
Series of
Giraph
jobs
3TB
(1.5B
vertices)
3.8GB
(3M
vertices)
15 - 4.1TB

0
0.5
1
1.5
Graph
construction
Graph
simplification
entire pipeline
Executiontimenormalizedto
SuperMikeII
Assembly stagesAxis
Effect of Network(InfiniBand vs Ethernet) to
assemble 90GB Bumble Bee Genome
SuperMikeII SwatIII-Basic-HDD
 40Gbps IB + 2:1 blocking vs 10Gbps Eth. + no blocking  Similar performance
 SSD vs HDD  Hadoop shows 50% improvement
 256GB vs 32GB DRAM  Hadoop shows 70% and Giraph shows 35%
improvement
1.012 1.033 1.025
0
0.5
1
1.5
Graph
construction
Graph
simplification
Entire pipeline
Executiontimenormalizedto
SuperMikeII
Assembly stages
Effect of storage type (HDD vs SSD) and size of
RAM while assembling Bumble Bee Genome
SuperMikeII SwatIII-Basic-SSD SwatIII-Memory
0.5
0.3
0.96
0.65 0.790.67

0
1
2
3
GraphConstruction GraphSimplification EntirePipeline
Performance/$
normalizedto
SuperMikeII
Assembly stages
Performance/$ with bumble bee (90GB) genome assembly
0
1
2
3
Executiontime
normalizedto
SuperMikeII
Execution time for 90GB bumble bee genome
assembly
 Scaled up cluster
 More execution time
 More Performance/$
 HDD and SSD shows
almost same
execution time
 HDD shows better
Performance/$ than
HDD

0
2
4
6
Graph
construction
Graph
simplification
Entire pipeline
Performance/$
normalizedto
SuperMikeII
Performance/$ for human genome
0
0.5
1
1.5
Graph
construction
Graph
simplification
Entire pipeline
Executiontime
normalizedto
SuperMikeII
Execution time for human genome (452GB)  Fewer scaled up
server
 Better than traditional
HPC cluster (3-4x
benefit in
performance/$)
 HDD performs similar
as SSD
 HDD shows better
performance/$ than
SSD
1.006 1.128 0.898 1.023 0.999 1.077
3.17
4.36 3.88
4.79
3.65
4.21

 1-SSD performs similar to 4-HDD
 Disk controller saturates at ~500MB/s
 Adding more disks (HDD/SSD) does not improve
performance any more
0
1000
2000
3000
4000
5000
6000
7000
1HDD/DataNode 2HDD/DataNode 4HDD/DataNode 1SSD/DataNode 2SSD/DataNode
Executiontime(s)
#DAS/DN and type
5740
4429
3333
2939 2732

0
1
2
3
Performance/$
normalizedto
SuperMikeII
Performance/$ with Bumble Bee Genome assembly
 Hyperscale system prototype
 32 low-power node: 2 cores, 1 SSD and 16GB RAM/node
 10% better performance than SuperMikeII (16 cores, 1 HDD and
32GB RAM/node)
 More than twice improvement in performance/$
0.8
0.85
0.9
0.95
1
1.05
Executiontime
normalizedto
SuperMikeII
Execution time for 90GB Bumble Bee Genome Assembly
0.93
0.89 0.90
2.16 2.24 2.215

 Increase compute bandwidth
 Power8 processor has 8 SMT
 16 memory controllers
 Increase I/O bandwidth
 Many HDD per node
 I/O and compute distribution on
SMT
 Increase Network bandwidth
 Clos connection with No blocking

 Intel’s Knights Landing (KNL) cluster
 Low energy consumption
 Knights landing processor with lower clock speed
 Increased compute and I/O parallelism
 4 SMT (instead of 2 hyperthread)
 Non volatile RAM high bandwidth flash memory
 NvidiaGPU cluster
 General Purpose GPU (GP GPU)
 Work in conjunction with Intel or IBM Power8 processor
 NVLink (High speed connection between IBM Pow)

 Limitations in traditional HPC cluster and Data Center
 Network
 Storage
 Memory
 Huge tradeoff between performance and cost
 How to model these observation to develop optimal
cluster architecture?

A Theoretical Model to Build Cost-
Effective Balanced HPC
Infrastructure
for Data-Driven Science

 Amdahl’s I/O number for balanced system
 1-bit (0.125-Byte) of I/O per second per IPS
 Amdahl’s memory number for balanced system
 1-byte of memory per IPS
 Limitation
One-size-fit-all: does not consider the impact
of the application’s characteristics

 Modified Amdahl’s I/O number
 8 MIPS/MBPS I/O
 On the relevant application
 Modified Amdahl’s memory number
 The MB/MIPS ratio is rising from 1 to 4
 Limitation
 Does not consider the cost component
 Observations only: no theoretical background

 Modify Amdahl’s I/O number, 𝛽𝑖𝑜
𝑜𝑝𝑡 = 𝑓𝑖𝑜(𝛾𝑖𝑜, 𝛿𝑖𝑜)
 Modify Amdahl’s mem. number, 𝛽 𝑚𝑒𝑚
𝑜𝑝𝑡 = 𝑓 𝑚𝑒𝑚 𝛾 𝑚𝑒𝑚, 𝛿 𝑚𝑒𝑚

 Ignores overlap of work done by I/O and memory
 Ignores the CPU micro architecture
 Consider the number of instruction executed per cycle (IPC) as
proportional toCPU core frequency

Cluster SuperMikeII SwatIII CeresII
Processor 2 8core Xeon 2 8core Xeon 1 6core Xeon
CPU core speed 2.6 GHz 2.6 GHz 2GHz
#Cores/node 16 16 6
Total CPU speed/node 41.6GHz 41.6GHz 12GHz
#Disks/node 1-HDD (SATA) 4-HDD (SATA) 1-SSD (NVMe)
Seq. I/O bandwidth/disk 0.15GBPS 0.15GBPS 2GBPS
Seq. I/O bandwidth/node 0.15GBPS 0.60GBPS 2GBPS
DRAM/node 32GB 256GB 64GB
Max. nodes available 128 16 40

Cluster SuperMikeII SwatIII CeresII
Cluster type Traditional HPC Datacenter MicroBrick
𝛽𝑖𝑜 0.003 0.015 0.166
𝛽 𝑚𝑒𝑚 0.77 6.15 5.33
𝛾𝑖𝑜 0.0005 0.01 1.03
𝛾 𝑚𝑒𝑚 0.06 1.47 1.25
Optimized for Only compute-
intensive
application
Compute- and
Memory-
intensive
application
I/O-, Compute-
and Memory-
intensive
Application

Application Terasort Wordcount Genome
Assembly
Ph1
Genome
Assembly
Ph2
Job type Hadoop Hadoop Hadoop Giraph
Input 1TB 1TB 452GB (2bn
short reads)
3.2TB (1.5bn
vertices)
Output 1TB 1TB 3TB 3.8GB
Shuffled data 1TB 1TB 9.9TB -
Application
Characteristics
Map: CPU-
intensive,
Reduce:
I/O-intensive
Map and
Reduce: I/O
and CPU-
Intensive
Map and
Reduce: CPU-
and
I/O-intensive
Memory-
Intensive

 Lower is better (Price-to-Performance of SuperMikeII is
considered as 1)
 CeresIIVs. SuperMikeII: >65% improvement for both
 CeresIIVs. SwatIII: >50% improvement for both
0
0.2
0.4
0.6
0.8
1
1.2
TeraSort WordCount
Price-to-Performance
(normalizedtoSuperMike2
Application
SuperMikeII
SwatIII
CeresII
0.76
0.37
0.79
0.35

 Lower is better (Price-to-Performance of SuperMikeII is
considered as 1)
 CeresIIVs. SuperMikeII: 88% and 85% for phase-1 and phase-2
respectively
 CeresII vs. SwatII: 50% and 20% for phase-1 and phase-2
respectively
0
0.2
0.4
0.6
0.8
1
1.2
Graph Construction Graph Simplification
Price-to-Performance
(normalizedto
SuperMike2)
Application
0.24
0.12
0.22
0.15

 For data-driven application with current H/W price
 Amdahl’s I/O number (𝛽𝑖𝑜
𝑜𝑝𝑡) should be increased compared
to Gray’s law (0.125 to 0.17)
 Amdahl’s memory number (𝛽 𝑚𝑒𝑚
𝑜𝑝𝑡) should be decreased
compared to Gray’s law (4 to 2.7)
 For HPC clusters
 𝛽𝑖𝑜 and 𝛽 𝑚𝑒𝑚 provide an easy-to-use alternative for
FLOPS for I/O- and memory-bound applications
Informed choice among hardware components
during investing on HPC cluster when application
characteristics are not known

 Application of Deep Learning and AI methodologies on
genomics
 Key-Value Memory Network
 Metagenomic Assembly and error correction
 Transfer big genomic data on Blockchain
 Security of the sensitive data
 High throughput
 Current collaboration
 SanDiego Supercomputing center
 IBM OpenPower

 Thanks to the Faculty and staff LSU and UW Platteville
 Dr. Seung-Jong Park, Dr. Kisung lee, Dr. SeungwonYang, Dr.
JianhuaChen, Dr. Praveen Koppa, Dr. SayanGoswami, Dr. Richard
Platania, Dr. Chui hui Chiu, Dipak Singh, Dr. Lisa Landgraf, etc.
 Samsung SSD team
▪ Jaeki Hong, Jay Seo, Jinki Kim,WooseokChang, etc.
 IBM Power8 and Open-PowerTeam
▪ Terry Leatherland, Ravi Arimilli,Ganesan Narayanswami, etc.
 Other collaborators
▪ Dr. Ling Liu (GATECH)
 Bioscience experts
▪ Dr. Joohyun Kim, Dr. Nayong Kim, Dr. Maheshi Dassanayake,
Dr. Dong-Ha Oh, etc.

 This work was supported in part by
 NIH-P20GM103424
 NSF-MRI-1338051
 NSF-CC-NIE-1341008
 NSF-IBSS-L-1620451
 LA BoR LEQSF(2016-19)-RD-A-08
 The HPC services are provided by
 LSU HPC
 LONI
 Samsung Research S. Korea
 IBM Research Austin

 “Developing a Meta Framework for Key-Value Memory Networks on HPC Clusters”
ChoonhanYoun, Arghya Kusum Das, SeungwonYang, Joohyun Kim, PEARC 2019
(Collaborative work UW-Platteville, LSU and San Diego SupercomputingCenter)
 “ParLECH: Parallel Long-read Error Correction with Hadoop” Arghya Kusum Das, Seung-
Jong Park, Kisung Lee, IEEE BIBM, 2018
 “A High-Throughput InteroperabilityArchitecture over Ethereum and Swarm for Big
Biomedical Data”, Arghya Kusum Das, Seung-Jong Park, Kisung Lee, IEEECHASE, 2018
(BlockchainWorkshop)
 “Large-scale parallel genome assembler over cloud computing environment” Arghya
Kusum Das, Praveen Kumar Koppa, Sayan Goswami, Richard Platania, Seung-Jong Park.
JBCB May23, 2017 issue
 “ParSECH: Parallel Sequencing Error Correction with Hadoop for Large-Scale Genome
Sequences” Arghya Kusum Das, Shayan Shams, Sayan Goswami, Richard Platania, Kisung
Lee, Seung-Jong Park. BiCOB 2017
 “Lazer: A Memory-Efficient Framework for Large-Scale Genome Assembly” Sayan
Goswami, Arghya Kusum Das, Richard Platania, Kisung Lee, Seung-Jong Park IEEE Big
Data2016.

 “Evaluating Different Distributed-Cyber-Infrastructure for Data and Compute Intensive
ScientificApplication” Arghya Kusum Das, Jaeki Hong, Sayan Goswami, Richard Platania,
Wooseok Chang, Seung-Jong Park. IEEE Big Data 2015. [With collaboration of SAMSUNG
Electronics Ltd., S. Korea]
 “AugmentingAmdahl’s Second Law: ATheoretical Model for Cost-Effective Balanced HPC
Infrastructure for Data-Driven Science” Arghya Kusum Das, Jaeki Hong, Sayan Goswami,
Richard Platania, Kisung Lee, Wooseok Chang, Seung-Jong Park. IEEE Cloud 2017
[collaboration with SAMSUNG Electronics Ltd, S. Korea]
 “IBM POWER8® HPC SystemAccelerates Genomics Analysis with SMT8 Multithreading”
Arghya Kusum Das, Sayan Goswami, Richard Platania, Seung-Jong Park, Ram
Ramanujam, Gus Kousoulas, Frank Lee, Ravi arimilli,Terry Leatherland, Joana Wong, John
Simpson,Grace Liu, JinchunWang. DynamicWhite Paper for Louisiana State University
collaboration with IBM
 “BIC-LSU: Big Data Research Integration with Cyberinfrastructure for LSU” Chiu, Chui-hui,
Nathan Lewis, Dipak Kumar Singh, Arghya Kusum Das, Mohammad M. Jalazai, Richard
Platania, Sayan Goswami, Kisung Lee, and Seung-Jong Park. XSEDE 2016.

Towards Ultra-Large-Scale System: Design of Scalable Software and Next-Gen HPC Cluster for Big Data Genome Analysis

Recommended

Recommended

More Related Content

What's hot

What's hot (12)

Similar to Towards Ultra-Large-Scale System: Design of Scalable Software and Next-Gen HPC Cluster for Big Data Genome Analysis

Similar to Towards Ultra-Large-Scale System: Design of Scalable Software and Next-Gen HPC Cluster for Big Data Genome Analysis (20)

Recently uploaded

Recently uploaded (20)

Towards Ultra-Large-Scale System: Design of Scalable Software and Next-Gen HPC Cluster for Big Data Genome Analysis

Editor's Notes