Optimizing High Performance Cancer Workflows on Bridges Supercomputer
1. Optimizing High Performance Big Data
Cancer Workflows
Iván L. Jiménez Ruiz
University of Puerto Rico, Río Piedras Campus
Ricardo González Méndez
University of Puerto Rico, School of Medicine
Alexander Ropelewski
Pittsburgh Supercomputing Center
2. Outline
Ø Aims
Ø Bridges architecture
Ø Workflow and software
Ø Timings and performance
Ø Recommendations for
similar workflows
3. Aims
Ø Implement workflows on Bridges supercomputer.
Ø Measure performance of file systems using NGS data.
Ø Determine where to run NGS programs to improve overall
workflow efficiency.
Ø Generate recommendations based on benchmarked data
14. Conclusions
Ø Bioinformatics workflows need to be reengineered regularly to
perform optimally on HPC systems.
Ø $LOCAL and $RAMDISK both performed comparably
- $RAMDISK had service usage charges associated
- Recommendation is to prefer $LOCAL over $RAMDISK
Ø /pylon1 performed similarly to both $LOCAL and $RAMDISK
- staging results and intermediate storage for output files.
15. Conclusions
Ø /pylon2 had:
- most variability
- worst performance
Our recommendation for using the file system on a similar
workflow would be to use /pylon2 for long-term storage
and archiving needs.
16. Acknowledgements
University of Puerto Rico, Rio Piedras Campus
Dr. Humberto Ortiz-Zuazaga
University of Pittsburgh
Department of Biomedical Informatics
Dr. David Boone
Dr. Uma Chandran
Funding:
• The NIH Big Data to Knowledge (BD2K) Enhancing Diversity in Biomedical Data Science Grant [9]
5R25MD010399-002 to the UPRRP
• The National Institutes of Health Minority Access to Research Careers (MARC) grant T36- GM-095335 and
National Institutes of Health Biomedical Technology Resource grant P41-GM-103712 to the Pittsburgh
Supercomputing Center (PSC)
• The computing resources used were provided through the Extreme Science and Engineering Discovery
Environment (XSEDE), which is supported by the National Science Foundation grant OCI-1053575.
• The Bridges supercomputer system at the PSC was acquired through NSF Award ACI-1445606.