Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building a GPU-enabled OpenStack Cloud for HPC - Lance Wilson, Monash University

147 views

Published on

Audience Level
Intermediate

Synopsis
M3 is the latest generation system of the MASSIVE project, an HPC facility specializing in characterization science (imaging and visualization). Using OpenStack as the compute provisioning layer, M3 is a hybrid HPC/cloud system, custom-integrated by Monash’s R@CMon Research Cloud team. Built to support Monash University’s next-gen high-throughput instrument processing requirements, M3 is half-half GPU-accelerated and CPU-only.

We’ll discuss the design and tech used to build this innovative platform as well as detailing approaches and challenges to building GPU-enabled and HPC clouds. We’ll also discuss some of the software and processing pipelines that this system supports and highlight the importance of tuning for these workloads.

Speaker Bio
Blair Bethwaite: Blair has worked in distributed computing at Monash University for 10 years, with OpenStack for half of that. Having served as team lead, architect, administrator, user, researcher, and occasional hacker, Blair’s unique perspective as a science power-user, developer, and system architect has helped guide the evolution of the research computing engine central to Monash’s 21st Century Microscope.

Lance Wilson: Lance is a mechanical engineer, who has been making tools to break things for the last 20 years. His career has moved through a number of engineering subdisciplines from manufacturing to bioengineering. Now he supports the national characterisation research community in Melbourne, Australia using OpenStack to create HPC systems solving problems too large for your laptop.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Building a GPU-enabled OpenStack Cloud for HPC - Lance Wilson, Monash University

  1. 1. Angstrom-Scale Microscopy on OpenStack Dr Lance Wilson Senior HPC Consultant @ MASSIVE
  2. 2. Source https://www.monash.edu/research/infrastructure/delivering-impact/research-outcomes/cryo-em/half-a-million-dollar-tick
  3. 3. Figure 5. 3D structure of the Ebola spike. Beniac DR, Melito PL, deVarennes SL, Hiebert SL, Rabb MJ, et al. (2012) The Organisation of Ebola Virus Reveals a Capacity for Extensive, Modular Polyploidy. PLOS ONE 7(1): e29608. https://doi.org/10.1371/journal.pone.0029608 http://journals.plos.org/plosone/article?id=10.1371/jo urnal.pone.0029608
  4. 4. openstack.org
  5. 5. openstack.org
  6. 6. “Seeing Molecular Interactions of Large Complexes by Cryo Electron Microscopy at Atomic Resolution” Laurie J. Wang Z.H. Zhou 2014
  7. 7. openstack.org 3D reconstruction of the electron density of aE11 Fab’ polyC9 complex
  8. 8. 9 What is the scope? Or How big is the computer?
  9. 9. 10 § ~1-4TB raw data set/sample ~2000-5000 files § Pipeline analysis with internal & external tools § Require large memory gpu > 8GB § Require large system memory > 64GB § Require cpu cores 200 - 400 § Parallel file reads and writes Compute and Storage Requirements
  10. 10. 11 Why use OpenStack for this workflow?
  11. 11. Typical workflow for processing using Relion2
  12. 12. https://sites.google.com/site/emcloudprocessing/home/relion2#TOC-Benchmark-Tests
  13. 13. 14 § How to achieve maximum processing speed? § # GPUs = # raw movies § Storage I/O > Processing I/O § The electron beam drifts during collection and the results need to be shifted to account for it. § http://www.ncbi.nlm.nih.gov/pubmed/2 3644547 § Software: motioncorr 2.1 § 3.4 GB of raw movie data (15 movies) § 218MB of corrected micrographs (15 images) § Processing Time:- 109s § Single Nvidia K80 Motion Correction
  14. 14. 15 Using local desktop :- ~3hrs ●Limited by local storage and network access, single gpu Using remote dekstop on MASSIVE:- ~45mins ●Limited by GPU (2 per desktop) Using an inhouse parallel scripted version:- ~4.5mins ●Limited by file system bandwidth (1-2 GB/s) Case study - Motion Correction 10x speedup!
  15. 15. How many GPUs do you need? Relion2 - Class 2D Step
  16. 16. 0 100 200 300 400 500 600 700 800 4 8 12 16 PROCESSING TIME (MINS) NUMBER OF GPUS Relion Class 2D - Effect of No. of K80 GPUs
  17. 17. How does this hardware compare against workstations?
  18. 18. 0 5 10 15 20 25 Processing Time (Hours) Hardware Configuration Processing Time vs Hardware Configuration for 3d Classification Step
  19. 19. Insert Picture of tweet
  20. 20. Cloud HPC vs Commodity Cloud What is the difference?
  21. 21. 25 § Jon Mansour for help gathering benchmarking results § Jafar Lie for help gathering benchmarking results and § Mike Wang (NVIDIA) for access to the NVIDIA DGX-1 Acknowledgments
  22. 22. Questions?

×