High Performance Computing in the Cloud is viable in numerous use cases. Common to all successful use cases for cloud-based HPC is the ability embrace latency. Not surprisingly then, early successes were achieved with embarrassingly parallel HPC applications involving minimal amounts of data - in other words, there was little or no latency to be hidden. Over the fulness of time, however, the HPC-cloud community has become increasingly adept in its ability to ‘hide’ latency and, in the process, support increasingly more sophisticated HPC use cases in public and private clouds. Real-world use cases, deemed relevant to remote sensing, will illustrate aspects of these sophistications for hiding latency in accounting for large volumes of data, the need to pass messages between simultaneously executing components of distributed-memory parallel applications, as well as (processing) workflows/pipelines. Finally, the impact of containerizing HPC for the cloud will be considered through the relatively recent creation of the Cloud Native Computing Foundation.
1. Abstract
High Performance Computing in the Cloud is viable in numerous use cases. Common to all successful
use cases for cloud-based HPC is the ability embrace latency. Not surprisingly then, early successes were
achieved with embarrassingly parallel HPC applications involving minimal amounts of data - in other
words, there was little or no latency to be hidden. Over the fulness of time, however, the HPC-cloud
community has become increasingly adept in its ability to ‘hide’ latency and, in the process, support
increasingly more sophisticated HPC use cases in public and private clouds. Real-world use cases,
deemed relevant to remote sensing, will illustrate aspects of these sophistications for hiding latency in
accounting for large volumes of data, the need to pass messages between simultaneously executing
components of distributed-memory parallel applications, as well as (processing) workflows/pipelines.
Finally, the impact of containerizing HPC for the cloud will be considered through the relatively recent
creation of the Cloud Native Computing Foundation.
2. High Performance
Computing in the Cloud?
Ian Lumb
ilumb@univa.com
Ontario Association of Remote Sensing (O.A.R.S)
Ryerson University, Toronto - November 10, 2015
15. Latency is physically a consequence of
the limited velocity with which any
physical interaction can propagate.
https://en.wikipedia.org/wiki/Latency_(engineering)
22. … if you have a network link with low bandwidth
then it's an easy matter of putting several in
parallel to make a combined link with higher
bandwidth, but if you have a network link with bad
latency then no amount of money can turn any
number of them into a link with good latency.
It's the Latency, Stupid
https://rescomp.stanford.edu/~cheshire/rants/Latency.html
26. Definitions
● Public cloud
○ Off-premise IT capabilities or applications, provided by others
● Private cloud
○ On-premise enablement of cloud capabilities with existing IT
● Hybrid cloud
○ Some combination of public and private clouds
29. GPUs in the Cloud? The Top Four Reasons
1. You can realize possibilities using the cloud
a. You can scale up and scale out
2. You still realize the promise of GPU programmability
a. … via HPC in the cloud
3. Your use of the cloud is transparent
a. You’ve found ways to `hide’ latency
i. Constraints apply for MPI apps
4. Your go-to apps still work in the cloud
http://info.brightcomputing.com/Blog/bid/196290/The-Top-4-Reasons-You-Should-Try-Cloud-Based-GPUs-for-HPC
34. HPC as a Containerized Cloud Based Service
http://insidehpc.com/2015/11/ubercloud-delivers-cae-as-a-service-with-univa-grid-engine-container-edition/
41. Cloud Native Computing Foundation (CNCF)
● For current applications and services
○ Uptake of cloud computing remains an afterthought from a systems-architecture perspective
● CNCF aims to introduce a cloud-native paradigm shift that emphasizes:
○ Containerization
○ Dynamic scheduling
○ Orientation around micro services
● Making use of Kubernetes as a ‘seed technology’
○ #1 priority: Integrate the orchestration layer of the container ecosystem
● Univa is a Founding Member
○ Along with Google, IBM, Intel, Red Hat and numerous others ...
https://cncf.io/
43. MPI Apps Remain a Challenge
● … for
○ cloud use
○ containerization
● Constrain MPI apps to mitigate concerns with latency
○ Run HPC on-premise OR in a cloud, but not between
○ Containers?
■ Just say no???
● Seek alternatives
○ Apache Spark ???
○ Message busses ???