Abstract
High Performance Computing in the Cloud is viable in numerous use cases. Common to all successful
use cases for cloud-based HPC is the ability embrace latency. Not surprisingly then, early successes were
achieved with embarrassingly parallel HPC applications involving minimal amounts of data - in other
words, there was little or no latency to be hidden. Over the fulness of time, however, the HPC-cloud
community has become increasingly adept in its ability to ‘hide’ latency and, in the process, support
increasingly more sophisticated HPC use cases in public and private clouds. Real-world use cases,
deemed relevant to remote sensing, will illustrate aspects of these sophistications for hiding latency in
accounting for large volumes of data, the need to pass messages between simultaneously executing
components of distributed-memory parallel applications, as well as (processing) workflows/pipelines.
Finally, the impact of containerizing HPC for the cloud will be considered through the relatively recent
creation of the Cloud Native Computing Foundation.
High Performance
Computing in the Cloud?
Ian Lumb
ilumb@univa.com
Ontario Association of Remote Sensing (O.A.R.S)
Ryerson University, Toronto - November 10, 2015
http://www0.cloudbootcamp.com/node/660946
HPC???
Latency
Latency is physically a consequence of
the limited velocity with which any
physical interaction can propagate.
https://en.wikipedia.org/wiki/Latency_(engineering)
http://www.cartoonsidrew.com/2011/05/einsteins-speed-limit.html
Current consumer devices have appallingly bad latency …
It's the Latency, Stupid
https://rescomp.stanford.edu/~cheshire/rants/Latency.html
Latency
is INEVITABLE!!!
Latency
… if you have a network link with low bandwidth
then it's an easy matter of putting several in
parallel to make a combined link with higher
bandwidth, but if you have a network link with bad
latency then no amount of money can turn any
number of them into a link with good latency.
It's the Latency, Stupid
https://rescomp.stanford.edu/~cheshire/rants/Latency.html
Latency
is INEVITABLE!!!
… and the best we can do? Try to ‘HIDE’ it!
Latency
is INEVITABLE!!!
… and the best we can do? Try to ‘HIDE’ it!
Definitions
● Public cloud
○ Off-premise IT capabilities or applications, provided by others
● Private cloud
○ On-premise enablement of cloud capabilities with existing IT
● Hybrid cloud
○ Some combination of public and private clouds
http://runge.math.smu.edu/SMUHPC_workshop_Summer14/_images/flynn.png
Flynn’s
Taxonomy
GPUs in the Cloud? The Top Four Reasons
1. You can realize possibilities using the cloud
a. You can scale up and scale out
2. You still realize the promise of GPU programmability
a. … via HPC in the cloud
3. Your use of the cloud is transparent
a. You’ve found ways to `hide’ latency
i. Constraints apply for MPI apps
4. Your go-to apps still work in the cloud
http://info.brightcomputing.com/Blog/bid/196290/The-Top-4-Reasons-You-Should-Try-Cloud-Based-GPUs-for-HPC
https://aws.amazon.com/ec2/instance-types/
http://c59951.r51.cf2.rackcdn.com/4994-1182-lumb.pdf
http://www.acceleware.com/technical-papers
HPC as a Containerized Cloud Based Service
http://insidehpc.com/2015/11/ubercloud-delivers-cae-as-a-service-with-univa-grid-engine-container-edition/
https://insights.sei.cmu.edu/assets/content/VM-Diagram.png
http://dockone.
io/uploads/article/20150329/aa61c8ee04d815507d575c9d0a3c162f.
png
Cloud Native Computing Foundation (CNCF)
● For current applications and services
○ Uptake of cloud computing remains an afterthought from a systems-architecture perspective
● CNCF aims to introduce a cloud-native paradigm shift that emphasizes:
○ Containerization
○ Dynamic scheduling
○ Orientation around micro services
● Making use of Kubernetes as a ‘seed technology’
○ #1 priority: Integrate the orchestration layer of the container ecosystem
● Univa is a Founding Member
○ Along with Google, IBM, Intel, Red Hat and numerous others ...
https://cncf.io/
http://c59951.r51.cf2.rackcdn.com/4994-1182-lumb.pdf
MPI Apps Remain a Challenge
● … for
○ cloud use
○ containerization
● Constrain MPI apps to mitigate concerns with latency
○ Run HPC on-premise OR in a cloud, but not between
○ Containers?
■ Just say no???
● Seek alternatives
○ Apache Spark ???
○ Message busses ???
“The wonderful
thing about
standards is that
there are so many
of them to choose
from.”
https://en.wikiquote.org/wiki/Grace_Hopper
Cloud Computing
is bereft of standards!!!
Cloud Computing
is bereft of standards!!!
...but, FLUSH with implementations!!!
Latency
is INEVITABLE!!!
… and the best we can do? Try to ‘HIDE’ it!
High Performance Computing in the Cloud?

High Performance Computing in the Cloud?