Supercomputing by API: Connecting Modern Web Apps to HPC

SUPERCOMPUTINGBYAPI:
CONNECTINGWEBAPPSTOHPC
Dr. David Perry
Compute Integration Specialist, University of Melbourne

Worker
Nodes
Database
Web
Server
Virtual Laboratory

Worker
Nodes
Database
Web
Server
Login
Node
Compute
Nodes
Virtual Laboratory Supercomputer

THEPROBLEM
Each HPC cluster has its own:
Scheduler
Software/OS
Hardware

THEDREAM
Write once, run anywhere.
No platform dependencies.
Consistent RESTful API for ... everything.

TODAY:
1. HPC APIs
2. Containers

THEIDEALHPCAPI:
Consistent interface across schedulers
Manages les
Work across system boundaries
Doesn't require changes to HPC cluster (no new
software, network ports, or security risks)
Multiple language bindings/wrappers

import drmaa
# Create session
s = drmaa.Session()
s.initialize()
# Create job
jt = s.createJobTemplate()
jt.remoteCommand = "echo 'hello'"
jt.nativeSpecification = "--mincpus=2"
jt.hardWallclockTimeLimit = '1:00:00'
# Run it
jobid = s.runJob(jt)
print('Your job has been submitted with ID %s' % jobid)
# Wait for it to complete
retval = s.wait(jobid, drmaa.Session.TIMEOUT_WAIT_FOREVER)
print('Job: {0} finished with status {1}'.format(retval.jobId, retval.hasEx
s.exit()

Scheduler
A Scheduler
B
Scheduler
C
Features
supported by
DRMAA

Good: Supported by almost all schedulers.
Bad: Unfriendly, local access only, limited scheduler
feature support, no longer under active
development.

import saga
import os
# Run job using SAGA
ctx = saga.Context("ssh")
ctx.user_id = 'perryd'
os.environ['SAGA_PTY_SSH_TIMEOUT'] = '60'
session = saga.Session()
session.add_context(ctx)
js = saga.job.Service("slurm+ssh://spartan.hpc.unimelb.edu.au/", session=se
jd = saga.job.Description()
jd.executable = "echo 'hello' > hello.out"
jd.wall_time_limit = 5 # minutes
# Create and submit job, wait for it to finish.
myjob = js.create_job(jd)
myjob.run()
print 'Job Running'
myjob.wait()
print('Job %s finished with status %s' % (myjob.id, myjob.exit_code))

# Fetch output files
output = 'file://localhost/tmp/'
source = 'sftp://spartan.hpc.unimelb.edu.au/home/perryd/hello.out'
saga.filesystem.File(source, session=session).copy(output)
print('Remote file contents:')
print(open('/tmp/hello.out').read())

Good: Supports popular schedulers, works over
SSH, nothing to install on cluster, handles le
transfers.
Bad: Still not a web API.

Via RESTful API:
Execution & Storage Systems
Monitoring
Metadata
Permissions
History
Events

Agave ToGo
https://togo.agaveapi.co

Good: Hosted. RESTful, OpenAPI-compliant. Does
everything.
Bad: Hosted. RESTful, OpenAPI-compliant. Does
everything.

What versions of Bowtie are available?
At Melbourne:
At Monash:
At NCI:
Bowtie2/2.2.5-GCC-4.9.2
Bowtie2/2.2.5-intel-2016.u3
Bowtie2/2.2.9-GCC-4.9.2
Bowtie2/2.2.9-intel-2016.u3
bowtie/1.1.2
bowtie2/2.2.8
bowtie/1.2.0
bowtie2/2.1.0
bowtie2/2.2.5
bowtie2/2.2.9
bowtie2/2.3.1

SINGULARITY
Image-based (just a big le with everything in it)
Flat network/hardware access
Volume mounts similar to Docker

1. Get or create a container.
$ sudo singularity create -s 6000 my_container.img
$ sudo singularity bootstrap my_container.img ubuntu.def
$ sudo singularity shell -w my_container.img
my_container.img> # Do stuff in a container

$ sudo singularity create -s 6000 digits_docker.img
$ sudo singularity --verbose import digits_docker.img
docker://nvidia/digits:latest

2. Run your container.
$ singularity exec -B /tmp:/jobs digits_docker.img
bash -c "export DIGITS_JOBS_DIR=/jobs && python -m digits"

As a HPC job:
#!/bin/bash
#SBATCH --nodes 1
#SBATCH --cpus-per-task=12
#SBATCH --partition gpu
#SBATCH --gres=gpu:4
#SBATCH --time 02:00:00
LOGIN_PORT=$(shuf -i 2000-65000 -n 1)
DIGITS_PORT=5000
module load Singularity
ssh -N -f -R $LOGIN_PORT:localhost:$DIGITS_PORT $SLURM_SUBMIT_HOST
echo "Forwarding to port:"
echo $LOGIN_PORT
singularity exec -B /tmp:/jobs -B /tmp:/scratch digits_docker.img bash -c

CAVEATS
Hardware/architecture dependencies still there.
Beware the golden image.

CONCLUSION
Supercomputer-enable your web app!
But can't ignore details of each supercomputer.
Tools out there to make life a bit easier.

MOREEXPLORATION
Project looking at:
APIs (inc. local Agave deployment)
Virtual Laboratory to HPC Single Sign-on
Knowledge Sharing

ACKNOWLEDGEMENTS
Nectar
VL managers & developers
Authors of SAGA, DRMAA and Agave
Lev Lafayette & Daniel Tosello

Supercomputing by API: Connecting Modern Web Apps to HPC

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Supercomputing by API: Connecting Modern Web Apps to HPC

Similar to Supercomputing by API: Connecting Modern Web Apps to HPC (20)

More from OpenStack

More from OpenStack (20)

Recently uploaded

Recently uploaded (20)

Supercomputing by API: Connecting Modern Web Apps to HPC