SlideShare a Scribd company logo
At HTCondor Week 2019
Presented by Igor Sfiligoi, UCSD
for the PRP team
An opportunistic HTCondor pool
inside an interactive-friendly
Kubernetes cluster
HTCondor Week, May 2019 1
Outline
• Where do I come from?
• What we did?
• How is it working?
• Looking head
HTCondor Week, May 2019 2
The Pacific Research Platform
• The PRP originally created as a
regional networking project
• Establishing end-to-end links
between 10Gbps and 100Gbps
(GDC)
HTCondor Week, May 2019 3
PRP
PRPv2 Nautilus
Transoceanic
Nodes
Guam
Asian Pacific RP
Transoceanic
Nodes
Australia
Korea
Singapore
Netherlands
10G 35TB
UvA
FIONA6
10G 35TB
KISTI
10G 35TB
U of Guam
10G 35TB
U of Queensland
The Pacific Research Platform
• The PRP originally created as a
regional networking project
• Establishing end-to-end links
between 10Gbps and 100Gbps
• Expanded nationally since
• And beyond, too
CENIC/PW Link
40G FIONA
UIUC
40G 160TB
U Hawaii
40G 160TB
NCAR-WY
40G 192TB
UWashington
40G FIONA
I2 Chicago
40G FIONA
I2 NYC
40G FIONA
I2 Kansas City
40G FIONA1
UIC
HTCondor Week, May 2019 4
The Pacific Research Platform
• Recently the PRP evolved in a major resource provider, too
• Because scientists really need more than bandwidth tests
• They need to share their data at high speed and compute on it, too
• The PRP now also provides
• Extensive compute power – About 330 GPUs and 3.5k CPU cores
• A large distributed storage area - About 2 PBytes
• Select user communities now directly use
all the resources PRP has to offer
• Still doing all the network R&D in the same setup, too
• We call it the Nautilus cluster
HTCondor Week, May 2019 5
Kubernetes as a resource manager
• Large and active development and support communityIndustry standard
• More freedom for usersContainer based
• Allows for easy mixing of service and user workloadsFlexible scheduling
HTCondor Week, May 2019 6
Designed for interactive use
• Makes for very happy users
Users expect to get what
they need when they need it
• And is typically short in duration
Congestion happens only
very rarely
HTCondor Week, May 2019 7
Opportunistic use
No congestion
Idle compute
resources
Time for
opportunistic
use
HTCondor Week, May 2019 8
Kubernetes priorities
• Low priority pods only start if no demand
from higher priority ones
Priorities natively
supported in Kubernetes
• Low priority pods killed the moment
a high priority pod needs the resources
Preemption out of the
box
• Just keep enough low-priority pods in the system
Perfect for opportunistic
use
https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/
HTCondor Week, May 2019 9
HTCondor as the OSG helper
PRP wanted to give opportunistic resources to Open Science Grid (OSG) users
• Since they can tolerate preemption
But OSG does not have native support for Kubernetes
• Supports only resources provided by batch systems
We thus instantiated an HTCondor pool
• As a fully Kubernetes/Containerized deployment
HTCondor Week, May 2019 10
HTCondor in a (set of) container(s)
• Just create an image with HTCondor binaries in it!
• Configuration injected through Kubernetes pod config
Putting HTCondor in a set
of containers is not hard
• The Collector must be discoverable – Kubernetes service
• Everything else just works from there
HTCondor deals nicely
with ephemeral IPs
• And potentially for the Negotiator, if long term accounting desired
• Everything else can live off ephemeral storage
Persistency needed for
the Schedd(s)
HTCondor Week, May 2019 11
Service vs Opportunistic
Collector and Schedd(s) deployed as high priority service pods
• Should be running at all times
• Few pods, not high CPU or GPU users, so OK
• Using Kubernetes Deployment to re-start the pods in case of HW problems and/or maintenance
• Kubernetes Service used to get a persistent routing IP to the collector pod
Startds deployed as low priority pods
• Hundreds of pods in the Kubernetes queue at all times, many in Pending state
• HTCondor Startd configured to accept jobs as soon as it starts and forever after
• If pod preempted, HTCondor gets a SIGTERM and has a few seconds to go away
Pure opportunistic
HTCondor Week, May 2019 12
Then came the users
• Well, until we had more than a single userEverything was working nicely,
until we let in real users
• So they can use any weird software they likeOSG users got used
to rely on Containers
• Cannot launch a user-provided containerBut HTCondor Startd already
running inside a container!
• How many of each kind?So I need to provide
user-specific execute pods
HTCondor Week, May 2019 13
Not without
elevated privileges
Then came the users
• Well, until we had more than a single userEverything was working nicely,
until we let in real users
• So they can use any weird software they likeOSG users got used
to rely on Containers
• Cannot launch a user-provided containerBut HTCondor Startd already
running inside a container!
• How many of each kind?So I need to provide
user-specific execute pods
HTCondor Week, May 2019 14
Dealing with many opportunistic pod types
• A different kind of pod could use that resource
• A glidein-like setup would solve that
Having idle Startd pods
not OK anymore
• They will just terminate without ever running a job
• Who should regulate the “glidein pressure”?
Keeping pods without users
not OK anymore
• Kubernetes scheduling is basically just priority-FIFO
How do I manage fair share
between different pod types?
• Ideally, HTCondor should have native Kubernetes support
How am I to know what
Container images users want?
HTCondor Week, May 2019 15
Dealing with many opportunistic pod types
• A different kind of pod could use that resource
• A glidein-like setup would solve that
Having idle Startd pods
not OK anymore
• They will just terminate without ever running a job
• Who should regulate the “glidein pressure”?
Keeping pods without users
not OK anymore
• Kubernetes scheduling is basically just priority-FIFO
How do I manage fair share
between different pod types?
• Ideally, HTCondor should have native Kubernetes support
How am I to know what
Container images users want?
I know how to
implement this.
HTCondor Week, May 2019 16
Dealing with many opportunistic pod types
• A different kind of pod could use that resource
• A glidein-like setup would solve that
Having idle Startd pods
not OK anymore
• They will just terminate without ever running a job
• Who should regulate the “glidein pressure”?
Keeping pods without users
not OK anymore
• Kubernetes scheduling is basically just priority-FIFO
How do I manage fair share
between different pod types?
• Ideally, HTCondor should have native Kubernetes support
How am I to know what
Container images users want?
I was told
this is coming.
HTCondor Week, May 2019 17
Dealing with many opportunistic pod types
• A different kind of pod could use that resource
• A glidein-like setup would solve that
Having idle Startd pods
not OK anymore
• They will just terminate without ever running a job
• Who should regulate the “glidein pressure”?
Keeping pods without users
not OK anymore
• Kubernetes scheduling is basically just priority-FIFO
How do I manage fair share
between different pod types?
• Ideally, HTCondor should have native Kubernetes support
How am I to know what
Container images users want?
In OSG-land, glideinWMS
solves this for me.
HTCondor Week, May 2019 18
Dealing with many opportunistic pod types
• A different kind of pod could use that resource
• A glidein-like setup would solve that
Having idle Startd pods
not OK anymore
• They will just terminate without ever running a job
• Who should regulate the “glidein pressure”?
Keeping pods without users
not OK anymore
• Kubernetes scheduling is basically just priority-FIFO
How do I manage fair share
between different pod types?
• Ideally, HTCondor should have native Kubernetes support
How am I to know what
Container images users want?
No concrete plans on how
to address these yet.
HTCondor Week, May 2019 19
Dealing with many opportunistic pod types
For now, I just periodically adjust the balance
• A completely manual process
Currently supporting only a few, well behaved users
• Maybe not optimal, but good enough
But looking forward to a more automated future
HTCondor Week, May 2019 20
Are side-containers an option?
Ideally, I do want to use user-provided, per-job Containers
• Running HTCondor and user jobs in separate pods
not an option due to opportunistic nature
But Kubernetes pods are made of several Containers
• Could I run HTCondor in a dedicated Container?
• Then start the user pod in a side-container?
Pretty sure currently not supported
• But, at least in principle, fits the architecture
• Would also need HTCondor native support
Pod
HTCondor container
User job container
HTCondor Week, May 2019 21
Will nested
Containers be
a reality soon?
It has been pointed out to me
that latest CentOS supports
unprivileged Singularity
Have not tied it out
• Probably I should
Cannot currently assume all of my
nodes have a recent-enough kernel
• But eventually will get there
HTCondor Week, May 2019 22
Looking ahead
Looking forward to a more automated future
Will do what I have to myself
Would be happier if I could use off-the-shelf solutions
HTCondor Week, May 2019 23
A final picture
• Opportunistic GPU usage over the past few months
HTCondor Week, May 2019 24
Summary
• We created an opportunistic HTCondor
pool in the PRP Kubernetes cluster
• OSG users can now use
any otherwise-unused cycles
• The lack of nested containerization
forces us to have
multiple execute pod types
• Some micromanagement
currently needed, hoping for
more automation in the future
HTCondor Week, May 2019 25
Acknowledgents
This work was partially funded by
US National Science Foundation (NSF) awards
CNS-1456638, CNS-1730158,
ACI-1540112, ACI-1541349,
OAC-1826967, OAC 1450871,
OAC-1659169 and OAC-1841530.
HTCondor Week, May 2019 26

More Related Content

Similar to An opportunistic HTCondor pool inside an interactive-friendly Kubernetes cluster

The Salmon Algorithm Spawning with Kubernetes
The Salmon Algorithm Spawning with KubernetesThe Salmon Algorithm Spawning with Kubernetes
The Salmon Algorithm Spawning with Kubernetes
CloudOps2005
 
Modern big data and machine learning in the era of cloud, docker and kubernetes
Modern big data and machine learning in the era of cloud, docker and kubernetesModern big data and machine learning in the era of cloud, docker and kubernetes
Modern big data and machine learning in the era of cloud, docker and kubernetes
Slim Baltagi
 
Introducing Container Technology to TSUBAME3.0 Supercomputer
Introducing Container Technology to TSUBAME3.0 SupercomputerIntroducing Container Technology to TSUBAME3.0 Supercomputer
Introducing Container Technology to TSUBAME3.0 Supercomputer
Akihiro Nomura
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
C4Media
 
Maximizing the Value of Containers and Microservices with the Right Platform
Maximizing the Value of Containers and Microservices with the Right PlatformMaximizing the Value of Containers and Microservices with the Right Platform
Maximizing the Value of Containers and Microservices with the Right Platform
Miska Kaipiainen
 
Kube 101
Kube 101Kube 101
Kube 101
Syed Imam
 
Android Developer Skills, Techniques, and Patterns
Android Developer Skills, Techniques, and PatternsAndroid Developer Skills, Techniques, and Patterns
Android Developer Skills, Techniques, and Patterns
gdgut
 
Welcome to CloudLand - DevOps Seattle Feb 2020
Welcome to CloudLand - DevOps Seattle Feb 2020Welcome to CloudLand - DevOps Seattle Feb 2020
Welcome to CloudLand - DevOps Seattle Feb 2020
Kaslin Fields
 
JUC Europe 2015: Continuous Integration and Distribution in the Cloud with DE...
JUC Europe 2015: Continuous Integration and Distribution in the Cloud with DE...JUC Europe 2015: Continuous Integration and Distribution in the Cloud with DE...
JUC Europe 2015: Continuous Integration and Distribution in the Cloud with DE...
CloudBees
 
Untangling spring week2
Untangling spring week2Untangling spring week2
Untangling spring week2
Derek Jacoby
 
Cloud Native CI/CD with Jenkins X and Knative Pipelines
Cloud Native CI/CD with Jenkins X and Knative PipelinesCloud Native CI/CD with Jenkins X and Knative Pipelines
Cloud Native CI/CD with Jenkins X and Knative Pipelines
C4Media
 
February 13th, 2014 - Unicon IAM Webinar Update
February 13th, 2014 - Unicon IAM Webinar UpdateFebruary 13th, 2014 - Unicon IAM Webinar Update
February 13th, 2014 - Unicon IAM Webinar Update
Misagh Moayyed
 
The Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter DeploymentThe Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter Deployment
Frederick Reiss
 
2014 Q4 IAM Open Source Support Program Update
2014 Q4 IAM Open Source Support Program Update2014 Q4 IAM Open Source Support Program Update
2014 Q4 IAM Open Source Support Program Update
John Gasper
 
Technical standards & the RDTF Vision: some considerations
Technical standards & the RDTF Vision: some considerationsTechnical standards & the RDTF Vision: some considerations
Technical standards & the RDTF Vision: some considerations
Paul Walk
 
A Tale of Three Tools: Kubernetes, Jsonnet, and Bazel
A Tale of Three Tools: Kubernetes, Jsonnet, and BazelA Tale of Three Tools: Kubernetes, Jsonnet, and Bazel
A Tale of Three Tools: Kubernetes, Jsonnet, and Bazel
Databricks
 
JHipster Code 2020 keynote
JHipster Code 2020 keynoteJHipster Code 2020 keynote
JHipster Code 2020 keynote
Julien Dubois
 
Cortex: Horizontally Scalable, Highly Available Prometheus
Cortex: Horizontally Scalable, Highly Available PrometheusCortex: Horizontally Scalable, Highly Available Prometheus
Cortex: Horizontally Scalable, Highly Available Prometheus
Grafana Labs
 
Persistent Data Storage for Docker Containers by Andre Moruga
Persistent Data Storage for Docker Containers by Andre MorugaPersistent Data Storage for Docker Containers by Andre Moruga
Persistent Data Storage for Docker Containers by Andre Moruga
Docker, Inc.
 
blockchain and iot: Opportunities and Challanges
blockchain and iot: Opportunities and Challangesblockchain and iot: Opportunities and Challanges
blockchain and iot: Opportunities and Challanges
Chetan Kumar S
 

Similar to An opportunistic HTCondor pool inside an interactive-friendly Kubernetes cluster (20)

The Salmon Algorithm Spawning with Kubernetes
The Salmon Algorithm Spawning with KubernetesThe Salmon Algorithm Spawning with Kubernetes
The Salmon Algorithm Spawning with Kubernetes
 
Modern big data and machine learning in the era of cloud, docker and kubernetes
Modern big data and machine learning in the era of cloud, docker and kubernetesModern big data and machine learning in the era of cloud, docker and kubernetes
Modern big data and machine learning in the era of cloud, docker and kubernetes
 
Introducing Container Technology to TSUBAME3.0 Supercomputer
Introducing Container Technology to TSUBAME3.0 SupercomputerIntroducing Container Technology to TSUBAME3.0 Supercomputer
Introducing Container Technology to TSUBAME3.0 Supercomputer
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
Maximizing the Value of Containers and Microservices with the Right Platform
Maximizing the Value of Containers and Microservices with the Right PlatformMaximizing the Value of Containers and Microservices with the Right Platform
Maximizing the Value of Containers and Microservices with the Right Platform
 
Kube 101
Kube 101Kube 101
Kube 101
 
Android Developer Skills, Techniques, and Patterns
Android Developer Skills, Techniques, and PatternsAndroid Developer Skills, Techniques, and Patterns
Android Developer Skills, Techniques, and Patterns
 
Welcome to CloudLand - DevOps Seattle Feb 2020
Welcome to CloudLand - DevOps Seattle Feb 2020Welcome to CloudLand - DevOps Seattle Feb 2020
Welcome to CloudLand - DevOps Seattle Feb 2020
 
JUC Europe 2015: Continuous Integration and Distribution in the Cloud with DE...
JUC Europe 2015: Continuous Integration and Distribution in the Cloud with DE...JUC Europe 2015: Continuous Integration and Distribution in the Cloud with DE...
JUC Europe 2015: Continuous Integration and Distribution in the Cloud with DE...
 
Untangling spring week2
Untangling spring week2Untangling spring week2
Untangling spring week2
 
Cloud Native CI/CD with Jenkins X and Knative Pipelines
Cloud Native CI/CD with Jenkins X and Knative PipelinesCloud Native CI/CD with Jenkins X and Knative Pipelines
Cloud Native CI/CD with Jenkins X and Knative Pipelines
 
February 13th, 2014 - Unicon IAM Webinar Update
February 13th, 2014 - Unicon IAM Webinar UpdateFebruary 13th, 2014 - Unicon IAM Webinar Update
February 13th, 2014 - Unicon IAM Webinar Update
 
The Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter DeploymentThe Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter Deployment
 
2014 Q4 IAM Open Source Support Program Update
2014 Q4 IAM Open Source Support Program Update2014 Q4 IAM Open Source Support Program Update
2014 Q4 IAM Open Source Support Program Update
 
Technical standards & the RDTF Vision: some considerations
Technical standards & the RDTF Vision: some considerationsTechnical standards & the RDTF Vision: some considerations
Technical standards & the RDTF Vision: some considerations
 
A Tale of Three Tools: Kubernetes, Jsonnet, and Bazel
A Tale of Three Tools: Kubernetes, Jsonnet, and BazelA Tale of Three Tools: Kubernetes, Jsonnet, and Bazel
A Tale of Three Tools: Kubernetes, Jsonnet, and Bazel
 
JHipster Code 2020 keynote
JHipster Code 2020 keynoteJHipster Code 2020 keynote
JHipster Code 2020 keynote
 
Cortex: Horizontally Scalable, Highly Available Prometheus
Cortex: Horizontally Scalable, Highly Available PrometheusCortex: Horizontally Scalable, Highly Available Prometheus
Cortex: Horizontally Scalable, Highly Available Prometheus
 
Persistent Data Storage for Docker Containers by Andre Moruga
Persistent Data Storage for Docker Containers by Andre MorugaPersistent Data Storage for Docker Containers by Andre Moruga
Persistent Data Storage for Docker Containers by Andre Moruga
 
blockchain and iot: Opportunities and Challanges
blockchain and iot: Opportunities and Challangesblockchain and iot: Opportunities and Challanges
blockchain and iot: Opportunities and Challanges
 

More from Igor Sfiligoi

Preparing Fusion codes for Perlmutter - CGYRO
Preparing Fusion codes for Perlmutter - CGYROPreparing Fusion codes for Perlmutter - CGYRO
Preparing Fusion codes for Perlmutter - CGYRO
Igor Sfiligoi
 
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...
Igor Sfiligoi
 
Comparing single-node and multi-node performance of an important fusion HPC c...
Comparing single-node and multi-node performance of an important fusion HPC c...Comparing single-node and multi-node performance of an important fusion HPC c...
Comparing single-node and multi-node performance of an important fusion HPC c...
Igor Sfiligoi
 
The anachronism of whole-GPU accounting
The anachronism of whole-GPU accountingThe anachronism of whole-GPU accounting
The anachronism of whole-GPU accounting
Igor Sfiligoi
 
Speeding up bowtie2 by improving cache-hit rate
Speeding up bowtie2 by improving cache-hit rateSpeeding up bowtie2 by improving cache-hit rate
Speeding up bowtie2 by improving cache-hit rate
Igor Sfiligoi
 
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence SimulationsPerformance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
Igor Sfiligoi
 
Comparing GPU effectiveness for Unifrac distance compute
Comparing GPU effectiveness for Unifrac distance computeComparing GPU effectiveness for Unifrac distance compute
Comparing GPU effectiveness for Unifrac distance compute
Igor Sfiligoi
 
Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...
Igor Sfiligoi
 
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory AccessAccelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Igor Sfiligoi
 
Using A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific OutputUsing A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific Output
Igor Sfiligoi
 
Using commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobsUsing commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobs
Igor Sfiligoi
 
Modest scale HPC on Azure using CGYRO
Modest scale HPC on Azure using CGYROModest scale HPC on Azure using CGYRO
Modest scale HPC on Azure using CGYRO
Igor Sfiligoi
 
Data-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud BurstData-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud Burst
Igor Sfiligoi
 
Scheduling a Kubernetes Federation with Admiralty
Scheduling a Kubernetes Federation with AdmiraltyScheduling a Kubernetes Federation with Admiralty
Scheduling a Kubernetes Federation with Admiralty
Igor Sfiligoi
 
Accelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACCAccelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACC
Igor Sfiligoi
 
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Igor Sfiligoi
 
Porting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUsPorting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUs
Igor Sfiligoi
 
Demonstrating 100 Gbps in and out of the public Clouds
Demonstrating 100 Gbps in and out of the public CloudsDemonstrating 100 Gbps in and out of the public Clouds
Demonstrating 100 Gbps in and out of the public Clouds
Igor Sfiligoi
 
TransAtlantic Networking using Cloud links
TransAtlantic Networking using Cloud linksTransAtlantic Networking using Cloud links
TransAtlantic Networking using Cloud links
Igor Sfiligoi
 
Bursting into the public Cloud - Sharing my experience doing it at large scal...
Bursting into the public Cloud - Sharing my experience doing it at large scal...Bursting into the public Cloud - Sharing my experience doing it at large scal...
Bursting into the public Cloud - Sharing my experience doing it at large scal...
Igor Sfiligoi
 

More from Igor Sfiligoi (20)

Preparing Fusion codes for Perlmutter - CGYRO
Preparing Fusion codes for Perlmutter - CGYROPreparing Fusion codes for Perlmutter - CGYRO
Preparing Fusion codes for Perlmutter - CGYRO
 
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...
 
Comparing single-node and multi-node performance of an important fusion HPC c...
Comparing single-node and multi-node performance of an important fusion HPC c...Comparing single-node and multi-node performance of an important fusion HPC c...
Comparing single-node and multi-node performance of an important fusion HPC c...
 
The anachronism of whole-GPU accounting
The anachronism of whole-GPU accountingThe anachronism of whole-GPU accounting
The anachronism of whole-GPU accounting
 
Speeding up bowtie2 by improving cache-hit rate
Speeding up bowtie2 by improving cache-hit rateSpeeding up bowtie2 by improving cache-hit rate
Speeding up bowtie2 by improving cache-hit rate
 
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence SimulationsPerformance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
 
Comparing GPU effectiveness for Unifrac distance compute
Comparing GPU effectiveness for Unifrac distance computeComparing GPU effectiveness for Unifrac distance compute
Comparing GPU effectiveness for Unifrac distance compute
 
Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...
 
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory AccessAccelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
 
Using A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific OutputUsing A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific Output
 
Using commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobsUsing commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobs
 
Modest scale HPC on Azure using CGYRO
Modest scale HPC on Azure using CGYROModest scale HPC on Azure using CGYRO
Modest scale HPC on Azure using CGYRO
 
Data-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud BurstData-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud Burst
 
Scheduling a Kubernetes Federation with Admiralty
Scheduling a Kubernetes Federation with AdmiraltyScheduling a Kubernetes Federation with Admiralty
Scheduling a Kubernetes Federation with Admiralty
 
Accelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACCAccelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACC
 
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
 
Porting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUsPorting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUs
 
Demonstrating 100 Gbps in and out of the public Clouds
Demonstrating 100 Gbps in and out of the public CloudsDemonstrating 100 Gbps in and out of the public Clouds
Demonstrating 100 Gbps in and out of the public Clouds
 
TransAtlantic Networking using Cloud links
TransAtlantic Networking using Cloud linksTransAtlantic Networking using Cloud links
TransAtlantic Networking using Cloud links
 
Bursting into the public Cloud - Sharing my experience doing it at large scal...
Bursting into the public Cloud - Sharing my experience doing it at large scal...Bursting into the public Cloud - Sharing my experience doing it at large scal...
Bursting into the public Cloud - Sharing my experience doing it at large scal...
 

Recently uploaded

Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 

Recently uploaded (20)

Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 

An opportunistic HTCondor pool inside an interactive-friendly Kubernetes cluster

  • 1. At HTCondor Week 2019 Presented by Igor Sfiligoi, UCSD for the PRP team An opportunistic HTCondor pool inside an interactive-friendly Kubernetes cluster HTCondor Week, May 2019 1
  • 2. Outline • Where do I come from? • What we did? • How is it working? • Looking head HTCondor Week, May 2019 2
  • 3. The Pacific Research Platform • The PRP originally created as a regional networking project • Establishing end-to-end links between 10Gbps and 100Gbps (GDC) HTCondor Week, May 2019 3
  • 4. PRP PRPv2 Nautilus Transoceanic Nodes Guam Asian Pacific RP Transoceanic Nodes Australia Korea Singapore Netherlands 10G 35TB UvA FIONA6 10G 35TB KISTI 10G 35TB U of Guam 10G 35TB U of Queensland The Pacific Research Platform • The PRP originally created as a regional networking project • Establishing end-to-end links between 10Gbps and 100Gbps • Expanded nationally since • And beyond, too CENIC/PW Link 40G FIONA UIUC 40G 160TB U Hawaii 40G 160TB NCAR-WY 40G 192TB UWashington 40G FIONA I2 Chicago 40G FIONA I2 NYC 40G FIONA I2 Kansas City 40G FIONA1 UIC HTCondor Week, May 2019 4
  • 5. The Pacific Research Platform • Recently the PRP evolved in a major resource provider, too • Because scientists really need more than bandwidth tests • They need to share their data at high speed and compute on it, too • The PRP now also provides • Extensive compute power – About 330 GPUs and 3.5k CPU cores • A large distributed storage area - About 2 PBytes • Select user communities now directly use all the resources PRP has to offer • Still doing all the network R&D in the same setup, too • We call it the Nautilus cluster HTCondor Week, May 2019 5
  • 6. Kubernetes as a resource manager • Large and active development and support communityIndustry standard • More freedom for usersContainer based • Allows for easy mixing of service and user workloadsFlexible scheduling HTCondor Week, May 2019 6
  • 7. Designed for interactive use • Makes for very happy users Users expect to get what they need when they need it • And is typically short in duration Congestion happens only very rarely HTCondor Week, May 2019 7
  • 8. Opportunistic use No congestion Idle compute resources Time for opportunistic use HTCondor Week, May 2019 8
  • 9. Kubernetes priorities • Low priority pods only start if no demand from higher priority ones Priorities natively supported in Kubernetes • Low priority pods killed the moment a high priority pod needs the resources Preemption out of the box • Just keep enough low-priority pods in the system Perfect for opportunistic use https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/ HTCondor Week, May 2019 9
  • 10. HTCondor as the OSG helper PRP wanted to give opportunistic resources to Open Science Grid (OSG) users • Since they can tolerate preemption But OSG does not have native support for Kubernetes • Supports only resources provided by batch systems We thus instantiated an HTCondor pool • As a fully Kubernetes/Containerized deployment HTCondor Week, May 2019 10
  • 11. HTCondor in a (set of) container(s) • Just create an image with HTCondor binaries in it! • Configuration injected through Kubernetes pod config Putting HTCondor in a set of containers is not hard • The Collector must be discoverable – Kubernetes service • Everything else just works from there HTCondor deals nicely with ephemeral IPs • And potentially for the Negotiator, if long term accounting desired • Everything else can live off ephemeral storage Persistency needed for the Schedd(s) HTCondor Week, May 2019 11
  • 12. Service vs Opportunistic Collector and Schedd(s) deployed as high priority service pods • Should be running at all times • Few pods, not high CPU or GPU users, so OK • Using Kubernetes Deployment to re-start the pods in case of HW problems and/or maintenance • Kubernetes Service used to get a persistent routing IP to the collector pod Startds deployed as low priority pods • Hundreds of pods in the Kubernetes queue at all times, many in Pending state • HTCondor Startd configured to accept jobs as soon as it starts and forever after • If pod preempted, HTCondor gets a SIGTERM and has a few seconds to go away Pure opportunistic HTCondor Week, May 2019 12
  • 13. Then came the users • Well, until we had more than a single userEverything was working nicely, until we let in real users • So they can use any weird software they likeOSG users got used to rely on Containers • Cannot launch a user-provided containerBut HTCondor Startd already running inside a container! • How many of each kind?So I need to provide user-specific execute pods HTCondor Week, May 2019 13 Not without elevated privileges
  • 14. Then came the users • Well, until we had more than a single userEverything was working nicely, until we let in real users • So they can use any weird software they likeOSG users got used to rely on Containers • Cannot launch a user-provided containerBut HTCondor Startd already running inside a container! • How many of each kind?So I need to provide user-specific execute pods HTCondor Week, May 2019 14
  • 15. Dealing with many opportunistic pod types • A different kind of pod could use that resource • A glidein-like setup would solve that Having idle Startd pods not OK anymore • They will just terminate without ever running a job • Who should regulate the “glidein pressure”? Keeping pods without users not OK anymore • Kubernetes scheduling is basically just priority-FIFO How do I manage fair share between different pod types? • Ideally, HTCondor should have native Kubernetes support How am I to know what Container images users want? HTCondor Week, May 2019 15
  • 16. Dealing with many opportunistic pod types • A different kind of pod could use that resource • A glidein-like setup would solve that Having idle Startd pods not OK anymore • They will just terminate without ever running a job • Who should regulate the “glidein pressure”? Keeping pods without users not OK anymore • Kubernetes scheduling is basically just priority-FIFO How do I manage fair share between different pod types? • Ideally, HTCondor should have native Kubernetes support How am I to know what Container images users want? I know how to implement this. HTCondor Week, May 2019 16
  • 17. Dealing with many opportunistic pod types • A different kind of pod could use that resource • A glidein-like setup would solve that Having idle Startd pods not OK anymore • They will just terminate without ever running a job • Who should regulate the “glidein pressure”? Keeping pods without users not OK anymore • Kubernetes scheduling is basically just priority-FIFO How do I manage fair share between different pod types? • Ideally, HTCondor should have native Kubernetes support How am I to know what Container images users want? I was told this is coming. HTCondor Week, May 2019 17
  • 18. Dealing with many opportunistic pod types • A different kind of pod could use that resource • A glidein-like setup would solve that Having idle Startd pods not OK anymore • They will just terminate without ever running a job • Who should regulate the “glidein pressure”? Keeping pods without users not OK anymore • Kubernetes scheduling is basically just priority-FIFO How do I manage fair share between different pod types? • Ideally, HTCondor should have native Kubernetes support How am I to know what Container images users want? In OSG-land, glideinWMS solves this for me. HTCondor Week, May 2019 18
  • 19. Dealing with many opportunistic pod types • A different kind of pod could use that resource • A glidein-like setup would solve that Having idle Startd pods not OK anymore • They will just terminate without ever running a job • Who should regulate the “glidein pressure”? Keeping pods without users not OK anymore • Kubernetes scheduling is basically just priority-FIFO How do I manage fair share between different pod types? • Ideally, HTCondor should have native Kubernetes support How am I to know what Container images users want? No concrete plans on how to address these yet. HTCondor Week, May 2019 19
  • 20. Dealing with many opportunistic pod types For now, I just periodically adjust the balance • A completely manual process Currently supporting only a few, well behaved users • Maybe not optimal, but good enough But looking forward to a more automated future HTCondor Week, May 2019 20
  • 21. Are side-containers an option? Ideally, I do want to use user-provided, per-job Containers • Running HTCondor and user jobs in separate pods not an option due to opportunistic nature But Kubernetes pods are made of several Containers • Could I run HTCondor in a dedicated Container? • Then start the user pod in a side-container? Pretty sure currently not supported • But, at least in principle, fits the architecture • Would also need HTCondor native support Pod HTCondor container User job container HTCondor Week, May 2019 21
  • 22. Will nested Containers be a reality soon? It has been pointed out to me that latest CentOS supports unprivileged Singularity Have not tied it out • Probably I should Cannot currently assume all of my nodes have a recent-enough kernel • But eventually will get there HTCondor Week, May 2019 22
  • 23. Looking ahead Looking forward to a more automated future Will do what I have to myself Would be happier if I could use off-the-shelf solutions HTCondor Week, May 2019 23
  • 24. A final picture • Opportunistic GPU usage over the past few months HTCondor Week, May 2019 24
  • 25. Summary • We created an opportunistic HTCondor pool in the PRP Kubernetes cluster • OSG users can now use any otherwise-unused cycles • The lack of nested containerization forces us to have multiple execute pod types • Some micromanagement currently needed, hoping for more automation in the future HTCondor Week, May 2019 25
  • 26. Acknowledgents This work was partially funded by US National Science Foundation (NSF) awards CNS-1456638, CNS-1730158, ACI-1540112, ACI-1541349, OAC-1826967, OAC 1450871, OAC-1659169 and OAC-1841530. HTCondor Week, May 2019 26