SlideShare a Scribd company logo
1 of 54
Download to read offline
Kubernetes the New
Research Platform
Bob Killen Lindsey Tulloch
University of Michigan Brock University
$ whoami - Lindsey
Lindsey Tulloch
Undergraduate Student at Brock University
Github: @onyiny-ang
Twitter: @9jaLindsey
$ whoami - Bob
Bob Killen
rkillen@umich.edu
Senior Research Cloud Administrator
CNCF Ambassador
Github: @mrbobbytables
Twitter: @mrbobbytables
Kubernetes the New
Research Platform
Bob Killen Lindsey Tulloch
University of Michigan Brock University
...or a tale of two
Research Institutions.
Why?
● Increased use of containers...everywhere.
● Moving away from strict “job” style workflows.
● Adoption of data-streaming and in-flight
processing.
● Greater use of interactive Science Gateways.
● Dependence on other more persistent services.
● Increasing demand for reproducibility.
R. Banerjee et. all - A graph theoretic framework for
representation, exploration and analysis on computed
states of physical systems
Why Kubernetes?
● Kubernetes has become the standard for container orchestration.
● Extremely easy to extend, augment, and integrate with other
systems.
● If it works on Kubernetes, it’ll work “anywhere”.
● No vendor lock-in.
● Very large, active development community.
● Declarative nature aids in improving reproducibility.
● Final Research
Project in CS(1 credit)
● Final Research
Project in CS(1 credit)
● Bioinformatics
● Final Research
Project in CS(1 credit)
● Bioinformatics
● Kubernetes
● Final Research
Project in CS(1 credit)
● Bioinformatics
● Kubernetes
● Bioinformatics on
Kubernetes!
● Final Research
Project in CS(1 credit)
● Bioinformatics
● Kubernetes
● Bioinformatics on
Kubernetes!
● Final Research
Project in CS(1 credit)
● Bioinformatics
● Kubernetes
● Bioinformatics on
Kubernetes!
● on Compute Canada?
Compute Canada
Regional and Government Partners
Compute Canada
● Not-for-profit corporation
● Membership includes most of Canada’s major
research universities
● All Canadian faculty members have access to
Compute Canada systems and can sponsor others:
- students
- postdocs
- external collaborators
● No fee for Canadian university faculty
● Reduced fee for federal laboratories and
not-for-profit orgs
Compute Canada
● Compute and storage resources, data centres
● Team of ~200 experts in utilization of advanced
research computing
● 100s of research software packages
● Cloud compute and storage (openstack, owncloud)
● 5-10 Data Centres
● 300,000 cores
● 12 Pflops, 50+ PB
Compute Canada
Researchers drive innovation
● The CC user base is broadening,
bringing a broader set of needs.
● Tremendous interest in services
enabling Research Data Management
(RDM)
● No restrictions on researchers ≠ admin privileges
● ~200 experts ≠ ~200 Kubernetes experts
● ≠ 1 Kubernetes expert. . .
● How is this going to work?????
Researchers drive innovation
Back to Salmon
ATLAS Collaboration
ATLAS Collaboration
What is ATLAS?
- located on the Large Hadron Collider ring
- detects and records the products of proton collisions in the LHC
- The LHC and the ATLAS detector together form the most powerful microscope
ever built
- allow scientists to explore:
- space and time
- fundamental laws of nature
ATLAS Collaboration
NBD
ATLAS Collaboration
● ATLAS produces several peta-bytes of data/year
● Tier 2 computing centers perform final analyses (Canadian Universities like UVic)
UVic-ATLAS group:
- 25 scientists (students, research associates, technicians, computer experts,
engineers and physics professors)
ATLAS + Kubernetes
Where does
Kubernetes fit
in?
Compute Canada and CERN
● Use Kubernetes as a batch system
● Based on SLC6 containers and
CVMFS-csi driver
● Proxy passed through K8s secret
● Still room for evolution, eg. allow
arbitrary container/options
execution, maybe split I/O in 1-core
container, improve usage of
infrastructure
● Tested at scale for some weeks
thanks to CERN IT & Ricardo Rocha
FaHiu Lin, Mandy Yang
Compute Canada and CERN
● Create your own cluster with certain
number of nodes (=VMs)
● Kubernetes orchestrates pods
(=containers) on top
● Need custom scheduling
● Need to improve/automate node
management with infrastructure
people
− Lost half the nodes during the exercise
FaHiu Lin Thanks to Danika MacDonell
With default K8s
Scheduler (round
robin load balance)
With policy
tuning to pack
nodes
Salmon on Kubernetes
● Arbutus Cloud Project Access
○ Openstack
○ Maximum Resource Allocation
■ 5 Instances, 16 VCPUs, 36GB RAM, 5 Volumes, 70GB Volume
Storage
■ 5 Floating IPs, 6 Security Groups
● Deploy Kubernetes with Kubespray, Terraform and Ansible
● Containerize the Salmon Algorithm
● Create Argo workflow
Salmon runs
Salmon Results
Future of Kubernetes at CC
● Interest from some staff
● CERN seems to be driving Kubernetes innovation
● Other researchers?
○ Learning curve is steep and time is precious (installing Kubernetes on bare
metal just to run your workflow is probably not worth it)
○ Lack of expertise with essential tools (yaml, docker, github)
University of Michigan
● 19 school and colleges
● 45,000 students
● 8,000 faculty
● Largest Public Research Institution
within the U.S.
● 1.48 billion in annual research
expenditures.
ARC-TS
● Advanced Research Computing and Technology Services.
● Streamline the Research Experience.
● Manage all computational
Research Needs.
● Provide infrastructure and
architecture consultation
services.
ARC-TS
● Primary Shared HPC Cluster - 27,000
cores.
● Secondary restricted data HPC
Cluster.
● Additional clusters with ARM,
POWER architectures.
● Data Science (HADOOP + Spark)
● On-prem virtualization services
● Cloud Services.
ARC-TS Needs
● Original adoption of Kubernetes spurred by
internal needs to easily host and manage
internal services.
○ High availability
■ Hosting artifacts and patch mirrors
■ Source repositories
■ Build Systems
○ Minimal overhead
○ Logging & Metrics
A few services..
#1 Requested Service.
Demand shifting from JupyterHub to Kubeflow.
Why Kubeflow?
● Chainer Training
● Hyperparameter Tuning (Katib)
● Istio Integration (for TF Serving)
● Jupyter Notebooks
● ModelDB
● ksonnet
● MPI Training
● MXNet Training
● Pipelines
● PyTorch Training
● Seldon Serving
● NVIDIA TensorRT Inference Server
● TensorFlow Serving
● TensorFlow Batch Predict
● TensorFlow Training (TFJob)
● PyTorch Serving
The New Research Workflow
Sculley et al. - Hidden Technical Debt in Machine Learning Systems
Challenges
● Difficult to integrate with classic multi-user posix
infrastructure.
○ Translating API level identity to posix identity.
● Installation on-prem/bare-metal is still challenging.
● No “native” concept of job queue or wall time.
○ Up to higher level components to extend and add that functionality.
● Scheduler generally not as expressive as common HPC workload
managers such as Slurm or Torque/MOAB.
Current User Distribution
● General Users - 70% - Want a
consumable endpoint.
● Intermediate users - 20% - Want
to be able to update their own
deployment (Git) and consume
results.
● Advanced users - 10% - Want
direct Kubernetes Access.
Future @ UofM
● Move to Bare Metal.
● Improve integration with institutional infrastructure.
● Investigate Hybrid HPC & Kubernetes.
○ Sylabs SLURM Operator
○ IBM LSF Operator
● Improved Kubernetes Native HPC
○ Kube-batch
○ Volcano
Future @ UofM
Outreach and training for
both Faculty and Students.
Expected User Distribution
General Users - 70% 30%
Intermediate - 20% 40%
Advanced - 10% 30%
Demand for direct access
growing with continued
education.
Expected User Distribution
General Users - 70% 30%
Intermediate - 20% 40%
Advanced - 10% 30%
Demand for direct access
growing with continued
education.
Recap:
Kubernetes is great.
Lots of applications to facilitate research workflows.
Growing demand for research that would benefit from Kubernetes.
Suggestions for increasing
Kubernetes Adoption
Providers
● Offer Kubernetes for people to consume
● Get involved with the Kube community
● Learn as much as you can
● Provide outreach to researchers and
anyone that might need to be ramped up
Researchers
● Engage with research institutions
● Get involved with the Kube community
● Learn as much as you can
● Provide outreach to researchers and
anyone that might need to be ramped up
Useful Links
● CNCF Academic Mailing List
● CNCF Academic Slack (#academia)
● Batch Jobs Channel (#kubernetes-batch-jobs)
● Kubernetes Big Data User Group
● Kubernetes Machine Learning Working Group
Credits and Thanks
● ATLAS images were sourced from the CERN document server:
https://cds.cern.ch/
● VISPA website:
https://www.uvic.ca/science/physics/vispa/research/projects/atlas/
● Compute Canada usage information:
https://www.computecanada.ca

More Related Content

What's hot

Brief Introduction To Kubernetes
Brief Introduction To KubernetesBrief Introduction To Kubernetes
Brief Introduction To KubernetesAvinash Ketkar
 
K8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals TrainingK8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals TrainingPiotr Perzyna
 
Cloud Native Applications on Kubernetes: a DevOps Approach
Cloud Native Applications on Kubernetes: a DevOps ApproachCloud Native Applications on Kubernetes: a DevOps Approach
Cloud Native Applications on Kubernetes: a DevOps ApproachNicola Ferraro
 
(Draft) Kubernetes - A Comprehensive Overview
(Draft) Kubernetes - A Comprehensive Overview(Draft) Kubernetes - A Comprehensive Overview
(Draft) Kubernetes - A Comprehensive OverviewBob Killen
 
KubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
KubeCon USA 2017 brief Overview - from Kubernetes meetup BangaloreKubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
KubeCon USA 2017 brief Overview - from Kubernetes meetup BangaloreKrishna-Kumar
 
Kubernetes Monitoring & Best Practices
Kubernetes Monitoring & Best PracticesKubernetes Monitoring & Best Practices
Kubernetes Monitoring & Best PracticesAjeet Singh Raina
 
Cloud Native Apps with GitOps
Cloud Native Apps with GitOps Cloud Native Apps with GitOps
Cloud Native Apps with GitOps Weaveworks
 
Deploying your first application with Kubernetes
Deploying your first application with KubernetesDeploying your first application with Kubernetes
Deploying your first application with KubernetesOVHcloud
 
How to contribute to cloud native computing foundation (CNCF)
How to contribute to cloud native computing foundation (CNCF)How to contribute to cloud native computing foundation (CNCF)
How to contribute to cloud native computing foundation (CNCF)Krishna-Kumar
 
The Operator Pattern - Managing Stateful Services in Kubernetes
The Operator Pattern - Managing Stateful Services in KubernetesThe Operator Pattern - Managing Stateful Services in Kubernetes
The Operator Pattern - Managing Stateful Services in KubernetesQAware GmbH
 
Kubernetes Architecture
 Kubernetes Architecture Kubernetes Architecture
Kubernetes ArchitectureKnoldus Inc.
 
Top 3 reasons why you should run your Enterprise workloads on GKE
Top 3 reasons why you should run your Enterprise workloads on GKETop 3 reasons why you should run your Enterprise workloads on GKE
Top 3 reasons why you should run your Enterprise workloads on GKESreenivas Makam
 
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...Edureka!
 
Kubernetes Workshop
Kubernetes WorkshopKubernetes Workshop
Kubernetes Workshoploodse
 
Kubernetes a comprehensive overview
Kubernetes   a comprehensive overviewKubernetes   a comprehensive overview
Kubernetes a comprehensive overviewGabriel Carro
 
Why kubernetes for Serverless (FaaS)
Why kubernetes for Serverless (FaaS)Why kubernetes for Serverless (FaaS)
Why kubernetes for Serverless (FaaS)Krishna-Kumar
 
Evolution of containers to kubernetes
Evolution of containers to kubernetesEvolution of containers to kubernetes
Evolution of containers to kubernetesKrishna-Kumar
 

What's hot (20)

From Code to Kubernetes
From Code to KubernetesFrom Code to Kubernetes
From Code to Kubernetes
 
Brief Introduction To Kubernetes
Brief Introduction To KubernetesBrief Introduction To Kubernetes
Brief Introduction To Kubernetes
 
K8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals TrainingK8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals Training
 
Cloud Native Applications on Kubernetes: a DevOps Approach
Cloud Native Applications on Kubernetes: a DevOps ApproachCloud Native Applications on Kubernetes: a DevOps Approach
Cloud Native Applications on Kubernetes: a DevOps Approach
 
Kubernetes 101 Workshop
Kubernetes 101 WorkshopKubernetes 101 Workshop
Kubernetes 101 Workshop
 
(Draft) Kubernetes - A Comprehensive Overview
(Draft) Kubernetes - A Comprehensive Overview(Draft) Kubernetes - A Comprehensive Overview
(Draft) Kubernetes - A Comprehensive Overview
 
KubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
KubeCon USA 2017 brief Overview - from Kubernetes meetup BangaloreKubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
KubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
 
Kubernetes Monitoring & Best Practices
Kubernetes Monitoring & Best PracticesKubernetes Monitoring & Best Practices
Kubernetes Monitoring & Best Practices
 
Cloud Native Apps with GitOps
Cloud Native Apps with GitOps Cloud Native Apps with GitOps
Cloud Native Apps with GitOps
 
Deploying your first application with Kubernetes
Deploying your first application with KubernetesDeploying your first application with Kubernetes
Deploying your first application with Kubernetes
 
How to contribute to cloud native computing foundation (CNCF)
How to contribute to cloud native computing foundation (CNCF)How to contribute to cloud native computing foundation (CNCF)
How to contribute to cloud native computing foundation (CNCF)
 
The Operator Pattern - Managing Stateful Services in Kubernetes
The Operator Pattern - Managing Stateful Services in KubernetesThe Operator Pattern - Managing Stateful Services in Kubernetes
The Operator Pattern - Managing Stateful Services in Kubernetes
 
Kubernetes @ meetic
Kubernetes @ meeticKubernetes @ meetic
Kubernetes @ meetic
 
Kubernetes Architecture
 Kubernetes Architecture Kubernetes Architecture
Kubernetes Architecture
 
Top 3 reasons why you should run your Enterprise workloads on GKE
Top 3 reasons why you should run your Enterprise workloads on GKETop 3 reasons why you should run your Enterprise workloads on GKE
Top 3 reasons why you should run your Enterprise workloads on GKE
 
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
 
Kubernetes Workshop
Kubernetes WorkshopKubernetes Workshop
Kubernetes Workshop
 
Kubernetes a comprehensive overview
Kubernetes   a comprehensive overviewKubernetes   a comprehensive overview
Kubernetes a comprehensive overview
 
Why kubernetes for Serverless (FaaS)
Why kubernetes for Serverless (FaaS)Why kubernetes for Serverless (FaaS)
Why kubernetes for Serverless (FaaS)
 
Evolution of containers to kubernetes
Evolution of containers to kubernetesEvolution of containers to kubernetes
Evolution of containers to kubernetes
 

Similar to Kubernetes The New Research Platform

The Salmon Algorithm Spawning with Kubernetes
The Salmon Algorithm Spawning with KubernetesThe Salmon Algorithm Spawning with Kubernetes
The Salmon Algorithm Spawning with KubernetesCloudOps2005
 
Security Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformSecurity Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformLarry Smarr
 
HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores inside-BigData.com
 
Pioneering and Democratizing Scalable HPC+AI at PSC
Pioneering and Democratizing Scalable HPC+AI at PSCPioneering and Democratizing Scalable HPC+AI at PSC
Pioneering and Democratizing Scalable HPC+AI at PSCinside-BigData.com
 
CloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning
 
OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017Stacy Véronneau
 
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;Larry Smarr
 
20181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v320181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v3Tim Bell
 
20181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v320181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v3Tim Bell
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facilityinside-BigData.com
 
Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025Larry Smarr
 
Design phase kick-off event and Ceremony
Design phase kick-off event and CeremonyDesign phase kick-off event and Ceremony
Design phase kick-off event and CeremonyArchiver
 
Armada: Batch Workloads Across 10s of Kubernetes Clusters
Armada: Batch Workloads Across 10s of Kubernetes ClustersArmada: Batch Workloads Across 10s of Kubernetes Clusters
Armada: Batch Workloads Across 10s of Kubernetes ClustersKubernetesCommunityD
 
Open Source Visualization of Scientific Data
Open Source Visualization of Scientific DataOpen Source Visualization of Scientific Data
Open Source Visualization of Scientific DataMarcus Hanwell
 
Monitoring Exascale Supercomputers With Tim Osborne | Current 2022
Monitoring Exascale Supercomputers With Tim Osborne | Current 2022Monitoring Exascale Supercomputers With Tim Osborne | Current 2022
Monitoring Exascale Supercomputers With Tim Osborne | Current 2022HostedbyConfluent
 
Toward a National Research Platform
Toward a National Research PlatformToward a National Research Platform
Toward a National Research PlatformLarry Smarr
 
Chemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the DesktopChemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the DesktopMarcus Hanwell
 
What Are Science Clouds?
What Are Science Clouds?What Are Science Clouds?
What Are Science Clouds?Robert Grossman
 

Similar to Kubernetes The New Research Platform (20)

The Salmon Algorithm Spawning with Kubernetes
The Salmon Algorithm Spawning with KubernetesThe Salmon Algorithm Spawning with Kubernetes
The Salmon Algorithm Spawning with Kubernetes
 
Security Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformSecurity Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research Platform
 
HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores 
 
Pioneering and Democratizing Scalable HPC+AI at PSC
Pioneering and Democratizing Scalable HPC+AI at PSCPioneering and Democratizing Scalable HPC+AI at PSC
Pioneering and Democratizing Scalable HPC+AI at PSC
 
CloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use Case
 
OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017
 
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
 
20181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v320181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v3
 
20181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v320181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v3
 
Available HPC resources at CSUC
Available HPC resources at CSUCAvailable HPC resources at CSUC
Available HPC resources at CSUC
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
 
Available HPC Resources at CSUC
Available HPC Resources at CSUCAvailable HPC Resources at CSUC
Available HPC Resources at CSUC
 
Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025
 
Design phase kick-off event and Ceremony
Design phase kick-off event and CeremonyDesign phase kick-off event and Ceremony
Design phase kick-off event and Ceremony
 
Armada: Batch Workloads Across 10s of Kubernetes Clusters
Armada: Batch Workloads Across 10s of Kubernetes ClustersArmada: Batch Workloads Across 10s of Kubernetes Clusters
Armada: Batch Workloads Across 10s of Kubernetes Clusters
 
Open Source Visualization of Scientific Data
Open Source Visualization of Scientific DataOpen Source Visualization of Scientific Data
Open Source Visualization of Scientific Data
 
Monitoring Exascale Supercomputers With Tim Osborne | Current 2022
Monitoring Exascale Supercomputers With Tim Osborne | Current 2022Monitoring Exascale Supercomputers With Tim Osborne | Current 2022
Monitoring Exascale Supercomputers With Tim Osborne | Current 2022
 
Toward a National Research Platform
Toward a National Research PlatformToward a National Research Platform
Toward a National Research Platform
 
Chemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the DesktopChemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the Desktop
 
What Are Science Clouds?
What Are Science Clouds?What Are Science Clouds?
What Are Science Clouds?
 

More from Bob Killen

Tackling New Challenges in a Virtual Focused Community
Tackling New Challenges in a Virtual Focused CommunityTackling New Challenges in a Virtual Focused Community
Tackling New Challenges in a Virtual Focused CommunityBob Killen
 
KubeCon EU 2021 Keynote: Shaping Kubernetes Community Culture
KubeCon EU 2021 Keynote: Shaping Kubernetes Community CultureKubeCon EU 2021 Keynote: Shaping Kubernetes Community Culture
KubeCon EU 2021 Keynote: Shaping Kubernetes Community CultureBob Killen
 
Introduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopIntroduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopBob Killen
 
The (mutable) config management showdown
The (mutable) config management showdownThe (mutable) config management showdown
The (mutable) config management showdownBob Killen
 
Ansible, integration testing, and you.
Ansible, integration testing, and you.Ansible, integration testing, and you.
Ansible, integration testing, and you.Bob Killen
 
Pluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and DockerPluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and DockerBob Killen
 

More from Bob Killen (6)

Tackling New Challenges in a Virtual Focused Community
Tackling New Challenges in a Virtual Focused CommunityTackling New Challenges in a Virtual Focused Community
Tackling New Challenges in a Virtual Focused Community
 
KubeCon EU 2021 Keynote: Shaping Kubernetes Community Culture
KubeCon EU 2021 Keynote: Shaping Kubernetes Community CultureKubeCon EU 2021 Keynote: Shaping Kubernetes Community Culture
KubeCon EU 2021 Keynote: Shaping Kubernetes Community Culture
 
Introduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopIntroduction to Kubernetes Workshop
Introduction to Kubernetes Workshop
 
The (mutable) config management showdown
The (mutable) config management showdownThe (mutable) config management showdown
The (mutable) config management showdown
 
Ansible, integration testing, and you.
Ansible, integration testing, and you.Ansible, integration testing, and you.
Ansible, integration testing, and you.
 
Pluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and DockerPluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and Docker
 

Recently uploaded

Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$kojalkojal131
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxellan12
 
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Call Girls in Nagpur High Profile
 
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.CarlotaBedoya1
 
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.soniya singh
 
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.soniya singh
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445ruhi
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...tanu pandey
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebJames Anderson
 
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...singhpriety023
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Servicegwenoracqe6
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 

Recently uploaded (20)

Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
 
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
 
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
 
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
 
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
 
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
 
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 

Kubernetes The New Research Platform

  • 1.
  • 2. Kubernetes the New Research Platform Bob Killen Lindsey Tulloch University of Michigan Brock University
  • 3. $ whoami - Lindsey Lindsey Tulloch Undergraduate Student at Brock University Github: @onyiny-ang Twitter: @9jaLindsey
  • 4. $ whoami - Bob Bob Killen rkillen@umich.edu Senior Research Cloud Administrator CNCF Ambassador Github: @mrbobbytables Twitter: @mrbobbytables
  • 5. Kubernetes the New Research Platform Bob Killen Lindsey Tulloch University of Michigan Brock University
  • 6. ...or a tale of two Research Institutions.
  • 7. Why? ● Increased use of containers...everywhere. ● Moving away from strict “job” style workflows. ● Adoption of data-streaming and in-flight processing. ● Greater use of interactive Science Gateways. ● Dependence on other more persistent services. ● Increasing demand for reproducibility. R. Banerjee et. all - A graph theoretic framework for representation, exploration and analysis on computed states of physical systems
  • 8. Why Kubernetes? ● Kubernetes has become the standard for container orchestration. ● Extremely easy to extend, augment, and integrate with other systems. ● If it works on Kubernetes, it’ll work “anywhere”. ● No vendor lock-in. ● Very large, active development community. ● Declarative nature aids in improving reproducibility.
  • 9. ● Final Research Project in CS(1 credit)
  • 10. ● Final Research Project in CS(1 credit) ● Bioinformatics
  • 11. ● Final Research Project in CS(1 credit) ● Bioinformatics ● Kubernetes
  • 12. ● Final Research Project in CS(1 credit) ● Bioinformatics ● Kubernetes ● Bioinformatics on Kubernetes!
  • 13. ● Final Research Project in CS(1 credit) ● Bioinformatics ● Kubernetes ● Bioinformatics on Kubernetes!
  • 14. ● Final Research Project in CS(1 credit) ● Bioinformatics ● Kubernetes ● Bioinformatics on Kubernetes! ● on Compute Canada?
  • 15. Compute Canada Regional and Government Partners
  • 16.
  • 17. Compute Canada ● Not-for-profit corporation ● Membership includes most of Canada’s major research universities ● All Canadian faculty members have access to Compute Canada systems and can sponsor others: - students - postdocs - external collaborators ● No fee for Canadian university faculty ● Reduced fee for federal laboratories and not-for-profit orgs
  • 18. Compute Canada ● Compute and storage resources, data centres ● Team of ~200 experts in utilization of advanced research computing ● 100s of research software packages ● Cloud compute and storage (openstack, owncloud) ● 5-10 Data Centres ● 300,000 cores ● 12 Pflops, 50+ PB
  • 19. Compute Canada Researchers drive innovation ● The CC user base is broadening, bringing a broader set of needs. ● Tremendous interest in services enabling Research Data Management (RDM)
  • 20. ● No restrictions on researchers ≠ admin privileges ● ~200 experts ≠ ~200 Kubernetes experts ● ≠ 1 Kubernetes expert. . . ● How is this going to work????? Researchers drive innovation Back to Salmon
  • 22. ATLAS Collaboration What is ATLAS? - located on the Large Hadron Collider ring - detects and records the products of proton collisions in the LHC - The LHC and the ATLAS detector together form the most powerful microscope ever built - allow scientists to explore: - space and time - fundamental laws of nature
  • 24. ATLAS Collaboration ● ATLAS produces several peta-bytes of data/year ● Tier 2 computing centers perform final analyses (Canadian Universities like UVic) UVic-ATLAS group: - 25 scientists (students, research associates, technicians, computer experts, engineers and physics professors)
  • 25. ATLAS + Kubernetes Where does Kubernetes fit in?
  • 26. Compute Canada and CERN ● Use Kubernetes as a batch system ● Based on SLC6 containers and CVMFS-csi driver ● Proxy passed through K8s secret ● Still room for evolution, eg. allow arbitrary container/options execution, maybe split I/O in 1-core container, improve usage of infrastructure ● Tested at scale for some weeks thanks to CERN IT & Ricardo Rocha FaHiu Lin, Mandy Yang
  • 27. Compute Canada and CERN ● Create your own cluster with certain number of nodes (=VMs) ● Kubernetes orchestrates pods (=containers) on top ● Need custom scheduling ● Need to improve/automate node management with infrastructure people − Lost half the nodes during the exercise FaHiu Lin Thanks to Danika MacDonell With default K8s Scheduler (round robin load balance) With policy tuning to pack nodes
  • 28. Salmon on Kubernetes ● Arbutus Cloud Project Access ○ Openstack ○ Maximum Resource Allocation ■ 5 Instances, 16 VCPUs, 36GB RAM, 5 Volumes, 70GB Volume Storage ■ 5 Floating IPs, 6 Security Groups ● Deploy Kubernetes with Kubespray, Terraform and Ansible ● Containerize the Salmon Algorithm ● Create Argo workflow
  • 31. Future of Kubernetes at CC ● Interest from some staff ● CERN seems to be driving Kubernetes innovation ● Other researchers? ○ Learning curve is steep and time is precious (installing Kubernetes on bare metal just to run your workflow is probably not worth it) ○ Lack of expertise with essential tools (yaml, docker, github)
  • 32. University of Michigan ● 19 school and colleges ● 45,000 students ● 8,000 faculty ● Largest Public Research Institution within the U.S. ● 1.48 billion in annual research expenditures.
  • 33. ARC-TS ● Advanced Research Computing and Technology Services. ● Streamline the Research Experience. ● Manage all computational Research Needs. ● Provide infrastructure and architecture consultation services.
  • 34. ARC-TS ● Primary Shared HPC Cluster - 27,000 cores. ● Secondary restricted data HPC Cluster. ● Additional clusters with ARM, POWER architectures. ● Data Science (HADOOP + Spark) ● On-prem virtualization services ● Cloud Services.
  • 35. ARC-TS Needs ● Original adoption of Kubernetes spurred by internal needs to easily host and manage internal services. ○ High availability ■ Hosting artifacts and patch mirrors ■ Source repositories ■ Build Systems ○ Minimal overhead ○ Logging & Metrics
  • 36.
  • 37.
  • 40. Demand shifting from JupyterHub to Kubeflow.
  • 41. Why Kubeflow? ● Chainer Training ● Hyperparameter Tuning (Katib) ● Istio Integration (for TF Serving) ● Jupyter Notebooks ● ModelDB ● ksonnet ● MPI Training ● MXNet Training ● Pipelines ● PyTorch Training ● Seldon Serving ● NVIDIA TensorRT Inference Server ● TensorFlow Serving ● TensorFlow Batch Predict ● TensorFlow Training (TFJob) ● PyTorch Serving
  • 42. The New Research Workflow Sculley et al. - Hidden Technical Debt in Machine Learning Systems
  • 43. Challenges ● Difficult to integrate with classic multi-user posix infrastructure. ○ Translating API level identity to posix identity. ● Installation on-prem/bare-metal is still challenging. ● No “native” concept of job queue or wall time. ○ Up to higher level components to extend and add that functionality. ● Scheduler generally not as expressive as common HPC workload managers such as Slurm or Torque/MOAB.
  • 44. Current User Distribution ● General Users - 70% - Want a consumable endpoint. ● Intermediate users - 20% - Want to be able to update their own deployment (Git) and consume results. ● Advanced users - 10% - Want direct Kubernetes Access.
  • 45. Future @ UofM ● Move to Bare Metal. ● Improve integration with institutional infrastructure. ● Investigate Hybrid HPC & Kubernetes. ○ Sylabs SLURM Operator ○ IBM LSF Operator ● Improved Kubernetes Native HPC ○ Kube-batch ○ Volcano
  • 46. Future @ UofM Outreach and training for both Faculty and Students.
  • 47. Expected User Distribution General Users - 70% 30% Intermediate - 20% 40% Advanced - 10% 30% Demand for direct access growing with continued education.
  • 48. Expected User Distribution General Users - 70% 30% Intermediate - 20% 40% Advanced - 10% 30% Demand for direct access growing with continued education.
  • 49. Recap: Kubernetes is great. Lots of applications to facilitate research workflows. Growing demand for research that would benefit from Kubernetes.
  • 51. Providers ● Offer Kubernetes for people to consume ● Get involved with the Kube community ● Learn as much as you can ● Provide outreach to researchers and anyone that might need to be ramped up
  • 52. Researchers ● Engage with research institutions ● Get involved with the Kube community ● Learn as much as you can ● Provide outreach to researchers and anyone that might need to be ramped up
  • 53. Useful Links ● CNCF Academic Mailing List ● CNCF Academic Slack (#academia) ● Batch Jobs Channel (#kubernetes-batch-jobs) ● Kubernetes Big Data User Group ● Kubernetes Machine Learning Working Group
  • 54. Credits and Thanks ● ATLAS images were sourced from the CERN document server: https://cds.cern.ch/ ● VISPA website: https://www.uvic.ca/science/physics/vispa/research/projects/atlas/ ● Compute Canada usage information: https://www.computecanada.ca