SlideShare a Scribd company logo
Highly Available (HA) Kubernetes
Kubernetes — master and worker (node) components
Kubernetes is a master-slave type of architecture with certain components (master
components) calling the shots in the cluster, and other components (node
components) executing application workloads (containers) as decided by the
master components.
Kubernetes — master and worker (node) components The node components run on every cluster node.
The master components manage the state of the cluster. This includes accepting
client requests (describing the desired state), scheduling containers and running
control loops to drive the actual cluster state towards the desired state.
These components are:
apiserver: a REST API supporting basic CRUD operations on API objects (such as
pods, deployments, and services).
This is the endpoint that a cluster administrator communicates with, for example,
using kubectl.
The apiserver, in itself, is stateless. Instead it uses a distributed key-value storage
system (etcd) as its backend for storing all cluster state.
Controller managers: runs the control/reconciliation loops that watch the desired
state in the apiserver and attempt to move the actual state towards the desired
state. Internally it consists of many different controllers, of which the replication
controller is a prominent one ensuring that the right numbers of replica pods are
running for each deployment.
scheduler: takes care of pod placement across the set of available nodes, striving
to balance resource consumption
to not place excessive load on any cluster node. It also takes user scheduling
restrictions into account, such as (anti-)affinity rules.
These components are:
Container runtime: such as Docker, to execute containers on the node.
kubelet: executes containers (pods) on the node as dictated by the control plane’s
scheduling, and ensures the health of those pods (for example, by restarting failed
pods).
kube-proxy: a network proxy/LoadBalancer that implements the Service
abstraction. It programs the iptables rules on the node to redirect service IP
requests to one of its registered backend pods.
There is no formal requirement for master components to run on any particular
host in the cluster, however, typically these control-plane components are grouped
together to run one or more master nodes. With the exception of etcd,
which may be set up to run on its own machine(s), these master nodes typically
include all components — both the control plane master components and the
node components — but are dedicated to running the control plane and to not
process any application workloads (typically by being tainted).
Other nodes are typically designated as worker nodes (or simply nodes) and only
include the node components.
Achieving scalability and availability
A common requirement is for a Kubernetes cluster to both scales to accommodate increasing workloads and to be fault-tolerant, remaining available even in the
presence of failures (datacenter outages, machine failures, network partitions).
Scalability and availability for the node plane Scalability and availability for the control plane
A cluster can be scaled by adding worker nodes, which increases the workload
capacity of the cluster, giving Kubernetes more room to schedule containers.
Kubernetes is self-healing in that it keeps track of its nodes and, when a node is
deemed missing (no longer passing heartbeat status messages to the master), the
control plane is clever enough to re-schedule the pods from the missing node onto
other (still reachable) nodes. Adding more nodes to the cluster therefore also
makes the cluster more fault tolerant, as it gives Kubernetes more freedom in
rescheduling pods from failed nodes onto new nodes.
Adding nodes to a cluster is commonly carried out manually when a cluster
administrator detects that the cluster is heavily loaded or cannot fit additional
pods.
Monitoring and managing the cluster size manually is tedious. Through the use of
AutoScaling, this task can be automated. The Kubernetes cluster-autoScaler is one
such solution. At the time of writing, it only supports Kubernetes clusters running
on GCE and AWS and only provides simplistic scaling that grows the cluster
whenever a pod fails to be scheduled.
With the AutoScaling engine, more sophisticated and diverse AutoScaling schemes
can be implemented, utilizing predictive algorithms to stay ahead of demand and
scaling not only to fit the current set of pods, but also taking into account other
cluster metrics such as actual CPU/memory usage, network traffic, etc.
Furthermore, the autoScaler supports a wide range of cloud providers and can
easily be extended to include more via our cloud pool abstraction. To handle node
scale-downs gracefully, a Kubernetes-aware cloud pool proxy is placed between
the autoScaler and the chosen cloud pool.
Adding more workers does not make the cluster resilient to all sorts of failures.
For example, if the master API server goes down (for example, due to its machine
failing or a network partition cutting it off from the rest of the cluster) it will no
longer be possible to control the cluster (via kubectl).
For a true highly available cluster, we also need to replicate the control plane
components.
Such a control plane can remain reachable and functional even in the face of
failures of one or a few nodes, depending on the replication factor.
A HA control plane setup requires at least three masters to withstand the loss of
one master, since etcd needs to be able to form a quorum (a majority) to continue
operating.
The anatomy of a HA cluster setup
Given the loosely coupled nature of Kubernetes’ components, a HA cluster can be realized in many ways, But generally there are a few common guidelines:
o We need a replicated, distributed etcd storage layer.
o We need to replicate the apiserver across several machines and front them with a load-balancer.
o We need to replicate the controllers and schedulers and set them up for leader-election.
o We need to configure the (worker) nodes’ kubelet and kube-proxy to access the apiserver through the LoadBalancer.
The replication factor to use depends on the level of availability one wishes to achieve. With three sets of master components the cluster can tolerate a failure of one
master node since, in that case, etcd requires two live members to be able to form a quorum (a node majority) and continue working. This table provides the fault-
tolerance of different etcd cluster sizes.
A HA cluster with a replication factor of three could be realized as illustrated in the following schematical image:
Some further considerations:
etcd replicates the cluster state to all master nodes. Therefore, to lose all data, all
three nodes must experience simultaneous disk failures. Although unlikely, one
may want to make the storage layer even more reliable by using a separate disk
that is decoupled from the lifecycle of the machine/VM (for example, an AWS EBS
volume).
You can also use a RAID setup to mirror disks, and finally set up etcd to take
periodical backups.
The etcd replicas can be placed on separate, dedicated machines to isolate and
give them dedicated machine resources for improved performance.
The load-balancer must monitor the health of its apiservers and only forward
traffic to live servers.
Spread masters across data centers (availability zones in AWS lingo) to increase
the overall uptime of the cluster. If the masters are placed in different zones, the
cluster can tolerate the outage of an entire availability zone.
(Worker) node kubelets and kube-proxys must access the apiserver via the load-
balancer to not tie them to a particular master instance.
The LoadBalancer must not become a single point of failure. Most cloud providers
can offer fault-tolerant LoadBalancer services. For on-premise setups, one can
make use of an active/passive nginx/HAProxy setup with a virtual/ floating IP that
is re-assigned by keepalived when failover is required.
Do I need a HA control plane?
So now that we know how to build a HA Kubernetes cluster, let’s get started. But wait, not so fast. Before heading on to build a HA cluster you may need to ask yourself:
do I really need a HA cluster?
A HA solution may not be strictly necessary to run a successful Kubernetes cluster. Before deciding on a HA setup, one needs to consider the requirements and needs of
the cluster and the workloads one intends to run on it.
One needs to take into account that a HA solution involves more moving parts and, therefore, higher costs.
A single master can work quite well most of the time. Machine failures in most clouds are quite rare, as are availability zone outages.
Can you tolerate the master being unreachable occasionally? If you have a rather static workload that rarely changes, a few hours of downtime a year may be
acceptable. Furthermore, a nice thing about the loosely coupled architecture of Kubernetes is that worker nodes remain functional, running whatever they were
instructed to run, even if masters go down. Hence worker nodes, in particular if spread over different zones can probably continue delivering the application service(s)
even when the master is down.
If you do decide to run a single master, you need to ensure that the etcd data is stored reliably, preferably backed up.
When we have deployed single-master Kubernetes clusters, we typically mount two disks to the master, one to hold the etcd data directory and the second disk to hold
snapshots. In this way, a failed master node can be replaced (make sure you use a static IP which you can assign to a replacement node!) and the etcd disk can be
attached to the replacement node. Also, on disk corruption, a backup can be restored from the backup disk.
In the end it’s a balancing act where you need to trade off the need to always have a master available against the cost of keeping additional servers running.
What tools do I use?
Say that you decide that your business-level objectives requires you to run a HA setup. Then it’s time to decide how to set up your cluster.
There are a bewildering number of options available, reflecting both that Kubernetes can run “anywhere” (clouds, hardware architectures, OSes, etc) and that there are
a lot of stakeholders that want to monetize.
One may opt for a hosted solution (“Kubernetes-as-a-Service”) such as Google Container Engine or Azure Container Service. These typically reduce the administrative
burden at the expense of higher costs, reduced flexibility, and vendor lock-in.
If you want to build a HA cluster yourself, there are many different options ranging from building your own from scratch (Kubernetes the hard way may lead you in the
right direction), to cloud-specific installers (like kops for AWS), to installers suitable for clouds and/or bare-metal scenarios (like kubespray and kubeadm).
A lot of factors affect which solution is right for you. Using an available installer allows you to benefit from the experience of others, with the risk of ending up with a
“black boxed” solution that is difficult to adapt to your needs. At the other extreme, building from scratch will be a serious investment in time which will give you full
insight and control over your solution, with the risk of repeating the same mistakes as others have already made.
References:
Kubernetes Design and Architecture
Building High-Availability Clusters
Eng. / Tarek Ali - MicroServices CI/CD & AWS Consultant.

More Related Content

What's hot

Google Kubernetes Engine (GKE) deep dive
Google Kubernetes Engine (GKE) deep diveGoogle Kubernetes Engine (GKE) deep dive
Google Kubernetes Engine (GKE) deep dive
Akash Agrawal
 
Kubernetes Concepts And Architecture Powerpoint Presentation Slides
Kubernetes Concepts And Architecture Powerpoint Presentation SlidesKubernetes Concepts And Architecture Powerpoint Presentation Slides
Kubernetes Concepts And Architecture Powerpoint Presentation Slides
SlideTeam
 
Kubernetes for Beginners: An Introductory Guide
Kubernetes for Beginners: An Introductory GuideKubernetes for Beginners: An Introductory Guide
Kubernetes for Beginners: An Introductory Guide
Bytemark
 
K8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals TrainingK8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals Training
Piotr Perzyna
 
Quick introduction to Kubernetes
Quick introduction to KubernetesQuick introduction to Kubernetes
Quick introduction to Kubernetes
Eduardo Garcia Moyano
 
Kubernetes - introduction
Kubernetes - introductionKubernetes - introduction
Kubernetes - introduction
Sparkbit
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
Martin Danielsson
 
An Introduction to Kubernetes
An Introduction to KubernetesAn Introduction to Kubernetes
An Introduction to Kubernetes
Imesh Gunaratne
 
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Edureka!
 
Intro to kubernetes
Intro to kubernetesIntro to kubernetes
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
Antonin Stoklasek
 
AKS
AKSAKS
Kubernetes PPT.pptx
Kubernetes PPT.pptxKubernetes PPT.pptx
Kubernetes PPT.pptx
ssuser0cc9131
 
Kubernetes 101 for Beginners
Kubernetes 101 for BeginnersKubernetes 101 for Beginners
Kubernetes 101 for Beginners
Oktay Esgul
 
Kubernetes Architecture and Introduction
Kubernetes Architecture and IntroductionKubernetes Architecture and Introduction
Kubernetes Architecture and Introduction
Stefan Schimanski
 
Containers 101
Containers 101Containers 101
Containers 101
Black Duck by Synopsys
 
Docker and kubernetes_introduction
Docker and kubernetes_introductionDocker and kubernetes_introduction
Docker and kubernetes_introduction
Jason Hu
 
Scaling Microservices with Kubernetes
Scaling Microservices with KubernetesScaling Microservices with Kubernetes
Scaling Microservices with Kubernetes
Deivid Hahn Fração
 
DevOps with Kubernetes
DevOps with KubernetesDevOps with Kubernetes
DevOps with Kubernetes
EastBanc Tachnologies
 
Getting Started with Kubernetes
Getting Started with Kubernetes Getting Started with Kubernetes
Getting Started with Kubernetes
VMware Tanzu
 

What's hot (20)

Google Kubernetes Engine (GKE) deep dive
Google Kubernetes Engine (GKE) deep diveGoogle Kubernetes Engine (GKE) deep dive
Google Kubernetes Engine (GKE) deep dive
 
Kubernetes Concepts And Architecture Powerpoint Presentation Slides
Kubernetes Concepts And Architecture Powerpoint Presentation SlidesKubernetes Concepts And Architecture Powerpoint Presentation Slides
Kubernetes Concepts And Architecture Powerpoint Presentation Slides
 
Kubernetes for Beginners: An Introductory Guide
Kubernetes for Beginners: An Introductory GuideKubernetes for Beginners: An Introductory Guide
Kubernetes for Beginners: An Introductory Guide
 
K8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals TrainingK8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals Training
 
Quick introduction to Kubernetes
Quick introduction to KubernetesQuick introduction to Kubernetes
Quick introduction to Kubernetes
 
Kubernetes - introduction
Kubernetes - introductionKubernetes - introduction
Kubernetes - introduction
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
 
An Introduction to Kubernetes
An Introduction to KubernetesAn Introduction to Kubernetes
An Introduction to Kubernetes
 
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
 
Intro to kubernetes
Intro to kubernetesIntro to kubernetes
Intro to kubernetes
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
 
AKS
AKSAKS
AKS
 
Kubernetes PPT.pptx
Kubernetes PPT.pptxKubernetes PPT.pptx
Kubernetes PPT.pptx
 
Kubernetes 101 for Beginners
Kubernetes 101 for BeginnersKubernetes 101 for Beginners
Kubernetes 101 for Beginners
 
Kubernetes Architecture and Introduction
Kubernetes Architecture and IntroductionKubernetes Architecture and Introduction
Kubernetes Architecture and Introduction
 
Containers 101
Containers 101Containers 101
Containers 101
 
Docker and kubernetes_introduction
Docker and kubernetes_introductionDocker and kubernetes_introduction
Docker and kubernetes_introduction
 
Scaling Microservices with Kubernetes
Scaling Microservices with KubernetesScaling Microservices with Kubernetes
Scaling Microservices with Kubernetes
 
DevOps with Kubernetes
DevOps with KubernetesDevOps with Kubernetes
DevOps with Kubernetes
 
Getting Started with Kubernetes
Getting Started with Kubernetes Getting Started with Kubernetes
Getting Started with Kubernetes
 

Similar to Highly available (ha) kubernetes

Intro to Kubernetes
Intro to KubernetesIntro to Kubernetes
Intro to Kubernetes
Joonathan Mägi
 
Kubernetes From Scratch .pdf
Kubernetes From Scratch .pdfKubernetes From Scratch .pdf
Kubernetes From Scratch .pdf
ssuser9b44c7
 
Kubernetes Architecture with Components
 Kubernetes Architecture with Components Kubernetes Architecture with Components
Kubernetes Architecture with Components
Ajeet Singh
 
Kubernetes-introduction to kubernetes for beginers.pptx
Kubernetes-introduction to kubernetes for beginers.pptxKubernetes-introduction to kubernetes for beginers.pptx
Kubernetes-introduction to kubernetes for beginers.pptx
rathnavel194
 
Newesis - Introduction to Containers
Newesis -  Introduction to ContainersNewesis -  Introduction to Containers
Newesis - Introduction to Containers
Rauno De Pasquale
 
Kubernetes
KubernetesKubernetes
Kubernetes
Srinath Reddy
 
Kubernetes acomprehensiveoverview
Kubernetes acomprehensiveoverviewKubernetes acomprehensiveoverview
Kubernetes acomprehensiveoverview
Ankit Shukla
 
Kubernetes - A Comprehensive Overview
Kubernetes - A Comprehensive OverviewKubernetes - A Comprehensive Overview
Kubernetes - A Comprehensive Overview
Bob Killen
 
Kubernetes a comprehensive overview
Kubernetes   a comprehensive overviewKubernetes   a comprehensive overview
Kubernetes a comprehensive overview
Gabriel Carro
 
Advanced Container Management and Scheduling
Advanced Container Management and SchedulingAdvanced Container Management and Scheduling
Advanced Container Management and Scheduling
Amazon Web Services
 
(Draft) Kubernetes - A Comprehensive Overview
(Draft) Kubernetes - A Comprehensive Overview(Draft) Kubernetes - A Comprehensive Overview
(Draft) Kubernetes - A Comprehensive Overview
Bob Killen
 
Kubernetes-Fundamentals.pptx
Kubernetes-Fundamentals.pptxKubernetes-Fundamentals.pptx
Kubernetes-Fundamentals.pptx
satish642065
 
Container Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and KubernetesContainer Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and Kubernetes
Will Hall
 
AKS components
AKS componentsAKS components
AKS components
Parisa Moosavinezhad
 
Container Orchestration using kubernetes
Container Orchestration using kubernetesContainer Orchestration using kubernetes
Container Orchestration using kubernetes
Puneet Kumar Bhatia (MBA, ITIL V3 Certified)
 
Containers kuberenetes
Containers kuberenetesContainers kuberenetes
Containers kuberenetes
Gayan Gunarathne
 
Containers kuberenetes
Containers kuberenetesContainers kuberenetes
Containers kuberenetes
Gayan Gunarathne
 
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMSARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
Arun prasath
 
Kubernetes
KubernetesKubernetes
Kubernetes
Mihir Shah
 
Kubernetes #1 intro
Kubernetes #1   introKubernetes #1   intro
Kubernetes #1 intro
Terry Cho
 

Similar to Highly available (ha) kubernetes (20)

Intro to Kubernetes
Intro to KubernetesIntro to Kubernetes
Intro to Kubernetes
 
Kubernetes From Scratch .pdf
Kubernetes From Scratch .pdfKubernetes From Scratch .pdf
Kubernetes From Scratch .pdf
 
Kubernetes Architecture with Components
 Kubernetes Architecture with Components Kubernetes Architecture with Components
Kubernetes Architecture with Components
 
Kubernetes-introduction to kubernetes for beginers.pptx
Kubernetes-introduction to kubernetes for beginers.pptxKubernetes-introduction to kubernetes for beginers.pptx
Kubernetes-introduction to kubernetes for beginers.pptx
 
Newesis - Introduction to Containers
Newesis -  Introduction to ContainersNewesis -  Introduction to Containers
Newesis - Introduction to Containers
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
Kubernetes acomprehensiveoverview
Kubernetes acomprehensiveoverviewKubernetes acomprehensiveoverview
Kubernetes acomprehensiveoverview
 
Kubernetes - A Comprehensive Overview
Kubernetes - A Comprehensive OverviewKubernetes - A Comprehensive Overview
Kubernetes - A Comprehensive Overview
 
Kubernetes a comprehensive overview
Kubernetes   a comprehensive overviewKubernetes   a comprehensive overview
Kubernetes a comprehensive overview
 
Advanced Container Management and Scheduling
Advanced Container Management and SchedulingAdvanced Container Management and Scheduling
Advanced Container Management and Scheduling
 
(Draft) Kubernetes - A Comprehensive Overview
(Draft) Kubernetes - A Comprehensive Overview(Draft) Kubernetes - A Comprehensive Overview
(Draft) Kubernetes - A Comprehensive Overview
 
Kubernetes-Fundamentals.pptx
Kubernetes-Fundamentals.pptxKubernetes-Fundamentals.pptx
Kubernetes-Fundamentals.pptx
 
Container Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and KubernetesContainer Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and Kubernetes
 
AKS components
AKS componentsAKS components
AKS components
 
Container Orchestration using kubernetes
Container Orchestration using kubernetesContainer Orchestration using kubernetes
Container Orchestration using kubernetes
 
Containers kuberenetes
Containers kuberenetesContainers kuberenetes
Containers kuberenetes
 
Containers kuberenetes
Containers kuberenetesContainers kuberenetes
Containers kuberenetes
 
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMSARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
Kubernetes #1 intro
Kubernetes #1   introKubernetes #1   intro
Kubernetes #1 intro
 

Recently uploaded

First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
kalichargn70th171
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
e20449
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
Tendenci - The Open Source AMS (Association Management Software)
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
WSO2
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 

Recently uploaded (20)

First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 

Highly available (ha) kubernetes

  • 1. Highly Available (HA) Kubernetes Kubernetes — master and worker (node) components Kubernetes is a master-slave type of architecture with certain components (master components) calling the shots in the cluster, and other components (node components) executing application workloads (containers) as decided by the master components. Kubernetes — master and worker (node) components The node components run on every cluster node. The master components manage the state of the cluster. This includes accepting client requests (describing the desired state), scheduling containers and running control loops to drive the actual cluster state towards the desired state. These components are: apiserver: a REST API supporting basic CRUD operations on API objects (such as pods, deployments, and services). This is the endpoint that a cluster administrator communicates with, for example, using kubectl. The apiserver, in itself, is stateless. Instead it uses a distributed key-value storage system (etcd) as its backend for storing all cluster state. Controller managers: runs the control/reconciliation loops that watch the desired state in the apiserver and attempt to move the actual state towards the desired state. Internally it consists of many different controllers, of which the replication controller is a prominent one ensuring that the right numbers of replica pods are running for each deployment. scheduler: takes care of pod placement across the set of available nodes, striving to balance resource consumption to not place excessive load on any cluster node. It also takes user scheduling restrictions into account, such as (anti-)affinity rules. These components are: Container runtime: such as Docker, to execute containers on the node. kubelet: executes containers (pods) on the node as dictated by the control plane’s scheduling, and ensures the health of those pods (for example, by restarting failed pods). kube-proxy: a network proxy/LoadBalancer that implements the Service abstraction. It programs the iptables rules on the node to redirect service IP requests to one of its registered backend pods. There is no formal requirement for master components to run on any particular host in the cluster, however, typically these control-plane components are grouped together to run one or more master nodes. With the exception of etcd, which may be set up to run on its own machine(s), these master nodes typically include all components — both the control plane master components and the node components — but are dedicated to running the control plane and to not process any application workloads (typically by being tainted). Other nodes are typically designated as worker nodes (or simply nodes) and only include the node components.
  • 2. Achieving scalability and availability A common requirement is for a Kubernetes cluster to both scales to accommodate increasing workloads and to be fault-tolerant, remaining available even in the presence of failures (datacenter outages, machine failures, network partitions). Scalability and availability for the node plane Scalability and availability for the control plane A cluster can be scaled by adding worker nodes, which increases the workload capacity of the cluster, giving Kubernetes more room to schedule containers. Kubernetes is self-healing in that it keeps track of its nodes and, when a node is deemed missing (no longer passing heartbeat status messages to the master), the control plane is clever enough to re-schedule the pods from the missing node onto other (still reachable) nodes. Adding more nodes to the cluster therefore also makes the cluster more fault tolerant, as it gives Kubernetes more freedom in rescheduling pods from failed nodes onto new nodes. Adding nodes to a cluster is commonly carried out manually when a cluster administrator detects that the cluster is heavily loaded or cannot fit additional pods. Monitoring and managing the cluster size manually is tedious. Through the use of AutoScaling, this task can be automated. The Kubernetes cluster-autoScaler is one such solution. At the time of writing, it only supports Kubernetes clusters running on GCE and AWS and only provides simplistic scaling that grows the cluster whenever a pod fails to be scheduled. With the AutoScaling engine, more sophisticated and diverse AutoScaling schemes can be implemented, utilizing predictive algorithms to stay ahead of demand and scaling not only to fit the current set of pods, but also taking into account other cluster metrics such as actual CPU/memory usage, network traffic, etc. Furthermore, the autoScaler supports a wide range of cloud providers and can easily be extended to include more via our cloud pool abstraction. To handle node scale-downs gracefully, a Kubernetes-aware cloud pool proxy is placed between the autoScaler and the chosen cloud pool. Adding more workers does not make the cluster resilient to all sorts of failures. For example, if the master API server goes down (for example, due to its machine failing or a network partition cutting it off from the rest of the cluster) it will no longer be possible to control the cluster (via kubectl). For a true highly available cluster, we also need to replicate the control plane components. Such a control plane can remain reachable and functional even in the face of failures of one or a few nodes, depending on the replication factor. A HA control plane setup requires at least three masters to withstand the loss of one master, since etcd needs to be able to form a quorum (a majority) to continue operating.
  • 3. The anatomy of a HA cluster setup Given the loosely coupled nature of Kubernetes’ components, a HA cluster can be realized in many ways, But generally there are a few common guidelines: o We need a replicated, distributed etcd storage layer. o We need to replicate the apiserver across several machines and front them with a load-balancer. o We need to replicate the controllers and schedulers and set them up for leader-election. o We need to configure the (worker) nodes’ kubelet and kube-proxy to access the apiserver through the LoadBalancer. The replication factor to use depends on the level of availability one wishes to achieve. With three sets of master components the cluster can tolerate a failure of one master node since, in that case, etcd requires two live members to be able to form a quorum (a node majority) and continue working. This table provides the fault- tolerance of different etcd cluster sizes. A HA cluster with a replication factor of three could be realized as illustrated in the following schematical image: Some further considerations: etcd replicates the cluster state to all master nodes. Therefore, to lose all data, all three nodes must experience simultaneous disk failures. Although unlikely, one may want to make the storage layer even more reliable by using a separate disk that is decoupled from the lifecycle of the machine/VM (for example, an AWS EBS volume). You can also use a RAID setup to mirror disks, and finally set up etcd to take periodical backups. The etcd replicas can be placed on separate, dedicated machines to isolate and give them dedicated machine resources for improved performance. The load-balancer must monitor the health of its apiservers and only forward traffic to live servers. Spread masters across data centers (availability zones in AWS lingo) to increase the overall uptime of the cluster. If the masters are placed in different zones, the cluster can tolerate the outage of an entire availability zone. (Worker) node kubelets and kube-proxys must access the apiserver via the load- balancer to not tie them to a particular master instance. The LoadBalancer must not become a single point of failure. Most cloud providers can offer fault-tolerant LoadBalancer services. For on-premise setups, one can make use of an active/passive nginx/HAProxy setup with a virtual/ floating IP that is re-assigned by keepalived when failover is required.
  • 4. Do I need a HA control plane? So now that we know how to build a HA Kubernetes cluster, let’s get started. But wait, not so fast. Before heading on to build a HA cluster you may need to ask yourself: do I really need a HA cluster? A HA solution may not be strictly necessary to run a successful Kubernetes cluster. Before deciding on a HA setup, one needs to consider the requirements and needs of the cluster and the workloads one intends to run on it. One needs to take into account that a HA solution involves more moving parts and, therefore, higher costs. A single master can work quite well most of the time. Machine failures in most clouds are quite rare, as are availability zone outages. Can you tolerate the master being unreachable occasionally? If you have a rather static workload that rarely changes, a few hours of downtime a year may be acceptable. Furthermore, a nice thing about the loosely coupled architecture of Kubernetes is that worker nodes remain functional, running whatever they were instructed to run, even if masters go down. Hence worker nodes, in particular if spread over different zones can probably continue delivering the application service(s) even when the master is down. If you do decide to run a single master, you need to ensure that the etcd data is stored reliably, preferably backed up. When we have deployed single-master Kubernetes clusters, we typically mount two disks to the master, one to hold the etcd data directory and the second disk to hold snapshots. In this way, a failed master node can be replaced (make sure you use a static IP which you can assign to a replacement node!) and the etcd disk can be attached to the replacement node. Also, on disk corruption, a backup can be restored from the backup disk. In the end it’s a balancing act where you need to trade off the need to always have a master available against the cost of keeping additional servers running. What tools do I use? Say that you decide that your business-level objectives requires you to run a HA setup. Then it’s time to decide how to set up your cluster. There are a bewildering number of options available, reflecting both that Kubernetes can run “anywhere” (clouds, hardware architectures, OSes, etc) and that there are a lot of stakeholders that want to monetize. One may opt for a hosted solution (“Kubernetes-as-a-Service”) such as Google Container Engine or Azure Container Service. These typically reduce the administrative burden at the expense of higher costs, reduced flexibility, and vendor lock-in. If you want to build a HA cluster yourself, there are many different options ranging from building your own from scratch (Kubernetes the hard way may lead you in the right direction), to cloud-specific installers (like kops for AWS), to installers suitable for clouds and/or bare-metal scenarios (like kubespray and kubeadm). A lot of factors affect which solution is right for you. Using an available installer allows you to benefit from the experience of others, with the risk of ending up with a “black boxed” solution that is difficult to adapt to your needs. At the other extreme, building from scratch will be a serious investment in time which will give you full insight and control over your solution, with the risk of repeating the same mistakes as others have already made. References: Kubernetes Design and Architecture Building High-Availability Clusters Eng. / Tarek Ali - MicroServices CI/CD & AWS Consultant.