SlideShare a Scribd company logo
1 of 4
Download to read offline
Highly Available (HA) Kubernetes
Kubernetes — master and worker (node) components
Kubernetes is a master-slave type of architecture with certain components (master
components) calling the shots in the cluster, and other components (node
components) executing application workloads (containers) as decided by the
master components.
Kubernetes — master and worker (node) components The node components run on every cluster node.
The master components manage the state of the cluster. This includes accepting
client requests (describing the desired state), scheduling containers and running
control loops to drive the actual cluster state towards the desired state.
These components are:
apiserver: a REST API supporting basic CRUD operations on API objects (such as
pods, deployments, and services).
This is the endpoint that a cluster administrator communicates with, for example,
using kubectl.
The apiserver, in itself, is stateless. Instead it uses a distributed key-value storage
system (etcd) as its backend for storing all cluster state.
Controller managers: runs the control/reconciliation loops that watch the desired
state in the apiserver and attempt to move the actual state towards the desired
state. Internally it consists of many different controllers, of which the replication
controller is a prominent one ensuring that the right numbers of replica pods are
running for each deployment.
scheduler: takes care of pod placement across the set of available nodes, striving
to balance resource consumption
to not place excessive load on any cluster node. It also takes user scheduling
restrictions into account, such as (anti-)affinity rules.
These components are:
Container runtime: such as Docker, to execute containers on the node.
kubelet: executes containers (pods) on the node as dictated by the control plane’s
scheduling, and ensures the health of those pods (for example, by restarting failed
pods).
kube-proxy: a network proxy/LoadBalancer that implements the Service
abstraction. It programs the iptables rules on the node to redirect service IP
requests to one of its registered backend pods.
There is no formal requirement for master components to run on any particular
host in the cluster, however, typically these control-plane components are grouped
together to run one or more master nodes. With the exception of etcd,
which may be set up to run on its own machine(s), these master nodes typically
include all components — both the control plane master components and the
node components — but are dedicated to running the control plane and to not
process any application workloads (typically by being tainted).
Other nodes are typically designated as worker nodes (or simply nodes) and only
include the node components.
Achieving scalability and availability
A common requirement is for a Kubernetes cluster to both scales to accommodate increasing workloads and to be fault-tolerant, remaining available even in the
presence of failures (datacenter outages, machine failures, network partitions).
Scalability and availability for the node plane Scalability and availability for the control plane
A cluster can be scaled by adding worker nodes, which increases the workload
capacity of the cluster, giving Kubernetes more room to schedule containers.
Kubernetes is self-healing in that it keeps track of its nodes and, when a node is
deemed missing (no longer passing heartbeat status messages to the master), the
control plane is clever enough to re-schedule the pods from the missing node onto
other (still reachable) nodes. Adding more nodes to the cluster therefore also
makes the cluster more fault tolerant, as it gives Kubernetes more freedom in
rescheduling pods from failed nodes onto new nodes.
Adding nodes to a cluster is commonly carried out manually when a cluster
administrator detects that the cluster is heavily loaded or cannot fit additional
pods.
Monitoring and managing the cluster size manually is tedious. Through the use of
AutoScaling, this task can be automated. The Kubernetes cluster-autoScaler is one
such solution. At the time of writing, it only supports Kubernetes clusters running
on GCE and AWS and only provides simplistic scaling that grows the cluster
whenever a pod fails to be scheduled.
With the AutoScaling engine, more sophisticated and diverse AutoScaling schemes
can be implemented, utilizing predictive algorithms to stay ahead of demand and
scaling not only to fit the current set of pods, but also taking into account other
cluster metrics such as actual CPU/memory usage, network traffic, etc.
Furthermore, the autoScaler supports a wide range of cloud providers and can
easily be extended to include more via our cloud pool abstraction. To handle node
scale-downs gracefully, a Kubernetes-aware cloud pool proxy is placed between
the autoScaler and the chosen cloud pool.
Adding more workers does not make the cluster resilient to all sorts of failures.
For example, if the master API server goes down (for example, due to its machine
failing or a network partition cutting it off from the rest of the cluster) it will no
longer be possible to control the cluster (via kubectl).
For a true highly available cluster, we also need to replicate the control plane
components.
Such a control plane can remain reachable and functional even in the face of
failures of one or a few nodes, depending on the replication factor.
A HA control plane setup requires at least three masters to withstand the loss of
one master, since etcd needs to be able to form a quorum (a majority) to continue
operating.
The anatomy of a HA cluster setup
Given the loosely coupled nature of Kubernetes’ components, a HA cluster can be realized in many ways, But generally there are a few common guidelines:
o We need a replicated, distributed etcd storage layer.
o We need to replicate the apiserver across several machines and front them with a load-balancer.
o We need to replicate the controllers and schedulers and set them up for leader-election.
o We need to configure the (worker) nodes’ kubelet and kube-proxy to access the apiserver through the LoadBalancer.
The replication factor to use depends on the level of availability one wishes to achieve. With three sets of master components the cluster can tolerate a failure of one
master node since, in that case, etcd requires two live members to be able to form a quorum (a node majority) and continue working. This table provides the fault-
tolerance of different etcd cluster sizes.
A HA cluster with a replication factor of three could be realized as illustrated in the following schematical image:
Some further considerations:
etcd replicates the cluster state to all master nodes. Therefore, to lose all data, all
three nodes must experience simultaneous disk failures. Although unlikely, one
may want to make the storage layer even more reliable by using a separate disk
that is decoupled from the lifecycle of the machine/VM (for example, an AWS EBS
volume).
You can also use a RAID setup to mirror disks, and finally set up etcd to take
periodical backups.
The etcd replicas can be placed on separate, dedicated machines to isolate and
give them dedicated machine resources for improved performance.
The load-balancer must monitor the health of its apiservers and only forward
traffic to live servers.
Spread masters across data centers (availability zones in AWS lingo) to increase
the overall uptime of the cluster. If the masters are placed in different zones, the
cluster can tolerate the outage of an entire availability zone.
(Worker) node kubelets and kube-proxys must access the apiserver via the load-
balancer to not tie them to a particular master instance.
The LoadBalancer must not become a single point of failure. Most cloud providers
can offer fault-tolerant LoadBalancer services. For on-premise setups, one can
make use of an active/passive nginx/HAProxy setup with a virtual/ floating IP that
is re-assigned by keepalived when failover is required.
Do I need a HA control plane?
So now that we know how to build a HA Kubernetes cluster, let’s get started. But wait, not so fast. Before heading on to build a HA cluster you may need to ask yourself:
do I really need a HA cluster?
A HA solution may not be strictly necessary to run a successful Kubernetes cluster. Before deciding on a HA setup, one needs to consider the requirements and needs of
the cluster and the workloads one intends to run on it.
One needs to take into account that a HA solution involves more moving parts and, therefore, higher costs.
A single master can work quite well most of the time. Machine failures in most clouds are quite rare, as are availability zone outages.
Can you tolerate the master being unreachable occasionally? If you have a rather static workload that rarely changes, a few hours of downtime a year may be
acceptable. Furthermore, a nice thing about the loosely coupled architecture of Kubernetes is that worker nodes remain functional, running whatever they were
instructed to run, even if masters go down. Hence worker nodes, in particular if spread over different zones can probably continue delivering the application service(s)
even when the master is down.
If you do decide to run a single master, you need to ensure that the etcd data is stored reliably, preferably backed up.
When we have deployed single-master Kubernetes clusters, we typically mount two disks to the master, one to hold the etcd data directory and the second disk to hold
snapshots. In this way, a failed master node can be replaced (make sure you use a static IP which you can assign to a replacement node!) and the etcd disk can be
attached to the replacement node. Also, on disk corruption, a backup can be restored from the backup disk.
In the end it’s a balancing act where you need to trade off the need to always have a master available against the cost of keeping additional servers running.
What tools do I use?
Say that you decide that your business-level objectives requires you to run a HA setup. Then it’s time to decide how to set up your cluster.
There are a bewildering number of options available, reflecting both that Kubernetes can run “anywhere” (clouds, hardware architectures, OSes, etc) and that there are
a lot of stakeholders that want to monetize.
One may opt for a hosted solution (“Kubernetes-as-a-Service”) such as Google Container Engine or Azure Container Service. These typically reduce the administrative
burden at the expense of higher costs, reduced flexibility, and vendor lock-in.
If you want to build a HA cluster yourself, there are many different options ranging from building your own from scratch (Kubernetes the hard way may lead you in the
right direction), to cloud-specific installers (like kops for AWS), to installers suitable for clouds and/or bare-metal scenarios (like kubespray and kubeadm).
A lot of factors affect which solution is right for you. Using an available installer allows you to benefit from the experience of others, with the risk of ending up with a
“black boxed” solution that is difficult to adapt to your needs. At the other extreme, building from scratch will be a serious investment in time which will give you full
insight and control over your solution, with the risk of repeating the same mistakes as others have already made.
References:
Kubernetes Design and Architecture
Building High-Availability Clusters
Eng. / Tarek Ali - MicroServices CI/CD & AWS Consultant.

More Related Content

What's hot

Managing Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native WayManaging Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native WayQiming Teng
 
Spark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesSpark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesYousun Jeong
 
Autoscaling with magnum and senlin
Autoscaling with magnum and senlinAutoscaling with magnum and senlin
Autoscaling with magnum and senlinQiming Teng
 
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...Yahoo Developer Network
 
DataStax: Backup and Restore in Cassandra and OpsCenter
DataStax: Backup and Restore in Cassandra and OpsCenterDataStax: Backup and Restore in Cassandra and OpsCenter
DataStax: Backup and Restore in Cassandra and OpsCenterDataStax Academy
 
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)Docker, Inc.
 
Developing Frameworks for Apache Mesos
Developing Frameworks  for Apache MesosDeveloping Frameworks  for Apache Mesos
Developing Frameworks for Apache MesosJoe Stein
 
Container Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and KubernetesContainer Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and KubernetesWill Hall
 
(Draft) Kubernetes - A Comprehensive Overview
(Draft) Kubernetes - A Comprehensive Overview(Draft) Kubernetes - A Comprehensive Overview
(Draft) Kubernetes - A Comprehensive OverviewBob Killen
 
The Operator Pattern - Managing Stateful Services in Kubernetes
The Operator Pattern - Managing Stateful Services in KubernetesThe Operator Pattern - Managing Stateful Services in Kubernetes
The Operator Pattern - Managing Stateful Services in KubernetesQAware GmbH
 
Docker on Amazon ECS
Docker on Amazon ECSDocker on Amazon ECS
Docker on Amazon ECSDeepak Kumar
 
Kubernetes "Ubernetes" Cluster Federation by Quinton Hoole (Google, Inc) Huaw...
Kubernetes "Ubernetes" Cluster Federation by Quinton Hoole (Google, Inc) Huaw...Kubernetes "Ubernetes" Cluster Federation by Quinton Hoole (Google, Inc) Huaw...
Kubernetes "Ubernetes" Cluster Federation by Quinton Hoole (Google, Inc) Huaw...Quinton Hoole
 
Immutable infrastructure with Terraform
Immutable infrastructure with TerraformImmutable infrastructure with Terraform
Immutable infrastructure with TerraformPrashant Kalkar
 
MariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaS
MariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaSMariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaS
MariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaSJelastic Multi-Cloud PaaS
 
Deploying Docker Containers at Scale with Mesos and Marathon
Deploying Docker Containers at Scale with Mesos and MarathonDeploying Docker Containers at Scale with Mesos and Marathon
Deploying Docker Containers at Scale with Mesos and MarathonDiscover Pinterest
 
Securing & Monitoring Your K8s Cluster with RBAC and Prometheus”.
Securing & Monitoring Your K8s Cluster with RBAC and Prometheus”.Securing & Monitoring Your K8s Cluster with RBAC and Prometheus”.
Securing & Monitoring Your K8s Cluster with RBAC and Prometheus”.Opcito Technologies
 
Heat up your stack
Heat up your stackHeat up your stack
Heat up your stackRico Lin
 
AWS ECS Meetup Talentica
AWS ECS Meetup TalenticaAWS ECS Meetup Talentica
AWS ECS Meetup TalenticaAnshul Patel
 

What's hot (20)

Managing Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native WayManaging Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native Way
 
Spark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesSpark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on Kubernetes
 
Autoscaling with magnum and senlin
Autoscaling with magnum and senlinAutoscaling with magnum and senlin
Autoscaling with magnum and senlin
 
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
 
DataStax: Backup and Restore in Cassandra and OpsCenter
DataStax: Backup and Restore in Cassandra and OpsCenterDataStax: Backup and Restore in Cassandra and OpsCenter
DataStax: Backup and Restore in Cassandra and OpsCenter
 
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)
 
Developing Frameworks for Apache Mesos
Developing Frameworks  for Apache MesosDeveloping Frameworks  for Apache Mesos
Developing Frameworks for Apache Mesos
 
Kubernetes Presentation
Kubernetes PresentationKubernetes Presentation
Kubernetes Presentation
 
Container Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and KubernetesContainer Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and Kubernetes
 
(Draft) Kubernetes - A Comprehensive Overview
(Draft) Kubernetes - A Comprehensive Overview(Draft) Kubernetes - A Comprehensive Overview
(Draft) Kubernetes - A Comprehensive Overview
 
The Operator Pattern - Managing Stateful Services in Kubernetes
The Operator Pattern - Managing Stateful Services in KubernetesThe Operator Pattern - Managing Stateful Services in Kubernetes
The Operator Pattern - Managing Stateful Services in Kubernetes
 
Docker on Amazon ECS
Docker on Amazon ECSDocker on Amazon ECS
Docker on Amazon ECS
 
Kubernetes "Ubernetes" Cluster Federation by Quinton Hoole (Google, Inc) Huaw...
Kubernetes "Ubernetes" Cluster Federation by Quinton Hoole (Google, Inc) Huaw...Kubernetes "Ubernetes" Cluster Federation by Quinton Hoole (Google, Inc) Huaw...
Kubernetes "Ubernetes" Cluster Federation by Quinton Hoole (Google, Inc) Huaw...
 
Immutable infrastructure with Terraform
Immutable infrastructure with TerraformImmutable infrastructure with Terraform
Immutable infrastructure with Terraform
 
MariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaS
MariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaSMariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaS
MariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaS
 
Deploying Docker Containers at Scale with Mesos and Marathon
Deploying Docker Containers at Scale with Mesos and MarathonDeploying Docker Containers at Scale with Mesos and Marathon
Deploying Docker Containers at Scale with Mesos and Marathon
 
Securing & Monitoring Your K8s Cluster with RBAC and Prometheus”.
Securing & Monitoring Your K8s Cluster with RBAC and Prometheus”.Securing & Monitoring Your K8s Cluster with RBAC and Prometheus”.
Securing & Monitoring Your K8s Cluster with RBAC and Prometheus”.
 
Heat up your stack
Heat up your stackHeat up your stack
Heat up your stack
 
AWS ECS Meetup Talentica
AWS ECS Meetup TalenticaAWS ECS Meetup Talentica
AWS ECS Meetup Talentica
 
K8S in prod
K8S in prodK8S in prod
K8S in prod
 

Similar to Highly available (ha) kubernetes

Kubernetes From Scratch .pdf
Kubernetes From Scratch .pdfKubernetes From Scratch .pdf
Kubernetes From Scratch .pdfssuser9b44c7
 
Kubernetes Architecture with Components
 Kubernetes Architecture with Components Kubernetes Architecture with Components
Kubernetes Architecture with ComponentsAjeet Singh
 
Kubernetes-introduction to kubernetes for beginers.pptx
Kubernetes-introduction to kubernetes for beginers.pptxKubernetes-introduction to kubernetes for beginers.pptx
Kubernetes-introduction to kubernetes for beginers.pptxrathnavel194
 
Newesis - Introduction to Containers
Newesis -  Introduction to ContainersNewesis -  Introduction to Containers
Newesis - Introduction to ContainersRauno De Pasquale
 
Kubernetes acomprehensiveoverview
Kubernetes acomprehensiveoverviewKubernetes acomprehensiveoverview
Kubernetes acomprehensiveoverviewAnkit Shukla
 
Kubernetes - A Comprehensive Overview
Kubernetes - A Comprehensive OverviewKubernetes - A Comprehensive Overview
Kubernetes - A Comprehensive OverviewBob Killen
 
Kubernetes a comprehensive overview
Kubernetes   a comprehensive overviewKubernetes   a comprehensive overview
Kubernetes a comprehensive overviewGabriel Carro
 
Advanced Container Management and Scheduling
Advanced Container Management and SchedulingAdvanced Container Management and Scheduling
Advanced Container Management and SchedulingAmazon Web Services
 
Kubernetes-Fundamentals.pptx
Kubernetes-Fundamentals.pptxKubernetes-Fundamentals.pptx
Kubernetes-Fundamentals.pptxsatish642065
 
Kubernetes 101 for Beginners
Kubernetes 101 for BeginnersKubernetes 101 for Beginners
Kubernetes 101 for BeginnersOktay Esgul
 
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMSARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMSArun prasath
 
Kubernetes #1 intro
Kubernetes #1   introKubernetes #1   intro
Kubernetes #1 introTerry Cho
 
MuleSoft Surat Meetup#42 - Runtime Fabric Manager on Self Managed Kubernetes ...
MuleSoft Surat Meetup#42 - Runtime Fabric Manager on Self Managed Kubernetes ...MuleSoft Surat Meetup#42 - Runtime Fabric Manager on Self Managed Kubernetes ...
MuleSoft Surat Meetup#42 - Runtime Fabric Manager on Self Managed Kubernetes ...Jitendra Bafna
 

Similar to Highly available (ha) kubernetes (20)

Intro to Kubernetes
Intro to KubernetesIntro to Kubernetes
Intro to Kubernetes
 
Kubernetes From Scratch .pdf
Kubernetes From Scratch .pdfKubernetes From Scratch .pdf
Kubernetes From Scratch .pdf
 
Kubernetes Architecture with Components
 Kubernetes Architecture with Components Kubernetes Architecture with Components
Kubernetes Architecture with Components
 
Kubernetes-introduction to kubernetes for beginers.pptx
Kubernetes-introduction to kubernetes for beginers.pptxKubernetes-introduction to kubernetes for beginers.pptx
Kubernetes-introduction to kubernetes for beginers.pptx
 
Newesis - Introduction to Containers
Newesis -  Introduction to ContainersNewesis -  Introduction to Containers
Newesis - Introduction to Containers
 
Kubernetes acomprehensiveoverview
Kubernetes acomprehensiveoverviewKubernetes acomprehensiveoverview
Kubernetes acomprehensiveoverview
 
Kubernetes - A Comprehensive Overview
Kubernetes - A Comprehensive OverviewKubernetes - A Comprehensive Overview
Kubernetes - A Comprehensive Overview
 
Kubernetes a comprehensive overview
Kubernetes   a comprehensive overviewKubernetes   a comprehensive overview
Kubernetes a comprehensive overview
 
Advanced Container Management and Scheduling
Advanced Container Management and SchedulingAdvanced Container Management and Scheduling
Advanced Container Management and Scheduling
 
Kubernetes-Fundamentals.pptx
Kubernetes-Fundamentals.pptxKubernetes-Fundamentals.pptx
Kubernetes-Fundamentals.pptx
 
Kubernetes 101 for Beginners
Kubernetes 101 for BeginnersKubernetes 101 for Beginners
Kubernetes 101 for Beginners
 
AKS components
AKS componentsAKS components
AKS components
 
Container Orchestration using kubernetes
Container Orchestration using kubernetesContainer Orchestration using kubernetes
Container Orchestration using kubernetes
 
Containers kuberenetes
Containers kuberenetesContainers kuberenetes
Containers kuberenetes
 
Containers kuberenetes
Containers kuberenetesContainers kuberenetes
Containers kuberenetes
 
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMSARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
 
Kubernetes #1 intro
Kubernetes #1   introKubernetes #1   intro
Kubernetes #1 intro
 
MuleSoft Surat Meetup#42 - Runtime Fabric Manager on Self Managed Kubernetes ...
MuleSoft Surat Meetup#42 - Runtime Fabric Manager on Self Managed Kubernetes ...MuleSoft Surat Meetup#42 - Runtime Fabric Manager on Self Managed Kubernetes ...
MuleSoft Surat Meetup#42 - Runtime Fabric Manager on Self Managed Kubernetes ...
 

Recently uploaded

Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 

Recently uploaded (20)

Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 

Highly available (ha) kubernetes

  • 1. Highly Available (HA) Kubernetes Kubernetes — master and worker (node) components Kubernetes is a master-slave type of architecture with certain components (master components) calling the shots in the cluster, and other components (node components) executing application workloads (containers) as decided by the master components. Kubernetes — master and worker (node) components The node components run on every cluster node. The master components manage the state of the cluster. This includes accepting client requests (describing the desired state), scheduling containers and running control loops to drive the actual cluster state towards the desired state. These components are: apiserver: a REST API supporting basic CRUD operations on API objects (such as pods, deployments, and services). This is the endpoint that a cluster administrator communicates with, for example, using kubectl. The apiserver, in itself, is stateless. Instead it uses a distributed key-value storage system (etcd) as its backend for storing all cluster state. Controller managers: runs the control/reconciliation loops that watch the desired state in the apiserver and attempt to move the actual state towards the desired state. Internally it consists of many different controllers, of which the replication controller is a prominent one ensuring that the right numbers of replica pods are running for each deployment. scheduler: takes care of pod placement across the set of available nodes, striving to balance resource consumption to not place excessive load on any cluster node. It also takes user scheduling restrictions into account, such as (anti-)affinity rules. These components are: Container runtime: such as Docker, to execute containers on the node. kubelet: executes containers (pods) on the node as dictated by the control plane’s scheduling, and ensures the health of those pods (for example, by restarting failed pods). kube-proxy: a network proxy/LoadBalancer that implements the Service abstraction. It programs the iptables rules on the node to redirect service IP requests to one of its registered backend pods. There is no formal requirement for master components to run on any particular host in the cluster, however, typically these control-plane components are grouped together to run one or more master nodes. With the exception of etcd, which may be set up to run on its own machine(s), these master nodes typically include all components — both the control plane master components and the node components — but are dedicated to running the control plane and to not process any application workloads (typically by being tainted). Other nodes are typically designated as worker nodes (or simply nodes) and only include the node components.
  • 2. Achieving scalability and availability A common requirement is for a Kubernetes cluster to both scales to accommodate increasing workloads and to be fault-tolerant, remaining available even in the presence of failures (datacenter outages, machine failures, network partitions). Scalability and availability for the node plane Scalability and availability for the control plane A cluster can be scaled by adding worker nodes, which increases the workload capacity of the cluster, giving Kubernetes more room to schedule containers. Kubernetes is self-healing in that it keeps track of its nodes and, when a node is deemed missing (no longer passing heartbeat status messages to the master), the control plane is clever enough to re-schedule the pods from the missing node onto other (still reachable) nodes. Adding more nodes to the cluster therefore also makes the cluster more fault tolerant, as it gives Kubernetes more freedom in rescheduling pods from failed nodes onto new nodes. Adding nodes to a cluster is commonly carried out manually when a cluster administrator detects that the cluster is heavily loaded or cannot fit additional pods. Monitoring and managing the cluster size manually is tedious. Through the use of AutoScaling, this task can be automated. The Kubernetes cluster-autoScaler is one such solution. At the time of writing, it only supports Kubernetes clusters running on GCE and AWS and only provides simplistic scaling that grows the cluster whenever a pod fails to be scheduled. With the AutoScaling engine, more sophisticated and diverse AutoScaling schemes can be implemented, utilizing predictive algorithms to stay ahead of demand and scaling not only to fit the current set of pods, but also taking into account other cluster metrics such as actual CPU/memory usage, network traffic, etc. Furthermore, the autoScaler supports a wide range of cloud providers and can easily be extended to include more via our cloud pool abstraction. To handle node scale-downs gracefully, a Kubernetes-aware cloud pool proxy is placed between the autoScaler and the chosen cloud pool. Adding more workers does not make the cluster resilient to all sorts of failures. For example, if the master API server goes down (for example, due to its machine failing or a network partition cutting it off from the rest of the cluster) it will no longer be possible to control the cluster (via kubectl). For a true highly available cluster, we also need to replicate the control plane components. Such a control plane can remain reachable and functional even in the face of failures of one or a few nodes, depending on the replication factor. A HA control plane setup requires at least three masters to withstand the loss of one master, since etcd needs to be able to form a quorum (a majority) to continue operating.
  • 3. The anatomy of a HA cluster setup Given the loosely coupled nature of Kubernetes’ components, a HA cluster can be realized in many ways, But generally there are a few common guidelines: o We need a replicated, distributed etcd storage layer. o We need to replicate the apiserver across several machines and front them with a load-balancer. o We need to replicate the controllers and schedulers and set them up for leader-election. o We need to configure the (worker) nodes’ kubelet and kube-proxy to access the apiserver through the LoadBalancer. The replication factor to use depends on the level of availability one wishes to achieve. With three sets of master components the cluster can tolerate a failure of one master node since, in that case, etcd requires two live members to be able to form a quorum (a node majority) and continue working. This table provides the fault- tolerance of different etcd cluster sizes. A HA cluster with a replication factor of three could be realized as illustrated in the following schematical image: Some further considerations: etcd replicates the cluster state to all master nodes. Therefore, to lose all data, all three nodes must experience simultaneous disk failures. Although unlikely, one may want to make the storage layer even more reliable by using a separate disk that is decoupled from the lifecycle of the machine/VM (for example, an AWS EBS volume). You can also use a RAID setup to mirror disks, and finally set up etcd to take periodical backups. The etcd replicas can be placed on separate, dedicated machines to isolate and give them dedicated machine resources for improved performance. The load-balancer must monitor the health of its apiservers and only forward traffic to live servers. Spread masters across data centers (availability zones in AWS lingo) to increase the overall uptime of the cluster. If the masters are placed in different zones, the cluster can tolerate the outage of an entire availability zone. (Worker) node kubelets and kube-proxys must access the apiserver via the load- balancer to not tie them to a particular master instance. The LoadBalancer must not become a single point of failure. Most cloud providers can offer fault-tolerant LoadBalancer services. For on-premise setups, one can make use of an active/passive nginx/HAProxy setup with a virtual/ floating IP that is re-assigned by keepalived when failover is required.
  • 4. Do I need a HA control plane? So now that we know how to build a HA Kubernetes cluster, let’s get started. But wait, not so fast. Before heading on to build a HA cluster you may need to ask yourself: do I really need a HA cluster? A HA solution may not be strictly necessary to run a successful Kubernetes cluster. Before deciding on a HA setup, one needs to consider the requirements and needs of the cluster and the workloads one intends to run on it. One needs to take into account that a HA solution involves more moving parts and, therefore, higher costs. A single master can work quite well most of the time. Machine failures in most clouds are quite rare, as are availability zone outages. Can you tolerate the master being unreachable occasionally? If you have a rather static workload that rarely changes, a few hours of downtime a year may be acceptable. Furthermore, a nice thing about the loosely coupled architecture of Kubernetes is that worker nodes remain functional, running whatever they were instructed to run, even if masters go down. Hence worker nodes, in particular if spread over different zones can probably continue delivering the application service(s) even when the master is down. If you do decide to run a single master, you need to ensure that the etcd data is stored reliably, preferably backed up. When we have deployed single-master Kubernetes clusters, we typically mount two disks to the master, one to hold the etcd data directory and the second disk to hold snapshots. In this way, a failed master node can be replaced (make sure you use a static IP which you can assign to a replacement node!) and the etcd disk can be attached to the replacement node. Also, on disk corruption, a backup can be restored from the backup disk. In the end it’s a balancing act where you need to trade off the need to always have a master available against the cost of keeping additional servers running. What tools do I use? Say that you decide that your business-level objectives requires you to run a HA setup. Then it’s time to decide how to set up your cluster. There are a bewildering number of options available, reflecting both that Kubernetes can run “anywhere” (clouds, hardware architectures, OSes, etc) and that there are a lot of stakeholders that want to monetize. One may opt for a hosted solution (“Kubernetes-as-a-Service”) such as Google Container Engine or Azure Container Service. These typically reduce the administrative burden at the expense of higher costs, reduced flexibility, and vendor lock-in. If you want to build a HA cluster yourself, there are many different options ranging from building your own from scratch (Kubernetes the hard way may lead you in the right direction), to cloud-specific installers (like kops for AWS), to installers suitable for clouds and/or bare-metal scenarios (like kubespray and kubeadm). A lot of factors affect which solution is right for you. Using an available installer allows you to benefit from the experience of others, with the risk of ending up with a “black boxed” solution that is difficult to adapt to your needs. At the other extreme, building from scratch will be a serious investment in time which will give you full insight and control over your solution, with the risk of repeating the same mistakes as others have already made. References: Kubernetes Design and Architecture Building High-Availability Clusters Eng. / Tarek Ali - MicroServices CI/CD & AWS Consultant.