Advertisement

Consolidating Infrastructure with Azure Kubernetes Service - MS Online Tech Forum

DevOps Architect at Microsoft
Apr. 16, 2020
Advertisement

More Related Content

Slideshows for you(20)

Similar to Consolidating Infrastructure with Azure Kubernetes Service - MS Online Tech Forum(20)

Advertisement

More from Davide Benvegnù(20)

Advertisement

Consolidating Infrastructure with Azure Kubernetes Service - MS Online Tech Forum

  1. Davide Benvegnu Consolidating Infrastructure with Azure Kubernetes Service Microsoft Online Tech Forum
  2. Davide Benvegnu DevOps Architect Azure DevOps Customer Advisory Team • Microsoft Certified Professional • Microsoft Certified Azure Solution Architect Expert • Microsoft MVP in VSALM - 3 years • Microsoft Event Speaker – Gold (2018 and 2019) • MMA fighter
  3. Agenda App Intro AKS architecture Scale Network & Security Handling Failures Tailwind Traders Introduction to Kubernetes and components Scale your applications in and out Pod identity and Calico network policies Cluster and Application Error Management
  4. App Intro Tailwind Traders
  5. Tailwind Traders components
  6. Management's Ask of Us Resiliency Security Flexibility Scale 4 2 1 3
  7. Why Kubernetes? Standardized API for infrastructure abstractions Self-healing Scalability Extensibility 4 2 1 3
  8. AKS Architecture
  9. Kubernetes Architecture Kubernetes control API server replication, namespace, serviceaccounts, etc. -controller- manager -scheduler etcd Master node Worker node kubelet kube-proxy Docker Pods Pods Containers Containers Worker node kubelet kube-proxy Docker Pods Pods Containers Containers Internet
  10. AKS Architecture API server Controller ManagerScheduler etcd Store Cloud Controller Self-managed master node(s) Customer VMs App/ workload definitionUser Docker Pods Docker Pods Docker Pods Docker Pods Docker Pods Schedule pods over private tunnel Kubernetes API endpoint Azure managed control plane
  11. AKS Architecture - Networking Kubernetes cluster: Azure VNET App Gateway Worker node Pods Containers kubelet Control plane Internal Load Balancer Ingress Controller Worker node Pods Containers kubelet … Namespace External DNS
  12. AKS Architecture - Virtual Node Azure Container Instances (ACI) Pods Virtual node Node Pods Node Pods Kubernetes control plane
  13. Region* AKS AZs Region* AKS AZs Region* AKS AZs AKS Architecture - Availability Zones
  14. Create vnet az network vnet create --resource-group myResGroup --name myVnet --address-prefixes 10.0.0.0/8 --subnet-name myVnetSub --subnet-prefix 10.240.0.0/16 az commands Also we create a subnet for our cluster
  15. Create a subnet az network vnet subnet create --resource-group myResGroup --vnet-name myVnet --name VNSubnet --address-prefix 10.241.0.0/16  az commands Create a subnet for virtual node az commands
  16. Create a service principal az ad sp create-for-rbac --name mySPk8s --role Contributor  az commands The service principal allows us to create other cloud resources az commands
  17. Create a base AKS Cluster az aks create --resource-group myResGroup --name myAKSCluster --node-count 3 --generate-ssh-keys  az commands Basic cluster az commands
  18. Create an AKS Cluster az aks create --resource-group myResGroup --name myAKSCluster --node-count 3 --service-principal <appId> --client-secret <password> --generate-ssh-keys --network-plugin azure --dns-service-ip $KUBE_DNS_IP --docker-bridge-address 172.17.0.1/16 --vnet-subnet-id <vnet id> --load-balancer-sku standard --enable-vmss --node-zones 1 2 3 --network-policy calico  az commands All addon flags az commands
  19. Add virtual node az aks enable-addons --resource-group myResGroup --name myAKSCluster --addons virtual-node --subnet-name VNsubnet  az commands Add the virtual node addon az commands
  20. Get the cluster connection az aks get-credentials --resource-group myResGroup --name myAKSCluster --admin kubectl get pods kubectl apply –f myfile.yaml ...  az commands Retrieves the configuration and keys for connecting to the AKS cluster az commands
  21. Future proof your cluster by enabling Virtual Node, CNI and availability zones
  22. Scale
  23. Feature Request From Management Management has asked us for a new service. The service must: • Generate customer recommendations off previous orders • Have its own deployable artifact • Have a documented API to interface with existing services
  24. Solution to the new request
  25. Scaling Technologies Cluster Autoscaler
  26. Scaling Technologies Horizontal Pod Autoscaler (HPA)
  27. Virtual Node is Based Off Virtual Kubelet
  28. Virtual Node Supports Linux containers Windows containers GPU Tip In the backend Virtual node is using Helm to deploy the binary needed to connect to ACI
  29. Tell Your Pods to Use Virtual Node nodeSelector: beta.kubernetes.io/os: linux kubernetes.io/role: agent type: virtual-kubelet tolerations: - key: virtual-kubelet.io/provider operator: Equal value: azure effect: NoSchedule  Example.yamlExample.yaml When using virtual node you need to specify virtual node in the node selector
  30. Demo Scaling with Virtual Node
  31. Network and Security
  32. Introduction into AKS security
  33. Introduction into Pod Identity
  34. Introduction into Pod Identity Node Management Identity (NMI) Managed Identity Controller (MIC)
  35. Pod Identity
  36. Network Policy Options in AKS
  37. Network Policy Options in AKS
  38. Azure Network Policy
  39. Calico Network Policy
  40. Demo Network Policies
  41. Handling Failures
  42. Availability Zones Region* AKS AZs Region* AKS AZs Region* AKS AZs
  43. Availability Zones Resiliency to data centre failures Nodes are split across 3 datacenters in a region Gives us fault domains to plan our deployments around. 2 3 1
  44. Availability zones is in public beta aka.ms/AKSavailability
  45. Handling Application Failure Use deployments with replication set to the number of zones you are using Use an ingress controller that is highly available Understand your disk mounts in pods 2 3 1
  46. Example deployment apiVersion: apps/v1 kind: Deployment metadata: name: webapp-deployment spec: selector: matchLabels: app: webapp replicas: 3 template: metadata: labels: app: webapp spec: containers: - name: webapp image: scottyc/webapp:latest ports: - containerPort: 3000 hostPort: 3000  Example.yaml To handle failure Example.yaml
  47. In review… AKS architecture Scale Network & Security Handling Failures Kubernetes is complex, with AKS it’s easy Scalability is a first-class citizen Pod identity and Calico network policies FTW Manage failures with AZs and proper settings
  48. © Copyright Microsoft Corporation. All rights reserved. Thank you

Editor's Notes

  1. In this slide we are showing what items the traditional azure resources that make up Tailwind traders we have  * Postgres * SQL * Multiple VM's  * ACI (Azure container instances) * Blob storage 
  2. Kubernetes is complex, running Kubernetes yourself is not the easiest of tasks without that skillset in your business.  In this slide we are showing the high level architecture of Kubernetes and all the moving parts, we want to call out the control plane (master node) and the worker nodes but we also want the audience to understand the flow from the API server to Kubelet to the worker nodes.
  3. This slide the main points that we want to make is that, leading in off the last slide. The Kubernetes control plane is managed by us as Azure, so the customer does not need to worry about that (so greater reduced complexity to allow your business to concentrate on your application not cluster administration) The end user is responsible for the Kubernetes worker nodes (This means operational task like OS patching)
  4. In this slide we want to tie together the architecture of Kubernetes with tailwind traders application. We want to articulate how the traffic flows the cluster to the pod. Also tieing in the two slides before that link the control plane (master nodes) and the worker nodes. For more information around Kubernetes ingress https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/ This is the first slide in which we will introduce Calcio network policies https://azure.microsoft.com/en-au/blog/integrating-azure-cni-and-calico-a-technical-deep-dive/ which will act as a soft network switch and sets security boundaries 
  5. In this slide we need to show the audience how virtual node connects ACI to Kubernetes  Virtual node installs a go binary on the control plane that converts the communications between ACI and Kubernetes so Kubernetes sees ACI as a worker node in the cluster Also for more reading on how the horizontal pod autoscaler works https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/ ACI creates a container runtime on demand with the security of vm isolation We also create a subnet on our vnet to talk to ACI so virtual node communications are private 
  6. In this slide we want to show the audience that when we set up availability zones in AKS the worker nodes will be split across Azure regions to make the cluster more highly available.  This is done using an Azure load balancer for more reading https://docs.microsoft.com/bs-latn-ba/azure/aks/availability-zones  This slide will give the audience context for the failure section
  7. From this slide on we are going to walk through how we create an AKS cluster with the 4 added features we covered in the previous architecture slides CNI, Virtual node and Availability zones. The first step, is to create an Azure resource group We define the region with the –l flag And name with the -n
  8. In this slide we are adding a vnet to our resource group  Also creating a subnet for our cluster to use with the name myVnet
  9. In this slide we are creating a second subnet that will be used for communication between Kubernetes and the ACI API in the region of the cluster We are tying it to our resource group we created earlier, also to our vnet myVnet 
  10. In this slide we are creating a service principal for the cluster to use to create other resources on Azure it might need. An example of this would be a public IP for ingress or schedule pods on ACI for virtual node
  11. In this slide are showing how to create your first AKS cluster.  This meant to be an introduction to creating the cluster, by not adding any flags you will get the default networking stack and will not be able to add any of the features we need for the managements ask. This leads into the next slide where are going to add flags to make our cluster fit the functionality that we have been request of via the management Remember once a cluster is created you cant add network policy, or availability zones
  12. In this slide we are going to contrast the pervious slide which is just getting a cluster up and running in 5 minutes. Here we are looking at a production use case with business needs.  In this command we are create the cluster like the last slide * Add the cluster to our resource group * Give the cluster a name * Add 3 worker nodes, this will put one node in each availability zone * Add the service principal and client secret so the cluster can provision other cloud resources  * Generate a set of ssh keys to access the work nodes * We specify the network plugin, Azure cni this allows us to create network policies and is a pre requisite for that. * We specify the DNS address the cluster will use for internal service discovery * The docker bridge address is an ip range that has to be there for Docker to start as systemd will fail without it * We add the vnet id of the subnet we created in the earlier steps * We then add a load balancer, this is a pre requsite for avaliblity zones. It is used for the control plane to speak to the worker nodes * We then enable vmms (virtual machine scaling sets https://docs.microsoft.com/en-us/azure/virtual-machine-scale-sets/overview) this is also need for avalibility zones * We then specify the availability zones that the worker nodes will go in. This ties into the node count placing one node in each zone * Lastly we add the network policy plugin     
  13. Now we have our cluster created we are going to add the virtual node plugin, this will allow us to burst our workloads to ACI (https://docs.microsoft.com/en-gb/azure/container-instances/) This will install a service on our cluster that will talk to Kubernetes and ACI and represent ACI as a node to Kubernetes.* *First we enable the addon * We then tie the addon to our resource group that our cluster  * We tie it to the cluster we just created by the clusters name * The addon flag is virtual node as that is the addon we want to use * We use the subnet that we created earlier to communicate with ACI on an internal connection
  14. Now we have our cluster created we are going to add the virtual node plugin, this will allow us to burst our workloads to ACI (https://docs.microsoft.com/en-gb/azure/container-instances/) This will install a service on our cluster that will talk to Kubernetes and ACI and represent ACI as a node to Kubernetes.* *First we enable the addon * We then tie the addon to our resource group that our cluster  * We tie it to the cluster we just created by the clusters name * The addon flag is virtual node as that is the addon we want to use * We use the subnet that we created earlier to communicate with ACI on an internal connection
  15. In this slide we are emulating a feature request coming in from management, Management want us to add a new consumer that takes customer data and colate all there orders to see if we can better server them when hitting the site. We are going to add a new message que with RabbitMQ and then have our new service take the data from the que and do its work
  16. In this slide we are emulating a feature request coming in from management, Management want us to add a new consumer that takes customer data and colate all there orders to see if we can better server them when hitting the site. We are going to add a new message que with RabbitMQ and then have our new service take the data from the que and do its work
  17. Cluster autoscaler The cluster autoscaler watches for pods that can't be scheduled on nodes because of  resource constraints. The cluster then automatically increases the number of nodes. 1) CA watches for pods in a pending state 2) If found any, starts a scale-out of the cluster by adding additional nodes 3) CA watches for underutilized node 4) If found, and evicting pods are not in violation of other rules, a node will be drained and removed
  18. Horizontal pod autoscaler The horizontal pod autoscaler (HPA) uses the Metrics Server in a Kubernetes cluster to monitor the resource demand  of pods. If a service needs more resources, the number of pods is automatically increased to meet the demand. 1) HPA obtains resource metrics and compares them to user-specified threshold 2) HPA evaluates whether user specified threshold is met or not 3) HPA increases/decreases the replicas based on the specified threshold 4) The Deployment controller adjusts the deployment based on increase/decrease in replicas For  tailwind traders we are going to use HPA with virtual node as we want the speed of processing and cluster autoscaler will not suit our needs. Also using virtual node we pay per second for the usage on ACI so we only pay when we use it.
  19. In the diagram we list the pod primatives virtual kubelet exposes to Kubernetes via the ACI API  This allows the user to define there pods and execute the workloads on ACI with Kubernetes orchastrating for more information on virtual kubelet https://docs.microsoft.com/en-us/azure/aks/virtual-kubelet For more information on pod and podspecs https://kubernetes.io/docs/concepts/workloads/pods/pod/ More reading Here is an explanation of a kubelet   The kubelet is the primary “node agent” that runs on each node. The kubelet works in terms of a PodSpec. A PodSpec is a YAML or JSON object that describes a pod. The kubelet takes a set of PodSpecs that are provided through various mechanisms (primarily through the apiserver) and ensures that the containers described in those PodSpecs are running and healthy. The kubelet doesn’t manage containers which were not created by Kubernetes.   Other than from a PodSpec from the apiserver, there are three ways that a container manifest can be provided to the Kubelet.       Virtual Kubelet is an open-source Kubernetes kubelet implementation that masquerades as a kubelet.   This allows Kubernetes nodes to be backed by Virtual Kubelet providers such as serverless cloud container platforms.    
  20. In this slide we look at the different types of pods that are supported.  So virtual nodes support most types of workloads
  21. To make sure workloads are scheduled on virtual node we have to specify a node selector in our deployments. Nodes selectors are used to define the node a workload will run on, by default no workload will be scheduled on virtual node without the above code added to your yaml at deployment time Here is a full example https://github.com/scotty-c/kubecon-china/blob/master/demo/manifests/consumer/consumer.yaml#L26
  22. This slide introduce the full suite of security features that we offer at Azure, this is a 30,000 foot overview and then will introduce us to the networking and pod indentity topic that we will go deeper into Image and container level security AAD authenticated Container registry access ACR image scanning and content trust for image validation Node and cluster level security Automatic security patching nightly Nodes deployed in private virtual network subnet w/o public addresses Network policy to secure communication paths between namespaces (and nodes) Pod Security Policies  K8s RBAC and AAD for authentication Pod level security Pod level control using AAD Pod Identity  Pod Security Context Workload level security Azure Role-based Access Control (RBAC) & security policy groups Secure access to resources & services  (e.g. Azure Key Vault) via Pod Identity Storage Encryption App Gateway with WAF to protect against threats and intrusions
  23. When pods need access to other Azure services, such as Cosmos DB, Key Vault, or Blob Storage, the pod needs access credentials. These access credentials could be defined with the container image or injected as a Kubernetes secret, but need to be manually created and assigned. Often, the credentials are reused across pods, and aren't regularly rotated. Managed identities for Azure resources (currently implemented as an associated AKS open source project) let you automatically request access to services through Azure AD. You don't manually define credentials for pods, instead they request an access token in real time, and can use it to access only their assigned services. In AKS, two components are deployed by the cluster operator to allow pods to use managed identities:
  24. These are the two components that handle talking between AAD (Azure active directory) and Kubernetes to lease a MSI (Microsoft secure identity)
  25. The steps to walk through the pod identity are as follows 1) Cluster operator first creates a service account that can be used to map identities when pods request access to services 2) The NMI server and MIC are deployed to relay any pod requests for access tokens to Azure AD. 3) A developer deploys a pod with a managed identity that requests an access token through the NMI server. 4) The token is returned to the pod and used to access an Azure SQL Server instance.
  26. All pods in an AKS cluster can send and receive traffic without limitations, by default. To improve security, you can define rules that control the flow of traffic. Back-end applications are often only exposed to required front-end services, for example. Or, database components are only accessible to the application tiers that connect to them. Network Policy is a Kubernetes specification that defines access policies for communication between Pods. Using Network Policies, you define an ordered set of rules to send and receive traffic and apply them to a collection of pods that match one or more label selectors. By default the network in Kubernetes is flat, without a policy engine all pods can speak to one another
  27. In this slide we introduce the two options that we offer in network security policy. This slide leads into the next two slides that we will deep dive into the implementation details further The main point here to call out between the two is Calico sets up Kernel routes and Azure network policy filters on the bridge. This detail will be important for customers that are highly regulated that go through audits.   Both implementations use Linux IPTables to enforce the specified policies. Policies are translated into sets of allowed and disallowed IP pairs. These pairs are then programmed as IPTable filter rules.
  28. In this slide we are going to visualize the packet flow and how all the components hang together. We have to logical sets of pods in this diagram one is labeled "test" the other "prod" both the policy engines work on labels to define the pod grouping. Then we can see both groups of pods talk directly to the bridge, this is where the policy engine plugs into and enforces IPtables rules preventing test from talking to prod You can then see the bridge connects to the vm-nic that is protected via the network security group on the Azure physical network
  29. In this slide we are going to visualize the packet flow and how all the components hang together. We have to logical sets of pods in this diagram one is labeled "test" the other "prod" both the policy engines work on labels to define the pod grouping this is the same as the last slide Then we can see both groups of pods use the network layer in the kernel to apply iptables and layer 3 routing. So no packets leave the kernel and make there way to the bridge. Calico plugs into the kernel to apply the policies there. You can then see the bridge connects to the vm-nic that is protected via the network security group on the Azure physical network
  30. Availability Zones is a high-availability offering that protects your applications and data from datacenter failures. Zones are unique physical locations within an Azure region. Each zone is made up of one or more datacenters equipped with independent power, cooling, and networking. To ensure resiliency, there’s a minimum of three separate zones in all enabled regions.
  31. Here are the main features that using avalibility zones gives use. 
  32. AKS clusters can currently be created using availability zones in the following regions: East US 2 North Europe Southeast Asia West Europe West US 2 Other limitations that need to be called out are You can only enable availability zones when the cluster is created. Availability zone settings can't be updated after the cluster is created. You also can't update an existing, non-availability zone cluster to use availability zones. You can't disable availability zones for an AKS cluster once it has been created. The node size (VM SKU) selected must be available across all availability zones. Clusters with availability zones enabled require use of Azure Standard Load Balancers for distribution across zones. You must use Kubernetes version 1.13.5 or greater in order to deploy Standard Load Balancers.
  33. Here are the main features that using avalibility zones gives use. 
  34. The code example on this slide shows a deployment that has 3 replicas (highlighted in red) This is to show how we would handle failures if a pod, node or region died. After this slide we will cut to the terminal and run show the availability zones the cluster has with kubectl describe nodes | grep -e "Name:" -e "failure-domain.beta.kubernetes.io/zone“ We will then turn a node off and check that the application is still available
Advertisement