Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

KubeCon EU 2016: A Practical Guide to Container Scheduling

1,099 views

Published on

Containers are at the forefront of a new wave of technology innovation but the methods for scheduling and managing them are still new to most developers. In this talk we'll look at the kind of problems that container scheduling solves and at how maximising efficiency and maiximising QoS don't have to be exclusive goals. We'll take a behind the scenes look at the Kubernetes scheduler: How does it prioritize? What about node selection and external dependencies? How do you schedule based on your own specific needs? How does it scale and what’s in it both for developers already using containers and for those that aren't? We’ll use a combination of slides, code, demos to answer all these questions and hopefully all of yours.

Sched Link: http://sched.co/6BZa

Published in: Technology
  • Be the first to comment

KubeCon EU 2016: A Practical Guide to Container Scheduling

  1. 1. Container Scheduling A Practical Guide
  2. 2. @tekgrrl #kubecon #kubernetes @tekgrrl +MandyWaite
  3. 3. @tekgrrl #kubecon #kubernetes web browsers BorgMaster link shard UI shardBorgMaster link shard UI shardBorgMaster link shard UI shardBorgMaster link shard UI shard Scheduler borgcfg web browsers scheduler Borglet Borglet Borglet Borglet Config file BorgMaster link shard UI shard persistent store (Paxos) Binary Cell Storage
  4. 4. @tekgrrl #kubecon #kubernetes Developer View job hello_world = { runtime = { cell = 'ic' } // Cell (cluster) to run in binary = '.../hello_world_webserver' // Program to run args = { port = '%port%' } // Command line parameters requirements = { // Resource requirements ram = 100M disk = 100M cpu = 0.1 } replicas = 5 // Number of tasks } 10000
  5. 5. @tekgrrl #kubecon #kubernetes Developer View
  6. 6. @tekgrrl #kubecon #kubernetes Hello world! Hello world! Hello world! Hello world!Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world!Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Image by Connie Zhou Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world!
  7. 7. 7 @tekgrrl #kubecon #kubernetes Developer View Hello world! “Internally, we don't use VMs - we just use containers to pack multiple tasks onto one machine, and stop them treading on one another.” - John Wilkes
  8. 8. 8 @tekgrrl #kubecon #kubernetes Developer View
  9. 9. 9 @tekgrrl #kubecon #kubernetes task-eviction rates and causes Failures
  10. 10. 10 @tekgrrl #kubecon #kubernetes Images by Connie Zhou A 2000-machine service will have >10 task exits per day This is not a problem: it's normal
  11. 11. 11 @tekgrrl #kubecon #kubernetes available resources one machine Efficiency Advanced bin- packing algorithms Experimental placement of production VM workload, July 2014 stranded resources
  12. 12. 12 @tekgrrl #kubecon #kubernetes Efficiency UsedCPU UsedCPU(incores) UsedMemory UsedMemory Available Resources Stranded Resources UsedCPU(incores)UsedMemory
  13. 13. 13 @tekgrrl #kubecon #kubernetes tasks per machine Efficiency Multiple applications per machine CPI^2 paper, EuroSys 2013 Median
  14. 14. 14 @tekgrrl #kubecon #kubernetes web browsers BorgMaster link shard UI shardBorgMaster link shard UI shardBorgMaster link shard UI shardBorgMaster link shard UI shard Scheduler borgcfg web browsers scheduler Cell Config file BorgMaster link shard UI shard persistent store (Paxos) Binary Cell Storage Efficiency batch Cells run both Prod and Non Prod tasks batch
  15. 15. 15 @tekgrrl #kubecon #kubernetes Efficiency Cell Sharing Cells between prod/non- prod is Better shared cell (original) shared cell (compacted) Cell Non-Prod load (compacted) Prod load (compacted) Represents the overhead of running prod and non-prod in their own cells
  16. 16. 16 @tekgrrl #kubecon #kubernetes Resource reclamation time limit: amount of resource requested usage: actual resource consumption Efficiency reservation: estimate of future usage potentially reusable resources
  17. 17. 17 @tekgrrl #kubecon #kubernetes Resource reclamation could be more aggressive Nov/Dec 2013 Efficiency
  18. 18. 18 @tekgrrl #kubecon #kubernetes Nov/Dec 2013 Efficiency Resource reclamation could be more aggressive
  19. 19. Kubernetes
  20. 20. @tekgrrl #kubecon #kubernetes K8s Master API Server Dash Board scheduler Kubelet Kubelet Kubelet Kubelet Container Registry etcdControllers web browserskubectl web browsers Config file Image
  21. 21. @tekgrrl #kubecon #kubernetes Kubernetes without a Scheduler K8s Master API Server Dash Board scheduler etcd apiVersion: v1 kind: Pod metadata: name: bursty-static spec: containers: - name: nginx image: nginx ports: - containerPort: 80 Controllers k8s-minion-xyz Kubelet k8s-minion-abc Kubelet k8s-minion-fig Kubelet k8s-minion-cat Kubelet
  22. 22. @tekgrrl #kubecon #kubernetes Kubernetes without a Scheduler K8s Master API Server Dashboard k8s-minion-xyz poddy Kubelet k8s-minion-abc Kubelet k8s-minion-fig Kubelet k8s-minion-cat Kubelet etcd apiVersion: v1 kind: Pod metadata: name: poddy spec: nodeName: k8s-minion-xyz containers: - name: nginx image: nginx ports: - containerPort: 80 Controllers
  23. 23. Resources
  24. 24. @tekgrrl #kubecon #kubernetes A Resource is something that can be requested, allocated, or consumed to/by a pod or a container CPU: Specified in units of Cores, what that is depends on the provider Memory: Specified in units of Bytes CPU is Compressible (i.e. it has a rate and can be throttled) Memory is Incompressible, it can’t be throttled Kubernetes Resources
  25. 25. @tekgrrl #kubecon #kubernetes Future Plans: More Resources: ● Network Ops ● Network Bandwidth ● Storage ● IOPS ● Storage Time Kubernetes Compute Unit (KCU) Kubernetes Resources (contd)
  26. 26. @tekgrrl #kubecon #kubernetes ... spec: containers: - name: locust image: gcr.io/rabbit-skateboard/guestbook:gdg-rtv resources: requests: memory: "300Mi" cpu: "100m" limits: memory: "300Mi" cpu: "100m" my-controller.yaml Resource based Scheduling
  27. 27. @tekgrrl #kubecon #kubernetes Resource based Scheduling (Work In Progress) Provide QoS for Scheduled Pods Per Container CPU and Memory requirements Specified as Request and Limit Future releases will [better] support: ● Best Effort (Request == 0) ● Burstable ( Request < Limit) ● Guaranteed (Request == Limit) Best Effort Scheduling for low priority workloads improves Utilization at Google by 20%
  28. 28. @tekgrrl #kubecon #kubernetes Scheduling Pods: Nodes K8s Node Kubelet disk = ssd Resources LabelsDisks Nodes may not be heterogeneous, they can differ in important ways: ● CPU and Memory Resources ● Attached Disks ● Specific Hardware Location may also be important
  29. 29. @tekgrrl #kubecon #kubernetes What CPU and Memory Resources does it need? Can also be used as a measure of priority Pod Scheduling: Identifying Potential Nodes K8s Node Kubelet Proxy CPU Mem
  30. 30. @tekgrrl #kubecon #kubernetes What Resources does it need? What Disk(s) does it need (GCE PD and EBS) and can it/they be mounted without conflict? Note: 1.1 limits to Pod Scheduling: Finding Potential Nodes K8s Node Kubelet Proxy CPU Mem
  31. 31. @tekgrrl #kubecon #kubernetes What Resources does it need? What Disk(s) does it need? What node(s) can it run on (Node Selector)? Pod Scheduling: Identifying Potential Nodes K8s Node Kubelet Proxy CPU Mem disktype = ssd kubectl label nodes node-3 disktype=ssd (pod) spec: nodeSelector: disktype: ssd
  32. 32. @tekgrrl #kubecon #kubernetes nodeAffinity (Alpha in 1.2) { "nodeAffinity": { "requiredDuringSchedulingIgnoredDuringExecution": { "nodeSelectorTerms": [ { "matchExpressions": [ { "key": "beta.kubernetes.io/instance-type", "operator": "In", "values": ["n1-highmem-2", "n1-highmem-4"] } ] } ] } } } http://kubernetes.github.io/docs/user-guide/node-selection/ Implemented through Annotations in 1.2, through fields in 1.3 Can be ‘Required’ or ‘Preferred’ during scheduling In future can can be ‘Required’ during execution (Node labels can change) Will eventually replace NodeSelector If you specify both nodeSelector and nodeAffinity, both must be satisfied
  33. 33. @tekgrrl #kubecon #kubernetes Prefer node with most free resource left after the pod is deployed Prefer nodes with the specified label Minimise number of Pods from the same service on the same node CPU and Memory is balanced after the Pod is deployed [Default] Pod Scheduling: Ranking Potential Nodes Node2 Node3 Node1
  34. 34. @tekgrrl #kubecon #kubernetes Extending the Scheduler 1. Add rules to the scheduler and recompile 2. Run your own scheduler process instead of, or as well as, the Kubernetes scheduler 3. Implement a "scheduler extender" that the Kubernetes scheduler calls out to as a final pass when making scheduling decisions
  35. 35. @tekgrrl #kubecon #kubernetes Admission Control Admission Control enforces certain conditions, before a request is accepted by the API Server AC functionality implemented as plugins which are executed in the sequence they are specified AC is performed after AuthN checks Enforcement usually results in either ● A Request denial ● Mutation of the Request Resource ● Mutation of related Resources K8s Master API Server scheduler Controllers AdmissionControl
  36. 36. @tekgrrl #kubecon #kubernetes NamespaceLifecycle Enforces that a Namespace that is undergoing termination cannot have new objects created in it, and ensures that requests in a non-existant Namespace are rejected LimitRanger Observes the incoming request and ensures that it does not violate any of the constraints enumerated in the LimitRange object in a Namespace ServiceAccount Implements automation for serviceAccounts ResourceQuota Observes the incoming request and ensures that it does not violate any of the constraints enumerated in the ResourceQuota object in a Namespace. Default plug-ins in 1.2: --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount, ResourceQuota,PersistentVolumeLabel Admission Control Examples
  37. 37. @tekgrrl #kubecon #kubernetes Mandy’s Canonical K8s deck: http://bit.ly/1oRMS0r One little-o R M S Zero little-r Setting Pod and CPU Limits Runtime Constraints Example Extending the Scheduler Resource Model Design Doc (beyond 1.1) Resources
  38. 38. @tekgrrl #kubecon #kubernetes Kubernetes is Open Source We want your help! http://kubernetes.io https://github.com/kubernetes/kubernetes Slack: #kubernetes-users @kubernetesio
  39. 39. @tekgrrl #kubecon #kubernetes Images by Connie Zhou cloud.google.com

×