Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Orchestration for the rest of us

1,374 views

Published on

Orchestration, resource scheduling…What does that mean? Is this only relevant for data centers with thousands of nodes? Should I care about Mesos, Kubernetes, Swarm, when all I have is a handful of virtual machines? The motto of public cloud IAAS is "pay for what you use," so in theory, if I deploy my apps there, I'm already getting the best "resource utilization" aka "bang for my buck," right? In this talk, we will answer those questions, and a few more. We will define orchestration, scheduling, and others, and show what it's like to use a scheduler to run containerized applications there.

Published in: Technology
  • Be the first to comment

Orchestration for the rest of us

  1. 1. Orchestration for the rest of us 1 / 52
  2. 2. Disclaimer I gave this talk in 2015. Since then, the landscape of container orchestration changed quite a bit. While the ideas, concepts, and challenges that I mention remain valid today, take the examples with a grain of salt. (In fact, you should always take everything I say with a grain of salt, lest I become lazy and complacent.) Thank you! 2 / 52
  3. 3. Who am I? French software engineer living in California I have built and scaled the dotCloud PaaS I know a few things about running containers (in production) 3 / 52
  4. 4. Outline What's orchestration? (And when do we need it?) What's scheduling? (And why is it hard?) Taxonomy of schedulers (Depending on how they handle concurrency) Mesos in action Swarm in action 4 / 52
  5. 5. What's orchestration? 5 / 52
  6. 6. 6 / 52
  7. 7. Wikipedia to the rescue! Orchestration describes the automated arrangement, coordination, and management of complex computer systems, middleware, and services. 7 / 52
  8. 8. Wikipedia to the rescue! Orchestration describes the automated arrangement, coordination, and management of complex computer systems, middleware, and services. [...] orchestration is often discussed in the context of service- oriented architecture, virtualization, provisioning, Converged Infrastructure and dynamic datacenter topics. 8 / 52
  9. 9. Wikipedia to the rescue! Orchestration describes the automated arrangement, coordination, and management of complex computer systems, middleware, and services. [...] orchestration is often discussed in the context of service- oriented architecture, virtualization, provisioning, Converged Infrastructure and dynamic datacenter topics. Uhhh, ok, what does that exactly mean? 9 / 52
  10. 10. Example 1: dynamic cloud instances 10 / 52
  11. 11. Example 1: dynamic cloud instances Q: do we always use 100% of our servers? 11 / 52
  12. 12. Example 1: dynamic cloud instances Q: do we always use 100% of our servers? A: obviously not! 12 / 52
  13. 13. Example 1: dynamic cloud instances Every night, scale down (by shutting down extraneous replicated instances) Every morning, scale up (by deploying new copies) "Pay for what you use" (i.e. save big $$$ here) 13 / 52
  14. 14. Example 1: dynamic cloud instances How do we implement this? Crontab Autoscaling (save even bigger $$$) That's relatively easy. Now, how are things for our IAAS provider? 14 / 52
  15. 15. Example 2: dynamic datacenter Q: what's the #1 cost in a datacenter? 15 / 52
  16. 16. Example 2: dynamic datacenter Q: what's the #1 cost in a datacenter? A: electricity! 16 / 52
  17. 17. Example 2: dynamic datacenter Q: what's the #1 cost in a datacenter? A: electricity! Q: what uses electricity? 17 / 52
  18. 18. Example 2: dynamic datacenter Q: what's the #1 cost in a datacenter? A: electricity! Q: what uses electricity? A: servers, obviously A: ... and associated cooling 18 / 52
  19. 19. Example 2: dynamic datacenter Q: what's the #1 cost in a datacenter? A: electricity! Q: what uses electricity? A: servers, obviously A: ... and associated cooling Q: do we always use 100% of our servers? 19 / 52
  20. 20. Example 2: dynamic datacenter Q: what's the #1 cost in a datacenter? A: electricity! Q: what uses electricity? A: servers, obviously A: ... and associated cooling Q: do we always use 100% of our servers? A: obviously not! 20 / 52
  21. 21. Example 2: dynamic datacenter If only we could turn off unused servers during the night... Problem: we can only turn off a server if it's totally empty! (i.e. all VMs on it are stopped/moved) Solution: migrate VMs and shutdown empty servers (e.g. combine two hypervisors with 40% load into 80%+0%, and shutdown the one at 0%) 21 / 52
  22. 22. Example 2: dynamic datacenter How do we implement this? Shutdown empty hosts (but make sure that there is spare capacity!) Restart hosts when capacity is low Ability to "live migrate" VMs (Xen already did this 10+ years ago) Rebalance VMs on a regular basis - what if a VM is stopped while we move it? - should we allow provisioning on hosts involved in a migration? Scheduling becomes more complex. 22 / 52
  23. 23. What is scheduling? 23 / 52
  24. 24. Wikipedia to the rescue! (Again!) In computing, scheduling is the method by which threads, processes or data flows are given access to system resources. The scheduler is concerned mainly with: throughput (total amount or work done per time unit); turnaround time (between submission and completion); response time (between submission and start); waiting time (between job readiness and execution); fairness (appropriate times according to priorities). In practice, these goals often conflict. "Scheduling" = decide which resources to use. 24 / 52
  25. 25. Exercise 1 You have: 5 hypervisors (physical machines) Each server has: 16 GB RAM, 8 cores, 1 TB disk Each week, your team asks: one VM with X RAM, Y CPU, Z disk Scheduling = deciding which hypervisor to use for each VM. Difficulty: easy! 25 / 52
  26. 26. Exercise 2 You have: 1000+ hypervisors (and counting!) Each server has different resources: 8-500 GB of RAM, 4-64 cores, 1-100 TB disk Multiple times a day, a different team asks for: up to 50 VMs with different characteristics Scheduling = deciding which hypervisor to use for each VM. Difficulty: ??? 26 / 52
  27. 27. Exercise 2 You have: 1000+ hypervisors (and counting!) Each server has different resources: 8-500 GB of RAM, 4-64 cores, 1-100 TB disk Multiple times a day, a different team asks for: up to 50 VMs with different characteristics Scheduling = deciding which hypervisor to use for each VM. 27 / 52
  28. 28. Exercise 3 You have machines (physical and/or virtual) You have containers You are trying to put the containers on the machines Sounds familiar? 28 / 52
  29. 29. Scheduling with one resource Can we do better? 29 / 52
  30. 30. Scheduling with one resource Yup! 30 / 52
  31. 31. Scheduling with two resources 31 / 52
  32. 32. Scheduling with three resources 32 / 52
  33. 33. You need to be good at this 33 / 52
  34. 34. But also, you must be quick! 34 / 52
  35. 35. And be web scale! 35 / 52
  36. 36. And think outside (?) of the box! 36 / 52
  37. 37. Good luck! 37 / 52
  38. 38. TL,DR Scheduling with multiple resources (dimensions) is hard Don't expect to solve the problem with a Tiny Shell Script There are literally tons of research papers written on this 38 / 52
  39. 39. TL,DR Scheduling with multiple resources (dimensions) is hard Don't expect to solve the problem with a Tiny Shell Script There are literally tons of research papers written on this Speaking of which... 39 / 52
  40. 40. Taxonomy of schedulers (According to the famous "Omega paper") 40 / 52
  41. 41. Monolithic schedulers Concurrency model: none All scheduling requests go through a central place The scheduler examines requests one at a time (usually) No conflict is possible 41 / 52
  42. 42. Monolithic schedulers ranking Pros: simple to understand no concurrency issue Cons: SPOF (need replication + master election) prone to feature creep head-of-line blocking (slow jobs blocking everybody) supposedly not web scale (more on this later) 42 / 52
  43. 43. Monolithic schedulers examples one-person manual scheduling ("Hello IT?") Hadoop YARN most grid schedulers for scientific compute Google Borg (so they kind of scale anyway...) http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/43438.pdf; 43 / 52
  44. 44. We are not sure where the ultimate scalability limit to Borg’s centralized architecture will come from; so far, every time we have approached a limit, we’ve managed to eliminate it. A single Borgmaster can manage many thousands of machines in a cell, and several cells have arrival rates above 10 000 tasks per minute. A busy Borgmaster uses 10–14 CPU cores and up to 50 GiB RAM. We use several techniques to achieve this scale. 44 / 52
  45. 45. Two-level schedulers Concurrency model: pessimistic Top level: master who holds all the resources Second level: frameworks To run something, you talk to a framework The frameworks are given offers by the master (chunks of resources) A given resource is offered only once (hence "pessimistic" concurrency; no conflict can happen) If a framework needs more resources, it hoards them (i.e. keeps them, without using them, waiting for more) 45 / 52
  46. 46. Two-level schedulers examples Mesos Frameworks correspond to different ways to consume resources: Marathon (keep something running forever) Chronos (cron-like periodic execution) Jenkins (spin-up Jenkins slave on demand) and many more 46 / 52
  47. 47. Two-level schedulers ranking Pros: easy to implement custom behavior (supposedly) reduced wait times DEM SCALES! (run multiple copies of a framework) Cons: SPOF (need replication + master election) hoarding is inefficient well-suited for small, short-lived jobs; not so much for big, long-lived ones (increased decision time = bad!) 47 / 52
  48. 48. Shared state schedulers Concurrency model: optimistic A master holds the authoritative state of the whole cluster Multiple schedulers hold a (read-only) copy of that state (and keep it in sync) You submit jobs to one of those schedulers The scheduler does its magic and submits a transaction The master can accept the transaction fully or partially (e.g. if another transaction caused overcommit on a specific resource: memory >100% on a machine) 48 / 52
  49. 49. Shared state schedulers examples Flynn ? 49 / 52
  50. 50. Shared state schedulers ranking Pros: easy to implement custom behavior reduced wait times super duper awesome scalability Cons: SPOF (need replication + master election) need to handle partial transactions (I think) haven't seen it in action at scale yet (but I'd be delighted to be enlightened!) 50 / 52
  51. 51. Demo 51 / 52
  52. 52. Thanks! Questions? @jpetazzo @docker 52 / 52

×