Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Mesos: A State-of-the-art Container Orchestrator

1,664 views

Published on

Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/2jk5yDt.

Jie Yu overviews Mesos and its API which allows users to deploy stateless and stateful services. She discusses how containers are managed in Mesos, the future of container support in Mesos, and shows some of the new container networking and storage features that are built recently. Filmed at qconsf.com.

Jie Yu is a Software Engineer at Mesosphere, leading the containerization development in Mesos. He is a PMC member of the Apache Mesos project, and one of the top committers. Before joining Mesosphere, he was a Software Engineer at Twitter.

Published in: Technology
  • Great information about writing! If you ever need any help with proofreading, editing or research check out Writer’s Help. They are a great resource for personal, educational or business writing needs. The website is HelpWriting.net
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Mesos: A State-of-the-art Container Orchestrator

  1. 1. © 2016 Mesosphere, Inc. All Rights Reserved. 1 MESOS A State-Of-The-Art Container Orchestrator
  2. 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ mesos-api
  3. 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon San Francisco www.qconsf.com
  4. 4. © 2016 Mesosphere, Inc. All Rights Reserved. 2 About me Jie Yu (@jie_yu) ● Tech Lead at Mesosphere ● Mesos PMC member and committer ● Formerly worked at Twitter ● PhD from University of Michigan ● Worked on Mesos since 2012 http://people.apache.org/~jieyu/
  5. 5. © 2016 Mesosphere, Inc. All Rights Reserved. 3 ● Mesos overview and fundamentals ● Why should I pick Mesos? ● Containerization in Mesos Outline
  6. 6. © 2016 Mesosphere, Inc. All Rights Reserved. 4 ● What does a traditional OS kernel provide? ○ Resource management Host cpu, memory, etc. ○ Programming abstractions POSIX API: processes, threads, etc. ○ Security and isolation Virtual memory, user, etc. ● Mesos: A kernel for data center applications ○ Resource management Cluster cpu, memory, etc. ○ Programming abstractions Mesos API: Task, Resource, etc. ○ Security and isolation Containerization Mesos: A kernel for data center applications Mesos overview and fundamentals
  7. 7. © 2016 Mesosphere, Inc. All Rights Reserved. 5 ● Key concepts ○ Framework ○ Resource/Offer ○ Task ○ Executor Programming abstractions Mesos overview and fundamentals Master Agent Framework Executor Task Task Executor Task Offer (Resources) Task/Executor Resources Task/Executor
  8. 8. © 2016 Mesosphere, Inc. All Rights Reserved. 6 Case study: Marathon Mesos overview and fundamentals Master Agent X Marathon Offer X: 8 cpus, 16G mem Decline Offer 8 cpus, 16G mem
  9. 9. © 2016 Mesosphere, Inc. All Rights Reserved. 7 Create a Marathon app Mesos overview and fundamentals Master Agent X Marathon Executor Task Offer X: 8 cpus, 16G mem Accept Offer LAUNCH(Task: 2 cpus, 2G mem) POST /v2/apps
  10. 10. © 2016 Mesosphere, Inc. All Rights Reserved. 8 Create a Marathon app Mesos overview and fundamentals Master Agent X Marathon Executor Task TASK_RUNNING TASK_RUNNING Offer X: 6 cpus, 14G mem
  11. 11. © 2016 Mesosphere, Inc. All Rights Reserved. 9 A typical Mesos cluster Mesos overview and fundamentals Master Agent Marathon Agent Agent Agent Agent Agent Agent Agent Kafka Cassandra MarathonSpark Master Master Zookeeper
  12. 12. © 2016 Mesosphere, Inc. All Rights Reserved. 10 Mesos helps improve cluster utilization Mesos overview and fundamentals time time
  13. 13. © 2016 Mesosphere, Inc. All Rights Reserved. 11 DS/OS vs. Mesos Mesos overview and fundamentals Existing Infrastructure Mesosphere DCOS Services & Containers ● Kernel alone is not enough ● DC/OS: the easiest way to run Mesos ○ CLI/UI ○ Package management ○ Service discovery ○ Load balancing ○ Day2 ops ○ Security ○ Framework SDK ● Yes, it is open source!
  14. 14. © 2016 Mesosphere, Inc. All Rights Reserved. 12 ● Production ready ● Proven scalability ● Highly customizable and extensible Why should I pick Mesos? Why Mesos?
  15. 15. © 2016 Mesosphere, Inc. All Rights Reserved. 13 Production Ready
  16. 16. © 2016 Mesosphere, Inc. All Rights Reserved. 14 The birth of Mesos Why Mesos? TWITTER TECH TALK The grad students working on Mesos give a tech talk at Twitter. March 2010 APACHE INCUBATION Mesos enters the Apache Incubator. Spring 2009 CS262B Ben Hindman, Andy Konwinski and Matei Zaharia create “Nexus” as their CS262B class project. MESOS PUBLISHED Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center is published as a technical report. September 2010 December 2010
  17. 17. © 2016 Mesosphere, Inc. All Rights Reserved. 15 Widely adopted Why Mesos? MESOS GRADUATES Mesos graduates from the Apache Incubator to become a top level project. June 2013 VERIZON SCALE DEMO Verizon demonstrates launching 50,000 containers in less than 90 seconds using Mesos and Mesosphere’s Marathon scheduler. April 2013 MESOSPHERE Mesosphere is formed by engineers who have been using Mesos at Twitter and AirBnB. APPLE ANNOUNCES J.A.R.V.I.S. Apple announces that the Siri infrastructure now runs on Mesos, atop “thousands” of nodes. April 2015 August 2015
  18. 18. © 2016 Mesosphere, Inc. All Rights Reserved. 16 Production Mesos users Why Mesos?
  19. 19. © 2016 Mesosphere, Inc. All Rights Reserved. 17 Proven Scalability
  20. 20. © 2016 Mesosphere, Inc. All Rights Reserved. 18 Twitter ● Largest Mesos cluster ○ > 30000 nodes ○ > 250K containers
  21. 21. © 2016 Mesosphere, Inc. All Rights Reserved. 19 Apple ● Siri is powered by Mesos!
  22. 22. © 2016 Mesosphere, Inc. All Rights Reserved. 20 Verizon ● 50K containers in 50 seconds
  23. 23. © 2016 Mesosphere, Inc. All Rights Reserved. 21 ● Stateless master ○ Inspired from the GFS design ○ Agents hold truth about running tasks (distributed) ○ Master state can be reconstructed when agents register ● Simple, only cares about ○ Resource allocation and isolation ○ Task management ● Implemented in C++ ○ Native performance ○ No GC issue Why Mesos is so scalable? Why Mesos?
  24. 24. © 2016 Mesosphere, Inc. All Rights Reserved. 22 ● Known that Mesos will scale to Twitter/Apple level ○ Feature is easy to add, took time to make it scalable ● Quality assurance for free ○ Imagine a test environment having 30k+ nodes with real workload ● Take backwards compatibility seriously ○ We don’t want to break their production environment What does it mean to you? Why Mesos?
  25. 25. © 2016 Mesosphere, Inc. All Rights Reserved. 23 Highly Customizable and Extensible
  26. 26. © 2016 Mesosphere, Inc. All Rights Reserved. 24 ● Every company’s environment is different ○ Scheduling ○ Service discovery ○ Container image format ○ Networking ○ Storage ○ Special hardware/accelerators (e.g., GPU, FPGA) ● No one-fits-all solution typically Why this is important? Why Mesos?
  27. 27. © 2016 Mesosphere, Inc. All Rights Reserved. 25 Pluggable schedulers Why Mesos? ● For instance, you need separate schedulers for ○ Long running stateless services ○ Cron jobs ○ Stateful services (e.g., database, DFS) ○ Batch jobs (e.g., map-reduce) ● Monolithic scheduler? Monolithic schedulers do not make it easy to add new policies and specialized implementations, and may not scale up to the cluster sizes we are planning for. --- From Google Omega Paper (EuroSys’13) Mesos frameworks == pluggable schedulers
  28. 28. © 2016 Mesosphere, Inc. All Rights Reserved. 26 Flexible service discovery Why Mesos? ● Mesos is not opinionated about service discovery ○ DNS based ○ ZK/Etcd/Chubby based (e.g., twitter, google, with client libraries) ○ Your custom way, every company is different ○ Mesos provides an endpoint to stream SD information ● DNS based solution does not scale well Larger jobs create worse problems, and several jobs many be running at once. The variability in our DNS load had been a serious problem for Google before Chubby was introduced. --- From Google Chubby paper (OSDI’06)
  29. 29. © 2016 Mesosphere, Inc. All Rights Reserved. 27 ● Container image format ● Networking ● Storage ● Custom isolation ● Container lifecycle hooks Pluggable and extensible containerization Why Mesos?
  30. 30. © 2016 Mesosphere, Inc. All Rights Reserved. 28 ● Mesos overview and fundamentals ● Why should I pick Mesos? ● Containerization in Mesos ○ Pluggable architecture ○ Container image ○ Container network ○ Container storage ○ Customization and extensions ○ Nesting container support Outline
  31. 31. © 2016 Mesosphere, Inc. All Rights Reserved. 29 What is Containerizer? Containerization in Mesos 29 Containerizer ● Between agents and containers ● Launch/update/destroy containers ● Provide isolations between containers ● Report container stats and status Mesos Master Mesos Master Mesos Master Zookeeper Marathon Framework Cassandra Framework Mesos Agent Containerizer Container Executor T1 T2 Mesos Agent Containerizer Container Executor T1 T2 Mesos Agent Containerizer Container Executor T1 T2
  32. 32. © 2016 Mesosphere, Inc. All Rights Reserved. 30 Docker containerizer ● Delegate to Docker daemon Mesos containerizer ● Using standard OS features (e.g., cgroups, namespaces) ● Pluggable architecture allowing customization and extension Currently supported containerizers Containerization in Mesos Very stable. Used in large scale production clusters
  33. 33. © 2016 Mesosphere, Inc. All Rights Reserved. 31 Docker containerizer ● Delegate to Docker daemon Mesos containerizer ● Using standard OS features (e.g., cgroups, namespaces) ● Pluggable architecture allowing customization and extension ● Support Docker, Appc, OCI (soon) images natively w/o dependency Currently supported containerizers Containerization in Mesos Very stable. Used in large scale production clusters
  34. 34. © 2016 Mesosphere, Inc. All Rights Reserved. 32 Docker containerizer ● Delegate to Docker daemon Unified containerizer ● Using standard OS features (e.g., cgroups, namespaces) ● Pluggable architecture allowing customization and extension ● Support Docker, Appc, OCI (soon) images natively w/o dependency Currently supported containerizers Containerization in Mesos Very stable. Used in large scale production clusters
  35. 35. © 2016 Mesosphere, Inc. All Rights Reserved. 33 ● Pluggable architecture ● Container image ● Container network ● Container storage ● Customization and extensions ● Nesting container support Unified Containerizer Containerization in Mesos
  36. 36. © 2016 Mesosphere, Inc. All Rights Reserved. 34 Pluggable architecture Unified Containerizer Launcher Isolators Unified containerizer Provisioner Process management Container lifecycle hook Container image support
  37. 37. 35 Responsible for process management ● Spawn containers ● Kill and wait containers Supported launchers: ● Posix launcher ● Linux launcher ● Windows launcher Launcher Unified Containerizer
  38. 38. 36 Interface for extensions during the life cycle of a container ● Pre-launch - prepare() ● Post-launch (both in parent and child context) - isolate() ● Termination - cleanup() ● Resources update - update() ● Resources limitation reached - watch() ● Agent restart and recovery - recover() ● Stats and status pulling - usage() Isolator Unified Containerizer Sufficient for most of the extensions!
  39. 39. 37 Isolator example: cgroups memory isolator Unified Containerizer Agent Process Launcher creates Subprocess Container Process execve() LaunchInfo = Isolator::prepare() * Create a cgroup for the container in memory cgroup hierarchy: /sys/fs/cgroup/memory/mesos/… * Start listening for OOM event Isolator::isolate(pid) Block on pipe Move ‘pid’ to the memory cgroup just created Invoke ‘LaunchInfo.script’ Exec the executor Signal the Child to continue
  40. 40. 38 Isolator example: cgroups memory isolator Unified Containerizer Agent Process Container Process Isolator::update() Change cgroup control: memory.limit_in_bytes Sending a new Task to Executor, ‘resources’ of the Executor changes Send Task to Executor
  41. 41. 39 Isolator example: cgroups memory isolator Unified Containerizer Agent Process Container Process Isolator::cleanup() Remove the memory cgroup associated with the container Shutdown Executor or kill Task Destroy container Container terminated
  42. 42. 40 Cgroups isolators: cgroups/cpu, cgroups/mem, ... Disk isolators: disk/du, disk/xfs Filesystem isolators: filesystem/posix, filesystem/linux Volume isolators: docker/volume Network isolators: network/cni, network/port_mapping GPU isolators: gpu/nvidia …... and more! Need your contribution! Built-in isolators Unified Containerizer
  43. 43. 41 Start from 0.28, you can run your Docker container on Mesos without a Docker daemon installed! ● One less dependency in your stack ● Agent restart handled gracefully, task not affected ● Compose well with all existing isolators ● Easier to add extensions Container image support Unified Containerizer
  44. 44. 42 ● Mesos supports multiple container image format ○ Docker (without docker daemon) ○ Appc (without rkt) ○ OCI (ready soon) ○ CVMFS (experimental) ○ Host filesystem with tars/jars ○ Your own image format! Pluggable container image format Unified Containerizer Used in large scale production clusters
  45. 45. 43 ● Manage container images ○ Store: fetch and cache image layers ○ Backend: assemble rootfs from image layers ■ E.g., copy, overlayfs, bind, aufs ● Store can be extended ○ Currently supported: Docker, Appc ○ Plan to support: OCI (ongoing), CVMFS ○ Custom fetching (e.g., p2p) Provisioner Unified Containerizer
  46. 46. 44 Demo Unified Containerizer
  47. 47. 45 ● Support Container Network Interface (CNI) from 1.0 ○ A spec for container networking ○ Supported by most network vendors ● Implemented as an isolator ○ --isolation=network/cni,... Container network support Unified Containerizer
  48. 48. 46 ● Proposed by CoreOS : https://github.com/containernetworking/cni ● Simple contract between container runtime and CNI plugin defined in the form of a JSON schema ○ CLI interface ○ ADD: attach to network ○ DEL: detach from network Container Network Interface (CNI) Unified Containerizer Mesos Agent Containerizer Container Executor T1 T2 CNI Plugin IPAM veth Network
  49. 49. ● Simpler and less dependencies than Docker CNM ● Backed by Kubernetes community as well ● Rich plugins from network vendors ● Clear separation between container and network management ● IPAM has its own pluggable interface 47 Why CNI? Unified Containerizer
  50. 50. 48 Existing CNI plugins ● ipvlan ● macvlan ● bridge ● flannel ● calico ● contiv ● contrail ● weave ● … CNI plugins Unified Containerizer You can write your own plugin, and Mesos supports it!
  51. 51. 49 ● Support Docker volume plugins from 1.0 ○ Define the interface between container runtime and storage provider ○ https://docs.docker.com/engine/extend/plugins_volume/ ● A variety of Docker volume plugins ○ Ceph ○ Convoy ○ Flocker ○ Glusterfs ○ Rexray Container storage support Unified Containerizer
  52. 52. 50 Launcher ● Custom container processes management Isolator ● Extension to the life cycle of a container Provisioner ● New type of images ● Custom fetching and caching Extensions Unified Containerizer
  53. 53. © 2016 Mesosphere, Inc. All Rights Reserved. 51 ● New in Mesos 1.1 ○ Building block for supporting Pod like feature ● Highlighted features ○ Support arbitrary levels of nesting ○ Re-use all existing isolators ○ Allow dynamically creation of nested containers Nested container support Nested container support
  54. 54. © 2016 Mesosphere, Inc. All Rights Reserved. 52 Nested container support Nested container support Mesos Master Mesos Master Mesos Master Zookeeper Marathon Framework Cassandra Framework Mesos Agent Containerizer Container Executor T1 T2 Mesos Agent Containerizer Container Executor T1 T2 Mesos Agent Containerizer Container Executor T1 T2 Container Executor T1 T2 Nested Container Nested Container
  55. 55. © 2016 Mesosphere, Inc. All Rights Reserved. 53 New Agent API for Nested Containers Nested container support message agent::Call { enum Type { // Calls for managing nested containers // under an executor's container. LAUNCH_NESTED_CONTAINER = 14; WAIT_NESTED_CONTAINER = 15; KILL_NESTED_CONTAINER = 16; } }
  56. 56. © 2016 Mesosphere, Inc. All Rights Reserved. 54 Launch nested container Nested container support Container Executor Mesos Agent Containerizer LAUNCH Nginx
  57. 57. © 2016 Mesosphere, Inc. All Rights Reserved. 55 Watch nested container Nested container support Container Executor Mesos Agent Containerizer WAIT NginxExit Status = 0
  58. 58. © 2016 Mesosphere, Inc. All Rights Reserved. 56 Arbitrary levels of nesting Nested container support Container Executor Nginx Mesos Agent Containerizer LAUNCH Debug
  59. 59. 57 Demo Unified Containerizer
  60. 60. © 2016 Mesosphere, Inc. All Rights Reserved. ● Mesos: state of the art container orchestrator ○ Production ready ○ Proven scalability ○ Highly customizable and extensible ● Containerization in Mesos ○ Pluggable architecture ○ Native support for Docker/Appc images (w/o Docker daemon or rkt) ○ Container network: CNI ○ Container storage: DVD ○ Nested container support 58 Summary
  61. 61. © 2016 Mesosphere, Inc. All Rights Reserved. 59 Questions?
  62. 62. 60 CNI support using an isolator Unified Containerizer Agent Process Launcher creates Subprocess Container Process execve() LaunchInfo = Isolator::prepare() Tell the launcher to create the child process in a new NET, UTS and MNT namespace. Isolator::isolate(pid) Block on pipe Bind mount the NET namespace to keep it open Invoke ‘ADD’ of the CNI plugin with the NET namespace associated with pid Setup network related /etc/xx files for the container Invoke ‘LaunchInfo.script’ Exec the executor Signal the Child to continue
  63. 63. 61 CNI support using an isolator Unified Containerizer Container Process Isolator::cleanup() Invoke ‘DEL’ of the CNI plugin with the NET namespace handle Umount and remove the NET namespace handle Shutdown Executor or kill Task Destroy container Container terminated Agent Process
  64. 64. © 2016 Mesosphere, Inc. All Rights Reserved. 62 Mesos, as one of the most powerful container orchestrators, greatly simplifies the deploy, provision and execution of containerized workloads. It automates the distribution of preprovisioned container images, injection of configuration, scheduling onto machines, life-cycle-management, and monitoring of applications, microservices, and jobs in the cloud. In this talk, Jie Yu will first give you an overview about Mesos and its powerful API which allows users to easily deploy their stateless and stateful services. Then, Jie will talk about how containers are managed in Mesos. In particular, Jie will provide a deep dive into the unified containerizer which is first introduced in Mesos 1.0. Jie will show some of the new container networking and storage features that are built recently, and how they benefit from the pluggable and extensible architecture of the unified containerizer. Finally, Jie will discuss the future of container support in Mesos.
  65. 65. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ mesos-api

×