© 2016 Mesosphere, Inc. All Rights Reserved. 1
MESOS
A State-Of-The-Art
Container Orchestrator
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
mesos-api
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon San Francisco
www.qconsf.com
© 2016 Mesosphere, Inc. All Rights Reserved. 2
About me
Jie Yu (@jie_yu)
● Tech Lead at Mesosphere
● Mesos PMC member and committer
● Formerly worked at Twitter
● PhD from University of Michigan
● Worked on Mesos since 2012
http://people.apache.org/~jieyu/
© 2016 Mesosphere, Inc. All Rights Reserved. 3
● Mesos overview and fundamentals
● Why should I pick Mesos?
● Containerization in Mesos
Outline
© 2016 Mesosphere, Inc. All Rights Reserved. 4
● What does a traditional OS kernel provide?
○ Resource management Host cpu, memory, etc.
○ Programming abstractions POSIX API: processes, threads, etc.
○ Security and isolation Virtual memory, user, etc.
● Mesos: A kernel for data center applications
○ Resource management Cluster cpu, memory, etc.
○ Programming abstractions Mesos API: Task, Resource, etc.
○ Security and isolation Containerization
Mesos: A kernel for data center applications
Mesos overview and fundamentals
© 2016 Mesosphere, Inc. All Rights Reserved. 5
● Key concepts
○ Framework
○ Resource/Offer
○ Task
○ Executor
Programming abstractions
Mesos overview and fundamentals
Master
Agent
Framework
Executor
Task Task
Executor
Task
Offer (Resources) Task/Executor
Resources Task/Executor
© 2016 Mesosphere, Inc. All Rights Reserved. 6
Case study: Marathon
Mesos overview and fundamentals
Master
Agent X
Marathon
Offer
X: 8 cpus, 16G mem
Decline Offer
8 cpus, 16G mem
© 2016 Mesosphere, Inc. All Rights Reserved. 7
Create a Marathon app
Mesos overview and fundamentals
Master
Agent X
Marathon
Executor
Task
Offer
X: 8 cpus, 16G mem
Accept Offer
LAUNCH(Task: 2 cpus, 2G mem)
POST /v2/apps
© 2016 Mesosphere, Inc. All Rights Reserved. 8
Create a Marathon app
Mesos overview and fundamentals
Master
Agent X
Marathon
Executor
Task
TASK_RUNNING
TASK_RUNNING Offer
X: 6 cpus, 14G mem
© 2016 Mesosphere, Inc. All Rights Reserved. 9
A typical Mesos cluster
Mesos overview and fundamentals
Master
Agent
Marathon
Agent Agent Agent Agent Agent Agent Agent
Kafka Cassandra MarathonSpark
Master Master
Zookeeper
© 2016 Mesosphere, Inc. All Rights Reserved. 10
Mesos helps improve cluster utilization
Mesos overview and fundamentals
time
time
© 2016 Mesosphere, Inc. All Rights Reserved. 11
DS/OS vs. Mesos
Mesos overview and fundamentals
Existing
Infrastructure
Mesosphere
DCOS
Services &
Containers
● Kernel alone is not enough
● DC/OS: the easiest way to run Mesos
○ CLI/UI
○ Package management
○ Service discovery
○ Load balancing
○ Day2 ops
○ Security
○ Framework SDK
● Yes, it is open source!
© 2016 Mesosphere, Inc. All Rights Reserved. 12
● Production ready
● Proven scalability
● Highly customizable and extensible
Why should I pick Mesos?
Why Mesos?
© 2016 Mesosphere, Inc. All Rights Reserved. 13
Production
Ready
© 2016 Mesosphere, Inc. All Rights Reserved. 14
The birth of Mesos
Why Mesos?
TWITTER TECH TALK
The grad students working on Mesos
give a tech talk at Twitter.
March 2010
APACHE INCUBATION
Mesos enters the Apache Incubator.
Spring 2009
CS262B
Ben Hindman, Andy Konwinski and
Matei Zaharia create “Nexus” as their
CS262B class project.
MESOS PUBLISHED
Mesos: A Platform for Fine-Grained
Resource Sharing in the Data Center is
published as a technical report.
September 2010
December 2010
© 2016 Mesosphere, Inc. All Rights Reserved. 15
Widely adopted
Why Mesos?
MESOS GRADUATES
Mesos graduates from the Apache
Incubator to become a top level
project.
June 2013
VERIZON SCALE DEMO
Verizon demonstrates launching
50,000 containers in less than 90
seconds using Mesos and
Mesosphere’s Marathon scheduler.
April 2013
MESOSPHERE
Mesosphere is formed by engineers
who have been using Mesos at
Twitter and AirBnB.
APPLE ANNOUNCES J.A.R.V.I.S.
Apple announces that the Siri
infrastructure now runs on Mesos,
atop “thousands” of nodes.
April 2015
August 2015
© 2016 Mesosphere, Inc. All Rights Reserved. 16
Production Mesos users
Why Mesos?
© 2016 Mesosphere, Inc. All Rights Reserved. 17
Proven
Scalability
© 2016 Mesosphere, Inc. All Rights Reserved. 18
Twitter
● Largest Mesos cluster
○ > 30000 nodes
○ > 250K containers
© 2016 Mesosphere, Inc. All Rights Reserved. 19
Apple
● Siri is powered by
Mesos!
© 2016 Mesosphere, Inc. All Rights Reserved. 20
Verizon
● 50K containers
in 50 seconds
© 2016 Mesosphere, Inc. All Rights Reserved. 21
● Stateless master
○ Inspired from the GFS design
○ Agents hold truth about running tasks (distributed)
○ Master state can be reconstructed when agents register
● Simple, only cares about
○ Resource allocation and isolation
○ Task management
● Implemented in C++
○ Native performance
○ No GC issue
Why Mesos is so scalable?
Why Mesos?
© 2016 Mesosphere, Inc. All Rights Reserved. 22
● Known that Mesos will scale to Twitter/Apple level
○ Feature is easy to add, took time to make it scalable
● Quality assurance for free
○ Imagine a test environment having 30k+ nodes with real workload
● Take backwards compatibility seriously
○ We don’t want to break their production environment
What does it mean to you?
Why Mesos?
© 2016 Mesosphere, Inc. All Rights Reserved. 23
Highly
Customizable
and Extensible
© 2016 Mesosphere, Inc. All Rights Reserved. 24
● Every company’s environment is different
○ Scheduling
○ Service discovery
○ Container image format
○ Networking
○ Storage
○ Special hardware/accelerators (e.g., GPU, FPGA)
● No one-fits-all solution typically
Why this is important?
Why Mesos?
© 2016 Mesosphere, Inc. All Rights Reserved. 25
Pluggable schedulers
Why Mesos?
● For instance, you need separate schedulers for
○ Long running stateless services
○ Cron jobs
○ Stateful services (e.g., database, DFS)
○ Batch jobs (e.g., map-reduce)
● Monolithic scheduler?
Monolithic schedulers do not make it easy to add new policies and specialized
implementations, and may not scale up to the cluster sizes we are planning for.
--- From Google Omega Paper (EuroSys’13)
Mesos frameworks
== pluggable schedulers
© 2016 Mesosphere, Inc. All Rights Reserved. 26
Flexible service discovery
Why Mesos?
● Mesos is not opinionated about service discovery
○ DNS based
○ ZK/Etcd/Chubby based (e.g., twitter, google, with client libraries)
○ Your custom way, every company is different
○ Mesos provides an endpoint to stream SD information
● DNS based solution does not scale well
Larger jobs create worse problems, and several jobs many be running
at once. The variability in our DNS load had been a serious problem for
Google before Chubby was introduced.
--- From Google Chubby paper (OSDI’06)
© 2016 Mesosphere, Inc. All Rights Reserved. 27
● Container image format
● Networking
● Storage
● Custom isolation
● Container lifecycle hooks
Pluggable and extensible containerization
Why Mesos?
© 2016 Mesosphere, Inc. All Rights Reserved. 28
● Mesos overview and fundamentals
● Why should I pick Mesos?
● Containerization in Mesos
○ Pluggable architecture
○ Container image
○ Container network
○ Container storage
○ Customization and extensions
○ Nesting container support
Outline
© 2016 Mesosphere, Inc. All Rights Reserved. 29
What is Containerizer?
Containerization in Mesos
29
Containerizer
● Between agents and containers
● Launch/update/destroy containers
● Provide isolations between containers
● Report container stats and status
Mesos Master Mesos Master Mesos Master
Zookeeper
Marathon
Framework
Cassandra
Framework
Mesos Agent
Containerizer
Container
Executor
T1 T2
Mesos Agent
Containerizer
Container
Executor
T1 T2
Mesos Agent
Containerizer
Container
Executor
T1 T2
© 2016 Mesosphere, Inc. All Rights Reserved. 30
Docker containerizer
● Delegate to Docker daemon
Mesos containerizer
● Using standard OS features (e.g.,
cgroups, namespaces)
● Pluggable architecture allowing
customization and extension
Currently supported containerizers
Containerization in Mesos
Very stable. Used in large
scale production clusters
© 2016 Mesosphere, Inc. All Rights Reserved. 31
Docker containerizer
● Delegate to Docker daemon
Mesos containerizer
● Using standard OS features (e.g.,
cgroups, namespaces)
● Pluggable architecture allowing
customization and extension
● Support Docker, Appc, OCI (soon)
images natively w/o dependency
Currently supported containerizers
Containerization in Mesos
Very stable. Used in large
scale production clusters
© 2016 Mesosphere, Inc. All Rights Reserved. 32
Docker containerizer
● Delegate to Docker daemon
Unified containerizer
● Using standard OS features (e.g.,
cgroups, namespaces)
● Pluggable architecture allowing
customization and extension
● Support Docker, Appc, OCI (soon)
images natively w/o dependency
Currently supported containerizers
Containerization in Mesos
Very stable. Used in large
scale production clusters
© 2016 Mesosphere, Inc. All Rights Reserved. 33
● Pluggable architecture
● Container image
● Container network
● Container storage
● Customization and extensions
● Nesting container support
Unified Containerizer
Containerization in Mesos
© 2016 Mesosphere, Inc. All Rights Reserved. 34
Pluggable architecture
Unified Containerizer
Launcher Isolators
Unified containerizer
Provisioner
Process
management
Container
lifecycle hook
Container
image support
35
Responsible for process management
● Spawn containers
● Kill and wait containers
Supported launchers:
● Posix launcher
● Linux launcher
● Windows launcher
Launcher
Unified Containerizer
36
Interface for extensions during the life cycle of a container
● Pre-launch - prepare()
● Post-launch (both in parent and child context) - isolate()
● Termination - cleanup()
● Resources update - update()
● Resources limitation reached - watch()
● Agent restart and recovery - recover()
● Stats and status pulling - usage()
Isolator
Unified Containerizer
Sufficient for most of
the extensions!
37
Isolator example: cgroups memory isolator
Unified Containerizer
Agent Process
Launcher
creates
Subprocess Container
Process
execve()
LaunchInfo = Isolator::prepare()
* Create a cgroup for the container
in memory cgroup hierarchy:
/sys/fs/cgroup/memory/mesos/…
* Start listening for OOM event
Isolator::isolate(pid)
Block on pipe
Move ‘pid’ to the
memory cgroup just
created
Invoke ‘LaunchInfo.script’
Exec the executor
Signal the Child to continue
38
Isolator example: cgroups memory isolator
Unified Containerizer
Agent Process
Container
Process
Isolator::update()
Change cgroup control:
memory.limit_in_bytes
Sending a new Task to
Executor, ‘resources’ of
the Executor changes
Send Task to Executor
39
Isolator example: cgroups memory isolator
Unified Containerizer
Agent Process
Container
Process
Isolator::cleanup()
Remove the memory
cgroup associated
with the container
Shutdown Executor
or kill Task Destroy container
Container terminated
40
Cgroups isolators: cgroups/cpu, cgroups/mem, ...
Disk isolators: disk/du, disk/xfs
Filesystem isolators: filesystem/posix, filesystem/linux
Volume isolators: docker/volume
Network isolators: network/cni, network/port_mapping
GPU isolators: gpu/nvidia
…... and more! Need your contribution!
Built-in isolators
Unified Containerizer
41
Start from 0.28, you can run your Docker container on
Mesos without a Docker daemon installed!
● One less dependency in your stack
● Agent restart handled gracefully, task not affected
● Compose well with all existing isolators
● Easier to add extensions
Container image support
Unified Containerizer
42
● Mesos supports multiple container image format
○ Docker (without docker daemon)
○ Appc (without rkt)
○ OCI (ready soon)
○ CVMFS (experimental)
○ Host filesystem with tars/jars
○ Your own image format!
Pluggable container image format
Unified Containerizer
Used in large scale
production clusters
43
● Manage container images
○ Store: fetch and cache image layers
○ Backend: assemble rootfs from image layers
■ E.g., copy, overlayfs, bind, aufs
● Store can be extended
○ Currently supported: Docker, Appc
○ Plan to support: OCI (ongoing), CVMFS
○ Custom fetching (e.g., p2p)
Provisioner
Unified Containerizer
44
Demo
Unified Containerizer
45
● Support Container Network Interface (CNI) from 1.0
○ A spec for container networking
○ Supported by most network vendors
● Implemented as an isolator
○ --isolation=network/cni,...
Container network support
Unified Containerizer
46
● Proposed by CoreOS :
https://github.com/containernetworking/cni
● Simple contract between container
runtime and CNI plugin defined in the
form of a JSON schema
○ CLI interface
○ ADD: attach to network
○ DEL: detach from network
Container Network Interface (CNI)
Unified Containerizer
Mesos Agent
Containerizer
Container
Executor
T1 T2
CNI Plugin
IPAM
veth
Network
● Simpler and less dependencies than Docker CNM
● Backed by Kubernetes community as well
● Rich plugins from network vendors
● Clear separation between container and network management
● IPAM has its own pluggable interface
47
Why CNI?
Unified Containerizer
48
Existing CNI plugins
● ipvlan
● macvlan
● bridge
● flannel
● calico
● contiv
● contrail
● weave
● …
CNI plugins
Unified Containerizer
You can write your own plugin,
and Mesos supports it!
49
● Support Docker volume plugins from 1.0
○ Define the interface between container runtime and storage provider
○ https://docs.docker.com/engine/extend/plugins_volume/
● A variety of Docker volume plugins
○ Ceph
○ Convoy
○ Flocker
○ Glusterfs
○ Rexray
Container storage support
Unified Containerizer
50
Launcher
● Custom container processes management
Isolator
● Extension to the life cycle of a container
Provisioner
● New type of images
● Custom fetching and caching
Extensions
Unified Containerizer
© 2016 Mesosphere, Inc. All Rights Reserved. 51
● New in Mesos 1.1
○ Building block for supporting Pod like feature
● Highlighted features
○ Support arbitrary levels of nesting
○ Re-use all existing isolators
○ Allow dynamically creation of nested containers
Nested container support
Nested container support
© 2016 Mesosphere, Inc. All Rights Reserved. 52
Nested container support
Nested container support
Mesos Master Mesos Master Mesos Master
Zookeeper
Marathon
Framework
Cassandra
Framework
Mesos Agent
Containerizer
Container
Executor
T1 T2
Mesos Agent
Containerizer
Container
Executor
T1 T2
Mesos Agent
Containerizer
Container
Executor
T1 T2
Container
Executor
T1 T2
Nested Container Nested Container
© 2016 Mesosphere, Inc. All Rights Reserved. 53
New Agent API for Nested Containers
Nested container support
message agent::Call {
enum Type {
// Calls for managing nested containers
// under an executor's container.
LAUNCH_NESTED_CONTAINER = 14;
WAIT_NESTED_CONTAINER = 15;
KILL_NESTED_CONTAINER = 16;
}
}
© 2016 Mesosphere, Inc. All Rights Reserved. 54
Launch nested container
Nested container support
Container
Executor
Mesos Agent
Containerizer
LAUNCH
Nginx
© 2016 Mesosphere, Inc. All Rights Reserved. 55
Watch nested container
Nested container support
Container
Executor
Mesos Agent
Containerizer
WAIT
NginxExit Status = 0
© 2016 Mesosphere, Inc. All Rights Reserved. 56
Arbitrary levels of nesting
Nested container support
Container
Executor
Nginx
Mesos Agent
Containerizer
LAUNCH
Debug
57
Demo
Unified Containerizer
© 2016 Mesosphere, Inc. All Rights Reserved.
● Mesos: state of the art container orchestrator
○ Production ready
○ Proven scalability
○ Highly customizable and extensible
● Containerization in Mesos
○ Pluggable architecture
○ Native support for Docker/Appc images (w/o Docker daemon or rkt)
○ Container network: CNI
○ Container storage: DVD
○ Nested container support
58
Summary
© 2016 Mesosphere, Inc. All Rights Reserved. 59
Questions?
60
CNI support using an isolator
Unified Containerizer
Agent Process
Launcher
creates
Subprocess Container
Process
execve()
LaunchInfo = Isolator::prepare()
Tell the launcher to create the
child process in a new NET, UTS
and MNT namespace.
Isolator::isolate(pid)
Block on pipe
Bind mount the NET
namespace to keep
it open
Invoke ‘ADD’ of the
CNI plugin with the
NET namespace
associated with pid
Setup network
related /etc/xx files
for the container
Invoke ‘LaunchInfo.script’
Exec the executor
Signal the Child to continue
61
CNI support using an isolator
Unified Containerizer
Container
Process
Isolator::cleanup()
Invoke ‘DEL’ of the
CNI plugin with the
NET namespace
handle
Umount and remove
the NET namespace
handle
Shutdown Executor
or kill Task Destroy container
Container terminated
Agent Process
© 2016 Mesosphere, Inc. All Rights Reserved. 62
Mesos, as one of the most powerful container orchestrators, greatly simplifies the
deploy, provision and execution of containerized workloads. It automates the
distribution of preprovisioned container images, injection of configuration,
scheduling onto machines, life-cycle-management, and monitoring of applications,
microservices, and jobs in the cloud.
In this talk, Jie Yu will first give you an overview about Mesos and its powerful API
which allows users to easily deploy their stateless and stateful services. Then, Jie will
talk about how containers are managed in Mesos. In particular, Jie will provide a deep
dive into the unified containerizer which is first introduced in Mesos 1.0.
Jie will show some of the new container networking and storage features that are
built recently, and how they benefit from the pluggable and extensible architecture of
the unified containerizer. Finally, Jie will discuss the future of container support in
Mesos.
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
mesos-api

Mesos: A State-of-the-art Container Orchestrator

  • 1.
    © 2016 Mesosphere,Inc. All Rights Reserved. 1 MESOS A State-Of-The-Art Container Orchestrator
  • 2.
    InfoQ.com: News &Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ mesos-api
  • 3.
    Purpose of QCon -to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon San Francisco www.qconsf.com
  • 4.
    © 2016 Mesosphere,Inc. All Rights Reserved. 2 About me Jie Yu (@jie_yu) ● Tech Lead at Mesosphere ● Mesos PMC member and committer ● Formerly worked at Twitter ● PhD from University of Michigan ● Worked on Mesos since 2012 http://people.apache.org/~jieyu/
  • 5.
    © 2016 Mesosphere,Inc. All Rights Reserved. 3 ● Mesos overview and fundamentals ● Why should I pick Mesos? ● Containerization in Mesos Outline
  • 6.
    © 2016 Mesosphere,Inc. All Rights Reserved. 4 ● What does a traditional OS kernel provide? ○ Resource management Host cpu, memory, etc. ○ Programming abstractions POSIX API: processes, threads, etc. ○ Security and isolation Virtual memory, user, etc. ● Mesos: A kernel for data center applications ○ Resource management Cluster cpu, memory, etc. ○ Programming abstractions Mesos API: Task, Resource, etc. ○ Security and isolation Containerization Mesos: A kernel for data center applications Mesos overview and fundamentals
  • 7.
    © 2016 Mesosphere,Inc. All Rights Reserved. 5 ● Key concepts ○ Framework ○ Resource/Offer ○ Task ○ Executor Programming abstractions Mesos overview and fundamentals Master Agent Framework Executor Task Task Executor Task Offer (Resources) Task/Executor Resources Task/Executor
  • 8.
    © 2016 Mesosphere,Inc. All Rights Reserved. 6 Case study: Marathon Mesos overview and fundamentals Master Agent X Marathon Offer X: 8 cpus, 16G mem Decline Offer 8 cpus, 16G mem
  • 9.
    © 2016 Mesosphere,Inc. All Rights Reserved. 7 Create a Marathon app Mesos overview and fundamentals Master Agent X Marathon Executor Task Offer X: 8 cpus, 16G mem Accept Offer LAUNCH(Task: 2 cpus, 2G mem) POST /v2/apps
  • 10.
    © 2016 Mesosphere,Inc. All Rights Reserved. 8 Create a Marathon app Mesos overview and fundamentals Master Agent X Marathon Executor Task TASK_RUNNING TASK_RUNNING Offer X: 6 cpus, 14G mem
  • 11.
    © 2016 Mesosphere,Inc. All Rights Reserved. 9 A typical Mesos cluster Mesos overview and fundamentals Master Agent Marathon Agent Agent Agent Agent Agent Agent Agent Kafka Cassandra MarathonSpark Master Master Zookeeper
  • 12.
    © 2016 Mesosphere,Inc. All Rights Reserved. 10 Mesos helps improve cluster utilization Mesos overview and fundamentals time time
  • 13.
    © 2016 Mesosphere,Inc. All Rights Reserved. 11 DS/OS vs. Mesos Mesos overview and fundamentals Existing Infrastructure Mesosphere DCOS Services & Containers ● Kernel alone is not enough ● DC/OS: the easiest way to run Mesos ○ CLI/UI ○ Package management ○ Service discovery ○ Load balancing ○ Day2 ops ○ Security ○ Framework SDK ● Yes, it is open source!
  • 14.
    © 2016 Mesosphere,Inc. All Rights Reserved. 12 ● Production ready ● Proven scalability ● Highly customizable and extensible Why should I pick Mesos? Why Mesos?
  • 15.
    © 2016 Mesosphere,Inc. All Rights Reserved. 13 Production Ready
  • 16.
    © 2016 Mesosphere,Inc. All Rights Reserved. 14 The birth of Mesos Why Mesos? TWITTER TECH TALK The grad students working on Mesos give a tech talk at Twitter. March 2010 APACHE INCUBATION Mesos enters the Apache Incubator. Spring 2009 CS262B Ben Hindman, Andy Konwinski and Matei Zaharia create “Nexus” as their CS262B class project. MESOS PUBLISHED Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center is published as a technical report. September 2010 December 2010
  • 17.
    © 2016 Mesosphere,Inc. All Rights Reserved. 15 Widely adopted Why Mesos? MESOS GRADUATES Mesos graduates from the Apache Incubator to become a top level project. June 2013 VERIZON SCALE DEMO Verizon demonstrates launching 50,000 containers in less than 90 seconds using Mesos and Mesosphere’s Marathon scheduler. April 2013 MESOSPHERE Mesosphere is formed by engineers who have been using Mesos at Twitter and AirBnB. APPLE ANNOUNCES J.A.R.V.I.S. Apple announces that the Siri infrastructure now runs on Mesos, atop “thousands” of nodes. April 2015 August 2015
  • 18.
    © 2016 Mesosphere,Inc. All Rights Reserved. 16 Production Mesos users Why Mesos?
  • 19.
    © 2016 Mesosphere,Inc. All Rights Reserved. 17 Proven Scalability
  • 20.
    © 2016 Mesosphere,Inc. All Rights Reserved. 18 Twitter ● Largest Mesos cluster ○ > 30000 nodes ○ > 250K containers
  • 21.
    © 2016 Mesosphere,Inc. All Rights Reserved. 19 Apple ● Siri is powered by Mesos!
  • 22.
    © 2016 Mesosphere,Inc. All Rights Reserved. 20 Verizon ● 50K containers in 50 seconds
  • 23.
    © 2016 Mesosphere,Inc. All Rights Reserved. 21 ● Stateless master ○ Inspired from the GFS design ○ Agents hold truth about running tasks (distributed) ○ Master state can be reconstructed when agents register ● Simple, only cares about ○ Resource allocation and isolation ○ Task management ● Implemented in C++ ○ Native performance ○ No GC issue Why Mesos is so scalable? Why Mesos?
  • 24.
    © 2016 Mesosphere,Inc. All Rights Reserved. 22 ● Known that Mesos will scale to Twitter/Apple level ○ Feature is easy to add, took time to make it scalable ● Quality assurance for free ○ Imagine a test environment having 30k+ nodes with real workload ● Take backwards compatibility seriously ○ We don’t want to break their production environment What does it mean to you? Why Mesos?
  • 25.
    © 2016 Mesosphere,Inc. All Rights Reserved. 23 Highly Customizable and Extensible
  • 26.
    © 2016 Mesosphere,Inc. All Rights Reserved. 24 ● Every company’s environment is different ○ Scheduling ○ Service discovery ○ Container image format ○ Networking ○ Storage ○ Special hardware/accelerators (e.g., GPU, FPGA) ● No one-fits-all solution typically Why this is important? Why Mesos?
  • 27.
    © 2016 Mesosphere,Inc. All Rights Reserved. 25 Pluggable schedulers Why Mesos? ● For instance, you need separate schedulers for ○ Long running stateless services ○ Cron jobs ○ Stateful services (e.g., database, DFS) ○ Batch jobs (e.g., map-reduce) ● Monolithic scheduler? Monolithic schedulers do not make it easy to add new policies and specialized implementations, and may not scale up to the cluster sizes we are planning for. --- From Google Omega Paper (EuroSys’13) Mesos frameworks == pluggable schedulers
  • 28.
    © 2016 Mesosphere,Inc. All Rights Reserved. 26 Flexible service discovery Why Mesos? ● Mesos is not opinionated about service discovery ○ DNS based ○ ZK/Etcd/Chubby based (e.g., twitter, google, with client libraries) ○ Your custom way, every company is different ○ Mesos provides an endpoint to stream SD information ● DNS based solution does not scale well Larger jobs create worse problems, and several jobs many be running at once. The variability in our DNS load had been a serious problem for Google before Chubby was introduced. --- From Google Chubby paper (OSDI’06)
  • 29.
    © 2016 Mesosphere,Inc. All Rights Reserved. 27 ● Container image format ● Networking ● Storage ● Custom isolation ● Container lifecycle hooks Pluggable and extensible containerization Why Mesos?
  • 30.
    © 2016 Mesosphere,Inc. All Rights Reserved. 28 ● Mesos overview and fundamentals ● Why should I pick Mesos? ● Containerization in Mesos ○ Pluggable architecture ○ Container image ○ Container network ○ Container storage ○ Customization and extensions ○ Nesting container support Outline
  • 31.
    © 2016 Mesosphere,Inc. All Rights Reserved. 29 What is Containerizer? Containerization in Mesos 29 Containerizer ● Between agents and containers ● Launch/update/destroy containers ● Provide isolations between containers ● Report container stats and status Mesos Master Mesos Master Mesos Master Zookeeper Marathon Framework Cassandra Framework Mesos Agent Containerizer Container Executor T1 T2 Mesos Agent Containerizer Container Executor T1 T2 Mesos Agent Containerizer Container Executor T1 T2
  • 32.
    © 2016 Mesosphere,Inc. All Rights Reserved. 30 Docker containerizer ● Delegate to Docker daemon Mesos containerizer ● Using standard OS features (e.g., cgroups, namespaces) ● Pluggable architecture allowing customization and extension Currently supported containerizers Containerization in Mesos Very stable. Used in large scale production clusters
  • 33.
    © 2016 Mesosphere,Inc. All Rights Reserved. 31 Docker containerizer ● Delegate to Docker daemon Mesos containerizer ● Using standard OS features (e.g., cgroups, namespaces) ● Pluggable architecture allowing customization and extension ● Support Docker, Appc, OCI (soon) images natively w/o dependency Currently supported containerizers Containerization in Mesos Very stable. Used in large scale production clusters
  • 34.
    © 2016 Mesosphere,Inc. All Rights Reserved. 32 Docker containerizer ● Delegate to Docker daemon Unified containerizer ● Using standard OS features (e.g., cgroups, namespaces) ● Pluggable architecture allowing customization and extension ● Support Docker, Appc, OCI (soon) images natively w/o dependency Currently supported containerizers Containerization in Mesos Very stable. Used in large scale production clusters
  • 35.
    © 2016 Mesosphere,Inc. All Rights Reserved. 33 ● Pluggable architecture ● Container image ● Container network ● Container storage ● Customization and extensions ● Nesting container support Unified Containerizer Containerization in Mesos
  • 36.
    © 2016 Mesosphere,Inc. All Rights Reserved. 34 Pluggable architecture Unified Containerizer Launcher Isolators Unified containerizer Provisioner Process management Container lifecycle hook Container image support
  • 37.
    35 Responsible for processmanagement ● Spawn containers ● Kill and wait containers Supported launchers: ● Posix launcher ● Linux launcher ● Windows launcher Launcher Unified Containerizer
  • 38.
    36 Interface for extensionsduring the life cycle of a container ● Pre-launch - prepare() ● Post-launch (both in parent and child context) - isolate() ● Termination - cleanup() ● Resources update - update() ● Resources limitation reached - watch() ● Agent restart and recovery - recover() ● Stats and status pulling - usage() Isolator Unified Containerizer Sufficient for most of the extensions!
  • 39.
    37 Isolator example: cgroupsmemory isolator Unified Containerizer Agent Process Launcher creates Subprocess Container Process execve() LaunchInfo = Isolator::prepare() * Create a cgroup for the container in memory cgroup hierarchy: /sys/fs/cgroup/memory/mesos/… * Start listening for OOM event Isolator::isolate(pid) Block on pipe Move ‘pid’ to the memory cgroup just created Invoke ‘LaunchInfo.script’ Exec the executor Signal the Child to continue
  • 40.
    38 Isolator example: cgroupsmemory isolator Unified Containerizer Agent Process Container Process Isolator::update() Change cgroup control: memory.limit_in_bytes Sending a new Task to Executor, ‘resources’ of the Executor changes Send Task to Executor
  • 41.
    39 Isolator example: cgroupsmemory isolator Unified Containerizer Agent Process Container Process Isolator::cleanup() Remove the memory cgroup associated with the container Shutdown Executor or kill Task Destroy container Container terminated
  • 42.
    40 Cgroups isolators: cgroups/cpu,cgroups/mem, ... Disk isolators: disk/du, disk/xfs Filesystem isolators: filesystem/posix, filesystem/linux Volume isolators: docker/volume Network isolators: network/cni, network/port_mapping GPU isolators: gpu/nvidia …... and more! Need your contribution! Built-in isolators Unified Containerizer
  • 43.
    41 Start from 0.28,you can run your Docker container on Mesos without a Docker daemon installed! ● One less dependency in your stack ● Agent restart handled gracefully, task not affected ● Compose well with all existing isolators ● Easier to add extensions Container image support Unified Containerizer
  • 44.
    42 ● Mesos supportsmultiple container image format ○ Docker (without docker daemon) ○ Appc (without rkt) ○ OCI (ready soon) ○ CVMFS (experimental) ○ Host filesystem with tars/jars ○ Your own image format! Pluggable container image format Unified Containerizer Used in large scale production clusters
  • 45.
    43 ● Manage containerimages ○ Store: fetch and cache image layers ○ Backend: assemble rootfs from image layers ■ E.g., copy, overlayfs, bind, aufs ● Store can be extended ○ Currently supported: Docker, Appc ○ Plan to support: OCI (ongoing), CVMFS ○ Custom fetching (e.g., p2p) Provisioner Unified Containerizer
  • 46.
  • 47.
    45 ● Support ContainerNetwork Interface (CNI) from 1.0 ○ A spec for container networking ○ Supported by most network vendors ● Implemented as an isolator ○ --isolation=network/cni,... Container network support Unified Containerizer
  • 48.
    46 ● Proposed byCoreOS : https://github.com/containernetworking/cni ● Simple contract between container runtime and CNI plugin defined in the form of a JSON schema ○ CLI interface ○ ADD: attach to network ○ DEL: detach from network Container Network Interface (CNI) Unified Containerizer Mesos Agent Containerizer Container Executor T1 T2 CNI Plugin IPAM veth Network
  • 49.
    ● Simpler andless dependencies than Docker CNM ● Backed by Kubernetes community as well ● Rich plugins from network vendors ● Clear separation between container and network management ● IPAM has its own pluggable interface 47 Why CNI? Unified Containerizer
  • 50.
    48 Existing CNI plugins ●ipvlan ● macvlan ● bridge ● flannel ● calico ● contiv ● contrail ● weave ● … CNI plugins Unified Containerizer You can write your own plugin, and Mesos supports it!
  • 51.
    49 ● Support Dockervolume plugins from 1.0 ○ Define the interface between container runtime and storage provider ○ https://docs.docker.com/engine/extend/plugins_volume/ ● A variety of Docker volume plugins ○ Ceph ○ Convoy ○ Flocker ○ Glusterfs ○ Rexray Container storage support Unified Containerizer
  • 52.
    50 Launcher ● Custom containerprocesses management Isolator ● Extension to the life cycle of a container Provisioner ● New type of images ● Custom fetching and caching Extensions Unified Containerizer
  • 53.
    © 2016 Mesosphere,Inc. All Rights Reserved. 51 ● New in Mesos 1.1 ○ Building block for supporting Pod like feature ● Highlighted features ○ Support arbitrary levels of nesting ○ Re-use all existing isolators ○ Allow dynamically creation of nested containers Nested container support Nested container support
  • 54.
    © 2016 Mesosphere,Inc. All Rights Reserved. 52 Nested container support Nested container support Mesos Master Mesos Master Mesos Master Zookeeper Marathon Framework Cassandra Framework Mesos Agent Containerizer Container Executor T1 T2 Mesos Agent Containerizer Container Executor T1 T2 Mesos Agent Containerizer Container Executor T1 T2 Container Executor T1 T2 Nested Container Nested Container
  • 55.
    © 2016 Mesosphere,Inc. All Rights Reserved. 53 New Agent API for Nested Containers Nested container support message agent::Call { enum Type { // Calls for managing nested containers // under an executor's container. LAUNCH_NESTED_CONTAINER = 14; WAIT_NESTED_CONTAINER = 15; KILL_NESTED_CONTAINER = 16; } }
  • 56.
    © 2016 Mesosphere,Inc. All Rights Reserved. 54 Launch nested container Nested container support Container Executor Mesos Agent Containerizer LAUNCH Nginx
  • 57.
    © 2016 Mesosphere,Inc. All Rights Reserved. 55 Watch nested container Nested container support Container Executor Mesos Agent Containerizer WAIT NginxExit Status = 0
  • 58.
    © 2016 Mesosphere,Inc. All Rights Reserved. 56 Arbitrary levels of nesting Nested container support Container Executor Nginx Mesos Agent Containerizer LAUNCH Debug
  • 59.
  • 60.
    © 2016 Mesosphere,Inc. All Rights Reserved. ● Mesos: state of the art container orchestrator ○ Production ready ○ Proven scalability ○ Highly customizable and extensible ● Containerization in Mesos ○ Pluggable architecture ○ Native support for Docker/Appc images (w/o Docker daemon or rkt) ○ Container network: CNI ○ Container storage: DVD ○ Nested container support 58 Summary
  • 61.
    © 2016 Mesosphere,Inc. All Rights Reserved. 59 Questions?
  • 62.
    60 CNI support usingan isolator Unified Containerizer Agent Process Launcher creates Subprocess Container Process execve() LaunchInfo = Isolator::prepare() Tell the launcher to create the child process in a new NET, UTS and MNT namespace. Isolator::isolate(pid) Block on pipe Bind mount the NET namespace to keep it open Invoke ‘ADD’ of the CNI plugin with the NET namespace associated with pid Setup network related /etc/xx files for the container Invoke ‘LaunchInfo.script’ Exec the executor Signal the Child to continue
  • 63.
    61 CNI support usingan isolator Unified Containerizer Container Process Isolator::cleanup() Invoke ‘DEL’ of the CNI plugin with the NET namespace handle Umount and remove the NET namespace handle Shutdown Executor or kill Task Destroy container Container terminated Agent Process
  • 64.
    © 2016 Mesosphere,Inc. All Rights Reserved. 62 Mesos, as one of the most powerful container orchestrators, greatly simplifies the deploy, provision and execution of containerized workloads. It automates the distribution of preprovisioned container images, injection of configuration, scheduling onto machines, life-cycle-management, and monitoring of applications, microservices, and jobs in the cloud. In this talk, Jie Yu will first give you an overview about Mesos and its powerful API which allows users to easily deploy their stateless and stateful services. Then, Jie will talk about how containers are managed in Mesos. In particular, Jie will provide a deep dive into the unified containerizer which is first introduced in Mesos 1.0. Jie will show some of the new container networking and storage features that are built recently, and how they benefit from the pluggable and extensible architecture of the unified containerizer. Finally, Jie will discuss the future of container support in Mesos.
  • 65.
    Watch the videowith slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ mesos-api