Your SlideShare is downloading. ×
0
Let Me Contain That For You

Containers @ Google

Victor Marmol (vmarmol@google.com)
Rohit Jnagal (jnagal@google.com)
SF B...
Containers in the Wild

User 1

User 2

User 3

User 4

Linux Kernel

●
●
●
●

Used to provide VM-like instances
High dens...
The Need for Isolation: A Shared Google Machine

I/O:CPU:Mem
Sensitive Task

Front End Task

Back End Task

Alloc

BACKGRO...
Containers @ Google
SS1
SS2

Sub 2
Task 1

Task 2

Sub 1

Sub 4

Sub 1

SS3

Sub 3

SS4
Sub 3
Sub 2
Alloc 1

Task 1

Task ...
Asymmetric Isolation

Isolating only certain resources (e.g., CPU but not memory).

CPU

Memory

Net

Container 1

Contain...
Containers @ Google Today

● Historically
○
○
○

●
●
●
●
●

2004: No isolation
2006: Cgroups
Now: Namespaces

Primarily Li...
Goals

● Isolation
○ Tasks do not impact each other
○ The behavior of a Task is the same regardless of what else is
on the...
lmctfy: Let Me Contain That For You
Open source containers stack based on Google’s.

github.com/google/lmctfy/
Provides th...
lmctfy: Let Me Contain That For You
Objectives
● Abstract away enforcement: separate policy from enforcement
● Scalability...
lmctfy: Fine-tuned resource isolation
Current cgroup API is complicated with lots of knobs (each a cgroup
file):
Common: 5...
Released 0.4.0 (This Week!)
Initial version of lowest layer
● Written entirely in C++
● Delivered as a CLI and a C++ libra...
Container Specifications
message ContainerSpec {
optional int64 owner = 1;
optional
optional
optional
optional
optional
.....
Cgroup Specifications

Create: “cpu:<limit:1000 max_limit:2000
scheduling_latency:PRIORITY>
memory:<limit:4096000 reservat...
C++ API
::containers::lmctfy::ContainerApi
● Create
● Get
● Destroy
● Detect
● InitMachine
::containers::lmctfy::Container...
Container Names
Path-like hierarchy of container names:
Absolute: /parent/self
Relative: self when in /parent
Container Na...
Roadmap
Towards Version 1.0
● Improve VirtualHost support
● Root file systems
● Checkpoint restore
● Support and target mo...
Questions?

Repository: https://github.com/google/lmctfy/
Mailing list: lmctfy@googlegroups.com

Victor Marmol: vmarmol@go...
Upcoming SlideShare
Loading in...5
×

Containers @ Google

1,152

Published on

Slides from our presentation at the SF Bay Area Large Scale Production Engineering meetup on Lightweight Containers.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,152
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
49
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Transcript of "Containers @ Google"

  1. 1. Let Me Contain That For You Containers @ Google Victor Marmol (vmarmol@google.com) Rohit Jnagal (jnagal@google.com) SF Bay Area Large-Scale Production Engineering: Lightweight Containers Meetup February 20, 2014 Google Confidential and Proprietary
  2. 2. Containers in the Wild User 1 User 2 User 3 User 4 Linux Kernel ● ● ● ● Used to provide VM-like instances High density (lower costs) and high performance Fast to start Migration is hard, but possible Google Confidential and Proprietary
  3. 3. The Need for Isolation: A Shared Google Machine I/O:CPU:Mem Sensitive Task Front End Task Back End Task Alloc BACKGROUND System Daemons Batch workload TASKS Soaker workload Google Confidential and Proprietary
  4. 4. Containers @ Google SS1 SS2 Sub 2 Task 1 Task 2 Sub 1 Sub 4 Sub 1 SS3 Sub 3 SS4 Sub 3 Sub 2 Alloc 1 Task 1 Task 2 Linux Kernel ● ● ● ● Container-aware tasks use asymmetric subcontainers Provide different guarantees of quality of service Overcommit resources to achieve high utilization Early users, few namespaces, and near-zero overhead Google Confidential and Proprietary
  5. 5. Asymmetric Isolation Isolating only certain resources (e.g., CPU but not memory). CPU Memory Net Container 1 Container 2 Container 3 Google Confidential and Proprietary
  6. 6. Containers @ Google Today ● Historically ○ ○ ○ ● ● ● ● ● 2004: No isolation 2006: Cgroups Now: Namespaces Primarily Linux cgroups + user-space policies and monitoring We skipped VMs due to high overhead Used everywhere: SaaS, PaaS, IaaS; Android, Chrome OS Heterogeneous workloads: Latency, bandwidth, and priority High task churn Google Confidential and Proprietary
  7. 7. Goals ● Isolation ○ Tasks do not impact each other ○ The behavior of a Task is the same regardless of what else is on the machine ● Predictability ○ Tasks behave the same each time they run ○ Unless they are specifically configured to use "slack" ● Quality of Service ○ Different tasks get different quality of resources ● Overcommitment ○ Oversell machine resources within QoS guarantees Google Confidential and Proprietary
  8. 8. lmctfy: Let Me Contain That For You Open source containers stack based on Google’s. github.com/google/lmctfy/ Provides the Container abstraction to higher levels by abstracting away the kernel interfaces. Motivation ● Existing code, systems, and design around containers ● Problems with LXC ○ ○ No abstraction (direct knob exposure) No easy way to access programmatically Google Confidential and Proprietary
  9. 9. lmctfy: Let Me Contain That For You Objectives ● Abstract away enforcement: separate policy from enforcement ● Scalability and parallel access ● Intent-based container specifications ● Asymmetric isolation ● Subcontainer support ● Provides tiers of quality of service System Layers ● CL1 ○ ○ ○ Container abstraction and enforcement Thin and light layer Current lmctfy ● CL2 ○ ○ ○ Sets policy (QoS, overcommitment) Higher level logic, monitoring, and control loops Stateful entity Google Confidential and Proprietary
  10. 10. lmctfy: Fine-tuned resource isolation Current cgroup API is complicated with lots of knobs (each a cgroup file): Common: 5+ files cgroup.clone_children cgroup.event_control cgroup.procs notify_on_release release_agent CPU: 8+ files cpuacct.stat cpuacct.usage cpuacct.usage_percpu cpu.cfs_period_us cpu.cfs_quota_us cpu. rt_period_us cpu.rt_runtime_us cpu.shares cpu.stat Memory: 12+ files memory.failcnt memory.force_empty memory.limit_in_bytes memory.max_usage_in_bytes memory. move_charge_at_immigrate memory.numa_stat memory.oom_control memory.pressure_level memory.soft_limit_in_bytes memory.stat memory.swappiness memory.usage_in_bytes memory. use_hierarchy Cpuset: 12+ files cpuset.cpu_exclusive cpuset.cpus cpuset.mem_exclusive cpuset.mem_hardwall cpuset. memory_migrate cpuset.memory_pressure cpuset.memory_pressure_enabled cpuset. memory_spread_page cpuset.memory_spread_slab cpuset.mems cpuset.sched_load_balance cpuset.sched_relax_domain_level +DiskIO +Net +... Google Confidential and Proprietary
  11. 11. Released 0.4.0 (This Week!) Initial version of lowest layer ● Written entirely in C++ ● Delivered as a CLI and a C++ library (C and Go bindings soon) ● Isolation for CPU, memory, and perf event ● Full support for subcontainers ● “Stateless” and lightweight ● Initial support for namespaces, more to come in the next week. Can be augmented with custom kernel patches ● CPU latency and accounting ● OOM priority Supported configurations ● Target configuration is well supported ● Designed to be flexible, but we test on a limited set of them ● More target configurations being added ● Contributions to add more are welcome Google Confidential and Proprietary
  12. 12. Container Specifications message ContainerSpec { optional int64 owner = 1; optional optional optional optional optional ... CpuSpec cpu = 2; MemorySpec memory = 3; DiskIoSpec diskio = 4; NetworkSpec network = 5; VirtualHost virtualhost = 6; } message CpuSpec { optional ShedulingLatency scheduling_latency = 1; optional uint64 limit = 2; optional uint64 max_limit = 3; ... } Create: “cpu:<limit:1000 max_limit:2000> memory:<limit:4096000 reservation:1024000>” Google Confidential and Proprietary
  13. 13. Cgroup Specifications Create: “cpu:<limit:1000 max_limit:2000 scheduling_latency:PRIORITY> memory:<limit:4096000 reservation:1024000>” equivalent lxc cgroup config: lxc.cgroup.cpu.shares = 2048 lxc.cgroup.cpu.cfs_period_us = 50000 lxc.cgroup.cpu.cfs_quota_us = 10000 lxc.cgroup.cpu.lat = 25 .. cpu performance knobs .. lxc.cgroup.memory.limit_in_bytes = 4096000 lxc.cgroup.memory.soft_limit_in_bytes = 1024000 .. memory performance knobs .. Google Confidential and Proprietary
  14. 14. C++ API ::containers::lmctfy::ContainerApi ● Create ● Get ● Destroy ● Detect ● InitMachine ::containers::lmctfy::Container ● Update ● Run ● Notifications ● List (threads, PIDs, and subcontainers) ● Stats ● Pause/Resume ● KillAll CLI is a thin wrapper around the C++ API Google Confidential and Proprietary
  15. 15. Container Names Path-like hierarchy of container names: Absolute: /parent/self Relative: self when in /parent Container Name Refers To / The root top-level container /sys The sys top-level container /sys/sub The sub subcontainer of the sys top-level container . or ./ The current container (current relative to the calling process) .. The parent container (parent relative to the calling process) ./foo_container or foo_container The foo_container subcontainer of the current container /foo_container The foo_container top-level container Google Confidential and Proprietary
  16. 16. Roadmap Towards Version 1.0 ● Improve VirtualHost support ● Root file systems ● Checkpoint restore ● Support and target most major distros ● Fully compatible with Docker’s use of containers Higher Layer ● Admission control and feasibility checks ● Monitoring, notifications, and statistics ● Tiers of quality of service guarantees Contributions Welcome! Google Confidential and Proprietary
  17. 17. Questions? Repository: https://github.com/google/lmctfy/ Mailing list: lmctfy@googlegroups.com Victor Marmol: vmarmol@google.com Rohit Jnagal: jnagal@google.com Google Confidential and Proprietary
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×