SlideShare a Scribd company logo
Realizing Linux Containers (LXC)
Building Blocks, Underpinnings &
Motivations
Boden Russell – IBM Global Technology Services
(brussell@us.ibm.com)
Definitions
 Linux Containers (LXC for LinuX Containers) are lightweight virtual machines (VMs)
which are realized using features provided by a modern Linux kernel – VMs without
the hypervisor
 Containerization of:
– (Linux) Operating Systems
– Single or multiple applications
 LXC as a technology not to be confused with LXC (tools) which are a user space
toolset for creating & managing Linux Containers
 From wikipedia:
LXC (LinuX Containers) is an operating system–level virtualization method for running multiple
isolated Linux systems (containers) on a single control host… LXC provides operating system-level
virtualization not via a virtual machine, but rather provides a virtual environment that has its own
process and network space.
3/11/2014 2
Why LXC
 Provision in seconds / milliseconds
 Near bare metal runtime performance
 VM-like agility – it’s still “virtualization”
 Flexibility
– Containerize a “system”
– Containerize “application(s)”
 Lightweight
– Just enough Operating System (JeOS)
– Minimal per container penalty
 Open source – free – lower TCO
 Supported with OOTB modern Linux kernel
 Growing in popularity
3/11/2014 3
“Linux Containers as poised as the next VM in our modern Cloud era…”
Manual VM LXC
Provision Time
Days
Minutes
Seconds / ms
linpack performance @ 45000
0
50
100
150
200
250
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
BM
vcpus
GFlops
Google trends - LXC Google trends - docker
Hypervisors vs. Linux Containers
Hardware
Operating System
Hypervisor
Virtual Machine
Operating
System
Bins / libs
App App
Virtual Machine
Operating
System
Bins / libs
App App
Hardware
Hypervisor
Virtual Machine
Operating
System
Bins / libs
App App
Virtual Machine
Operating
System
Bins / libs
App App
Hardware
Operating System
Container
Bins / libs
App App
Container
Bins / libs
App App
Type 1 Hypervisor Type 2 Hypervisor Linux Containers
3/11/2014 4
Containers share the OS kernel of the host and thus are lightweight.
However, each container must have the same OS kernel.
Containers are isolated, but
share OS and, where
appropriate, libs / bins.
LXC Technology Stack
 LXCs are built on modern kernel features
– cgroups; limits, prioritization, accounting & control
– namespaces; process based resource isolation
– chroot; apparent root FS directory
– Linux Security Modules (LSM); Mandatory Access Control (MAC)
 User space interfaces for kernel functions
 LXC tools
– Tools to isolate process(es) virtualizing kernel resources
 LXC commoditization
– Dead easy LXC
– LXC virtualization
 Orchestration & management
– Scheduling across multiple hosts
– Monitoring
– Uptime
3/11/2014 5
Linux cgroups
 History
– Work started in 2006 by google engineers
– Merged into upstream 2.6.24 kernel due to wider spread LXC usage
– A number of features still a WIP
 Functionality
– Access; which devices can be used per cgroup
– Resource limiting; memory, CPU, device accessibility, block I/O, etc.
– Prioritization; who gets more of the CPU, memory, etc.
– Accounting; resource usage per cgroup
– Control; freezing & check pointing
– Injection; packet tagging
 Usage
– cgroup functionality exposed as “resource controllers” (aka “subsystems”)
– Subsystems mounted on FS
– Top-level subsystem mount is the root cgroup; all procs on host
– Directories under top-level mounts created per cgroup
– Procs put in tasks file for group assignment
– Interface via read / write pseudo files in group
3/11/2014 6
Linux cgroup Subsystems
 cgroups provided via kernel modules
– Not always loaded / provided by default
– Locate and load with modprobe
 Some features tied to kernel version
 See: https://www.kernel.org/doc/Documentation/cgroups/
3/11/2014 7
Subsystem Tunable Parameters
blkio - Weighted proportional block I/O access. Group wide or per device.
- Per device hard limits on block I/O read/write specified as bytes per second or IOPS per second.
cpu - Time period (microseconds per second) a group should have CPU access.
- Group wide upper limit on CPU time per second.
- Weighted proportional value of relative CPU time for a group.
cpuset - CPUs (cores) the group can access.
- Memory nodes the group can access and migrate ability.
- Memory hardwall, pressure, spread, etc.
devices - Define which devices and access type a group can use.
freezer - Suspend/resume group tasks.
memory - Max memory limits for the group (in bytes).
- Memory swappiness, OOM control, hierarchy, etc..
hugetlb - Limit HugeTLB size usage.
- Per cgroup HugeTLB metrics.
net_cls - Tag network packets with a class ID.
- Use tc to prioritize tagged packets.
net_prio - Weighted proportional priority on egress traffic (per interface).
Linux cgroups FS Layout
3/11/2014 8
Linux cgroups Pseudo FS Interface
3/11/2014 9
/sys/fs/cgroup/my-lxc
|-- blkio
| |-- blkio.io_merged
| |-- blkio.io_queued
| |-- blkio.io_service_bytes
| |-- blkio.io_serviced
| |-- blkio.io_service_time
| |-- blkio.io_wait_time
| |-- blkio.reset_stats
| |-- blkio.sectors
| |-- blkio.throttle.io_service_bytes
| |-- blkio.throttle.io_serviced
| |-- blkio.throttle.read_bps_device
| |-- blkio.throttle.read_iops_device
| |-- blkio.throttle.write_bps_device
| |-- blkio.throttle.write_iops_device
| |-- blkio.time
| |-- blkio.weight
| |-- blkio.weight_device
| |-- cgroup.clone_children
| |-- cgroup.event_control
| |-- cgroup.procs
| |-- notify_on_release
| |-- release_agent
| `-- tasks
|-- cpu
| |-- ...
|-- ...
`-- perf_event
echo "8:16 1048576“ >
blkio.throttle.read_bps_de
vice
cat blkio.weight_device
dev weight
8:1 200
8:16 500
App
App
App
 Linux pseudo FS is the interface to cgroups
– Read / write to pseudo file(s) in your cgroup directory
 Some libs exist to interface with pseudo FS programmatically
Linux cgroups: CPU Usage
 Use CPU shares (and other controls) to prioritize jobs / containers
 Carry out complex scheduling schemes
 Segment host resources
 Adhere to SLAs
3/11/2014 10
Linux cgroups: CPU Pinning
 Pin containers / jobs to CPU cores
 Carry out complex scheduling schemes
 Reduce core switching costs
 Adhere to SLAs
3/11/2014 11
Linux cgroups: Device Access
 Limit device visibility; isolation
 Implement device access controls
– Secure sharing
 Segment device access
 Device whitelist / blacklist
3/11/2014 12
LXC Realization: Linux cgroups
 cgroup created per container (in each cgroup subsystem)
 Prioritization, access, limits per container a la cgroup controls
 Per container metrics (bean counters)
3/11/2014 13
Linux namespaces
 History
– Initial kernel patches in 2.4.19
– Recent 3.8 patches for user namespace support
– A number of features still a WIP
 Functionality
– Provide process level isolation of global resources
• MNT (mount points, file systems, etc.)
• PID (process)
• NET (NICs, routing, etc.)
• IPC (System V IPC resources)
• UTS (host & domain name)
• USER (UID + GID)
– Process(es) in namespace have illusion they are the only processes on the system
– Generally constructs exist to permit “connectivity” with parent namespace
 Usage
– Construct namespace(s) of desired type
– Create process(es) in namespace (typically done when creating namespace)
– If necessary, initialize “connectivity” to parent namespace
– Process(es) in name space internally function as if they are only proc(s) on system
3/11/2014 14
Linux namespaces: Conceptual Overview
3/11/2014 15
Linux namespaces: MNT namespace
 Isolates the mount table – per namespace mounts
 mount / unmount operations isolated to namespace
 Mount propagation
– Shared; mount objects propagate events to one another
– Slave; one mount propagates events to another, but not
vice versa
– Private; no event propagation (default)
 Unbindable mount forbids bind mounting itself
 Various tools / APIs support the mount namespace such
as the mount command
– Options to make shared, private, slave, etc.
– Mount with namespace support
 Typically used with chroot or pivot_root for
effective root FS isolation
3/11/2014 16
“global” (i.e. root)
namespace
“green” namespace
“red” namespace
MNT NS
/
/proc
/mnt/fsrd
/mnt/fsrw
/mnt/cdrom
/run2
MNT NS
/
/proc
/mnt/greenfs
/mnt/fsrw
/mnt/cdrom
MNT NS
/
/proc
/mnt/cdrom
/redns
Linux namespaces: UTS namespace
 Per namespace
– Hostname
– NIS domain name
 Reported by commands such as hostname
 Processes in namespace can change UTS values – only
reflected in the child namespace
 Allows containers to have their own FQDN
3/11/2014 17
“global” (i.e. root)
namespace
“green” namespace
“red” namespace
UTS NS
globalhost
rootns.com
UTS NS
greenhost
greenns.org
UTS NS
redhost
redns.com
Linux namespaces: PID namespace
 Per namespace PID mapping
– PID 1 in namespace not the same as PID 1 in parent namespace
– No PID conflicts between namespaces
– Effectively 2 PIDs; the PID in the namespace and the PID outside
the namespace
 Permits migrating namespace processes between hosts
while keeping same PID
 Only processes in the namespace are visible within the
namespace (visibility limited)
3/11/2014 18
“global” (i.e. root)
namespace
“green” namespace
“red” namespace
PID NS
PID COMMAND
1 /sbin/init
2 [kthreadd]
3 [ksoftirqd]
4 [cpuset]
5 /sbin/udevd
PID NS
PID COMMAND
1 /bin/bash
2 /bin/vim
PID NS
PID COMMAND
1 /bin/bash
2 python
3 node
Linux namespaces: IPC namespace
 System V IPC object & POSIX message queue isolation
between namespaces
– Semaphores
– Shared memory
– Message queues
 Parent namespace connectivity
– Signals
– Memory polling
– Sockets (if no NET namespace)
– Files / file descriptors (if no mount namespace)
– Events over pipe pair
3/11/2014 19
“global” (i.e. root)
namespace
“green” namespace
“red” namespace
IPC NS
SHMID OWNER
32452 root
43321 boden
SEMID OWNER
0 root
1 boden
IPC NS
SHMID OWNER
SEMID OWNER
0 root
IPC NS
SHMID OWNER
SEMID OWNER
MSQID OWNER
Linux namespaces: NET namespace
 Per namespace network objects
– Network devices (eths)
– Bridges
– Routing tables
– IP address(es)
– ports
– Etc
 Various commands support network namespace such as ip
 Connectivity to other namespaces
– veths – create veth pair, move one inside the namespace and
configure
– Acts as a pipe between the 2 namespaces
 LXCs can have their own IPs, routes, bridges, etc.
3/11/2014 20
“global” (i.e. root)
namespace
“green” namespace
“red” namespace
NET NS
lo: UNKNOWN…
eth0: UP…
eth1: UP…
br0: UP…
app1 IP:5000
app2 IP:6000
app3 IP:7000
NET NS
lo: UNKNOWN…
eth0: UP…
app1 IP:1000
app2 IP:7000
NET NS
lo: UNKNOWN…
eth0: DOWN…
eth1: UP
app1 IP:7000
app2 IP:9000
Linux namespaces: USER namespace
 A long work in progress – still development for XFS and other
FS support
– Significant security impacts
– A handful of security holes already found + fixed
 Two major features provided:
– Map UID / GID from outside the container to UID / GID inside the
container
– Permit non-root users to launch LXCs
– Distro’s rolling out phased support, with UID / GID mapping
typically 1st
 First process in USER namespace has full CAPs; perform
initializations before other processes are created
– No CAPs in parent namespace
 UID / GID map can be pre-configured via FS
 Eventually USER namespace will mitigate many perceived LXC
security concerns
3/11/2014 21
“global” (i.e. root)
namespace
“green” namespace
“red” namespace
USER NS
root 0:0
ntp 104:109
Mysql 105:110
boden 106:111
USER NS
root 0:0
app 106:111
USER NS
root 0:0
app 104:109
LXC Realization: Linux namespaces
3/11/2014 22
 A set of namespaces created for the container
 Container process(es) “executed” in the namespace set
 Process(es) in the container have isolated view of resources
 Connectivity to parent where needed (via lxc tooling)
Linux namespaces & cgroups: Availability
3/11/2014 23
Note: user namespace support in
upstream kernel 3.8+, but
distributions rolling out phased
support:
- Map LXC UID/GID between
container and host
- Non-root LXC creation
Linux chroots
 Changes apparent root directory for process and children
– Search paths
– Relative directories
– Etc
 Using chroot can be escaped given proper capabilities, thus pivot_root is often
used instead
– chroot; points the processes file system root to new directory
– pivot_root; detaches the new root and attaches it to process root directory
 Often used when building system images
– Chroot to temp directory
– Download and install packages in chroot
– Compress chroot as a system root FS
 LXC realization
– Bind mount container root FS (image)
– Launch (unshare or clone) LXC init process in a new MNT namespace
– pivot_root to the bind mount (root FS)
3/11/2014 24
Linux chroot vs pivot_root
3/11/2014 25
 Using pivot_root with MNT namespace addresses escaping chroot concerns
 The pivot_root target directory becomes the “new root FS”
LXC Realization: Images
LXC images provide a flexible means to deliver only what you need – lightweight and minimal footprint
 Basic constraints
– Same architecture
– Same endian
– Linux’ish Operating System; you can run different Linux distros on same host
 Image types
– System; images intended to virtualize Operating System(s) – standard distro root FS less the
kernel
– Application; images intended to virtualize application(s) – only package apps + dependencies
(aka JeOS – Just enough Operating System)
 Bind mount host libs / bins into LXC to share host resources
 Container image init process
– Container init command provided on invocation – can be an application or a full fledged init
process
– Init script customized for image – skinny SysVinit, upstart, etc.
– Reduces overhead of lxc start-up and runtime foot print
 Various tools to build images
– SuSE Kiwi
– Debootstrap
– Etc.
 LXC tooling options often include numerous image templates
3/11/2014 26
Linux Security Modules & MAC
 Linux Security Modules (LSM) – kernel modules which provide a framework for
Mandatory Access Control (MAC) security implementations
 MAC vs DAC
– In MAC, admin (user or process) assigns access controls to subject / initiator
• Most MAC implementations provide the notion of profiles
• Profiles define access restrictions and are said to “confine” a subject
– In DAC, resource owner (user) assigns access controls to individual resources
 Existing LSM implementations include: AppArmor, SELinux, GRSEC, etc.
3/11/2014 27
Linux Capabilities & Other Security Measures
 Linux capabilities
– Per process privileges which define operational (sys call) access
– Typically checked based on process EUID and EGID
– Root processes (i.e. EUID = GUID = 0) bypass capability checks
 Capabilities can be assigned to LXC processes to restrict
 Other LXC security mitigations
– Reduce shared FS access using RO bind mounts
– Keep Linux kernel up to date
– User namespaces in 3.8+ kernel
• Allow to launch containers as non-root user
• Map UID / GID inside / outside of container
3/11/2014 28
LXC Realization
3/11/2014 29
LXC Tooling
 LXC is not a kernel feature – it’s a technology enabled via kernel features
– User space tooling required to manage LXCs effectively
 Numerous toolsets exist
– Then: add-on patches to upstream kernel due to slow kernel acceptance
– Now: upstream LXC feature support is growing – less need for patches
 More popular GNU Linux toolsets include libvirt-lxc and lxc (tools)
– OpenVZ is likely the most mature toolset, but it requires kernel patches
– Note: I would consider docker a commoditization of LXC
 Non-GNU Linux based LXC
– Solaris zones
– BSD jails
– Illumos / SmartOS (solaris derivatives)
– Etc.
3/11/2014 30
LXC Industry Tooling
3/11/2014 31
Libvirt-lxc
 Perhaps the simplest to learn through a familiar virsh interface
 Libvirt provides LXC support by connecting to lxc:///
 Many virsh commands work
• virsh -c lxc:/// define sample.xml
• virsh –c lxc:/// start sample
• virsh –c lxc:/// console sample
• virsh –c lxc:/// shutdown sample
• virsh –c lxc:/// undefine sample
 No snapshotting, templates…
 OpenStack support since Grizzly
 No VNC
 No Cinder support in Grizzly
 Config drive not supported
 Alternative means of accessing metadata
 Attached disk rather than http calls
3/11/2014 32
<domain type='lxc'>
<name>sample</name>
<memory>32768</memory>
<os> <type>exe</type> <init>/init</init> </os>
<vcpu>1</vcpu>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator>/usr/libexec/libvirt_lxc</emulator>
<filesystem type='mount'> <source dir='/opt/vm-1-root'/> <target dir='/'/> </filesystem>
<interface type='network'> <source network='default'/> </interface>
<console type='pty' />
</devices>
</domain>
LXC (tools)
 A little more functionality
 Supported by the major distributions
 LXC 1.0 recently released
– Cloning supported: lxc-clone
– Templates… btrfs
– lxc-create -t ubuntu -n CN creates a new ubuntu container
• “template” is downloaded from Ubuntu
• Some support for Fedora <= 14
• Debian is supported
– lxc-start -d -n CN starts the container
– lxc-destroy -n CN destroys the container
– /etc/lxc/lxc.conf has default settings
– /var/lib/lxc/CN is the default place for each container
3/11/2014 33
LXC Commoditization: docker
 Young project with great vibrancy in the industry
 Currently based on unmodified LXC – but the goal is to make it dirt easy
 As of March 10th, 2014 at v0.9. Monthly releases, 1.0 should be ready for production use
 What docker adds to LXC
– Portable deployment across machines
• In Cloud terms, think of LXC as the hypervisor and docker as the Open Virtualization Appliance (OVA) and the provision engine
• Docker images can run unchanged on any platform supporting docker
– Application-centric
• User facing function geared towards application deployment, not VM analogs [!]
– Automatic build
• Create containers from build files
• Builders can use chef, maven, puppet, etc.
– Versioning support
• Think of git for docker containers
• Only delta of base container is tracked
– Component re-use
• Any container can be used as a base, specialized and saved
– Sharing
• Support for public/private repositories of containers
– Tools
• CLI / REST API for interacting with docker
• Vendors adding tools daily
 Docker containers are self contained – no more “dependency hell”
3/11/2014 34
Docker vs. LXC vs. Hypervisor
3/11/2014 35
Docker: LXC Virtualization?
3/11/2014 36
 Docker decouples the LXC provider from the operations
– LXC provider agnostic
 Docker “images” run anywhere docker is supported
– Portability
LXC Orchestration & Management
 Docker & libvirt-lxc in OpenStack
– Manage containers heterogeneously with traditional VMs… but not w/the level of support
& features we might like
 CoreOS
– Zero-touch admin Linux distro with docker images as the unit of operation
– Centralized key/value store to coordinate distributed environment
 Various other 3rd party apps
– Maestro for docker
– Shipyard for docker
– Fleet for CoreOS
– Etc.
 LXC migration
– Container migration via criu
 But…
– Still no great way to tie all virtual resources together with LXC – e.g. storage + networking
• IMO; an area which needs focus for LXC to become more generally applicable
3/11/2014 37
Docker in OpenStack
 Introduced in Havana
– A nova driver to integrate with docker REST API
– A Glance translator to integrate containers with Glance
• A docker container which implements a docker registry API
 The claim is that docker will become a “group A” hypervisor
– In it’s current form it’s effectively a “tech preview”
3/11/2014 38
LXC Evaluation
 Goal: validate the promise with an eye towards practical applicability
 Dimensions evaluated:
– Runtime performance benefits
– Density / footprint
– Workload isolation
– Ease of use and tooling
– Cloud Integration
– Security
– Ease of use / feature set
NOTE: tests performed in a passive manner – deeper analysis warrented.
3/11/2014 39
Runtime Performance Benefits - CPU
 Tested using libvirt lxc on Ubuntu 13.10 using linpack 11.1
 Cpuset was used to limit the number of CPUs that the containers could use
 The performance overhead falls within the error of measurement of this test
 Actual bare metal performance is actually lower than some container results
3/11/2014 40
linpack performance @ 45000
0
50
100
150
200
250
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
BM
vcpus
GFlops
220.77
Bare metal220.5
@32 vcpu
220.9
@ 31 vcpu
Runtime Performance Benefits – I/O
 I/O Tests using libvirt lxc show a < 1 % degradation
 Tested with a pass-through mount
3/11/2014 41
Sync read I/O test
Rw=Write
Size=1024m
Bs=128mb
direct=1
sync=1
Sync write I/O test
Rw=Write
Size=1024m
Bs=128mb
direct=1
sync=1
I/O throughput
1711.2 1724.9
1626.4 1633.4
0
500
1000
1500
2000
lxc write bare metal
write
lxc read bare metal
read
test
MB/s
Series1
Runtime Performance Benefits – Block I/O
 Tested with [standard] AUFS
3/11/2014 42
Density & Footprint – libvirt-lxc
3/11/2014 43
Starting 500 containers
Mon Nov 11 13:38:49 CST 2013 ... all threads done
in 157
(sequential I/O bound)
Stopping 500 containers
Mon Nov 11 13:42:20 CST 2013 ... all threads done
in 162
Active memory delta: 417.2 KB
Starting 1000 containers
Mon Nov 11 13:59:19 CST 2013 ... all threads done
in 335
Stopping 1000 containers
Mon Nov 11 14:14:26 CST 2013 ... all threads done
in 339
Active memory delta: 838.4KB
Using libvirt lxc on RHEL 6.4, we found that empty container overhead was just 840 bytes. A container could be started in
about 330ms, which was an I/O bound process
This represents the lower limit of lxc footprint
Containers ran /bin/sh
Density & Footprint – Docker
 In this test, we created 150 Docker containers with CentOS, started
apache & then removed them
 Average footprint was ~10MB per container
 Average start time was 240ms
 Serially booting 150 containers which run apache
– Takes on average 36 seconds
– Consumes about 2 % of the CPU
– Negligible HDD space
– Spawns around 225 processes for create
– Around 1.5 GB of memory ~ 10 MB per container
– Expect faster results once docker addresses performance topics in the
next few months
 Serially destroying 150 containers running apache
– On average takes 9 seconds
– We would expect destroy to be faster – likely a docker bug and will triage
with the docker community
3/11/2014 44
Container
Creation
Container
Deletion
I/O profile
CPU profile
Workload Isolation: Examples
 Using the blkio cgroup (lxc.cgroup.blkio.throttle.read_bps_device) to cap the I/O of a container
 Both the total bps and iops_device on read / write could be capped
 Better async BIO support in kernel 3.10+
 We used fio with oflag=sync, direct to test the ability to cap the reads:
– With limit set to 6 MB / second
READ: io=131072KB, aggrb=6147KB/s, minb=6295KB/s, maxb=6295KB/s, mint=21320msec,
maxt=21320msec
– With limit set to 60 MB / second
READ: io=131072KB, aggrb=61134KB/s, minb=62601KB/s, maxb=62601KB/s,
mint=2144msec, maxt=2144msec
– No read limit
READ: io=131072KB, aggrb=84726KB/s, minb=86760KB/s, maxb=86760KB/s,
mint=1547msec, maxt=1547msec
3/11/2014 45
OpenStack VM Operations
3/11/2014 46
NOTE: orchestration / management overheads cap LXC performance
Who’s Using LXC
 Google app engine & infra is said to be using some form of LXC
 RedHat OpenShift
 dotCloud (now docker inc)
 CloudFoundry (early versions)
 Rackspace Cloud Databases
– Outperforms AWS (Xen) according to perf results
 Parallels Virtuozzo (commercial product)
 Etc..
3/11/2014 47
LXC Gaps
There are gaps…
 Lack of industry tooling / support
 Live migration still a WIP
 Full orchestration across resources (compute / storage / networking)
 Fears of security
 Not a well known technology… yet
 Integration with existing virtualization and Cloud tooling
 Not much / any industry standards
 Missing skillset
 Slower upstream support due to kernel dev process
 Etc.
3/11/2014 48
LXC: Use Cases For Traditional VMs
There are still use cases where traditional VMs are warranted.
 Virtualization of non Linux based OSs
– Windows
– AIX
– Etc.
 LXC not supported on host
 VM requires unique kernel setup which is not applicable to other VMs on the host
(i.e. per VM kernel config)
 Etc.
3/11/2014 49
LXC Recommendations
 Public & private Clouds
– Increase VM density 2-3x
– Accommodate Big Data & HPC type applications
– Move the support of Linux distros to containers
 PaaS & managed services
– Realize “as a Service” and managed services using LXC
 Operations management
– Ease management + increase agility of bare metal components
 DevOps
 Development & test
– Sandboxes
– Dev / test envs
– Etc.
 If you are just starting with LXC and don’t have in-depth skillset
– Start with LXC for private solutions (trusted code)
3/11/2014 50
LXC Resources
 https://www.kernel.org/doc/Documentation/cgroups/
 http://www.blaess.fr/christophe/2012/01/07/linux-3-2-cfs-cpu-bandwidth-english-version/
 http://atmail.com/kb/2009/throttling-bandwidth/
 https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/ch-
Subsystems_and_Tunable_Parameters.html
 http://www.janoszen.com/2013/02/06/limiting-linux-processes-cgroups-explained/
 http://www.mattfischer.com/blog/?p=399
 http://oakbytes.wordpress.com/2012/09/02/cgroup-cpu-allocation-cpu-shares-examples/
 http://fritshoogland.wordpress.com/2012/12/15/throttling-io-with-linux/
 https://lwn.net/Articles/531114/
 https://www.kernel.org/doc/Documentation/filesystems/sharedsubtree.txt
 http://www.ibm.com/developerworks/library/l-mount-namespaces/
 http://blog.endpoint.com/2012/01/linux-unshare-m-for-per-process-private.html
 http://timothysc.github.io/blog/2013/02/22/perprocess/
 http://www.evolware.org/?p=293
 http://s3hh.wordpress.com/2012/05/10/user-namespaces-available-to-play/
 http://libvirt.org/drvlxc.html
 https://help.ubuntu.com/lts/serverguide/lxc.html
 https://linuxcontainers.org/
 https://wiki.ubuntu.com/AppArmor
 http://linux.die.net/man/7/capabilities
 http://docs.openstack.org/trunk/config-reference/content/lxc.html
 https://wiki.openstack.org/wiki/Docker
 https://www.docker.io/
 http://marceloneves.org/papers/pdp2013-containers.pdf
 http://openvz.org/Main_Page
 http://criu.org/Main_Page
3/11/2014 51

More Related Content

What's hot

Kvm and libvirt
Kvm and libvirtKvm and libvirt
Kvm and libvirt
plarsen67
 
Virtualization with KVM (Kernel-based Virtual Machine)
Virtualization with KVM (Kernel-based Virtual Machine)Virtualization with KVM (Kernel-based Virtual Machine)
Virtualization with KVM (Kernel-based Virtual Machine)
Novell
 
Openstack ansible
Openstack ansibleOpenstack ansible
Openstack ansible
George Paraskevas
 
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
ShapeBlue
 
Red Hat Insights
Red Hat InsightsRed Hat Insights
Red Hat Insights
Alessandro Silva
 
Introduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopIntroduction to Kubernetes Workshop
Introduction to Kubernetes Workshop
Bob Killen
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
Peng Xiao
 
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
Vietnam Open Infrastructure User Group
 
Openstack 101
Openstack 101Openstack 101
Openstack 101
Kamesh Pemmaraju
 
Ansible-cours .pdf
Ansible-cours .pdfAnsible-cours .pdf
Ansible-cours .pdf
Jaouad Assabbour
 
Ansible 101
Ansible 101Ansible 101
Ansible 101
Gena Mykhailiuta
 
KubeCon EU 2016: Kubernetes Storage 101
KubeCon EU 2016: Kubernetes Storage 101KubeCon EU 2016: Kubernetes Storage 101
KubeCon EU 2016: Kubernetes Storage 101
KubeAcademy
 
Ansible Automation - Enterprise Use Cases | Juncheng Anthony Lin
Ansible Automation - Enterprise Use Cases | Juncheng Anthony LinAnsible Automation - Enterprise Use Cases | Juncheng Anthony Lin
Ansible Automation - Enterprise Use Cases | Juncheng Anthony Lin
Vietnam Open Infrastructure User Group
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
Eric Gustafson
 
Containers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatContainers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red Hat
Amazon Web Services
 
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region
Ji-Woong Choi
 
XPDS13: Xen in OSS based In–Vehicle Infotainment Systems - Artem Mygaiev, Glo...
XPDS13: Xen in OSS based In–Vehicle Infotainment Systems - Artem Mygaiev, Glo...XPDS13: Xen in OSS based In–Vehicle Infotainment Systems - Artem Mygaiev, Glo...
XPDS13: Xen in OSS based In–Vehicle Infotainment Systems - Artem Mygaiev, Glo...
The Linux Foundation
 
Autoscale a self-healing cluster in OpenStack with Heat
Autoscale a self-healing cluster in OpenStack with HeatAutoscale a self-healing cluster in OpenStack with Heat
Autoscale a self-healing cluster in OpenStack with Heat
Rico Lin
 
Volume Encryption In CloudStack
Volume Encryption In CloudStackVolume Encryption In CloudStack
Volume Encryption In CloudStack
ShapeBlue
 
OpenStack 개요 및 활용 사례 @ Community Open Camp with Microsoft
OpenStack 개요 및 활용 사례 @ Community Open Camp with MicrosoftOpenStack 개요 및 활용 사례 @ Community Open Camp with Microsoft
OpenStack 개요 및 활용 사례 @ Community Open Camp with Microsoft
Ian Choi
 

What's hot (20)

Kvm and libvirt
Kvm and libvirtKvm and libvirt
Kvm and libvirt
 
Virtualization with KVM (Kernel-based Virtual Machine)
Virtualization with KVM (Kernel-based Virtual Machine)Virtualization with KVM (Kernel-based Virtual Machine)
Virtualization with KVM (Kernel-based Virtual Machine)
 
Openstack ansible
Openstack ansibleOpenstack ansible
Openstack ansible
 
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
 
Red Hat Insights
Red Hat InsightsRed Hat Insights
Red Hat Insights
 
Introduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopIntroduction to Kubernetes Workshop
Introduction to Kubernetes Workshop
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
 
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
Room 1 - 6 - Trần Quốc Sang - Autoscaling for multi cloud platform based on S...
 
Openstack 101
Openstack 101Openstack 101
Openstack 101
 
Ansible-cours .pdf
Ansible-cours .pdfAnsible-cours .pdf
Ansible-cours .pdf
 
Ansible 101
Ansible 101Ansible 101
Ansible 101
 
KubeCon EU 2016: Kubernetes Storage 101
KubeCon EU 2016: Kubernetes Storage 101KubeCon EU 2016: Kubernetes Storage 101
KubeCon EU 2016: Kubernetes Storage 101
 
Ansible Automation - Enterprise Use Cases | Juncheng Anthony Lin
Ansible Automation - Enterprise Use Cases | Juncheng Anthony LinAnsible Automation - Enterprise Use Cases | Juncheng Anthony Lin
Ansible Automation - Enterprise Use Cases | Juncheng Anthony Lin
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
 
Containers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatContainers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red Hat
 
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region
 
XPDS13: Xen in OSS based In–Vehicle Infotainment Systems - Artem Mygaiev, Glo...
XPDS13: Xen in OSS based In–Vehicle Infotainment Systems - Artem Mygaiev, Glo...XPDS13: Xen in OSS based In–Vehicle Infotainment Systems - Artem Mygaiev, Glo...
XPDS13: Xen in OSS based In–Vehicle Infotainment Systems - Artem Mygaiev, Glo...
 
Autoscale a self-healing cluster in OpenStack with Heat
Autoscale a self-healing cluster in OpenStack with HeatAutoscale a self-healing cluster in OpenStack with Heat
Autoscale a self-healing cluster in OpenStack with Heat
 
Volume Encryption In CloudStack
Volume Encryption In CloudStackVolume Encryption In CloudStack
Volume Encryption In CloudStack
 
OpenStack 개요 및 활용 사례 @ Community Open Camp with Microsoft
OpenStack 개요 및 활용 사례 @ Community Open Camp with MicrosoftOpenStack 개요 및 활용 사례 @ Community Open Camp with Microsoft
OpenStack 개요 및 활용 사례 @ Community Open Camp with Microsoft
 

Similar to Realizing Linux Containers (LXC)

Linux Container Brief for IEEE WG P2302
Linux Container Brief for IEEE WG P2302Linux Container Brief for IEEE WG P2302
Linux Container Brief for IEEE WG P2302
Boden Russell
 
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copyLinux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
Boden Russell
 
Introduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & ContainersIntroduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & Containers
Vaibhav Sharma
 
The building blocks of docker.
The building blocks of docker.The building blocks of docker.
The building blocks of docker.
Chafik Belhaoues
 
Lxc – next gen virtualization for cloud intro (cloudexpo)
Lxc – next gen virtualization for cloud   intro (cloudexpo)Lxc – next gen virtualization for cloud   intro (cloudexpo)
Lxc – next gen virtualization for cloud intro (cloudexpo)
Boden Russell
 
Lightweight Virtualization in Linux
Lightweight Virtualization in LinuxLightweight Virtualization in Linux
Lightweight Virtualization in Linux
Sadegh Dorri N.
 
First steps on CentOs7
First steps on CentOs7First steps on CentOs7
First steps on CentOs7
Marc Cortinas Val
 
Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015 Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015
Roger Zhou 周志强
 
Introduction to linux containers
Introduction to linux containersIntroduction to linux containers
Introduction to linux containers
Google
 
20240201 [HPC Containers] Rootless Containers.pdf
20240201 [HPC Containers] Rootless Containers.pdf20240201 [HPC Containers] Rootless Containers.pdf
20240201 [HPC Containers] Rootless Containers.pdf
Akihiro Suda
 
2. Vagin. Linux containers. June 01, 2013
2. Vagin. Linux containers. June 01, 20132. Vagin. Linux containers. June 01, 2013
2. Vagin. Linux containers. June 01, 2013
ru-fedora-moscow-2013
 
Visual comparison of Unix-like systems & Virtualisation
Visual comparison of Unix-like systems & VirtualisationVisual comparison of Unix-like systems & Virtualisation
Visual comparison of Unix-like systems & Virtualisation
wangyuanyi
 
Fedora Virtualization Day: Linux Containers & CRIU
Fedora Virtualization Day: Linux Containers & CRIUFedora Virtualization Day: Linux Containers & CRIU
Fedora Virtualization Day: Linux Containers & CRIUAndrey Vagin
 
Ospresentation 120112074429-phpapp02 (1)
Ospresentation 120112074429-phpapp02 (1)Ospresentation 120112074429-phpapp02 (1)
Ospresentation 120112074429-phpapp02 (1)Vivian Vhaves
 
Evolution of containers to kubernetes
Evolution of containers to kubernetesEvolution of containers to kubernetes
Evolution of containers to kubernetes
Krishna-Kumar
 
Linux container, namespaces & CGroup.
Linux container, namespaces & CGroup. Linux container, namespaces & CGroup.
Linux container, namespaces & CGroup.
Neeraj Shrimali
 
Oracle rac 10g best practices
Oracle rac 10g best practicesOracle rac 10g best practices
Oracle rac 10g best practicesHaseeb Alam
 

Similar to Realizing Linux Containers (LXC) (20)

Linux Container Brief for IEEE WG P2302
Linux Container Brief for IEEE WG P2302Linux Container Brief for IEEE WG P2302
Linux Container Brief for IEEE WG P2302
 
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copyLinux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
 
Introduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & ContainersIntroduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & Containers
 
The building blocks of docker.
The building blocks of docker.The building blocks of docker.
The building blocks of docker.
 
Lxc – next gen virtualization for cloud intro (cloudexpo)
Lxc – next gen virtualization for cloud   intro (cloudexpo)Lxc – next gen virtualization for cloud   intro (cloudexpo)
Lxc – next gen virtualization for cloud intro (cloudexpo)
 
Lightweight Virtualization in Linux
Lightweight Virtualization in LinuxLightweight Virtualization in Linux
Lightweight Virtualization in Linux
 
First steps on CentOs7
First steps on CentOs7First steps on CentOs7
First steps on CentOs7
 
Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015 Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015
 
Introduction to linux containers
Introduction to linux containersIntroduction to linux containers
Introduction to linux containers
 
20240201 [HPC Containers] Rootless Containers.pdf
20240201 [HPC Containers] Rootless Containers.pdf20240201 [HPC Containers] Rootless Containers.pdf
20240201 [HPC Containers] Rootless Containers.pdf
 
2. Vagin. Linux containers. June 01, 2013
2. Vagin. Linux containers. June 01, 20132. Vagin. Linux containers. June 01, 2013
2. Vagin. Linux containers. June 01, 2013
 
Visual comparison of Unix-like systems & Virtualisation
Visual comparison of Unix-like systems & VirtualisationVisual comparison of Unix-like systems & Virtualisation
Visual comparison of Unix-like systems & Virtualisation
 
Fedora Virtualization Day: Linux Containers & CRIU
Fedora Virtualization Day: Linux Containers & CRIUFedora Virtualization Day: Linux Containers & CRIU
Fedora Virtualization Day: Linux Containers & CRIU
 
Ospresentation 120112074429-phpapp02 (1)
Ospresentation 120112074429-phpapp02 (1)Ospresentation 120112074429-phpapp02 (1)
Ospresentation 120112074429-phpapp02 (1)
 
Evolution of containers to kubernetes
Evolution of containers to kubernetesEvolution of containers to kubernetes
Evolution of containers to kubernetes
 
Linux container, namespaces & CGroup.
Linux container, namespaces & CGroup. Linux container, namespaces & CGroup.
Linux container, namespaces & CGroup.
 
Oracle rac 10g best practices
Oracle rac 10g best practicesOracle rac 10g best practices
Oracle rac 10g best practices
 
Ubuntu OS Presentation
Ubuntu OS PresentationUbuntu OS Presentation
Ubuntu OS Presentation
 
Libra Library OS
Libra Library OSLibra Library OS
Libra Library OS
 
Studies
StudiesStudies
Studies
 

Recently uploaded

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 

Recently uploaded (20)

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 

Realizing Linux Containers (LXC)

  • 1. Realizing Linux Containers (LXC) Building Blocks, Underpinnings & Motivations Boden Russell – IBM Global Technology Services (brussell@us.ibm.com)
  • 2. Definitions  Linux Containers (LXC for LinuX Containers) are lightweight virtual machines (VMs) which are realized using features provided by a modern Linux kernel – VMs without the hypervisor  Containerization of: – (Linux) Operating Systems – Single or multiple applications  LXC as a technology not to be confused with LXC (tools) which are a user space toolset for creating & managing Linux Containers  From wikipedia: LXC (LinuX Containers) is an operating system–level virtualization method for running multiple isolated Linux systems (containers) on a single control host… LXC provides operating system-level virtualization not via a virtual machine, but rather provides a virtual environment that has its own process and network space. 3/11/2014 2
  • 3. Why LXC  Provision in seconds / milliseconds  Near bare metal runtime performance  VM-like agility – it’s still “virtualization”  Flexibility – Containerize a “system” – Containerize “application(s)”  Lightweight – Just enough Operating System (JeOS) – Minimal per container penalty  Open source – free – lower TCO  Supported with OOTB modern Linux kernel  Growing in popularity 3/11/2014 3 “Linux Containers as poised as the next VM in our modern Cloud era…” Manual VM LXC Provision Time Days Minutes Seconds / ms linpack performance @ 45000 0 50 100 150 200 250 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 BM vcpus GFlops Google trends - LXC Google trends - docker
  • 4. Hypervisors vs. Linux Containers Hardware Operating System Hypervisor Virtual Machine Operating System Bins / libs App App Virtual Machine Operating System Bins / libs App App Hardware Hypervisor Virtual Machine Operating System Bins / libs App App Virtual Machine Operating System Bins / libs App App Hardware Operating System Container Bins / libs App App Container Bins / libs App App Type 1 Hypervisor Type 2 Hypervisor Linux Containers 3/11/2014 4 Containers share the OS kernel of the host and thus are lightweight. However, each container must have the same OS kernel. Containers are isolated, but share OS and, where appropriate, libs / bins.
  • 5. LXC Technology Stack  LXCs are built on modern kernel features – cgroups; limits, prioritization, accounting & control – namespaces; process based resource isolation – chroot; apparent root FS directory – Linux Security Modules (LSM); Mandatory Access Control (MAC)  User space interfaces for kernel functions  LXC tools – Tools to isolate process(es) virtualizing kernel resources  LXC commoditization – Dead easy LXC – LXC virtualization  Orchestration & management – Scheduling across multiple hosts – Monitoring – Uptime 3/11/2014 5
  • 6. Linux cgroups  History – Work started in 2006 by google engineers – Merged into upstream 2.6.24 kernel due to wider spread LXC usage – A number of features still a WIP  Functionality – Access; which devices can be used per cgroup – Resource limiting; memory, CPU, device accessibility, block I/O, etc. – Prioritization; who gets more of the CPU, memory, etc. – Accounting; resource usage per cgroup – Control; freezing & check pointing – Injection; packet tagging  Usage – cgroup functionality exposed as “resource controllers” (aka “subsystems”) – Subsystems mounted on FS – Top-level subsystem mount is the root cgroup; all procs on host – Directories under top-level mounts created per cgroup – Procs put in tasks file for group assignment – Interface via read / write pseudo files in group 3/11/2014 6
  • 7. Linux cgroup Subsystems  cgroups provided via kernel modules – Not always loaded / provided by default – Locate and load with modprobe  Some features tied to kernel version  See: https://www.kernel.org/doc/Documentation/cgroups/ 3/11/2014 7 Subsystem Tunable Parameters blkio - Weighted proportional block I/O access. Group wide or per device. - Per device hard limits on block I/O read/write specified as bytes per second or IOPS per second. cpu - Time period (microseconds per second) a group should have CPU access. - Group wide upper limit on CPU time per second. - Weighted proportional value of relative CPU time for a group. cpuset - CPUs (cores) the group can access. - Memory nodes the group can access and migrate ability. - Memory hardwall, pressure, spread, etc. devices - Define which devices and access type a group can use. freezer - Suspend/resume group tasks. memory - Max memory limits for the group (in bytes). - Memory swappiness, OOM control, hierarchy, etc.. hugetlb - Limit HugeTLB size usage. - Per cgroup HugeTLB metrics. net_cls - Tag network packets with a class ID. - Use tc to prioritize tagged packets. net_prio - Weighted proportional priority on egress traffic (per interface).
  • 8. Linux cgroups FS Layout 3/11/2014 8
  • 9. Linux cgroups Pseudo FS Interface 3/11/2014 9 /sys/fs/cgroup/my-lxc |-- blkio | |-- blkio.io_merged | |-- blkio.io_queued | |-- blkio.io_service_bytes | |-- blkio.io_serviced | |-- blkio.io_service_time | |-- blkio.io_wait_time | |-- blkio.reset_stats | |-- blkio.sectors | |-- blkio.throttle.io_service_bytes | |-- blkio.throttle.io_serviced | |-- blkio.throttle.read_bps_device | |-- blkio.throttle.read_iops_device | |-- blkio.throttle.write_bps_device | |-- blkio.throttle.write_iops_device | |-- blkio.time | |-- blkio.weight | |-- blkio.weight_device | |-- cgroup.clone_children | |-- cgroup.event_control | |-- cgroup.procs | |-- notify_on_release | |-- release_agent | `-- tasks |-- cpu | |-- ... |-- ... `-- perf_event echo "8:16 1048576“ > blkio.throttle.read_bps_de vice cat blkio.weight_device dev weight 8:1 200 8:16 500 App App App  Linux pseudo FS is the interface to cgroups – Read / write to pseudo file(s) in your cgroup directory  Some libs exist to interface with pseudo FS programmatically
  • 10. Linux cgroups: CPU Usage  Use CPU shares (and other controls) to prioritize jobs / containers  Carry out complex scheduling schemes  Segment host resources  Adhere to SLAs 3/11/2014 10
  • 11. Linux cgroups: CPU Pinning  Pin containers / jobs to CPU cores  Carry out complex scheduling schemes  Reduce core switching costs  Adhere to SLAs 3/11/2014 11
  • 12. Linux cgroups: Device Access  Limit device visibility; isolation  Implement device access controls – Secure sharing  Segment device access  Device whitelist / blacklist 3/11/2014 12
  • 13. LXC Realization: Linux cgroups  cgroup created per container (in each cgroup subsystem)  Prioritization, access, limits per container a la cgroup controls  Per container metrics (bean counters) 3/11/2014 13
  • 14. Linux namespaces  History – Initial kernel patches in 2.4.19 – Recent 3.8 patches for user namespace support – A number of features still a WIP  Functionality – Provide process level isolation of global resources • MNT (mount points, file systems, etc.) • PID (process) • NET (NICs, routing, etc.) • IPC (System V IPC resources) • UTS (host & domain name) • USER (UID + GID) – Process(es) in namespace have illusion they are the only processes on the system – Generally constructs exist to permit “connectivity” with parent namespace  Usage – Construct namespace(s) of desired type – Create process(es) in namespace (typically done when creating namespace) – If necessary, initialize “connectivity” to parent namespace – Process(es) in name space internally function as if they are only proc(s) on system 3/11/2014 14
  • 15. Linux namespaces: Conceptual Overview 3/11/2014 15
  • 16. Linux namespaces: MNT namespace  Isolates the mount table – per namespace mounts  mount / unmount operations isolated to namespace  Mount propagation – Shared; mount objects propagate events to one another – Slave; one mount propagates events to another, but not vice versa – Private; no event propagation (default)  Unbindable mount forbids bind mounting itself  Various tools / APIs support the mount namespace such as the mount command – Options to make shared, private, slave, etc. – Mount with namespace support  Typically used with chroot or pivot_root for effective root FS isolation 3/11/2014 16 “global” (i.e. root) namespace “green” namespace “red” namespace MNT NS / /proc /mnt/fsrd /mnt/fsrw /mnt/cdrom /run2 MNT NS / /proc /mnt/greenfs /mnt/fsrw /mnt/cdrom MNT NS / /proc /mnt/cdrom /redns
  • 17. Linux namespaces: UTS namespace  Per namespace – Hostname – NIS domain name  Reported by commands such as hostname  Processes in namespace can change UTS values – only reflected in the child namespace  Allows containers to have their own FQDN 3/11/2014 17 “global” (i.e. root) namespace “green” namespace “red” namespace UTS NS globalhost rootns.com UTS NS greenhost greenns.org UTS NS redhost redns.com
  • 18. Linux namespaces: PID namespace  Per namespace PID mapping – PID 1 in namespace not the same as PID 1 in parent namespace – No PID conflicts between namespaces – Effectively 2 PIDs; the PID in the namespace and the PID outside the namespace  Permits migrating namespace processes between hosts while keeping same PID  Only processes in the namespace are visible within the namespace (visibility limited) 3/11/2014 18 “global” (i.e. root) namespace “green” namespace “red” namespace PID NS PID COMMAND 1 /sbin/init 2 [kthreadd] 3 [ksoftirqd] 4 [cpuset] 5 /sbin/udevd PID NS PID COMMAND 1 /bin/bash 2 /bin/vim PID NS PID COMMAND 1 /bin/bash 2 python 3 node
  • 19. Linux namespaces: IPC namespace  System V IPC object & POSIX message queue isolation between namespaces – Semaphores – Shared memory – Message queues  Parent namespace connectivity – Signals – Memory polling – Sockets (if no NET namespace) – Files / file descriptors (if no mount namespace) – Events over pipe pair 3/11/2014 19 “global” (i.e. root) namespace “green” namespace “red” namespace IPC NS SHMID OWNER 32452 root 43321 boden SEMID OWNER 0 root 1 boden IPC NS SHMID OWNER SEMID OWNER 0 root IPC NS SHMID OWNER SEMID OWNER MSQID OWNER
  • 20. Linux namespaces: NET namespace  Per namespace network objects – Network devices (eths) – Bridges – Routing tables – IP address(es) – ports – Etc  Various commands support network namespace such as ip  Connectivity to other namespaces – veths – create veth pair, move one inside the namespace and configure – Acts as a pipe between the 2 namespaces  LXCs can have their own IPs, routes, bridges, etc. 3/11/2014 20 “global” (i.e. root) namespace “green” namespace “red” namespace NET NS lo: UNKNOWN… eth0: UP… eth1: UP… br0: UP… app1 IP:5000 app2 IP:6000 app3 IP:7000 NET NS lo: UNKNOWN… eth0: UP… app1 IP:1000 app2 IP:7000 NET NS lo: UNKNOWN… eth0: DOWN… eth1: UP app1 IP:7000 app2 IP:9000
  • 21. Linux namespaces: USER namespace  A long work in progress – still development for XFS and other FS support – Significant security impacts – A handful of security holes already found + fixed  Two major features provided: – Map UID / GID from outside the container to UID / GID inside the container – Permit non-root users to launch LXCs – Distro’s rolling out phased support, with UID / GID mapping typically 1st  First process in USER namespace has full CAPs; perform initializations before other processes are created – No CAPs in parent namespace  UID / GID map can be pre-configured via FS  Eventually USER namespace will mitigate many perceived LXC security concerns 3/11/2014 21 “global” (i.e. root) namespace “green” namespace “red” namespace USER NS root 0:0 ntp 104:109 Mysql 105:110 boden 106:111 USER NS root 0:0 app 106:111 USER NS root 0:0 app 104:109
  • 22. LXC Realization: Linux namespaces 3/11/2014 22  A set of namespaces created for the container  Container process(es) “executed” in the namespace set  Process(es) in the container have isolated view of resources  Connectivity to parent where needed (via lxc tooling)
  • 23. Linux namespaces & cgroups: Availability 3/11/2014 23 Note: user namespace support in upstream kernel 3.8+, but distributions rolling out phased support: - Map LXC UID/GID between container and host - Non-root LXC creation
  • 24. Linux chroots  Changes apparent root directory for process and children – Search paths – Relative directories – Etc  Using chroot can be escaped given proper capabilities, thus pivot_root is often used instead – chroot; points the processes file system root to new directory – pivot_root; detaches the new root and attaches it to process root directory  Often used when building system images – Chroot to temp directory – Download and install packages in chroot – Compress chroot as a system root FS  LXC realization – Bind mount container root FS (image) – Launch (unshare or clone) LXC init process in a new MNT namespace – pivot_root to the bind mount (root FS) 3/11/2014 24
  • 25. Linux chroot vs pivot_root 3/11/2014 25  Using pivot_root with MNT namespace addresses escaping chroot concerns  The pivot_root target directory becomes the “new root FS”
  • 26. LXC Realization: Images LXC images provide a flexible means to deliver only what you need – lightweight and minimal footprint  Basic constraints – Same architecture – Same endian – Linux’ish Operating System; you can run different Linux distros on same host  Image types – System; images intended to virtualize Operating System(s) – standard distro root FS less the kernel – Application; images intended to virtualize application(s) – only package apps + dependencies (aka JeOS – Just enough Operating System)  Bind mount host libs / bins into LXC to share host resources  Container image init process – Container init command provided on invocation – can be an application or a full fledged init process – Init script customized for image – skinny SysVinit, upstart, etc. – Reduces overhead of lxc start-up and runtime foot print  Various tools to build images – SuSE Kiwi – Debootstrap – Etc.  LXC tooling options often include numerous image templates 3/11/2014 26
  • 27. Linux Security Modules & MAC  Linux Security Modules (LSM) – kernel modules which provide a framework for Mandatory Access Control (MAC) security implementations  MAC vs DAC – In MAC, admin (user or process) assigns access controls to subject / initiator • Most MAC implementations provide the notion of profiles • Profiles define access restrictions and are said to “confine” a subject – In DAC, resource owner (user) assigns access controls to individual resources  Existing LSM implementations include: AppArmor, SELinux, GRSEC, etc. 3/11/2014 27
  • 28. Linux Capabilities & Other Security Measures  Linux capabilities – Per process privileges which define operational (sys call) access – Typically checked based on process EUID and EGID – Root processes (i.e. EUID = GUID = 0) bypass capability checks  Capabilities can be assigned to LXC processes to restrict  Other LXC security mitigations – Reduce shared FS access using RO bind mounts – Keep Linux kernel up to date – User namespaces in 3.8+ kernel • Allow to launch containers as non-root user • Map UID / GID inside / outside of container 3/11/2014 28
  • 30. LXC Tooling  LXC is not a kernel feature – it’s a technology enabled via kernel features – User space tooling required to manage LXCs effectively  Numerous toolsets exist – Then: add-on patches to upstream kernel due to slow kernel acceptance – Now: upstream LXC feature support is growing – less need for patches  More popular GNU Linux toolsets include libvirt-lxc and lxc (tools) – OpenVZ is likely the most mature toolset, but it requires kernel patches – Note: I would consider docker a commoditization of LXC  Non-GNU Linux based LXC – Solaris zones – BSD jails – Illumos / SmartOS (solaris derivatives) – Etc. 3/11/2014 30
  • 32. Libvirt-lxc  Perhaps the simplest to learn through a familiar virsh interface  Libvirt provides LXC support by connecting to lxc:///  Many virsh commands work • virsh -c lxc:/// define sample.xml • virsh –c lxc:/// start sample • virsh –c lxc:/// console sample • virsh –c lxc:/// shutdown sample • virsh –c lxc:/// undefine sample  No snapshotting, templates…  OpenStack support since Grizzly  No VNC  No Cinder support in Grizzly  Config drive not supported  Alternative means of accessing metadata  Attached disk rather than http calls 3/11/2014 32 <domain type='lxc'> <name>sample</name> <memory>32768</memory> <os> <type>exe</type> <init>/init</init> </os> <vcpu>1</vcpu> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/libexec/libvirt_lxc</emulator> <filesystem type='mount'> <source dir='/opt/vm-1-root'/> <target dir='/'/> </filesystem> <interface type='network'> <source network='default'/> </interface> <console type='pty' /> </devices> </domain>
  • 33. LXC (tools)  A little more functionality  Supported by the major distributions  LXC 1.0 recently released – Cloning supported: lxc-clone – Templates… btrfs – lxc-create -t ubuntu -n CN creates a new ubuntu container • “template” is downloaded from Ubuntu • Some support for Fedora <= 14 • Debian is supported – lxc-start -d -n CN starts the container – lxc-destroy -n CN destroys the container – /etc/lxc/lxc.conf has default settings – /var/lib/lxc/CN is the default place for each container 3/11/2014 33
  • 34. LXC Commoditization: docker  Young project with great vibrancy in the industry  Currently based on unmodified LXC – but the goal is to make it dirt easy  As of March 10th, 2014 at v0.9. Monthly releases, 1.0 should be ready for production use  What docker adds to LXC – Portable deployment across machines • In Cloud terms, think of LXC as the hypervisor and docker as the Open Virtualization Appliance (OVA) and the provision engine • Docker images can run unchanged on any platform supporting docker – Application-centric • User facing function geared towards application deployment, not VM analogs [!] – Automatic build • Create containers from build files • Builders can use chef, maven, puppet, etc. – Versioning support • Think of git for docker containers • Only delta of base container is tracked – Component re-use • Any container can be used as a base, specialized and saved – Sharing • Support for public/private repositories of containers – Tools • CLI / REST API for interacting with docker • Vendors adding tools daily  Docker containers are self contained – no more “dependency hell” 3/11/2014 34
  • 35. Docker vs. LXC vs. Hypervisor 3/11/2014 35
  • 36. Docker: LXC Virtualization? 3/11/2014 36  Docker decouples the LXC provider from the operations – LXC provider agnostic  Docker “images” run anywhere docker is supported – Portability
  • 37. LXC Orchestration & Management  Docker & libvirt-lxc in OpenStack – Manage containers heterogeneously with traditional VMs… but not w/the level of support & features we might like  CoreOS – Zero-touch admin Linux distro with docker images as the unit of operation – Centralized key/value store to coordinate distributed environment  Various other 3rd party apps – Maestro for docker – Shipyard for docker – Fleet for CoreOS – Etc.  LXC migration – Container migration via criu  But… – Still no great way to tie all virtual resources together with LXC – e.g. storage + networking • IMO; an area which needs focus for LXC to become more generally applicable 3/11/2014 37
  • 38. Docker in OpenStack  Introduced in Havana – A nova driver to integrate with docker REST API – A Glance translator to integrate containers with Glance • A docker container which implements a docker registry API  The claim is that docker will become a “group A” hypervisor – In it’s current form it’s effectively a “tech preview” 3/11/2014 38
  • 39. LXC Evaluation  Goal: validate the promise with an eye towards practical applicability  Dimensions evaluated: – Runtime performance benefits – Density / footprint – Workload isolation – Ease of use and tooling – Cloud Integration – Security – Ease of use / feature set NOTE: tests performed in a passive manner – deeper analysis warrented. 3/11/2014 39
  • 40. Runtime Performance Benefits - CPU  Tested using libvirt lxc on Ubuntu 13.10 using linpack 11.1  Cpuset was used to limit the number of CPUs that the containers could use  The performance overhead falls within the error of measurement of this test  Actual bare metal performance is actually lower than some container results 3/11/2014 40 linpack performance @ 45000 0 50 100 150 200 250 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 BM vcpus GFlops 220.77 Bare metal220.5 @32 vcpu 220.9 @ 31 vcpu
  • 41. Runtime Performance Benefits – I/O  I/O Tests using libvirt lxc show a < 1 % degradation  Tested with a pass-through mount 3/11/2014 41 Sync read I/O test Rw=Write Size=1024m Bs=128mb direct=1 sync=1 Sync write I/O test Rw=Write Size=1024m Bs=128mb direct=1 sync=1 I/O throughput 1711.2 1724.9 1626.4 1633.4 0 500 1000 1500 2000 lxc write bare metal write lxc read bare metal read test MB/s Series1
  • 42. Runtime Performance Benefits – Block I/O  Tested with [standard] AUFS 3/11/2014 42
  • 43. Density & Footprint – libvirt-lxc 3/11/2014 43 Starting 500 containers Mon Nov 11 13:38:49 CST 2013 ... all threads done in 157 (sequential I/O bound) Stopping 500 containers Mon Nov 11 13:42:20 CST 2013 ... all threads done in 162 Active memory delta: 417.2 KB Starting 1000 containers Mon Nov 11 13:59:19 CST 2013 ... all threads done in 335 Stopping 1000 containers Mon Nov 11 14:14:26 CST 2013 ... all threads done in 339 Active memory delta: 838.4KB Using libvirt lxc on RHEL 6.4, we found that empty container overhead was just 840 bytes. A container could be started in about 330ms, which was an I/O bound process This represents the lower limit of lxc footprint Containers ran /bin/sh
  • 44. Density & Footprint – Docker  In this test, we created 150 Docker containers with CentOS, started apache & then removed them  Average footprint was ~10MB per container  Average start time was 240ms  Serially booting 150 containers which run apache – Takes on average 36 seconds – Consumes about 2 % of the CPU – Negligible HDD space – Spawns around 225 processes for create – Around 1.5 GB of memory ~ 10 MB per container – Expect faster results once docker addresses performance topics in the next few months  Serially destroying 150 containers running apache – On average takes 9 seconds – We would expect destroy to be faster – likely a docker bug and will triage with the docker community 3/11/2014 44 Container Creation Container Deletion I/O profile CPU profile
  • 45. Workload Isolation: Examples  Using the blkio cgroup (lxc.cgroup.blkio.throttle.read_bps_device) to cap the I/O of a container  Both the total bps and iops_device on read / write could be capped  Better async BIO support in kernel 3.10+  We used fio with oflag=sync, direct to test the ability to cap the reads: – With limit set to 6 MB / second READ: io=131072KB, aggrb=6147KB/s, minb=6295KB/s, maxb=6295KB/s, mint=21320msec, maxt=21320msec – With limit set to 60 MB / second READ: io=131072KB, aggrb=61134KB/s, minb=62601KB/s, maxb=62601KB/s, mint=2144msec, maxt=2144msec – No read limit READ: io=131072KB, aggrb=84726KB/s, minb=86760KB/s, maxb=86760KB/s, mint=1547msec, maxt=1547msec 3/11/2014 45
  • 46. OpenStack VM Operations 3/11/2014 46 NOTE: orchestration / management overheads cap LXC performance
  • 47. Who’s Using LXC  Google app engine & infra is said to be using some form of LXC  RedHat OpenShift  dotCloud (now docker inc)  CloudFoundry (early versions)  Rackspace Cloud Databases – Outperforms AWS (Xen) according to perf results  Parallels Virtuozzo (commercial product)  Etc.. 3/11/2014 47
  • 48. LXC Gaps There are gaps…  Lack of industry tooling / support  Live migration still a WIP  Full orchestration across resources (compute / storage / networking)  Fears of security  Not a well known technology… yet  Integration with existing virtualization and Cloud tooling  Not much / any industry standards  Missing skillset  Slower upstream support due to kernel dev process  Etc. 3/11/2014 48
  • 49. LXC: Use Cases For Traditional VMs There are still use cases where traditional VMs are warranted.  Virtualization of non Linux based OSs – Windows – AIX – Etc.  LXC not supported on host  VM requires unique kernel setup which is not applicable to other VMs on the host (i.e. per VM kernel config)  Etc. 3/11/2014 49
  • 50. LXC Recommendations  Public & private Clouds – Increase VM density 2-3x – Accommodate Big Data & HPC type applications – Move the support of Linux distros to containers  PaaS & managed services – Realize “as a Service” and managed services using LXC  Operations management – Ease management + increase agility of bare metal components  DevOps  Development & test – Sandboxes – Dev / test envs – Etc.  If you are just starting with LXC and don’t have in-depth skillset – Start with LXC for private solutions (trusted code) 3/11/2014 50
  • 51. LXC Resources  https://www.kernel.org/doc/Documentation/cgroups/  http://www.blaess.fr/christophe/2012/01/07/linux-3-2-cfs-cpu-bandwidth-english-version/  http://atmail.com/kb/2009/throttling-bandwidth/  https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/ch- Subsystems_and_Tunable_Parameters.html  http://www.janoszen.com/2013/02/06/limiting-linux-processes-cgroups-explained/  http://www.mattfischer.com/blog/?p=399  http://oakbytes.wordpress.com/2012/09/02/cgroup-cpu-allocation-cpu-shares-examples/  http://fritshoogland.wordpress.com/2012/12/15/throttling-io-with-linux/  https://lwn.net/Articles/531114/  https://www.kernel.org/doc/Documentation/filesystems/sharedsubtree.txt  http://www.ibm.com/developerworks/library/l-mount-namespaces/  http://blog.endpoint.com/2012/01/linux-unshare-m-for-per-process-private.html  http://timothysc.github.io/blog/2013/02/22/perprocess/  http://www.evolware.org/?p=293  http://s3hh.wordpress.com/2012/05/10/user-namespaces-available-to-play/  http://libvirt.org/drvlxc.html  https://help.ubuntu.com/lts/serverguide/lxc.html  https://linuxcontainers.org/  https://wiki.ubuntu.com/AppArmor  http://linux.die.net/man/7/capabilities  http://docs.openstack.org/trunk/config-reference/content/lxc.html  https://wiki.openstack.org/wiki/Docker  https://www.docker.io/  http://marceloneves.org/papers/pdp2013-containers.pdf  http://openvz.org/Main_Page  http://criu.org/Main_Page 3/11/2014 51