DevOpsMtl January 2014 - A Decade of Linux Containers


Published on

Containers technologies have been gaining a lot of traction in the DevOps world, especially with the arrival of But these technologies have been around for more than 10 years. In this talk, we will dive through the history of Linux Containers, and how it differentiates with traditional virtualization technologies. From Linux UML and VServer days in the early 2000s, through OpenVZ and the rise of Linux namespaces and cgroups, to LXC, the new kid on the block.

Published in: Technology, News & Politics

DevOpsMtl January 2014 - A Decade of Linux Containers

  1. 1. A Decade of Linux Containers Simon Boulet Consultant, Deployment and Automation
  2. 2. A Decade of Linux Containers
  3. 3. An Introduction to Containers “[...] should it be possible for the operating system to ensure that excessive resource usage by one group of processes doesn't interfere with another group of processes? Should it be possible for a single kernel to provide resource-usage statistics for a logical group of processes? Likewise, should the kernel be able to allow multiple processes to transparently use port 80?” Glauber Costa, Parallels (SWSoft / company behind OpenVZ)
  4. 4. Containers (vs virtualization) ● Group processes together to create secure, isolated virtual environments ● Share the host kernel / operating system ● Generally perform better than traditional virtualization ● Often have limitations with kernel features (VPN, loopback devices, iptables, FUSE, NFS, etc.)
  5. 5. User-Mode Linux (UML) ● Kernel patch to compile the Linux kernel as “regular” binary. Run linux inside linux: ./linux ● First paper in August 2000, Linux 2.2.x [1] ● Mainstream since 2.6.0 (December 2003) ● No root access needed (network requires TUN/TAP) ● Linode was initially offering UML containers and switched to Xen on March 28, 2008 [2] ● Works out of the box with all recent kernels [3] [1] [2] [3]
  6. 6. Linux-VServer ● Created by Jacques Gelinas, a Montrealer ● First public announcement October 2001 [1] ● Use a “security context” concept to isolate processes (similar to Linux Namespaces) ● Still alive (latest patch for Linux 3.10.21) ● Dreamhost (the company behind Ceph) still use Linux-VServer for their VPS offering [1] [2]
  7. 7. OpenVZ ● Patch based on latest RHEL kernel (currently 2.6.32; 40MB gzip patch). Extends Linux Cgroups/Namespaces features ● Mature (initial release in 2005), OSS behind Parallels Virtuozzo (commercial) ● Future of OpenVZ lies within Linux Cgroups/ Namespaces. Recent version of OpenVZ tools work partially with recent mainstream kernels ● OpenVZ developers very active in Linux kernel/Namespaces community
  8. 8. OpenVZ Contributions to Linux Kernel
  9. 9. OpenVZ: LXC/Namespaces older brother “OpenVZ is great, and it has been around for longer than LXC, so some people consider it to be more stable and secure. However, one has to keep in mind that LXC and OpenVZ share many developers in common, and that LXC is nothing else than “OpenVZ redesigned to be able to be merged into the mainline kernel”. Therefore, OpenVZ will eventually sunset, to be fully replaced by LXC.” Jérôme Petazzoni, Senior Engineer at dotCloud (company behind Docker)
  10. 10. LXC ● Docker uses LXC for creating containers ● First release of LXC September 2008 ● Set of userspace tools to create containers on top of Linux Cgroups and Namespaces ● LXC containers are not fully secure yet. It’s possible for root inside container to escape and gain root on host. Need AppArmor/SELinux. Future lies in the User namespace.
  11. 11. Linux Namespaces Different namespaces = Different “Views” of the kernel Linux 2.4.19 - 3 Aug 2002 Mount namespace Mount Points Linux 2.6.19 - 29 Nov 2006 UTS namespace Hostname IPC namespace Interprocess communication PID namespace Processes in different PID namespace can have the same PID Network namespace Network devices, IP addresses, routing tables, iptables entries User namespace Root privileges for operations inside a user namespace, but unprivileged outside the namespace. Number of Linux filesystems are not yet usernamespace aware. Linux 2.6.24 - 24 Jan 2008 Linux 3.8 - 18 Feb 2013
  12. 12. Linux Cgroups ● Virtually group processes together, apply limits, priority, accounting, etc. ● Divided in subsystems, each subsystem representing a resource (CPU, memory, etc) blkio Limit input/output access to and from block devices cpu Uses the scheduler to provide access to the CPU devices Allows or denies access to devices freezer Suspends or resumes tasks in a cgroup memory Set limits on memory use by tasks in a cgroup, and generates automatic reports on memory resources used by those tasks ...
  13. 13. Playing with Cgroups ● Cgroups are configured through the cgroup virtual file system (similar to /proc) ● Mounting the cgroup virtual filesystem for the desired subsystem (ex. blkio): sudo mkdir -p /sys/fs/cgroup/blkio sudo mount -t cgroup -oblkio blkio /sys/fs/cgroup/blkio ● Create a new cgroup named “1mbsec” in the blkio sybsystem: sudo mkdir /sys/fs/cgroup/blkio/1mbsec
  14. 14. Playing with Cgroups (cont.) ● Set a limit of 1MB/ sec on this cgroup: echo '253:2 '$((1024*1024)) |sudo tee /sys/fs/cgroup/blkio/1mbsec/blkio.throttle.write_bps_device ● Attach current process (shell) to the 1mbsec cgroup: echo $$ | sudo tee /sys/fs/cgroup/blkio/1mbsec/tasks ● Writes are now throttled to 1MB/sec: dd if=/dev/zero of=100mbtest.bin bs=1M count=100 conv=fdatasync 100+0 records in 100+0 records out 104857600 bytes (105 MB) copied, 100.055 s, 1.0 MB/s
  15. 15. My Personal Experience ● OpenVZ is generally the “go-to” for public / production containers (unless you need some of the recent kernel features) ● LXC is gaining a lot of interest, especially with tools like Docker. Escaping LXC containers is a major security issue, you will need to learn AppArmor/SELinux to secure LXC ● User-Mode Linux is a very well kept secret. It’ s a great way to quickly run containers, especially in non-root environments, and works out the box with all recent kernels.
  16. 16. Thank you! Questions? Simon Boulet