Lxc – next gen virtualization for cloud   intro (cloudexpo)
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Lxc – next gen virtualization for cloud intro (cloudexpo)

  • 816 views
Uploaded on

Intro to Linux Containers presented @ cloudexpo 2014 east in NYC.

Intro to Linux Containers presented @ cloudexpo 2014 east in NYC.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
816
On Slideshare
800
From Embeds
16
Number of Embeds
3

Actions

Shares
Downloads
56
Comments
0
Likes
10

Embeds 16

http://www.slideee.com 11
https://www.linkedin.com 4
https://twitter.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Linux Containers – NextGen Virtualization for Cloud (Intro & Overview) Cloud Expo June 10-12, 2014 New York City, NY Boden Russell (brussell@us.ibm.com)
  • 2. Why LXC: Performance 6/13/2014 2 Manual VM LXC Provision Time Days Minutes Seconds / ms linpack performance @ 45000 0 50 100 150 200 250 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 B M vcpus GFlops
  • 3. Why LXC: Industry Uptrend 6/13/2014 3 Google trends - LXC Google trends - docker
  • 4. Why LXC: Flexible & Lightweight Virtual Machines Linux Containers 6/13/2014 4 OS bins / libs app OS bins / libs app app bins / libs app bins / libs app app app app OS bins / libs app OS bins / libs app OS bins / libs app bins / libs app bins / libs app bins / libs app bins / libs app bins / libs app bins / libs app bins / libs app bins / libs app bins / libs app FlexibilityDensity OS
  • 5. Why LXC: Lower TCO  Supported with out of the box modern Linux Kernel  Open source toolsets  Cloudy integration 6/13/2014 5
  • 6. Definitions  Linux Containers (LXC  LinuX Containers) – Lightweight virtualization – Realized using features provided by a modern Linux kernel – VMs without the hypervisor (kind of)  Containerization of – (Linux) Operating Systems – Single or multiple applications  LXC as a technology ≠ LXC “tools” 6/13/2014 6
  • 7. Hypervisors vs. Linux Containers 6/13/2014 7 Hardware Operating System Hypervisor Virtual Machine Operating System Bins / libs App App Virtual Machine Operating System Bins / libs App App Hardware Hypervisor Virtual Machine Operating System Bins / libs App App Virtual Machine Operating System Bins / libs App App Hardware Operating System Container Bins / libs App App Container Bins / libs App App Type 1 Hypervisor Type 2 Hypervisor Linux Containers Containers share the OS kernel of the host and thus are lightweight. However, each container must have the same OS kernel. Containers are isolated, but share OS and, where appropriate, libs / bins.
  • 8. LXC Technology Stack 6/13/2014 8 UserSpaceKernelSpace Kernel System Call Interface Architecture Dependent Kernel Code GLIBC / Pseudo FS / User Space Tools & Libs Linux Container Tooling Linux Container Commoditization Orchestration & Management Hardware cgroups namespaces chroots LSM lxc
  • 9. So You Want To Build A Container?  High level checklist – Process(es) – Throttling / limits – Prioritization – Resource isolation – Root file system – Security 6/13/2014 9 my-lxc ?
  • 10. Linux Control Groups (cgroups)  Problem – How do I throttle, prioritize, control and obtain metrics for a group of tasks (processes)?  Solution  control groups (cgroups) 6/13/2014 10 cgroup blue proc proc proc – Device Access – Resource limiting – Prioritization – Accounting – Control – Injection
  • 11. Linux cgroup Subsystems Subsystem Tunable Parameters blkio - Weighted proportional block I/O access. Group wide or per device. - Per device hard limits on block I/O read/write specified as bytes per second or IOPS per second. cpu - Time period (microseconds per second) a group should have CPU access. - Group wide upper limit on CPU time per second. - Weighted proportional value of relative CPU time for a group. cpuset - CPUs (cores) the group can access. - Memory nodes the group can access and migrate ability. - Memory hardwall, pressure, spread, etc. devices - Define which devices and access type a group can use. freezer - Suspend/resume group tasks. memory - Max memory limits for the group (in bytes). - Memory swappiness, OOM control, hierarchy, etc.. hugetlb - Limit HugeTLB size usage. - Per cgroup HugeTLB metrics. net_cls - Tag network packets with a class ID. - Use tc to prioritize tagged packets. net_prio - Weighted proportional priority on egress traffic (per interface). 6/13/2014 11
  • 12. Linux cgroups Pseudo FS Interface  Linux pseudo FS is the interface to cgroups – Directory per subsystem per cgroup – Read / write to pseudo file(s) in your cgroup directory 6/13/2014 12 /sys/fs/cgroup/my-lxc |-- blkio | |-- blkio.io_merged | |-- blkio.io_queued | |-- blkio.io_service_bytes | |-- blkio.io_serviced | |-- blkio.io_service_time | |-- blkio.io_wait_time | |-- blkio.reset_stats | |-- blkio.sectors | |-- blkio.throttle.io_service_bytes | |-- blkio.throttle.io_serviced | |-- blkio.throttle.read_bps_device | |-- blkio.throttle.read_iops_device | |-- blkio.throttle.write_bps_device | |-- blkio.throttle.write_iops_device | |-- blkio.time | |-- blkio.weight | |-- blkio.weight_device | |-- cgroup.clone_children | |-- cgroup.event_control | |-- cgroup.procs | |-- notify_on_release | |-- release_agent | `-- tasks |-- cpu | |-- ... |-- ... `-- perf_event echo "8:16 1048576“ > blkio.throttle.read_bps_device cat blkio.weight_device dev weight 8:1 200 8:16 500 App App App
  • 13. Linux cgroups FS Layout 6/13/2014 13
  • 14. Linux cgroups: CPU Usage  Use CPU shares (and other controls) to prioritize jobs / containers  Carry out complex scheduling schemes  Segment host resources  Adhere to SLAs 6/13/2014 14
  • 15. Linux cgroups: CPU Pinning  Pin containers / jobs to CPU cores  Carry out complex scheduling schemes  Reduce core switching costs  Adhere to SLAs 6/13/2014 15
  • 16. Linux cgroups: Device Access  Limit device visibility; isolation  Implement device access controls – Secure sharing  Segment device access  Device whitelist / blacklist 6/13/2014 16
  • 17. So You Want To Build A Container? 6/13/2014 17
  • 18. Linux namespaces  Problem – How do I provide an isolated view of global resources to a group of tasks (processes)?  Solution  namespaces 6/13/2014 18 namespace blue – MNT; mount points, files systems, etc. – PID; processes – NET; NICs, routing, etc. – IPC; System V IPC – UTS; host and domain name – USER; UID and GID MNT PID NET UTS USER proc proc proc
  • 19. Linux namespaces: Conceptual Overview 6/13/2014 19 global (i.e. root) namespace MNT NS / /proc /mnt/fsrd /mnt/fsrw /mnt/cdrom /run2 UTS NS globalhost rootns.com PID NS PID COMMAND 1 /sbin/init 2 [kthreadd] 3 [ksoftirqd] 4 [cpuset] 5 /sbin/udevd 6 /bin/sh 7 /bin/bash IPC NS SHMID OWNER 32452 root 43321 boden SEMID OWNER 0 root 1 Boden MSQID OWNER NET NS lo: UNKNOWN… eth0: UP… eth1: UP… br0: UP… app1 IP:5000 app2 IP:6000 app3 IP:7000 USER NS root 0:0 ntp 104:109 mysql 105:110 boden 106:111 purple namespace MNT NS / /proc /mnt/purplenfs /mnt/fsrw /mnt/cdrom UTS NS purplehost purplens.com PID NS PID COMMAND 1 /bin/bash 2 /bin/vim IPC NS SHMID OWNER SEMID OWNER 0 root MSQID OWNER NET NS lo: UNKNOWN… eth0: UP… app1 IP:1000 app2 IP:7000 USER NS root 0:0 app 106:111 blue namespace MNT NS / /proc /mnt/cdrom /bluens UTS NS bluehost bluens.com PID NS PID COMMAND 1 /bin/bash 2 python 3 node IPC NS SHMID OWNER SEMID OWNER MSQID OWNER NET NS lo: UNKNOWN… eth0: DOWN… eth1: UP app1 IP:7000 app2 IP:9000 USER NS root 0:0 app 104:109
  • 20. Linux namespaces: Common Idioms  It’s not required to use all namespaces – Pick & choose; if your toolset allows it  Constructs exist to permit “connectivity” between parent / child namespace  Various linux user space tools have namespace support  Linux sys API supports flexible namespace creation 6/13/2014 20
  • 21. Linux namespaces & cgroups: Availability 6/13/2014 21 Note: user namespace support in upstream kernel 3.8+, but distributions rolling out phased support: - Map LXC UID/GID between container and host - Non-root LXC creation
  • 22. So You Want To Build A Container? 6/13/2014 22
  • 23. Linux chroot & pivot_root  Using pivot_root with MNT namespace addresses escaping chroot concerns  The pivot_root target directory becomes the “new root FS” 6/13/2014 23
  • 24. LXC Images LXC images provide a flexible means to deliver only what you need – lightweight and minimal footprint  Basic constraints – Same architecture & endian – Linux’ish Operating System; you can run different Linux distros on same host  Image types – System; virtualize Operating System(s) – standard distro root FS less the kernel – Application; virtualize application(s) – only package apps + dependencies (aka JeOS – Just enough Operating System)  Bind mount host libs / bins into LXC to share host resources  Container image init process – Container init command provided on invocation – can be an application or a full fledged init process – Init script customized for image – skinny SysVinit, upstart, etc. – Reduces overhead of lxc start-up and runtime foot print  Various tools to build images – SuSE Kiwi – Debootstrap – Etc.  LXC tooling options often include numerous image templates 6/13/2014 24
  • 25. So You Want To Build A Container? 6/13/2014 25
  • 26. Linux Security Modules & MAC  Linux Security Modules (LSM) – kernel modules which provide a framework for Mandatory Access Control (MAC) security implementations  MAC vs DAC – In MAC, admin (user or process) assigns access controls to subject / initiator – In DAC, resource owner (user) assigns access controls to individual resources  Existing LSM implementations include: AppArmor, SELinux, GRSEC, etc. 6/13/2014 26
  • 27. Linux Capabilities  Per process privileges which define sys call access  Can be assigned to LXC process(es) 6/13/2014 27
  • 28. Other Security Measures  Reduce shared FS access using RO bind mounts  Linux seccomp – Confine system calls  Keep Linux kernel up to date  User namespaces in 3.8+ kernel – Launching containers as non-root user – Mapping UID / GID into container 6/13/2014 28
  • 29. So You Want To Build A Container? 6/13/2014 29
  • 30. LXC Industry Tooling Virtuozzo OpenVZ Linux VServer Libvirt-lxc Lxc (tools) Warden lmctfy Docker Summary Commercial product using OpenVZ under the hood Custom Kernel providing well seasoned LXC support A set of kernel patches providing LXC. Not based on cgroups or namespaces. Libvirt support for LXC via cgroups and namespaces. Lib + set of user spaces tools /bindings for LXC. LXC management tooling used by CF. Similar to LXC, but provides more intent based focus. Commoditizatio n of LXC adding support for images, build files, etc. Part of upstream Kernel? No No Partial Yes Yes Yes Yes, but additional patches needed for specific features. Yes License Commercial GNU GPL v2 GNU GPL v2 GNU LGPL GNU LGPL Apache v2 Apache v2 Apache v2 APIs / Bindings - CLI - API - CLI - C - CLI - C - Python - Java - C# - PHP - Python - Lua - GO - CLI - GO - REST - CLI - Python - Other 3rd party libs Managem ent plane/ Dashboard Virtuozzo Parrallels Virtuozzo Parrallels + others - OpenStack - Archipel - Virt- Manager - LXC web panel - Lexy - OpenStack - Shipyard - Docker UI 6/13/2014 30
  • 31. LXC Orchestration & Management  Docker & libvirt-lxc in OpenStack – Manage containers heterogeneously with traditional VMs… but not w/the level of support & features we might like  CoreOS – Zero-touch admin Linux distro with docker images as the unit of operation – Centralized key/value store to coordinate distributed environment  Various other 3rd party apps – Maestro for docker – Shipyard for docker – Fleet for CoreOS – Etc.  LXC migration – Container migration via criu  But… – Still no great way to tie all virtual resources together with LXC – e.g. storage + networking • IMO; an area which needs focus for LXC to become more generally applicable 6/13/2014 31
  • 32. LXC Gaps There are gaps…  Lack of industry tooling / support  Live migration still a WIP  Full orchestration across resources (compute / storage / networking)  Fears of security  Not a well known technology… yet  Integration with existing virtualization and Cloud tooling  Not much / any industry standards  Missing skillset  Slower upstream support due to kernel dev process  Etc. 6/13/2014 32
  • 33. LXC: Use Cases For Traditional VMs There are still use cases where traditional VMs are warranted.  Virtualization of non Linux based OSs – Windows – AIX – Etc.  LXC not supported on host  VM requires unique kernel setup which is not applicable to other VMs on the host (i.e. per VM kernel config)  Etc. 6/13/2014 33
  • 34. References & Related Links  http://www.slideshare.net/BodenRussell/realizing-linux-containerslxc  http://bodenr.blogspot.com/2014/05/kvm-and-docker-lxc-benchmarking- with.html  https://www.docker.io/  http://sysbench.sourceforge.net/  http://dag.wiee.rs/home-made/dstat/  http://www.openstack.org/  https://wiki.openstack.org/wiki/Rally  https://wiki.openstack.org/wiki/Docker  http://devstack.org/  http://www.linux-kvm.org/page/Main_Page  https://github.com/stackforge/nova-docker  https://github.com/dotcloud/docker-registry  http://www.netperf.org/netperf/  http://www.tokutek.com/products/iibench/  http://www.brendangregg.com/activebenchmarking.html  http://wiki.openvz.org/Performance 6/13/2014 34