Containers are everywhere. But what exactly is a container? What are they made from? What's the difference between LXC, butts-nspawn, Docker, and the other container systems out there? And why should we bother about specific filesystems?
In this talk, Jérôme will show the individual roles and behaviors of the components making up a container: namespaces, control groups, and copy-on-write systems. Then, he will use them to assemble a container from scratch, and highlight the differences (and likelinesses) with existing container systems.
Cgroups, namespaces, and beyond: what are containers made from? (DockerCon Eu...Jérôme Petazzoni
Linux containers are different from Solaris Zones or BSD Jails: they use discrete kernel features like cgroups, namespaces, SELinux, and more. We will describe those mechanisms in depth, as well as demo how to put them together to produce a container. We will also highlight how different container runtimes compare to each other.
This talk was delivered at DockerCon Europe 2015 in Barcelona.
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
Talk for USENIX/LISA2014 by Brendan Gregg, Netflix. At Netflix performance is crucial, and we use many high to low level tools to analyze our stack in different ways. In this talk, I will introduce new system observability tools we are using at Netflix, which I've ported from my DTraceToolkit, and are intended for our Linux 3.2 cloud instances. These show that Linux can do more than you may think, by using creative hacks and workarounds with existing kernel features (ftrace, perf_events). While these are solving issues on current versions of Linux, I'll also briefly summarize the future in this space: eBPF, ktap, SystemTap, sysdig, etc.
New Ways to Find Latency in Linux Using TracingScyllaDB
Ftrace is the official tracer of the Linux kernel. It originated from the real-time patch (now known as PREEMPT_RT), as developing an operating system for real-time use requires deep insight and transparency of the happenings of the kernel. Not only was tracing useful for debugging, but it was critical for finding areas in the kernel that was causing unbounded latency. It's no wonder why the ftrace infrastructure has a lot of tooling for seeking out latency. Ftrace was introduced into mainline Linux in 2008, and several talks have been done on how to utilize its tracing features. But a lot has happened in the past few years that makes the tooling for finding latency much simpler. Other talks at P99 will discuss the new ftrace tracers "osnoise" and "timerlat", but this talk will focus more on the new flexible and dynamic aspects of ftrace that facilitates finding latency issues which are more specific to your needs. Some of this work may still be in a proof of concept stage, but this talk will give you the advantage of knowing what tools will be available to you in the coming year.
Presentation at Android Builders Summit 2012.
Based on the experience of working with ODM companies and SoC vendors, this session would discuss how to figure out the performance hotspot of certain Android devices and then improve in various areas including graphics and boot time. This session consists of the detailed components which seem to be independent from each other in traditional view. However, the situation changes a lot in Android system view since everything is coupled in a mass. Three frequently mentioned items in Android engineering are selected as the entry points: 2D/3D graphics, runtime, and boot time. Audience: Developers who work on Android system integration and platform enablement.
Describes what is lightweight virtualization and containers, and the low-level mechanisms in the Linux kernel that it relies on: namespaces, cgroups. It also gives details on AUFS. Those component together are the key to understanding how modern systems like Docker (http://www.docker.io/) work.
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...Adrian Huang
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Cgroups, namespaces, and beyond: what are containers made from? (DockerCon Eu...Jérôme Petazzoni
Linux containers are different from Solaris Zones or BSD Jails: they use discrete kernel features like cgroups, namespaces, SELinux, and more. We will describe those mechanisms in depth, as well as demo how to put them together to produce a container. We will also highlight how different container runtimes compare to each other.
This talk was delivered at DockerCon Europe 2015 in Barcelona.
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
Talk for USENIX/LISA2014 by Brendan Gregg, Netflix. At Netflix performance is crucial, and we use many high to low level tools to analyze our stack in different ways. In this talk, I will introduce new system observability tools we are using at Netflix, which I've ported from my DTraceToolkit, and are intended for our Linux 3.2 cloud instances. These show that Linux can do more than you may think, by using creative hacks and workarounds with existing kernel features (ftrace, perf_events). While these are solving issues on current versions of Linux, I'll also briefly summarize the future in this space: eBPF, ktap, SystemTap, sysdig, etc.
New Ways to Find Latency in Linux Using TracingScyllaDB
Ftrace is the official tracer of the Linux kernel. It originated from the real-time patch (now known as PREEMPT_RT), as developing an operating system for real-time use requires deep insight and transparency of the happenings of the kernel. Not only was tracing useful for debugging, but it was critical for finding areas in the kernel that was causing unbounded latency. It's no wonder why the ftrace infrastructure has a lot of tooling for seeking out latency. Ftrace was introduced into mainline Linux in 2008, and several talks have been done on how to utilize its tracing features. But a lot has happened in the past few years that makes the tooling for finding latency much simpler. Other talks at P99 will discuss the new ftrace tracers "osnoise" and "timerlat", but this talk will focus more on the new flexible and dynamic aspects of ftrace that facilitates finding latency issues which are more specific to your needs. Some of this work may still be in a proof of concept stage, but this talk will give you the advantage of knowing what tools will be available to you in the coming year.
Presentation at Android Builders Summit 2012.
Based on the experience of working with ODM companies and SoC vendors, this session would discuss how to figure out the performance hotspot of certain Android devices and then improve in various areas including graphics and boot time. This session consists of the detailed components which seem to be independent from each other in traditional view. However, the situation changes a lot in Android system view since everything is coupled in a mass. Three frequently mentioned items in Android engineering are selected as the entry points: 2D/3D graphics, runtime, and boot time. Audience: Developers who work on Android system integration and platform enablement.
Describes what is lightweight virtualization and containers, and the low-level mechanisms in the Linux kernel that it relies on: namespaces, cgroups. It also gives details on AUFS. Those component together are the key to understanding how modern systems like Docker (http://www.docker.io/) work.
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...Adrian Huang
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Talk by Brendan Gregg for USENIX LISA 2019: Linux Systems Performance. Abstract: "
Systems performance is an effective discipline for performance analysis and tuning, and can help you find performance wins for your applications and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas of Linux systems performance: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (Ftrace, bcc/BPF, and bpftrace/BPF), and much advice about what is and isn't important to learn. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud."
How Zalando runs Kubernetes clusters at scale on AWS - AWS re:InventHenning Jacobs
Many clusters, many problems? Having many clusters has benefits: reduced blast radius, less vertical scaling of cluster components, and a natural trust boundary. In this session, Zalando shows its approach for running 140+ clusters on AWS, how it does continuous delivery for its cluster infrastructure, and how it created open-source tooling to manage cost efficiency and improve developer experience. The company openly shares its failures and the learnings collected during three years of Kubernetes in production.
AWS re:Invent session OPN211 on 2019-12-05
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Virtual File System in Linux Kernel
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Delivered as plenary at USENIX LISA 2013. video here: https://www.youtube.com/watch?v=nZfNehCzGdw and https://www.usenix.org/conference/lisa13/technical-sessions/plenary/gregg . "How did we ever analyze performance before Flame Graphs?" This new visualization invented by Brendan can help you quickly understand application and kernel performance, especially CPU usage, where stacks (call graphs) can be sampled and then visualized as an interactive flame graph. Flame Graphs are now used for a growing variety of targets: for applications and kernels on Linux, SmartOS, Mac OS X, and Windows; for languages including C, C++, node.js, ruby, and Lua; and in WebKit Web Inspector. This talk will explain them and provide use cases and new visualizations for other event types, including I/O, memory usage, and latency.
Broken benchmarks, misleading metrics, and terrible tools. This talk will help you navigate the treacherous waters of Linux performance tools, touring common problems with system tools, metrics, statistics, visualizations, measurement overhead, and benchmarks. You might discover that tools you have been using for years, are in fact, misleading, dangerous, or broken.
The speaker, Brendan Gregg, has given many talks on tools that work, including giving the Linux PerformanceTools talk originally at SCALE. This is an anti-version of that talk, to focus on broken tools and metrics instead of the working ones. Metrics can be misleading, and counters can be counter-intuitive! This talk will include advice for verifying new performance tools, understanding how they work, and using them successfully.
LXC, Docker, security: is it safe to run applications in Linux Containers?Jérôme Petazzoni
Linux Containers (or LXC) is now a popular choice for development and testing environments. As more and more people use them in production deployments, they face a common question: are Linux Containers secure enough? It is often claimed that containers have weaker isolation than virtual machines. We will explore whether this is true, if it matters, and what can be done about it.
OSNoise Tracer: Who Is Stealing My CPU Time?ScyllaDB
In the context of high-performance computing (HPC), the Operating System Noise (osnoise) refers to the interference experienced by an application due to activities inside the operating system. In the context of Linux, NMIs, IRQs, softirqs, and any other system thread can cause noise to the application. Moreover, hardware-related jobs can also cause noise, for example, via SMIs.
HPC users and developers that care about every microsecond stolen by the OS need not only a precise way to measure the osnoise but mainly to figure out who is stealing cpu time so that they can pursue the perfect tune of the system. These users and developers are the inspiration of Linux's osnoise tracer.
The osnoise tracer runs an in-kernel loop measuring how much time is available. It does it with preemption, softirq and IRQs enabled, thus allowing all the sources of osnoise during its execution. The osnoise tracer takes note of the entry and exit point of any source of interferences. When the noise happens without any interference from the operating system level, the tracer can safely point to a hardware-related noise. In this way, osnoise can account for any source of interference. The osnoise tracer also adds new kernel tracepoints that auxiliaries the user to point to the culprits of the noise in a precise and intuitive way.
At the end of a period, the osnoise tracer prints the sum of all noise, the max single noise, the percentage of CPU available for the thread, and the counters for the noise sources, serving as a benchmark tool.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Containerd Internals: Building a Core Container RuntimePhil Estes
A talk given at OpenSource Summit, North America in Los Angeles, CA on September 11th, 2017. Stephen Day (Docker) and Phil Estes (IBM) presented the history, design, architecture, and use cases for the containerd 1.0 core container runtime open source CNCF project.
Video: https://www.youtube.com/watch?v=FJW8nGV4jxY and https://www.youtube.com/watch?v=zrr2nUln9Kk . Tutorial slides for O'Reilly Velocity SC 2015, by Brendan Gregg.
There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This tutorial explains methodologies for using these tools, and provides a tour of four tool types: observability, benchmarking, tuning, and static tuning. Many tools will be discussed, including top, iostat, tcpdump, sar, perf_events, ftrace, SystemTap, sysdig, and others, as well observability frameworks in the Linux kernel: PMCs, tracepoints, kprobes, and uprobes.
This tutorial is updated and extended on an earlier talk that summarizes the Linux performance tool landscape. The value of this tutorial is not just learning that these tools exist and what they do, but hearing when and how they are used by a performance engineer to solve real world problems — important context that is typically not included in the standard documentation.
Cgroups, namespaces and beyond: what are containers made from?Docker, Inc.
Linux containers are different from Solaris Zones or BSD Jails: they use discrete kernel features like cgroups, namespaces, SELinux, and more. We will describe those mechanisms in depth, as well as demo how to put them together to produce a container. We will also highlight how different container runtimes compare to each other.
Advanced cgroups and namespaces
This talk picks up where we left off in the previous cgroups and namespaces talk and dive in even deeper!
Agenda:
* cgroups v2 design (cgroup v2 was started to be merged in the current kernel, 4.4)
* cgroups v2 examples (migrating tasks, enabling and disabling controllers, and more).
* comparison between cgroup v2 unified hierarchy and cgroup v1 legacy hierarchy.
* PIDs namespaces (from kernel 4.3)
* cgroup namespaces (not merged yet)
Talk by Brendan Gregg for USENIX LISA 2019: Linux Systems Performance. Abstract: "
Systems performance is an effective discipline for performance analysis and tuning, and can help you find performance wins for your applications and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas of Linux systems performance: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (Ftrace, bcc/BPF, and bpftrace/BPF), and much advice about what is and isn't important to learn. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud."
How Zalando runs Kubernetes clusters at scale on AWS - AWS re:InventHenning Jacobs
Many clusters, many problems? Having many clusters has benefits: reduced blast radius, less vertical scaling of cluster components, and a natural trust boundary. In this session, Zalando shows its approach for running 140+ clusters on AWS, how it does continuous delivery for its cluster infrastructure, and how it created open-source tooling to manage cost efficiency and improve developer experience. The company openly shares its failures and the learnings collected during three years of Kubernetes in production.
AWS re:Invent session OPN211 on 2019-12-05
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Virtual File System in Linux Kernel
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Delivered as plenary at USENIX LISA 2013. video here: https://www.youtube.com/watch?v=nZfNehCzGdw and https://www.usenix.org/conference/lisa13/technical-sessions/plenary/gregg . "How did we ever analyze performance before Flame Graphs?" This new visualization invented by Brendan can help you quickly understand application and kernel performance, especially CPU usage, where stacks (call graphs) can be sampled and then visualized as an interactive flame graph. Flame Graphs are now used for a growing variety of targets: for applications and kernels on Linux, SmartOS, Mac OS X, and Windows; for languages including C, C++, node.js, ruby, and Lua; and in WebKit Web Inspector. This talk will explain them and provide use cases and new visualizations for other event types, including I/O, memory usage, and latency.
Broken benchmarks, misleading metrics, and terrible tools. This talk will help you navigate the treacherous waters of Linux performance tools, touring common problems with system tools, metrics, statistics, visualizations, measurement overhead, and benchmarks. You might discover that tools you have been using for years, are in fact, misleading, dangerous, or broken.
The speaker, Brendan Gregg, has given many talks on tools that work, including giving the Linux PerformanceTools talk originally at SCALE. This is an anti-version of that talk, to focus on broken tools and metrics instead of the working ones. Metrics can be misleading, and counters can be counter-intuitive! This talk will include advice for verifying new performance tools, understanding how they work, and using them successfully.
LXC, Docker, security: is it safe to run applications in Linux Containers?Jérôme Petazzoni
Linux Containers (or LXC) is now a popular choice for development and testing environments. As more and more people use them in production deployments, they face a common question: are Linux Containers secure enough? It is often claimed that containers have weaker isolation than virtual machines. We will explore whether this is true, if it matters, and what can be done about it.
OSNoise Tracer: Who Is Stealing My CPU Time?ScyllaDB
In the context of high-performance computing (HPC), the Operating System Noise (osnoise) refers to the interference experienced by an application due to activities inside the operating system. In the context of Linux, NMIs, IRQs, softirqs, and any other system thread can cause noise to the application. Moreover, hardware-related jobs can also cause noise, for example, via SMIs.
HPC users and developers that care about every microsecond stolen by the OS need not only a precise way to measure the osnoise but mainly to figure out who is stealing cpu time so that they can pursue the perfect tune of the system. These users and developers are the inspiration of Linux's osnoise tracer.
The osnoise tracer runs an in-kernel loop measuring how much time is available. It does it with preemption, softirq and IRQs enabled, thus allowing all the sources of osnoise during its execution. The osnoise tracer takes note of the entry and exit point of any source of interferences. When the noise happens without any interference from the operating system level, the tracer can safely point to a hardware-related noise. In this way, osnoise can account for any source of interference. The osnoise tracer also adds new kernel tracepoints that auxiliaries the user to point to the culprits of the noise in a precise and intuitive way.
At the end of a period, the osnoise tracer prints the sum of all noise, the max single noise, the percentage of CPU available for the thread, and the counters for the noise sources, serving as a benchmark tool.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Containerd Internals: Building a Core Container RuntimePhil Estes
A talk given at OpenSource Summit, North America in Los Angeles, CA on September 11th, 2017. Stephen Day (Docker) and Phil Estes (IBM) presented the history, design, architecture, and use cases for the containerd 1.0 core container runtime open source CNCF project.
Video: https://www.youtube.com/watch?v=FJW8nGV4jxY and https://www.youtube.com/watch?v=zrr2nUln9Kk . Tutorial slides for O'Reilly Velocity SC 2015, by Brendan Gregg.
There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This tutorial explains methodologies for using these tools, and provides a tour of four tool types: observability, benchmarking, tuning, and static tuning. Many tools will be discussed, including top, iostat, tcpdump, sar, perf_events, ftrace, SystemTap, sysdig, and others, as well observability frameworks in the Linux kernel: PMCs, tracepoints, kprobes, and uprobes.
This tutorial is updated and extended on an earlier talk that summarizes the Linux performance tool landscape. The value of this tutorial is not just learning that these tools exist and what they do, but hearing when and how they are used by a performance engineer to solve real world problems — important context that is typically not included in the standard documentation.
Cgroups, namespaces and beyond: what are containers made from?Docker, Inc.
Linux containers are different from Solaris Zones or BSD Jails: they use discrete kernel features like cgroups, namespaces, SELinux, and more. We will describe those mechanisms in depth, as well as demo how to put them together to produce a container. We will also highlight how different container runtimes compare to each other.
Advanced cgroups and namespaces
This talk picks up where we left off in the previous cgroups and namespaces talk and dive in even deeper!
Agenda:
* cgroups v2 design (cgroup v2 was started to be merged in the current kernel, 4.4)
* cgroups v2 examples (migrating tasks, enabling and disabling controllers, and more).
* comparison between cgroup v2 unified hierarchy and cgroup v1 legacy hierarchy.
* PIDs namespaces (from kernel 4.3)
* cgroup namespaces (not merged yet)
A guest lecture at National University of Defense Technology (NUDT) in 2016 to postgraduate students in China about emerging technologies in the Linux operating system.
Containerization is more than the new Virtualization: enabling separation of ...Jérôme Petazzoni
Docker offers a new, lightweight approach to application
portability. Applications are shipped using a common container format,
and managed with a high-level API. Their processes run within isolated
namespaces which abstract the operating environment, independently of
the distribution, versions, network setup, and other details of this
environment.
This "containerization" has often been nicknamed "the new
virtualization". But containers are more than lightweight virtual
machines. Beyond their smaller footprint, shorter boot times, and
higher consolidation factors, they also bring a lot of new features
and use cases which were not possible with classical virtual machines.
We will focus on one of those features: separation of operational
concerns. Specifically, we will demonstrate how some fundamental tasks
like logging, remote access, backups, and troubleshooting can be
entirely decoupled from the deployment of applications and
services. This decoupling results in independent, smaller, simpler
moving parts; just like microservice architectures break down large
monolithic apps in more manageable components.
Immutable infrastructure with Docker and containers (GlueCon 2015)Jérôme Petazzoni
"Never upgrade a server again. Never update your code. Instead, create new servers, and throw away the old ones!"
That's the idea of immutable servers, or immutable infrastructure. This makes many things easier: rollbacks (you can always bring back the old servers), A/B testing (put old and new servers side by side), security (use the latest and safest base system at each deploy), and more.
However, throwing in a bunch of new servers at each one-line CSS change is going to be complicated, not to mention costly.
Containers to the rescue! Creating container "golden images" is easy, fast, dare I say painless. Replacing your old containers with new ones is also easy to do; much easier than virtual machines, let alone physical ones.
In this talk, we'll quickly recap the pros (and cons) of immutable servers; then explain how to implement that pattern with containers. We will use Docker as an example, but the technique can easily be adapted to Rocket or even plain LXC containers.
"Lightweight Virtualization with Linux Containers and Docker". Jerome Petazzo...Yandex
Lightweight virtualization", also called "OS-level virtualization", is not new. On Linux it evolved from VServer to OpenVZ, and, more recently, to Linux Containers (LXC). It is not Linux-specific; on FreeBSD it's called "Jails", while on Solaris it’s "Zones". Some of those have been available for a decade and are widely used to provide VPS (Virtual Private Servers), cheaper alternatives to virtual machines or physical servers. But containers have other purposes and are increasingly popular as the core components of public and private Platform-as-a-Service (PAAS), among others.
Just like a virtual machine, a Linux Container can run (almost) anywhere. But containers have many advantages over VMs: they are lightweight and easier to manage. After operating a large-scale PAAS for a few years, dotCloud realized that with those advantages, containers could become the perfect format for software delivery, since that is how dotCloud delivers from their build system to their hosts. To make it happen everywhere, dotCloud open-sourced Docker, the next generation of the containers engine powering its PAAS. Docker has been extremely successful so far, being adopted by many projects in various fields: PAAS, of course, but also continuous integration, testing, and more.
Containerization Is More than the New VirtualizationC4Media
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/1E5GzZX.
Jérôme Petazzoni borrows from his experience at Docker Inc. to explain live applications running in Docker, including reading logs, remote access, and troubleshooting tips. Filmed at qconsf.com.
Jérôme Petazzoni is a senior engineer at dotCloud, where he rotates between Ops, Support and Evangelist duties and the nickname of “master Yoda”, has earned.
Introduction to Docker (as presented at December 2013 Global Hackathon)Jérôme Petazzoni
Not on board of the Docker ship yet? This presentation will get you up to speed, and explain everything you want to know about Linux Containers and Docker, including the new features of the latest 0.7 version (which brings support for all Linux distros and kernels).
Use the Source or Join the Dark Side: differences between Docker Community an...Jérôme Petazzoni
The Docker Project delivers a complete open source platform to “build, ship, and run” any application, anywhere, using containers. The Docker Engine and the other main components (Compose, Machine,
and the SwarmKit orchestration system) are free; but Docker Inc. (the company who started the Docker Project) also has a complete commercial offering named “Docker EE” (for Enterprise Edition) that adds an extra set of features geared at larger organizations, as well as an extended support and release cycle.
In this talk, I will explain (and show with demos) what you can do using exclusively Docker CE (community, free edition) and which features are added by Docker EE. This talk is for you if you are in the process of selecting a container platform; or if you’re just curious, and want to know exactly what you can do (and cannot do) with Docker CE and EE.
Orchestration, resource scheduling…What does that mean? Is this only relevant for data centers with thousands of nodes? Should I care about Mesos, Kubernetes, Swarm, when all I have is a handful of virtual machines? The motto of public cloud IAAS is "pay for what you use," so in theory, if I deploy my apps there, I'm already getting the best "resource utilization" aka "bang for my buck," right? In this talk, we will answer those questions, and a few more. We will define orchestration, scheduling, and others, and show what it's like to use a scheduler to run containerized applications there.
Docker : quels enjeux pour le stockage et réseau ? Paris Open Source Summit ...Jérôme Petazzoni
Présentation donnée le 18 novembre 2015 au Paris Open Source Summit par Hervé Leclerc (Alterway) et Jérôme Petazzoni (Docker), présentant entre autres les nouvelles fonctionalités de Docker pour le stockage et le réseau arrivées dans la version 1.9 du Docker Engine.
Making DevOps Secure with Docker on Solaris (Oracle Open World, with Jesse Bu...Jérôme Petazzoni
Docker, the container Engine and Platform, is coming to Oracle Solaris! This is the talk that Jérôme Petazzoni (Docker) and Jesse Butler (Oracle) gave at Oracle Open World in November 2015.
Containers, docker, and security: state of the union (Bay Area Infracoders Me...Jérôme Petazzoni
Docker is two years old. While security has always been at the core of the questions revolving around Docker, the nature of those questions has changed. Last year, the main concern was "can I safely colocate containers on the same machine?" and it elicited various responses. Dan Walsh, SELinux expert, notoriously said: "containers do not contain!", and at last year's LinuxCon, Jérôme delivered a presentation detailing how to harden Docker and containers to isolate them better. Today, people have new concerns. They include image transport, vulnerability mitigation, and more.
After a recap about the current state of container security, Jérôme will explain why those new questions showed up, and most importantly, how to address them and safely deploy containers in general, and Docker in particular.
From development environments to production deployments with Docker, Compose,...Jérôme Petazzoni
In this session, we will learn how to define and run multi-container applications with Docker Compose. Then, we will show how to deploy and scale them seamlessly to a cluster with Docker Swarm; and how Amazon EC2 Container Service (ECS) eliminates the need to install,operate, and scale your own cluster management infrastructure. We will also walk through some best practice patterns used by customers for running their microservices platforms or batch jobs. Sample code and Compose templates will be provided on GitHub afterwards.
How to contribute to large open source projects like Docker (LinuxCon 2015)Jérôme Petazzoni
Contributing to a large open source project can seem daunting at first; but fear not! You too can join thousands of successful contributors. First, you don't have to be an expert in Golang, Python, or C, to contribute to Docker, OpenStack, or the Linux Kernel. Many projects also need help with documentation, translation, testing, triaging issues, and more. Very often, just going through bug reports to reproduce them and confirm "this also happens on my setup, with version XYZ" is extremely helpful.
If you decide to take the leap and propose a change (be it code or documentation), each open source project has different contribution guidelines and workflows.
In this talk, Arnaud and Jérôme will explain some of those workflows, how maintainers review your patches, and highlight the details that make your changes more likely to be merged into the project.
Containers, Docker, and Security: State Of The Union (LinuxCon and ContainerC...Jérôme Petazzoni
Docker is two years old. While security has always been at the core of the questions revolving around Docker, the nature of those questions has changed. Last year, the main concern was "can I safely colocate containers on the same machine?" and it elicited various responses. Dan Walsh, SELinux expert, notoriously said: "containers do not contain!", and at last year's LinuxCon, Jérôme delivered a presentation detailing how to harden Docker and containers to isolate them better.
Today, people have new concerns. They include image transport, vulnerability mitigation, and more.
After a recap about the current state of container security, Jérôme will explain why those new questions showed up, and most importantly, how to address them and safely deploy containers in general, and Docker in particular.
Deploy microservices in containers with Docker and friends - KCDC2015Jérôme Petazzoni
Docker lets us build, ship, and run any Linux application, on any platform. It found many early adopters in the CI/CD industry, long before it reached the symbolic 1.0 milestone and was considered "production-ready." Since then, its stability and features attracted enterprise users in many different fields, including very demanding ones like finance, banking, or intelligence agencies.
We will see how Docker is particularly suited to the deployment of distributed applications, and why it is an ideal platform for microservice architectures. In particular, we will look into three Docker related projects that have been announced at DockerCon Europe last December: Machine, Swarm, and Compose, and we will explain how they improve the way we build, deploy, and scale distributed applications.
Containers: from development to production at DevNation 2015Jérôme Petazzoni
In Docker, applications are shipped using a lightweight format, managed with a high-level API, and run within software containers which abstract the host environment. Operating details like distributions, versions, and network setup no longer matter to the application developer.
Thanks to this abstraction level, we can use the same container across all steps of the life cycle of an application, from development to production. This eliminates problems stemming from discrepancies between those environments.
Even so, these environments will always have different requirements. If our quality assurance (QA) and production systems use different logging systems, how can we still ship the same container to both? How can we satisfy the backup and security requirements of our production stack without bloating our development stack?
In this sess, you will learn about the unique features in containers that allow you to cleanly decouple system administrator tasks from the core of your application. We’ll show you how this decoupling results in smaller, simpler containers, and gives you more flexibility when building, managing, and evolving your application stacks.
The Docker ecosystem and the future of application deploymentJérôme Petazzoni
Ten years ago, virtualization ignited a revolution which gave birth to the Cloud and the DevOps initiative. Today, with containers, we are at the dawn of a similar breakthrough.
How can we capture the value of containers? How can we use their features to implement microservices and immutable infrastructures, while retaining as much as possible of our existing practices? The answer is in the rich ecosystem that developed around Docker, an open-source platform to build, ship, and run applications in containers.
In this keynote we’ll explore what the applications of tomorrow will look like, how they’ll be deployed and distributed – and how to leverage those tools today.
Docker landed almost two years ago, making it possible to build, ship, and run
any Linux application, on any platform, it was quickly adopted by developers
and ops, like no other tool before. The CI/CD industry even took it to
production long before it was stamped "production-ready."
Why does everyone (or almost!) love Docker? Because it puts powerful
automation abilities within the hands of normal developers. Automation
almost always involves building distribution packages, virtual machine
images, or writing configuration management manifests. With Docker,
those tasks are radically transformed: sometimes they're far easier than before,
other times they're no longer needed at all. Either way, the intervention
of a seasoned sysadmin guru is no longer required.
Introduction to Docker, December 2014 "Tour de France" Bordeaux Special EditionJérôme Petazzoni
Docker, the Open Source container Engine, lets you build, ship and run, any app, anywhere.
This is the presentation which was shown in December 2014 for the last stop of the "Tour de France" in Bordeaux. It is slightly different from the presentation which was shown in the other cities (http://www.slideshare.net/jpetazzo/introduction-to-docker-december-2014-tour-de-france-edition), and includes a detailed history of dotCloud and Docker and a few other differences.
Special thanks to https://twitter.com/LilliJane and https://twitter.com/zirkome, who gave me the necessary motivation to put together this slightly different presentation, since they had already seen the other presentation in Paris :-)
Introduction to Docker, December 2014 "Tour de France" EditionJérôme Petazzoni
Docker, the Open Source container Engine, lets you build, ship and run, any app, anywhere.
This is the presentation which was shown in December 2014 for the "Tour de France" in Paris, Lille, Lyon, Nice...
Containers, Docker, and Microservices: the Terrific TrioJérôme Petazzoni
One of the upsides of Microservices is the ability to deploy often,at arbitrary schedules, and independently of other services, instead of requiring synchronized deployments happening on a fixed time.
But to really leverage this advantage, we need fast, efficient, and reliable deployment processes. That's one of the value propositions of Containers in general, and Docker in particular.
Docker offers a new, lightweight approach to application portability.It can build applications using easy-to-write, repeatable, efficient recipes; then it can ship them across environments using a common container format; and it can run them within isolated namespaces which abstract the operating environment, independently of the distribution,versions, network setup, and other details of this environment.
But Docker can do way more than deploy your apps. Docker also enables you to generalize Microservices principles and apply them on operational tasks like logging, remote access, backups, and troubleshooting.This decoupling results in independent, smaller, simpler moving parts.
Pipework: Software-Defined Network for Containers and DockerJérôme Petazzoni
Pipework lets you connect together containers in arbitrarily complex scenarios. Pipework uses cgroups and namespaces and works with "plain" LXC containers (created with lxc-start), and with the awesome Docker.
It's nothing less than Software-Defined Networking for Linux Containers!
This is a short presentation about Pipework, given at the Docker Networking meet-up November 6th in Mountain View.
More information:
- https://github.com/jpetazzo/pipework
- http://www.meetup.com/Docker-Networking/
Docker Tips And Tricks at the Docker Beijing MeetupJérôme Petazzoni
This talk was presented in October at the Docker Beijing Meetup, in the VMware offices.
It presents some of the latest features of Docker, discusses orchestration possibilities with Docker, then gives a briefing about the performance of containers; and finally shows how to use volumes to decouple components in your applications.
Introduction to Docker at Glidewell Laboratories in Orange CountyJérôme Petazzoni
In this presentation we will introduce Docker, and how you can use it to build, ship, and run any application, anywhere. The presentation included short demos, links to further material, and of course Q&As. If you are already a seasoned Docker user, this presentation will probably be redundant; but if you started to use Docker and are still struggling with some of his facets, you'll learn some!
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
2. Who am I?
Jérôme Petazzoni (@jpetazzo)
French software engineer living in California
I have built and scaled the dotCloud PaaS
(almost 5 years ago, time flies!)
I talk (too much) about containers
(first time was maybe in February 2013, at SCALE in L.A.)
Proud member of the House of Bash
(when annoyed, we replace things with tiny shell scripts)
2 / 59
3. Outline
What is a container?
(high level and low level overviews)
The building blocks
(namespaces, cgroups, copy-on-write storage)
Container runtimes
(Docker, LXC, rkt, runC, systemd-nspawn...)
Other containers
(OpenVZ, Jails, Zones...)
Hand-crafted, artisan-pick, house-blend containers
(with demos!)
3 / 59
5. High level approach:
it's a lightweight VM
I can get a shell on it
(through SSH or otherwise)
It "feels" like a VM:
own process space
own network interface
can run stuff as root
can install packages
can run services
can mess up routing, iptables...
5 / 59
6. Low level approach:
it's chroot on steroids
It's not quite like a VM:
uses the host kernel
can't boot a different OS
can't have its own modules
doesn't need initas PID 1
doesn't need syslogd, cron...
It's just a bunch of processes visible on the host machine
(contrast with VMs which are opaque)
6 / 59
7. How are they implemented?
Let's look in the kernel source!
Go to LXR
Look for "LXC" → zero result
Look for "container" → 1000+ results
Almost all of them are about data structures
(or other unrelated concepts like "ACPI containers")
There are some references to "our" containers,
in documentation
7 / 59
8. How are they implemented?
Let's look in the kernel source!
Go to LXR
Look for "LXC" → zero result
Look for "container" → 1000+ results
Almost all of them are about data structures
(or other unrelated concepts like "ACPI containers")
There are some references to "our" containers,
in documentation
??? Are containers even in the kernel ???
8 / 59
9. Ancient archeology
Five years ago...
When using the LXC tools (lxc-start, lxc-stop...)
you would often get weird messages mentioning
/cgroup/container_name/...
The cgrouppseudo filesystem has counters for CPU, RAM,
I/O, and device access (/dev/*) but nothing about network
I didn't find about "namespaces" by myself
(maybe noticing /proc/$PID/nswould have helped;
more on that later)
9 / 59
10. Modern archeology
Nowadays:
Cgroups are often in /sys/fs/cgroup
You quickly stumble on nsenter
(which tips you off about namespaces)
There is significantly more documentation everywhere
10 / 59
13. Control groups
Resource metering and limiting
memory
CPU
block I/O
network*
Device node (/dev/*) access control
*With cooperation from iptables/tc; more on that later
13 / 59
14. Generalities
Each subsystem (memory, CPU...) has a hierarchy (tree)
Hierarchies are independent
(the trees for e.g. memory and CPU can be different)
Each process belongs to exactly 1 node in each hierarchy
(think of each hierarchy as a different dimension or axis)
Each hierarchy starts with 1 node (the root)
All processes initially belong to the root of each hierarchy*
Each node = group of processes
(sharing the same resources)
*It's a bit more subtle; more on that later
14 / 59
16. Memory cgroup: accounting
Keeps track of pages used by each group:
file (read/write/mmap from block devices)
anonymous (stack, heap, anonymous mmap)
active (recently accessed)
inactive (candidate for eviction)
Each page is "charged" to a group
Pages can be shared across multiple groups
(e.g. multiple processes reading from the same files)
When pages are shared, the groups "split the bill"
16 / 59
17. Memory cgroup: limits
Each group can have (optional) hard and soft limits
Soft limits are not enforced
(they influence reclaim under memory pressure)
Hard limits will trigger a per-group OOM killer
The OOM killer can be customized (oom-notifier);
when the hard limit is exceeded:
freeze all processes in the group
notify user space (instead of going rampage)
we can kill processes, raise limits, migrate containers ...
when we're in the clear again, unfreeze the group
Limits can be set for physical, kernel, total memory
17 / 59
18. Memory cgroup: tricky details
Each time the kernel gives a page to a process,
or takes it away, it updates the counters
This adds some overhead
Unfortunately, this cannot be enabled/disabled per process
(it has to be done at boot time)
Cost sharing means thata process leaving a group
(e.g. because it terminates) can theoretically cause
an out of memory condition
18 / 59
19. Cpu cgroup
Keeps track of user/system CPU time
Keeps track of usage per CPU
Allows to set weights
Can't set CPU limits
OK, let's say you give N%
then the CPU throttles to a lower clock speed
now what?
same if you give a time slot
instructions? their exec speed varies wildly
¯_ツ_/¯
19 / 59
20. Cpuset cgroup
Pin groups to specific CPU(s)
Reserve CPUs for specific apps
Avoid processes bouncing between CPUs
Also relevant for NUMA systems
Provides extra dials and knobs
(per zone memory pressure, process migration costs...)
20 / 59
21. Blkio cgroup
Keeps track of I/Os for each group
per block device
read vs write
sync vs async
Set throttle (limits) for each group
per block device
read vs write
ops vs bytes
Set relative weights for each group
Full disclosure: this used to be clunky for async operations, and I haven't tested it in ages. ☹
21 / 59
22. Net_cls and net_prio cgroup
Automatically set traffic class or priority,
for traffic generated by processes in the group
Only works for egress traffic
Net_cls will assign traffic to a class
(that has to be matched with tc/iptables,
otherwise traffic just flows normally)
Net_prio will assign traffic to a priority
(priorities are used by queuing disciplines)
22 / 59
23. Devices cgroup
Controls what the group can do on device nodes
Permissions include read/write/mknod
Typical use:
allow /dev/{tty,zero,random,null}...
deny everything else
A few interesting nodes:
/dev/net/tun(network interface manipulation)
/dev/fuse(filesystems in user space)
/dev/kvm(VMs in containers, yay inception!)
/dev/dri(GPU)
23 / 59
24. Subtleties
PID 1 is placed at the root of each hierarchy
When a process is created, it is placed in the same groups
as its parent
Groups are materialized by one (or multiple) pseudo-fs
(typically mounted in /sys/fs/cgroup)
Groups are created by mkdirin the pseudo-fs
To move a process:
echo$PID>/sys/fs/cgroup/.../tasks
The cgroup wars: systemd vs cgmanager vs ...
24 / 59
26. Namespaces
Provide processes with their own view of the system
Cgroups = limits how much you can use;
namespaces = limits what you can see (and therefore use)
Multiple namespaces:
pid
net
mnt
uts
ipc
user
Each process is in one namespace of each type
26 / 59
27. Pid namespace
Processes within a PID namespace only see processes in
the same PID namespace
Each PID namespace has its own numbering
(starting at 1)
When PID 1 goes away, the whole namespace is killed
Those namespaces can be nested
A process ends up having multiple PIDs
(one per namespace in which its nested)
27 / 59
28. Net namespace: in theory
Processes within a given network namespace
get their own private network stack, including:
network interfaces (including lo)
routing tables
iptables rules
sockets (ss, netstat)
You can move a network interface from a netns to another
iplinksetdeveth0netnsPID
28 / 59
29. Net namespace: in practice
Typical use-case:
use vethpairs
(two virtual interfaces acting as a cross-over cable)
eth0in container network namespace
paired with vethXXXin host network namespace
all the vethXXXare bridged together
(Docker calls the bridge docker0)
But also: the magic of --netcontainer
shared localhost(and more!)
29 / 59
30. Mnt namespace
Processes can have their own root fs (à la chroot)
Processes can also have "private" mounts
/tmp(scoped per user, per service...)
Masking of /proc, /sys
NFS auto-mounts (why not?)
Mounts can be totally private, or shared
No easy way to pass along a mount
from a namespace to another ☹
30 / 59
35. Ipc namespace
Does anybody knows about IPC?
Does anybody cares about IPC?
Allows a process (or group of processes) to have own:
IPC semaphores
IPC message queues
IPC shared memory
... without risk of conflict with other instances.
35 / 59
36. User namespace
Allows to map UID/GID; e.g.:
UID 0→1999 in container C1 is mapped to
UID 10000→11999 on host
UID 0→1999 in container C2 is mapped to
UID 12000→13999 on host
etc.
Avoids extra configuration in containers
UID 0 (root) can be squashed to a non-privileged user
Security improvement
But: devil is in the details
36 / 59
37. Namespace manipulation
Namespaces are created with the clone()system call
(i.e. with extra flags when creating a new process)
Namespaces are materialized by pseudo-files in
/proc/<pid>/ns
When the last process of a namespace exits, it is destroyed
(but can be preserved by bind-mounting the pseudo-file)
It is possible to "enter" a namespace with setns()
(exposed by the nsenterwrapper in util-linux)
37 / 59
39. Copy-on-write storage
Create a new container instantly
(instead of copying its whole filesystem)
Storage keeps track of what has changed
Many options available
AUFS, overlay (file level)
device mapper thinp (block level)
BTRFS, ZFS (FS level)
Considerably reduces footprint and "boot" times
See also: Deep dive into Docker storage drivers
39 / 59
41. Orthogonality
All those things can be used independently
Use a few cgroups if you just need resource isolation
Simulate a network of routers with network namespaces
Put the debugger in a container's namespaces,
but not its cgroups (to not use its resource quotas)
Setup a network interface in an isolated environment,
then move it to another
etc.
41 / 59
42. One word about overhead
Even when you don't run containers ...
... you are in a container
Your host processes still execute in the root
namespaces and cgroups
Remember: there are three kind of lies
42 / 59
43. One word about overhead
Even when you don't run containers ...
... you are in a container
Your host processes still execute in the root
namespaces and cgroups
Remember: there are three kind of lies
Lies, damn lies, and benchmarks
43 / 59
44. Some missing bits
Capabilities
break down "root / non-root" into fine-grained rights
allow to keep root, but without the dangerous bits
however: CAP_SYS_ADMINremains a big catchall
SELinux / AppArmor ...
containers that actually contain
deserve a whole talk on their own
44 / 59
46. LXC
Set of userland tools
A container is a directory in /var/lib/lxc
Small config file + root filesystem
Early versions had no support for CoW
Early versions had no support to move images around
Requires significant amount of elbow grease
(easy for sysadmins/ops, hard for devs)
46 / 59
47. systemd-nspawn
From its manpage:
"For debugging, testing and building"
"Similar to chroot, but more powerful"
"Implements the Container Interface"
Seems to position itself as plumbing
47 / 59
48. systemd-nspawn
From its manpage:
"For debugging, testing and building"
"Similar to chroot, but more powerful"
"Implements the Container Interface"
Seems to position itself as plumbing
Recently added support for ɹǝʞɔop images
48 / 59
49. systemd-nspawn
From its manpage:
"For debugging, testing and building"
"Similar to chroot, but more powerful"
"Implements the Container Interface"
Seems to position itself as plumbing
Recently added support for ɹǝʞɔop images
#defineINDEX_HOST"index.do"/*theURLwegetthedatafrom*/"cker.io"
#defineHEADER_TOKEN"X-Do"/*theHTTPheaderfortheauthtoken*/"cker-Token:"
#defineHEADER_REGISTRY"X-Do"/*theHTTPheaderfortheregistry*/"cker-Endpoints:"
49 / 59
50. Docker Engine
Daemon controlled by REST(ish) API
First versions shelled out to LXC,
now uses its own libcontainerruntime
Manages containers, images, builds, and more
Some people think it does too many things
50 / 59
51. rkt, runC
Back to the basics!
Focus on container execution
(no API, no image management, no build, etc.)
They implement different specifications:
rkt implements appc (App Container)
runC implements OCP (Open Container Project),
leverages Docker's libcontainer
51 / 59
52. Which one is best?
They all use the same kernel features
Performance will be exactly the same
Look at:
features
design
ecosystem
52 / 59
54. OpenVZ
Also Linux
Older, but battle-tested
(e.g. Travis CI gives you root in OpenVZ)
Tons of neat features too
ploop (efficient block device for containers)
checkpoint/restore, live migration
venet (~more efficient veth)
Still developed
54 / 59
55. Jails / Zones
FreeBSD / Solaris
Coarser granularity than namespaces and cgroups
Strong emphasis on security
Great for hosting providers
Not so much for developers
(where's the equivalent of dockerrun-tiubuntu?)
Note: Solaris branded zones can run Linux binaries
55 / 59