ssThe State of Linux Containers
2
Gaikai
PS Now announcement at CES 2014
3
Gaikai
- caching
+ controller feedback
1. “Linux Container” / “Docker Ecosystem” in a Nutshell
2. Confusion about Ecosystem / Vision to tackle it
3. Docker -> SWARM -> SLURM 

-> BigData
4. Discussion of Opportunities and Problems
4
Agenda
The Bits and Pieces…
Userland	(OS)Userland	(OS) Userland	(OS)
Userland	(OS)
Ubuntu:14.04 Ubuntu:15.10 RHEL7.2
Tiny	Core	Linux	
Linux Containers
6
SERVER
HOST	KERNEL
HYPERVISOR
KERNEL
SERVICE
Userland	(OS)
KERNEL KERNEL
Userland	(OS)Userland	(OS) Userland	(OS)
SERVICE SERVICE
SERVER
HOST	KERNEL
SERVICE SERVICE SERVICE
Traditional Virtualisation Containerisation
Containers do not spin up a distinct kernel
all containers & the host share the same
user-lands are independent
they are separated by Kernel Namespaces
Containers are ‘grouped processes’
isolated by Kernel Namespaces
resource restrictions applicable through CGroups (disk/netIO)
HOST
container1
7
Kernel Namespaces
bash
ls -l
container2
apache
container3
mysqld
consul consul
PIDNamespaces: Network Mount IPC UTS
container4
slurmd
ssh
consul
Container Runtime Daemon
creates/…/removes containers, exposes REST API
handles Namespaces, CGroups, bind-mounts, etc.
IP connectivity by default via ‘host-only’ network bridge
Docker Engine
8
SERVER
eth0
docker0
container1
container2
Docker-Engine
Docker Compose
9
Describes stack of container configurations
instead of writing a small bash script…
… it holds the runtime configuration as YAML file.
Docker Networking spans networks across engines
KV-store to synchronise (Zookeeper, etcd, Consul)
VXLAN to pass messages along
SERVER0 SERVER1 SERVER<n>
Docker Networking
10
Consul
Docker-Engine
Consul Consul
Docker-Engine Docker-Engine
Consul DC
global
container0 container1 containerN
Docker Swarm proxies docker-engines
serves an API endpoint in front of multiple docker-engines
does placement decisions.
SERVER0 SERVER1 SERVER<n>
Docker Swarm
11
Docker-Engine Docker-Engine Docker-Engine
swarm-client swarm-client swarm-client
swarm-master
:2376 :2376 :2376
:2375
container1
-e constraint:node==SERVER0
Docker Swarm [cont]
12
query docker-enginequery docker-swarm
Introduce new Technologies
Introducing new Tech
14
Self-perception when introducing new tech…
credit: TF2 - Meet the Pyro
Introducing new Tech
15
… not always the same as the perception of others.
credit: TF2 - Meet the Pyro
Docker Buzzword Chaos!
Distributions
Solutions
Auto-Scaling
On-Premise & OverSpill
Orchestration
self-healing
16
production-ready
enterprise-grade
1. No special distributions
useful for certain use-cases, such as elasticity and green-field
deployment
not so much for an on-premise datacenter w/ legacy in it.
2. Leverage existing processes/resources
install workflow, syslog, monitoring
security (ssh infrastructure), user auth.
3. keep up with docker ecosystem
incorporate new features of engine, swarm, compose
networking, volumes, user-namespaces
17
Vision
Reduce to the max!
Hardware (courtesy of )
8x Sun Fire x2250, 2x 4core XEON, 32GB, Mellanox ConnectX-2)
Software
Base installation
CentOS 7.2 base installation (updated from 7-alpha)
Ansible
consul, sensu
docker v1.10, docker-compose
docker SWARM
19
Testbed
node1
node2
node8
20
Docker Networking
Synchronised by Consul
Consul
Consul

DC
Consul
Consul
Docker-Engine
Docker-Engine
Docker-Engine
node1
node2
node8
21
Docker SWARM
Docker SWARM
Synchronised by Consul KV-
store
Consul
Consul

DC
Consul
Consul
Docker-Engine
Docker-Engine
Docker-Engine
swarm
swarm
SWARM
swarm master
node8
node2
node1
22
SLURM Cluster
Consul
Consul

DC
Consul
Consul
SLURM within SWARM
slurmctld slurmd
slurmd
slurmd
Docker-Engine
Docker-Engine
Docker-Engine
swarm
swarm
SWARM
swarm master
SLURM
23
SLURM Cluster [cont]
node8
node2
node1
24
SLURM Cluster [cont]
Consul
Consul

DC
Consul
Consul
SLURM within SWARM
slurmd within app-container
pre-stage containers slurmctld slurmd
slurmd
slurmd
Docker-Engine
Docker-Engine
Docker-Engine
swarm
swarm
hpcg
hpcg
SWARM
hpcg
swarm master
SLURM
25
MPI Benchmark
http://qnib.org/mpi
http://qnib.org/mpi-paper
node8
node2
node1
26
SLURM Cluster [cont]
Consul
Consul

DC
Consul
Consul
SLURM within SWARM
slurmd within app-container
pre-stage containers slurmctld slurmd
slurmd
slurmd
Docker-Engine
Docker-Engine
Docker-Engine
swarm
swarm
hpcg
hpcg
SWARM
hpcg
OpenFOAM
OpenFOAM
OpenFOAM
swarm master
SLURM
27
OpenFOAM Benchmark
http://qnib.org/immutable
http://qnib.org/immutable-paper
node1
node2
node8
28
Samza Cluster
Consul
Consul

DC
Consul
Consul
Distributed Samza
Zookeeper and Kafka cluster
Samza instances to run jobs Docker-Engine
Docker-Engine
Docker-Engine
swarm
swarm
SWARM
swarm masterzookeeper
zookeeper
zookeeper
kafka
kafka
kafka
samza
samza
samza
$ cat test.log |awk ‘{print $1}’ |sed -e ’s/HPC/BigData/g’ |tee out.log
To Be Explored
1. Where to base images on?
Ubuntu/Fedora: ~200MB
Debian: ~100MB
Alpine Linux: 5MB (musl-libc)
2. Trimm the Images down at all cost?
How about debugging tools? Possibility to run tools on the host
and ‘inspect’ namespaced processes inside of a container.
If PID-sharing arrives, carving out (e.g.) monitoring could be a
thing.
30
Small vs. Big
1. In an ideal world…
a container only runs one process, e.g. the HPC solver.
2. In reality…
MPI want’s to connect to a sshd within the job-peers
monitoring, syslog, service discovery should be present as well.
3. How fast / aggressive to break traditional
approaches?
31
One vs. Many Processes
Plugin System
VXLAN
MACVLAN
How about IPoIB?
32
Docker Network
Running OpenFOAM on small scale is cumbersome
manually install OpenFOAM on a workstation
be confident that the installation works correctly
A containerised OpenFOAM installation tackles both
33
Reproducibility / Downscaling
http://qnib.org/immutablehttp://qnib.org/immutable-paper
1. Since the environments are rather dynamic…
how does the containers discover services?
external registry as part of the framework?
discovery service as part of the container stacks?
34
Service Discovery
With Docker Swarm it is rather easy
to spin up a Kubernetes or Mesos cluster within Swarm.
35
Orchestration Frameworks
SERVER0 SERVER1 SERVER<n>
Docker-Engine Docker-Engine Docker-Engine
swarm-client swarm-client swarm-client
swarm-master
etcd
kubelet
scheduler apiserver
etcd
kubelet
etcd
kubelet
1. Containers should be controlled via ENV or flags
External access/change of a running container is discouraged
2. Configuration management
Downgraded to bootstrap a host?
36
Immutable vs. Config Mgmt
If containers are immutable within pipeline
testing/deployment should be automated
developers should have a production replica
37
Continuous Dev./Integration
38
Docker Momentum
Software Dev
DatacenterOps
IT Tinkering (Hello World)
Continuous Dev/Int/Dep
Microservices, hyper scale
Big Data
High Performance Computing
HPC
Disclaimer: subjective exaggeration
Spinning up production-like environment is great
MongoDB, PostreSQL, memcached as separate containers
python2.7, python3.4
39
Docker in Software Development
Like python’s virtualenv on steroids,
iteration speedup through reproducibility
Spinning up production-like environment is…
…not that easy
focus more on engineer/scientist, not the software-developer
1. For development it might work
close to non-HPC software dev
2. But is that the iteration-focus?
rather job settings / input data?
40
Docker in HPC development
Split input iteration / development from operation
non-distributed stays vanilla
transition to HPC cluster using tech to foster operation
41
Separation of Concerns?
http://gmkurtzer.github.io/singularity
Input/Dev
Docker-Engine 1.11 will not be the parent of containers
runC usage under the hood
42
containerd Integration
1. Separat Dev and Ops
don’t block the momentum fostering 

iteration speed in Development 
2. Using vanilla docker-tech
keep up with the ecosystem and prevent vendor/ecosystem
lock-in
3. 80/20 rule
have caveats on the radar but don’t bother too much
everything is so fast moving - it’s hard to predict
43
Recap aka. IMHO
Q&A
https://github.com/qnib/hpcac-cluster2016
http://qnib.org
eGalea Workshop (Pisa)

<plz ping me if you are interested>
23.06.2016

The State of Linux Containers

  • 1.
    ssThe State ofLinux Containers
  • 2.
  • 3.
  • 4.
    1. “Linux Container”/ “Docker Ecosystem” in a Nutshell 2. Confusion about Ecosystem / Vision to tackle it 3. Docker -> SWARM -> SLURM 
 -> BigData 4. Discussion of Opportunities and Problems 4 Agenda
  • 5.
    The Bits andPieces…
  • 6.
    Userland (OS)Userland (OS) Userland (OS) Userland (OS) Ubuntu:14.04 Ubuntu:15.10RHEL7.2 Tiny Core Linux Linux Containers 6 SERVER HOST KERNEL HYPERVISOR KERNEL SERVICE Userland (OS) KERNEL KERNEL Userland (OS)Userland (OS) Userland (OS) SERVICE SERVICE SERVER HOST KERNEL SERVICE SERVICE SERVICE Traditional Virtualisation Containerisation Containers do not spin up a distinct kernel all containers & the host share the same user-lands are independent they are separated by Kernel Namespaces
  • 7.
    Containers are ‘groupedprocesses’ isolated by Kernel Namespaces resource restrictions applicable through CGroups (disk/netIO) HOST container1 7 Kernel Namespaces bash ls -l container2 apache container3 mysqld consul consul PIDNamespaces: Network Mount IPC UTS container4 slurmd ssh consul
  • 8.
    Container Runtime Daemon creates/…/removescontainers, exposes REST API handles Namespaces, CGroups, bind-mounts, etc. IP connectivity by default via ‘host-only’ network bridge Docker Engine 8 SERVER eth0 docker0 container1 container2 Docker-Engine
  • 9.
    Docker Compose 9 Describes stackof container configurations instead of writing a small bash script… … it holds the runtime configuration as YAML file.
  • 10.
    Docker Networking spansnetworks across engines KV-store to synchronise (Zookeeper, etcd, Consul) VXLAN to pass messages along SERVER0 SERVER1 SERVER<n> Docker Networking 10 Consul Docker-Engine Consul Consul Docker-Engine Docker-Engine Consul DC global container0 container1 containerN
  • 11.
    Docker Swarm proxiesdocker-engines serves an API endpoint in front of multiple docker-engines does placement decisions. SERVER0 SERVER1 SERVER<n> Docker Swarm 11 Docker-Engine Docker-Engine Docker-Engine swarm-client swarm-client swarm-client swarm-master :2376 :2376 :2376 :2375 container1 -e constraint:node==SERVER0
  • 12.
    Docker Swarm [cont] 12 querydocker-enginequery docker-swarm
  • 13.
  • 14.
    Introducing new Tech 14 Self-perceptionwhen introducing new tech… credit: TF2 - Meet the Pyro
  • 15.
    Introducing new Tech 15 …not always the same as the perception of others. credit: TF2 - Meet the Pyro
  • 16.
    Docker Buzzword Chaos! Distributions Solutions Auto-Scaling On-Premise& OverSpill Orchestration self-healing 16 production-ready enterprise-grade
  • 17.
    1. No specialdistributions useful for certain use-cases, such as elasticity and green-field deployment not so much for an on-premise datacenter w/ legacy in it. 2. Leverage existing processes/resources install workflow, syslog, monitoring security (ssh infrastructure), user auth. 3. keep up with docker ecosystem incorporate new features of engine, swarm, compose networking, volumes, user-namespaces 17 Vision
  • 18.
  • 19.
    Hardware (courtesy of) 8x Sun Fire x2250, 2x 4core XEON, 32GB, Mellanox ConnectX-2) Software Base installation CentOS 7.2 base installation (updated from 7-alpha) Ansible consul, sensu docker v1.10, docker-compose docker SWARM 19 Testbed
  • 20.
    node1 node2 node8 20 Docker Networking Synchronised byConsul Consul Consul
 DC Consul Consul Docker-Engine Docker-Engine Docker-Engine
  • 21.
    node1 node2 node8 21 Docker SWARM Docker SWARM Synchronisedby Consul KV- store Consul Consul
 DC Consul Consul Docker-Engine Docker-Engine Docker-Engine swarm swarm SWARM swarm master
  • 22.
    node8 node2 node1 22 SLURM Cluster Consul Consul
 DC Consul Consul SLURM withinSWARM slurmctld slurmd slurmd slurmd Docker-Engine Docker-Engine Docker-Engine swarm swarm SWARM swarm master SLURM
  • 23.
  • 24.
    node8 node2 node1 24 SLURM Cluster [cont] Consul Consul
 DC Consul Consul SLURMwithin SWARM slurmd within app-container pre-stage containers slurmctld slurmd slurmd slurmd Docker-Engine Docker-Engine Docker-Engine swarm swarm hpcg hpcg SWARM hpcg swarm master SLURM
  • 25.
  • 26.
    node8 node2 node1 26 SLURM Cluster [cont] Consul Consul
 DC Consul Consul SLURMwithin SWARM slurmd within app-container pre-stage containers slurmctld slurmd slurmd slurmd Docker-Engine Docker-Engine Docker-Engine swarm swarm hpcg hpcg SWARM hpcg OpenFOAM OpenFOAM OpenFOAM swarm master SLURM
  • 27.
  • 28.
    node1 node2 node8 28 Samza Cluster Consul Consul
 DC Consul Consul Distributed Samza Zookeeperand Kafka cluster Samza instances to run jobs Docker-Engine Docker-Engine Docker-Engine swarm swarm SWARM swarm masterzookeeper zookeeper zookeeper kafka kafka kafka samza samza samza $ cat test.log |awk ‘{print $1}’ |sed -e ’s/HPC/BigData/g’ |tee out.log
  • 29.
  • 30.
    1. Where tobase images on? Ubuntu/Fedora: ~200MB Debian: ~100MB Alpine Linux: 5MB (musl-libc) 2. Trimm the Images down at all cost? How about debugging tools? Possibility to run tools on the host and ‘inspect’ namespaced processes inside of a container. If PID-sharing arrives, carving out (e.g.) monitoring could be a thing. 30 Small vs. Big
  • 31.
    1. In anideal world… a container only runs one process, e.g. the HPC solver. 2. In reality… MPI want’s to connect to a sshd within the job-peers monitoring, syslog, service discovery should be present as well. 3. How fast / aggressive to break traditional approaches? 31 One vs. Many Processes
  • 32.
    Plugin System VXLAN MACVLAN How aboutIPoIB? 32 Docker Network
  • 33.
    Running OpenFOAM onsmall scale is cumbersome manually install OpenFOAM on a workstation be confident that the installation works correctly A containerised OpenFOAM installation tackles both 33 Reproducibility / Downscaling http://qnib.org/immutablehttp://qnib.org/immutable-paper
  • 34.
    1. Since theenvironments are rather dynamic… how does the containers discover services? external registry as part of the framework? discovery service as part of the container stacks? 34 Service Discovery
  • 35.
    With Docker Swarmit is rather easy to spin up a Kubernetes or Mesos cluster within Swarm. 35 Orchestration Frameworks SERVER0 SERVER1 SERVER<n> Docker-Engine Docker-Engine Docker-Engine swarm-client swarm-client swarm-client swarm-master etcd kubelet scheduler apiserver etcd kubelet etcd kubelet
  • 36.
    1. Containers shouldbe controlled via ENV or flags External access/change of a running container is discouraged 2. Configuration management Downgraded to bootstrap a host? 36 Immutable vs. Config Mgmt
  • 37.
    If containers areimmutable within pipeline testing/deployment should be automated developers should have a production replica 37 Continuous Dev./Integration
  • 38.
    38 Docker Momentum Software Dev DatacenterOps ITTinkering (Hello World) Continuous Dev/Int/Dep Microservices, hyper scale Big Data High Performance Computing HPC Disclaimer: subjective exaggeration
  • 39.
    Spinning up production-likeenvironment is great MongoDB, PostreSQL, memcached as separate containers python2.7, python3.4 39 Docker in Software Development Like python’s virtualenv on steroids, iteration speedup through reproducibility
  • 40.
    Spinning up production-likeenvironment is… …not that easy focus more on engineer/scientist, not the software-developer 1. For development it might work close to non-HPC software dev 2. But is that the iteration-focus? rather job settings / input data? 40 Docker in HPC development
  • 41.
    Split input iteration/ development from operation non-distributed stays vanilla transition to HPC cluster using tech to foster operation 41 Separation of Concerns? http://gmkurtzer.github.io/singularity Input/Dev
  • 42.
    Docker-Engine 1.11 willnot be the parent of containers runC usage under the hood 42 containerd Integration
  • 43.
    1. Separat Devand Ops don’t block the momentum fostering 
 iteration speed in Development  2. Using vanilla docker-tech keep up with the ecosystem and prevent vendor/ecosystem lock-in 3. 80/20 rule have caveats on the radar but don’t bother too much everything is so fast moving - it’s hard to predict 43 Recap aka. IMHO
  • 44.