Talk from Docker meetup Jakarta. Presented and demoed various Linux kernel features that enable container runtime, i.e. chroot, namespaces, cgroups, capabilities.
12. Ingredient #3: namespaces
Limit the “view” of a container:
Process namespace (pid)
Network namespace (net)
Mount namespace (mnt)
https://en.wikipedia.org/wiki/Linux_namespaces
13. Ingredient #3: namespaces
chroot of other systems:
clone(2): http://man7.org/linux/man-pages/man2/clone.2.html
unshare(2): http://man7.org/linux/man-pages/man2/unshare.2.html
Process trees
Network interfaces
Mount volumes
15. Ingredient #4: enter namespaces
Namespaces are composable
Example: Kubernetes pod
setns(2): http://man7.org/linux/man-pages/man2/setns.2.html
k8s pod
di r p o s ,
di r c o t
sa t o k,
sa un
16. # PID=321
# ls /proc/$PID/ns
cgroup ipc mnt net pid user uts
# nsenter
--pid=/proc/$PID/ns/pid
--mnt=/proc/$PID/ns/mnt
chroot $PWD/rootfs /bin/bash
17. Ingredient #5: volume mounts
Inject files into our chroot
$ docker run -d
--name=nginxtest
-v nginx-vol:/usr/share/nginx/html
nginx:latest
27. Ingredient #8: capabilities
SELinux, seccomp, AppArmor should’ve been covered
Show Linux capabilities instead
http://man7.org/linux/man-pages/man7/capabilities.7.html
30. Ingredient #9: network namespace
Huge topic, will do simple demo for now
For the impatient, probably next talk:
https://github.com/girikuncoro/netns-demo
31. $ sudo unshare -n chroot rootfs
# ip addr
# ip link set dev lo up
32. $ sudo ip link add veth0 type veth peer name
veth1
$ sudo ip link set veth1 netns $PID
$ sudo ip address add 10.1.1.2/24 dev veth0
$ sudo ip link set dev veth0 up
# (inside namespace)
# ip address add 10.1.1.3/24 dev veth1
# ip link set dev veth1 up
33. Conclusion
Containers are a combination between Linux kernel
features
Docker, rkt, lxc (container runtime) are just opinionated
wrapper around these
34. References
Containers from scratch, Eric Chiang
https://ericchiang.github.io/post/containers-from-scratch/
Building minimal containers, Brian Redbeard
https://github.com/brianredbeard/minimal_containers
Namespaces in operation, Michael Kerrisk
https://lwn.net/Articles/531114/
cgroups v1, Paul Menage
https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
Bocker, Docker implemented in 100 lines of bash
https://github.com/p8952/bocker