CRIU: Time and Space Travel for Linux Containers

CRIU:
time and space travel
for Linux containers
CRIU:
time and space travel
for Linux containers
Kirill Kolyshkin
ContainerDays NYC, 30 Oct 2015

AgendaAgenda
• Why would we want to migrate containers
• Why wouldn't we want to migrate containers
• How complex is to migrate containers
2

Live migration at a glanceLive migration at a glance
• Save the state
• Transfer the state
• Restore the state
3

Container live migrationContainer live migration
4

Why would we want to migrate containers?Why would we want to migrate containers?
• It's awesome!
• Load balancing in a cluster
• Kernel upgrade
– Can be done without migration
• Hardware upgrade
5

Why wouldn't we want to live migrate containers?Why wouldn't we want to live migrate containers?
6

How to avoid live migrating containersHow to avoid live migrating containers
• Incoming traffic load balancing
• Microservices
• Crash-driven upgrades
• Scheduled downtimes
7

How to make live migration really live?How to make live migration really live?
• Need to get rid of migrating memory while the container is frozen
• Two ways:
– Pre-copy the memory
– Post-copy the memory
8

Live migration in more detailsLive migration in more details
• Pre-copy: collect and transfer the memory (might be iterative)
• Freeze the container
• Save its state
• Copy the state
• Restore
• Unfreeze
• Post-copy: swap in the memory over the network
9

Obstacles, booby traps, and rakesObstacles, booby traps, and rakes
10
VS

What do we need to migrateWhat do we need to migrate
• Virtual Machine
– Environment (i.e. virtual hardware)
– CPU state
– Memory
• Container
– Environment (cgroups, namespaces)
– Processes and stuff
– Memory
11

Collect and copy the memoryCollect and copy the memory
• Virtual Machine
– All memory is at hand
• Container
– Memory is spread through the processes
– Different types of memory (shared/private, backed by a file or not)
– Need to collect the processes first
●
Only then collect the memory
12

FreezingFreezing
• Virtual Machine
– Suspend all CPUs
• Container
– Walk the tree (/proc), catch the processes and freeze those
– Freeze cgroup helps a bit
13

Saving the stateSaving the state
• Virtual Machine
– Hardware state, tree, 300K, ~70 objects
• Container
– State of all objects, graph, 160K, ~1000 objects
– Not all objects have decent API to get the state
14

Copying the stateCopying the state
• Virtual Machine
– Can read and copy at once, easy to serialize
• Container
– Not easy to serialize as it's a graph not a tree
15

Restoring the stateRestoring the state
• VM: recreate the memory, state of CPUs and virtual hardware
• Containers
– In-kernel: create a myriad of small objects
– In CRIU: same, but there might not be a convenient API
●
Over 1000 syscalls
●
Need to sort it all out
16

FreezeFreeze
• VM: resume the virtual CPUs
• Container
– Either SIGCONT through the tree
– Or “unfreeze” the cgroup
– Problem: need to wake processes in the proper order
17

Post-memory migration: network swap devicePost-memory migration: network swap device
• Not yet ready for neither VMs nor CTs
• userfaultfd by Andrea Arcangeli of Red Hat
– a file descriptor to inform about page fault and get a memory back
– merged into 4.2 kernel
– work in progress to use it for KVM/QEMU
• Container
– Userfault FD is not sufficient for CRIU case
18

ImplementationImplementation
• https://criu.org
• criu@openvz.org
• plus.google.com/+CriuOrg
• @__criu__
• github: xemul/criu
19

CRIU uses beyond the live migrationCRIU uses beyond the live migration
• HPC jobs: periodic checkpoints
• Slow boot services speed up
• That magical SAVE button e.g. in games
• Software testing speed up
• Reverse debugging
20

Live migrationLive migration
• P.Haul
– Process hauler
– http://criu.org/P.Haul
– Uses CRIU for c/r
21

That's all Folks!
Kirill Kolyshkin
kir@openvz.org
That's all Folks!
Kirill Kolyshkin
kir@openvz.org

CRIU: Time and Space Travel for Linux Containers

More Related Content

What's hot

Similar to CRIU: Time and Space Travel for Linux Containers

More from Kirill Kolyshkin

Recently uploaded

CRIU: Time and Space Travel for Linux Containers

Editor's Notes