Checkpoint-restore in userspace.Are we there yet?                           Pavel Emelyanov                      LinuxCon ...
What is C/R and what is it for?C/R is an ability to snapshot an application state and restore it from the  state at any ti...
Is it possible to do all these nice things now?                      Yes!                     Almost.             And were...
A brief history of C/R in Linux2005                                       2008                        2010                ...
CRIU project ultimate goal             ...  Timers           APP      FS                                                 A...
CRIU project concept                         FD                  APP                       open        dump               ...
Existing kernel APIs                   dump                            Proc                                             re...
How CRIU grows up                        FOO                 APP                   Get FOO       dump                 CRIU...
CRIU project grow-up concept (Linus vision)... this is a project by various mad Russians to perform c/r mainly from usersp...
Kernel impact            ~110 patches merged                                        ~15 patches in flight         9 new fe...
The most interesting new features in kernel   Parasite code injection           – Read task states, that are currently ret...
Other new features in kernel   Virtual net devices indices           – Allows to restore network devices in a namespace   ...
CRIU features so far                                    X86_64 architecture                                    Process tre...
How we test it    ZDTM – set of atomic tests for every new piece of functionality    Real software                     ...
Main plans for the nearest future●    Full OS resources coverage●    Merge in-flight patches, so that everything works on ...
CRIU project resourceshttp://criu.org – project news and documentationhttp://git.criu.org – git repo with tool sourceshttp...
Pavel Emelyanov                                xemul@parallels.com17   Parallels – Optimized ComputingTM    Confidential
Upcoming SlideShare
Loading in …5
×

Checkpoint/Restore: are we there yet?

1,418 views

Published on

Checkpoint/Restore In Userspace, or CRIU, is a software tool for Linux operating system. Using this tool, you can freeze a running application (or part of it) and checkpoint it to a hard drive as a collection of files. You can then use the files to restore and run the application from the point it was frozen at. The distinctive feature of the CRIU project is that it is mainly implemented in user space.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,418
On SlideShare
0
From Embeds
0
Number of Embeds
19
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Everything is one the slides.
  • A brief C/R history – openvz version, Oren's version, attempt to merge Oren's version upstream, CRIU proof-of-concept, Linus' “OK, let's take it” and first two releases
  • Consider you have an application. This application has a variety of resources associated with it: memory, open files, credentials, etc. There can be more than one application in a game, some of them sharing resources. And that's not all – they may live in some environment (we call container, yes) with its own not bound to tasks resources like networking configuration or system V IPC objects. What we do in CRIU is – we serialize the state of this whole thing into an image file (well, it's a set of files, but still). Later we can take this image and recreate the applications with their resources and environment at the very same state as it was before we dumped it.
  • - Linked clones. Disk space. I/O performance. GPL and ESXi
  • Checkpoint/Restore: are we there yet?

    1. 1. Checkpoint-restore in userspace.Are we there yet? Pavel Emelyanov LinuxCon Europe 2012
    2. 2. What is C/R and what is it for?C/R is an ability to snapshot an application state and restore it from the state at any time and place later.Usage scenarios: – Live migration – Reboot-less kernel update – Applications start-up boost – Working environment snapshots – HPC load balancing – ... 2
    3. 3. Is it possible to do all these nice things now? Yes! Almost. And were close to it! This talk answers on: ✔ How shall we be able to do it? ✔ How close to it are we? ✔ How far from “impossible to” are we? ✔ What has happened since then? 3
    4. 4. A brief history of C/R in Linux2005 2008 2010 2011 2012 Jan Jul Sep CRIU v0.2 + LXC support CRIU v0.1 Linus decided to merge first set of patches upstream First attempt to do C/R mostly in user-space First more-or-less complete version (over 100 patches) First collaborative attempt OpenVZ project starts to get C/R upstream with live-migration support all in kernel feature 4
    5. 5. CRIU project ultimate goal ... Timers APP FS APP Creds dump MM Image 0011011001 0010101110 1101011001 1011100111 APP 0001011011 APP 0101011100 1011010110 ... restore share APP APP IPC IPC ... ... Network Network 5
    6. 6. CRIU project concept FD APP open dump CRIU What files are opened? kernel tool restore FD ~APP open 6
    7. 7. Existing kernel APIs dump Proc restore System calls kernel About self About anybody Netlink 7
    8. 8. How CRIU grows up FOO APP Get FOO dump CRIU tool Info on FOO-s ? kernel Info FOO ++ X restore FD ~APP Get FOO back Get FOO ++ 8
    9. 9. CRIU project grow-up concept (Linus vision)... this is a project by various mad Russians to perform c/r mainly from userspace, with various oddball helper code added into the kernel where the need is demonstrated. So rather than some large central lump of code, what we have is little bits and pieces popping up in various places which either expose something new or which permit something which is normally kernel-private to be modified... 9
    10. 10. Kernel impact ~110 patches merged ~15 patches in flight 9 new features appeared (1 C/R-only) 2 new features to come 10
    11. 11. The most interesting new features in kernel Parasite code injection – Read task states, that are currently retrieved by a task only about himself The kcmp system call – Helps checking which kernel objects are shared between processes Sockets information dumping via netlink ( sock_diag) – Extendable sockets state retrieving engine TCP repair mode – Read intimate state of a TCP connection and reconstructs it from scratch on a freshly created socket 11
    12. 12. Other new features in kernel Virtual net devices indices – Allows to restore network devices in a namespace Proc map_files directory – Find out what exact file is mapped – Mappings sharing info Socket peeking offset – Allows peeking sockets queues (reading without removing data from queue) More socket get-able sockoptions – Bound device – Packet filter 12
    13. 13. CRIU features so far X86_64 architecture Process tree linkage Multi-threaded apps Memory mappings of all kinds Terminals, groups and sessions Open files (+ shared and unlinked) Established TCP connection UNIX sockets LXC container environment Kernel V3.6 IPC ... Network Non-posix files (inotify, epoll, etc.) 13
    14. 14. How we test it ZDTM – set of atomic tests for every new piece of functionality Real software  Apache  MySQL  Make and gcc  Tar and gzip  Sshd with connections  Screen with top inside  VNC with xscreensaver and client connection  NGINX  MongoDB  tcpdump 14
    15. 15. Main plans for the nearest future● Full OS resources coverage● Merge in-flight patches, so that everything works on vanilla kernel● Properly integrate crtools with LXC and OpenVZ● Live-migration script● Pre-migrate app memory before freeze (speeds things up) 15
    16. 16. CRIU project resourceshttp://criu.org – project news and documentationhttp://git.criu.org – git repo with tool sourceshttps://github.com/cyrillos/linux-2.6/ – kernel with all in-flight patches appliedcriu@openvz.org mailing list+CRIU page 16
    17. 17. Pavel Emelyanov xemul@parallels.com17 Parallels – Optimized ComputingTM Confidential

    ×