1. 18/04/2018 EOSC-hub public day, Malaga, 16-17 April, 2018 1
VM image analysis and optimised
fragmentation
Jozsef Kovacs
(Akos Hajnal, Attila Marosi, Peter Kacsuk)
MTA SZTAKI, Budapest, Hungary
2. VM Image analysis and fragmentation
• Problem statement
o Virtual machine images are stored as a whole in the
proprietary VM image repository of the different clouds, even
if they share common parts of potentially large-size (e.g.
OS files)
• Goals
o Store common parts of the different VM images once
o Re-assemble on-demand by merging unique and common
parts
o The process of breaking monolithic images into parts is
called fragmentation or image decomposition, image parts
are called fragments or delta packages, image assembly is
called image composition
EOSC-hub public day, Malaga, 16-17 April, 2018
3. Design space
• Which approach to choose?
o top-down: take existing images and break into parts
o bottom-up: build all images from fragments
• How to choose comparison operator? (identifies common parts)
o Fixed-size (e.g., disk blocks, or 4k) How to choose the proper size?
o Variable-size parts (binary) Where to start comparison? How to re-synch
after a difference?
o File system-level (compare files) By hash or by size and timestamp?
(MBR is out)
• How to implement the proper assembly algorithm?
o Before deployment: Safe but can be slow if assembly is done at a
different site and the whole image must be transferred to the VM host
o At deployment-time: Could it interfere with the boot up procedure?
18/04/2018 3EOSC-hub public day, Malaga, 16-17 April, 2018
4. VM Image decomposition and
composition
18/04/2018 4EOSC-hub public day, Malaga, 16-17 April, 2018
OriginalVirtualAppliances
image1
image2
imagen
…
F r a g m e n t s
Re-builtVirtualAppliances
delta package1
delta package2
delta package3
delta package4
delta packagem
image1
image2
imagen
…
decompose
(fragmentation)
compose
(fragment merging)
…
5. Our applied solution in ENTICE
• Cloud image repository must contain only “base OS images”, which are small and bootable
• Every image built later must be built upon one of the existing base OSs
• At creating a new image (installations, removals, configuration, etc.) the difference from the base
OS image, increment fragment will be computed and stored in a post-processing manner.
• The new image is called a “virtual image”, virtual_image1 = base_image + fragment1, as it is not
stored as a whole.
• When another image is built upon a previous virtual image (further installations), it will result in a
new fragment containing difference between the previous virtual image content (assembled):
virtual_image2 = (base_image + fragment1) + fragment2
• Fragments may be stored out of the cloud’s proprietary image storage (“fragment storage”).
• Fragments are created based on file system-level comparison. Path + size + last modification
date is used to determine differences (using the rsync tool, faster than hash). Permissions,
timestamp, sym- and hardlinks, user ids are properly handled.
• Fragments can contain removals/changes (on change, the whole file is stored even if a single
byte has changed)
• Fragments are compressed: tar.gz (lower transfer time, addition CPU for decompression)
• When launching a virtual image we launch the appropriate base image first (bootable) and merge
(download+extract) fragments sequentially at boot time (cloud-init runcmd)
18/04/2018 5EOSC-hub public day, Malaga, 16-17 April, 2018
6. VM Image analysis and fragmentation
services - Virtual image tree
EOSC-hub public day, Malaga, 16-17 April, 2018
7. Creating a new virtual image
• Users are allowed to start any base or virtual image,
perform any installation, configuration changes
manually, and create a snapshot of the new image.
Then this snapshot is compared to the “parent” virtual or
base image
• Users can select any virtual image having certain set of
functionalities (called tags) and create a new virtual
image with extended functionality by selecting installers
EOSC-hub public day, Malaga, 16-17 April, 2018
8. VM Image analysis and fragmentation
services - Installers
• There are three options for the user to add new software
packages to the selected virtual image:
o Via custom shell script resulting in a new virtual image
o An Ansible playbook (yaml)
o Select one or more “pre-made installers”
EOSC-hub public day, Malaga, 16-17 April, 2018
9. Virtual Image Tree – FlexiOps use
cases
Ubuntu 16.04 (base image)
Ubuntu 16.04 updated (virtual image)
installer: update
Ubuntu 16.04 updated with MySQL (virtual image)
installer: mysql-server
Tomcat+MySQL (virtual image)
installer: tomcat
LAMP (virtual image)
installer: apache
WordPress (virtual image)
installer: wordpress
Ubuntu 16.04 updated with MySQL and php
(virtual image)
installer: php
03ec74a6-e3a0-4666-8897-5b4e7f65a8da
62fc58a9-8828-4645-bd4e-e2307e07536f
085205dd-93bd-4fd5-bf22-f0a8d75f0693
dcd1b5e1-4fda-4d8f-968c-3538291a8bcd
ae15ce6b-b009-4525-960b-80d0b16d46bf
ea96e38e-2e7b-404a-a51c-a1b7b2ed7826
93b4e771-eecb-45c8-8f6b-ce265ef9c895
29 475 713
100 927 299
121 271 776
50 982 511
25 806 101
58137671
130 403 012
EOSC-hub public day, Malaga, 16-17 April, 2018
10. Advantages and disadvantages
• This method precisely restores the “original” image
content
• The selection what images to compare is implicit
• All virtual images will re-use base OS images and
fragments (within the same subtree)
• Due to file system comparison, the method is not tied to
a particular file system (ext2, xfs, fat, …)
• Fragmentation is independent of the underlying
virtualization (kvm, xen).
• Cannot detect changes out of the file system (master
boot record, boot loader, grub updates)
18/04/2018 10EOSC-hub public day, Malaga, 16-17 April, 2018
11. Performance goals and results
• Original goals were
o 25% faster VMI Delivery (Objective 1.1)
o 60% smaller VMIs (Objective 1.2)
o 80% less storage space (Objective 2.1)
• Based on measurements on the images provided by the
industrial partners of ENTICE we reached
o up to 82.82% (avg 60.55%) faster VMI delivery
o up to 87.83% (avg 52.58%) smaller VMIs
o up to 86.00% (avg 77.56%) less storage space
EOSC-hub public day, Malaga, 16-17 April, 2018
12. Screenshots of the ENTICE demo site
18/04/2018 12EOSC-hub public day, Malaga, 16-17 April, 2018
13. Thank you for your attention!
• For further project details, please visit
http://www.entice-project.eu/
• For further technical details, please read
Hajnal, A., Kecskemeti, G., Marosi, A. C., et all: "ENTICE VM
Image Analysis and Optimised Fragmentation", Journal of Grid
Computing, 2018, 1-17. http://rdcu.be/HxkQ
• For further discussion, please contact to
the authors of the paper above
18/04/2018 13EOSC-hub public day, Malaga, 16-17 April, 2018
14. ENTICE Virtual Image Management GUI
Virtual Image Management Knowledge Base Backend
Virtual Image Manager
Virtual Image Decomposer
Fragment StorageInstaller Storage
Base Image
Storage
Cloud
Img.
RepoVM
Fragment
Merger
register base image,
create virtual image,
list all virtual images
compute fragment
get installer
store fragment
get base image get fragment
distribute
get/add installers
get fragment
launch, contextualize (EC2)
base image
apply
fragment
upload base images
get status
Virtual Image
Launcher
launch
get fragment
merger
Virtual image builder
Image comparator
Installer
Virtual Image Composer
get fragment ids get fragment merger
15. Implementation details
• We use qemu-nbd to mount file systems of
images
• We use chroot to perform installation
• We use rsync to compute differences in images
(their file systems)
• Fragments are compressed (tar.gz) contents of
file system differences
• We use cloud-init (shell script) to assemble
disks of VMs launched from virtual images (tar)
18/04/2018 15EOSC-hub public day, Malaga, 16-17 April, 2018
Editor's Notes
virtualization is a key technology for cloud computing that allows users to run multiple virtual machines with their own application environment on top of physical hw layer.
TODO
csomopontok az image-ek, elek a telepitett szoftverek listaja
gyarkan hasznalt szoftverek telepitojet a rendszer tarolja
A csom
- 3 image-et szerettek volna fragmentalt modon tarolni
Mysql-t gyakran hasznaljak ezert ez kiemelendo kozos fragmentbe