SlideShare a Scribd company logo
1 of 15
18/04/2018 EOSC-hub public day, Malaga, 16-17 April, 2018 1
VM image analysis and optimised
fragmentation
Jozsef Kovacs
(Akos Hajnal, Attila Marosi, Peter Kacsuk)
MTA SZTAKI, Budapest, Hungary
VM Image analysis and fragmentation
• Problem statement
o Virtual machine images are stored as a whole in the
proprietary VM image repository of the different clouds, even
if they share common parts of potentially large-size (e.g.
OS files)
• Goals
o Store common parts of the different VM images once
o Re-assemble on-demand by merging unique and common
parts
o The process of breaking monolithic images into parts is
called fragmentation or image decomposition, image parts
are called fragments or delta packages, image assembly is
called image composition
EOSC-hub public day, Malaga, 16-17 April, 2018
Design space
• Which approach to choose?
o top-down: take existing images and break into parts
o bottom-up: build all images from fragments
• How to choose comparison operator? (identifies common parts)
o Fixed-size (e.g., disk blocks, or 4k) How to choose the proper size?
o Variable-size parts (binary) Where to start comparison? How to re-synch
after a difference?
o File system-level (compare files) By hash or by size and timestamp?
(MBR is out)
• How to implement the proper assembly algorithm?
o Before deployment: Safe but can be slow if assembly is done at a
different site and the whole image must be transferred to the VM host
o At deployment-time: Could it interfere with the boot up procedure?
18/04/2018 3EOSC-hub public day, Malaga, 16-17 April, 2018
VM Image decomposition and
composition
18/04/2018 4EOSC-hub public day, Malaga, 16-17 April, 2018
OriginalVirtualAppliances
image1
image2
imagen
…
F r a g m e n t s
Re-builtVirtualAppliances
delta package1
delta package2
delta package3
delta package4
delta packagem
image1
image2
imagen
…
decompose
(fragmentation)
compose
(fragment merging)
…
Our applied solution in ENTICE
• Cloud image repository must contain only “base OS images”, which are small and bootable
• Every image built later must be built upon one of the existing base OSs
• At creating a new image (installations, removals, configuration, etc.) the difference from the base
OS image, increment fragment will be computed and stored in a post-processing manner.
• The new image is called a “virtual image”, virtual_image1 = base_image + fragment1, as it is not
stored as a whole.
• When another image is built upon a previous virtual image (further installations), it will result in a
new fragment containing difference between the previous virtual image content (assembled):
virtual_image2 = (base_image + fragment1) + fragment2
• Fragments may be stored out of the cloud’s proprietary image storage (“fragment storage”).
• Fragments are created based on file system-level comparison. Path + size + last modification
date is used to determine differences (using the rsync tool, faster than hash). Permissions,
timestamp, sym- and hardlinks, user ids are properly handled.
• Fragments can contain removals/changes (on change, the whole file is stored even if a single
byte has changed)
• Fragments are compressed: tar.gz (lower transfer time, addition CPU for decompression)
• When launching a virtual image we launch the appropriate base image first (bootable) and merge
(download+extract) fragments sequentially at boot time (cloud-init runcmd)
18/04/2018 5EOSC-hub public day, Malaga, 16-17 April, 2018
VM Image analysis and fragmentation
services - Virtual image tree
EOSC-hub public day, Malaga, 16-17 April, 2018
Creating a new virtual image
• Users are allowed to start any base or virtual image,
perform any installation, configuration changes
manually, and create a snapshot of the new image.
Then this snapshot is compared to the “parent” virtual or
base image
• Users can select any virtual image having certain set of
functionalities (called tags) and create a new virtual
image with extended functionality by selecting installers
EOSC-hub public day, Malaga, 16-17 April, 2018
VM Image analysis and fragmentation
services - Installers
• There are three options for the user to add new software
packages to the selected virtual image:
o Via custom shell script resulting in a new virtual image
o An Ansible playbook (yaml)
o Select one or more “pre-made installers”
EOSC-hub public day, Malaga, 16-17 April, 2018
Virtual Image Tree – FlexiOps use
cases
Ubuntu 16.04 (base image)
Ubuntu 16.04 updated (virtual image)
installer: update
Ubuntu 16.04 updated with MySQL (virtual image)
installer: mysql-server
Tomcat+MySQL (virtual image)
installer: tomcat
LAMP (virtual image)
installer: apache
WordPress (virtual image)
installer: wordpress
Ubuntu 16.04 updated with MySQL and php
(virtual image)
installer: php
03ec74a6-e3a0-4666-8897-5b4e7f65a8da
62fc58a9-8828-4645-bd4e-e2307e07536f
085205dd-93bd-4fd5-bf22-f0a8d75f0693
dcd1b5e1-4fda-4d8f-968c-3538291a8bcd
ae15ce6b-b009-4525-960b-80d0b16d46bf
ea96e38e-2e7b-404a-a51c-a1b7b2ed7826
93b4e771-eecb-45c8-8f6b-ce265ef9c895
29 475 713
100 927 299
121 271 776
50 982 511
25 806 101
58137671
130 403 012
EOSC-hub public day, Malaga, 16-17 April, 2018
Advantages and disadvantages
• This method precisely restores the “original” image
content
• The selection what images to compare is implicit
• All virtual images will re-use base OS images and
fragments (within the same subtree)
• Due to file system comparison, the method is not tied to
a particular file system (ext2, xfs, fat, …)
• Fragmentation is independent of the underlying
virtualization (kvm, xen).
• Cannot detect changes out of the file system (master
boot record, boot loader, grub updates)
18/04/2018 10EOSC-hub public day, Malaga, 16-17 April, 2018
Performance goals and results
• Original goals were
o 25% faster VMI Delivery (Objective 1.1)
o 60% smaller VMIs (Objective 1.2)
o 80% less storage space (Objective 2.1)
• Based on measurements on the images provided by the
industrial partners of ENTICE we reached
o up to 82.82% (avg 60.55%) faster VMI delivery
o up to 87.83% (avg 52.58%) smaller VMIs
o up to 86.00% (avg 77.56%) less storage space
EOSC-hub public day, Malaga, 16-17 April, 2018
Screenshots of the ENTICE demo site
18/04/2018 12EOSC-hub public day, Malaga, 16-17 April, 2018
Thank you for your attention!
• For further project details, please visit
http://www.entice-project.eu/
• For further technical details, please read
Hajnal, A., Kecskemeti, G., Marosi, A. C., et all: "ENTICE VM
Image Analysis and Optimised Fragmentation", Journal of Grid
Computing, 2018, 1-17. http://rdcu.be/HxkQ
• For further discussion, please contact to
the authors of the paper above
18/04/2018 13EOSC-hub public day, Malaga, 16-17 April, 2018
ENTICE Virtual Image Management GUI
Virtual Image Management Knowledge Base Backend
Virtual Image Manager
Virtual Image Decomposer
Fragment StorageInstaller Storage
Base Image
Storage
Cloud
Img.
RepoVM
Fragment
Merger
register base image,
create virtual image,
list all virtual images
compute fragment
get installer
store fragment
get base image get fragment
distribute
get/add installers
get fragment
launch, contextualize (EC2)
base image
apply
fragment
upload base images
get status
Virtual Image
Launcher
launch
get fragment
merger
Virtual image builder
Image comparator
Installer
Virtual Image Composer
get fragment ids get fragment merger
Implementation details
• We use qemu-nbd to mount file systems of
images
• We use chroot to perform installation
• We use rsync to compute differences in images
(their file systems)
• Fragments are compressed (tar.gz) contents of
file system differences
• We use cloud-init (shell script) to assemble
disks of VMs launched from virtual images (tar)
18/04/2018 15EOSC-hub public day, Malaga, 16-17 April, 2018

More Related Content

Similar to VM: Image analysis and fragmentation

EclipseCon 2016 - OCCIware : one Cloud API to rule them all
EclipseCon 2016 - OCCIware : one Cloud API to rule them allEclipseCon 2016 - OCCIware : one Cloud API to rule them all
EclipseCon 2016 - OCCIware : one Cloud API to rule them allMarc Dutoo
 
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open WideOCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open WideOCCIware
 
Deep learning features and similarity of movies based on their video content
Deep learning features and similarity of movies based on their video contentDeep learning features and similarity of movies based on their video content
Deep learning features and similarity of movies based on their video contentShowmax Engineering
 
SDOBenchmark - a machine learning image dataset for the prediction of solar f...
SDOBenchmark - a machine learning image dataset for the prediction of solar f...SDOBenchmark - a machine learning image dataset for the prediction of solar f...
SDOBenchmark - a machine learning image dataset for the prediction of solar f...Roman Bolzern
 
TECHNICAL WHITE PAPER▶Symantec Backup Exec 2014 Blueprints - OST Powered Appl...
TECHNICAL WHITE PAPER▶Symantec Backup Exec 2014 Blueprints - OST Powered Appl...TECHNICAL WHITE PAPER▶Symantec Backup Exec 2014 Blueprints - OST Powered Appl...
TECHNICAL WHITE PAPER▶Symantec Backup Exec 2014 Blueprints - OST Powered Appl...Symantec
 
Adoption of Software By A User Community: The Montage Image Mosaic Engine Exa...
Adoption of Software By A User Community: The Montage Image Mosaic Engine Exa...Adoption of Software By A User Community: The Montage Image Mosaic Engine Exa...
Adoption of Software By A User Community: The Montage Image Mosaic Engine Exa...SoftwarePractice
 
Starting with OpenCV on i.MX 6 Processors
Starting with OpenCV on i.MX 6 ProcessorsStarting with OpenCV on i.MX 6 Processors
Starting with OpenCV on i.MX 6 ProcessorsToradex
 
The Yocto Project
The Yocto ProjectThe Yocto Project
The Yocto Projectrossburton
 
Imaris 8.1 Freedom to Discover
Imaris 8.1 Freedom to DiscoverImaris 8.1 Freedom to Discover
Imaris 8.1 Freedom to DiscoverLuciano Lucas
 
01 foundations
01 foundations01 foundations
01 foundationsankit_ppt
 
Webinar: Nightmares of a Container Orchestration System - Jorg Schad
Webinar: Nightmares of a Container Orchestration System - Jorg SchadWebinar: Nightmares of a Container Orchestration System - Jorg Schad
Webinar: Nightmares of a Container Orchestration System - Jorg SchadCodemotion
 
Webinar - Nightmares of a Container Orchestration System - Jorg Schad
Webinar - Nightmares of a Container Orchestration System - Jorg SchadWebinar - Nightmares of a Container Orchestration System - Jorg Schad
Webinar - Nightmares of a Container Orchestration System - Jorg SchadCodemotion
 
Hpc Visualization with X3D (Michail Karpov)
Hpc Visualization with X3D (Michail Karpov)Hpc Visualization with X3D (Michail Karpov)
Hpc Visualization with X3D (Michail Karpov)Michael Karpov
 
Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019Iulian Pintoiu
 
Slide 1
Slide 1Slide 1
Slide 1butest
 

Similar to VM: Image analysis and fragmentation (20)

EclipseCon 2016 - OCCIware : one Cloud API to rule them all
EclipseCon 2016 - OCCIware : one Cloud API to rule them allEclipseCon 2016 - OCCIware : one Cloud API to rule them all
EclipseCon 2016 - OCCIware : one Cloud API to rule them all
 
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open WideOCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
 
final proposal-cloud storage
final proposal-cloud storagefinal proposal-cloud storage
final proposal-cloud storage
 
Deep learning features and similarity of movies based on their video content
Deep learning features and similarity of movies based on their video contentDeep learning features and similarity of movies based on their video content
Deep learning features and similarity of movies based on their video content
 
How to setup MateriApps LIVE!
How to setup MateriApps LIVE!How to setup MateriApps LIVE!
How to setup MateriApps LIVE!
 
Lecture OSSIM
Lecture OSSIM Lecture OSSIM
Lecture OSSIM
 
How to setup MateriApps LIVE!
How to setup MateriApps LIVE!How to setup MateriApps LIVE!
How to setup MateriApps LIVE!
 
SDOBenchmark - a machine learning image dataset for the prediction of solar f...
SDOBenchmark - a machine learning image dataset for the prediction of solar f...SDOBenchmark - a machine learning image dataset for the prediction of solar f...
SDOBenchmark - a machine learning image dataset for the prediction of solar f...
 
TECHNICAL WHITE PAPER▶Symantec Backup Exec 2014 Blueprints - OST Powered Appl...
TECHNICAL WHITE PAPER▶Symantec Backup Exec 2014 Blueprints - OST Powered Appl...TECHNICAL WHITE PAPER▶Symantec Backup Exec 2014 Blueprints - OST Powered Appl...
TECHNICAL WHITE PAPER▶Symantec Backup Exec 2014 Blueprints - OST Powered Appl...
 
Adoption of Software By A User Community: The Montage Image Mosaic Engine Exa...
Adoption of Software By A User Community: The Montage Image Mosaic Engine Exa...Adoption of Software By A User Community: The Montage Image Mosaic Engine Exa...
Adoption of Software By A User Community: The Montage Image Mosaic Engine Exa...
 
How to setup MateriApps LIVE!
How to setup MateriApps LIVE!How to setup MateriApps LIVE!
How to setup MateriApps LIVE!
 
Starting with OpenCV on i.MX 6 Processors
Starting with OpenCV on i.MX 6 ProcessorsStarting with OpenCV on i.MX 6 Processors
Starting with OpenCV on i.MX 6 Processors
 
The Yocto Project
The Yocto ProjectThe Yocto Project
The Yocto Project
 
Imaris 8.1 Freedom to Discover
Imaris 8.1 Freedom to DiscoverImaris 8.1 Freedom to Discover
Imaris 8.1 Freedom to Discover
 
01 foundations
01 foundations01 foundations
01 foundations
 
Webinar: Nightmares of a Container Orchestration System - Jorg Schad
Webinar: Nightmares of a Container Orchestration System - Jorg SchadWebinar: Nightmares of a Container Orchestration System - Jorg Schad
Webinar: Nightmares of a Container Orchestration System - Jorg Schad
 
Webinar - Nightmares of a Container Orchestration System - Jorg Schad
Webinar - Nightmares of a Container Orchestration System - Jorg SchadWebinar - Nightmares of a Container Orchestration System - Jorg Schad
Webinar - Nightmares of a Container Orchestration System - Jorg Schad
 
Hpc Visualization with X3D (Michail Karpov)
Hpc Visualization with X3D (Michail Karpov)Hpc Visualization with X3D (Michail Karpov)
Hpc Visualization with X3D (Michail Karpov)
 
Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019
 
Slide 1
Slide 1Slide 1
Slide 1
 

More from EOSC-hub project

EOSC-hub Early Adopter Programme
EOSC-hub Early Adopter ProgrammeEOSC-hub Early Adopter Programme
EOSC-hub Early Adopter ProgrammeEOSC-hub project
 
2019 05-21 egi and eosc - final
2019 05-21 egi and eosc - final2019 05-21 egi and eosc - final
2019 05-21 egi and eosc - finalEOSC-hub project
 
Introduction to service management and FitSM
Introduction to service management and FitSMIntroduction to service management and FitSM
Introduction to service management and FitSMEOSC-hub project
 
Service management board (SMB), Service providers’ forum (SPF)
Service management board (SMB), Service providers’ forum (SPF)Service management board (SMB), Service providers’ forum (SPF)
Service management board (SMB), Service providers’ forum (SPF)EOSC-hub project
 
Joining the EOSC-hub as a Service Provider
Joining the EOSC-hub as a Service ProviderJoining the EOSC-hub as a Service Provider
Joining the EOSC-hub as a Service ProviderEOSC-hub project
 
PID services - understandability and findability of data
PID services - understandability and findability of dataPID services - understandability and findability of data
PID services - understandability and findability of dataEOSC-hub project
 
Software for data management and exploitation
Software for data management and exploitationSoftware for data management and exploitation
Software for data management and exploitationEOSC-hub project
 
Repositories for long-term preservation - certification
Repositories for long-term preservation - certificationRepositories for long-term preservation - certification
Repositories for long-term preservation - certificationEOSC-hub project
 
EOSC working group on FAIR
EOSC working group on FAIREOSC working group on FAIR
EOSC working group on FAIREOSC-hub project
 
Updates on the FAIR Data Maturity Model RDA Working Group & the DG RTD FAIR i...
Updates on the FAIR Data Maturity Model RDA Working Group & the DG RTD FAIR i...Updates on the FAIR Data Maturity Model RDA Working Group & the DG RTD FAIR i...
Updates on the FAIR Data Maturity Model RDA Working Group & the DG RTD FAIR i...EOSC-hub project
 
Services to support FAIR data - Introduction
Services to support FAIR data - IntroductionServices to support FAIR data - Introduction
Services to support FAIR data - IntroductionEOSC-hub project
 
Pathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaborationPathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaborationEOSC-hub project
 
Overview on the HPC CoEs panorama
Overview on the HPC CoEs panoramaOverview on the HPC CoEs panorama
Overview on the HPC CoEs panoramaEOSC-hub project
 
Overview of the Onboarding and validation process and the Rules of Participat...
Overview of the Onboarding and validation process and the Rules of Participat...Overview of the Onboarding and validation process and the Rules of Participat...
Overview of the Onboarding and validation process and the Rules of Participat...EOSC-hub project
 
ELIXIR Competence Centre in EOSC-hub
ELIXIR Competence Centre in EOSC-hubELIXIR Competence Centre in EOSC-hub
ELIXIR Competence Centre in EOSC-hubEOSC-hub project
 

More from EOSC-hub project (20)

EOSC-hub Early Adopter Programme
EOSC-hub Early Adopter ProgrammeEOSC-hub Early Adopter Programme
EOSC-hub Early Adopter Programme
 
2019 05-21 egi and eosc - final
2019 05-21 egi and eosc - final2019 05-21 egi and eosc - final
2019 05-21 egi and eosc - final
 
Introduction to service management and FitSM
Introduction to service management and FitSMIntroduction to service management and FitSM
Introduction to service management and FitSM
 
Service management board (SMB), Service providers’ forum (SPF)
Service management board (SMB), Service providers’ forum (SPF)Service management board (SMB), Service providers’ forum (SPF)
Service management board (SMB), Service providers’ forum (SPF)
 
Joining the EOSC-hub as a Service Provider
Joining the EOSC-hub as a Service ProviderJoining the EOSC-hub as a Service Provider
Joining the EOSC-hub as a Service Provider
 
PID services - understandability and findability of data
PID services - understandability and findability of dataPID services - understandability and findability of data
PID services - understandability and findability of data
 
Software for data management and exploitation
Software for data management and exploitationSoftware for data management and exploitation
Software for data management and exploitation
 
Repositories for long-term preservation - certification
Repositories for long-term preservation - certificationRepositories for long-term preservation - certification
Repositories for long-term preservation - certification
 
EOSC working group on FAIR
EOSC working group on FAIREOSC working group on FAIR
EOSC working group on FAIR
 
Updates on the FAIR Data Maturity Model RDA Working Group & the DG RTD FAIR i...
Updates on the FAIR Data Maturity Model RDA Working Group & the DG RTD FAIR i...Updates on the FAIR Data Maturity Model RDA Working Group & the DG RTD FAIR i...
Updates on the FAIR Data Maturity Model RDA Working Group & the DG RTD FAIR i...
 
Services to support FAIR data - Introduction
Services to support FAIR data - IntroductionServices to support FAIR data - Introduction
Services to support FAIR data - Introduction
 
EOSC-synergy
EOSC-synergyEOSC-synergy
EOSC-synergy
 
ExPaNDS
ExPaNDSExPaNDS
ExPaNDS
 
EOSC-Pillar
EOSC-PillarEOSC-Pillar
EOSC-Pillar
 
NI4OS-Europe
NI4OS-EuropeNI4OS-Europe
NI4OS-Europe
 
Excellerat CoE
Excellerat CoEExcellerat CoE
Excellerat CoE
 
Pathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaborationPathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaboration
 
Overview on the HPC CoEs panorama
Overview on the HPC CoEs panoramaOverview on the HPC CoEs panorama
Overview on the HPC CoEs panorama
 
Overview of the Onboarding and validation process and the Rules of Participat...
Overview of the Onboarding and validation process and the Rules of Participat...Overview of the Onboarding and validation process and the Rules of Participat...
Overview of the Onboarding and validation process and the Rules of Participat...
 
ELIXIR Competence Centre in EOSC-hub
ELIXIR Competence Centre in EOSC-hubELIXIR Competence Centre in EOSC-hub
ELIXIR Competence Centre in EOSC-hub
 

Recently uploaded

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 

Recently uploaded (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 

VM: Image analysis and fragmentation

  • 1. 18/04/2018 EOSC-hub public day, Malaga, 16-17 April, 2018 1 VM image analysis and optimised fragmentation Jozsef Kovacs (Akos Hajnal, Attila Marosi, Peter Kacsuk) MTA SZTAKI, Budapest, Hungary
  • 2. VM Image analysis and fragmentation • Problem statement o Virtual machine images are stored as a whole in the proprietary VM image repository of the different clouds, even if they share common parts of potentially large-size (e.g. OS files) • Goals o Store common parts of the different VM images once o Re-assemble on-demand by merging unique and common parts o The process of breaking monolithic images into parts is called fragmentation or image decomposition, image parts are called fragments or delta packages, image assembly is called image composition EOSC-hub public day, Malaga, 16-17 April, 2018
  • 3. Design space • Which approach to choose? o top-down: take existing images and break into parts o bottom-up: build all images from fragments • How to choose comparison operator? (identifies common parts) o Fixed-size (e.g., disk blocks, or 4k) How to choose the proper size? o Variable-size parts (binary) Where to start comparison? How to re-synch after a difference? o File system-level (compare files) By hash or by size and timestamp? (MBR is out) • How to implement the proper assembly algorithm? o Before deployment: Safe but can be slow if assembly is done at a different site and the whole image must be transferred to the VM host o At deployment-time: Could it interfere with the boot up procedure? 18/04/2018 3EOSC-hub public day, Malaga, 16-17 April, 2018
  • 4. VM Image decomposition and composition 18/04/2018 4EOSC-hub public day, Malaga, 16-17 April, 2018 OriginalVirtualAppliances image1 image2 imagen … F r a g m e n t s Re-builtVirtualAppliances delta package1 delta package2 delta package3 delta package4 delta packagem image1 image2 imagen … decompose (fragmentation) compose (fragment merging) …
  • 5. Our applied solution in ENTICE • Cloud image repository must contain only “base OS images”, which are small and bootable • Every image built later must be built upon one of the existing base OSs • At creating a new image (installations, removals, configuration, etc.) the difference from the base OS image, increment fragment will be computed and stored in a post-processing manner. • The new image is called a “virtual image”, virtual_image1 = base_image + fragment1, as it is not stored as a whole. • When another image is built upon a previous virtual image (further installations), it will result in a new fragment containing difference between the previous virtual image content (assembled): virtual_image2 = (base_image + fragment1) + fragment2 • Fragments may be stored out of the cloud’s proprietary image storage (“fragment storage”). • Fragments are created based on file system-level comparison. Path + size + last modification date is used to determine differences (using the rsync tool, faster than hash). Permissions, timestamp, sym- and hardlinks, user ids are properly handled. • Fragments can contain removals/changes (on change, the whole file is stored even if a single byte has changed) • Fragments are compressed: tar.gz (lower transfer time, addition CPU for decompression) • When launching a virtual image we launch the appropriate base image first (bootable) and merge (download+extract) fragments sequentially at boot time (cloud-init runcmd) 18/04/2018 5EOSC-hub public day, Malaga, 16-17 April, 2018
  • 6. VM Image analysis and fragmentation services - Virtual image tree EOSC-hub public day, Malaga, 16-17 April, 2018
  • 7. Creating a new virtual image • Users are allowed to start any base or virtual image, perform any installation, configuration changes manually, and create a snapshot of the new image. Then this snapshot is compared to the “parent” virtual or base image • Users can select any virtual image having certain set of functionalities (called tags) and create a new virtual image with extended functionality by selecting installers EOSC-hub public day, Malaga, 16-17 April, 2018
  • 8. VM Image analysis and fragmentation services - Installers • There are three options for the user to add new software packages to the selected virtual image: o Via custom shell script resulting in a new virtual image o An Ansible playbook (yaml) o Select one or more “pre-made installers” EOSC-hub public day, Malaga, 16-17 April, 2018
  • 9. Virtual Image Tree – FlexiOps use cases Ubuntu 16.04 (base image) Ubuntu 16.04 updated (virtual image) installer: update Ubuntu 16.04 updated with MySQL (virtual image) installer: mysql-server Tomcat+MySQL (virtual image) installer: tomcat LAMP (virtual image) installer: apache WordPress (virtual image) installer: wordpress Ubuntu 16.04 updated with MySQL and php (virtual image) installer: php 03ec74a6-e3a0-4666-8897-5b4e7f65a8da 62fc58a9-8828-4645-bd4e-e2307e07536f 085205dd-93bd-4fd5-bf22-f0a8d75f0693 dcd1b5e1-4fda-4d8f-968c-3538291a8bcd ae15ce6b-b009-4525-960b-80d0b16d46bf ea96e38e-2e7b-404a-a51c-a1b7b2ed7826 93b4e771-eecb-45c8-8f6b-ce265ef9c895 29 475 713 100 927 299 121 271 776 50 982 511 25 806 101 58137671 130 403 012 EOSC-hub public day, Malaga, 16-17 April, 2018
  • 10. Advantages and disadvantages • This method precisely restores the “original” image content • The selection what images to compare is implicit • All virtual images will re-use base OS images and fragments (within the same subtree) • Due to file system comparison, the method is not tied to a particular file system (ext2, xfs, fat, …) • Fragmentation is independent of the underlying virtualization (kvm, xen). • Cannot detect changes out of the file system (master boot record, boot loader, grub updates) 18/04/2018 10EOSC-hub public day, Malaga, 16-17 April, 2018
  • 11. Performance goals and results • Original goals were o 25% faster VMI Delivery (Objective 1.1) o 60% smaller VMIs (Objective 1.2) o 80% less storage space (Objective 2.1) • Based on measurements on the images provided by the industrial partners of ENTICE we reached o up to 82.82% (avg 60.55%) faster VMI delivery o up to 87.83% (avg 52.58%) smaller VMIs o up to 86.00% (avg 77.56%) less storage space EOSC-hub public day, Malaga, 16-17 April, 2018
  • 12. Screenshots of the ENTICE demo site 18/04/2018 12EOSC-hub public day, Malaga, 16-17 April, 2018
  • 13. Thank you for your attention! • For further project details, please visit http://www.entice-project.eu/ • For further technical details, please read Hajnal, A., Kecskemeti, G., Marosi, A. C., et all: "ENTICE VM Image Analysis and Optimised Fragmentation", Journal of Grid Computing, 2018, 1-17. http://rdcu.be/HxkQ • For further discussion, please contact to the authors of the paper above 18/04/2018 13EOSC-hub public day, Malaga, 16-17 April, 2018
  • 14. ENTICE Virtual Image Management GUI Virtual Image Management Knowledge Base Backend Virtual Image Manager Virtual Image Decomposer Fragment StorageInstaller Storage Base Image Storage Cloud Img. RepoVM Fragment Merger register base image, create virtual image, list all virtual images compute fragment get installer store fragment get base image get fragment distribute get/add installers get fragment launch, contextualize (EC2) base image apply fragment upload base images get status Virtual Image Launcher launch get fragment merger Virtual image builder Image comparator Installer Virtual Image Composer get fragment ids get fragment merger
  • 15. Implementation details • We use qemu-nbd to mount file systems of images • We use chroot to perform installation • We use rsync to compute differences in images (their file systems) • Fragments are compressed (tar.gz) contents of file system differences • We use cloud-init (shell script) to assemble disks of VMs launched from virtual images (tar) 18/04/2018 15EOSC-hub public day, Malaga, 16-17 April, 2018

Editor's Notes

  1. virtualization is a key technology for cloud computing that allows users to run multiple virtual machines with their own application environment on top of physical hw layer.
  2. TODO
  3. csomopontok az image-ek, elek a telepitett szoftverek listaja gyarkan hasznalt szoftverek telepitojet a rendszer tarolja A csom
  4. - 3 image-et szerettek volna fragmentalt modon tarolni Mysql-t gyakran hasznaljak ezert ez kiemelendo kozos fragmentbe