SlideShare a Scribd company logo
1 of 19
Download to read offline
1
Maxim Patlasov
<mpatlasov@parallels.com>
Containers in a File
2
Agenda
•Containers overview
•Current way of keeping container files and its problems
•Basics of container-in-a-file approach
•Limitations of existing facilities
•Solution
•Benefits
•Plans for future
3
Virtualization: VM-style vs. containers
CT 1
Hardware Hardware
Host OSHost OS / Hypervisor
OS Virtualization LayerVirtual Machine Monitor
Virtual Hardware Virtual Hardware
Guest OS Guest OS
VM 1 VM 2
CT 2
Containers:
•OpenVZ / Parallels Containers
•Solaris containers/Zones
•FreeBSD jails
VMMs:
•KVM / Qemu
•VMware
•Parallels Hypervisor
4
Containers Virtualization
Each container has its own:
•Files
•Process tree
•Network
•Devices
•IPC objects
5
Containers: file system view
Natural approach: each container chroot-s to a per-container sub-tree of
some system-wide host file system.
CT 1 CT 2
Block device
Host file system
/
1 2
CTs
chroot chroot
6
Problems
•File system journal is a bottleneck:
1. CT1 performs a lot of operations leading to metadata updates (e.g.
truncates). Consequently, the journal is getting full.
2. CT2 issuing a single I/O operation blocks until journal checkpoint is
completed. Up to 15 seconds in some tests!
Host file system100% full
Journal
Lots of random writes
7
Problems (cont.)
•Lots of small-size files I/O on any CT operation (backup, clone)
•Sub-tree disk quota support is absent in mainstream kernel
•Hard to manage:
•No per-container snapshots, live backup is problematic
•Live migration – rsync unreliable, changed inode numbers
•File system type and its properties are fixed for all containers
•Need to limit number of inodes per container
8
Container in a file
Basic idea: assign virtual block device to container, keep container’ file-
system on top of virtual block device.
CT 1
Per-container
file systems
Host file system
Per-container
virtual block devices
virtual block device
image-file image-file
virtual block device
/ /
CT 2
9
LVM limitations
•Not flexible enough – works only on top of block device
•Hard to manage (e.g. how to migrate huge volume?)
•No dynamic allocation (--virtualsize 16T --size 16M)
•Management is not as convenient as for files
Block device
Per-container
file systems
Per-container
logical volumes
CT 1
/ /
CT 2
Logical volume Logical volume
10
Ordinary loop device limitations
•VFS operations leads to double page-caching
•No dynamic allocation (only “plain” format is supported)
•No helps to backup/migrate containers
•No snapshot functionality
Per-container
file systems
Per-container
loop devices
CT 1
/ /
CT 2
/dev/loop1 /dev/loop2
image-file1 image-file2 Host file system
11
Design of new loop device (ploop)
main ploop module
Container file system
kernel block layer
mount
physical medium
image file
Maps virtual
block number
to image
block number
Maps file offset
to physical sector
I/O module
DIO
VFS
format module
plain
QCOW
ploop block device
v-blocks
i-blocks
expandable
NFS
12
Stacked image configuration
Ploop snapshots are represented by image files. They are stuffed in ploop
device in stacked manner and obey the following rules:
•Only top image is opened in RW mode (others – RO)
•Every mapping bears info about ‘level’
•READ leads to access to an image of proper ‘level’
•WRITE may lead to transferring proper block to top image
Snapshot
image1
(RW)
image
(RO)
level 1
level 0
top
WRITE
Copy (if any)
complete
READ
complete
READ
complete
13
Backup via snapshot
Operations involved:
•Create snapshot
•Backup base image
•Merge
CT
image
ploop
bdev
snapshot CT
ploop
bdev
image
(RW)
(RO)
copy
image.tmp
(RW)
merge CT
ploop
bdev
image
(RW)
image.tmp
merge
Key features:
•On-line
•consistent
14
Migrate via write-tracker
Operations involved:
•Turn write-tracker on
•Iterative copy
•Freeze container
•Copy the rest
time
track on, copy all
check tracker, copy modified
check tracker, copy the rest
image a copy of image
< freeze container>
Key features:
•On-line
•I/O efficient
15
Problems solved
•File system journal is not bottleneck anymore
1. CT1 performs a lot of operations leading to metadata updates (e.g.
truncates). Consequently, the journal is getting full.
2. CT2 has its own file system. So, it’s not directly affected by CT1.
CT1 file system100% full
Journal
Lots of random writes
CT2 file system
16
Problems solved (cont.)
•Large-size image files I/O instead of lots of small-size files I/O on
management operations
•Disk quota can be implemented based on virtual device sizes. No need
for sub-tree quotas
•Live backup is easy and consistent
•Live migration is reliable and efficient
•Different containers may use file systems of different types and properties
•No need to limit “number-of-inodes-per-container”
17
Additional benefits
•Efficient container creation
•Snapshot/merge feature
•Can support QCOW2 and other image formats
•Can support arbitrary storage types
•Can help KVM to keep all disk I/O in-kernel
18
Plans
•Implement I/O module on top of vfs ops
•Add support of QCOW2 image format
•Optimizations:
•discard/TRIM events support
•online images defragmentation
•support sparse files
•Integration into mainline kernel
19
Questions

More Related Content

What's hot

Process, Threads, Symmetric Multiprocessing and Microkernels in Operating System
Process, Threads, Symmetric Multiprocessing and Microkernels in Operating SystemProcess, Threads, Symmetric Multiprocessing and Microkernels in Operating System
Process, Threads, Symmetric Multiprocessing and Microkernels in Operating System
LieYah Daliah
 
Processes and Threads in Windows Vista
Processes and Threads in Windows VistaProcesses and Threads in Windows Vista
Processes and Threads in Windows Vista
Trinh Phuc Tho
 
Backup, Restore, and Disaster Recovery
Backup, Restore, and Disaster RecoveryBackup, Restore, and Disaster Recovery
Backup, Restore, and Disaster Recovery
MongoDB
 
Lecture 6.1
Lecture  6.1Lecture  6.1
Lecture 6.1
Mr SMAK
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
Haris456
 
10 Multicore 07
10 Multicore 0710 Multicore 07
10 Multicore 07
timcrack
 

What's hot (20)

BiTteRsweet FS
BiTteRsweet FSBiTteRsweet FS
BiTteRsweet FS
 
Process, Threads, Symmetric Multiprocessing and Microkernels in Operating System
Process, Threads, Symmetric Multiprocessing and Microkernels in Operating SystemProcess, Threads, Symmetric Multiprocessing and Microkernels in Operating System
Process, Threads, Symmetric Multiprocessing and Microkernels in Operating System
 
SYNCHRONIZATION IN MULTIPROCESSING
SYNCHRONIZATION IN MULTIPROCESSINGSYNCHRONIZATION IN MULTIPROCESSING
SYNCHRONIZATION IN MULTIPROCESSING
 
Cache coherence
Cache coherenceCache coherence
Cache coherence
 
A20 dfs2
A20 dfs2A20 dfs2
A20 dfs2
 
Processes and Threads in Windows Vista
Processes and Threads in Windows VistaProcesses and Threads in Windows Vista
Processes and Threads in Windows Vista
 
18 philbe replication stanford99
18 philbe replication stanford9918 philbe replication stanford99
18 philbe replication stanford99
 
Scalability fs v2
Scalability fs v2Scalability fs v2
Scalability fs v2
 
Multithreaded processors ppt
Multithreaded processors pptMultithreaded processors ppt
Multithreaded processors ppt
 
File replication
File replicationFile replication
File replication
 
Backup, Restore, and Disaster Recovery
Backup, Restore, and Disaster RecoveryBackup, Restore, and Disaster Recovery
Backup, Restore, and Disaster Recovery
 
Lecture 6.1
Lecture  6.1Lecture  6.1
Lecture 6.1
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
 
Memory management in linux
Memory management in linuxMemory management in linux
Memory management in linux
 
10 replication
10 replication10 replication
10 replication
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
 
10 Multicore 07
10 Multicore 0710 Multicore 07
10 Multicore 07
 
Real time operating systems (rtos) concepts 5
Real time operating systems (rtos) concepts 5Real time operating systems (rtos) concepts 5
Real time operating systems (rtos) concepts 5
 
Process management in linux
Process management in linuxProcess management in linux
Process management in linux
 
Parallel computing in bioinformatics t.seemann - balti bioinformatics - wed...
Parallel computing in bioinformatics   t.seemann - balti bioinformatics - wed...Parallel computing in bioinformatics   t.seemann - balti bioinformatics - wed...
Parallel computing in bioinformatics t.seemann - balti bioinformatics - wed...
 

Similar to Containers in a File

Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
MongoDB
 

Similar to Containers in a File (20)

PFcache (Linuxcon, Seattle, 2015)
PFcache (Linuxcon, Seattle, 2015)PFcache (Linuxcon, Seattle, 2015)
PFcache (Linuxcon, Seattle, 2015)
 
PFcache - LinuxCon 2015
PFcache - LinuxCon 2015PFcache - LinuxCon 2015
PFcache - LinuxCon 2015
 
Denser containers with PF cache - Pavel Emelyanov
Denser containers with PF cache - Pavel EmelyanovDenser containers with PF cache - Pavel Emelyanov
Denser containers with PF cache - Pavel Emelyanov
 
Containers and Namespaces in the Linux Kernel
Containers and Namespaces in the Linux KernelContainers and Namespaces in the Linux Kernel
Containers and Namespaces in the Linux Kernel
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment Strategies
 
Cassandra Day Atlanta 2015: Recording the Web: High-Fidelity Storage and Play...
Cassandra Day Atlanta 2015: Recording the Web: High-Fidelity Storage and Play...Cassandra Day Atlanta 2015: Recording the Web: High-Fidelity Storage and Play...
Cassandra Day Atlanta 2015: Recording the Web: High-Fidelity Storage and Play...
 
Advanced Storage Area Network
Advanced Storage Area NetworkAdvanced Storage Area Network
Advanced Storage Area Network
 
File000128
File000128File000128
File000128
 
Exploring Docker Security
Exploring Docker SecurityExploring Docker Security
Exploring Docker Security
 
CvmFS Workshop
CvmFS WorkshopCvmFS Workshop
CvmFS Workshop
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
 
Ceph in the GRNET cloud stack
Ceph in the GRNET cloud stackCeph in the GRNET cloud stack
Ceph in the GRNET cloud stack
 
Container & kubernetes
Container & kubernetesContainer & kubernetes
Container & kubernetes
 
Live Container Migration: OpenStack Summit Barcelona 2016
Live Container Migration: OpenStack Summit Barcelona 2016Live Container Migration: OpenStack Summit Barcelona 2016
Live Container Migration: OpenStack Summit Barcelona 2016
 
.ppt
.ppt.ppt
.ppt
 
”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016
 
Considerations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmfConsiderations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmf
 
What'sNnew in 3.0 Webinar
What'sNnew in 3.0 WebinarWhat'sNnew in 3.0 Webinar
What'sNnew in 3.0 Webinar
 
Building the Internet of Things with Thingsquare and Contiki - day 2 part 1
Building the Internet of Things with Thingsquare and Contiki - day 2 part 1Building the Internet of Things with Thingsquare and Contiki - day 2 part 1
Building the Internet of Things with Thingsquare and Contiki - day 2 part 1
 

More from OpenVZ

Speeding up ps and top
Speeding up ps and topSpeeding up ps and top
Speeding up ps and top
OpenVZ
 
Live migration: pros, cons and gotchas -- Pavel Emelyanov
Live migration: pros, cons and gotchas -- Pavel EmelyanovLive migration: pros, cons and gotchas -- Pavel Emelyanov
Live migration: pros, cons and gotchas -- Pavel Emelyanov
OpenVZ
 
Containers in a file
Containers in a fileContainers in a file
Containers in a file
OpenVZ
 

More from OpenVZ (20)

Speeding up ps and top
Speeding up ps and topSpeeding up ps and top
Speeding up ps and top
 
Live migration: pros, cons and gotchas -- Pavel Emelyanov
Live migration: pros, cons and gotchas -- Pavel EmelyanovLive migration: pros, cons and gotchas -- Pavel Emelyanov
Live migration: pros, cons and gotchas -- Pavel Emelyanov
 
Live migrating a container: pros, cons and gotchas -- Pavel Emelyanov
Live migrating a container: pros, cons and gotchas -- Pavel EmelyanovLive migrating a container: pros, cons and gotchas -- Pavel Emelyanov
Live migrating a container: pros, cons and gotchas -- Pavel Emelyanov
 
CRIU: time and space travel for Linux containers -- Kir Kolyshkin
CRIU: time and space travel for Linux containers -- Kir KolyshkinCRIU: time and space travel for Linux containers -- Kir Kolyshkin
CRIU: time and space travel for Linux containers -- Kir Kolyshkin
 
Тестирование ПО, основанного на сторонних компонентах - Денис Силаков, SECR 2015
Тестирование ПО, основанного на сторонних компонентах - Денис Силаков, SECR 2015Тестирование ПО, основанного на сторонних компонентах - Денис Силаков, SECR 2015
Тестирование ПО, основанного на сторонних компонентах - Денис Силаков, SECR 2015
 
Живая миграция: плюсы, минусы и подводные камни - Павел Емельянов
Живая миграция: плюсы, минусы и подводные камни - Павел ЕмельяновЖивая миграция: плюсы, минусы и подводные камни - Павел Емельянов
Живая миграция: плюсы, минусы и подводные камни - Павел Емельянов
 
What's missing from upstream kernel containers? - Sergey Bronnikov
What's missing from upstream kernel containers? - Sergey BronnikovWhat's missing from upstream kernel containers? - Sergey Bronnikov
What's missing from upstream kernel containers? - Sergey Bronnikov
 
Проблема фрагментации виртуальных дисков и способы её решения -- Дмитрий Монахов
Проблема фрагментации виртуальных дисков и способы её решения -- Дмитрий МонаховПроблема фрагментации виртуальных дисков и способы её решения -- Дмитрий Монахов
Проблема фрагментации виртуальных дисков и способы её решения -- Дмитрий Монахов
 
Развёртывание приложений Docker в контейнерах Virtuozzo -- Павел Тихомиров
Развёртывание приложений Docker в контейнерах Virtuozzo -- Павел ТихомировРазвёртывание приложений Docker в контейнерах Virtuozzo -- Павел Тихомиров
Развёртывание приложений Docker в контейнерах Virtuozzo -- Павел Тихомиров
 
CRIU: ускорение запуска PHP в CloudLinux OS -- Руслан Купреев
CRIU: ускорение запуска PHP в CloudLinux OS  -- Руслан КупреевCRIU: ускорение запуска PHP в CloudLinux OS  -- Руслан Купреев
CRIU: ускорение запуска PHP в CloudLinux OS -- Руслан Купреев
 
LibCT и контейнеры на уровне приложений -- Александр Бурлука
	LibCT и контейнеры на уровне приложений -- Александр Бурлука	LibCT и контейнеры на уровне приложений -- Александр Бурлука
LibCT и контейнеры на уровне приложений -- Александр Бурлука
 
Управление памятью контейнеров в проекте OpenVZ -- Владимир Давыдов
Управление памятью контейнеров в проекте OpenVZ -- Владимир ДавыдовУправление памятью контейнеров в проекте OpenVZ -- Владимир Давыдов
Управление памятью контейнеров в проекте OpenVZ -- Владимир Давыдов
 
Живая миграция контейнеров: плюсы, минусы, подводные камни -- Павел Емельянов
Живая миграция контейнеров: плюсы, минусы, подводные камни -- Павел ЕмельяновЖивая миграция контейнеров: плюсы, минусы, подводные камни -- Павел Емельянов
Живая миграция контейнеров: плюсы, минусы, подводные камни -- Павел Емельянов
 
LibCT: one lib to rule them all -- Andrey Vagin
LibCT: one lib to rule them all -- Andrey VaginLibCT: one lib to rule them all -- Andrey Vagin
LibCT: one lib to rule them all -- Andrey Vagin
 
CGroups kernel memory controller -- Pavel Emelyanov
CGroups kernel memory controller -- Pavel EmelyanovCGroups kernel memory controller -- Pavel Emelyanov
CGroups kernel memory controller -- Pavel Emelyanov
 
What's missing from upstream kernel containers? - Kir Kolyshkin, Sergey Bronn...
What's missing from upstream kernel containers? - Kir Kolyshkin, Sergey Bronn...What's missing from upstream kernel containers? - Kir Kolyshkin, Sergey Bronn...
What's missing from upstream kernel containers? - Kir Kolyshkin, Sergey Bronn...
 
Not so brief history of Linux Containers - Kir Kolyshkin
Not so brief history of Linux Containers - Kir KolyshkinNot so brief history of Linux Containers - Kir Kolyshkin
Not so brief history of Linux Containers - Kir Kolyshkin
 
Openvz booth
Openvz boothOpenvz booth
Openvz booth
 
Управление ресурсами в Linux и OpenVZ
Управление ресурсами в Linux и OpenVZ Управление ресурсами в Linux и OpenVZ
Управление ресурсами в Linux и OpenVZ
 
Containers in a file
Containers in a fileContainers in a file
Containers in a file
 

Recently uploaded

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 

Containers in a File

  • 2. 2 Agenda •Containers overview •Current way of keeping container files and its problems •Basics of container-in-a-file approach •Limitations of existing facilities •Solution •Benefits •Plans for future
  • 3. 3 Virtualization: VM-style vs. containers CT 1 Hardware Hardware Host OSHost OS / Hypervisor OS Virtualization LayerVirtual Machine Monitor Virtual Hardware Virtual Hardware Guest OS Guest OS VM 1 VM 2 CT 2 Containers: •OpenVZ / Parallels Containers •Solaris containers/Zones •FreeBSD jails VMMs: •KVM / Qemu •VMware •Parallels Hypervisor
  • 4. 4 Containers Virtualization Each container has its own: •Files •Process tree •Network •Devices •IPC objects
  • 5. 5 Containers: file system view Natural approach: each container chroot-s to a per-container sub-tree of some system-wide host file system. CT 1 CT 2 Block device Host file system / 1 2 CTs chroot chroot
  • 6. 6 Problems •File system journal is a bottleneck: 1. CT1 performs a lot of operations leading to metadata updates (e.g. truncates). Consequently, the journal is getting full. 2. CT2 issuing a single I/O operation blocks until journal checkpoint is completed. Up to 15 seconds in some tests! Host file system100% full Journal Lots of random writes
  • 7. 7 Problems (cont.) •Lots of small-size files I/O on any CT operation (backup, clone) •Sub-tree disk quota support is absent in mainstream kernel •Hard to manage: •No per-container snapshots, live backup is problematic •Live migration – rsync unreliable, changed inode numbers •File system type and its properties are fixed for all containers •Need to limit number of inodes per container
  • 8. 8 Container in a file Basic idea: assign virtual block device to container, keep container’ file- system on top of virtual block device. CT 1 Per-container file systems Host file system Per-container virtual block devices virtual block device image-file image-file virtual block device / / CT 2
  • 9. 9 LVM limitations •Not flexible enough – works only on top of block device •Hard to manage (e.g. how to migrate huge volume?) •No dynamic allocation (--virtualsize 16T --size 16M) •Management is not as convenient as for files Block device Per-container file systems Per-container logical volumes CT 1 / / CT 2 Logical volume Logical volume
  • 10. 10 Ordinary loop device limitations •VFS operations leads to double page-caching •No dynamic allocation (only “plain” format is supported) •No helps to backup/migrate containers •No snapshot functionality Per-container file systems Per-container loop devices CT 1 / / CT 2 /dev/loop1 /dev/loop2 image-file1 image-file2 Host file system
  • 11. 11 Design of new loop device (ploop) main ploop module Container file system kernel block layer mount physical medium image file Maps virtual block number to image block number Maps file offset to physical sector I/O module DIO VFS format module plain QCOW ploop block device v-blocks i-blocks expandable NFS
  • 12. 12 Stacked image configuration Ploop snapshots are represented by image files. They are stuffed in ploop device in stacked manner and obey the following rules: •Only top image is opened in RW mode (others – RO) •Every mapping bears info about ‘level’ •READ leads to access to an image of proper ‘level’ •WRITE may lead to transferring proper block to top image Snapshot image1 (RW) image (RO) level 1 level 0 top WRITE Copy (if any) complete READ complete READ complete
  • 13. 13 Backup via snapshot Operations involved: •Create snapshot •Backup base image •Merge CT image ploop bdev snapshot CT ploop bdev image (RW) (RO) copy image.tmp (RW) merge CT ploop bdev image (RW) image.tmp merge Key features: •On-line •consistent
  • 14. 14 Migrate via write-tracker Operations involved: •Turn write-tracker on •Iterative copy •Freeze container •Copy the rest time track on, copy all check tracker, copy modified check tracker, copy the rest image a copy of image < freeze container> Key features: •On-line •I/O efficient
  • 15. 15 Problems solved •File system journal is not bottleneck anymore 1. CT1 performs a lot of operations leading to metadata updates (e.g. truncates). Consequently, the journal is getting full. 2. CT2 has its own file system. So, it’s not directly affected by CT1. CT1 file system100% full Journal Lots of random writes CT2 file system
  • 16. 16 Problems solved (cont.) •Large-size image files I/O instead of lots of small-size files I/O on management operations •Disk quota can be implemented based on virtual device sizes. No need for sub-tree quotas •Live backup is easy and consistent •Live migration is reliable and efficient •Different containers may use file systems of different types and properties •No need to limit “number-of-inodes-per-container”
  • 17. 17 Additional benefits •Efficient container creation •Snapshot/merge feature •Can support QCOW2 and other image formats •Can support arbitrary storage types •Can help KVM to keep all disk I/O in-kernel
  • 18. 18 Plans •Implement I/O module on top of vfs ops •Add support of QCOW2 image format •Optimizations: •discard/TRIM events support •online images defragmentation •support sparse files •Integration into mainline kernel