SlideShare a Scribd company logo
1 of 50
Download to read offline
The Turtles Project: 
Design and Implementation of Nested Virtualization 
Muli Ben-Yehuday Michael D. Dayz Zvi Dubitzkyy Michael Factory 
Nadav Har’Ely Abel Gordony Anthony Liguoriz Orit Wassermany 
Ben-Ami Yassoury 
yIBM Research – Haifa 
zIBM Linux Technology Center 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 1 / 22
What is nested x86 virtualization? 
Running multiple unmodified 
hypervisors 
With their associated 
unmodified VM’s 
Guest 
Simultaneously 
Hypervisor 
On the x86 architecture 
Which does not support 
nesting in hardware. . . 
. . . but does support a single 
level of virtualization Hardware 
Hypervisor 
Guest 
OS 
Guest 
OS 
Guest 
OS 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 2 / 22
Why? 
Operating systems are already hypervisors (Windows 7 with XP 
mode, Linux/KVM) 
To be able to run other hypervisors in clouds 
Security (e.g., hypervisor-level rootkits) 
Co-design of x86 hardware and system software 
Testing, demonstrating, debugging, live migration of hypervisors 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 3 / 22
Related work 
First models for nested virtualization [PopekGoldberg74, 
BelpaireHsu75, LauerWyeth73] 
First implementation in the IBM z/VM; relies on architectural 
support for nested virtualization (sie) 
Microkernels meet recursive VMs [FordHibler96]: assumes we 
can modify software at all levels 
x86 software based approaches (slow!) [Berghmans10] 
KVM [KivityKamay07] with AMD SVM [RoedelGraf09] 
Early Xen prototype [He09] 
Blue Pill rootkit hiding from other hypervisors [Rutkowska06] 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 4 / 22
What is the Turtles project? 
Efficient nested virtualization for Intel x86 based on KVM 
Multiple guest hypervisors and VMs: VMware, Windows, . . . 
Code publicly available 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 5 / 22
What is the Turtles project? (cont’) 
Nested VMX virtualization for nested CPU virtualization 
Multi-dimensional paging for nested MMU virtualization 
Multi-level device assignment for nested I/O virtualization 
Micro-optimizations to make it go fast (see paper) 
+ + = 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 6 / 22
Theory of nested CPU virtualization 
Single-level architectural support (x86) vs. multi-level architectural 
support (e.g., z/VM) 
Single level ) one hypervisor, many guests 
Turtles approach: L0 multiplexes the hardware between L1 and L2, 
running both as guests of L0—without either being aware of it 
(Scheme generalized for n levels; Our focus is n=2) 
Guest 
Host Hypervisor 
Hardware 
Host Hypervisor 
Hardware 
Multiple logical levels Multiplexed on a single level 
L1 
L0 
L2 
L1 
Guest 
L2 
Guest 
L2 
L0 
Guest 
L2 Guest 
Hypervisor 
Guest 
Hypervisor Guest Guest 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 7 / 22
Theory of nested CPU virtualization 
Single-level architectural support (x86) vs. multi-level architectural 
support (e.g., z/VM) 
Single level ) one hypervisor, many guests 
Turtles approach: L0 multiplexes the hardware between L1 and L2, 
running both as guests of L0—without either being aware of it 
(Scheme generalized for n levels; Our focus is n=2) 
Guest 
Host Hypervisor 
Hardware 
Host Hypervisor 
Hardware 
Multiple logical levels Multiplexed on a single level 
L1 
L0 
L2 
L1 
Guest 
L2 
Guest 
L2 
L0 
Guest 
L2 Guest 
Hypervisor 
Guest 
Hypervisor Guest Guest 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 7 / 22
Theory of nested CPU virtualization 
Single-level architectural support (x86) vs. multi-level architectural 
support (e.g., z/VM) 
Single level ) one hypervisor, many guests 
Turtles approach: L0 multiplexes the hardware between L1 and L2, 
running both as guests of L0—without either being aware of it 
(Scheme generalized for n levels; Our focus is n=2) 
Guest 
Host Hypervisor 
Hardware 
Host Hypervisor 
Hardware 
Multiple logical levels Multiplexed on a single level 
L1 
L0 
L2 
L1 
Guest 
L2 
Guest 
L2 
L0 
Guest 
L2 Guest 
Hypervisor 
Guest 
Hypervisor Guest Guest 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 7 / 22
Theory of nested CPU virtualization 
Single-level architectural support (x86) vs. multi-level architectural 
support (e.g., z/VM) 
Single level ) one hypervisor, many guests 
Turtles approach: L0 multiplexes the hardware between L1 and L2, 
running both as guests of L0—without either being aware of it 
(Scheme generalized for n levels; Our focus is n=2) 
Guest 
Host Hypervisor 
Hardware 
Host Hypervisor 
Hardware 
Multiple logical levels Multiplexed on a single level 
L1 
L0 
L2 
L1 
Guest 
L2 
Guest 
L2 
L0 
Guest 
L2 Guest 
Hypervisor 
Guest 
Hypervisor Guest Guest 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 7 / 22
Nested VMX virtualization: flow 
L0 runs L1 with VMCS0!1 
L1 prepares VMCS1!2 and 
executes vmlaunch 
vmlaunch traps to L0 
L0 merges VMCS’s: 
VMCS0!1 merged with 
VMCS1!2 is VMCS0!2 
L0 launches L2 
L2 causes a trap 
L0 handles trap itself or 
forwards it to L1 
. . . 
eventually, L0 resumes L2 
repeat 
L1 L2 
Host Hypervisor 
Hardware 
Guest 
OS 
Guest 
Hypervisor 
VMCS 
Memory 
VMCS Tables 
Memory 
Tables VMCS Memory 
Tables 
L0 
1-2 State 
0-1 State 0-2 State 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
Nested VMX virtualization: flow 
L0 runs L1 with VMCS0!1 
L1 prepares VMCS1!2 and 
executes vmlaunch 
vmlaunch traps to L0 
L0 merges VMCS’s: 
VMCS0!1 merged with 
VMCS1!2 is VMCS0!2 
L0 launches L2 
L2 causes a trap 
L0 handles trap itself or 
forwards it to L1 
. . . 
eventually, L0 resumes L2 
repeat 
L1 L2 
Host Hypervisor 
Hardware 
Guest 
OS 
Guest 
Hypervisor 
VMCS 
Memory 
VMCS Tables 
Memory 
Tables VMCS Memory 
Tables 
L0 
1-2 State 
0-1 State 0-2 State 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
Nested VMX virtualization: flow 
L0 runs L1 with VMCS0!1 
L1 prepares VMCS1!2 and 
executes vmlaunch 
vmlaunch traps to L0 
L0 merges VMCS’s: 
VMCS0!1 merged with 
VMCS1!2 is VMCS0!2 
L0 launches L2 
L2 causes a trap 
L0 handles trap itself or 
forwards it to L1 
. . . 
eventually, L0 resumes L2 
repeat 
L1 L2 
Host Hypervisor 
Hardware 
Guest 
OS 
Guest 
Hypervisor 
VMCS 
Memory 
VMCS Tables 
Memory 
Tables VMCS Memory 
Tables 
L0 
1-2 State 
0-1 State 0-2 State 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
Nested VMX virtualization: flow 
L0 runs L1 with VMCS0!1 
L1 prepares VMCS1!2 and 
executes vmlaunch 
vmlaunch traps to L0 
L0 merges VMCS’s: 
VMCS0!1 merged with 
VMCS1!2 is VMCS0!2 
L0 launches L2 
L2 causes a trap 
L0 handles trap itself or 
forwards it to L1 
. . . 
eventually, L0 resumes L2 
repeat 
L1 L2 
Host Hypervisor 
Hardware 
Guest 
OS 
Guest 
Hypervisor 
VMCS 
Memory 
VMCS Tables 
Memory 
Tables VMCS Memory 
Tables 
L0 
1-2 State 
0-1 State 0-2 State 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
Nested VMX virtualization: flow 
L0 runs L1 with VMCS0!1 
L1 prepares VMCS1!2 and 
executes vmlaunch 
vmlaunch traps to L0 
L0 merges VMCS’s: 
VMCS0!1 merged with 
VMCS1!2 is VMCS0!2 
L0 launches L2 
L2 causes a trap 
L0 handles trap itself or 
forwards it to L1 
. . . 
eventually, L0 resumes L2 
repeat 
L1 L2 
Host Hypervisor 
Hardware 
Guest 
OS 
Guest 
Hypervisor 
VMCS 
Memory 
VMCS Tables 
Memory 
Tables VMCS Memory 
Tables 
L0 
1-2 State 
0-1 State 0-2 State 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
Nested VMX virtualization: flow 
L0 runs L1 with VMCS0!1 
L1 prepares VMCS1!2 and 
executes vmlaunch 
vmlaunch traps to L0 
L0 merges VMCS’s: 
VMCS0!1 merged with 
VMCS1!2 is VMCS0!2 
L0 launches L2 
L2 causes a trap 
L0 handles trap itself or 
forwards it to L1 
. . . 
eventually, L0 resumes L2 
repeat 
L1 L2 
Host Hypervisor 
Hardware 
Guest 
OS 
Guest 
Hypervisor 
VMCS 
Memory 
VMCS Tables 
Memory 
Tables VMCS Memory 
Tables 
L0 
1-2 State 
0-1 State 0-2 State 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
Nested VMX virtualization: flow 
L0 runs L1 with VMCS0!1 
L1 prepares VMCS1!2 and 
executes vmlaunch 
vmlaunch traps to L0 
L0 merges VMCS’s: 
VMCS0!1 merged with 
VMCS1!2 is VMCS0!2 
L0 launches L2 
L2 causes a trap 
L0 handles trap itself or 
forwards it to L1 
. . . 
eventually, L0 resumes L2 
repeat 
L1 L2 
Host Hypervisor 
Hardware 
Guest 
OS 
Guest 
Hypervisor 
VMCS 
Memory 
VMCS Tables 
Memory 
Tables VMCS Memory 
Tables 
L0 
1-2 State 
0-1 State 0-2 State 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
Nested VMX virtualization: flow 
L0 runs L1 with VMCS0!1 
L1 prepares VMCS1!2 and 
executes vmlaunch 
vmlaunch traps to L0 
L0 merges VMCS’s: 
VMCS0!1 merged with 
VMCS1!2 is VMCS0!2 
L0 launches L2 
L2 causes a trap 
L0 handles trap itself or 
forwards it to L1 
. . . 
eventually, L0 resumes L2 
repeat 
L1 L2 
Host Hypervisor 
Hardware 
Guest 
OS 
Guest 
Hypervisor 
VMCS 
Memory 
VMCS Tables 
Memory 
Tables VMCS Memory 
Tables 
L0 
1-2 State 
0-1 State 0-2 State 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
Nested VMX virtualization: flow 
L0 runs L1 with VMCS0!1 
L1 prepares VMCS1!2 and 
executes vmlaunch 
vmlaunch traps to L0 
L0 merges VMCS’s: 
VMCS0!1 merged with 
VMCS1!2 is VMCS0!2 
L0 launches L2 
L2 causes a trap 
L0 handles trap itself or 
forwards it to L1 
. . . 
eventually, L0 resumes L2 
repeat 
L1 L2 
Host Hypervisor 
Hardware 
Guest 
OS 
Guest 
Hypervisor 
VMCS 
Memory 
VMCS Tables 
Memory 
Tables VMCS Memory 
Tables 
L0 
1-2 State 
0-1 State 0-2 State 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
Nested VMX virtualization: flow 
L0 runs L1 with VMCS0!1 
L1 prepares VMCS1!2 and 
executes vmlaunch 
vmlaunch traps to L0 
L0 merges VMCS’s: 
VMCS0!1 merged with 
VMCS1!2 is VMCS0!2 
L0 launches L2 
L2 causes a trap 
L0 handles trap itself or 
forwards it to L1 
. . . 
eventually, L0 resumes L2 
repeat 
L1 L2 
Host Hypervisor 
Hardware 
Guest 
OS 
Guest 
Hypervisor 
VMCS 
Memory 
VMCS Tables 
Memory 
Tables VMCS Memory 
Tables 
L0 
1-2 State 
0-1 State 0-2 State 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
Exit multiplication makes angry turtle angry 
To handle a single L2 exit, L1 does many things: read and write 
the VMCS, disable interrupts, . . . 
Those operations can trap, leading to exit multiplication 
Exit multiplication: a single L2 exit can cause 40-50 L1 exits! 
Optimize: make a single exit fast and reduce frequency of exits 
… 
… 
… 
… 
L3 
L2 
L1 
L0 
Single Two 
Three Levels 
Level 
Levels 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 9 / 22
Exit multiplication makes angry turtle angry 
To handle a single L2 exit, L1 does many things: read and write 
the VMCS, disable interrupts, . . . 
Those operations can trap, leading to exit multiplication 
Exit multiplication: a single L2 exit can cause 40-50 L1 exits! 
Optimize: make a single exit fast and reduce frequency of exits 
… 
… 
… 
… 
L3 
L2 
L1 
L0 
Single Two 
Three Levels 
Level 
Levels 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 9 / 22
Exit multiplication makes angry turtle angry 
To handle a single L2 exit, L1 does many things: read and write 
the VMCS, disable interrupts, . . . 
Those operations can trap, leading to exit multiplication 
Exit multiplication: a single L2 exit can cause 40-50 L1 exits! 
Optimize: make a single exit fast and reduce frequency of exits 
… 
… 
… 
… 
L3 
L2 
L1 
L0 
Single Two 
Three Levels 
Level 
Levels 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 9 / 22
Exit multiplication makes angry turtle angry 
To handle a single L2 exit, L1 does many things: read and write 
the VMCS, disable interrupts, . . . 
Those operations can trap, leading to exit multiplication 
Exit multiplication: a single L2 exit can cause 40-50 L1 exits! 
Optimize: make a single exit fast and reduce frequency of exits 
… 
… 
… 
… 
L3 
L2 
L1 
L0 
Single Two 
Three Levels 
Level 
Levels 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 9 / 22
MMU virtualization via multi-dimensional paging 
Three logical translations: L2 virt ! phys, L2 ! L1, L1 ! L0 
Only two tables in hardware with EPT: 
virt ! phys and guest physical ! host physical 
L0 compresses three logical translations onto two hardware tables 
SPT12 
Shadow on top of 
L2 virtual 
L2 physical 
L1 physical 
L0 physical 
GPT 
L2 virtual 
L2 physical 
L1 physical 
L0 physical 
GPT 
SPT02 
shadow 
SPT12 
EPT01 
Multi-dimensional 
L2 virtual 
L2 physical 
L1 physical 
L0 physical 
GPT 
EPT02 
EPT12 
paging 
Shadow on top 
of EPT 
EPT01 
baseline better best 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 10 / 22
Baseline: shadow-on-shadow 
SPT12 
L2 virtual 
L2 physical 
L1 physical 
L0 physical 
GPT 
SPT02 
Assume no EPT table; all hypervisors use shadow paging 
Useful for old machines and as a baseline 
Maintaining shadow page tables is expensive 
Compress: three logical translations ) one table in hardware 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 11 / 22
Better: shadow-on-EPT 
L2 virtual 
L2 physical 
L1 physical 
L0 physical 
SPT12 
EPT01 
GPT 
Instead of one hardware table we have two 
Compress: three logical translations ) two in hardware 
Simple approach: L0 uses EPT, L1 uses shadow paging for L2 
Every L2 page fault leads to multiple L1 exits 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 12 / 22
Best: multi-dimensional paging 
L2 virtual 
L2 physical 
L1 physical 
L0 physical 
GPT 
EPT02 
EPT12 
EPT01 
EPT table rarely changes; guest page table changes a lot 
Again, compress three logical translations ) two in hardware 
L0 emulates EPT for L1 
L0 uses EPT0!1 and EPT1!2 to construct EPT0!2 
End result: a lot less exits! 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 13 / 22
Introduction to I/O virtualization 
Device emulation [Sugerman01] 
HOST 
GUEST 
1 
2 
device 
4 3 
device 
driver 
device 
emulation 
driver 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 14 / 22
Introduction to I/O virtualization 
Device emulation [Sugerman01] 
HOST 
GUEST 
1 
2 
device 
4 3 
device 
driver 
device 
emulation 
driver 
Para-virtualized drivers [Barham03, Russell08] 
HOST 
GUEST 
front−end 
virtual 
driver 
1 
driver 
3 2 
back−end 
device virtual 
driver 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 14 / 22
Introduction to I/O virtualization 
Device emulation [Sugerman01] 
HOST 
GUEST 
1 
2 
device 
4 3 
device 
driver 
device 
emulation 
driver 
Para-virtualized drivers [Barham03, Russell08] 
HOST 
GUEST 
front−end 
virtual 
driver 
1 
driver 
3 2 
back−end 
device virtual 
driver 
Direct device assignment [Levasseur04,Yassour08] 
HOST 
GUEST 
device 
driver 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 14 / 22
Introduction to I/O virtualization 
Device emulation [Sugerman01] 
HOST 
GUEST 
1 
2 
device 
4 3 
device 
driver 
device 
emulation 
driver 
Para-virtualized drivers [Barham03, Russell08] 
HOST 
GUEST 
front−end 
virtual 
driver 
1 
driver 
3 2 
back−end 
device virtual 
driver 
Direct device assignment [Levasseur04,Yassour08] 
HOST 
GUEST 
device 
driver 
Direct assignment best performing option 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 14 / 22
Introduction to I/O virtualization 
Device emulation [Sugerman01] 
HOST 
GUEST 
1 
2 
device 
4 3 
device 
driver 
device 
emulation 
driver 
Para-virtualized drivers [Barham03, Russell08] 
HOST 
GUEST 
front−end 
virtual 
driver 
1 
driver 
3 2 
back−end 
device virtual 
driver 
Direct device assignment [Levasseur04,Yassour08] 
HOST 
GUEST 
device 
driver 
Direct assignment best performing option 
Direct assignment requires IOMMU for safe DMA bypass 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 14 / 22
Multi-level device assignment 
With nested 3x3 options for I/O virtualization (L2 , L1 , L0) 
Multi-level device assignment means giving an L2 guest direct 
access to L0’s devices, safely bypassing both L0 and L1 
L1 
hypervisor 
physical 
device 
L0 
hypervisor 
L2 device 
driver 
MMIOs and PIOs 
L L0 IOMMU 1 IOMMU 
Device DMA via 
platform IOMMU 
How? L0 emulates an IOMMU for L1 [Amit10] 
L0 compresses multiple IOMMU translations onto the single 
hardware IOMMU page table 
L2 programs the device directly 
Device DMA’s into L2 memory space directly 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 15 / 22
Multi-level device assignment 
With nested 3x3 options for I/O virtualization (L2 , L1 , L0) 
Multi-level device assignment means giving an L2 guest direct 
access to L0’s devices, safely bypassing both L0 and L1 
L1 
hypervisor 
physical 
device 
L0 
hypervisor 
L2 device 
driver 
MMIOs and PIOs 
L L0 IOMMU 1 IOMMU 
Device DMA via 
platform IOMMU 
How? L0 emulates an IOMMU for L1 [Amit10] 
L0 compresses multiple IOMMU translations onto the single 
hardware IOMMU page table 
L2 programs the device directly 
Device DMA’s into L2 memory space directly 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 15 / 22
Multi-level device assignment 
With nested 3x3 options for I/O virtualization (L2 , L1 , L0) 
Multi-level device assignment means giving an L2 guest direct 
access to L0’s devices, safely bypassing both L0 and L1 
L1 
hypervisor 
physical 
device 
L0 
hypervisor 
L2 device 
driver 
MMIOs and PIOs 
L L0 IOMMU 1 IOMMU 
Device DMA via 
platform IOMMU 
How? L0 emulates an IOMMU for L1 [Amit10] 
L0 compresses multiple IOMMU translations onto the single 
hardware IOMMU page table 
L2 programs the device directly 
Device DMA’s into L2 memory space directly 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 15 / 22
Multi-level device assignment 
With nested 3x3 options for I/O virtualization (L2 , L1 , L0) 
Multi-level device assignment means giving an L2 guest direct 
access to L0’s devices, safely bypassing both L0 and L1 
L1 
hypervisor 
physical 
device 
L0 
hypervisor 
L2 device 
driver 
MMIOs and PIOs 
L L0 IOMMU 1 IOMMU 
Device DMA via 
platform IOMMU 
How? L0 emulates an IOMMU for L1 [Amit10] 
L0 compresses multiple IOMMU translations onto the single 
hardware IOMMU page table 
L2 programs the device directly 
Device DMA’s into L2 memory space directly 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 15 / 22
Multi-level device assignment 
With nested 3x3 options for I/O virtualization (L2 , L1 , L0) 
Multi-level device assignment means giving an L2 guest direct 
access to L0’s devices, safely bypassing both L0 and L1 
L1 
hypervisor 
physical 
device 
L0 
hypervisor 
L2 device 
driver 
MMIOs and PIOs 
L L0 IOMMU 1 IOMMU 
Device DMA via 
platform IOMMU 
How? L0 emulates an IOMMU for L1 [Amit10] 
L0 compresses multiple IOMMU translations onto the single 
hardware IOMMU page table 
L2 programs the device directly 
Device DMA’s into L2 memory space directly 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 15 / 22
Multi-level device assignment 
With nested 3x3 options for I/O virtualization (L2 , L1 , L0) 
Multi-level device assignment means giving an L2 guest direct 
access to L0’s devices, safely bypassing both L0 and L1 
L1 
hypervisor 
physical 
device 
L0 
hypervisor 
L2 device 
driver 
MMIOs and PIOs 
L L0 IOMMU 1 IOMMU 
Device DMA via 
platform IOMMU 
How? L0 emulates an IOMMU for L1 [Amit10] 
L0 compresses multiple IOMMU translations onto the single 
hardware IOMMU page table 
L2 programs the device directly 
Device DMA’s into L2 memory space directly 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 15 / 22
Experimental Setup 
Running Linux, Windows, 
KVM, VMware, SMP, . . . 
Macro workloads: 
kernbench 
SPECjbb 
netperf 
Multi-dimensional paging? 
Multi-level device assignment? 
KVM as L1 vs. VMware as L1? 
See paper for full experimental 
details, more benchmarks and 
analysis, including worst case 
synthetic micro-benchmark 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 16 / 22
Macro: SPECjbb and kernbench 
kernbench 
Host Guest Nested NestedDRW 
Run time 324.3 355 406.3 391.5 
% overhead vs. host - 9.5 25.3 20.7 
% overhead vs. guest - - 14.5 10.3 
SPECjbb 
Host Guest Nested NestedDRW 
Score 90493 83599 77065 78347 
% degradation vs. host - 7.6 14.8 13.4 
% degradation vs. guest - - 7.8 6.3 
Table: kernbench and SPECjbb results 
Exit multiplication effect not as bad as we feared 
Direct vmread and vmwrite (DRW) give an immediate boost 
Take-away: each level of virtualization adds approximately the same 
overhead! 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 17 / 22
Macro: multi-dimensional paging 
Shadow on EPT 
Multi−dimensional paging
 
3.5
 
3.0
 
2.5
 
2.0
 
1.5
 
1.0
 
0.5
 
0.0
 
kernbench specjbb netperf 
Improvement ratio
 
Impact of multi-dimensional paging depends on rate of page faults 
Shadow-on-EPT: every L2 page fault causes L1 multiple exits 
Multi-dimensional paging: only EPT violations cause L1 exits 
EPT table rarely changes: #(EPT violations) << #(page faults) 
Multi-dimensional paging huge win for page-fault intensive 
kernbench 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 18 / 22
Macro: multi-level device assignment 
throughput (Mbps) 
%cpu
 
1,000 
900 
800 
700 
600 
500 
400 
300 
200 
100 
0 
native 
nested guest 
nested guest 
nested guest 
emulation / emulation 
single level guest 
single emulation 
level virtio 
guest 
single direct level access 
guest 
nested guest 
direct / virtio 
virtio / emulation 
virtio / virtio 
100 
80 
60 
40 
20 
0 
nested guest 
direct / direct 
throughput (Mbps)
 
% cpu 
Benchmark: netperf TCP_STREAM (transmit) 
Multi-level device assignment best performing option 
But: native at 20%, multi-level device assignment at 60% (x3!) 
Interrupts considered harmful, cause exit multiplication 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 19 / 22
Macro: multi-level device assignment (sans interrupts) 
1000 
900 
800 
700 
600 
500 
400 
300 
200 
100 
L0 (bare metal) 
L2 (direct/direct) 
L2 (direct/virtio) 
16 32 64 128 256 512 
Throughput (Mbps) 
Message size (netperf -m) 
What if we could deliver device interrupts directly to L2? 
Only 7% difference between native and nested guest! 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 20 / 22
Conclusions 
Efficient nested x86 
virtualization is challenging but 
feasible 
A whole new ballpark opening 
up many exciting 
applications—security, cloud, 
architecture, . . . 
Current overhead of 6-14% 
Negligible for some 
workloads, not yet for others 
Work in progress—expect at 
most 5% eventually 
Code is available 
It’s turtles all the way down 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 21 / 22
Conclusions 
Efficient nested x86 
virtualization is challenging but 
feasible 
A whole new ballpark opening 
up many exciting 
applications—security, cloud, 
architecture, . . . 
Current overhead of 6-14% 
Negligible for some 
workloads, not yet for others 
Work in progress—expect at 
most 5% eventually 
Code is available 
It’s turtles all the way down 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 21 / 22
Conclusions 
Efficient nested x86 
virtualization is challenging but 
feasible 
A whole new ballpark opening 
up many exciting 
applications—security, cloud, 
architecture, . . . 
Current overhead of 6-14% 
Negligible for some 
workloads, not yet for others 
Work in progress—expect at 
most 5% eventually 
Code is available 
It’s turtles all the way down 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 21 / 22
Conclusions 
Efficient nested x86 
virtualization is challenging but 
feasible 
A whole new ballpark opening 
up many exciting 
applications—security, cloud, 
architecture, . . . 
Current overhead of 6-14% 
Negligible for some 
workloads, not yet for others 
Work in progress—expect at 
most 5% eventually 
Code is available 
It’s turtles all the way down 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 21 / 22
Conclusions 
Efficient nested x86 
virtualization is challenging but 
feasible 
A whole new ballpark opening 
up many exciting 
applications—security, cloud, 
architecture, . . . 
Current overhead of 6-14% 
Negligible for some 
workloads, not yet for others 
Work in progress—expect at 
most 5% eventually 
Code is available 
It’s turtles all the way down 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 21 / 22
Questions? 
Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 22 / 22

More Related Content

What's hot

Cloudstack collaboration conference Europe - SDN and Devops
Cloudstack collaboration conference Europe - SDN and DevopsCloudstack collaboration conference Europe - SDN and Devops
Cloudstack collaboration conference Europe - SDN and Devops
John Willis
 
Linux Container Technology 101
Linux Container Technology 101Linux Container Technology 101
Linux Container Technology 101
inside-BigData.com
 
Windows Internals for Linux Kernel Developers
Windows Internals for Linux Kernel DevelopersWindows Internals for Linux Kernel Developers
Windows Internals for Linux Kernel Developers
Kernel TLV
 
JAVA AND ANDROID OS_PRESENTATION
JAVA AND ANDROID OS_PRESENTATIONJAVA AND ANDROID OS_PRESENTATION
JAVA AND ANDROID OS_PRESENTATION
Benjamin Agboola
 
Virtualization - Kernel Virtual Machine (KVM)
Virtualization - Kernel Virtual Machine (KVM)Virtualization - Kernel Virtual Machine (KVM)
Virtualization - Kernel Virtual Machine (KVM)
Wan Leung Wong
 
BlueHat v18 || Memory resident implants - code injection is alive and well
BlueHat v18 || Memory resident implants - code injection is alive and wellBlueHat v18 || Memory resident implants - code injection is alive and well
BlueHat v18 || Memory resident implants - code injection is alive and well
BlueHat Security Conference
 

What's hot (19)

Docker en kernel security
Docker en kernel securityDocker en kernel security
Docker en kernel security
 
sVirt: Hardening Linux Virtualization with Mandatory Access Control
sVirt: Hardening Linux Virtualization with Mandatory Access ControlsVirt: Hardening Linux Virtualization with Mandatory Access Control
sVirt: Hardening Linux Virtualization with Mandatory Access Control
 
Cloudstack collaboration conference Europe - SDN and Devops
Cloudstack collaboration conference Europe - SDN and DevopsCloudstack collaboration conference Europe - SDN and Devops
Cloudstack collaboration conference Europe - SDN and Devops
 
Security of Linux containers in the cloud
Security of Linux containers in the cloudSecurity of Linux containers in the cloud
Security of Linux containers in the cloud
 
Linux Container Technology 101
Linux Container Technology 101Linux Container Technology 101
Linux Container Technology 101
 
Containers and Cloud: From LXC to Docker to Kubernetes
Containers and Cloud: From LXC to Docker to KubernetesContainers and Cloud: From LXC to Docker to Kubernetes
Containers and Cloud: From LXC to Docker to Kubernetes
 
Docker introduction
Docker introductionDocker introduction
Docker introduction
 
Running Applications on the NetBSD Rump Kernel by Justin Cormack
Running Applications on the NetBSD Rump Kernel by Justin Cormack Running Applications on the NetBSD Rump Kernel by Justin Cormack
Running Applications on the NetBSD Rump Kernel by Justin Cormack
 
Introduction to Docker, December 2014 "Tour de France" Edition
Introduction to Docker, December 2014 "Tour de France" EditionIntroduction to Docker, December 2014 "Tour de France" Edition
Introduction to Docker, December 2014 "Tour de France" Edition
 
Windows Internals for Linux Kernel Developers
Windows Internals for Linux Kernel DevelopersWindows Internals for Linux Kernel Developers
Windows Internals for Linux Kernel Developers
 
JAVA AND ANDROID OS_PRESENTATION
JAVA AND ANDROID OS_PRESENTATIONJAVA AND ANDROID OS_PRESENTATION
JAVA AND ANDROID OS_PRESENTATION
 
Locally run a FIWARE Lab Instance In another Hypervisors
Locally run a FIWARE Lab Instance In another HypervisorsLocally run a FIWARE Lab Instance In another Hypervisors
Locally run a FIWARE Lab Instance In another Hypervisors
 
"Lightweight Virtualization with Linux Containers and Docker". Jerome Petazzo...
"Lightweight Virtualization with Linux Containers and Docker". Jerome Petazzo..."Lightweight Virtualization with Linux Containers and Docker". Jerome Petazzo...
"Lightweight Virtualization with Linux Containers and Docker". Jerome Petazzo...
 
Virtualization - Kernel Virtual Machine (KVM)
Virtualization - Kernel Virtual Machine (KVM)Virtualization - Kernel Virtual Machine (KVM)
Virtualization - Kernel Virtual Machine (KVM)
 
Introduction to Docker (as presented at December 2013 Global Hackathon)
Introduction to Docker (as presented at December 2013 Global Hackathon)Introduction to Docker (as presented at December 2013 Global Hackathon)
Introduction to Docker (as presented at December 2013 Global Hackathon)
 
Learn OpenStack from trystack.cn ——Folsom in practice
Learn OpenStack from trystack.cn  ——Folsom in practiceLearn OpenStack from trystack.cn  ——Folsom in practice
Learn OpenStack from trystack.cn ——Folsom in practice
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
 
PIC your malware
PIC your malwarePIC your malware
PIC your malware
 
BlueHat v18 || Memory resident implants - code injection is alive and well
BlueHat v18 || Memory resident implants - code injection is alive and wellBlueHat v18 || Memory resident implants - code injection is alive and well
BlueHat v18 || Memory resident implants - code injection is alive and well
 

Similar to Ben yehuda

Nova for Physicalization and Virtualization compute models
Nova for Physicalization and Virtualization compute modelsNova for Physicalization and Virtualization compute models
Nova for Physicalization and Virtualization compute models
openstackindia
 
SYSAD323 Virtualization Basics
SYSAD323 Virtualization BasicsSYSAD323 Virtualization Basics
SYSAD323 Virtualization Basics
Don Bosco BSIT
 

Similar to Ben yehuda (20)

Nova for Physicalization and Virtualization compute models
Nova for Physicalization and Virtualization compute modelsNova for Physicalization and Virtualization compute models
Nova for Physicalization and Virtualization compute models
 
open source virtualization
open source virtualizationopen source virtualization
open source virtualization
 
Virtual machines and containers
Virtual machines and containersVirtual machines and containers
Virtual machines and containers
 
Proxmox for DevOps
Proxmox for DevOpsProxmox for DevOps
Proxmox for DevOps
 
Virtual Container - Docker
Virtual Container - Docker Virtual Container - Docker
Virtual Container - Docker
 
Linux virtualization
Linux virtualizationLinux virtualization
Linux virtualization
 
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
 
Unikernels: the rise of the library hypervisor in MirageOS
Unikernels: the rise of the library hypervisor in MirageOSUnikernels: the rise of the library hypervisor in MirageOS
Unikernels: the rise of the library hypervisor in MirageOS
 
Unikernels: Rise of the Library Hypervisor
Unikernels: Rise of the Library HypervisorUnikernels: Rise of the Library Hypervisor
Unikernels: Rise of the Library Hypervisor
 
20150531 virtualizatino station 2.0 partner's day
20150531 virtualizatino station 2.0 partner's day20150531 virtualizatino station 2.0 partner's day
20150531 virtualizatino station 2.0 partner's day
 
F19 slidedeck (OpenStack^H^H^H^Hhift, what the)
F19 slidedeck (OpenStack^H^H^H^Hhift, what the)F19 slidedeck (OpenStack^H^H^H^Hhift, what the)
F19 slidedeck (OpenStack^H^H^H^Hhift, what the)
 
Handout2o
Handout2oHandout2o
Handout2o
 
Noah - Robust and Flexible Operating System Compatibility Architecture - Cont...
Noah - Robust and Flexible Operating System Compatibility Architecture - Cont...Noah - Robust and Flexible Operating System Compatibility Architecture - Cont...
Noah - Robust and Flexible Operating System Compatibility Architecture - Cont...
 
RunningFreeBSDonLinuxKVM
RunningFreeBSDonLinuxKVMRunningFreeBSDonLinuxKVM
RunningFreeBSDonLinuxKVM
 
Introduction to Linux for Windows Users
Introduction to Linux for Windows UsersIntroduction to Linux for Windows Users
Introduction to Linux for Windows Users
 
Lightning talk unikernels
Lightning talk unikernelsLightning talk unikernels
Lightning talk unikernels
 
Hyper-V support for OpenStack Grizzly
Hyper-V support for OpenStack GrizzlyHyper-V support for OpenStack Grizzly
Hyper-V support for OpenStack Grizzly
 
Virtualization concept slideshare
Virtualization concept slideshareVirtualization concept slideshare
Virtualization concept slideshare
 
SYSAD323 Virtualization Basics
SYSAD323 Virtualization BasicsSYSAD323 Virtualization Basics
SYSAD323 Virtualization Basics
 
Virtualization technolegys for amdocs
Virtualization technolegys for amdocsVirtualization technolegys for amdocs
Virtualization technolegys for amdocs
 

Recently uploaded

Recently uploaded (20)

WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...
WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...
WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...
 
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public AdministrationWSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
The mythical technical debt. (Brooke, please, forgive me)
The mythical technical debt. (Brooke, please, forgive me)The mythical technical debt. (Brooke, please, forgive me)
The mythical technical debt. (Brooke, please, forgive me)
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
 
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
 
WSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - Kanchana
 
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
 
WSO2CON 2024 - Software Engineering for Digital Businesses
WSO2CON 2024 - Software Engineering for Digital BusinessesWSO2CON 2024 - Software Engineering for Digital Businesses
WSO2CON 2024 - Software Engineering for Digital Businesses
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
WSO2CON 2024 - Designing Event-Driven Enterprises: Stories of Transformation
WSO2CON 2024 - Designing Event-Driven Enterprises: Stories of TransformationWSO2CON 2024 - Designing Event-Driven Enterprises: Stories of Transformation
WSO2CON 2024 - Designing Event-Driven Enterprises: Stories of Transformation
 
Driving Innovation: Scania's API Revolution with WSO2
Driving Innovation: Scania's API Revolution with WSO2Driving Innovation: Scania's API Revolution with WSO2
Driving Innovation: Scania's API Revolution with WSO2
 
WSO2Con2024 - Software Delivery in Hybrid Environments
WSO2Con2024 - Software Delivery in Hybrid EnvironmentsWSO2Con2024 - Software Delivery in Hybrid Environments
WSO2Con2024 - Software Delivery in Hybrid Environments
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
Novo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMsNovo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMs
 
WSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
WSO2CON2024 - Why Should You Consider Ballerina for Your Next IntegrationWSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
WSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
 
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AI
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 

Ben yehuda

  • 1. The Turtles Project: Design and Implementation of Nested Virtualization Muli Ben-Yehuday Michael D. Dayz Zvi Dubitzkyy Michael Factory Nadav Har’Ely Abel Gordony Anthony Liguoriz Orit Wassermany Ben-Ami Yassoury yIBM Research – Haifa zIBM Linux Technology Center Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 1 / 22
  • 2. What is nested x86 virtualization? Running multiple unmodified hypervisors With their associated unmodified VM’s Guest Simultaneously Hypervisor On the x86 architecture Which does not support nesting in hardware. . . . . . but does support a single level of virtualization Hardware Hypervisor Guest OS Guest OS Guest OS Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 2 / 22
  • 3. Why? Operating systems are already hypervisors (Windows 7 with XP mode, Linux/KVM) To be able to run other hypervisors in clouds Security (e.g., hypervisor-level rootkits) Co-design of x86 hardware and system software Testing, demonstrating, debugging, live migration of hypervisors Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 3 / 22
  • 4. Related work First models for nested virtualization [PopekGoldberg74, BelpaireHsu75, LauerWyeth73] First implementation in the IBM z/VM; relies on architectural support for nested virtualization (sie) Microkernels meet recursive VMs [FordHibler96]: assumes we can modify software at all levels x86 software based approaches (slow!) [Berghmans10] KVM [KivityKamay07] with AMD SVM [RoedelGraf09] Early Xen prototype [He09] Blue Pill rootkit hiding from other hypervisors [Rutkowska06] Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 4 / 22
  • 5. What is the Turtles project? Efficient nested virtualization for Intel x86 based on KVM Multiple guest hypervisors and VMs: VMware, Windows, . . . Code publicly available Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 5 / 22
  • 6. What is the Turtles project? (cont’) Nested VMX virtualization for nested CPU virtualization Multi-dimensional paging for nested MMU virtualization Multi-level device assignment for nested I/O virtualization Micro-optimizations to make it go fast (see paper) + + = Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 6 / 22
  • 7. Theory of nested CPU virtualization Single-level architectural support (x86) vs. multi-level architectural support (e.g., z/VM) Single level ) one hypervisor, many guests Turtles approach: L0 multiplexes the hardware between L1 and L2, running both as guests of L0—without either being aware of it (Scheme generalized for n levels; Our focus is n=2) Guest Host Hypervisor Hardware Host Hypervisor Hardware Multiple logical levels Multiplexed on a single level L1 L0 L2 L1 Guest L2 Guest L2 L0 Guest L2 Guest Hypervisor Guest Hypervisor Guest Guest Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 7 / 22
  • 8. Theory of nested CPU virtualization Single-level architectural support (x86) vs. multi-level architectural support (e.g., z/VM) Single level ) one hypervisor, many guests Turtles approach: L0 multiplexes the hardware between L1 and L2, running both as guests of L0—without either being aware of it (Scheme generalized for n levels; Our focus is n=2) Guest Host Hypervisor Hardware Host Hypervisor Hardware Multiple logical levels Multiplexed on a single level L1 L0 L2 L1 Guest L2 Guest L2 L0 Guest L2 Guest Hypervisor Guest Hypervisor Guest Guest Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 7 / 22
  • 9. Theory of nested CPU virtualization Single-level architectural support (x86) vs. multi-level architectural support (e.g., z/VM) Single level ) one hypervisor, many guests Turtles approach: L0 multiplexes the hardware between L1 and L2, running both as guests of L0—without either being aware of it (Scheme generalized for n levels; Our focus is n=2) Guest Host Hypervisor Hardware Host Hypervisor Hardware Multiple logical levels Multiplexed on a single level L1 L0 L2 L1 Guest L2 Guest L2 L0 Guest L2 Guest Hypervisor Guest Hypervisor Guest Guest Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 7 / 22
  • 10. Theory of nested CPU virtualization Single-level architectural support (x86) vs. multi-level architectural support (e.g., z/VM) Single level ) one hypervisor, many guests Turtles approach: L0 multiplexes the hardware between L1 and L2, running both as guests of L0—without either being aware of it (Scheme generalized for n levels; Our focus is n=2) Guest Host Hypervisor Hardware Host Hypervisor Hardware Multiple logical levels Multiplexed on a single level L1 L0 L2 L1 Guest L2 Guest L2 L0 Guest L2 Guest Hypervisor Guest Hypervisor Guest Guest Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 7 / 22
  • 11. Nested VMX virtualization: flow L0 runs L1 with VMCS0!1 L1 prepares VMCS1!2 and executes vmlaunch vmlaunch traps to L0 L0 merges VMCS’s: VMCS0!1 merged with VMCS1!2 is VMCS0!2 L0 launches L2 L2 causes a trap L0 handles trap itself or forwards it to L1 . . . eventually, L0 resumes L2 repeat L1 L2 Host Hypervisor Hardware Guest OS Guest Hypervisor VMCS Memory VMCS Tables Memory Tables VMCS Memory Tables L0 1-2 State 0-1 State 0-2 State Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
  • 12. Nested VMX virtualization: flow L0 runs L1 with VMCS0!1 L1 prepares VMCS1!2 and executes vmlaunch vmlaunch traps to L0 L0 merges VMCS’s: VMCS0!1 merged with VMCS1!2 is VMCS0!2 L0 launches L2 L2 causes a trap L0 handles trap itself or forwards it to L1 . . . eventually, L0 resumes L2 repeat L1 L2 Host Hypervisor Hardware Guest OS Guest Hypervisor VMCS Memory VMCS Tables Memory Tables VMCS Memory Tables L0 1-2 State 0-1 State 0-2 State Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
  • 13. Nested VMX virtualization: flow L0 runs L1 with VMCS0!1 L1 prepares VMCS1!2 and executes vmlaunch vmlaunch traps to L0 L0 merges VMCS’s: VMCS0!1 merged with VMCS1!2 is VMCS0!2 L0 launches L2 L2 causes a trap L0 handles trap itself or forwards it to L1 . . . eventually, L0 resumes L2 repeat L1 L2 Host Hypervisor Hardware Guest OS Guest Hypervisor VMCS Memory VMCS Tables Memory Tables VMCS Memory Tables L0 1-2 State 0-1 State 0-2 State Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
  • 14. Nested VMX virtualization: flow L0 runs L1 with VMCS0!1 L1 prepares VMCS1!2 and executes vmlaunch vmlaunch traps to L0 L0 merges VMCS’s: VMCS0!1 merged with VMCS1!2 is VMCS0!2 L0 launches L2 L2 causes a trap L0 handles trap itself or forwards it to L1 . . . eventually, L0 resumes L2 repeat L1 L2 Host Hypervisor Hardware Guest OS Guest Hypervisor VMCS Memory VMCS Tables Memory Tables VMCS Memory Tables L0 1-2 State 0-1 State 0-2 State Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
  • 15. Nested VMX virtualization: flow L0 runs L1 with VMCS0!1 L1 prepares VMCS1!2 and executes vmlaunch vmlaunch traps to L0 L0 merges VMCS’s: VMCS0!1 merged with VMCS1!2 is VMCS0!2 L0 launches L2 L2 causes a trap L0 handles trap itself or forwards it to L1 . . . eventually, L0 resumes L2 repeat L1 L2 Host Hypervisor Hardware Guest OS Guest Hypervisor VMCS Memory VMCS Tables Memory Tables VMCS Memory Tables L0 1-2 State 0-1 State 0-2 State Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
  • 16. Nested VMX virtualization: flow L0 runs L1 with VMCS0!1 L1 prepares VMCS1!2 and executes vmlaunch vmlaunch traps to L0 L0 merges VMCS’s: VMCS0!1 merged with VMCS1!2 is VMCS0!2 L0 launches L2 L2 causes a trap L0 handles trap itself or forwards it to L1 . . . eventually, L0 resumes L2 repeat L1 L2 Host Hypervisor Hardware Guest OS Guest Hypervisor VMCS Memory VMCS Tables Memory Tables VMCS Memory Tables L0 1-2 State 0-1 State 0-2 State Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
  • 17. Nested VMX virtualization: flow L0 runs L1 with VMCS0!1 L1 prepares VMCS1!2 and executes vmlaunch vmlaunch traps to L0 L0 merges VMCS’s: VMCS0!1 merged with VMCS1!2 is VMCS0!2 L0 launches L2 L2 causes a trap L0 handles trap itself or forwards it to L1 . . . eventually, L0 resumes L2 repeat L1 L2 Host Hypervisor Hardware Guest OS Guest Hypervisor VMCS Memory VMCS Tables Memory Tables VMCS Memory Tables L0 1-2 State 0-1 State 0-2 State Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
  • 18. Nested VMX virtualization: flow L0 runs L1 with VMCS0!1 L1 prepares VMCS1!2 and executes vmlaunch vmlaunch traps to L0 L0 merges VMCS’s: VMCS0!1 merged with VMCS1!2 is VMCS0!2 L0 launches L2 L2 causes a trap L0 handles trap itself or forwards it to L1 . . . eventually, L0 resumes L2 repeat L1 L2 Host Hypervisor Hardware Guest OS Guest Hypervisor VMCS Memory VMCS Tables Memory Tables VMCS Memory Tables L0 1-2 State 0-1 State 0-2 State Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
  • 19. Nested VMX virtualization: flow L0 runs L1 with VMCS0!1 L1 prepares VMCS1!2 and executes vmlaunch vmlaunch traps to L0 L0 merges VMCS’s: VMCS0!1 merged with VMCS1!2 is VMCS0!2 L0 launches L2 L2 causes a trap L0 handles trap itself or forwards it to L1 . . . eventually, L0 resumes L2 repeat L1 L2 Host Hypervisor Hardware Guest OS Guest Hypervisor VMCS Memory VMCS Tables Memory Tables VMCS Memory Tables L0 1-2 State 0-1 State 0-2 State Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
  • 20. Nested VMX virtualization: flow L0 runs L1 with VMCS0!1 L1 prepares VMCS1!2 and executes vmlaunch vmlaunch traps to L0 L0 merges VMCS’s: VMCS0!1 merged with VMCS1!2 is VMCS0!2 L0 launches L2 L2 causes a trap L0 handles trap itself or forwards it to L1 . . . eventually, L0 resumes L2 repeat L1 L2 Host Hypervisor Hardware Guest OS Guest Hypervisor VMCS Memory VMCS Tables Memory Tables VMCS Memory Tables L0 1-2 State 0-1 State 0-2 State Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 8 / 22
  • 21. Exit multiplication makes angry turtle angry To handle a single L2 exit, L1 does many things: read and write the VMCS, disable interrupts, . . . Those operations can trap, leading to exit multiplication Exit multiplication: a single L2 exit can cause 40-50 L1 exits! Optimize: make a single exit fast and reduce frequency of exits … … … … L3 L2 L1 L0 Single Two Three Levels Level Levels Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 9 / 22
  • 22. Exit multiplication makes angry turtle angry To handle a single L2 exit, L1 does many things: read and write the VMCS, disable interrupts, . . . Those operations can trap, leading to exit multiplication Exit multiplication: a single L2 exit can cause 40-50 L1 exits! Optimize: make a single exit fast and reduce frequency of exits … … … … L3 L2 L1 L0 Single Two Three Levels Level Levels Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 9 / 22
  • 23. Exit multiplication makes angry turtle angry To handle a single L2 exit, L1 does many things: read and write the VMCS, disable interrupts, . . . Those operations can trap, leading to exit multiplication Exit multiplication: a single L2 exit can cause 40-50 L1 exits! Optimize: make a single exit fast and reduce frequency of exits … … … … L3 L2 L1 L0 Single Two Three Levels Level Levels Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 9 / 22
  • 24. Exit multiplication makes angry turtle angry To handle a single L2 exit, L1 does many things: read and write the VMCS, disable interrupts, . . . Those operations can trap, leading to exit multiplication Exit multiplication: a single L2 exit can cause 40-50 L1 exits! Optimize: make a single exit fast and reduce frequency of exits … … … … L3 L2 L1 L0 Single Two Three Levels Level Levels Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 9 / 22
  • 25. MMU virtualization via multi-dimensional paging Three logical translations: L2 virt ! phys, L2 ! L1, L1 ! L0 Only two tables in hardware with EPT: virt ! phys and guest physical ! host physical L0 compresses three logical translations onto two hardware tables SPT12 Shadow on top of L2 virtual L2 physical L1 physical L0 physical GPT L2 virtual L2 physical L1 physical L0 physical GPT SPT02 shadow SPT12 EPT01 Multi-dimensional L2 virtual L2 physical L1 physical L0 physical GPT EPT02 EPT12 paging Shadow on top of EPT EPT01 baseline better best Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 10 / 22
  • 26. Baseline: shadow-on-shadow SPT12 L2 virtual L2 physical L1 physical L0 physical GPT SPT02 Assume no EPT table; all hypervisors use shadow paging Useful for old machines and as a baseline Maintaining shadow page tables is expensive Compress: three logical translations ) one table in hardware Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 11 / 22
  • 27. Better: shadow-on-EPT L2 virtual L2 physical L1 physical L0 physical SPT12 EPT01 GPT Instead of one hardware table we have two Compress: three logical translations ) two in hardware Simple approach: L0 uses EPT, L1 uses shadow paging for L2 Every L2 page fault leads to multiple L1 exits Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 12 / 22
  • 28. Best: multi-dimensional paging L2 virtual L2 physical L1 physical L0 physical GPT EPT02 EPT12 EPT01 EPT table rarely changes; guest page table changes a lot Again, compress three logical translations ) two in hardware L0 emulates EPT for L1 L0 uses EPT0!1 and EPT1!2 to construct EPT0!2 End result: a lot less exits! Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 13 / 22
  • 29. Introduction to I/O virtualization Device emulation [Sugerman01] HOST GUEST 1 2 device 4 3 device driver device emulation driver Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 14 / 22
  • 30. Introduction to I/O virtualization Device emulation [Sugerman01] HOST GUEST 1 2 device 4 3 device driver device emulation driver Para-virtualized drivers [Barham03, Russell08] HOST GUEST front−end virtual driver 1 driver 3 2 back−end device virtual driver Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 14 / 22
  • 31. Introduction to I/O virtualization Device emulation [Sugerman01] HOST GUEST 1 2 device 4 3 device driver device emulation driver Para-virtualized drivers [Barham03, Russell08] HOST GUEST front−end virtual driver 1 driver 3 2 back−end device virtual driver Direct device assignment [Levasseur04,Yassour08] HOST GUEST device driver Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 14 / 22
  • 32. Introduction to I/O virtualization Device emulation [Sugerman01] HOST GUEST 1 2 device 4 3 device driver device emulation driver Para-virtualized drivers [Barham03, Russell08] HOST GUEST front−end virtual driver 1 driver 3 2 back−end device virtual driver Direct device assignment [Levasseur04,Yassour08] HOST GUEST device driver Direct assignment best performing option Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 14 / 22
  • 33. Introduction to I/O virtualization Device emulation [Sugerman01] HOST GUEST 1 2 device 4 3 device driver device emulation driver Para-virtualized drivers [Barham03, Russell08] HOST GUEST front−end virtual driver 1 driver 3 2 back−end device virtual driver Direct device assignment [Levasseur04,Yassour08] HOST GUEST device driver Direct assignment best performing option Direct assignment requires IOMMU for safe DMA bypass Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 14 / 22
  • 34. Multi-level device assignment With nested 3x3 options for I/O virtualization (L2 , L1 , L0) Multi-level device assignment means giving an L2 guest direct access to L0’s devices, safely bypassing both L0 and L1 L1 hypervisor physical device L0 hypervisor L2 device driver MMIOs and PIOs L L0 IOMMU 1 IOMMU Device DMA via platform IOMMU How? L0 emulates an IOMMU for L1 [Amit10] L0 compresses multiple IOMMU translations onto the single hardware IOMMU page table L2 programs the device directly Device DMA’s into L2 memory space directly Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 15 / 22
  • 35. Multi-level device assignment With nested 3x3 options for I/O virtualization (L2 , L1 , L0) Multi-level device assignment means giving an L2 guest direct access to L0’s devices, safely bypassing both L0 and L1 L1 hypervisor physical device L0 hypervisor L2 device driver MMIOs and PIOs L L0 IOMMU 1 IOMMU Device DMA via platform IOMMU How? L0 emulates an IOMMU for L1 [Amit10] L0 compresses multiple IOMMU translations onto the single hardware IOMMU page table L2 programs the device directly Device DMA’s into L2 memory space directly Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 15 / 22
  • 36. Multi-level device assignment With nested 3x3 options for I/O virtualization (L2 , L1 , L0) Multi-level device assignment means giving an L2 guest direct access to L0’s devices, safely bypassing both L0 and L1 L1 hypervisor physical device L0 hypervisor L2 device driver MMIOs and PIOs L L0 IOMMU 1 IOMMU Device DMA via platform IOMMU How? L0 emulates an IOMMU for L1 [Amit10] L0 compresses multiple IOMMU translations onto the single hardware IOMMU page table L2 programs the device directly Device DMA’s into L2 memory space directly Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 15 / 22
  • 37. Multi-level device assignment With nested 3x3 options for I/O virtualization (L2 , L1 , L0) Multi-level device assignment means giving an L2 guest direct access to L0’s devices, safely bypassing both L0 and L1 L1 hypervisor physical device L0 hypervisor L2 device driver MMIOs and PIOs L L0 IOMMU 1 IOMMU Device DMA via platform IOMMU How? L0 emulates an IOMMU for L1 [Amit10] L0 compresses multiple IOMMU translations onto the single hardware IOMMU page table L2 programs the device directly Device DMA’s into L2 memory space directly Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 15 / 22
  • 38. Multi-level device assignment With nested 3x3 options for I/O virtualization (L2 , L1 , L0) Multi-level device assignment means giving an L2 guest direct access to L0’s devices, safely bypassing both L0 and L1 L1 hypervisor physical device L0 hypervisor L2 device driver MMIOs and PIOs L L0 IOMMU 1 IOMMU Device DMA via platform IOMMU How? L0 emulates an IOMMU for L1 [Amit10] L0 compresses multiple IOMMU translations onto the single hardware IOMMU page table L2 programs the device directly Device DMA’s into L2 memory space directly Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 15 / 22
  • 39. Multi-level device assignment With nested 3x3 options for I/O virtualization (L2 , L1 , L0) Multi-level device assignment means giving an L2 guest direct access to L0’s devices, safely bypassing both L0 and L1 L1 hypervisor physical device L0 hypervisor L2 device driver MMIOs and PIOs L L0 IOMMU 1 IOMMU Device DMA via platform IOMMU How? L0 emulates an IOMMU for L1 [Amit10] L0 compresses multiple IOMMU translations onto the single hardware IOMMU page table L2 programs the device directly Device DMA’s into L2 memory space directly Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 15 / 22
  • 40. Experimental Setup Running Linux, Windows, KVM, VMware, SMP, . . . Macro workloads: kernbench SPECjbb netperf Multi-dimensional paging? Multi-level device assignment? KVM as L1 vs. VMware as L1? See paper for full experimental details, more benchmarks and analysis, including worst case synthetic micro-benchmark Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 16 / 22
  • 41. Macro: SPECjbb and kernbench kernbench Host Guest Nested NestedDRW Run time 324.3 355 406.3 391.5 % overhead vs. host - 9.5 25.3 20.7 % overhead vs. guest - - 14.5 10.3 SPECjbb Host Guest Nested NestedDRW Score 90493 83599 77065 78347 % degradation vs. host - 7.6 14.8 13.4 % degradation vs. guest - - 7.8 6.3 Table: kernbench and SPECjbb results Exit multiplication effect not as bad as we feared Direct vmread and vmwrite (DRW) give an immediate boost Take-away: each level of virtualization adds approximately the same overhead! Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 17 / 22
  • 42. Macro: multi-dimensional paging Shadow on EPT Multi−dimensional paging 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 kernbench specjbb netperf Improvement ratio Impact of multi-dimensional paging depends on rate of page faults Shadow-on-EPT: every L2 page fault causes L1 multiple exits Multi-dimensional paging: only EPT violations cause L1 exits EPT table rarely changes: #(EPT violations) << #(page faults) Multi-dimensional paging huge win for page-fault intensive kernbench Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 18 / 22
  • 43. Macro: multi-level device assignment throughput (Mbps) %cpu 1,000 900 800 700 600 500 400 300 200 100 0 native nested guest nested guest nested guest emulation / emulation single level guest single emulation level virtio guest single direct level access guest nested guest direct / virtio virtio / emulation virtio / virtio 100 80 60 40 20 0 nested guest direct / direct throughput (Mbps) % cpu Benchmark: netperf TCP_STREAM (transmit) Multi-level device assignment best performing option But: native at 20%, multi-level device assignment at 60% (x3!) Interrupts considered harmful, cause exit multiplication Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 19 / 22
  • 44. Macro: multi-level device assignment (sans interrupts) 1000 900 800 700 600 500 400 300 200 100 L0 (bare metal) L2 (direct/direct) L2 (direct/virtio) 16 32 64 128 256 512 Throughput (Mbps) Message size (netperf -m) What if we could deliver device interrupts directly to L2? Only 7% difference between native and nested guest! Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 20 / 22
  • 45. Conclusions Efficient nested x86 virtualization is challenging but feasible A whole new ballpark opening up many exciting applications—security, cloud, architecture, . . . Current overhead of 6-14% Negligible for some workloads, not yet for others Work in progress—expect at most 5% eventually Code is available It’s turtles all the way down Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 21 / 22
  • 46. Conclusions Efficient nested x86 virtualization is challenging but feasible A whole new ballpark opening up many exciting applications—security, cloud, architecture, . . . Current overhead of 6-14% Negligible for some workloads, not yet for others Work in progress—expect at most 5% eventually Code is available It’s turtles all the way down Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 21 / 22
  • 47. Conclusions Efficient nested x86 virtualization is challenging but feasible A whole new ballpark opening up many exciting applications—security, cloud, architecture, . . . Current overhead of 6-14% Negligible for some workloads, not yet for others Work in progress—expect at most 5% eventually Code is available It’s turtles all the way down Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 21 / 22
  • 48. Conclusions Efficient nested x86 virtualization is challenging but feasible A whole new ballpark opening up many exciting applications—security, cloud, architecture, . . . Current overhead of 6-14% Negligible for some workloads, not yet for others Work in progress—expect at most 5% eventually Code is available It’s turtles all the way down Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 21 / 22
  • 49. Conclusions Efficient nested x86 virtualization is challenging but feasible A whole new ballpark opening up many exciting applications—security, cloud, architecture, . . . Current overhead of 6-14% Negligible for some workloads, not yet for others Work in progress—expect at most 5% eventually Code is available It’s turtles all the way down Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 21 / 22
  • 50. Questions? Ben-Yehuda et al. (IBM Research) The Turtles Project: Nested Virtualization OSDI ’10 22 / 22