Towards a configurable and slimmer x86
hypervisor
Liu Wei
Budapest – July 11-13, 2017
Current state of affairs
PV mode: no hardware extension needed, used in legacy
systems, useful in certain cases like running unikernel and
nested-virt without vVMX or vSVM
HVM mode: needs hardware support and QEMU for
emulation, has become the mainstream Xen VM mode
PVH mode: essentially HVM without QEMU, under
development
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 2 / 19
Two (big) projects
Splitting PV and HVM code: Refactor x86 hypervisor code.
Make guest supporting code configurable via Kconfig
PV ABI in PVH container: Implement a PV ABI shim. Use it
to translate PV hypercalls into PVH ones when necessary
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 3 / 19
Why splitting PV and HVM code?
Users can pick and choose the guest interfaces
Smaller binary, smaller attack surface
Reclaim precious address space if PV is disabled, to let Xen
support >16TB host memory more easily
Improve x86 hypervisor code base
*NOT* intending to kill PV in the hypervisor
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 4 / 19
Xen x86 PV memory layout
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 5 / 19
The conceptual map of code
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 6 / 19
The reality
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 7 / 19
Current code
do foo ( . . . )
{
/∗ . . . ∗/
i f (hvm) {
do foo hvm ( ) ;
return ;
}
/∗ l o t s of code to do foo f o r pv ∗/
return ;
}
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 8 / 19
Current code
do bar ( . . . )
{
/∗ . . . ∗/
i f (hvm) { /∗ Some code ∗/ };
i f ( pv ) { /∗ Some code ∗/ };
/∗ l o t s of code f o r common case ∗/
i f (hvm) { /∗ Some code ∗/ };
i f ( pv ) { /∗ Some code ∗/ };
return ;
}
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 9 / 19
Future code
do baz ( . . . )
{
/∗ code f o r common case ∗/
i f (hvm)
do baz hvm ( ) ;
i f ( pv )
do baz pv ( ) ;
return ;
}
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 10 / 19
Game plan for splitting PV and HVM code
Identify all the components that need refactoring
Dom0 builder
Domain handling code
Trap handling code
Memory management code
Guest memory accessor
...
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 11 / 19
Game plan for splitting PV and HVM code
Coarse-grained refactoring – mostly for PV code
Move code around
Split code into manageable trunks
Do some basic cleanups:
Use better function names
Use better coding style
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 12 / 19
Game plan for splitting PV and HVM code
Fine-grained refactoring – for both PV and HVM code
Abstract out a set of guest interfaces
Adjust internal interfaces between components if necessary
Fix x86 common code
Make PV and HVM configurable
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 13 / 19
Why PV ABI in PVH container?
Continue to support PV in a more secure manner
Have more than 128GB worth of 32bit PV guests
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 14 / 19
PV ABI in PVH container
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 15 / 19
Game plan for PV ABI in PVH container
Build the PV shim – essentially a stripped-down Xen
hypervisor
Go through all PV hypercall handlers, categorize them into the
aforementeioned groups
Further refactor PV guest supporting code: provide the ”real
PV” handlers and ”PV shim” handlers while sharing as much
code as possible
Change the build system to pull in the right objects
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 16 / 19
Game plan for PV ABI in PVH container
Adjust Xen toolstack
Construct a PVH guest while using the PV shim as ”firmware”
Further improvements (open questions at the moment)
Provide mechanism to parse guest kernel inside the container,
something like pvgrub
Provide mechanism to pass-through PCI devices (if that’s still
relevant)
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 17 / 19
Current status
Doing coarse-grained refactoring:
Dom0 builder (done)
Domain handling code (done)
Trap handling code (done)
Memory management code (doing)
Guest memory accessor
...
ETA: Some point in the future :)
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 18 / 19
Q&A
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 19 / 19

XPDDS17: Keynote: Towards a Configurable and Slimmer x86 Hypervisor - Wei Liu, Citrix

  • 1.
    Towards a configurableand slimmer x86 hypervisor Liu Wei Budapest – July 11-13, 2017
  • 2.
    Current state ofaffairs PV mode: no hardware extension needed, used in legacy systems, useful in certain cases like running unikernel and nested-virt without vVMX or vSVM HVM mode: needs hardware support and QEMU for emulation, has become the mainstream Xen VM mode PVH mode: essentially HVM without QEMU, under development Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 2 / 19
  • 3.
    Two (big) projects SplittingPV and HVM code: Refactor x86 hypervisor code. Make guest supporting code configurable via Kconfig PV ABI in PVH container: Implement a PV ABI shim. Use it to translate PV hypercalls into PVH ones when necessary Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 3 / 19
  • 4.
    Why splitting PVand HVM code? Users can pick and choose the guest interfaces Smaller binary, smaller attack surface Reclaim precious address space if PV is disabled, to let Xen support >16TB host memory more easily Improve x86 hypervisor code base *NOT* intending to kill PV in the hypervisor Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 4 / 19
  • 5.
    Xen x86 PVmemory layout Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 5 / 19
  • 6.
    The conceptual mapof code Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 6 / 19
  • 7.
    The reality Budapest –July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 7 / 19
  • 8.
    Current code do foo( . . . ) { /∗ . . . ∗/ i f (hvm) { do foo hvm ( ) ; return ; } /∗ l o t s of code to do foo f o r pv ∗/ return ; } Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 8 / 19
  • 9.
    Current code do bar( . . . ) { /∗ . . . ∗/ i f (hvm) { /∗ Some code ∗/ }; i f ( pv ) { /∗ Some code ∗/ }; /∗ l o t s of code f o r common case ∗/ i f (hvm) { /∗ Some code ∗/ }; i f ( pv ) { /∗ Some code ∗/ }; return ; } Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 9 / 19
  • 10.
    Future code do baz( . . . ) { /∗ code f o r common case ∗/ i f (hvm) do baz hvm ( ) ; i f ( pv ) do baz pv ( ) ; return ; } Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 10 / 19
  • 11.
    Game plan forsplitting PV and HVM code Identify all the components that need refactoring Dom0 builder Domain handling code Trap handling code Memory management code Guest memory accessor ... Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 11 / 19
  • 12.
    Game plan forsplitting PV and HVM code Coarse-grained refactoring – mostly for PV code Move code around Split code into manageable trunks Do some basic cleanups: Use better function names Use better coding style Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 12 / 19
  • 13.
    Game plan forsplitting PV and HVM code Fine-grained refactoring – for both PV and HVM code Abstract out a set of guest interfaces Adjust internal interfaces between components if necessary Fix x86 common code Make PV and HVM configurable Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 13 / 19
  • 14.
    Why PV ABIin PVH container? Continue to support PV in a more secure manner Have more than 128GB worth of 32bit PV guests Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 14 / 19
  • 15.
    PV ABI inPVH container Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 15 / 19
  • 16.
    Game plan forPV ABI in PVH container Build the PV shim – essentially a stripped-down Xen hypervisor Go through all PV hypercall handlers, categorize them into the aforementeioned groups Further refactor PV guest supporting code: provide the ”real PV” handlers and ”PV shim” handlers while sharing as much code as possible Change the build system to pull in the right objects Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 16 / 19
  • 17.
    Game plan forPV ABI in PVH container Adjust Xen toolstack Construct a PVH guest while using the PV shim as ”firmware” Further improvements (open questions at the moment) Provide mechanism to parse guest kernel inside the container, something like pvgrub Provide mechanism to pass-through PCI devices (if that’s still relevant) Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 17 / 19
  • 18.
    Current status Doing coarse-grainedrefactoring: Dom0 builder (done) Domain handling code (done) Trap handling code (done) Memory management code (doing) Guest memory accessor ... ETA: Some point in the future :) Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 18 / 19
  • 19.
    Q&A Budapest – July11-13, 2017 Towards a configurable and slimmer x86 hypervisor 19 / 19