This talk will start with an overview of the x86 PVH Dom0 architecture together with some basic information about it, in order for the audience to understand why does Xen need PVH, and even more, why does Xen need a PVH Dom0 at all.
Then it will dive into the hypervisor side implementation, in order to understand how is PVH Dom0 implemented, and in which way it interacts with the existing Xen internal interfaces and subsystems. Details about the current implementation status will also be provided, together with a roadmap of the missing bits.
Hopefully after the talk the audience should have a good understanding about what's this new PVH mode, and how is it implemented inside of Xen.
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
XPDDS17: PVH Dom0: The Road so Far - Roger Pau Monné, Citrix
1. PVH Dom0: The Road so Far
Roger Pau Monn´e roger.pau@citrix.com
Budapest – July 12th, 2017
2. Current Dom0 interfaces Designing a new interface Current status Conclusions
Current Dom0 interface
Due to the nature of the Xen architecture, a different interface
from the native one is used in order to perform several tasks:
MMU and privileged instructions.
CPU handling.
Setup and delivery of interrupts.
ACPI tables.
Budapest – July 12th, 2017 PVH Dom0: The Road so Far 2 / 17
3. Current Dom0 interfaces Designing a new interface Current status Conclusions
MMU and privileged instructions
Traditional PV Dom0 uses the PV MMU:
Specific Xen MMU code in OSes.
Very intrusive.
Limited to 4KB pages.
Involves using hypercalls in order to setup page tables.
Hypercalls are used in order to request the hypervisor to
execute privileged instructions on behalf of the guest.
Budapest – July 12th, 2017 PVH Dom0: The Road so Far 3 / 17
4. Current Dom0 interfaces Designing a new interface Current status Conclusions
CPU handling
Native PV
Boot time enumeration ACPI MADT Hypercalls
AP bringup Local/x2 APIC Hypercalls
Hotplug ACPI GPE and processor objects Xenstore
Budapest – July 12th, 2017 PVH Dom0: The Road so Far 4 / 17
5. Current Dom0 interfaces Designing a new interface Current status Conclusions
Setup and delivery of interrupts
On x86 systems interrupts are delivered from the APIC to the
CPU. There are several kinds of interrupts:
Legacy PCI: implemented using side-band signals, delivered to
the IO APIC and then injected into the local APIC.
MSI/MSI-X: implemented using in-band signals delivered
directly to the local APIC.
Configuration of interrupts is done from the PCI configuration
space and the IO APIC when using legacy PCI interrupts.
Budapest – July 12th, 2017 PVH Dom0: The Road so Far 5 / 17
6. Current Dom0 interfaces Designing a new interface Current status Conclusions
Setup and delivery of interrupts
PV guests don’t have an emulated APIC.
Interrupts are delivered using event channels, the
paravirtualized interrupt interface provided by Xen.
Configuration of interrupts is performed using hypercalls.
Budapest – July 12th, 2017 PVH Dom0: The Road so Far 6 / 17
7. Current Dom0 interfaces Designing a new interface Current status Conclusions
Setup and delivery of interrupts
Hardware
CPU APIC MMU ...
Xen
Guest 1
Hardware
Domain
Event channel driver
Guest 2
Budapest – July 12th, 2017 PVH Dom0: The Road so Far 7 / 17
8. Current Dom0 interfaces Designing a new interface Current status Conclusions
ACPI tables
Two different kind of ACPI tables can be found as part of a
system description:
Static tables: binary structure in memory that can be directly
mapped into a C struct.
Dynamic tables: described using ACPI Machine Language
(AML), an AML parser is required in order to access them.
They can contain both data and methods, that are executed
by the OS.
On a traditional PV Dom0 all tables are passed as-is to
Dom0, and that forces Xen to use side-band methods for CPU
enumeration.
Budapest – July 12th, 2017 PVH Dom0: The Road so Far 8 / 17
9. Current Dom0 interfaces Designing a new interface Current status Conclusions
ACPI tables
Xen can only parse information from static ACPI tables.
But there’s information required by Xen that resides in
dynamic tables:
Hotplug of physical CPUs.
CPU C states.
Sleep states.
Dom0 has to provide this information to Xen.
Although it would be possible for Xen to import a simple
AML parser, there can only be one OSPM, so Xen could only
look at the tables, but not execute any method.
Budapest – July 12th, 2017 PVH Dom0: The Road so Far 9 / 17
10. Current Dom0 interfaces Designing a new interface Current status Conclusions
A new interface for PVH Dom0
As close as possible to the native interface.
Only use hypercalls as last-resort.
Take advantage of the hardware virtualization extensions.
Budapest – July 12th, 2017 PVH Dom0: The Road so Far 10 / 17
11. Current Dom0 interfaces Designing a new interface Current status Conclusions
MMU
Use the hardware virtualization extensions in order to provide
a stage-2 page table for the guest:
Completely transparent from a guest point of view.
Guest can use the virtual MMU provided by the hardware.
Can use pages bigger than 4KB (2MB, 1GB).
No need for any modification to the OS.
Budapest – July 12th, 2017 PVH Dom0: The Road so Far 11 / 17
12. Current Dom0 interfaces Designing a new interface Current status Conclusions
Interrupt management
Provide Dom0 with an emulated local APIC and IO APICs.
Legacy PCI interrupts:
Snoop writes to the emulated IO APIC and setup interrupts on
behalf of Dom0.
MSI and MSI-X:
Trap accesses to the MSI/MSI-X capabilities on the PCI config
space.
Budapest – July 12th, 2017 PVH Dom0: The Road so Far 12 / 17
13. Current Dom0 interfaces Designing a new interface Current status Conclusions
Interrupt management
Hardware
CPU APIC MMU ...
Xen
Guest 1
Hardware
Domain
vAPIC
Guest 2
Budapest – July 12th, 2017 PVH Dom0: The Road so Far 13 / 17
14. Current Dom0 interfaces Designing a new interface Current status Conclusions
ACPI tables
Provide Dom0 with the correct CPU topology in ACPI tables
(MADT).
Hide tables not usable by Dom0, or that contain devices
hidden from Dom0 (in use by Xen): HPET, SLIT, SRAT,
MPST, PMTT, DMAR.
Budapest – July 12th, 2017 PVH Dom0: The Road so Far 14 / 17
15. Current Dom0 interfaces Designing a new interface Current status Conclusions
Current status
Done and current items:
PVH-specific Dom0 builder in place. No longer shared with
the PV Dom0 builder.
ACPI tables modified to suit Dom0 needs.
Routing of legacy PCI interrupts to the emulated IO APIC.
Trap accesses to PCI config space: header (BARs), MSI and
MSI-X capabilities. This is currently a work in progress
(patches posted to mailing list)
Future items:
Further traps/emulation in the PCI config space: SR-IOV,
AER, DPC?
Make the PCI code suitable for DomU use (both PVH and
HVM).
Budapest – July 12th, 2017 PVH Dom0: The Road so Far 15 / 17
16. Current Dom0 interfaces Designing a new interface Current status Conclusions
Conclusions
Based on the original PVH effort, but more similar to HVM
than PV, PVH name re-used to avoid confusing people.
Completely new interface, very similar to native:
Take advantage of hardware virtualization extensions.
Reduce the Xen specific code in OSes.
Make it easier to port new OSes.
Remove code duplication in Xen: PCI emulation code in
pciback and QEMU already, move it to Xen where it can be
used by Dom0/DomU.
Budapest – July 12th, 2017 PVH Dom0: The Road so Far 16 / 17
17. Current Dom0 interfaces Designing a new interface Current status Conclusions
Q&A
Thanks
Questions?
Budapest – July 12th, 2017 PVH Dom0: The Road so Far 17 / 17