This document summarizes OpenStack Compute features related to the Libvirt/KVM driver, including updates in Kilo and predictions for Liberty. Key Kilo features discussed include CPU pinning for performance, huge page support, and I/O-based NUMA scheduling. Predictions for Liberty include improved hardware policy configuration, post-plug networking scripts, further SR-IOV support, and hot resize capability. The document provides examples of how these features can be configured and their impact on guest virtual machine configuration and performance.
8. Libvirt/KVM
● Driver used for 85% of production OpenStack deployments. [1]
● Free and Open Source Software end-to-end stack:
○ Libvirt - Abstraction layer providing an API for hypervisor and virtual
machine lifecycle management. Supports many hypervisors and
architectures.
○ Qemu - Machine emulator able to use dynamic translation, or with
hypervisor assistance (e.g. KVM) virtualization.
○ KVM - Kernel-based-virtual machine is a kernel module providing full
virtualization for the Linux kernel .
● Why Libvirt instead of speaking straight to QEMU?
[1] http://superuser.openstack.org/articles/openstack-users-share-how-their-deployments-stack-up
10. Libvirt/KVM Guest Configuration
● CPU
● NIC
● Disks
● PCI devices
● Serial consoles
● SMBios info
● CPU pinning
● VNC or SPICE
● QEMU + SPICE agents
● VNC or SPICE
● QEMU + SPICE agents
● Clock (PIT, RTC) parameters
● Scheduler, disk, network
tunables
11. Supporting Tool Highlights
● virsh - CLI for interacting with Libvirt.
● virt-rescue - Run a rescue shell on a virtual machine (using
libguestfs).
● virt-sysprep - Reset a virtual machine so that clones can be
made. Removes SSH host keys, udev rules, etc.
● virt-v2v - Convert guests from other platforms (VMware, Xen,
Hyper-V).
● virt-sparsify - Convert disk image to thin provisioned.
12. Libvirt/KVM
● nova-compute agent
communicates with Libvirt.
● Libvirt launches and
manages qemu processes
for each guest.
● KVM uses the Linux kernel
for direct hardware access
as needed.
13. Guest Enhancements
● VirtIO drivers provide paravirtualized device to virtual
machines, improving speed over emulation.
○ Built into modern enterprise Linux guest operating systems.
○ Available for Windows.
● QEMU guest agent optionally runs inside guests and
facilitates external interaction by users and/or management
platforms including OpenStack.
● Anti-VENOM provided using sVirt (SELinux and AppArmour
security drivers supported).
14. Virtual Interface Drivers
● Responsible for plugging/unplugging guest interfaces.
● Different interface types = different Libvirt XML definitions.
● Simplified LibvirtGenericVIFDriver implementation supports a
wide range of VIF types.
● Not easily pluggable by out-of-tree implementations.
○ Live in nova/virt/libvirt/vif.py
○ More on this later...
19. CPU Pinning
● Extends NUMATopologyFilter added in Juno:
○ Adds concept of a “dedicated resource” guest.
○ Implicitly pins vCPUs and emulator threads to pCPU cores for increased
performance, trading off the ability to overcommit.
● Combine with existing techniques for isolating cores for
maximum benefit.
23. Example - Configuration
● Scheduler:
○ Enable NUMATopologyFilter, and AggregateInstanceExtraSpecsFilter
● Compute Node(s):
○ Alter kernel boot params to add isolcpus=2,3,6,7
○ Set vcpu_pin_set=2,3,6,7 in /etc/nova.conf
25. Example - Configuration
● Flavor:
○ Add hw:cpu_policy=dedicated extra specification:
$ nova flavor-key m1.small.performance set hw:
cpu_policy=dedicated
● Instance:
$ nova boot --image rhel-guest-image-7.1-20150224
--flavor m1.small.performance test-instance
26. Example - Resultant Libvirt XML
● vCPU placement is static and 1:1 vCPU:pCPU relationship:
<vcpu placement='static'>2</vcpu>
<cputune>
<vcpupin vcpu=' 0' cpuset='2'/>
<vcpupin vcpu=' 1' cpuset='3'/>
<emulatorpin cpuset=' 2-3'/>
</cputune>
● Memory is strictly aligned to the NUMA node:
<numatune>
<memory mode= 'strict' nodeset='0'/>
<memnode cellid=' 0' mode='strict' nodeset=' 0'/>
</numatune>
27. Huge Pages
● Huge pages allow the use of larger page sizes (2M, 1 GB)
increasing CPU TLB cache efficiency.
○ Backing guest memory with huge pages allows predictable memory
access, at the expense of the ability to over-commit.
○ Different workloads extract different performance characteristics from
different page sizes - bigger is not always better!
● Administrator reserves large pages during compute node
setup and creates flavors to match:
○ hw:mem_page_size=large|small|any|2048|1048576
● User requests using flavor or image properties.
30. Example - Flavor Configuration
$ nova flavor-key m1.small.performance set hw:mem_page_size=2048
$ nova boot --flavor=m1.small.performance
--image=rhel-guest-image-7.1-20150224
numa-lp-test
31. Example - Result
$ virsh dumpxml instance-00000001
...
<memoryBacking>
<hugepages>
<page size=’2048’ unit=’KiB’ nodeset=’0’/>
</hugepages>
</memorybacking>
...
33. I/O-based NUMA Scheduling
● Extends PciDevice model to include NUMA node the device
is associated with.
● Extends NUMATopologyFilter to make use of this information
when scheduling.
34. Quiesce Guest Filesystem
● Libvirt > 1.2.5 supports a fsFreeze/fsThaw API.
● Freezes/thaws guest filesystem(s) using QEMU guest agent.
● Ensures consistent snapshots.
● To enable:
○ hw_qemu_guest_agent image property must be set to yes.
○ hw_require_fsfreeze image property must be set to yes.
○ QEMU guest agent must be installed inside guest.
35. Hyper-V Enlightenment
● Windows guests support several additional paravirt features
when running on Hyper-V (similar to virtio, kvmclock, etc. on
KVM).
● Helps avoid BSOD in guests on heavily loaded hosts,
enhances performance.
● QEMU/KVM is able to support several of these natively.
● Expands behavior of os_type=“windows” image property.
36. vhost-user support
● VIF driver for new type of network interface implemented in
QEMU/Libvirt.
● Intended to provide a more efficient path between a guest
and userspace vswitches.
38. Liberty Predictions/Speculation
● Libvirt hardware policy from libosinfo (approved)
● Post-plug VIF scripts (under review)
● Further work around SR-IOV incl.:
○ Interface attach/detach (under review)
○ Live migration when using macvtap (under review)
● Ability to select guest CPU model and/or features (under
review)
● VM HA (under review)
● VirtIO network performance enhancements (under review)
● Hot resize (under review)