SlideShare a Scribd company logo
1 of 29
Download to read offline
connect.linaro.org
SFO17-403: Optimizing the Design and Implementation of KVM/ARM
Christoffer Dall
ENGINEERS AND DEVICES

WORKING TOGETHER
–Popek and Golberg
[Formal requirements for virtualizable third generation architectures ’74]
““Efficient, isolated duplicate

of the real machine””
ENGINEERS AND DEVICES

WORKING TOGETHER
Hardware
OS Kernel
App AppApp
Hardware
Hypervisor
VM
Kernel
App App
VM
Kernel
App App
Native Virtual Machines
Virtualization
ENGINEERS AND DEVICES

WORKING TOGETHER
Hypervisor Design
Hardware
Hypervisor
VM
Kernel
App App
VM
Kernel
App App
Type 1 (Standalone)
ENGINEERS AND DEVICES

WORKING TOGETHER
Hypervisor Design
Hardware
Hypervisor
VM
Kernel
App App
VM
Kernel
App App
Type 1 (Standalone)
Hardware
OS Kernel
VM
Kernel
App App
VM
Kernel
App App
Type 2 (Hosted)
Hypervisor
App
ENGINEERS AND DEVICES

WORKING TOGETHER
Hypervisor Design
Hardware
Xen
Dom0
Linux
App App
DomU
Linux
App App
Hardware
Linux
VM
Linux
App App
VM
Linux
App App
KVM
App
ENGINEERS AND DEVICES

WORKING TOGETHER
ARM Virtualization Extensions
Kernel
UserEL0
EL1
HypervisorEL2
ENGINEERS AND DEVICES

WORKING TOGETHER
ARM VE and Hypervisors
Xen
Dom0
Linux
App App
DomU
Linux
App AppEL0
EL1
EL2
?
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM
Host
Linux
AppApp
VM
Kernel
AppApp
KVM
KVM lowvisor
EL0
EL1
EL2
1. Hypercall
2. Return3. Hypercall
4. Return
switch
state
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM
Host
Linux
AppApp
VM
Kernel
AppApp
KVM
EL0
EL1
EL2
1. Hypercall 2. Return
ENGINEERS AND DEVICES

WORKING TOGETHER
ARMv8.1 VHE
• Virtualization Host Extensions

• Supports running unmodified
OSes in EL2 without using EL1
Linux
EL0
EL1
EL2
AppApp
ENGINEERS AND DEVICES

WORKING TOGETHER
ARMv8.1 VHE in Details
1. HCR_EL2.E2H complete enables and disables VHE

2. Addition registers in EL2 to have same support as EL1

3. TGE bit traps EL0 to EL2 without disabling EL0 Stage 1 MMU

4. EL2 translation regime works like EL1

5. Redirects system register accesses from EL1 to EL2
ENGINEERS AND DEVICES

WORKING TOGETHER
VHE Register Redirection
TCR_EL1
mrs x0, TCR_EL1
HCR_EL2.E2H==0
TCR_EL2
HCR_EL2.E2H==1
ENGINEERS AND DEVICES

WORKING TOGETHER
VHE Register Redirection
TCR_EL1mrs x0, TCR_EL12
ENGINEERS AND DEVICES

WORKING TOGETHER
More VHE Register Redirection
• Some registers change bit position and content

• Example: CNTHTCL_EL2 changes layout to match CNTKCTL_EL1 with extra
bits
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM with VHE
HypervisorLinux
EL2
EL1
KVM
Lowvisor
Trap
Run VM
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM with VHE
HypervisorLinux
EL2
KVM
Lowvisor
Function

Call
Run VM
ENGINEERS AND DEVICES

WORKING TOGETHER
Experimental Setup
• AMD Seattle B0
• 64-bit ARMv8-A
• 2.0 GHz AMD A1100 CPU
• 8-way SMP
• 16 GB RAM
• 10 GB Ethernet (passthrough)
*Measurements obtained using Linux in EL2. See BKK16 talk.
ENGINEERS AND DEVICES

WORKING TOGETHER
VHE Performance at First Glance
CPU Clock Cycles non-VHE VHE*
Hypercall 3.181 3.045
*Measurements obtained using Linux in EL2. See BKK16 talk.
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM Optimization #1
VM
Kernel
AppAppEL0
EL1
EL2
Host
AppApp
Linux KVM
• Avoid saving/restoring
EL1 register state
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM Optimization #2
VM
Kernel
AppAppEL0
EL1
EL2
Host
AppApp
Linux KVM
• Leave virtualization
features enabled

• No traps from host EL0
when E2H and TGE are
both set

• EL2 not affected by
traps
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM Optimization #3
• Don’t context switch
the timer on every exit
from the VM

• Completely reworks the
timer code to take
interrupts and load/put
state as necessary

• 20 patches on list
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM Optimization #4
• Defer as much work as possible to vcpu_load and vcpu_put

• Called when entering/exiting run-loop

• Called when preempted/scheduled
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM Optimization #5
• Rewrite the world
switch code

• Avoids many
conditionals

• Adds has_vhe() calls in
pre_run_check and
post_run_check.
kvm_arch_vcpu_ioctl_run
{
...
while (1) {
pre_run_checks();
if (has_vhe() /* static key */
ret = kvm_vcpu_vhe_run(vcpu);
else
ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
post_run_checks();
}
...
}
ENGINEERS AND DEVICES

WORKING TOGETHER
Microbenchmark Results
CPU Clock Cycles non-VHE VHE OPT * x86
Hypercall 3,181 752 1.437
I/O Kernel 3,992 1.604 2.565
I/O User 6,665 7.630 6.732
Virtual IPI 14,155 2.526 3.102
*Measurements obtained using Linux in EL2. See BKK16 talk.
Application Workloads
Application Description
Kernbench Kernel compile
Hackbench Scheduler stress
Netperf Network performance
Apache Web server stress
Memcached Key-Value store
ENGINEERS AND DEVICES

WORKING TOGETHER
Application Workloads
0.00
0.50
1.00
1.50
2.00
Kernbench
Hackbench
TCP_STREAM
TCP_MAERTS
TCP_RR
Apache
Memcached
non-VHE VHE OPT*
*Measurements obtained using Linux in EL2. See BKK16 talk.
Normalized overhead
(lower is better)
ENGINEERS AND DEVICES

WORKING TOGETHER
Conclusions
• Optimize and redesign KVM/ARM for VHE

• Reduce hypercall overhead by more than 75% (better than x86!)

• Network benchmark overhead reduced by 50%

• Memcached overhead reduced by more than 80%
ENGINEERS AND DEVICES

WORKING TOGETHER
Upstream Status
• Timer patches on list

• Optimization patches in final stages

• Priority to upstream soon, hopefully ready for v4.16

More Related Content

What's hot

How to kx_driver
How to kx_driverHow to kx_driver
How to kx_driver
dakotar
 
ARM Architecture-based System Virtualization: Xen ARM open source software pr...
ARM Architecture-based System Virtualization: Xen ARM open source software pr...ARM Architecture-based System Virtualization: Xen ARM open source software pr...
ARM Architecture-based System Virtualization: Xen ARM open source software pr...
The Linux Foundation
 

What's hot (20)

LCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted FirmwareLCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted Firmware
 
Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...
Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...
Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...
 
LCC17 - Securing Embedded Systems with the Hypervisor - Lars Kurth, Citrix
LCC17 - Securing Embedded Systems with the Hypervisor - Lars Kurth, CitrixLCC17 - Securing Embedded Systems with the Hypervisor - Lars Kurth, Citrix
LCC17 - Securing Embedded Systems with the Hypervisor - Lars Kurth, Citrix
 
XPDDS18: Xen Testing at Intel - Xudong Hao, Intel
XPDDS18: Xen Testing at Intel - Xudong Hao, IntelXPDDS18: Xen Testing at Intel - Xudong Hao, Intel
XPDDS18: Xen Testing at Intel - Xudong Hao, Intel
 
LCC17 - Live Patching, Virtual Machine Introspection and Vulnerability Manag...
LCC17 -  Live Patching, Virtual Machine Introspection and Vulnerability Manag...LCC17 -  Live Patching, Virtual Machine Introspection and Vulnerability Manag...
LCC17 - Live Patching, Virtual Machine Introspection and Vulnerability Manag...
 
Oracle Enterprise manager SNMP and Exadata
Oracle Enterprise manager SNMP and ExadataOracle Enterprise manager SNMP and Exadata
Oracle Enterprise manager SNMP and Exadata
 
How to kx_driver
How to kx_driverHow to kx_driver
How to kx_driver
 
XPDDS18: Design Session - SGX deep dive and SGX Virtualization Discussion, Ka...
XPDDS18: Design Session - SGX deep dive and SGX Virtualization Discussion, Ka...XPDDS18: Design Session - SGX deep dive and SGX Virtualization Discussion, Ka...
XPDDS18: Design Session - SGX deep dive and SGX Virtualization Discussion, Ka...
 
XPDS13: Xen in OSS based In–Vehicle Infotainment Systems - Artem Mygaiev, Glo...
XPDS13: Xen in OSS based In–Vehicle Infotainment Systems - Artem Mygaiev, Glo...XPDS13: Xen in OSS based In–Vehicle Infotainment Systems - Artem Mygaiev, Glo...
XPDS13: Xen in OSS based In–Vehicle Infotainment Systems - Artem Mygaiev, Glo...
 
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
SFO15-TR9: PSCI, ACPI (and UEFI to boot)SFO15-TR9: PSCI, ACPI (and UEFI to boot)
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
 
Quickboot on i.MX6
Quickboot on i.MX6Quickboot on i.MX6
Quickboot on i.MX6
 
XPDDS18: Intel Processor Trace for Xen Hypervisor - Luwei Kang, Intel
XPDDS18: Intel Processor Trace for Xen Hypervisor - Luwei Kang, IntelXPDDS18: Intel Processor Trace for Xen Hypervisor - Luwei Kang, Intel
XPDDS18: Intel Processor Trace for Xen Hypervisor - Luwei Kang, Intel
 
ALSF13: Xen on ARM - Virtualization for the Automotive Industry - Stefano Sta...
ALSF13: Xen on ARM - Virtualization for the Automotive Industry - Stefano Sta...ALSF13: Xen on ARM - Virtualization for the Automotive Industry - Stefano Sta...
ALSF13: Xen on ARM - Virtualization for the Automotive Industry - Stefano Sta...
 
App armor structure
App armor structureApp armor structure
App armor structure
 
Photon vs UNET: Battle of the Giants
Photon vs UNET: Battle of the GiantsPhoton vs UNET: Battle of the Giants
Photon vs UNET: Battle of the Giants
 
XPDS14 - RT-Xen: Real-Time Virtualization in Xen - Sisu Xi, Washington Univer...
XPDS14 - RT-Xen: Real-Time Virtualization in Xen - Sisu Xi, Washington Univer...XPDS14 - RT-Xen: Real-Time Virtualization in Xen - Sisu Xi, Washington Univer...
XPDS14 - RT-Xen: Real-Time Virtualization in Xen - Sisu Xi, Washington Univer...
 
ACRN vMeet-Up EU 2021 - debug ACRN hypervisor
ACRN vMeet-Up EU 2021 - debug ACRN hypervisorACRN vMeet-Up EU 2021 - debug ACRN hypervisor
ACRN vMeet-Up EU 2021 - debug ACRN hypervisor
 
ARM Architecture-based System Virtualization: Xen ARM open source software pr...
ARM Architecture-based System Virtualization: Xen ARM open source software pr...ARM Architecture-based System Virtualization: Xen ARM open source software pr...
ARM Architecture-based System Virtualization: Xen ARM open source software pr...
 
XPDS16: Xenbedded: Xen-based client virtualization for phones and tablets - ...
XPDS16:  Xenbedded: Xen-based client virtualization for phones and tablets - ...XPDS16:  Xenbedded: Xen-based client virtualization for phones and tablets - ...
XPDS16: Xenbedded: Xen-based client virtualization for phones and tablets - ...
 
Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_
 

Similar to MOVED: Optimizing the Design and Implementation of KVM/ARM - SFO17-403

Optimizing the Design and Implementation of KVM/ARM - SFO17-403
Optimizing the Design and Implementation of KVM/ARM - SFO17-403Optimizing the Design and Implementation of KVM/ARM - SFO17-403
Optimizing the Design and Implementation of KVM/ARM - SFO17-403
Linaro
 
KVM/ARM Nested Virtualization Support and Performance - SFO17-410
KVM/ARM Nested Virtualization Support and Performance - SFO17-410KVM/ARM Nested Virtualization Support and Performance - SFO17-410
KVM/ARM Nested Virtualization Support and Performance - SFO17-410
Linaro
 
Rmll Virtualization As Is Tool 20090707 V1.0
Rmll Virtualization As Is Tool 20090707 V1.0Rmll Virtualization As Is Tool 20090707 V1.0
Rmll Virtualization As Is Tool 20090707 V1.0
guest72e8c1
 

Similar to MOVED: Optimizing the Design and Implementation of KVM/ARM - SFO17-403 (20)

Optimizing the Design and Implementation of KVM/ARM - SFO17-403
Optimizing the Design and Implementation of KVM/ARM - SFO17-403Optimizing the Design and Implementation of KVM/ARM - SFO17-403
Optimizing the Design and Implementation of KVM/ARM - SFO17-403
 
KVM/ARM Nested Virtualization Support and Performance - SFO17-410
KVM/ARM Nested Virtualization Support and Performance - SFO17-410KVM/ARM Nested Virtualization Support and Performance - SFO17-410
KVM/ARM Nested Virtualization Support and Performance - SFO17-410
 
Secure Containers with EPT Isolation
Secure Containers with EPT IsolationSecure Containers with EPT Isolation
Secure Containers with EPT Isolation
 
MIPI DevCon 2016: Accelerating Software Development for MIPI CSI-2 Cameras
MIPI DevCon 2016: Accelerating Software Development for MIPI CSI-2 CamerasMIPI DevCon 2016: Accelerating Software Development for MIPI CSI-2 Cameras
MIPI DevCon 2016: Accelerating Software Development for MIPI CSI-2 Cameras
 
[Android Codefest Germany] Adding x86 target to your Android app by Xavier Ha...
[Android Codefest Germany] Adding x86 target to your Android app by Xavier Ha...[Android Codefest Germany] Adding x86 target to your Android app by Xavier Ha...
[Android Codefest Germany] Adding x86 target to your Android app by Xavier Ha...
 
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
 
Implementing SR-IOv failover for Windows guests during live migration
Implementing SR-IOv failover for Windows guests during live migrationImplementing SR-IOv failover for Windows guests during live migration
Implementing SR-IOv failover for Windows guests during live migration
 
Intel update
Intel updateIntel update
Intel update
 
Network Automation Tools
Network Automation ToolsNetwork Automation Tools
Network Automation Tools
 
Cigna Innovation Summit
Cigna Innovation SummitCigna Innovation Summit
Cigna Innovation Summit
 
Linux, Unikernel, LinuxKit: towards redefining the cloud stack.
Linux, Unikernel, LinuxKit: towards redefining the cloud stack.Linux, Unikernel, LinuxKit: towards redefining the cloud stack.
Linux, Unikernel, LinuxKit: towards redefining the cloud stack.
 
Virtual Container - Docker
Virtual Container - Docker Virtual Container - Docker
Virtual Container - Docker
 
HPC in the Cloud
HPC in the CloudHPC in the Cloud
HPC in the Cloud
 
Linux firmware for iRMC controller on Fujitsu Primergy servers
Linux firmware for iRMC controller on Fujitsu Primergy serversLinux firmware for iRMC controller on Fujitsu Primergy servers
Linux firmware for iRMC controller on Fujitsu Primergy servers
 
Embedded linux
Embedded linuxEmbedded linux
Embedded linux
 
SR-IOV Introduce
SR-IOV IntroduceSR-IOV Introduce
SR-IOV Introduce
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
 
RMLL / LSM 2009
RMLL / LSM 2009RMLL / LSM 2009
RMLL / LSM 2009
 
Rmll Virtualization As Is Tool 20090707 V1.0
Rmll Virtualization As Is Tool 20090707 V1.0Rmll Virtualization As Is Tool 20090707 V1.0
Rmll Virtualization As Is Tool 20090707 V1.0
 
[Draft] Fast Prototyping with DPDK and eBPF in Containernet
[Draft] Fast Prototyping with DPDK and eBPF in Containernet[Draft] Fast Prototyping with DPDK and eBPF in Containernet
[Draft] Fast Prototyping with DPDK and eBPF in Containernet
 

More from Linaro

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Linaro
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
Linaro
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Linaro
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
Linaro
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
Linaro
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
Linaro
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
Linaro
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
Linaro
 

More from Linaro (20)

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qa
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening Keynote
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP Workshop
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8M
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

MOVED: Optimizing the Design and Implementation of KVM/ARM - SFO17-403

  • 1. connect.linaro.org SFO17-403: Optimizing the Design and Implementation of KVM/ARM Christoffer Dall
  • 2. ENGINEERS AND DEVICES WORKING TOGETHER –Popek and Golberg [Formal requirements for virtualizable third generation architectures ’74] ““Efficient, isolated duplicate
 of the real machine””
  • 3. ENGINEERS AND DEVICES WORKING TOGETHER Hardware OS Kernel App AppApp Hardware Hypervisor VM Kernel App App VM Kernel App App Native Virtual Machines Virtualization
  • 4. ENGINEERS AND DEVICES WORKING TOGETHER Hypervisor Design Hardware Hypervisor VM Kernel App App VM Kernel App App Type 1 (Standalone)
  • 5. ENGINEERS AND DEVICES WORKING TOGETHER Hypervisor Design Hardware Hypervisor VM Kernel App App VM Kernel App App Type 1 (Standalone) Hardware OS Kernel VM Kernel App App VM Kernel App App Type 2 (Hosted) Hypervisor App
  • 6. ENGINEERS AND DEVICES WORKING TOGETHER Hypervisor Design Hardware Xen Dom0 Linux App App DomU Linux App App Hardware Linux VM Linux App App VM Linux App App KVM App
  • 7. ENGINEERS AND DEVICES WORKING TOGETHER ARM Virtualization Extensions Kernel UserEL0 EL1 HypervisorEL2
  • 8. ENGINEERS AND DEVICES WORKING TOGETHER ARM VE and Hypervisors Xen Dom0 Linux App App DomU Linux App AppEL0 EL1 EL2 ?
  • 9. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM Host Linux AppApp VM Kernel AppApp KVM KVM lowvisor EL0 EL1 EL2 1. Hypercall 2. Return3. Hypercall 4. Return switch state
  • 10. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM Host Linux AppApp VM Kernel AppApp KVM EL0 EL1 EL2 1. Hypercall 2. Return
  • 11. ENGINEERS AND DEVICES WORKING TOGETHER ARMv8.1 VHE • Virtualization Host Extensions • Supports running unmodified OSes in EL2 without using EL1 Linux EL0 EL1 EL2 AppApp
  • 12. ENGINEERS AND DEVICES WORKING TOGETHER ARMv8.1 VHE in Details 1. HCR_EL2.E2H complete enables and disables VHE 2. Addition registers in EL2 to have same support as EL1 3. TGE bit traps EL0 to EL2 without disabling EL0 Stage 1 MMU 4. EL2 translation regime works like EL1 5. Redirects system register accesses from EL1 to EL2
  • 13. ENGINEERS AND DEVICES WORKING TOGETHER VHE Register Redirection TCR_EL1 mrs x0, TCR_EL1 HCR_EL2.E2H==0 TCR_EL2 HCR_EL2.E2H==1
  • 14. ENGINEERS AND DEVICES WORKING TOGETHER VHE Register Redirection TCR_EL1mrs x0, TCR_EL12
  • 15. ENGINEERS AND DEVICES WORKING TOGETHER More VHE Register Redirection • Some registers change bit position and content • Example: CNTHTCL_EL2 changes layout to match CNTKCTL_EL1 with extra bits
  • 16. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM with VHE HypervisorLinux EL2 EL1 KVM Lowvisor Trap Run VM
  • 17. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM with VHE HypervisorLinux EL2 KVM Lowvisor Function
 Call Run VM
  • 18. ENGINEERS AND DEVICES WORKING TOGETHER Experimental Setup • AMD Seattle B0 • 64-bit ARMv8-A • 2.0 GHz AMD A1100 CPU • 8-way SMP • 16 GB RAM • 10 GB Ethernet (passthrough) *Measurements obtained using Linux in EL2. See BKK16 talk.
  • 19. ENGINEERS AND DEVICES WORKING TOGETHER VHE Performance at First Glance CPU Clock Cycles non-VHE VHE* Hypercall 3.181 3.045 *Measurements obtained using Linux in EL2. See BKK16 talk.
  • 20. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM Optimization #1 VM Kernel AppAppEL0 EL1 EL2 Host AppApp Linux KVM • Avoid saving/restoring EL1 register state
  • 21. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM Optimization #2 VM Kernel AppAppEL0 EL1 EL2 Host AppApp Linux KVM • Leave virtualization features enabled • No traps from host EL0 when E2H and TGE are both set • EL2 not affected by traps
  • 22. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM Optimization #3 • Don’t context switch the timer on every exit from the VM • Completely reworks the timer code to take interrupts and load/put state as necessary • 20 patches on list
  • 23. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM Optimization #4 • Defer as much work as possible to vcpu_load and vcpu_put • Called when entering/exiting run-loop • Called when preempted/scheduled
  • 24. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM Optimization #5 • Rewrite the world switch code • Avoids many conditionals • Adds has_vhe() calls in pre_run_check and post_run_check. kvm_arch_vcpu_ioctl_run { ... while (1) { pre_run_checks(); if (has_vhe() /* static key */ ret = kvm_vcpu_vhe_run(vcpu); else ret = kvm_call_hyp(__kvm_vcpu_run, vcpu); post_run_checks(); } ... }
  • 25. ENGINEERS AND DEVICES WORKING TOGETHER Microbenchmark Results CPU Clock Cycles non-VHE VHE OPT * x86 Hypercall 3,181 752 1.437 I/O Kernel 3,992 1.604 2.565 I/O User 6,665 7.630 6.732 Virtual IPI 14,155 2.526 3.102 *Measurements obtained using Linux in EL2. See BKK16 talk.
  • 26. Application Workloads Application Description Kernbench Kernel compile Hackbench Scheduler stress Netperf Network performance Apache Web server stress Memcached Key-Value store
  • 27. ENGINEERS AND DEVICES WORKING TOGETHER Application Workloads 0.00 0.50 1.00 1.50 2.00 Kernbench Hackbench TCP_STREAM TCP_MAERTS TCP_RR Apache Memcached non-VHE VHE OPT* *Measurements obtained using Linux in EL2. See BKK16 talk. Normalized overhead (lower is better)
  • 28. ENGINEERS AND DEVICES WORKING TOGETHER Conclusions • Optimize and redesign KVM/ARM for VHE • Reduce hypercall overhead by more than 75% (better than x86!) • Network benchmark overhead reduced by 50% • Memcached overhead reduced by more than 80%
  • 29. ENGINEERS AND DEVICES WORKING TOGETHER Upstream Status • Timer patches on list • Optimization patches in final stages • Priority to upstream soon, hopefully ready for v4.16