SlideShare a Scribd company logo
Virtualization Support in
ARMv8+
Aananth C N
c.n.aananth@gmail.com
Version 1.3, 24 Oct 2020
Agenda
• To provide a simplified view on Virtualization for Automotive ECUs
• To understand and compare different solutions available.
• To share this knowledge so that this drop in ocean join with other drops
eventually quench the thirst of some good souls in universe!
Note: All contents, pictures etc., are based on either what are already published on the web and/or from my own experience /
learning / creations. My intent is not to violate any copyrights or NDA content. Please let me know if any violations if happened.
What is Virtualization?
• Operating System (OS) abstracts
hardware from its applications.
Virtualization abstracts hardware
from one or more OS.
• In automotive world, Virtualization
is about abstracting applications,
operating systems, vehicle
network, displays, audio systems
etc. away from the hardware.
What is Virtualization?
• Operating System (OS) abstracts
hardware from its applications.
Virtualization abstracts hardware
from one or more OS.
• In automotive world, Virtualization
is about abstracting operating
systems, applications, vehicle
network, displays, audio systems
etc. away from the hardware.
Virtualization Types
• Fundamental types
• Type 1: Full-virtualization, where the hypervisor takes control of the
hardware and hosts the guest OSes, and the guests are completely
unaware of running on an virtualized environment.
• Type 2: Para-virtualization, where one of the operating system (called as
Host OS) takes charge of hardware and the guest OS is modified to
connect with either Host OS or hardware devices.
• Derived Types
• Hardware assisted virtualization: Here the virtualization solution utilizes
the support provided by hardware to realize the virtualization goals.
• Example Linux/KVM falls under this category. We will see this in details, later.
• Hybrid types: Here the virtualization is realized by combining different
other types.
• For example, the core virtualization functions are realized using Type 1
hypervisor and peripheral / device virtualization are done using Type 2 or other
types such as Graphics, Display virtualization uses a server in Host OS and clients
running in Guests OSes. We will see this in details, later.
• ... and many more
Hardware
Hypervisor
OS 1 OS 2
Type 1
Hardware
Host OS
Apps Hypervisor
Type 2
Modified
Guest OS
Stop all old stories! How to realize Virtualization?
• System Virtualization involves following functions
• Virtualization of CPU cores or Processing Elements
• Virtualization of memory and the memory management
• Virtualization of Interrupts
• Virtualization of Timers
• I/O or Peripheral Virtualization
• To get a better understanding different virtualization functions (listed
above), we may need some example hardware such as Raspberry Pi 3
(ARM Cortex A53)
• Raspberry Pi is taken because that is the most open & common hardware available.
Overview of ARM Cortex A53
- CPU cores
- Exceptions Levels of ARMv8
- Memory management
- Memory Mapped I/O
- Interrupts
- Timers, Clocks, Resets
ARM Cortex A53 – CPU core hardware blocks
• 4 CPU Cores with
• Timer block
• Interrupt block
• Core includes
• NEON Coprocessor
• FPU
• Crypto extensions
• L1 Cache [, L2 Cache]
• Debug & trace
• Trace block
• Debug block
• ACP - Accelerator Coherency
Ports for AXI slaves
• Master memory interface
• Power management interface
• Test interface
The Cortex-A53 processor is a mid-range, low-power processor that implements the ARMv8-A
architecture. The Cortex-A53 processor has one to four cores, each with an L1 memory system
and a single shared L2 cache.
Figure 1-1 shows an example of a Cortex-A53 MPCore configuration with four cores and either
an ACE or a CHI interface.
Figure 1-1 Example Cortex-A53 processor configuration
See About the Cortex-A53 processor functions on page 2-2 for more information about the
functional components.
Core 3*
Core 2*
Core 1*
AXI slave interface
Core 0
Timer events
Counter
ICDT*, nIRQ, nFIQ
PMU
ATB
Debug
Core
Trace
Debug
Interrupt
Timer
ACP*
Power
management
Test
ACE or CHI
master interface
Power control
DFT
MBIST
Cortex-A53 processor
* Optional
APB debug
Clocks
Resets
Configuration
Master
interface
ICCT*, nVCPUMNTIRQ
Ref: https://developer.arm.com/documentation/ddi0500/e/introduction/about-the-cortex-a53-processor
ARM Cortex A53 – CPU Functional Blocks
• APB – slow speed(compared
to AXI) Advanced Peripheral
Bus
• CTM – CoreSight Trigger
Matrix (Debug & Trace)
• CTI – CoreSight Trigger
Interface (Debug & Trace)
• GIC – Global Interrupt
Controller
• SCU – Snoop Control Unit that
maintains cache coherency
• ACE – an extension to AXI
protocol
• CHI – a scalable protocol
supporting multi-node
interconnect
• ACP - an AMBA 4 AXI slave
interface
.1 About the Cortex-A53 processor functions
Figure 2-1 shows a top-level functional diagram of the Cortex-A53 processor.
Figure 2-1 Cortex-A53 processor block diagram
The following sections describe the main Cortex-A53 processor components and their
functions:
L1
ICache
L1
DCache
Debug
and trace
Core 0
L2 cache SCU
ACE/AMBA 5 CHI
master bus interface
ACP slave
Level 2 memory system
Core 0 governor
L1
ICache
L1
DCache
Debug
and trace
Core 1
FPU and NEON
extension
Crypto
extension
L1
ICache
L1
DCache
Debug
and trace
Core 2
L1
ICache
L1
DCache
Debug
and trace
Core 3
Core 1 governor Core 2 governor Core 3 governor
Arch
timer
GIC CPU
interface
Clock and
reset
CTI
Retention
control
Debug over
power down
Arch
timer
GIC CPU
interface
Clock and
reset
CTI
Retention
control
Debug over
power down
Arch
timer
GIC CPU
interface
Clock and
reset
CTI
Retention
control
Debug over
power down
Arch
timer
GIC CPU
interface
Clock and
reset
CTI
Retention
control
Debug over
power down
Governor
APB decoder APB ROM APB multiplexer CTM
Cortex-A53 processor
FPU and NEON
extension
Crypto
extension
FPU and NEON
extension
Crypto
extension
FPU and NEON
extension
Crypto
extension
Ref: https://developer.arm.com/documentation/ddi0500/d/functional-description/about-the-cortex-a53-processor-functions?lang=en
CPU Virtualization – Virtual CPU Cores (vCPU)
• A Raspberry Pi3 has 4 physical cores or CPU (refer previous slide).
• vCPU is basically a time slot of a physical CPU.
• Note: ARM uses the term “vPE” (virtual Processing Element)
• There can be 1-to-1 or many-to-1 relation between vCPU and a real CPU
core.
• For understanding purpose let us imagine a single core ARMv8 processor
and if we can schedule 2 vCPUs from from it (as shown in the picture
below), then this system has 2 vCPU to 1 real CPU relationship.
7 Virtualizing the G
7 Virtualizing the Generic Timers
The Arm architecture includes the Generic Timer, which is a standardized set of timers avai
each processor. The Generic Timer consists of a set of comparators that compare against a
system count. A comparator generates an interrupt when its value is equal to or less than th
count. In the following diagram, we can see the Generic Timer in a system (orange), and its
components of comparators and a counter module.
The following diagram shows an example system with a hypervisor that hosts two virtual
CPUs (vCPUs):
Single Core ARMv8
Hypervisor
VM 1 VM 2
ARMv8 – Switching to Hypervisor context
• ARM supports WFI instruction to put
the CPU in low power state.
• But, when HCR_EL2.TWI bit is set, if
either application or the OS executes
WFI instruction, then the CPU switches
to Hypervisor context.
• ARM also supports ‘HVC #[0-65535]’
instruction, which can be called from
OS context to switch the context to
Hypervisor.
• Note: ‘HVC #imm’ is undefined in
application context.
https://developer.arm.com/architectures/learn-the-architecture/armv8-a-virtualization/trapping-and-emulation-of-instructions
Copyright © 2019 Arm Limited (or its affiliates). All rights reserved.
Non-Confidential
Note: Traps are not just for virtualization. There are EL3 and EL1 controlled traps
traps are particularly useful to virtualization software. This guide only discusses th
typically associated with virtualization.
In our WFI example, an OS would usually execute a WFI as part of an idle loop. Wi
within a VM, the hypervisor can trap this operation and schedule a different vCPU
diagram shows:
5.1 Presenting virtual values of registers
Another example of using traps is to present virtual values of registers. For examp
ID_AA64MMFR0_EL1 reports support for memory system-related features in the
Memory & Memory Management
ARM Cortex A53 (Raspberry Pi3) Memory Map
• Picture on right shows physical and virtual
memory addresses of RPi3 B.
• Physical memory map
• RAM (1 GB “bcm2837-rpi-3-b.dts”)
memory {
reg = <0 0x40000000>;
}
• Memory Mapped I/O
• See the map for I/O Peripherals on the right.
• Virtual memory map
• User space: 0x0000 0000 to 0xBFFF FFFF
• Kernel space: 0xC000 0000 to 0xFFFF FFFF
DDR2
I/O
Peripherals
ARM
MMU
User Space
Virtual
Memory
0x0000 0000
0x4000 0000
0x0000 0000
0xC000 0000
0xFFFF FFFF
Physical Address Virtual Address
32-bit split
0x4003 FFFF
I/O
Peripherals
ARM Cortex A53 – MMU
• ARM64 instructions such as LDR/STR and registers such as PC/LR all points to
Virtual Address.
• This means any address or pointers in the application programs points to virtual
addresses.
• MMU
• Sits between CPU and DDR controller (see next slide).
• Translates virtual to physical address. Once configured no translation penalty.
• Address translation granule of 4KB (AArch32 & AAarch64) and 64KB (AArch64 only) - pages.
• 16 bit ASID (AArch32 uses 8-bit) – used in TLB (see TLB slides).
• Max supported physical address size = 40 bits = 2^40 = 1024 GB (1 TB).
• Provides fine-grained control through virt-to-phys addr. mappings and memory attributes
held in page tables, loaded into the TLB (translation lookaside buffer).
• Generates exceptions if any access violation happens.
ARM Cortex A53 – MMU mapping with Example
• Let us take an example application
that needs 16k of RAM (.text + .bss
+ .data and others).
• As soon as the application is
spawned, assume that the OS
allocates 16k at virtual address:
0x80000000.
• As the page size is 4k, the OS allots
4 contiguous pages as shown in the
table below:
1
2
3
4
0x8000_0000 + 16k
0x00000000
Page VA: start VA: end PA: start PA: end
1 80000000 80000FFF 00017000 00017FFF
2 80001000 80001FFF 00029000 00029FFF
3 80002000 80002FFF 00003000 00003FFF
4 80003000 80003FFF 0004F000 0004FFFF
0x00016000
0x00024000
0x0004D000
This table is for illustrating the MMU lookup
table. Note that the physical address in last 2
columns are not contiguous.
ARM Cortex A53 – Translation Lookaside Buffer (TLB)
• Based on the example provided in previous slide, assume that for every 5 to 10
instructions, the CPU is asked to read a look-up table that is stored in external
DDR memory. Do you think the CPU will be efficiently utilized? No.
• What is TLB?
• TLB is a ‘memory cache’ that contains the recent Virtual Address (VA) to Physical Address
(PA) translations. This saves CPU time for the entries that are done often by the OS.
• In case of TLB miss, generally address translation info from the look-up table is fetched
and updated.
• TLB is organized as 4 major blocks listed below:
• Micro TLB
• 10 sets of physical address to cache (first level) for each data and instruction
• Main TLB
• second layer of TLB structure that catches the cache misses from Micro TLBs.
• Supports all VMSAv8 (Virtual Memory System Architecture) block sizes except 1 GB.
• IPA cache RAM
• The intermediate physical address (IPA) cache RAM holds mappings between intermediate
physical address and the physical address.
• Only non-secure EL1 and EL0 “stage 2” translation uses this cache.
• Walk cache RAM
• Holds the result of stage 1 (OS controlled) translation.
• If stage 1 translation result in a section or larger mapping then nothing is placed in the walk
cache RAM.
MMU
TLB
DDR
Controller
DDR
Memory
CPU
Registers
Entries
pa ...va
pa ...va
pa ...va
SoC
Note: The IPA is part of “stage 2” address translation, i.e., the hypervisor controlled address translation. Will be discussed later.
ARM Cortex A53 – TLB matching and Cache handling
TLB match process
• Each TLB entry contains a VA, block-size, PA and a set of memory
properties (type, access permissions, ...)
• Each entry is associated with a particular ASID, contains a field to store
VMID
• A TLB entry match occurs, when the following conditions are met:
• Its VA, moderated by VA bits [47:N], where N is log2(page size) =
12 for 4k
• Memory space matches the memory space state of request. The
memory space can be one of four values:
• Secure EL3 (AArch64)
• Non-secure EL2
• Secure EL0, EL1 (and EL3 - AArch32)
• Non-secure EL0 or EL1
• ASID matches the current ASID held in the CONTEXTIDR, TTBR0
or TTBR1 or entry marked as global
• The VMID matches the current VMID held in the VTTBR register
Data cache coherency
• Uses MOESI protocol to maintain data coherency between
multiple cores.
• M - Modified - The line is in only this cache and is dirty (Unique Dirty)
• O - Owned - The line is possibly in more than one cache and is dirty
(Shared Dirty)
• E - Exclusive - The line is in only this cache and is clean. (Unique
Clean)
• S - Shared - The line is possibly in more than one cache and is clean
(Shared Clean)
• I - Invalid - The line is not in this cache
CPU
Registers
MMU
+ TLB
L1
Cache
L2
Cache
DDR
Memory
Virtual Address Physical Address
SoC
We will discuss
this soon
Key takeaway: Memory is already virtualized on a single OS, virtualizing it for more than one OS is done by adding stage-2 address translation.
ARMv8 stage 2 translation, MMIO & SMMU
ARMv8 Stage2 Address Translation
• Allows Hypervisor to control which memory mapped system resources a VM can access and
how it appears within the VM.
• It is can be used to ensure that the VM can see only the allocated regions.
• In short, OS controlled translation table is called stage 1 table and Hypervisor controlled
translation table is called as stage 2 translation
https://developer.arm.com/architectures/learn-the-architecture/armv8-a-virtualization/stage-2-translation
resources that are allocated to other VMs or the hypervisor.
For memory address translation, stage 2 translation is a second stage of translation. To support this, a
new set of translation tables known as Stage 2 tables, are required, as shown here:
An Operating System (OS) controls a set of translation tables that map from the virtual address space
to what it thinks is the physical address space. However, this process undergoes a second translation
into the real physical address space. This second stage is controlled by the hypervisor.
The OS-controlled translation is called stage 1 translation, and the hypervisor-controlled translation
is called stage 2 translation. The address space that the OS thinks is physical memory is referred to as
the Intermediate Physical Address (IPA) space.
Note: For an introduction to how address translation works, see our guide on Memory Management.
OS or VM Hypervisor (IPA) RPi3 phy.addr space
ARMv8 - Virtual Peripheral Emulation using MMU
• There are 2 ways you can assign a
peripheral to an VM
• Pass-through or “Assigned”
• Shared or “Virtual”
• Assigned Peripheral – the physical device
is fully assigned to a VM.
• Virtual Peripheral – the device is shared
between 2 or more VMs and a stage-2 fault
is generated to trap the access and
emulate in Hypervisor.
• Why stage-2 fault? Because stage-1 fault
report the virtual address of OS which is
meaningless to hypervisor hence it can’t
decide which peripheral it needs to emulate.
• Instead, HPFAR_EL2 register can be read by
hypervisor to determine the IPA address
mapped to a specific peripheral.
Copyright © 2019 Arm Limited (or its affiliates). All rights reserved.
Non-Confidential
Page 12 of 38
The VM can use peripheral regions to access both real physical peripherals, which are often referred
to as directly assigned peripherals, and virtual peripherals.
Virtual peripherals are completely emulated in software by the hypervisor, as this diagram highlights:
ARMv8 – Trapping and Emulation of Virtual Peripherals
• Let us take ”shared” UART (serial
comm.) as example.
• An app running in vCPU (VM) tries to
read data from UART.
• Since it is not a pass-through device,
the read will create a stage2 fault and
context switches to Hypervisor.
• Hypervisor will read HPFAR_EL2
register to know the peripheral the VM
was trying access.
• It them emulate the read operation
and return the results to the VM.
• Note that this example read results in
2 context switches.
Copyright © 2019 Arm Limited (or its affiliates). All rights reserved.
Non-Confidential
by the hypervisor, it can use this information to determine the register that it needs to emulate.
Exception Model shows how the ESR_ELx registers report information about the exception. For
single general-purpose register loads or stores that trigger a stage 2 fault, additional syndrome
information is provided. This information includes the size of the accesses and the source or
destination register, and allows a hypervisor to determine the type of access that is being made to the
virtual peripheral.
This diagram illustrates the process of trapping then emulating the access:
This process is described in these steps:
1. Software in the VM attempts to access the virtual peripheral. In this example, this is the receive
FIFO of a virtual UART.
2. This access is blocked at stage 2 translation, leading to an abort routed to EL2.
https://developer.arm.com/architectures/learn-the-architecture/armv8-a-virtualization/trapping-and-emulation-of-instructions
ARMv8 – System Memory Management Units (SMMU)
• In the “shared” UART peripheral example discussed in previous
slide, what will happen if we need to use DMA for UART?
• Yes, there are 2 problems (see Fig-A):
• Isolation of VMs (Guests) are not possible, as the address-space has to be
shared to make DMA work for more than 2 VMs.
• The VM translates addresses to IPAs. But the UART driver in the unmodified
guest believes those IPA are PAs. But DMA operates at PAs. To fix the IPA <--
> PA incompatibility, the hypervisor (software) has to trap every transaction
of DMA which breaks the original purpose of using DMA.
• To overcome the above problem, ARM has come up with
SMMU (or IOMMU, see Fig-B), which fixes the above problem.
• The fix is, the SMMU and MMU will work in pairs so that the DMA
gets the stage-1 (reverse-translated) IPAs as the addresses for their
copy operations.
• This means, if 2 VMs wanted to do DMA operations, the 1st and 2nd
VM will provide different IPAs to DMA. So Isolation is maintained.
• During their copy operations, the same SMMU translate the IPAs to
PAs back when the copy instruction goes to DDR (via and after the
Interconnect box, shown in the Fig-B).
• The hypervisor is responsible for programming SMMU so that
the DMA see the same view of memory as the VMs.
Armv8-A virtualization
In this system, a hypervisor is using stage 2 to provide isolation b
to see memory is limited by the stage 2 tables that the hyperviso
Allowing a driver in the VM to directly interact with the DMA con
Isolation: The DMA controller is not subject to the stage 2 tables
VM’s sandbox.
Address space: With two stages of translation, what the kernel b
controller still sees PAs, therefore the kernel and DMA controlle
overcome this problem, the hypervisor could trap every interacti
controller, providing the necessary translation. When memory is
inefficient and problematic.
An alternative to trapping and emulating driver accesses is to ext
other masters, like our DMA controller. When this happens, thos
referred to as a System Memory Management Unit (SMMU, som
Copyright © 2019 Arm Limited (or its affiliates)
In this system, a hypervisor is using stage 2 to provide isolation
to see memory is limited by the stage 2 tables that the hypervis
Allowing a driver in the VM to directly interact with the DMA c
Isolation: The DMA controller is not subject to the stage 2 tabl
VM’s sandbox.
Address space: With two stages of translation, what the kernel
controller still sees PAs, therefore the kernel and DMA control
overcome this problem, the hypervisor could trap every interac
controller, providing the necessary translation. When memory
inefficient and problematic.
An alternative to trapping and emulating driver accesses is to e
other masters, like our DMA controller. When this happens, th
referred to as a System Memory Management Unit (SMMU, so
Fig – A: DMA access without SMMU
Fig – B: DMA access with SMMU
https://developer.arm.com/architectures/learn-the-architecture/armv8-a-virtualization/stage-2-translation
ARMv8 Exception Levels & Secure States
ARMv8 Exception Levels
• ARMv8 model defines 4 exception
levels
• EL0 – least privileged
• EL1 – increased privileged (OS)
• EL2 – Hypervisor mode.
• EL3 – highest privileged, Secure
monitor mode.
• On processor reset (power on
reset), the system enters EL3.
• On taking an exception, exception
level either increases or remains
the same. Doesn’t decrease.
• On return from exception, the
exception level decreases or
remains the same.
• Every exception level has its own
stack pointer. The boot loader or
the initialization part of operating
system software has to setup these
registers for all exceptions levels.
ProgrammersModel
Figure 3-1 ARMv8 security model when EL3 is using AArch64
Security model when EL3 is using AArch32
To provide software compatibility with VMSAv7 implementations that include the security
Guest OS1 Guest OS2
Hypervisor
EL0
EL1
EL2
EL3
Non-secure state Secure state
Secure monitor
Hyp
Modes:
AArch64
System, FIQ, IRQ,
Supervisor, Abort, Undefined
Modes:
System, FIQ, IRQ,
Supervisor, Abort, Undefined
Modes:
User
Modes:
User
Modes:
User
Modes:
User
Modes:
AArch32 or
AArch64†
AArch32 or
AArch64†
App1 App2
User
Modes:
User
Modes:
AArch32 or
AArch64†
AArch32 or
AArch64†
App1 App2
AArch32 or AArch64‡
AArch32 or AArch64‡
AArch32 or AArch64
AArch32 or
AArch64†
Secure App1
AArch32 or
AArch64†
Secure App2
Secure OS
System, FIQ, IRQ,
Supervisor, Abort, Undefined
Modes:
AArch32 or AArch64
† AArch64 permitted only if EL1 is using AArch64
‡ AArch64 permitted only if EL2 is using AArch64
Ref: https://developer.arm.com/documentation/100095/0003/programmers-model/armv8-a-architecture-concepts/armv8-security-model
ARMv8 Security States & Virtualization Support
• Secure State
• Can access both secure memory
space and non-secure memory
states.
• When executing at EL3, it can access
all system control resources.
• Non-Secure State
• Can only access non secure memory
spaces.
• Even in EL3, it cannot access all
system control resources.
• Virtualization support
• Software running in EL2 has access to
several control for virtualization
• Stage 2 translation
• EL1/0 instruction and register access
trapping.
• Virtual exception generation
https://developer.arm.com/architectures/learn-the-architecture/armv8-a-virtualization/virtualization-in-aarch64
Armv8-A virtualization
3 Virtualization in AArch64
Software running at EL2 or higher has access to several controls for virtu
Stage 2 translation
EL1/0 instruction and register access trapping
Virtual exception generation
The Exception Levels (ELs) in Non-Secure and Secure states are shown h
In the diagram, Secure EL2 is shown in gray. This is because support for E
ARMv8 Interrupts
ARM Cortex A53 – Interrupt Controller GICv4
• Interrupt Sources.
• Message-based interrupts are generated by memory-write to an assigned address.
• Wired-based interrupts are generated by peripherals such as UART or I2C via I/O
pins.
• SPI – Shared Peripheral Interrupts, which can be either message-based or wire-
based. Can be routed to any PEs configured to handle interrupts.
• PPI - Private Peripheral Interrupt, targets single specific PE (Processing Element).
• LPI - Locality-specific Peripheral Interrupt are interrupts that uses ITS (interrupt
translation service) to route an interrupt to a specific redistributor and PE.
• SGI – Software Generated Interrupts, generated by PEs.
• Distributor
• performs interrupt prioritization & distribution of SPIs and SGIs to the
Redistributors & CPU interfaces.
• Redistributor (red box)
• Holds the control, prioritization and pending information for all physical LPIs using
data structures that are held in memory. Provides programming interface for:
• Enabling / disabling SGIs and PPIs, Setting priority levels for SGIs and PPIs
• Setting each PPI to be level sensitive or edge triggered.
• ...
• CPU interface (blue box)
• Provides register interface to PE. Provides programming interface for:
• Control and config. To enable interrupt handling in accordance with the security state and
legacy support requirements of the implementation
• Acknowledging an interrupt, deactivation of interrupt,
• Performing a priority drop, deactivation of interrupt,
• ...
3 GIC Partitioning
3.1 The GIC logical components
Figure 3-2 shows the GIC partitioning in an implementation that includes an ITS.
Figure 3-2 GIC logical partitioning with an ITS
The mechanism for communication between the ITS and the Redistributors is IMPLEMENTATION DEFINED.
The mechanism for communication between the CPU interfaces and the Redistributors is also IMPLEMENTATION
DEFINED.
Distributor
PE
x.y.0.0
PE
x.y.0.1
PE
x.y.0.2
Cluster C0
PE
x.y.n.0
PE
x.y.n.1
Cluster Cn
Redistributor
ITSa
Interrupt Translation Service
CPU interface
Distributor
a. The inclusion of an ITS is optional, and there might be more than one
ITS in a GIC.
b. SGIs are generated by a PE and routed through the Distributor.
PPIs
LPIs
SGIsb SGIsb
SGIsb
SGIsb
SPIs
SGIsb
Wired-based Interrupt Message-based Interrupt
Ref: https://static.docs.arm.com/ihi0069/c/IHI0069C_gic_architecture_specification.pdf, section 3.1
ARM Cortex A53 – Interrupt Lifecycle & Interrupt numbers
GIC interrupt lifecycle, a series of high-level processes that apply to any
e interrupt lifecycle provides a basis for describing the detailed steps of the
o maintains a state machine that controls interrupt state transitions during
cycle for physical interrupts.
Figure 4-1 Physical interrupt lifecycle
s follows:
s generated either by the peripheral or by software.
Start
A device generates an
interrupt
Generate
End
Distribute
Deliver
Activate
Priority
drop
The CPU interface
delivers interrupt to the
PE
Deactivationa
a. This step does not apply to LPIs.
The PE ends the
interrupt
The PE acknowledges
the interrupt
INTID Interrupt
Type
Details
0 - 15 SGI These interrupts are local to CPU interface
16 - 31 PPI These interrupts are local to CPU interface
(0-1023 are compatible with earlier versions of
GIC architecture)
32 - 1019 SPI Shared peripheral interrupts that the Distributor
can route to either a specific PE, or to any one of
the PEs in the system that is a participating node
1020 - 1023 Special
interrupt
number
1020 - GIC returns this from EL3 -> handled at
Secure EL1
1021 - GIC returns this from EL3 -> handled at
Non-Secure EL1
1022 - legacy operations only
1023 – GIC returns this as interrupt acknowledge,
or if there are errors handling interrupt.
1024 - 8191 - Reserved
8192 -
implementat
ion defined
LPI Peripheral hardware interrupts that are routed to
a specific PE (directly).
Ref: https://static.docs.arm.com/ihi0069/c/IHI0069C_gic_architecture_specification.pdf, section 4.1
ARMv8 Interrupt Grouping
• GICv3 onwards supports Interrupt Grouping as a mechanism to align interrupt handling with ARMv8
exception & security model.
• In a system with two Security states (secure, non-secure), an interrupt is configured as one of the
following:
• A Group 0 physical interrupt:
• ARM expects these interrupts to be handled at EL3.
• A Secure Group 1 physical interrupt:
• ARM expects these interrupts to be handled at Secure EL1.
• A Non-secure Group 1 physical interrupt:
• ARM expects these interrupts to be handled at Non-secure EL2 in systems using virtualization, or at Non-secure EL1 in systems not using
virtualization
• In a system with one Security state an interrupt is configured to be either:
• Group 0.
• Group 1.
• At the System level, GICD_CTLR.DS indicates if the GIC is configured with one or two Security states.
ARM Cortex A53 – Virtual Interrupt Handling
• Say, a serial input device asserts its
interrupt signal to GIC.
• During initialization software
executing at EL3 or EL2 configures
PE to route interrupts to EL2
(hypervisor)
• GIC generates a physical interrupt
exception, either IRQ or FIQ, which
then gets routed to EL2
(hypervisor)
• The hypervisor then configures the
GIC to forward the physical
interrupt as Virtual Interrupt (vIRQ
or vFIQ) to the right vCPU/VM.
• The hypervisor then returns the
control to the vCPU/VM.
• The vCPU/VM uses Virtual CPU
Interface to read and respond to
the interrupts.
Note: In GICv4 onwards, LPIs can be directly injected to VM, which reduces the context switching to hypervisor.
Armv8-A virtualization Doc ID 10214
Issue [0
6 Virtualizing exceptio
The diagram illustrates these steps:
1. The physical peripheral asserts its interrupt signal into the GIC.
2. The GIC generates a physical interrupt exception, either IRQ or FIQ, which gets routed to EL2 by
the configuration of HCR_EL2.IMO/FMO. The hypervisor identifies the peripheral and
determines that it has been assigned to a VM. It checks which vCPU the interrupt should be
ARMv8 Generic Timer, Clock Tree & Resets
ARM Cortex A53 – Generic Timer
Functional Description
• Each core has following set of 64bit timer:
• EL1 Non-secure physical timer
• EL1 Secure physical timer
• EL2 physical timer
• Virtual timer
• The system counter value (which resides in SoC) is
distributed to the Cortex-A53 processor via
CNTVALUEB[63:0]
• The system counter typically operate at lower frequency
than the CLKIN (main processor clock)
• Each timer provides an active-LOW interrupt output to the
SoC.
• External interrupt output pins (n = no-of-cores -1)
• nCNTPNSIRQ[n:0] - EL1 Non-secure physical timer event
• nCNTPSIRQ[n:0] - EL1 Secure physical timer event
• nCNTHPIRQ[n:0] - EL2 physical timer event
• nCNTVIRQ[n:0] - Virtual timer event
https://developer.arm.com/documentation/ddi0500/e/generic-timer/generic-timer-functional-description
frequency than the main processor CLKIN, the CNTCLKEN input is provided as a clock
enable for the CNTVALUEB bus. CNTCLKEN is registered inside the Cortex-A53 processor
before being used as a clock enable for the CNTVALUEB[63:0] registers. This allows a
multicycle path to be applied to the CNTVALUEB[63:0] bus. Figure 10-1 shows the interface.
Figure 10-1 Architectural counter interface
The value on the CNTVALUEB[63:0] bus is required to be stable whenever the internally
registered version of the CNTCLKEN clock enable is asserted. CNTCLKEN must be
synchronous and balanced with CLK and must toggle at integer ratios of the processor CLK.
See Clocks on page 2-9 for more information about CNTCLKEN.
Each timer provides an active-LOW interrupt output to the SoC.
Table 10-1 shows the signals that are the external interrupt output pins.
Cortex-A53 processor
Clock gate
CNTCLKEN
register
Architectural
counter
registers
CNTVALUEB[63:0]
CNTCLKEN
Table 10-1 Generic Timer signals
• Timer schedules events and trigger
interrupts based on an incrementing
counter value.
• It provides
• Generation of timer events as
interrupt outputs
• Generation of event streams
ARM Cortex A53 – System Counter for Timer
• System counter (in SoC) generates the count value and
distributes to all cores (PEs)
• System counter measures real time and doesn’t
affected by DVFS
• Provides interface to programmers via following frames
• CNTControlBase accessible only in EL3, contains following
registers
• CNTCR – Control Register, contains enable, freq. selection, scaling
selection etc.
• CNTSR – Status Register, reports whether timer is running or not.
• CNTCV – Reports the current count value.
• ...
• CNTReadBase
• This is a copy of CNTControlBase but includes CNTCV register only.
5 External timers
In What is the Generic Timer, we introduced the timers that are in the processor. A syste
timers. The following diagram shows an example of this:
The programming interface for these timers mirrors that of the internal timers, but these
mapped registers. The location of these registers is determined by the SoC implementor,
datasheet for the SoC that you are working with.
Interrupts from the external memory-mapped timers will typically be delivered as Shared
https://developer.arm.com/architectures/learn-the-architecture/generic-timer/what-is-the-generic-timer
ARMv8 – Timer Virtualization
• Similar to “shared” UART driver example (slide 20), we can also trap timer
interrupts in hypervisor. But this would add considerable CPU overhead on
such systems as this is the core of any OS.
• The good news here is, ARMv8 allows vCPU to access following timers for
its scheduling needs:
• EL1 Non-secure physical timer – Read Only
• Virtual Timer – Read Write
• To generate timer interrupt, the GICv4 needs to configure the interrupt
getting routed to a specific vCPU.
• Note: as discussed earlier (slide 29), usage LPIs should reduce the interrupt context
switches. This is something I need to do more research and confirm.
ARM Cortex A53 – Clocks and Resets
Clock Tree
• The Cortex-A53 processor has a single
clock input, CLKIN
• RPi3 uses 1.2GHz clock
• All cores in Cortex-A53 & SCU are
clocked with a distributed version of
CLKIN.
• Clock Tree
• PCLK - APB interface / bus
• ACLK - ACE (extension to AXI) bus, ACP
slave interface
• SCLK - SCU interface only if CHI protocol
is used.
• ATCLK - ATB interface, which can operate
at any integer multiple of main clock
• CNTCLK - 64-bit counter
Reset Inputs
• Cortex-A53 processor has the
following active-LOW reset input
signals
• nCPUPORESET[N:0] - primary, cold resets
signals initialize all resettable registers
• nCORERESET[N:0] - same as above,
except debug registers and ETM registers
• nPRESETDBG - single, cluster-wide signal
resets the integrated CoreSight
components that connect to the external
PCLK domain, such as debug logic
• nL2RESET - resets all resettable registers
in L2 memory system and the logic in the
SCU
• nMBISTRESET - an external MBIST
controller can use this signal to reset the
entire SoC.
Clock tree and resets are typically handled by Host OS. VMs generally don’t modify these.
Linux KVM/ARM implementation
An overview
Linux KVM/ARM
• KVM stands for Kernel-based Virtual
Machine, which can run “unmodified”
guests.
• As discussed in slide 4, this is a derived
type, where hypervisor is part of an OS
that uses hardware assisted features. It
doesn’t fall under type 1 or 2.
• As shown in picture on right, the KVM
implementation is split into 2 parts
• Highvisor – runs in EL1
• Lowvisor – runs in EL2, to trap hypervisor
calls and exceptions.
in slow and convoluted code paths. As a simple example, a page
fault handler needs to obtain the virtual address causing the page
fault. In Hyp mode this address is stored in a different register
than in kernel mode.
Second, running the entire kernel in Hyp mode would ad-
versely affect native performance. For example, Hyp mode has
its own separate address space. Whereas kernel mode uses two
page table base registers to provide the familiar 3GB/1GB split
between user address space and kernel address space, Hyp mode
uses a single page table register and therefore cannot have direct
access to the user space portion of the address space. Frequently
used functions to access user memory would require the kernel
to explicitly map user space data into kernel address space and
subsequently perform necessary teardown and TLB maintenance
operations, resulting in poor native performance on ARM.
These problems with running a Linux hypervisor using ARM
Hyp mode do not occur for x86 hardware virtualization. x86 root
mode is orthogonal to its CPU privilege modes. The entire Linux
kernel can run in root mode as a hypervisor because the same set
of CPU modes available in non-root mode are available in root
mode. Nevertheless, given the widespread use of ARM and the
advantages of Linux on ARM, finding an efficient virtualization
solution for ARM that can leverage Linux and take advantage
Host
Kernel
KVM
Highvisor
Host
User
QEMU
PL 0 (User)
PL 1 (Kernel)
PL 2 (Hyp)
VM
Kernel
VM
User
Trap
Lowvisor
Trap
Figure 2: KVM/ARM System Architecture
processing required and defers the bulk of the work to be done
to the highvisor after a world switch to the highvisor is complete.
The highvisor runs in kernel mode as part of the host Linux
kernel. It can therefore directly leverage existing Linux function-
ality such as the scheduler, and can make use of standard kernel
software data structures and mechanisms to implement its func-
tionality, such as locking mechanisms and memory allocation
functions. This makes higher-level functionality easier to imple-
ment in the highvisor. For example, while the lowvisor provides
https://www.cs.columbia.edu/~nieh/pubs/asplos2014_kvmarm.pdf
KVM/ARM Linux
• The source tree (picture) on right shows all the files
that realizes Highvisor and Lowvisor described in
previous slide.
• Most of the file names should be familiar by now.
• Also it should provide a feel of how simple a hypervisor
implementation (~16k lines of code).
<Linux Kernel Src>
├── COPYING
├── CREDITS
├── Documentation
~~~
└── virt
├── Makefile
├── built-in.a
├── kvm
│ ├── Kconfig
│ ├── arm
│ │ ├── aarch32.c
│ │ ├── arch_timer.c
│ │ ├── arm.c
│ │ ├── hyp
│ │ ├── mmio.c
│ │ ├── mmu.c
│ │ ├── perf.c
│ │ ├── pmu.c
│ │ ├── psci.c
│ │ ├── trace.h
│ │ └── vgic
│ ├── async_pf.c
│ ├── async_pf.h
│ ├── coalesced_mmio.c
│ ├── coalesced_mmio.h
│ ├── eventfd.c
│ ├── irqchip.c
│ ├── kvm_main.c
│ ├── vfio.c
│ └── vfio.h
├── lib
│ ├── Kconfig
│ ├── Makefile
│ ├── built-in.a
│ ├── irqbypass.c
│ ├── modules.builtin
│ └── modules.order
├── modules.builtin
└── modules.order
KVM and its suitability for Automotive
• The high-visor of KVM is basically a character device driver.
• From a software component perspective, KVM is a kernel module loaded after
the Linux kernel is initialized.
• New / modification / start of VM happens through ioctl() calls from userspace.
• Auto-start of VM machine is also possible under KVM.
• This means, the first virtual machine can start after the Linux kernel is initialized.
• So, for a product with architecture similar to the one in slide 3 will have difficulty
in meeting safety and start-up time needs for Automotive.
• But, if we use safety certified Linux as a hypervisor (i.e., kernel configured with minimum
features, ~2MB size) on a system with high speed eMMC or UFS (more than 400 Mbps),
then there is a possibility of meeting the timing and safety requirements of Automotive.
• Note: 2 MB image @400 Mbps will get loaded in 40ms.
My view: It is not wise to go on this kind of paths for Automotive. Strategically, we need a light-weight Type 1 hypervisors for Automotive.
Minos Hypervisor (Type 1)
• Though there are may type 1 hypervisor, but the Minos project sounded
interesting as it supports virtualization on Raspberry Pi.
• Please evaluate it if you get time
• https://github.com/minosproject/minos
• I will provide more details and my views as time permits.
Graphics, Display & Audio
How do current suppliers virtualize these peripherals, which are critical for
Automotive?
Graphics, Display & Audio Virtualization
• Most suppliers & tier 1s will go for para-
virtualization solution (shown below) for
these devices as these are bit complex.
• These solution add some context switch
overheads, but low latency due to shared
memory
• In future we might see solutions similar
to what ARM has done to the DMA
(picture below) for these peripherals
also.
• Alternatively, SoCs provide more than 1
peripheral blocks, so that each VM can
use one of them as pass-through.
Copyright © 2019 Arm Limited (or its affiliates). All rights reserved.
Non-Confidential
In this system, a hypervisor is using stage 2 to provide isolation between VMs. The ability
to see memory is limited by the stage 2 tables that the hypervisor controls.
Allowing a driver in the VM to directly interact with the DMA controller creates two prob
Isolation: The DMA controller is not subject to the stage 2 tables, and could be used to br
VM’s sandbox.
Address space: With two stages of translation, what the kernel believes to be PAs are IPA
controller still sees PAs, therefore the kernel and DMA controller have different views o
overcome this problem, the hypervisor could trap every interaction between the VM and
controller, providing the necessary translation. When memory is fragmented, this proces
inefficient and problematic.
An alternative to trapping and emulating driver accesses is to extend the stage 2 regime
other masters, like our DMA controller. When this happens, those masters also need an M
referred to as a System Memory Management Unit (SMMU, sometimes also called IOMM
https://developer.arm.com/architectures/learn-the-architecture/armv8-a-virtualization/stage-2-translation
SoC
Host OSDriver(s)
Hypervisor
Guest OS
Shared Memory
GPU/Display/Sound
Backend
Server
Graphics / Sound
Application Frontend
Client Driver
Graphics / Sound
Application
Acronyms – 1 / 3
• ACE – an extension to AXI protocol
• ACLK – ACE (extension to AXI bus) Clock
• ACP – Accelerator Coherency Ports for AXI slaves
• APB – Advanced Peripheral Bus for slower speed interface by ARM
• App - Software Application (e.g. Calculator App in Android)
• ASID – Address Space Identifier
• ATB – Interface for Trace by ARM
• ATCLK – ATB Clock
• AXI – Advanced eXtensible Interface by ARM
• BCM – Broadcom (the maker of Raspberry Pi)
• CHI – a scalable protocol supporting multi-node interconnect
• CLKIN – Clock In (main processor clock, 1.2 GHz)
• CNTCLK – Timer / Counter Clock
• CNTCLKEN – Counter Clock Enable
• CNTCR – Timer Control Register
• CNTCV – Timer Count Value
• CNTHPIRQ – EL1 physical timer event pin
• CNTPNSIRQ – EL1 Non-secure physical timer event pin
• CNTPSIRQ - EL1 Secure physical timer event pin
• CNTSR – Timer Status Register
• CNTVALUE – 64bit counter value
• CNTVIRQ - Virtual timer event pin
• CONTEXTIDR – Context ID Register (identifies current process ID and ASID)
• CORERESET – Core rest (debug and ETM registers are preserved)
• CPU – Central Processing Unit
• CPUPORESET – CPU Power On Reset
• CTI – CoreSight Trigger Interface (Debug & Trace)
• CTLR.DS – Control Register -> Disable Security bit
• CTM – CoreSight Trigger Matrix (Debug & Trace)
• DDR – Double Data Rate
• DFT – Design For Test
• DMA – Dynamic Memory Access (the unit that offloads CPU for data copy)
Acronyms – 2 / 3
• DTS – Device Tree Structure
• DVFS – Dynamic Voltage and Frequency Scaling
• ECC – Error Correction Code
• ECU – Electronic Control Units in Cars
• ELx – Exception Level ‘x’ [x: 0 to 3 for ARMv8]
• ERET – Exception Return
• ESR – Exception Syndrome Register
• FIQ – Fast Interrupt Request (takes higher priority than IRQ)
• FPU – Floating Point Unit
• GIC – Global Interrupt Controller
• GICD - GIC Distributor
• GPU – Graphics Processing Unit
• HCR – Hypervisor Configuration Register
• HPFAR – Hyp IPA Fault Address Register. Holds the faulting IPA for some
aborts on stage 2 translation.
• HVC – Hypervisor Call
• I/O – Inputs and / or Outputs
• INTID – Interrupt Identifier
• IOMMU - I/O MMU, same as SMMU.
• IPA - Intermediate Physical Address
• IRQ – Interrupt Request (from I/O to CPU)
• ITS – Interrupt Translation Service that injects Interrupt directly to VMs.
• IVI – In-Vehicle Infotainment unit.
• KVM – Kernel-based Virtual Machine
• L1 – Level 1
• L2RESET – L2 Memory system reset
• LDR – Load from register
• log2 – Binary Logarithm
• LPI - Locality-specific Peripheral Interrupt are interrupts that uses ITS
• LR – Link Register
• MBIST – Memory Built In Self Test
• MBISTRESET – and external MBIST controller can reset the SoC
• MMU – Memory Management Unit
• NDA – Non Disclosure Agreement
Acronyms – 3 / 3
• OS – Operating Systems
• PA – Physical Address
• PC – Program Counter
• PCLK – Peripheral
• PE – Processing Element
• PMU – Performance Monitoring Unit
• PoC – Proof of Concept
• PPI - Private Peripheral Interrupt, targets single specific PE
• PRESETDBG - single, cluster-wide signal resets
• QNX – QNX Operating System
• RAM – Random Access Memory
• ROM – Read Only Memory
• SCLK – SCU Clock
• SCU – Snoop Control Unit that maintains cache coherency
• SGI – Software Generated Interrupts, generated by PEs.
• SMMU – System Memory Management Unit (the MMU for peripherals)
• SoC – System on Chip
• SPI – Shared Peripheral Interrupt
• SRAM – Static Random Access Memory
• STR - Store from Register
• TLB – Translation Lookaside Buffer
• TTBRx – Translation Table Base Register x [x: 0 or 1]
• UART – Universal Asynchronous Receive and Transmit. Serial Comms.
• VA – Virtual Address
• vCPU – Virtual CPU
• VM – Virtual Machine
• VMID – Virtual Machine Identifier
• VMSA – Virtual Memory System Architecture
• VTTBR – Virtualization Translation Table Base Register
• WFI – Wait For Interrupt
Thank you!
Please send your feedback, comments to c.n.aananth@gmail.com

More Related Content

What's hot

Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introduction
Gene Chang
 
Making Linux do Hard Real-time
Making Linux do Hard Real-timeMaking Linux do Hard Real-time
Making Linux do Hard Real-time
National Cheng Kung University
 
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareHKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
Linaro
 
Hardware accelerated Virtualization in the ARM Cortex™ Processors
Hardware accelerated Virtualization in the ARM Cortex™ ProcessorsHardware accelerated Virtualization in the ARM Cortex™ Processors
Hardware accelerated Virtualization in the ARM Cortex™ Processors
The Linux Foundation
 
Qemu
QemuQemu
GMSL in Linux
GMSL in LinuxGMSL in Linux
GMSL in Linux
Kieran Bingham
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
Ni Zo-Ma
 
U Boot or Universal Bootloader
U Boot or Universal BootloaderU Boot or Universal Bootloader
U Boot or Universal Bootloader
Satpal Parmar
 
Linux Device Tree
Linux Device TreeLinux Device Tree
Linux Device Tree
艾鍗科技
 
Memory model
Memory modelMemory model
Memory model
Yi-Hsiu Hsu
 
SR-IOV Introduce
SR-IOV IntroduceSR-IOV Introduce
SR-IOV Introduce
Lingfei Kong
 
ARM Architecture in Details
ARM Architecture in Details ARM Architecture in Details
ARM Architecture in Details
GlobalLogic Ukraine
 
Qemu Introduction
Qemu IntroductionQemu Introduction
Qemu Introduction
Chiawei Wang
 
Qemu Pcie
Qemu PcieQemu Pcie
Linux device drivers
Linux device drivers Linux device drivers
Introduction to Modern U-Boot
Introduction to Modern U-BootIntroduction to Modern U-Boot
Introduction to Modern U-Boot
GlobalLogic Ukraine
 
Linux Kernel MMC Storage driver Overview
Linux Kernel MMC Storage driver OverviewLinux Kernel MMC Storage driver Overview
Linux Kernel MMC Storage driver Overview
RajKumar Rampelli
 
05.2 virtio introduction
05.2 virtio introduction05.2 virtio introduction
05.2 virtio introduction
zenixls2
 
Linux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKBLinux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKB
shimosawa
 
Architecture Of The Linux Kernel
Architecture Of The Linux KernelArchitecture Of The Linux Kernel
Architecture Of The Linux Kernel
Dominique Cimafranca
 

What's hot (20)

Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introduction
 
Making Linux do Hard Real-time
Making Linux do Hard Real-timeMaking Linux do Hard Real-time
Making Linux do Hard Real-time
 
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareHKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
 
Hardware accelerated Virtualization in the ARM Cortex™ Processors
Hardware accelerated Virtualization in the ARM Cortex™ ProcessorsHardware accelerated Virtualization in the ARM Cortex™ Processors
Hardware accelerated Virtualization in the ARM Cortex™ Processors
 
Qemu
QemuQemu
Qemu
 
GMSL in Linux
GMSL in LinuxGMSL in Linux
GMSL in Linux
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
 
U Boot or Universal Bootloader
U Boot or Universal BootloaderU Boot or Universal Bootloader
U Boot or Universal Bootloader
 
Linux Device Tree
Linux Device TreeLinux Device Tree
Linux Device Tree
 
Memory model
Memory modelMemory model
Memory model
 
SR-IOV Introduce
SR-IOV IntroduceSR-IOV Introduce
SR-IOV Introduce
 
ARM Architecture in Details
ARM Architecture in Details ARM Architecture in Details
ARM Architecture in Details
 
Qemu Introduction
Qemu IntroductionQemu Introduction
Qemu Introduction
 
Qemu Pcie
Qemu PcieQemu Pcie
Qemu Pcie
 
Linux device drivers
Linux device drivers Linux device drivers
Linux device drivers
 
Introduction to Modern U-Boot
Introduction to Modern U-BootIntroduction to Modern U-Boot
Introduction to Modern U-Boot
 
Linux Kernel MMC Storage driver Overview
Linux Kernel MMC Storage driver OverviewLinux Kernel MMC Storage driver Overview
Linux Kernel MMC Storage driver Overview
 
05.2 virtio introduction
05.2 virtio introduction05.2 virtio introduction
05.2 virtio introduction
 
Linux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKBLinux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKB
 
Architecture Of The Linux Kernel
Architecture Of The Linux KernelArchitecture Of The Linux Kernel
Architecture Of The Linux Kernel
 

Similar to Virtualization Support in ARMv8+

Embedded systems 101 final
Embedded systems 101 finalEmbedded systems 101 final
Embedded systems 101 final
Khalid Elmeadawy
 
Arm cortex-m3 by-joe_bungo_arm
Arm cortex-m3 by-joe_bungo_armArm cortex-m3 by-joe_bungo_arm
Arm cortex-m3 by-joe_bungo_arm
Prashant Ahire
 
EC8791 ARM Processor and Peripherals.pptx
EC8791 ARM Processor and Peripherals.pptxEC8791 ARM Processor and Peripherals.pptx
EC8791 ARM Processor and Peripherals.pptx
deviifet2015
 
VTU University Micro Controllers-06ES42 lecturer Notes
VTU University Micro Controllers-06ES42 lecturer NotesVTU University Micro Controllers-06ES42 lecturer Notes
VTU University Micro Controllers-06ES42 lecturer Notes
24x7house
 
AMulti-coreSoftwareHardwareCo-DebugPlatform_Final
AMulti-coreSoftwareHardwareCo-DebugPlatform_FinalAMulti-coreSoftwareHardwareCo-DebugPlatform_Final
AMulti-coreSoftwareHardwareCo-DebugPlatform_Final
Alan Su
 
Chapter7_InputOutputStorageSystems.pptx
Chapter7_InputOutputStorageSystems.pptxChapter7_InputOutputStorageSystems.pptx
Chapter7_InputOutputStorageSystems.pptx
JanethMedina31
 
Architecture of pentium family
Architecture of pentium familyArchitecture of pentium family
Architecture of pentium family
University of Gujrat, Pakistan
 
directCell - Cell/B.E. tightly coupled via PCI Express
directCell - Cell/B.E. tightly coupled via PCI ExpressdirectCell - Cell/B.E. tightly coupled via PCI Express
directCell - Cell/B.E. tightly coupled via PCI Express
Heiko Joerg Schick
 
Training report on embedded sys_AVR
Training report on embedded sys_AVRTraining report on embedded sys_AVR
Training report on embedded sys_AVR
Galgotias College of Engg. & Tech.
 
UNIT-III ES.ppt
UNIT-III ES.pptUNIT-III ES.ppt
UNIT-III ES.ppt
DustinGraham19
 
iPhone Architecture - Review
iPhone Architecture - ReviewiPhone Architecture - Review
iPhone Architecture - Review
Abdelrahman Hosny
 
Introduction to intel galileo board gen2
Introduction to intel galileo board gen2Introduction to intel galileo board gen2
Introduction to intel galileo board gen2
Harshit Srivastava
 
Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203
Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203
Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203
Linaro
 
Introduction to embedded System.pptx
Introduction to embedded System.pptxIntroduction to embedded System.pptx
Introduction to embedded System.pptx
Pratik Gohel
 
Nvidia tegra K1 Presentation
Nvidia tegra K1 PresentationNvidia tegra K1 Presentation
Nvidia tegra K1 Presentation
ANURAG SEKHSARIA
 
16-bit Microprocessor Design (2005)
16-bit Microprocessor Design (2005)16-bit Microprocessor Design (2005)
16-bit Microprocessor Design (2005)
Susam Pal
 
Dsp on an-avr
Dsp on an-avrDsp on an-avr
Dsp on an-avr
Vikash Kumar
 
15CS44 MP & MC Module 4
15CS44 MP & MC Module 415CS44 MP & MC Module 4
15CS44 MP & MC Module 4
RLJIT
 
Design of a low power processor for Embedded system applications
Design of a low power processor for Embedded system applicationsDesign of a low power processor for Embedded system applications
Design of a low power processor for Embedded system applications
ROHIT89352
 
Computer Organization : CPU, Memory and I/O organization
Computer Organization : CPU, Memory and I/O organizationComputer Organization : CPU, Memory and I/O organization
Computer Organization : CPU, Memory and I/O organization
AmrutaMehata
 

Similar to Virtualization Support in ARMv8+ (20)

Embedded systems 101 final
Embedded systems 101 finalEmbedded systems 101 final
Embedded systems 101 final
 
Arm cortex-m3 by-joe_bungo_arm
Arm cortex-m3 by-joe_bungo_armArm cortex-m3 by-joe_bungo_arm
Arm cortex-m3 by-joe_bungo_arm
 
EC8791 ARM Processor and Peripherals.pptx
EC8791 ARM Processor and Peripherals.pptxEC8791 ARM Processor and Peripherals.pptx
EC8791 ARM Processor and Peripherals.pptx
 
VTU University Micro Controllers-06ES42 lecturer Notes
VTU University Micro Controllers-06ES42 lecturer NotesVTU University Micro Controllers-06ES42 lecturer Notes
VTU University Micro Controllers-06ES42 lecturer Notes
 
AMulti-coreSoftwareHardwareCo-DebugPlatform_Final
AMulti-coreSoftwareHardwareCo-DebugPlatform_FinalAMulti-coreSoftwareHardwareCo-DebugPlatform_Final
AMulti-coreSoftwareHardwareCo-DebugPlatform_Final
 
Chapter7_InputOutputStorageSystems.pptx
Chapter7_InputOutputStorageSystems.pptxChapter7_InputOutputStorageSystems.pptx
Chapter7_InputOutputStorageSystems.pptx
 
Architecture of pentium family
Architecture of pentium familyArchitecture of pentium family
Architecture of pentium family
 
directCell - Cell/B.E. tightly coupled via PCI Express
directCell - Cell/B.E. tightly coupled via PCI ExpressdirectCell - Cell/B.E. tightly coupled via PCI Express
directCell - Cell/B.E. tightly coupled via PCI Express
 
Training report on embedded sys_AVR
Training report on embedded sys_AVRTraining report on embedded sys_AVR
Training report on embedded sys_AVR
 
UNIT-III ES.ppt
UNIT-III ES.pptUNIT-III ES.ppt
UNIT-III ES.ppt
 
iPhone Architecture - Review
iPhone Architecture - ReviewiPhone Architecture - Review
iPhone Architecture - Review
 
Introduction to intel galileo board gen2
Introduction to intel galileo board gen2Introduction to intel galileo board gen2
Introduction to intel galileo board gen2
 
Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203
Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203
Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203
 
Introduction to embedded System.pptx
Introduction to embedded System.pptxIntroduction to embedded System.pptx
Introduction to embedded System.pptx
 
Nvidia tegra K1 Presentation
Nvidia tegra K1 PresentationNvidia tegra K1 Presentation
Nvidia tegra K1 Presentation
 
16-bit Microprocessor Design (2005)
16-bit Microprocessor Design (2005)16-bit Microprocessor Design (2005)
16-bit Microprocessor Design (2005)
 
Dsp on an-avr
Dsp on an-avrDsp on an-avr
Dsp on an-avr
 
15CS44 MP & MC Module 4
15CS44 MP & MC Module 415CS44 MP & MC Module 4
15CS44 MP & MC Module 4
 
Design of a low power processor for Embedded system applications
Design of a low power processor for Embedded system applicationsDesign of a low power processor for Embedded system applications
Design of a low power processor for Embedded system applications
 
Computer Organization : CPU, Memory and I/O organization
Computer Organization : CPU, Memory and I/O organizationComputer Organization : CPU, Memory and I/O organization
Computer Organization : CPU, Memory and I/O organization
 

Recently uploaded

Hand Gesture Control Robotic Arm using image processing.pptx
Hand Gesture Control Robotic Arm using image processing.pptxHand Gesture Control Robotic Arm using image processing.pptx
Hand Gesture Control Robotic Arm using image processing.pptx
wstatus456
 
EV Charging at Multifamily Properties by Kevin Donnelly
EV Charging at Multifamily Properties by Kevin DonnellyEV Charging at Multifamily Properties by Kevin Donnelly
EV Charging at Multifamily Properties by Kevin Donnelly
Forth
 
Here's Why Every Semi-Truck Should Have ELDs
Here's Why Every Semi-Truck Should Have ELDsHere's Why Every Semi-Truck Should Have ELDs
Here's Why Every Semi-Truck Should Have ELDs
jennifermiller8137
 
What Could Be Behind Your Mercedes Sprinter's Power Loss on Uphill Roads
What Could Be Behind Your Mercedes Sprinter's Power Loss on Uphill RoadsWhat Could Be Behind Your Mercedes Sprinter's Power Loss on Uphill Roads
What Could Be Behind Your Mercedes Sprinter's Power Loss on Uphill Roads
Sprinter Gurus
 
What do the symbols on vehicle dashboard mean?
What do the symbols on vehicle dashboard mean?What do the symbols on vehicle dashboard mean?
What do the symbols on vehicle dashboard mean?
Hyundai Motor Group
 
Kaizen SMT_MI_PCBA for Quality Engineerspptx
Kaizen SMT_MI_PCBA for Quality EngineerspptxKaizen SMT_MI_PCBA for Quality Engineerspptx
Kaizen SMT_MI_PCBA for Quality Engineerspptx
vaibhavsrivastava482521
 
一比一原版(Columbia文凭证书)哥伦比亚大学毕业证如何办理
一比一原版(Columbia文凭证书)哥伦比亚大学毕业证如何办理一比一原版(Columbia文凭证书)哥伦比亚大学毕业证如何办理
一比一原版(Columbia文凭证书)哥伦比亚大学毕业证如何办理
afkxen
 
AadiShakti Projects ( Asp Cranes ) Raipur
AadiShakti Projects ( Asp Cranes ) RaipurAadiShakti Projects ( Asp Cranes ) Raipur
AadiShakti Projects ( Asp Cranes ) Raipur
AadiShakti Projects
 
Catalytic Converter theft prevention - NYC.pptx
Catalytic Converter theft prevention - NYC.pptxCatalytic Converter theft prevention - NYC.pptx
Catalytic Converter theft prevention - NYC.pptx
Blue Star Brothers
 
Expanding Access to Affordable At-Home EV Charging by Vanessa Warheit
Expanding Access to Affordable At-Home EV Charging by Vanessa WarheitExpanding Access to Affordable At-Home EV Charging by Vanessa Warheit
Expanding Access to Affordable At-Home EV Charging by Vanessa Warheit
Forth
 
Globalfleet - global fleet survey 2021 full results
Globalfleet - global fleet survey 2021 full resultsGlobalfleet - global fleet survey 2021 full results
Globalfleet - global fleet survey 2021 full results
vaterland
 
EV Charging at MFH Properties by Whitaker Jamieson
EV Charging at MFH Properties by Whitaker JamiesonEV Charging at MFH Properties by Whitaker Jamieson
EV Charging at MFH Properties by Whitaker Jamieson
Forth
 
53286592-Global-Entrepreneurship-and-the-Successful-Growth-Strategies-of-Earl...
53286592-Global-Entrepreneurship-and-the-Successful-Growth-Strategies-of-Earl...53286592-Global-Entrepreneurship-and-the-Successful-Growth-Strategies-of-Earl...
53286592-Global-Entrepreneurship-and-the-Successful-Growth-Strategies-of-Earl...
MarynaYurchenko2
 
原版制作(Exeter毕业证书)埃克塞特大学毕业证完成信一模一样
原版制作(Exeter毕业证书)埃克塞特大学毕业证完成信一模一样原版制作(Exeter毕业证书)埃克塞特大学毕业证完成信一模一样
原版制作(Exeter毕业证书)埃克塞特大学毕业证完成信一模一样
78tq3hi2
 
一比一原版(WashU文凭证书)圣路易斯华盛顿大学毕业证如何办理
一比一原版(WashU文凭证书)圣路易斯华盛顿大学毕业证如何办理一比一原版(WashU文凭证书)圣路易斯华盛顿大学毕业证如何办理
一比一原版(WashU文凭证书)圣路易斯华盛顿大学毕业证如何办理
afkxen
 
EN Artificial Intelligence by Slidesgo.pptx
EN Artificial Intelligence by Slidesgo.pptxEN Artificial Intelligence by Slidesgo.pptx
EN Artificial Intelligence by Slidesgo.pptx
aichamardi99
 
快速办理(napier毕业证书)英国龙比亚大学毕业证在读证明一模一样
快速办理(napier毕业证书)英国龙比亚大学毕业证在读证明一模一样快速办理(napier毕业证书)英国龙比亚大学毕业证在读证明一模一样
快速办理(napier毕业证书)英国龙比亚大学毕业证在读证明一模一样
78tq3hi2
 

Recently uploaded (17)

Hand Gesture Control Robotic Arm using image processing.pptx
Hand Gesture Control Robotic Arm using image processing.pptxHand Gesture Control Robotic Arm using image processing.pptx
Hand Gesture Control Robotic Arm using image processing.pptx
 
EV Charging at Multifamily Properties by Kevin Donnelly
EV Charging at Multifamily Properties by Kevin DonnellyEV Charging at Multifamily Properties by Kevin Donnelly
EV Charging at Multifamily Properties by Kevin Donnelly
 
Here's Why Every Semi-Truck Should Have ELDs
Here's Why Every Semi-Truck Should Have ELDsHere's Why Every Semi-Truck Should Have ELDs
Here's Why Every Semi-Truck Should Have ELDs
 
What Could Be Behind Your Mercedes Sprinter's Power Loss on Uphill Roads
What Could Be Behind Your Mercedes Sprinter's Power Loss on Uphill RoadsWhat Could Be Behind Your Mercedes Sprinter's Power Loss on Uphill Roads
What Could Be Behind Your Mercedes Sprinter's Power Loss on Uphill Roads
 
What do the symbols on vehicle dashboard mean?
What do the symbols on vehicle dashboard mean?What do the symbols on vehicle dashboard mean?
What do the symbols on vehicle dashboard mean?
 
Kaizen SMT_MI_PCBA for Quality Engineerspptx
Kaizen SMT_MI_PCBA for Quality EngineerspptxKaizen SMT_MI_PCBA for Quality Engineerspptx
Kaizen SMT_MI_PCBA for Quality Engineerspptx
 
一比一原版(Columbia文凭证书)哥伦比亚大学毕业证如何办理
一比一原版(Columbia文凭证书)哥伦比亚大学毕业证如何办理一比一原版(Columbia文凭证书)哥伦比亚大学毕业证如何办理
一比一原版(Columbia文凭证书)哥伦比亚大学毕业证如何办理
 
AadiShakti Projects ( Asp Cranes ) Raipur
AadiShakti Projects ( Asp Cranes ) RaipurAadiShakti Projects ( Asp Cranes ) Raipur
AadiShakti Projects ( Asp Cranes ) Raipur
 
Catalytic Converter theft prevention - NYC.pptx
Catalytic Converter theft prevention - NYC.pptxCatalytic Converter theft prevention - NYC.pptx
Catalytic Converter theft prevention - NYC.pptx
 
Expanding Access to Affordable At-Home EV Charging by Vanessa Warheit
Expanding Access to Affordable At-Home EV Charging by Vanessa WarheitExpanding Access to Affordable At-Home EV Charging by Vanessa Warheit
Expanding Access to Affordable At-Home EV Charging by Vanessa Warheit
 
Globalfleet - global fleet survey 2021 full results
Globalfleet - global fleet survey 2021 full resultsGlobalfleet - global fleet survey 2021 full results
Globalfleet - global fleet survey 2021 full results
 
EV Charging at MFH Properties by Whitaker Jamieson
EV Charging at MFH Properties by Whitaker JamiesonEV Charging at MFH Properties by Whitaker Jamieson
EV Charging at MFH Properties by Whitaker Jamieson
 
53286592-Global-Entrepreneurship-and-the-Successful-Growth-Strategies-of-Earl...
53286592-Global-Entrepreneurship-and-the-Successful-Growth-Strategies-of-Earl...53286592-Global-Entrepreneurship-and-the-Successful-Growth-Strategies-of-Earl...
53286592-Global-Entrepreneurship-and-the-Successful-Growth-Strategies-of-Earl...
 
原版制作(Exeter毕业证书)埃克塞特大学毕业证完成信一模一样
原版制作(Exeter毕业证书)埃克塞特大学毕业证完成信一模一样原版制作(Exeter毕业证书)埃克塞特大学毕业证完成信一模一样
原版制作(Exeter毕业证书)埃克塞特大学毕业证完成信一模一样
 
一比一原版(WashU文凭证书)圣路易斯华盛顿大学毕业证如何办理
一比一原版(WashU文凭证书)圣路易斯华盛顿大学毕业证如何办理一比一原版(WashU文凭证书)圣路易斯华盛顿大学毕业证如何办理
一比一原版(WashU文凭证书)圣路易斯华盛顿大学毕业证如何办理
 
EN Artificial Intelligence by Slidesgo.pptx
EN Artificial Intelligence by Slidesgo.pptxEN Artificial Intelligence by Slidesgo.pptx
EN Artificial Intelligence by Slidesgo.pptx
 
快速办理(napier毕业证书)英国龙比亚大学毕业证在读证明一模一样
快速办理(napier毕业证书)英国龙比亚大学毕业证在读证明一模一样快速办理(napier毕业证书)英国龙比亚大学毕业证在读证明一模一样
快速办理(napier毕业证书)英国龙比亚大学毕业证在读证明一模一样
 

Virtualization Support in ARMv8+

  • 1. Virtualization Support in ARMv8+ Aananth C N c.n.aananth@gmail.com Version 1.3, 24 Oct 2020
  • 2. Agenda • To provide a simplified view on Virtualization for Automotive ECUs • To understand and compare different solutions available. • To share this knowledge so that this drop in ocean join with other drops eventually quench the thirst of some good souls in universe! Note: All contents, pictures etc., are based on either what are already published on the web and/or from my own experience / learning / creations. My intent is not to violate any copyrights or NDA content. Please let me know if any violations if happened.
  • 3. What is Virtualization? • Operating System (OS) abstracts hardware from its applications. Virtualization abstracts hardware from one or more OS. • In automotive world, Virtualization is about abstracting applications, operating systems, vehicle network, displays, audio systems etc. away from the hardware. What is Virtualization? • Operating System (OS) abstracts hardware from its applications. Virtualization abstracts hardware from one or more OS. • In automotive world, Virtualization is about abstracting operating systems, applications, vehicle network, displays, audio systems etc. away from the hardware.
  • 4. Virtualization Types • Fundamental types • Type 1: Full-virtualization, where the hypervisor takes control of the hardware and hosts the guest OSes, and the guests are completely unaware of running on an virtualized environment. • Type 2: Para-virtualization, where one of the operating system (called as Host OS) takes charge of hardware and the guest OS is modified to connect with either Host OS or hardware devices. • Derived Types • Hardware assisted virtualization: Here the virtualization solution utilizes the support provided by hardware to realize the virtualization goals. • Example Linux/KVM falls under this category. We will see this in details, later. • Hybrid types: Here the virtualization is realized by combining different other types. • For example, the core virtualization functions are realized using Type 1 hypervisor and peripheral / device virtualization are done using Type 2 or other types such as Graphics, Display virtualization uses a server in Host OS and clients running in Guests OSes. We will see this in details, later. • ... and many more Hardware Hypervisor OS 1 OS 2 Type 1 Hardware Host OS Apps Hypervisor Type 2 Modified Guest OS
  • 5. Stop all old stories! How to realize Virtualization? • System Virtualization involves following functions • Virtualization of CPU cores or Processing Elements • Virtualization of memory and the memory management • Virtualization of Interrupts • Virtualization of Timers • I/O or Peripheral Virtualization • To get a better understanding different virtualization functions (listed above), we may need some example hardware such as Raspberry Pi 3 (ARM Cortex A53) • Raspberry Pi is taken because that is the most open & common hardware available.
  • 6. Overview of ARM Cortex A53 - CPU cores - Exceptions Levels of ARMv8 - Memory management - Memory Mapped I/O - Interrupts - Timers, Clocks, Resets
  • 7. ARM Cortex A53 – CPU core hardware blocks • 4 CPU Cores with • Timer block • Interrupt block • Core includes • NEON Coprocessor • FPU • Crypto extensions • L1 Cache [, L2 Cache] • Debug & trace • Trace block • Debug block • ACP - Accelerator Coherency Ports for AXI slaves • Master memory interface • Power management interface • Test interface The Cortex-A53 processor is a mid-range, low-power processor that implements the ARMv8-A architecture. The Cortex-A53 processor has one to four cores, each with an L1 memory system and a single shared L2 cache. Figure 1-1 shows an example of a Cortex-A53 MPCore configuration with four cores and either an ACE or a CHI interface. Figure 1-1 Example Cortex-A53 processor configuration See About the Cortex-A53 processor functions on page 2-2 for more information about the functional components. Core 3* Core 2* Core 1* AXI slave interface Core 0 Timer events Counter ICDT*, nIRQ, nFIQ PMU ATB Debug Core Trace Debug Interrupt Timer ACP* Power management Test ACE or CHI master interface Power control DFT MBIST Cortex-A53 processor * Optional APB debug Clocks Resets Configuration Master interface ICCT*, nVCPUMNTIRQ Ref: https://developer.arm.com/documentation/ddi0500/e/introduction/about-the-cortex-a53-processor
  • 8. ARM Cortex A53 – CPU Functional Blocks • APB – slow speed(compared to AXI) Advanced Peripheral Bus • CTM – CoreSight Trigger Matrix (Debug & Trace) • CTI – CoreSight Trigger Interface (Debug & Trace) • GIC – Global Interrupt Controller • SCU – Snoop Control Unit that maintains cache coherency • ACE – an extension to AXI protocol • CHI – a scalable protocol supporting multi-node interconnect • ACP - an AMBA 4 AXI slave interface .1 About the Cortex-A53 processor functions Figure 2-1 shows a top-level functional diagram of the Cortex-A53 processor. Figure 2-1 Cortex-A53 processor block diagram The following sections describe the main Cortex-A53 processor components and their functions: L1 ICache L1 DCache Debug and trace Core 0 L2 cache SCU ACE/AMBA 5 CHI master bus interface ACP slave Level 2 memory system Core 0 governor L1 ICache L1 DCache Debug and trace Core 1 FPU and NEON extension Crypto extension L1 ICache L1 DCache Debug and trace Core 2 L1 ICache L1 DCache Debug and trace Core 3 Core 1 governor Core 2 governor Core 3 governor Arch timer GIC CPU interface Clock and reset CTI Retention control Debug over power down Arch timer GIC CPU interface Clock and reset CTI Retention control Debug over power down Arch timer GIC CPU interface Clock and reset CTI Retention control Debug over power down Arch timer GIC CPU interface Clock and reset CTI Retention control Debug over power down Governor APB decoder APB ROM APB multiplexer CTM Cortex-A53 processor FPU and NEON extension Crypto extension FPU and NEON extension Crypto extension FPU and NEON extension Crypto extension Ref: https://developer.arm.com/documentation/ddi0500/d/functional-description/about-the-cortex-a53-processor-functions?lang=en
  • 9. CPU Virtualization – Virtual CPU Cores (vCPU) • A Raspberry Pi3 has 4 physical cores or CPU (refer previous slide). • vCPU is basically a time slot of a physical CPU. • Note: ARM uses the term “vPE” (virtual Processing Element) • There can be 1-to-1 or many-to-1 relation between vCPU and a real CPU core. • For understanding purpose let us imagine a single core ARMv8 processor and if we can schedule 2 vCPUs from from it (as shown in the picture below), then this system has 2 vCPU to 1 real CPU relationship. 7 Virtualizing the G 7 Virtualizing the Generic Timers The Arm architecture includes the Generic Timer, which is a standardized set of timers avai each processor. The Generic Timer consists of a set of comparators that compare against a system count. A comparator generates an interrupt when its value is equal to or less than th count. In the following diagram, we can see the Generic Timer in a system (orange), and its components of comparators and a counter module. The following diagram shows an example system with a hypervisor that hosts two virtual CPUs (vCPUs): Single Core ARMv8 Hypervisor VM 1 VM 2
  • 10. ARMv8 – Switching to Hypervisor context • ARM supports WFI instruction to put the CPU in low power state. • But, when HCR_EL2.TWI bit is set, if either application or the OS executes WFI instruction, then the CPU switches to Hypervisor context. • ARM also supports ‘HVC #[0-65535]’ instruction, which can be called from OS context to switch the context to Hypervisor. • Note: ‘HVC #imm’ is undefined in application context. https://developer.arm.com/architectures/learn-the-architecture/armv8-a-virtualization/trapping-and-emulation-of-instructions Copyright © 2019 Arm Limited (or its affiliates). All rights reserved. Non-Confidential Note: Traps are not just for virtualization. There are EL3 and EL1 controlled traps traps are particularly useful to virtualization software. This guide only discusses th typically associated with virtualization. In our WFI example, an OS would usually execute a WFI as part of an idle loop. Wi within a VM, the hypervisor can trap this operation and schedule a different vCPU diagram shows: 5.1 Presenting virtual values of registers Another example of using traps is to present virtual values of registers. For examp ID_AA64MMFR0_EL1 reports support for memory system-related features in the
  • 11. Memory & Memory Management
  • 12. ARM Cortex A53 (Raspberry Pi3) Memory Map • Picture on right shows physical and virtual memory addresses of RPi3 B. • Physical memory map • RAM (1 GB “bcm2837-rpi-3-b.dts”) memory { reg = <0 0x40000000>; } • Memory Mapped I/O • See the map for I/O Peripherals on the right. • Virtual memory map • User space: 0x0000 0000 to 0xBFFF FFFF • Kernel space: 0xC000 0000 to 0xFFFF FFFF DDR2 I/O Peripherals ARM MMU User Space Virtual Memory 0x0000 0000 0x4000 0000 0x0000 0000 0xC000 0000 0xFFFF FFFF Physical Address Virtual Address 32-bit split 0x4003 FFFF I/O Peripherals
  • 13. ARM Cortex A53 – MMU • ARM64 instructions such as LDR/STR and registers such as PC/LR all points to Virtual Address. • This means any address or pointers in the application programs points to virtual addresses. • MMU • Sits between CPU and DDR controller (see next slide). • Translates virtual to physical address. Once configured no translation penalty. • Address translation granule of 4KB (AArch32 & AAarch64) and 64KB (AArch64 only) - pages. • 16 bit ASID (AArch32 uses 8-bit) – used in TLB (see TLB slides). • Max supported physical address size = 40 bits = 2^40 = 1024 GB (1 TB). • Provides fine-grained control through virt-to-phys addr. mappings and memory attributes held in page tables, loaded into the TLB (translation lookaside buffer). • Generates exceptions if any access violation happens.
  • 14. ARM Cortex A53 – MMU mapping with Example • Let us take an example application that needs 16k of RAM (.text + .bss + .data and others). • As soon as the application is spawned, assume that the OS allocates 16k at virtual address: 0x80000000. • As the page size is 4k, the OS allots 4 contiguous pages as shown in the table below: 1 2 3 4 0x8000_0000 + 16k 0x00000000 Page VA: start VA: end PA: start PA: end 1 80000000 80000FFF 00017000 00017FFF 2 80001000 80001FFF 00029000 00029FFF 3 80002000 80002FFF 00003000 00003FFF 4 80003000 80003FFF 0004F000 0004FFFF 0x00016000 0x00024000 0x0004D000 This table is for illustrating the MMU lookup table. Note that the physical address in last 2 columns are not contiguous.
  • 15. ARM Cortex A53 – Translation Lookaside Buffer (TLB) • Based on the example provided in previous slide, assume that for every 5 to 10 instructions, the CPU is asked to read a look-up table that is stored in external DDR memory. Do you think the CPU will be efficiently utilized? No. • What is TLB? • TLB is a ‘memory cache’ that contains the recent Virtual Address (VA) to Physical Address (PA) translations. This saves CPU time for the entries that are done often by the OS. • In case of TLB miss, generally address translation info from the look-up table is fetched and updated. • TLB is organized as 4 major blocks listed below: • Micro TLB • 10 sets of physical address to cache (first level) for each data and instruction • Main TLB • second layer of TLB structure that catches the cache misses from Micro TLBs. • Supports all VMSAv8 (Virtual Memory System Architecture) block sizes except 1 GB. • IPA cache RAM • The intermediate physical address (IPA) cache RAM holds mappings between intermediate physical address and the physical address. • Only non-secure EL1 and EL0 “stage 2” translation uses this cache. • Walk cache RAM • Holds the result of stage 1 (OS controlled) translation. • If stage 1 translation result in a section or larger mapping then nothing is placed in the walk cache RAM. MMU TLB DDR Controller DDR Memory CPU Registers Entries pa ...va pa ...va pa ...va SoC Note: The IPA is part of “stage 2” address translation, i.e., the hypervisor controlled address translation. Will be discussed later.
  • 16. ARM Cortex A53 – TLB matching and Cache handling TLB match process • Each TLB entry contains a VA, block-size, PA and a set of memory properties (type, access permissions, ...) • Each entry is associated with a particular ASID, contains a field to store VMID • A TLB entry match occurs, when the following conditions are met: • Its VA, moderated by VA bits [47:N], where N is log2(page size) = 12 for 4k • Memory space matches the memory space state of request. The memory space can be one of four values: • Secure EL3 (AArch64) • Non-secure EL2 • Secure EL0, EL1 (and EL3 - AArch32) • Non-secure EL0 or EL1 • ASID matches the current ASID held in the CONTEXTIDR, TTBR0 or TTBR1 or entry marked as global • The VMID matches the current VMID held in the VTTBR register Data cache coherency • Uses MOESI protocol to maintain data coherency between multiple cores. • M - Modified - The line is in only this cache and is dirty (Unique Dirty) • O - Owned - The line is possibly in more than one cache and is dirty (Shared Dirty) • E - Exclusive - The line is in only this cache and is clean. (Unique Clean) • S - Shared - The line is possibly in more than one cache and is clean (Shared Clean) • I - Invalid - The line is not in this cache CPU Registers MMU + TLB L1 Cache L2 Cache DDR Memory Virtual Address Physical Address SoC We will discuss this soon Key takeaway: Memory is already virtualized on a single OS, virtualizing it for more than one OS is done by adding stage-2 address translation.
  • 17. ARMv8 stage 2 translation, MMIO & SMMU
  • 18. ARMv8 Stage2 Address Translation • Allows Hypervisor to control which memory mapped system resources a VM can access and how it appears within the VM. • It is can be used to ensure that the VM can see only the allocated regions. • In short, OS controlled translation table is called stage 1 table and Hypervisor controlled translation table is called as stage 2 translation https://developer.arm.com/architectures/learn-the-architecture/armv8-a-virtualization/stage-2-translation resources that are allocated to other VMs or the hypervisor. For memory address translation, stage 2 translation is a second stage of translation. To support this, a new set of translation tables known as Stage 2 tables, are required, as shown here: An Operating System (OS) controls a set of translation tables that map from the virtual address space to what it thinks is the physical address space. However, this process undergoes a second translation into the real physical address space. This second stage is controlled by the hypervisor. The OS-controlled translation is called stage 1 translation, and the hypervisor-controlled translation is called stage 2 translation. The address space that the OS thinks is physical memory is referred to as the Intermediate Physical Address (IPA) space. Note: For an introduction to how address translation works, see our guide on Memory Management. OS or VM Hypervisor (IPA) RPi3 phy.addr space
  • 19. ARMv8 - Virtual Peripheral Emulation using MMU • There are 2 ways you can assign a peripheral to an VM • Pass-through or “Assigned” • Shared or “Virtual” • Assigned Peripheral – the physical device is fully assigned to a VM. • Virtual Peripheral – the device is shared between 2 or more VMs and a stage-2 fault is generated to trap the access and emulate in Hypervisor. • Why stage-2 fault? Because stage-1 fault report the virtual address of OS which is meaningless to hypervisor hence it can’t decide which peripheral it needs to emulate. • Instead, HPFAR_EL2 register can be read by hypervisor to determine the IPA address mapped to a specific peripheral. Copyright © 2019 Arm Limited (or its affiliates). All rights reserved. Non-Confidential Page 12 of 38 The VM can use peripheral regions to access both real physical peripherals, which are often referred to as directly assigned peripherals, and virtual peripherals. Virtual peripherals are completely emulated in software by the hypervisor, as this diagram highlights:
  • 20. ARMv8 – Trapping and Emulation of Virtual Peripherals • Let us take ”shared” UART (serial comm.) as example. • An app running in vCPU (VM) tries to read data from UART. • Since it is not a pass-through device, the read will create a stage2 fault and context switches to Hypervisor. • Hypervisor will read HPFAR_EL2 register to know the peripheral the VM was trying access. • It them emulate the read operation and return the results to the VM. • Note that this example read results in 2 context switches. Copyright © 2019 Arm Limited (or its affiliates). All rights reserved. Non-Confidential by the hypervisor, it can use this information to determine the register that it needs to emulate. Exception Model shows how the ESR_ELx registers report information about the exception. For single general-purpose register loads or stores that trigger a stage 2 fault, additional syndrome information is provided. This information includes the size of the accesses and the source or destination register, and allows a hypervisor to determine the type of access that is being made to the virtual peripheral. This diagram illustrates the process of trapping then emulating the access: This process is described in these steps: 1. Software in the VM attempts to access the virtual peripheral. In this example, this is the receive FIFO of a virtual UART. 2. This access is blocked at stage 2 translation, leading to an abort routed to EL2. https://developer.arm.com/architectures/learn-the-architecture/armv8-a-virtualization/trapping-and-emulation-of-instructions
  • 21. ARMv8 – System Memory Management Units (SMMU) • In the “shared” UART peripheral example discussed in previous slide, what will happen if we need to use DMA for UART? • Yes, there are 2 problems (see Fig-A): • Isolation of VMs (Guests) are not possible, as the address-space has to be shared to make DMA work for more than 2 VMs. • The VM translates addresses to IPAs. But the UART driver in the unmodified guest believes those IPA are PAs. But DMA operates at PAs. To fix the IPA <-- > PA incompatibility, the hypervisor (software) has to trap every transaction of DMA which breaks the original purpose of using DMA. • To overcome the above problem, ARM has come up with SMMU (or IOMMU, see Fig-B), which fixes the above problem. • The fix is, the SMMU and MMU will work in pairs so that the DMA gets the stage-1 (reverse-translated) IPAs as the addresses for their copy operations. • This means, if 2 VMs wanted to do DMA operations, the 1st and 2nd VM will provide different IPAs to DMA. So Isolation is maintained. • During their copy operations, the same SMMU translate the IPAs to PAs back when the copy instruction goes to DDR (via and after the Interconnect box, shown in the Fig-B). • The hypervisor is responsible for programming SMMU so that the DMA see the same view of memory as the VMs. Armv8-A virtualization In this system, a hypervisor is using stage 2 to provide isolation b to see memory is limited by the stage 2 tables that the hyperviso Allowing a driver in the VM to directly interact with the DMA con Isolation: The DMA controller is not subject to the stage 2 tables VM’s sandbox. Address space: With two stages of translation, what the kernel b controller still sees PAs, therefore the kernel and DMA controlle overcome this problem, the hypervisor could trap every interacti controller, providing the necessary translation. When memory is inefficient and problematic. An alternative to trapping and emulating driver accesses is to ext other masters, like our DMA controller. When this happens, thos referred to as a System Memory Management Unit (SMMU, som Copyright © 2019 Arm Limited (or its affiliates) In this system, a hypervisor is using stage 2 to provide isolation to see memory is limited by the stage 2 tables that the hypervis Allowing a driver in the VM to directly interact with the DMA c Isolation: The DMA controller is not subject to the stage 2 tabl VM’s sandbox. Address space: With two stages of translation, what the kernel controller still sees PAs, therefore the kernel and DMA control overcome this problem, the hypervisor could trap every interac controller, providing the necessary translation. When memory inefficient and problematic. An alternative to trapping and emulating driver accesses is to e other masters, like our DMA controller. When this happens, th referred to as a System Memory Management Unit (SMMU, so Fig – A: DMA access without SMMU Fig – B: DMA access with SMMU https://developer.arm.com/architectures/learn-the-architecture/armv8-a-virtualization/stage-2-translation
  • 22. ARMv8 Exception Levels & Secure States
  • 23. ARMv8 Exception Levels • ARMv8 model defines 4 exception levels • EL0 – least privileged • EL1 – increased privileged (OS) • EL2 – Hypervisor mode. • EL3 – highest privileged, Secure monitor mode. • On processor reset (power on reset), the system enters EL3. • On taking an exception, exception level either increases or remains the same. Doesn’t decrease. • On return from exception, the exception level decreases or remains the same. • Every exception level has its own stack pointer. The boot loader or the initialization part of operating system software has to setup these registers for all exceptions levels. ProgrammersModel Figure 3-1 ARMv8 security model when EL3 is using AArch64 Security model when EL3 is using AArch32 To provide software compatibility with VMSAv7 implementations that include the security Guest OS1 Guest OS2 Hypervisor EL0 EL1 EL2 EL3 Non-secure state Secure state Secure monitor Hyp Modes: AArch64 System, FIQ, IRQ, Supervisor, Abort, Undefined Modes: System, FIQ, IRQ, Supervisor, Abort, Undefined Modes: User Modes: User Modes: User Modes: User Modes: AArch32 or AArch64† AArch32 or AArch64† App1 App2 User Modes: User Modes: AArch32 or AArch64† AArch32 or AArch64† App1 App2 AArch32 or AArch64‡ AArch32 or AArch64‡ AArch32 or AArch64 AArch32 or AArch64† Secure App1 AArch32 or AArch64† Secure App2 Secure OS System, FIQ, IRQ, Supervisor, Abort, Undefined Modes: AArch32 or AArch64 † AArch64 permitted only if EL1 is using AArch64 ‡ AArch64 permitted only if EL2 is using AArch64 Ref: https://developer.arm.com/documentation/100095/0003/programmers-model/armv8-a-architecture-concepts/armv8-security-model
  • 24. ARMv8 Security States & Virtualization Support • Secure State • Can access both secure memory space and non-secure memory states. • When executing at EL3, it can access all system control resources. • Non-Secure State • Can only access non secure memory spaces. • Even in EL3, it cannot access all system control resources. • Virtualization support • Software running in EL2 has access to several control for virtualization • Stage 2 translation • EL1/0 instruction and register access trapping. • Virtual exception generation https://developer.arm.com/architectures/learn-the-architecture/armv8-a-virtualization/virtualization-in-aarch64 Armv8-A virtualization 3 Virtualization in AArch64 Software running at EL2 or higher has access to several controls for virtu Stage 2 translation EL1/0 instruction and register access trapping Virtual exception generation The Exception Levels (ELs) in Non-Secure and Secure states are shown h In the diagram, Secure EL2 is shown in gray. This is because support for E
  • 26. ARM Cortex A53 – Interrupt Controller GICv4 • Interrupt Sources. • Message-based interrupts are generated by memory-write to an assigned address. • Wired-based interrupts are generated by peripherals such as UART or I2C via I/O pins. • SPI – Shared Peripheral Interrupts, which can be either message-based or wire- based. Can be routed to any PEs configured to handle interrupts. • PPI - Private Peripheral Interrupt, targets single specific PE (Processing Element). • LPI - Locality-specific Peripheral Interrupt are interrupts that uses ITS (interrupt translation service) to route an interrupt to a specific redistributor and PE. • SGI – Software Generated Interrupts, generated by PEs. • Distributor • performs interrupt prioritization & distribution of SPIs and SGIs to the Redistributors & CPU interfaces. • Redistributor (red box) • Holds the control, prioritization and pending information for all physical LPIs using data structures that are held in memory. Provides programming interface for: • Enabling / disabling SGIs and PPIs, Setting priority levels for SGIs and PPIs • Setting each PPI to be level sensitive or edge triggered. • ... • CPU interface (blue box) • Provides register interface to PE. Provides programming interface for: • Control and config. To enable interrupt handling in accordance with the security state and legacy support requirements of the implementation • Acknowledging an interrupt, deactivation of interrupt, • Performing a priority drop, deactivation of interrupt, • ... 3 GIC Partitioning 3.1 The GIC logical components Figure 3-2 shows the GIC partitioning in an implementation that includes an ITS. Figure 3-2 GIC logical partitioning with an ITS The mechanism for communication between the ITS and the Redistributors is IMPLEMENTATION DEFINED. The mechanism for communication between the CPU interfaces and the Redistributors is also IMPLEMENTATION DEFINED. Distributor PE x.y.0.0 PE x.y.0.1 PE x.y.0.2 Cluster C0 PE x.y.n.0 PE x.y.n.1 Cluster Cn Redistributor ITSa Interrupt Translation Service CPU interface Distributor a. The inclusion of an ITS is optional, and there might be more than one ITS in a GIC. b. SGIs are generated by a PE and routed through the Distributor. PPIs LPIs SGIsb SGIsb SGIsb SGIsb SPIs SGIsb Wired-based Interrupt Message-based Interrupt Ref: https://static.docs.arm.com/ihi0069/c/IHI0069C_gic_architecture_specification.pdf, section 3.1
  • 27. ARM Cortex A53 – Interrupt Lifecycle & Interrupt numbers GIC interrupt lifecycle, a series of high-level processes that apply to any e interrupt lifecycle provides a basis for describing the detailed steps of the o maintains a state machine that controls interrupt state transitions during cycle for physical interrupts. Figure 4-1 Physical interrupt lifecycle s follows: s generated either by the peripheral or by software. Start A device generates an interrupt Generate End Distribute Deliver Activate Priority drop The CPU interface delivers interrupt to the PE Deactivationa a. This step does not apply to LPIs. The PE ends the interrupt The PE acknowledges the interrupt INTID Interrupt Type Details 0 - 15 SGI These interrupts are local to CPU interface 16 - 31 PPI These interrupts are local to CPU interface (0-1023 are compatible with earlier versions of GIC architecture) 32 - 1019 SPI Shared peripheral interrupts that the Distributor can route to either a specific PE, or to any one of the PEs in the system that is a participating node 1020 - 1023 Special interrupt number 1020 - GIC returns this from EL3 -> handled at Secure EL1 1021 - GIC returns this from EL3 -> handled at Non-Secure EL1 1022 - legacy operations only 1023 – GIC returns this as interrupt acknowledge, or if there are errors handling interrupt. 1024 - 8191 - Reserved 8192 - implementat ion defined LPI Peripheral hardware interrupts that are routed to a specific PE (directly). Ref: https://static.docs.arm.com/ihi0069/c/IHI0069C_gic_architecture_specification.pdf, section 4.1
  • 28. ARMv8 Interrupt Grouping • GICv3 onwards supports Interrupt Grouping as a mechanism to align interrupt handling with ARMv8 exception & security model. • In a system with two Security states (secure, non-secure), an interrupt is configured as one of the following: • A Group 0 physical interrupt: • ARM expects these interrupts to be handled at EL3. • A Secure Group 1 physical interrupt: • ARM expects these interrupts to be handled at Secure EL1. • A Non-secure Group 1 physical interrupt: • ARM expects these interrupts to be handled at Non-secure EL2 in systems using virtualization, or at Non-secure EL1 in systems not using virtualization • In a system with one Security state an interrupt is configured to be either: • Group 0. • Group 1. • At the System level, GICD_CTLR.DS indicates if the GIC is configured with one or two Security states.
  • 29. ARM Cortex A53 – Virtual Interrupt Handling • Say, a serial input device asserts its interrupt signal to GIC. • During initialization software executing at EL3 or EL2 configures PE to route interrupts to EL2 (hypervisor) • GIC generates a physical interrupt exception, either IRQ or FIQ, which then gets routed to EL2 (hypervisor) • The hypervisor then configures the GIC to forward the physical interrupt as Virtual Interrupt (vIRQ or vFIQ) to the right vCPU/VM. • The hypervisor then returns the control to the vCPU/VM. • The vCPU/VM uses Virtual CPU Interface to read and respond to the interrupts. Note: In GICv4 onwards, LPIs can be directly injected to VM, which reduces the context switching to hypervisor. Armv8-A virtualization Doc ID 10214 Issue [0 6 Virtualizing exceptio The diagram illustrates these steps: 1. The physical peripheral asserts its interrupt signal into the GIC. 2. The GIC generates a physical interrupt exception, either IRQ or FIQ, which gets routed to EL2 by the configuration of HCR_EL2.IMO/FMO. The hypervisor identifies the peripheral and determines that it has been assigned to a VM. It checks which vCPU the interrupt should be
  • 30. ARMv8 Generic Timer, Clock Tree & Resets
  • 31. ARM Cortex A53 – Generic Timer Functional Description • Each core has following set of 64bit timer: • EL1 Non-secure physical timer • EL1 Secure physical timer • EL2 physical timer • Virtual timer • The system counter value (which resides in SoC) is distributed to the Cortex-A53 processor via CNTVALUEB[63:0] • The system counter typically operate at lower frequency than the CLKIN (main processor clock) • Each timer provides an active-LOW interrupt output to the SoC. • External interrupt output pins (n = no-of-cores -1) • nCNTPNSIRQ[n:0] - EL1 Non-secure physical timer event • nCNTPSIRQ[n:0] - EL1 Secure physical timer event • nCNTHPIRQ[n:0] - EL2 physical timer event • nCNTVIRQ[n:0] - Virtual timer event https://developer.arm.com/documentation/ddi0500/e/generic-timer/generic-timer-functional-description frequency than the main processor CLKIN, the CNTCLKEN input is provided as a clock enable for the CNTVALUEB bus. CNTCLKEN is registered inside the Cortex-A53 processor before being used as a clock enable for the CNTVALUEB[63:0] registers. This allows a multicycle path to be applied to the CNTVALUEB[63:0] bus. Figure 10-1 shows the interface. Figure 10-1 Architectural counter interface The value on the CNTVALUEB[63:0] bus is required to be stable whenever the internally registered version of the CNTCLKEN clock enable is asserted. CNTCLKEN must be synchronous and balanced with CLK and must toggle at integer ratios of the processor CLK. See Clocks on page 2-9 for more information about CNTCLKEN. Each timer provides an active-LOW interrupt output to the SoC. Table 10-1 shows the signals that are the external interrupt output pins. Cortex-A53 processor Clock gate CNTCLKEN register Architectural counter registers CNTVALUEB[63:0] CNTCLKEN Table 10-1 Generic Timer signals • Timer schedules events and trigger interrupts based on an incrementing counter value. • It provides • Generation of timer events as interrupt outputs • Generation of event streams
  • 32. ARM Cortex A53 – System Counter for Timer • System counter (in SoC) generates the count value and distributes to all cores (PEs) • System counter measures real time and doesn’t affected by DVFS • Provides interface to programmers via following frames • CNTControlBase accessible only in EL3, contains following registers • CNTCR – Control Register, contains enable, freq. selection, scaling selection etc. • CNTSR – Status Register, reports whether timer is running or not. • CNTCV – Reports the current count value. • ... • CNTReadBase • This is a copy of CNTControlBase but includes CNTCV register only. 5 External timers In What is the Generic Timer, we introduced the timers that are in the processor. A syste timers. The following diagram shows an example of this: The programming interface for these timers mirrors that of the internal timers, but these mapped registers. The location of these registers is determined by the SoC implementor, datasheet for the SoC that you are working with. Interrupts from the external memory-mapped timers will typically be delivered as Shared https://developer.arm.com/architectures/learn-the-architecture/generic-timer/what-is-the-generic-timer
  • 33. ARMv8 – Timer Virtualization • Similar to “shared” UART driver example (slide 20), we can also trap timer interrupts in hypervisor. But this would add considerable CPU overhead on such systems as this is the core of any OS. • The good news here is, ARMv8 allows vCPU to access following timers for its scheduling needs: • EL1 Non-secure physical timer – Read Only • Virtual Timer – Read Write • To generate timer interrupt, the GICv4 needs to configure the interrupt getting routed to a specific vCPU. • Note: as discussed earlier (slide 29), usage LPIs should reduce the interrupt context switches. This is something I need to do more research and confirm.
  • 34. ARM Cortex A53 – Clocks and Resets Clock Tree • The Cortex-A53 processor has a single clock input, CLKIN • RPi3 uses 1.2GHz clock • All cores in Cortex-A53 & SCU are clocked with a distributed version of CLKIN. • Clock Tree • PCLK - APB interface / bus • ACLK - ACE (extension to AXI) bus, ACP slave interface • SCLK - SCU interface only if CHI protocol is used. • ATCLK - ATB interface, which can operate at any integer multiple of main clock • CNTCLK - 64-bit counter Reset Inputs • Cortex-A53 processor has the following active-LOW reset input signals • nCPUPORESET[N:0] - primary, cold resets signals initialize all resettable registers • nCORERESET[N:0] - same as above, except debug registers and ETM registers • nPRESETDBG - single, cluster-wide signal resets the integrated CoreSight components that connect to the external PCLK domain, such as debug logic • nL2RESET - resets all resettable registers in L2 memory system and the logic in the SCU • nMBISTRESET - an external MBIST controller can use this signal to reset the entire SoC. Clock tree and resets are typically handled by Host OS. VMs generally don’t modify these.
  • 36. Linux KVM/ARM • KVM stands for Kernel-based Virtual Machine, which can run “unmodified” guests. • As discussed in slide 4, this is a derived type, where hypervisor is part of an OS that uses hardware assisted features. It doesn’t fall under type 1 or 2. • As shown in picture on right, the KVM implementation is split into 2 parts • Highvisor – runs in EL1 • Lowvisor – runs in EL2, to trap hypervisor calls and exceptions. in slow and convoluted code paths. As a simple example, a page fault handler needs to obtain the virtual address causing the page fault. In Hyp mode this address is stored in a different register than in kernel mode. Second, running the entire kernel in Hyp mode would ad- versely affect native performance. For example, Hyp mode has its own separate address space. Whereas kernel mode uses two page table base registers to provide the familiar 3GB/1GB split between user address space and kernel address space, Hyp mode uses a single page table register and therefore cannot have direct access to the user space portion of the address space. Frequently used functions to access user memory would require the kernel to explicitly map user space data into kernel address space and subsequently perform necessary teardown and TLB maintenance operations, resulting in poor native performance on ARM. These problems with running a Linux hypervisor using ARM Hyp mode do not occur for x86 hardware virtualization. x86 root mode is orthogonal to its CPU privilege modes. The entire Linux kernel can run in root mode as a hypervisor because the same set of CPU modes available in non-root mode are available in root mode. Nevertheless, given the widespread use of ARM and the advantages of Linux on ARM, finding an efficient virtualization solution for ARM that can leverage Linux and take advantage Host Kernel KVM Highvisor Host User QEMU PL 0 (User) PL 1 (Kernel) PL 2 (Hyp) VM Kernel VM User Trap Lowvisor Trap Figure 2: KVM/ARM System Architecture processing required and defers the bulk of the work to be done to the highvisor after a world switch to the highvisor is complete. The highvisor runs in kernel mode as part of the host Linux kernel. It can therefore directly leverage existing Linux function- ality such as the scheduler, and can make use of standard kernel software data structures and mechanisms to implement its func- tionality, such as locking mechanisms and memory allocation functions. This makes higher-level functionality easier to imple- ment in the highvisor. For example, while the lowvisor provides https://www.cs.columbia.edu/~nieh/pubs/asplos2014_kvmarm.pdf
  • 37. KVM/ARM Linux • The source tree (picture) on right shows all the files that realizes Highvisor and Lowvisor described in previous slide. • Most of the file names should be familiar by now. • Also it should provide a feel of how simple a hypervisor implementation (~16k lines of code). <Linux Kernel Src> ├── COPYING ├── CREDITS ├── Documentation ~~~ └── virt ├── Makefile ├── built-in.a ├── kvm │ ├── Kconfig │ ├── arm │ │ ├── aarch32.c │ │ ├── arch_timer.c │ │ ├── arm.c │ │ ├── hyp │ │ ├── mmio.c │ │ ├── mmu.c │ │ ├── perf.c │ │ ├── pmu.c │ │ ├── psci.c │ │ ├── trace.h │ │ └── vgic │ ├── async_pf.c │ ├── async_pf.h │ ├── coalesced_mmio.c │ ├── coalesced_mmio.h │ ├── eventfd.c │ ├── irqchip.c │ ├── kvm_main.c │ ├── vfio.c │ └── vfio.h ├── lib │ ├── Kconfig │ ├── Makefile │ ├── built-in.a │ ├── irqbypass.c │ ├── modules.builtin │ └── modules.order ├── modules.builtin └── modules.order
  • 38. KVM and its suitability for Automotive • The high-visor of KVM is basically a character device driver. • From a software component perspective, KVM is a kernel module loaded after the Linux kernel is initialized. • New / modification / start of VM happens through ioctl() calls from userspace. • Auto-start of VM machine is also possible under KVM. • This means, the first virtual machine can start after the Linux kernel is initialized. • So, for a product with architecture similar to the one in slide 3 will have difficulty in meeting safety and start-up time needs for Automotive. • But, if we use safety certified Linux as a hypervisor (i.e., kernel configured with minimum features, ~2MB size) on a system with high speed eMMC or UFS (more than 400 Mbps), then there is a possibility of meeting the timing and safety requirements of Automotive. • Note: 2 MB image @400 Mbps will get loaded in 40ms. My view: It is not wise to go on this kind of paths for Automotive. Strategically, we need a light-weight Type 1 hypervisors for Automotive.
  • 39. Minos Hypervisor (Type 1) • Though there are may type 1 hypervisor, but the Minos project sounded interesting as it supports virtualization on Raspberry Pi. • Please evaluate it if you get time • https://github.com/minosproject/minos • I will provide more details and my views as time permits.
  • 40. Graphics, Display & Audio How do current suppliers virtualize these peripherals, which are critical for Automotive?
  • 41. Graphics, Display & Audio Virtualization • Most suppliers & tier 1s will go for para- virtualization solution (shown below) for these devices as these are bit complex. • These solution add some context switch overheads, but low latency due to shared memory • In future we might see solutions similar to what ARM has done to the DMA (picture below) for these peripherals also. • Alternatively, SoCs provide more than 1 peripheral blocks, so that each VM can use one of them as pass-through. Copyright © 2019 Arm Limited (or its affiliates). All rights reserved. Non-Confidential In this system, a hypervisor is using stage 2 to provide isolation between VMs. The ability to see memory is limited by the stage 2 tables that the hypervisor controls. Allowing a driver in the VM to directly interact with the DMA controller creates two prob Isolation: The DMA controller is not subject to the stage 2 tables, and could be used to br VM’s sandbox. Address space: With two stages of translation, what the kernel believes to be PAs are IPA controller still sees PAs, therefore the kernel and DMA controller have different views o overcome this problem, the hypervisor could trap every interaction between the VM and controller, providing the necessary translation. When memory is fragmented, this proces inefficient and problematic. An alternative to trapping and emulating driver accesses is to extend the stage 2 regime other masters, like our DMA controller. When this happens, those masters also need an M referred to as a System Memory Management Unit (SMMU, sometimes also called IOMM https://developer.arm.com/architectures/learn-the-architecture/armv8-a-virtualization/stage-2-translation SoC Host OSDriver(s) Hypervisor Guest OS Shared Memory GPU/Display/Sound Backend Server Graphics / Sound Application Frontend Client Driver Graphics / Sound Application
  • 42. Acronyms – 1 / 3 • ACE – an extension to AXI protocol • ACLK – ACE (extension to AXI bus) Clock • ACP – Accelerator Coherency Ports for AXI slaves • APB – Advanced Peripheral Bus for slower speed interface by ARM • App - Software Application (e.g. Calculator App in Android) • ASID – Address Space Identifier • ATB – Interface for Trace by ARM • ATCLK – ATB Clock • AXI – Advanced eXtensible Interface by ARM • BCM – Broadcom (the maker of Raspberry Pi) • CHI – a scalable protocol supporting multi-node interconnect • CLKIN – Clock In (main processor clock, 1.2 GHz) • CNTCLK – Timer / Counter Clock • CNTCLKEN – Counter Clock Enable • CNTCR – Timer Control Register • CNTCV – Timer Count Value • CNTHPIRQ – EL1 physical timer event pin • CNTPNSIRQ – EL1 Non-secure physical timer event pin • CNTPSIRQ - EL1 Secure physical timer event pin • CNTSR – Timer Status Register • CNTVALUE – 64bit counter value • CNTVIRQ - Virtual timer event pin • CONTEXTIDR – Context ID Register (identifies current process ID and ASID) • CORERESET – Core rest (debug and ETM registers are preserved) • CPU – Central Processing Unit • CPUPORESET – CPU Power On Reset • CTI – CoreSight Trigger Interface (Debug & Trace) • CTLR.DS – Control Register -> Disable Security bit • CTM – CoreSight Trigger Matrix (Debug & Trace) • DDR – Double Data Rate • DFT – Design For Test • DMA – Dynamic Memory Access (the unit that offloads CPU for data copy)
  • 43. Acronyms – 2 / 3 • DTS – Device Tree Structure • DVFS – Dynamic Voltage and Frequency Scaling • ECC – Error Correction Code • ECU – Electronic Control Units in Cars • ELx – Exception Level ‘x’ [x: 0 to 3 for ARMv8] • ERET – Exception Return • ESR – Exception Syndrome Register • FIQ – Fast Interrupt Request (takes higher priority than IRQ) • FPU – Floating Point Unit • GIC – Global Interrupt Controller • GICD - GIC Distributor • GPU – Graphics Processing Unit • HCR – Hypervisor Configuration Register • HPFAR – Hyp IPA Fault Address Register. Holds the faulting IPA for some aborts on stage 2 translation. • HVC – Hypervisor Call • I/O – Inputs and / or Outputs • INTID – Interrupt Identifier • IOMMU - I/O MMU, same as SMMU. • IPA - Intermediate Physical Address • IRQ – Interrupt Request (from I/O to CPU) • ITS – Interrupt Translation Service that injects Interrupt directly to VMs. • IVI – In-Vehicle Infotainment unit. • KVM – Kernel-based Virtual Machine • L1 – Level 1 • L2RESET – L2 Memory system reset • LDR – Load from register • log2 – Binary Logarithm • LPI - Locality-specific Peripheral Interrupt are interrupts that uses ITS • LR – Link Register • MBIST – Memory Built In Self Test • MBISTRESET – and external MBIST controller can reset the SoC • MMU – Memory Management Unit • NDA – Non Disclosure Agreement
  • 44. Acronyms – 3 / 3 • OS – Operating Systems • PA – Physical Address • PC – Program Counter • PCLK – Peripheral • PE – Processing Element • PMU – Performance Monitoring Unit • PoC – Proof of Concept • PPI - Private Peripheral Interrupt, targets single specific PE • PRESETDBG - single, cluster-wide signal resets • QNX – QNX Operating System • RAM – Random Access Memory • ROM – Read Only Memory • SCLK – SCU Clock • SCU – Snoop Control Unit that maintains cache coherency • SGI – Software Generated Interrupts, generated by PEs. • SMMU – System Memory Management Unit (the MMU for peripherals) • SoC – System on Chip • SPI – Shared Peripheral Interrupt • SRAM – Static Random Access Memory • STR - Store from Register • TLB – Translation Lookaside Buffer • TTBRx – Translation Table Base Register x [x: 0 or 1] • UART – Universal Asynchronous Receive and Transmit. Serial Comms. • VA – Virtual Address • vCPU – Virtual CPU • VM – Virtual Machine • VMID – Virtual Machine Identifier • VMSA – Virtual Memory System Architecture • VTTBR – Virtualization Translation Table Base Register • WFI – Wait For Interrupt
  • 45. Thank you! Please send your feedback, comments to c.n.aananth@gmail.com