2. 2 |
Xen Hypervisor safety certification plan
Xen Hypervisor Safety-Certifiability Plan
May 2023
3. 3 |
Embedded Hypervisors
• Hypervisors are becoming key enablers in embedded (avionics, industrial,
medical, automotive, etc.) Key features:
• Run multiple different Operating Systems in isolation
• Real-time support & cache partitioning
• Mixed-criticality & Fault-tolerance
• Running everything in the same environment put the critical function at risk
• A failure in the non-critical component causes a critical system crash
• The non-critical component using too much CPU time causes a missed deadline
• Don’t put all your eggs in one basket
• Static partitioning allows you to run them separately in isolation
• Many baskets!
• Each domain has direct hardware access (IOMMU protected)
• Real-time
• Strong isolation between the domains
• Real-time isolation
• Failure isolation
• By default, each component has only enough privilege to do what it needs to do
• à Safer and more secure architectures
Xen
Critical
Non-
Critical
Non-
Critical
4. 4 |
Xen: Open Source Community
• Xen Project is an Open Source project under the Linux Foundation
• Well known and widely used in the industry
• Extremely strong review process and security process
• Reference Open Source hypervisor for Embedded and Automotive
• x86 and ARM supported, RISC-V and PPC ports in progress
• The Xen Open Source Community is a diverse multi-vendor
community
• Maintainers from Amazon, ARM, Citrix, AMD, SuSE, and more
• Independent panel of experts
• AMD joined Xen Project as a member in 2022
• AMD is one of the top three contributors to Xen
cloud.com
28%
SUSE
23%
AMD*
23%
Others
7%
Amazon
6%
Arm
5%
InvisibleThingsLab
3%
Linaro
3%
RaptorEngineering
2%
XEN 4.18 CONTRIBUTORS
5. 5 |
Xen: from the Datacenter to Embedded
A long road started in 2010
Panda is Xen mascot
6. 6 |
Xen for Embedded
• Microkernel architecture
• Only the hypervisor requires privileges
• Xen supports disaggregation and driver domains: large amounts of code run unprivileged
• Freedom From Interference and Isolation
• Small code size (less than 50K LOC on ARM)
• Dom0less
• Parallel boot
• Static Partitioning
• No dependency on Linux, Dom0 becomes only optional
• Invisible at runtime
• Real-time with the null scheduler
• Cache isolation with cache coloring
• Virtio and Xen PV Drivers support with freedom from interference
• Cortex-R52 and R82 support, Xen on MPU systems (MMU-less)
7. 7 |
• Xen is the Open-Source reference hypervisor for embedded and automotive at AMD
(ARM, AMD x86, …)
• AMD has an in-house engineering team to develop, enhance, and support Xen for embedded
and automotive
• Xen is delivered to customers today as reference and is supported by Forum, Premium Technical
support, and Engineering
• We have many customers using Xen in production across multiple verticals: industrial,
automotive, medical, …
• Many customers require real-time isolation between VMs (null scheduler, cache coloring)
8. 8 |
Safety certifying Xen Hypervisor
• AMD is working on making Xen safety-certifiable for AMD platforms
• ARM and AMD x86 platforms
• IEC 61508 SIL 3 (Systematic Capability 3) & ISO 26262 ASIL D
• Certification based on Xen upstream community processes and upstream codebase
• Not working with a private fork -- Ability to update the certification with limited efforts
• Certification docs & artifacts available for AMD customers
• Open to collaborations with other community members upstreaming
• Assumptions and Scope
• Common code and core components in Xen
• AMD x86: AMD-v, AMD-Vi, IOMMU, HPET, vPCI
• ARM: SMMUv3, GICv3, Arch Timer, Hypervisor Extensions, vPCI
• Easy to port to future generations of hardware
• Xen enabling components for Virtio and Xen PV Drivers
• Safe memory sharing using Virtio with grant table;
• Virtio with grant table enables Virtio frontends in Safe VMs
• No OS/hypervisor dependencies: run (multiple) Safe VMs and QM VMs of your choice
• Dom0less for domain creation: Dom0 is not required and it is not in scope
• Real-time and low interrupt latency
Xen
SafeOS
e.g. Zephyr
QM
e.g. Linux
HW
HW partition
HW partition
create create
access access
QM == Non-Safe
10. 10 |
Xen Safety Progress
• Xen MISRA C compliance
• MISRA C: coding guidelines for safe C programming
• Goal: improve the Xen codebase
• MISRA C compliance is never at the expense of quality
• Adopt MISRA C rules as part of the Xen coding guidelines
• Address or deviate MISRA C violations in Xen
• ECLAIR MISRA C checker integrated in the CI-loop
• Software Safety Requirements
• Define scope and requirements structure
• “market”, “product” and “software safety” requirements
• Traceability
• Link requirements, tests, and code
11. 11 |
Xen Safety Progress: MISRA C
• Preliminary tailoring resulted in the selection of 143 MISRA C rule candidates
• MISRA C rules adoption in progress:
• 118 discussed among maintainers and Bugseng experts
• 96 rules adopted and added to docs/misra/rules.rst
• 17 candidates for mass adoption (already clean)
• Only 8 rules left to discuss!
• Rules added to docs/misra/rules.rst
• Xen 4.18 release: 148 commits to fix MISRA C violations by
• MISRA C unjustified violations down from 2 million to 100,000!
• ECLAIR MISRA C scanner integrated in the upstream Xen Gitlab CI-loop
• 69 rules checked with zero unjustified violations (“clean” and checked against regressions)
• 7 more rules are also clean on arm64 only and 4 rules on x86_64 only
• 35 additional rules will also be checked against regressions (some violations are present)
12. 12 |
Xen Safety Progress: MISRA C
• Xen infrastructure for MISRA C deviations
• Project-wide deviations: deviations.rst and ECLAIR/deviations.ecl
• “SAF” tag for deviations as in-code comments
• Document the reason for the deviation appropriately in safe.json
• A tool is provided to convert the Xen deviation tag into the tool-specific tag
• we can support multiple tools with a single tag
/* SAF-1-safe R8.6 linker defined symbols */
extern char _start[], _end[], start[];
Xen code with
Xen tag
Xen code with
ECLAIR tag
13. 13 |
MISRA C for Xen: Lessons Learned
• Tailoring of the MISRA C guidelines is essential
• Selection of the (non-mandatory) guidelines to comply with
• Global deviations for required guidelines where justifiable
• Adapt some required guidelines to the project
• Partial deviations of overly restrictive rules
• The reason for the rule and the rule itself make sense
• Taken as-is it would lead to undesirable code changes
• Focus on undefined and implementation-specific behaviors
• Focus and the spirit of the rule
• Deprioritize “developer confusion”
• Encapsulate and deviate clever tricks
• KConfig support and MISRA C: deviations required
14. 14 |
Overly Restrictive Rules: Comply with the Spirit of the Rule
• Rule 16.3: “An unconditional break statement shall terminate every switch-clause”
• Good idea in principle: lack of “break” causes severe bugs
• Deviations for “fallthrough” cases are expected
• But the rule is too restrictive:
switch ( state )
{
case IO_ABORT:
goto inject_abt;
break;
case IO_HANDLED:
/* ... */
return;
break;
15. 15 |
Overly Restrictive Rules: Comply with the Spirit of the Rule
• Rule 16.3: “An unconditional break statement shall terminate every switch-clause”
• Xen Project also allows the following:
• continue, goto, return
• the “fallthrough” keyword
• a call to a function that does not give the control back
switch ( state )
{
case IO_ABORT:
goto inject_abt;
case IO_HANDLED:
/* ... */
return;
case IO_RETRY:
/* ... */
continue;
16. 16 |
Encapsulate and Deviate Clever Tricks
• Clever bit trick that not all know:
• This is now encapsulated into a macro, which is globally deviated as far as Rule 10.1 is concerned:
/* extract LSB */
align &= -align; // Unary minus of unsigned:
// Rule 10.1 violation
/*
* Given an unsigned integer argument, expands to
* a mask where just the least significant nonzero bit
* of the argument is set, or 0 if no bits are set.
*/
#define ISOLATE_LSB(x) ((x) & -(x))
[…]
align = ISOLATE_LSB(align);
17. 17 |
Partial Deviation Example: Rule 10.1
• Rule 10.1: “Operands shall not be of an inappropriate essential type”
• Multiple rationales
• unspecified and undefined behavior
• implementation-defined behavior
• developer confusion
• Initially, 269,393 and 65,731 unjustified violations respectively on x86_64 and arm64, but:
• a subset of 10.1 is guaranteed to be safe by the C standard
• Given the architecture and toolchains supported by Xen, we can further narrow down the rule
• It is clear to all Xen developers the way implicit Boolan conversions work
• Resulting project-wide deviations adopted by Xen:
• Value-preserving conversions of integer constants
• Bitwise and, or, xor, one's complement, bitwise and assignment, bitwise or assignment, bitwise xor assignment
• Left shift, right shift, left shift assignment, right shift assignment
• Implicit conversions to boolean for conditionals (?: if while for) and logical operators (! || &&)
• From ~335,000 initial violations we are now down to 95
18. 18 |
KConfig and MISRA C: a difficult marriage
• Many OSS projects (Linux, Xen, ...) are moving away from #ifdef to use IS_ENABLED checks instead
• Better readability, especially with nested checks
• Syntax is always checked, also in excluded parts
• Dead code is optimized out by the compiler anyway
#ifdef CONFIG_FOO
if ( cmdline_option_foo )
{
#else
do_A();
}
#endif
if ( !IS_ENABLED(CONFIG_FOO) ||
cmdline_option_foo )
{
do_A();
}
19. 19 |
KConfig and MISRA C: a difficult marriage
• MISRA C view: excluded code should be removed by preprocessing
• MISRA C Decidable guidelines apply to all code that is present after preprocessing
including code guarded by if(0)
• Global MISRA C deviation which is justified if:
• the compiler does indeed eliminate code that is guarded by if(0)
• no jumps inside code excluded by if(0) are possible from outside
• Also guaranteed by other MISRA C guidelines
20. 20 |
Xen Safety Requirements
o Requirements: detailed documentation of all expected software behaviors
o Define the Market Requirements first
o Market Requirements are linked to Product Requirements. Product Requirements explain how
the hypervisor implements Market Requirements
o Product Requirements are further split into numerous Software Safety Requirements, which are
individually testable
Market
requirements
Product
requirements
Software safety
requirements
Test
case
Test
code
Test
job
M to N M to N 1 to N N to 1 1 to 1
21. 21 |
Market Requirements
• Identify the scope of the safety certification for Xen
• Defines the expectations of Xen for automotive and embedded use cases
• Written with a high-level view of the system
• Example:
Name Description
Static VM definition Xen shall specify the resources required to boot and
run safe and non safe VMs.
Run Arm64 and AMD-x86 VMs Xen shall run Arm64 and AMD x86 VMs.
VM device assignment Xen shall be able to assign devices to each VM. For
e.g.: it should be able to assign GPU to VM1, MMC to
VM2. Only the VM assigned to a device shall have
exclusive access to the device.
22. 22 |
1 Market Requirement à N Product Requirements
• Product Requirements explain how Market Requirements are fulfilled by Xen
• Product Requirements are Xen specific
• Still written with a high-level view of the system
• Product Requirements can sometimes be linked to more than one Market Requirement
Emulated UART
23. 23 |
1 Product Requirements à N Software Safety Requirements
Domain shall be able to read the frequency of the system counter (either via
CNTFRQ_EL0 register or "clock-frequency" device tree property if present).
Access virtual timer from a domain
Trigger the physical timer interrupt from a domain
Trigger the virtual timer interrupt from a domain
24. 24 |
Software Safety Requirements
• The requirements are written in plain English, from the perspective of what Xen is expected to fulfil
• The software safety requirements (SSR) are the most granular form
• Engineers are expected to refer to a SSR (and architecture spec) to write a test to validate it
• Each SSR should be tested independently
• Market Requirements, Product Requirements and SSR should be independently baselined
• SSR should be unambiguous, complete, consistent, correct
• SSR should be traceable all the way to market requirements
25. 25 |
Requirements-as-Code
• Traceability typically handled with complex proprietary solutions
• They do not work well in an Open Source community environment
• Requirements are documents à Reuse the same processes we already have in Xen also for Requirements
• Benefits:
• The Xen Community is already familiar with it
• Zero ramp up time, high speed of development and review
• Alignment with Zephyr, ELISA, and other Open Source software projects
26. 26 |
Requirements-as-Code -- Documents as Code
• For years now, the larger Open Source Community has been re-using the same powerful tools and
processes for both code and documentation
• Write plain text documents using formats like RST and Markdown
• Easily readable and modifiable in source format
• They can also be rendered to PDF and HTML
• Same submission and review process as code; same version control as code
27. 27 |
Requirements-as-Code
• Traceability typically handled with complex proprietary solutions
• They do not work well in an Open Source community environment
• Requirements are documents à Reuse the same processes we already have in Xen also for Requirements
• Benefits:
• The Xen Community is already familiar with it
• Zero ramp up time, high speed of development and review
• Alignment with Zephyr, ELISA, and other Open Source software projects
• One Gap: Linking and Traceability
• Open Source projects started to address this specific need: OpenFastTrace, StrictDoc, Basil, and others
• We are using OpenFastTrace for linking and traceability reports
• Requirements-as-Code is a great fit for Open Source software projects
• No need for proprietary tools
28. 28 |
OpenFastTrace
• OpenFastTrace: Open Source requirement tracing tool
• Handles requirements written in markdown, RST support pending
• Link requirements all the way to code
• Independent versions for each requirement
• Detect missing dependencies (missing links)
• Detect obsolete requirements (old versions)
• Generate reports in html and xml
• Extremely lightweight and fast
• Very mature (~7 years old), actively Maintained, proven in use
• https://github.com/itsallcode/openfasttrace/blob/main/README.md
29. 29 |
OpenFastTrace – writing a requirement
A Requirement Title With an Underline
-------------------------------------
`req~this-is-the-id~1` This is the unique specification id
Description:
Each specification id consist of 3 parts
- Artifact type – 'req’
- Name - 'this-is-the-id’
- Revision number – 1
Needs:
- subreq The artifact type of its sub requirement (i.e. child requirement)
30. 30 |
OpenFastTrace - keywords
OpenFastTrace relies on the following keywords to be placed within requirements written in markdown
• Covers – it denotes the specification id of its parent
• Needs – It denotes the artifact type of its children
A Sub requirement
-----------------
`subreq~this-is-the-id~1` This is the unique specification id
Description:
This is a sub requirement.
Covers:
req~this-is-the-id~1 This is the unique specification id of the parent
31. 31 |
OpenFastTrace – Tracing Requirements in Code
The following is the format for placing tags in the code:
• impl: implement a given requirement
• test: test a given requirement
// [impl->subreq~this-is-the-id~1>>test]
private validate(final AuthenticationRequest request) {
// Implements subreq~this-is-the-id~1 blah blah
}
# test->impl~this-is-the-id~1 */
echo "Testing the implementation"
34. 34 |
Conclusions
• Xen is an Open Source project under Linux Foundation with an healthy and diverse community
• Xen as embedded hypervisor: dom0less, cache coloring, real-time, MPU support, etc.
• AMD is working on making Xen safety-certifiable for AMD platforms
• Certification based on Xen upstream community processes and upstream codebase
• Great benefits to the Xen community: MISRA C, documentation, Gitlab-CI tests, and more
• MISRA C
• Configurability of ECLAIR, the MISRA C scanning tool used, is absolutely essential
• Contiguous Integration is crucial to lighten the burden of Maintainers
• On existing high-quality code we can expect more deviations
• The MISRA potential for suggesting improvements is still high
• Requirements
• Requirements-as-Code is the way to go for Open Source projects
• OpenFastTrace handles traceability, detects missing links, and generates the report
• Requirements upstreaming coming soon!