The beautiful thing about software engineering is that it gives you the warm and fuzzy illusion of total understanding: I control this machine because I know how it operates. This is the result of layers upon layers of successful abstractions, which hide immense sophistication and complexity. As with any abstraction, though, these sometimes leak, and that's when a good grounding in what's under the hood pays off.
The second talk in this series peels a few layers of abstraction and takes a look under the hood of our "car engine", the CPU. While hardly anyone codes in assembly language anymore, your C# or JavaScript (or Scala or...) application still ends up executing machine code instructions on a processor; that is why Java has a memory model, why memory layout still matters at scale, and why you're usually free to ignore these considerations and go about your merry way.
You'll come away knowing a little bit about a lot of different moving parts under the hood; after all, isn't understanding how the machine operates what this is all about?
(From a talk given at BuildStuff 2016 in Vilnius, Lithuania.)
GNU Toolchain is the de facto standard of IT industrial and has been improved by comprehensive open source contributions. In this session, it is expected to cover the mechanism of compiler driver, system interaction (take GNU/Linux for example), linker, C runtime library, and the related dynamic linker. Instead of analyzing the system design, the session is use case driven and illustrated progressively.
The beautiful thing about software engineering is that it gives you the warm and fuzzy illusion of total understanding: I control this machine because I know how it operates. This is the result of layers upon layers of successful abstractions, which hide immense sophistication and complexity. As with any abstraction, though, these sometimes leak, and that's when a good grounding in what's under the hood pays off.
The second talk in this series peels a few layers of abstraction and takes a look under the hood of our "car engine", the CPU. While hardly anyone codes in assembly language anymore, your C# or JavaScript (or Scala or...) application still ends up executing machine code instructions on a processor; that is why Java has a memory model, why memory layout still matters at scale, and why you're usually free to ignore these considerations and go about your merry way.
You'll come away knowing a little bit about a lot of different moving parts under the hood; after all, isn't understanding how the machine operates what this is all about?
(From a talk given at BuildStuff 2016 in Vilnius, Lithuania.)
GNU Toolchain is the de facto standard of IT industrial and has been improved by comprehensive open source contributions. In this session, it is expected to cover the mechanism of compiler driver, system interaction (take GNU/Linux for example), linker, C runtime library, and the related dynamic linker. Instead of analyzing the system design, the session is use case driven and illustrated progressively.
UM2019 Extended BPF: A New Type of SoftwareBrendan Gregg
Keynote for Ubuntu Masters 2019 by Brendan Gregg, Netflix. Video https://www.youtube.com/watch?v=7pmXdG8-7WU&feature=youtu.be . "Extended BPF is a new type of software, and the first fundamental change to how kernels are used in 50 years. This new type of software is already in use by major companies: Netflix has 14 BPF programs running by default on all of its cloud servers, which run Ubuntu Linux. Facebook has 40 BPF programs running by default. Extended BPF is composed of an in-kernel runtime for executing a virtual BPF instruction set through a safety verifier and with JIT compilation. So far it has been used for software defined networking, performance tools, security policies, and device drivers, with more uses planned and more we have yet to think of. It is changing how we use and think about systems. This talk explores the past, present, and future of BPF, with BPF performance tools as a use case."
The promise of the IoT won’t be fulfilled until integrated
software platforms are available that allow software
developers to develop these devices efficiently and in
the most cost-effective manner possible.
This presentation introduces F9 microkernel, new open source
implementation built from scratch, which deploys
modern kernel techniques dedicated to deeply
embedded devices.
Netronome's half-day tutorial on host data plane acceleration at ACM SIGCOMM 2018 introduced attendees to models for host data plane acceleration and provided an in-depth understanding of SmartNIC deployment models at hyperscale cloud vendors and telecom service providers.
Presenter Bios
Jakub Kicinski is a long term Linux kernel contributor, who has been leading the kernel team at Netronome for the last two years. Jakub’s major contributions include the creation of BPF hardware offload mechanisms in the kernel and bpftool user space utility, as well as work on the Linux kernel side of OVS offload.
David Beckett is a Software Engineer at Netronome with a strong technical background of computer networks including academic research with DDoS. David has expertise in the areas of Linux architecture and computer programming. David has a Masters Degree in Electrical, Electronic Engineering at Queen’s University Belfast and continues as a PhD student studying Emerging Application Layer DDoS threats.
Introduce F9 microkernel, new open source implementation built from scratch, which deploys modern kernel techniques, derived from L4 microkernel designs, to deep embedded devices.
:: https://github.com/f9micro
Characteristics of F9 microkernel
– Efficiency: performance + power consumption
– Security: memory protection + isolated execution
– Flexible development environment
The Linux kernel is undergoing the most fundamental architecture evolution in history and is becoming a microkernel. Why is the Linux kernel evolving into a microkernel? The potentially biggest fundamental change ever happening to the Linux kernel. This talk covers how companies like Facebook and Google use BPF to patch 0-day exploits, how BPF will change the way features are added to the kernel forever, and how BPF is introducing a new type of application deployment method for the Linux kernel.
It is the presentation file used by Jim Huang (jserv) at OSDC.tw 2009. New compiler technologies are invisible but highly integrated around our world, and we can enrich the experience via facilitating LLVM.
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...Adrian Huang
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Presentation at Android Builders Summit 2012.
Based on the experience of working with ODM companies and SoC vendors, this session would discuss how to figure out the performance hotspot of certain Android devices and then improve in various areas including graphics and boot time. This session consists of the detailed components which seem to be independent from each other in traditional view. However, the situation changes a lot in Android system view since everything is coupled in a mass. Three frequently mentioned items in Android engineering are selected as the entry points: 2D/3D graphics, runtime, and boot time. Audience: Developers who work on Android system integration and platform enablement.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
This slide provides a basic understanding of hypervisor support in ARM v8 and above processors. And these slides (intent to) give some guidelines to automotive engineers to compare and choose right solution!
Video: https://www.youtube.com/watch?v=JRFNIKUROPE . Talk for linux.conf.au 2017 (LCA2017) by Brendan Gregg, about Linux enhanced BPF (eBPF). Abstract:
A world of new capabilities is emerging for the Linux 4.x series, thanks to enhancements that have been included in Linux for to Berkeley Packet Filter (BPF): an in-kernel virtual machine that can execute user space-defined programs. It is finding uses for security auditing and enforcement, enhancing networking (including eXpress Data Path), and performance observability and troubleshooting. Many new open source tools that have been written in the past 12 months for performance analysis that use BPF. Tracing superpowers have finally arrived for Linux!
For its use with tracing, BPF provides the programmable capabilities to the existing tracing frameworks: kprobes, uprobes, and tracepoints. In particular, BPF allows timestamps to be recorded and compared from custom events, allowing latency to be studied in many new places: kernel and application internals. It also allows data to be efficiently summarized in-kernel, including as histograms. This has allowed dozens of new observability tools to be developed so far, including measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more.
This talk will summarize BPF capabilities and use cases so far, and then focus on its use to enhance Linux tracing, especially with the open source bcc collection. bcc includes BPF versions of old classics, and many new tools, including execsnoop, opensnoop, funcccount, ext4slower, and more (many of which I developed). Perhaps you'd like to develop new tools, or use the existing tools to find performance wins large and small, especially when instrumenting areas that previously had zero visibility. I'll also summarize how we intend to use these new capabilities to enhance systems analysis at Netflix.
UM2019 Extended BPF: A New Type of SoftwareBrendan Gregg
Keynote for Ubuntu Masters 2019 by Brendan Gregg, Netflix. Video https://www.youtube.com/watch?v=7pmXdG8-7WU&feature=youtu.be . "Extended BPF is a new type of software, and the first fundamental change to how kernels are used in 50 years. This new type of software is already in use by major companies: Netflix has 14 BPF programs running by default on all of its cloud servers, which run Ubuntu Linux. Facebook has 40 BPF programs running by default. Extended BPF is composed of an in-kernel runtime for executing a virtual BPF instruction set through a safety verifier and with JIT compilation. So far it has been used for software defined networking, performance tools, security policies, and device drivers, with more uses planned and more we have yet to think of. It is changing how we use and think about systems. This talk explores the past, present, and future of BPF, with BPF performance tools as a use case."
The promise of the IoT won’t be fulfilled until integrated
software platforms are available that allow software
developers to develop these devices efficiently and in
the most cost-effective manner possible.
This presentation introduces F9 microkernel, new open source
implementation built from scratch, which deploys
modern kernel techniques dedicated to deeply
embedded devices.
Netronome's half-day tutorial on host data plane acceleration at ACM SIGCOMM 2018 introduced attendees to models for host data plane acceleration and provided an in-depth understanding of SmartNIC deployment models at hyperscale cloud vendors and telecom service providers.
Presenter Bios
Jakub Kicinski is a long term Linux kernel contributor, who has been leading the kernel team at Netronome for the last two years. Jakub’s major contributions include the creation of BPF hardware offload mechanisms in the kernel and bpftool user space utility, as well as work on the Linux kernel side of OVS offload.
David Beckett is a Software Engineer at Netronome with a strong technical background of computer networks including academic research with DDoS. David has expertise in the areas of Linux architecture and computer programming. David has a Masters Degree in Electrical, Electronic Engineering at Queen’s University Belfast and continues as a PhD student studying Emerging Application Layer DDoS threats.
Introduce F9 microkernel, new open source implementation built from scratch, which deploys modern kernel techniques, derived from L4 microkernel designs, to deep embedded devices.
:: https://github.com/f9micro
Characteristics of F9 microkernel
– Efficiency: performance + power consumption
– Security: memory protection + isolated execution
– Flexible development environment
The Linux kernel is undergoing the most fundamental architecture evolution in history and is becoming a microkernel. Why is the Linux kernel evolving into a microkernel? The potentially biggest fundamental change ever happening to the Linux kernel. This talk covers how companies like Facebook and Google use BPF to patch 0-day exploits, how BPF will change the way features are added to the kernel forever, and how BPF is introducing a new type of application deployment method for the Linux kernel.
It is the presentation file used by Jim Huang (jserv) at OSDC.tw 2009. New compiler technologies are invisible but highly integrated around our world, and we can enrich the experience via facilitating LLVM.
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...Adrian Huang
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Presentation at Android Builders Summit 2012.
Based on the experience of working with ODM companies and SoC vendors, this session would discuss how to figure out the performance hotspot of certain Android devices and then improve in various areas including graphics and boot time. This session consists of the detailed components which seem to be independent from each other in traditional view. However, the situation changes a lot in Android system view since everything is coupled in a mass. Three frequently mentioned items in Android engineering are selected as the entry points: 2D/3D graphics, runtime, and boot time. Audience: Developers who work on Android system integration and platform enablement.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
This slide provides a basic understanding of hypervisor support in ARM v8 and above processors. And these slides (intent to) give some guidelines to automotive engineers to compare and choose right solution!
Video: https://www.youtube.com/watch?v=JRFNIKUROPE . Talk for linux.conf.au 2017 (LCA2017) by Brendan Gregg, about Linux enhanced BPF (eBPF). Abstract:
A world of new capabilities is emerging for the Linux 4.x series, thanks to enhancements that have been included in Linux for to Berkeley Packet Filter (BPF): an in-kernel virtual machine that can execute user space-defined programs. It is finding uses for security auditing and enforcement, enhancing networking (including eXpress Data Path), and performance observability and troubleshooting. Many new open source tools that have been written in the past 12 months for performance analysis that use BPF. Tracing superpowers have finally arrived for Linux!
For its use with tracing, BPF provides the programmable capabilities to the existing tracing frameworks: kprobes, uprobes, and tracepoints. In particular, BPF allows timestamps to be recorded and compared from custom events, allowing latency to be studied in many new places: kernel and application internals. It also allows data to be efficiently summarized in-kernel, including as histograms. This has allowed dozens of new observability tools to be developed so far, including measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more.
This talk will summarize BPF capabilities and use cases so far, and then focus on its use to enhance Linux tracing, especially with the open source bcc collection. bcc includes BPF versions of old classics, and many new tools, including execsnoop, opensnoop, funcccount, ext4slower, and more (many of which I developed). Perhaps you'd like to develop new tools, or use the existing tools to find performance wins large and small, especially when instrumenting areas that previously had zero visibility. I'll also summarize how we intend to use these new capabilities to enhance systems analysis at Netflix.
* Know the reasons why various operating systems exist and how they are functioned for dedicated purposes
* Understand the basic concepts while building system software from scratch
• How can we benefit from cheap ARM boards and the related open source tools?
- Raspberry Pi & STM32F4-Discovery
Build a full-functioned virtual machine from scratch, when Brainfuck is used. Basic concepts about interpreter, optimizations techniques, language specialization, and platform specific tweaks.
(1) Analysis of Large-scale system software
(2) Diagnose faults inside system software, especially for
device drivers
(2) Deal with faulty device driver implementation
(Presentation at COSCUP 2012) Discuss why you should try to develop your own operating system and how you can speed up by taking the microkernel approach.
University of Virginia
cs4414: Operating Systems
http://rust-class.org
For embedded notes, see:
http://rust-class.org/class-22-microkernels-and-beyond.html
Manta: a new internet-facing object storage facility that features compute by...Hakka Labs
As the amount of unstructured data has greatly exceeded a single computer's ability to process it, data has become increasingly isolated from the compute elements . The resulting haul from stores of record (e.g., SAN, NAS, S3) to transient compute (e.g., Hadoop, EC2) creates needless mechanical work and human labor. Is there a better way? In this talk, we'll explore the coming convergence of data and compute in the cloud, focusing in particular on Joyent's Manta, a new internet-facing object storage facility that features compute. We will describe the design principles for Manta, the engineering challenges in building it, and more generally, the opportunities presented by the convergence of compute and data.
UPDATED OCTOBER 2015: Unikernels are small, fast, easily deployable, and very secure application stacks. Lacking a traditional operating system layer, they provide a new way of looking at the cloud which goes beyond the methodologies used by Docker and other container technologies.
This is an update of the deck as delivered by Russell Pavlicek. This includes some ground-breaking work done in the Rump Kernel project to bring web servers, database, and scripting language into the world of Unikernels.
Deck result of the Ohio Linuxfest 2015 in Columbus, OH.
Xen Project Evangelist Russell Pavlicek talks about how the growing area of hypervisor-leveraging unikernels will help redefine the cloud.
MAJOR UPDATE: Deck is now the result of 2015 Ohio Linuxfest, about a year after the initial talk. Deck now contains almost twice as much information as the original talk.
Unikernel User Summit 2015: The Next Generation Cloud: Unleashing the Power o...The Linux Foundation
Xen Project Evangelist Russell Pavlicek's presentation at the Unikernel User Summit at Texas Linux Fest 2015. An overview of the world of unikernels and their importance for the future. Beyond Docker and containers, unikernels are smaller, lighter, and more secure than any workload currently in the cloud.
The Performance of μ-Kernel-Based Systems from Mach to L4. Analysing evolution in kernel structures, ipc - message based communication in monolithic linux kernel and microkernel
Analysis of Practicality and Performance Evaluation for Monolithic Kernel and...CSCJournals
The microkernel system (as opposite to monolithic systems) has been developed for years, with the hope that microkernels could solve the problems of other operating systems. However, the evolution of the microkernel systems did not go as many people expected. Because of faultinesses of design in system structure, the performance of the first generation of microkernel operating systems was disappointing. The overhead of the system was too high to bear for users. However, the second-generation microkernel system uses an improved design architecture that could substantially reduce the overhead in the previous microkernel systems. This project evaluates the system performance of the MINIX3.1.2a with the performance of Linux by using Unixbench system evaluating tool. In this way, it could testify whether the microkernel systems could be more flexible, portable and secure than monolithic operating systems. Unixbench could give sufficient statistics on different capacities of MINIX3 and Linux, such as system call overhead, pipe throughput, arithmetic test and so on. The result illustrates MINIX3 has better performance on Shell Scripts running and Arithmetic test and Linux has better performance on other aspects such as system call overhead, process creation and so on. Furthermore, we provide a more detailed analyze on the microkernel Minix 3 system and propose a method that we could improve the performance of the MINIX3 system.
Introduce Brainf*ck, another Turing complete programming language. Then, try to implement the following from scratch: Interpreter, Compiler [x86_64 and ARM], and JIT Compiler.
This presentation covers the general concepts about real-time systems, how Linux kernel works for preemption, the latency in Linux, rt-preempt, and Xenomai, the real-time extension as the dual kernel approach.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
5. Types of Kernel Designs
• Monolithic kernel
• Microkernel
• Hybrid Kernel
• Exokernel
• Virtual Machine / Hypervisor
Source: Michael Roitzsch, TU Dresden
6. Monolithic Kernel
• All OS services operate in kernel space
• Good performance
– less context-switch, TLB flush
• Disadvantages
– Dependencies between system component
– Complex & huge (millions(!) of lines of code)
– Larger size makes it hard to maintain
• Examples: MULTICS, Unix, FreeBSD, Linux
7. TCB (Trusted Computing Base)
traditional
embedded
Linux/
Windows
Microkernel
based
all code 100,000 LoC 10,000 LoC
System
TCB
Diagram from Kashin Lin (NEWS Lab)
8. Case Study: Bugs inside big kernels
• Drivers cause 85% of Windows XP crashes.
– Michael M. Swift, Brian N. Bershad, Henry M. Levy: “Improving the
Reliability of Commodity Operating Systems”, SOSP 2003
• Error rate in Linux drivers is 3x (maximum: 10x)
– Andy Chou, Junfeng Yang, Benjamin Chelf, Seth Hallem, Dawson R.
Engler: “An Empirical Study of Operating System Errors”, SOSP 2001
• Causes for driver bugs
– 23% programming error
– 38% mismatch regarding device specification
– 39% OS-driver-interface misconceptions
– Leonid Ryzhyk, Peter Chubb, Ihor Kuz and Gernot Heiser: “Dingo:
Taming device drivers”, EuroSys 2009
9. Microkernel
• Minimalist approach
– IPC, virtual memory, thread scheduling
• Put the rest into user space
– Device drivers, networking, file system, user interface
• Disadvantages
– Lots of system calls and context switches
• Examples: Mach, L4, QNX, MINIX, IBM K42
10. Microkernel
• Put the rest into user space
– Device drivers, networking, file system, user interface
File
System
Networking Multi-mediaWindowing
Process
Manager
Application
Microkernel
+
Process Manager
are the only trusted
components
microkernel
Message Bus
Applications and Drivers
Are processes which plug into a message bus
Reside in their own memory-protected address space
• Have a well defined message interface
• Cannot corrupt other software components
• Can be started, stopped and upgraded on the fly
11. Microkernel: Definitions
• A kernel technique that provides only the minimum OS
services.
– Address Spacing
– Inter-process Communication (IPC)
– Thread Management
– Unique Identifiers
• All other services are done at user space
independently.
12. Microkernel
12
Device Drivers User Program
Memory
Managers
User Mode
Address
spacing
Thread
Management
and IPC
Unique
Identifiers
Microkernel Mode
Hardware
14. Tanenbaum-Torvalds Debate
• Andrew Tanenbaum
– Author of Minix, Amoeba, Globe
– Many books on OS and distributed system
– One of the most influential fellow in OS research
• Linus Torvalds
– Chief architect of the Linux kernel
– Most influential fellow in open source
15. round 1: “Linux is obsolete” (1992)
• Historical context – early 1990s:
– AT&T USL vs BSDi lawsuit.
– GNU kernel was not out yet.
• Andrew:
– “Linux is obsolete”
– “The limitations of Minix are partly due to my being a prof.”
– w.r.t OS design, the debate is over. “Microkernels have won.” and “Linux
is a giant step back into the 1970”.
– “writing a new operating system that is closely tied to any particular piece
of hardware, especially a weird one like the Intel line, is basically wrong.”
– “Designing a monolithic kernel in 1991 is a fundamental error. Be
thankful you are not my student. You would not get a high grade for such
a design :-)”
16. Linus' response
• On ”for me MINIX is a hobby”: “You use this as an excuse for the limitations
of minix? Sorry, but you loose: I've got more excuses than you have, and
linux still beats the pants of minix in almost all areas. Look at who makes
money off minix, and who gives linux out for free”
• On “Minix is a micro-kernel design, Linux is a monolithic style sys”: “If this
was the only criterion for the "goodness" of a kernel, you'd be right. What
you don't mention is that minix doesn't do the micro-kernel thing very well,
and has problems with real multitasking (in the kernel). If I had made an OS
that had problems with a multithreading filesystem, I wouldn't be so fast to
condemn others: in fact, I'd do my damndest to make others forget about the
fiasco.”
• "Portability is for people who cannot write new programs”: “I agree that
portability is a good thing: but only where it actually has some meaning.
There is no idea in trying to make an operating system overly portable:
adhering to a portable API is good enough. The very idea of an operating
system is to use the hardware features, and hide them behind a layer of
17. round 2: Minix 3 (2006)
• Context: “Can we make OS reliable and secure?” & Slashdot
• Andrew:
– “Be sure brain is in gear before engaging mouth”
– Encourage people to try Minix3 before making comments about
uKernel
• Linus:
– “The real reason people do microkernels .. and make up new
reasons for why they are better is that the concept sounds so
good. It sounded good to me too!”
– “…real progress is made. Not by people who are afraid of the
complexity. Real progress is made by people who are too ignorant
to even realize that there is complexity, and how things are
"supposed" to work, and they just do it their own way.
– Over-thinking things is never how you do anything real.”
18. Reliability, Security, and Complexity
• Andrew:
– Reliability & Security have now become more important than performance for
most users. Systems in military and aerospace uses uKernel.
– Complexity:
Shared data structure is bad and hard to get right. “you want to avoid shared data
structures as much as possible”. “That's what object-oriented programming is all about”
( thousands of bugs have been found in the Linux kernel in this area alone)
Distributed algorithm is only a problem for multiple machines. Complexity is manageable,
we implemented it.
• Linus:
– Reliability & Security: “The fact that each individual piece is simple and secure
does not make the aggregate either simple or secure”, “when one node goes
down, often the rest comes down too.”
– Complexity:
Shared data structure is good, beside performance benefit, makes it easier to develop
system: “in the absense of a shared state, you have a hell of a lot of problems trying to
make any decision that spans more than one entity in the system.”
uKernel requires distributed algorithms which are difficult to get right.. “whenever you
compare the speed of development of a microkernel and a traditional kernel, the
traditional kernel wins. By a huge amount, too.”
19. 3 Generations of Microkernel
• Mach (1985-1994)
– replace pipes with IPC (more general)
– improved stability (vs monolithic kernels)
– poor performance
• L3 & L4 (1990-2001)
– order of magnitude improvement in IPC performance
• written in assembly, sacrificed CPU portability
• only synchronus IPC (build async on top of sync)
– very small kernel: more functions moved to userspace
• seL4, Fiasco.OC, Coyotos, NOVA (2000-)
– platform independence
– verification, security, multiple CPUs, etc.
21. Hybrid Kernel
• Combine the best of both worlds
• Speed and simple design of a monolithic kernel
• Modularity and stability of a microkernel
• Still similar to a monolithic kernel
• Disadvantages still apply here
• Examples: Windows NT, NetWare, BeOS, Darwin
(MacOS X), DragonFly BSD
– Darwin is built around XNU, a hybrid kernel that combines
the Mach 3 microkernel, various elements of BSD (including
the process model, network stack, and virtual file system),
and an object-oriented device driver API called I/O Kit.
22. Exokernel
• Follows end-to-end principle ("Library" OS)
– Extremely minimal
– Fewest hardware abstractions as possible
– Just allocates physical resources to apps
• Disadvantages
– More work for application developers
• Examples: MIT Exokernel, Nemesis
23. “Worse Is Better", Richard P. Gabriel
New Jersey style
[UNIX, Bell Labs]
MIT style
[Multics]
Simplicity No.1 consideration
Implementation >
Interface
Interface >
Implementation
Correctness mostly 100%
Consistency mostly 100%
Completeness de facto mostly
25. Contrasting Reading
• Paper: The Duality of Memory and Communication in
the Implementation of a Multiprocessor Operating
System
– SOSP 1987, Young et al
• Examples: CMU Mach (1985), Chorus (1987)
• Richard Rashid
– Lead developer of Mach
– Lead, Microsoft Research
• William Bolosky
– Microsoft Research
• Avie Tevanian
– former Apple CTO
• Brian Bershad
– Professor, University of
Washington
26. Mach
• 1st
generation microkernel
– OpenStep / Apple Darwin (XNU), GNU Hurd, IBM
Workplace OS, OSF/1
• Derived from CMU Accent operating system
– Accent has no ability to execute UNIX applications (1981)
– Mach = BSD Unix system + Accent concepts
• Memory object
– Mange system services like network paging and file system
• Memory via communication
UNIX was owned by AT&T which controlled the market. IBM, DEC, etc. got together and
formed the OSF (Open Software Foundation). In an effort to conquer market share, OSF
took the Mach 2.5 release and made it the OSF/1 system.
27. Mach failed because of performance
• Mach were notoriously noted for suffering from
excessive performance limitations.
• Mach on a DEC-Station 5200/200 was found to endure
peak degradations of up to 66% when compared to
Ultrix running on the same hardware. [1]
• Mach-based OSF/1 is cited to perform on average at
only half the performance level of monolithic OSF/1. [2]
[1] J. Bradley Chen and Brian N. Bershad. The impact of operating system structure on
memory system performance. In Proceedings of the 14th ACM Symposium on OS
Principles, pages 120–133, Asheville, NC, USA, December 1993.
[2] Michael Condict, Don Bolinger, Dave Mitchell, and Eamonn McManus. Microkernel
modularity with integrated kernel performance. Technical report, OSF Research Institute,
Cambridge, 1994.
28. Mach in a nutshell
• Simple, extensible communication kernel
– “Everything is a pipe.” – ports as secure communication
channels
• Multiprocessor support
• Message passing by mapping
• Multi-server OS personality
• POSIX-compatibility
• Shortcomings
– Performance
– drivers still in the kernel
29. Mach Design Principles
• Maintain BSD compatibility
– Simple programmer interface
– Easy portability
– Extensive library of utilities /
applications
– Combine utilities via pipes
• In addition,
– Diverse architectures
– Varying network speed
– Simple kernel
– Distributed operation
– Integrated memory management
and IPC
– Heterogeneous systems
30. Mach Abstractions
• Task
– Basic unit of resource allocation
– Virtual address space, communication capabilities
• Thread
– Basic unit of computation
• Port
– Communication channel for IPC
• Message
– May contain port capabilities, pointers
• Memory Object
32. Ports in Mach microkernel
• Dedicated kernel objects
• Applications hold send/recv rights for ports
• Kernel checks whether task owns sufficient rights
before doing IPC
33. Memory Management and IPC
• Memory Management using IPC:
– Memory object represented by port(s)
– IPC messages are sent to those ports to request operation
on the object
– Memory objects can be remote → kernel caches the
contents
• IPC using memory-management techniques:
– Pass message by moving pointers to shared memory objects
– Virtual-memory remapping to transfer large contents
(virtual copy or copy-on-write)
34. External Memory Management
• No kernel-based file system
– Kernel is just a cache manager
• Memory object
– As known as “paging object”
• Page
– Task that implements memory object
35. Benefits from such design
• Keypoint: consistent network shared memory
• Examples:
– Each client maps X with shared pager
– Use primitives to tell kernel cache what to do:
• Locking
• Flushing
36. Problems of External Memory
Management
• External data manager failure looks like
communication failure
– timeouts are needed
• Opportunities for data manager to deadlock on itself
37. Performance Impacts
• Does not prohibit caching
• Reduce number of copies of data occupying memory
– Copy-to-use, copy-to-kernel
– More memory for caching
• “compiling a small program cached in memory…is
twice as fast”
• I/O operations reduced by a factor of 10
• Context switch overhead
– Cost of kernel overhead can be up to 800 cycles.
• Address Space Switches
– Expensive Page Table and Segment Switch Overhead
– Untagged TLB = Bad performance
39. • Paper: On μ-Kernel Construction
– 15th
Symposium on Operating System Principles (1995)
– Microkernels can have adequate performance if they are
architecture-dependent.
• Jochen Liedtke (1953-2001)
– worked on Eumel, L3, L4
– worked at IBM Research and GMD (German National
Research Center for IT)
Contrasting Reading
”Our vision is a microkernel technology that can
be and is used advantageousely for constructing
any general or customized operating system”.
[Liedtke et al 2001]
40. What are going into Kernel?
• Only determined by function, not performance.
• Key issue is protection.
– Principle of independence
• Servers can’t stomp on each other.
– Principle of integrity
• Communication channels can’t be interfered with.
• Usually the following go into the kernel:
– Address spaces
– IPC
– Basic scheduling
41. L4: the 2nd
Generation
• Similar to Mach
– Started from scratch, rather than monolithic
– But even more minimal
• minimality principle for L4:
A concept is tolerated inside the microkernel only if moving it
outside the kernel, i.e., permitting competing implementations,
would prevent the implementation of the system's required
functionality.
• Tasks, threads, IPC
– Contains only 13 system calls. 7 of them are for IPC
– Uses only 12k of memory
42. Recall: Mach kernel
• API Size: 140 functions
– Asynchronus IPC
– Threads
– Scheduling
– Memory management
– Resource access permissions
– Device drivers (in some variants)
• All other functions are implemented outside the kernel.
43. Performance Issues of Mach kernel
• Checking resource access permissions on system
calls.
– Single user machines do not need to do this.
• Cache misses.
– Critical sections were too large.
• Asynchronus IPC
– Most calls only need synchronus IPC.
– Synchronous IPC can be faster than asynchronous.
– Asynchronous IPC can be built on top of synchronous.
• Virtual memory
– How to prevent key processes from being paged out?
44. Performance Issues for 1st
Generation
• First-generation microkernels were slow
• Reason: Poor design [Liedtke SOSP'95]
– complex API
– Too many features
– Poor design and implementation
– Large cache footprint ⇒ memory-bandwidth limited
• L4 is fast due to small cache footprint
– 10–14 I-cache lines
– 8 D-cache lines
– Small cache footprint ⇒ CPU limited
45. Designs taken by L4 over Mach
• API Size: 7 functions (vs 140 for Mach3)
– Synchronous IPC
– Threads
– Scheduling
– Memory management
46. IPC Performance of L4
• “Radical” approach
• Strict minimality
• From-scratch design
• Fast primitives
47. Typical code size of 2nd
Microkernel
• Source code (OKL4)
– ~9k LOC architecture-independent
– ~0.5–6k LOC architecture/platform-specific
• Memory footprint kernel (not aggressively minimized):
– Using gcc (poor code density on RISC/EPIC architectures)
Architecture Version Text Total
X86 L4Ka 52k 98k
Itanium L4Ka 173k 417k
ARM OKL4 48k 78k
PPC-32 L4Ka 41k 135k
PPC-64 L4Ka 60k 205k
MIPS-64 NICTA 61k 100k
x86 seL4 74k 98k
ARMv6 seL4 64k 112k
48. L4 Abstractions, Mechanisms,
Concepts
• 3 basic Abstractions
– Address spaces (or virtual MMUs) — for protection
– Threads (or virtual CPUs) — for execution
– Capabilities (for naming and access control) — from OKL4 2.1
– Time (for scheduling) — removed recently
• 2 basic Mechanisms
– Message-passing communication (IPC)
– Mapping memory to address spaces
• Other core concepts
– Root task — Removed in OKL4 2.2
– Exceptions
49. L4 based Systems
(each variant differs in some parts)
• L4 kernel
– variants
• Device Driver
• Server
• Application
50. • Definition: A mapping which associates each virtual
page to a physical page. (Liedtke)
• microkernel has to hide the hardware concept of
address spaces, since otherwise, implementing
protection would be impossible
• microkernel concept of address spaces must be
tamed, but must permit the implementation of arbitrary
protection schemes on top of microkernel
Abstractions: Address Spaces
51. Abstractions: Address Spaces (AS)
• Address space is unit of protection
– Initially empty
– Populated by mapping in frames
• Mapping performed by privileged MapControl() syscall
– Can only be called from root task
– Also used for revoking mappings (unmap operation)
• Root task [removed]
– Initial AS created at boot time, controlling system resources
– Privileged system calls can only be performed from root task
– privileged syscalls identified by names ending in “Control
53. • L4 provides 3 operations:
– Grant
owner of an address space can grant any of its pages to
another space
– Map
owner of an address space can map any of its pages to
another space
– Flush
owner of an address space can flush any of its pages
Abstractions: Address Spaces
54. • I/O ports treated as address space
• As address space, can be mapped and unmapped
• Hardware interrupts are handled by user-level
processes. The L4 kernel will send a message via
IPC.
Abstractions: Address Spaces – I/O
55. • Recall that I/O can be done either with special “ports”,
or memory-mapped.
• Incorporated into the address space
– Natural for memory-mapped I/O (RISC series)
– Also works on I/O ports (x86 permits control per port, but no
mapping)
• Do not confuse memory-mapped I/O with
memory-mapped files.
Abstractions: Address Spaces – I/O
56. • Devices often contain on-chip memory (NIC, graphcis, etc)
• Instead of accessing through I/O ports, drivers can
map this memory into their address space just like
normal RAM
– no need for special instructions
– increased flexibility by using underlying virtual memory
management
Abstractions: AS – I/O Memory
57. • Device memory looks just like physical memory
• Chipset needs to
– map I/O memory to exclusive address ranges
– distinguish physical and I/O memory access
Abstractions: AS – I/O Memory
58. • Hardware interrupts: mapped to IPC
• I/O memory & I/O ports: mapped via flexpages
Abstractions: AS – Device Driver
60. Abstractions: Threads
• Thread is unit of execution
– Kernel-scheduled
• Thread is addressable unit for IPC
– Thread capability used for addressing and establishing send rights
– Called Thread-ID for backward compatibility
• Threads managed by user-level servers
– Creation, destruction, association with address space
• Thread attributes:
– Scheduling parameters (time slice, priority)
– Unique ID (hidden from userland)
• referenced via thread capability (local name)
– Address space
– Page-fault and exception handler
61. Abstractions: Threads
• Thread
– Implemented as kernel object
• Properties managed by the kernel:
– Instruction Pointer (EIP)
– Stack (ESP)
– Registers
– User-level TCB
• User-level applications need to
– allocate stack memory
– provide memory for application binary
– find entry point
62. Abstractions: Time
• Used for scheduling times slices
– Thread has fixed-length time slice for preemption
– Time slices allocated from (finite or infinite) time quantum
• Notification when exceeded
• In earlier L4 versions also used for IPC timeouts
– Removed in OKL4
• Future versions remove time completely from the
kernel
– If scheduling (including timer management) is completely
exported to user level
63. Mechanism: IPC
• Synchronous message-passing operation
• Data copied directly from sender to receiver
– Short messages passed in registers
– Long messages copied by kernel (semi-)asynchronously
[OKL4]
• Can be blocking or polling (fail if partner not ready)
• Asynchronous notification variant
– No data transfer, only sets notification bit in receiver
– Receiver can wait (block) or poll
• In earlier L4 versions [removed in OKL4]:
– IPC also used for mapping
– long synchronous messages
64. IPC: Inter-Process Communication
• Definition: Exchange of data between 2 process.
– IPC is one way communication
– RPC (remote procedure call) is round trip
communication
• The microkernel handles message transfers between
threads.
• Grant and Map operations rely on IPC.
65. Mechanism: IPC
• synchronous (no buffering)
• Payloads
– registers only (short IPC), fast
– strings (long IPC)
– access rights (“mappings”)
– Faults
– interrupts
66. Mechanism: IPC – Copy Data
• Direct and indirect data copy
• UTCB (User-level Thread Control Block) message
(special area; shared memory between user and kernel)
• Special case: register-only message
• Page faults during user-level memory access possible
67. Mechanism: IPC – Direct/Indirect Copy
• UTCB: set of “virtual” registers
• Message Registers
– Message Registers System call parameters
– IPC: direct copy to receiver
• Buffer Registers
– Receive flexpage descriptors
• Thread Control Registers
– Thread-private data
– Preserved, not copied
68. Message: Map Reference
• Used to transfer memory pages and capabilities
• Kernel manipulates page tables
• Used to implement the map/grant operations
69. Mechanism: Mapping
• Create a mapping from a physical frame to a page in
an address space
– Privileged syscall MapControl
– unprivileged in newer L4 (access control via memory caps)
• Typically done in response to page fault
– VM server acting as pager
– can pre-map, of course
• Also used for mapping device
– VM server acting as pager
– can pre-map, of course
70. Page Fault Handling
• Page fault exception is caught by kernel page fault handler
• No management of user memory in kernel
• Invoke user-level memory management
– Pager
71. Pager
• Pager is the thread invoked on
page fault
• Each thread has a (potentially
different) pager assigned
Pager Invocation
• Communication with pager
thread using IPC
• Kernel page fault handler sets
up IPC to pager
• Pager sees faulting thread as
sender of IPC
72. Communication and Resource Control
• Need to control who can send data to whom
– Security and isolation
– Access to resources
• Approaches
– IPC redirection/introspection
– Central vs. Distributed policy and mechanism
– ACL-based vs. capability-based
73. Abstractions: Capabilities
(in recent L4 implementations)
• actors in the system are objects
– objects have local state and behavior
• capabilities are references to objects
– any object interaction requires a capability
– unseparable and unforgeable combination of reference and
access right
74. Abstractions: Capabilities
• Capabilities reference threads
– actual cap word (TID) is index into per-address-space
capability list (Clist)
• Capability conveys privilege
– Right to send message to thread
– May also convey rights to other operations on thread
• Capabilities are local names for global resources
75. Abstractions: Capabilities
• If a thread has access to a capability, it canmap this
capability to another thread
• Mapping / not mapping of capabilities used for
implementing access control
• Abstraction for mapping: flexpage
• Flexpages describe mapping
– location and size of resource
– receiver's rights (read-only, mappable)
– type (memory, I/O, communication capability)
77. Mechanism: Exception Handling
• Interrupts
– Modelled as hardware “thread” sending messages
– Received by registered (user-level) interrupt-handler thread
– Interrupt acknowledged by handler via syscall (optionally waiting for next)
– Timer interrupt handled in-kernel
• Page Faults
– Kernel fakes IPC message from faulting thread to its pager
– Pager requests root task to set up a mapping
– Pager replies to faulting client, message intercepted by kernel
• Other Exceptions
– Kernel fakes IPC message from exceptor thread to its exception handler
– Exception handler may reply with message specifying new IP, SP
– Can be signal handler, emulation code, stub for IPCing to server, ...
78. Page Fault and Pager
• Page Faults are mapped to IPC
• Pager is special thread that receives page faults
• Page fault IPC cannot trigger another page fault
• Kernel receives the flexpage from pager and inserts
• mapping into page table of application
• Other faults normally terminate threads
80. L4 API vs. ABI
• L4 project was originally established by Jochen
Liedtke in the 1990s and is actively researched by the
L4Ka team at the University of Karlsruhe in
collaboration with NICTA / University of New South
Wales and the Dresden University of Technology.
• L4 is defined by a platform-independent1 API and a
platform-dependent ABI
81. Background: Liedtke's involvement
• Jochen Liedtke and his colleagues continued research
on L4 and microkernel based systems in general.
• In 1996, Liedtke started to work at IBM's Thomas J.
Watson Research Center.
• In 1999, Liedtke took over the Systems Architecture
Group at the University of Karlsruhe, Germany
– L4Ka::Hazelnut
82. Commercial L4: from NICTA to OKLabs
• L4::Pistachio microkernel was originally developed at
Karlsruhe University. NICTA had ported it to a number
of architectures, including ARM, had optimized it for
use in resource-constrained embedded systems.
• In 2004, Qualcomm engaged NICTA in a consulting
arrangement to deploy L4 on Qualcomm's wireless
communication chips.
• The engagement with Qualcomm grew to a volume
where it was too significant a development/engineering
effort to be done inside the research organization.
– Commercized!
Source: http://microkerneldude.wordpress.com/2012/10/02/
giving-it-away-part-2-on-microkernels-and-the-national-interes/
83. AMSS
• AMSS = Advanced Mobile Subscriber Software,
developed by Qualcomm
– RTOS runs atop the baseband processor.
– based on L4 microkernel
– Real world verified commercial microkernel
• Outside of some fixed functionality and off-die RF, the
baseband really is just a CPU, a big DSP or two, and collection
of management tasks that run as software - GPS, GLONASS,
talking with the USIM, being compliant with the pages and
pages of 3GPP specifications, etc.
85. L4 Family: Fiasco
• Design by TU Dresden
• Preemptible real-time kernel
• Used for DROPS project (GPL)
– Dresden Real-Time Operating Systems Project
– Find design techniques for the construction of distributed real
time operating systems
– Every component guarantees a certain level of service
87. Virtual Machines based on L4Re
• Isolate not only processes, but also complete
• Operating Systems (compartments)
– “Server consolidation”
88. L4 Family: Pistachio
• Pistachio was designed by L4Ka team and NICTA
EROPS Group
– L4 API V X.2
• NICTA::Pistachio-embedded
– Designed by UNSW and NICTA EROPS Group
– Based on L4Ka::Pistachio, designed for embedded
– keep almost all system calls short
89. L4 Family: OKL4
• Design by OKLab (Open Kernel Labs)
– old source available
• OKLab spun out from NICTA
• Based on NICTA::Pistachio-embedded
• Commercial deployment
– Adopted by Qualcomm for CDMA chipsets
90. L4 Implementations by UNSW/NICTA
L4-embedded
• Fast context-switching on ARMv5
– context switching without cache flush on
virtually-addressed caches
– 55-cycle IPC on Xscale
– virtualized Linux faster than native
• Event-based kernel (single kernel stack)
• Removed IPC timeouts, “long” IPC
• Introduced asynchronous notifications
OKL4
• Dumped recursive address-space model
– reduced kernel complexity
– First L4 kernel with capability-based
access control
OKL4 Microvisor
• Removed synchronous IPC
• Removed kernel-scheduled threads
91. L4 Family: OKL4
• L4 implementations on embedded
processors
– ARM, MIPS
• Wombat: portable virtualized Linux
for embedded systems
• ARMv4/v5 thanks to fast
context-switching extension
92. Bring Linux to L4
• Some portions of Linux are architecture dependent:
– Interrupts
– Low-level device drivers (DMA)
– Methods for interaction with use processes.
– Switching between kernel contexts
– Copyin/copyout between kernel and user space
– Mapping/unmapping for address space
– System calls
93. L4Linux
• Is a paravirtualized Linux first presented at SOSP’97
running on the original L4 kernel.
– L4Linux predates the x86 virtualization hype
– L4Linux 2.4 first version to run on L4Env
– L4Linux 2.6 uses Fiasco.OC’s paravirtualization features
• Current status
– based on Linux 3.8
– x86 and ARM support
– SMP
95. L4Linux
• Linux kernel as L4 user service
– Runs as an L4 thread in a single L4 address space
– Creates L4 threads for its user processes
– Maps parts of its address space to user process threads
(using L4 primitives)
– Acts as pager thread for its user threads
– Has its own logical page table
– Multiplexes its own single thread (to avoid having to change
Linux source code)
97. L4Linux: Interrupt Handling
• All interrupt handlers are mapped to messages.
• The Linux server contains threads that do nothing but
wait for interrupt messages.
• Interrupt threads have a higher priority than the main
thread.
98. L4Linux: User Process
• Each different user process is implemented as a
different L4 task: Has it’s own address space and
threads.
• Linux Server is the pager for these processes. Any
fault by the user-level processes is sent by RPC from
the L4 kernel to the Server.
99. L4Linux: System Calls
• The statically linked and shared C libraries are modified
– Systems calls in the lib call the Linux kernel using IPC
• For unmodified Linux programs, use “trampoline”
– The application traps
– Control bounces to a user-level exception handler
– The handler calls the modified shared library
– Binary compatible
100. L4Linux: Signal
• Each user-level process has an additional thread for
signal handling.
• Main server thread cannot directly manipulate user
process stack, so it sends a message for the signal
handling thread, telling the user thread to save it’s
state and enter Linux
101. L4Linux: Scheduling
• All thread scheduling is down by the L4 kernel
• Linux server’s schedule() routine is only used for
multiplexing it’s single thread.
• After each system call, if no other system call is
pending, it simply resumes the user process thread
and sleeps.
102. TLB
• A Translation Look-aside Buffer (TLB) caches page
table lookups
• On context switch, TLB needs to be flushed
• A tagged TLB tags each entry with an address space
label, avoiding flushes
• Pentium-class CPU can emulate a tagged TLB for
small address spaces
103. Small Memory Space
• In order to reduce TLB conflicts, L4Linux has a special
library to customize code and data for communicating
with the Linux Server
• The emulation library and signal thread are mapped
close to the application, instead of default high-memory
area.
104. Benchmark Matrix
• Compared the following systems
– Native Linux
– L4Linux
8.3% slower; Only 6.8% slower at maximum load.
– MkLinux (in-kernel)
• Linux ported to run inside Mach
• 29% slower
– MkLinux (user)
• Linux ported to run as a user process on top of Mach
• 49% slower
107. Contrasting Readings
• seL4: Formal Verification of an Operating-System
Kernel
– Gerwin Klein et alia
– SOSP'09 Best Paper
• Goals
– General-purpose
– Formal verification
• Functional correctness
• Security/safety properties
– High performance
108. Problems in 2nd
Generations
• microkernel needs memory for its abstractions
– tasks: page tables
– threads: kernel-TCB
– capability tables
– IPC wait queues
– mapping database
– kernel memory is limited
– opens the possibility of DoS attacks
109. Ideas to solve 2nd
's Problems
• memory management policy should not be in the kernel
• account all memory to the application it is
• needed for (directly or indirectly)
• kernel provides memory control mechanism
– exception for bootstrapping:
initial kernel memory is managed by kernel
• untyped memory in seL4 (3rd
Generation)
– all physical memory unused after bootstrap is
represented by untyped memory capabilities
110. Moving from 2nd
to 3rd
Generation
OKL4
• Dumped recursive address-space model
– reduced kernel complexity
– First L4 kernel with capability-based
access control
OKL4 Microvisor
• Removed synchronous IPC
• Removed kernel-scheduled threads
seL4
• All memory management at user level
– no kernel heap!
• Formal proof of functional correctness
• Performance on par with fastest kernels
– <200 cycle IPC on ARM11 without
assembler fastpath
111. Why Recursive Address Spaces are removed?
Reasons:
• Complex & large mapping database
→ may account for 50% of memory use!
• Lack of control over resource use
→ implicit allocation of mapping nodes
• Potential covert channels
112. seL4 Novelty: Kernel Resource Management
• No kernel heap: all memory left after boot is handed to userland
– Resource manager can delegate to subsystems
• Operations requiring memory explicitly provide memory to
kernel
• Result: strong isolation of subsystems
– Operate within delegated resources
– No interference
113. Conceptual Difference with original L4
Original L4,
Liedtke
seL4 OKL4 Microvisor
Abstractions Yes, but abstractions are quite different
Threads thread virtual CPU
Address Spaces address space virtual MMU
Synchronous IPC sync IPC +
async notify
virtual IRQ (async)
Rich msg struct No
Unique thread ID No, has capabilities
Virtual TCB array No
Per-thread kernel
stack
No, kernel event
115. Mobile Virtualization
• Why mobile virtualization?
– Current and next-generation smartphones increasingly
resemble desktop computers in terms of computing power,
memory, and storage
• OKL4 Microvisor
– API optimized for low-overhead virtualization
– Eliminated:
• recursive address spaces
• Synchronous IPC
• Kernel-scheduled threads
– API closely models hardware:
• vCPU, vMMU, vIRQ + “channels” (FIFOs)
• Capabilities for resource control
116. OKL4 Microvisor: Benefits and
Capabilities• Virtualization
• Resource management
• Lightweight components
• Real-time capability and low-performance overhead
• Small memory footprint
• High-performance IPC between secure cells
• Minimal trusted computing base(TCB)
• Single core and multicore support
• Ready-to-integrate guest OS support with available OS support
packages (paravirtualizing guest Oses)
117. Use Cases
Each secure cell in the system offers
isolation from software in other cells
Existing software components can
be reused in new designs
Microvisor tames the complexity of
dispatching multi-OS workloads across
multiple physical CPUs
118. Use Case: Low-cost 3G Handset
• Mobile Handsets
– Major applications runs on Linux
– 3G Modem software stack runs
on RTOS domain
• Virtualization in multimedia Devices
– Reduces BOM (bill of materials)
– Enables the Re-usability of legacy
code/applications
– Reduces the system development
time
• Instrumentation, Automation
– Run RTOS for Measurement and
analysis
– Run a GPOS for Graphical
Interface
119. Reference
• Wikipedia - http://en.wikipedia.org/wiki/L4_microkernel
• Microkernels, Arun Krishnamurthy, University of
Central Florida
• Microkernel-based Operating Systems – Introduction,
Carsten Weinhold, TU Dresden
• Threads, Michael Roitzsch, TU Dresden
• Virtualization, Julian Stecklina, TU Dresden