The document describes the process of implementing SMP support for OpenBSD on a SGI Octane 2 machine. Key steps included restructuring per-processor data, implementing locking primitives, handling hardware aspects like spinning up secondary processors, and debugging challenges like detecting deadlocks. Debugging was made difficult by timing issues but was aided by tools like JTAG, DDB, printfs, and modifying locks to record stuck locations. Interrupts could block inter-processor communication so the clock handler was modified to re-enable interrupts during locking.
This presentation is about creating software for for hardware which does not exist yet. In particular, it explains how to add support of new hardware to QEMU (I2C Accelerometer), how to simulate new hardware, write a simple application to work with accelerometer, and demonstrate that it works on the real platform as well as under QEMU.
Presentation by Igor Kaplinsky (Senior Embedded Software Developer, GlobalLogic, Kyiv), Taras Protsiv (Embedded Software Developer GlobalLogic, Kyiv), and Volodymyr Shymanskyy (Embedded Software Developer, GlobalLogic, Kyiv), Embedded TechTalk, Lviv, 2014.
More details -
http://www.globallogic.com.ua/press-releases/embedded-lviv-techtalk-2-coverage
Kernel Recipes 2016 - entry_*.S: A carefree stroll through kernel entry codeAnne Nicolas
I have always wondered what happens when we enter the kernel from userspace: what preparations does the hardware meet when the userspace to kernel space switch instructions are executed and back, and what does the kernel do when it executes a system call. There are also a bunch of things it does before it executes the actual syscall so I try to look at those too.
This talk is an attempt to demystify some of the aspects of the cryptic x86 entry code in arch/x86/entry/ written in assembly and how does that all fit with software-visible architecture of x86, what hardware features are being used and how.
With the hope to get more people excited about this funky piece of the kernel and maybe have the same fun we’re having.
Borislav Petkov, SUSE
Translation Cache Policies for Dynamic Binary TranslationSaber Ferjani
Our project comes in order to enhance Qemu simulation speed, through the proposal of a new cache algorithm that detect frequently used blocks and improves their reuse ratio.
Kernel Recipes 2016 - kernelci.org: 1.5 million kernel boots (and counting)Anne Nicolas
The kernelci.org project performs over 2000 kernel boot tests per day for upstream kernels on a wide variety of hardware. This talk will provide an overview of kernelci.org, how distributed board farms are used, how it is used by kernel maintainers and developers, and how you can make use of the reports, logs and pre-built images.
Kevin Hilman, BayLibre
This presentation is about a methodology which allows patching of a running Linux kernel, its technical details, limitations as well as kpatch tools.
The talk was delivered by Ruslan Bilovol (Associate Manager, Consultant, GlobalLogic) at GlobalLogic Embedded Career Day #2 on February 10, 2018.
More about GlobalLogic Embedded Career Day #2: https://www.globallogic.com/ua/events/globallogic-kyiv-embedded-career-day-2-materials
Kernel Recipes 2018 - New GPIO interface for linux user space - Bartosz Golas...Anne Nicolas
e linux 4.8 the GPIO sysfs interface is deprecated. Due to its many drawbacks and bad design decisions a new user space interface has been implemented in the form of the GPIO character device which is now the preferred method of interaction with GPIOs which can’t otherwisebe serviced by a kernel driver. The character device brings in many new interesting features such as: polling for line events, finding GPIO chips and lines by name, changing & reading the values of multiple lines with a single ioctl (one context switch) and many more. In this presentation Bartosz will showcase the new features of the GPIO UAPI, discuss the current state of libgpiod (user space tools for using the character device providing a C library, set of user space tools and C++ & Python bindings) and tell you why it’s beneficial to switch to the new interface.
This presentation is about creating software for for hardware which does not exist yet. In particular, it explains how to add support of new hardware to QEMU (I2C Accelerometer), how to simulate new hardware, write a simple application to work with accelerometer, and demonstrate that it works on the real platform as well as under QEMU.
Presentation by Igor Kaplinsky (Senior Embedded Software Developer, GlobalLogic, Kyiv), Taras Protsiv (Embedded Software Developer GlobalLogic, Kyiv), and Volodymyr Shymanskyy (Embedded Software Developer, GlobalLogic, Kyiv), Embedded TechTalk, Lviv, 2014.
More details -
http://www.globallogic.com.ua/press-releases/embedded-lviv-techtalk-2-coverage
Kernel Recipes 2016 - entry_*.S: A carefree stroll through kernel entry codeAnne Nicolas
I have always wondered what happens when we enter the kernel from userspace: what preparations does the hardware meet when the userspace to kernel space switch instructions are executed and back, and what does the kernel do when it executes a system call. There are also a bunch of things it does before it executes the actual syscall so I try to look at those too.
This talk is an attempt to demystify some of the aspects of the cryptic x86 entry code in arch/x86/entry/ written in assembly and how does that all fit with software-visible architecture of x86, what hardware features are being used and how.
With the hope to get more people excited about this funky piece of the kernel and maybe have the same fun we’re having.
Borislav Petkov, SUSE
Translation Cache Policies for Dynamic Binary TranslationSaber Ferjani
Our project comes in order to enhance Qemu simulation speed, through the proposal of a new cache algorithm that detect frequently used blocks and improves their reuse ratio.
Kernel Recipes 2016 - kernelci.org: 1.5 million kernel boots (and counting)Anne Nicolas
The kernelci.org project performs over 2000 kernel boot tests per day for upstream kernels on a wide variety of hardware. This talk will provide an overview of kernelci.org, how distributed board farms are used, how it is used by kernel maintainers and developers, and how you can make use of the reports, logs and pre-built images.
Kevin Hilman, BayLibre
This presentation is about a methodology which allows patching of a running Linux kernel, its technical details, limitations as well as kpatch tools.
The talk was delivered by Ruslan Bilovol (Associate Manager, Consultant, GlobalLogic) at GlobalLogic Embedded Career Day #2 on February 10, 2018.
More about GlobalLogic Embedded Career Day #2: https://www.globallogic.com/ua/events/globallogic-kyiv-embedded-career-day-2-materials
Kernel Recipes 2018 - New GPIO interface for linux user space - Bartosz Golas...Anne Nicolas
e linux 4.8 the GPIO sysfs interface is deprecated. Due to its many drawbacks and bad design decisions a new user space interface has been implemented in the form of the GPIO character device which is now the preferred method of interaction with GPIOs which can’t otherwisebe serviced by a kernel driver. The character device brings in many new interesting features such as: polling for line events, finding GPIO chips and lines by name, changing & reading the values of multiple lines with a single ioctl (one context switch) and many more. In this presentation Bartosz will showcase the new features of the GPIO UAPI, discuss the current state of libgpiod (user space tools for using the character device providing a C library, set of user space tools and C++ & Python bindings) and tell you why it’s beneficial to switch to the new interface.
Linux Kernel Platform Development: Challenges and InsightsGlobalLogic Ukraine
This presentation is about the main tasks which Linux kernel platform engineers take care of. The talk includes real-life cases which help understand the role of respective specialists and might be helpful to those who consider such change in their careers.
The talk was delivered by Sam Protsenko (Software Engineer, Consultant, GlobalLogic) at GlobalLogic Embedded Career Day #2 on February 10, 2018.
More about GlobalLogic Embedded Career Day #2: https://www.globallogic.com/ua/events/globallogic-kyiv-embedded-career-day-2-materials
Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...Anne Nicolas
Ftrace is the official tracer of the Linux kernel. It was added in 2008, and in 2009 came trace-cmd which was a command line tool that would make interaction with ftrace easier. Shortly after that, KernelShark was created as a GUI for trace-cmd interface. But as KernelShark and trace-cmd were mostly side projects, there wasn't as much activity that they deserved. trace-cmd was updated more often, but KernelShark has suffered with bit-rot for some time. But all that has changed recently as VMware has active developers working on it.
KernelShark has been completely rewritten from scratch and version 1.0 is due to be released in August of 2018 (has already been released as of this talk). This will discuss what changed, how to use the new tool and what is coming in the future.
Greybus is the name for a new application layer protocol on top of Unipro that controls the Ara Phone from Google. This protocol turns a phone into a modular device, allowing any part of the system to be hotplugged while the phone is running.
This talk will describe what this protocol is, why it was designed, and give the basics for how it works. It will discuss how this is implemented in the Linux kernel, and how it easily bridges existing hardware like USB, I2C, GPIO and others with little to no changes needed to existing kernel drivers.
Greg KH, Linux Foundation
The presentation deals with the set of tools and features that can be used by Linux kernel developers for kernel debugging. Also, static analysis of kernel patches was addressed during speech. Special attention was given to access tools, tracing tools, and interactive debugging tools, namely: DebugFS, ftrace, and GDB.
This presentation by Aleksandr Bulyshchenko (Software Engineer, Consultant, GlobalLogic Kharkiv) was delivered at GlobalLogic Kharkiv Embedded TechTalk #1 on March 13, 2018.
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015curryon
http://curry-on.org/2015/sessions/bits-of-advice-for-vm-writers.html
This is a talk about the choices one makes when building a Virtual Machine. Many of these choices aren’t even obviously being made when you first get the machine running - it’s not until years later when you look at your limitations that you even realize there was a choice. There’s the obvious Big VM (server, desktop, laptop, cell phone?) vs Small VM (embedded device, cell phone?) choice. But also: GC-or-no-GC. Portable or not (X86 vs ARM? vs Power/Sparc/tiny-DSP)? Multi-threaded or not? Run any “native” code - or only highly cooperative code? Run inside a pre-emptive multi-tasking OS? Or bare metal? Interpret bytecodes/p-codes vs dumb template-JIT vs Multi-tier-highly-optimizing-JIT? The set of choices goes on and on.
Most of these choices interact in Bad Ways… and usually the interactions are not obvious until long after the design decisions are made and locked in. And worse: most of the choices have to be made from the start, when you don’t really know the answers. Coding for yourself & your PhD advisor? Coding for a fortune-1000 company? Coding for the Internet-Scale-Masses? All different scenarios, with radically different goals. While the talk is based on my experience with the HotSpot Java VM, the bits of advice only loosely tied to Java, and can equally be applied to a host of other (VM) hosted languages.
Bio:
Cliff Click is the CTO and Co-Founder of H2O, makers of H2O, the open source math and machine learning engine for Big Data. Cliff wrote his first compiler when he was 15 (Pascal to TRS Z-80!), although Cliff’s most famous compiler is the HotSpot Server Compiler (the Sea of Nodes IR). Cliff helped Azul Systems build an 864 core pure-Java mainframe that keeps GC pauses on 500Gb heaps to under 10ms, and worked on all aspects of that JVM. Before that he worked on HotSpot at Sun Microsystems, and was at least partially responsible for bringing Java into the mainstream. Cliff is invited to speak regularly at industry and academic conferences and has published many papers about HotSpot technology. He holds a PhD in Computer Science from Rice University and about 20 patents.
Current experience shows that a lot of developers working on Xen/Linux kernel use mainly only small set of debugging tools. Often they are sufficient for generic work. However, when unusual problem arises which could not be easily debugged using known tools sometimes they are trying to reinvent the wheel. Goal of this session is to present wide range of debugging tools starting from simplest one to most feature reach solutions in context of Xen/Linux kernel debugging. It will describe pros and cons of printk (serial, debug console, etc.), gdb, gdbsx, kgdb, QEMU, kdump and others. Additionally, there will be some information about possible new solutions and current kexec/kdump developments for Xen.
Kernel Recipes 2015: Speed up your kernel development cycle with QEMUAnne Nicolas
Kernel development is often associated with rebooting crashed machines, debugging over serial consoles, and an unwiedly development cycle. Developers know that short development cycles are incredibly important for programmer productivity.
The QEMU machine emulator and virtualizer offers a way to test kernels inside virtual machines without risk of hanging the physical machine. It also makes kernel debugging easier than between physical machines. The kernel development with QEMU allows kernel code changes to be tested within seconds.
This talk covers methods of compiling, testing, and debugging kernels using QEMU. Common approaches include building a custom initramfs or sharing the host file system with a virtual machine. Advanced use cases like cross-architecture development and device driver bringup are also possible using QEMU.
This presentation is aimed at anyone wishing to shorten their kernel development cycle and overcome some of the hurdles of developing low-level software.
Stefan Hajnoczi, Red Hat
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
LAS16-403: GDB Linux Kernel Awareness
Speakers: Peter Griffin
Date: September 29, 2016
★ Session Description ★
The presentation will look at the ways in which GDB can be enhanced when debugging the Linux kernel to give it better knowledge of the underlying operating system to enable a better debugging experience. It will also provide a status of the current work being undertaken in this area by the ST landing team, a demo and potential future work.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-403
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-403/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
Kernel Recipes 2015: Anatomy of an atomic KMS driverAnne Nicolas
The DRM and KMS APIs have won in the Linux graphics ecosystem. Long gone are the days when KMS meant only a handful of desktop graphics drivers. As a side effect, new problems have been uncovered, and API extensions are being designed to address advanced use cases. Atomic updates is the latest significant of such extensions.
While the userspace API extension is simple, a lot of work went under the hood and the in-kernel KMS helpers went through major changes that are not trivial to implement in drivers. This talk will present KMS atomic updates and explain how to update KMS drivers to take advantage of the new API, using the Renesas rcar-du-drm driver as an example.
Laurent Pinchart, Ideas on Board
Compromising Linux Virtual Machines with Debugging MechanismsRussell Sanford
This presentation covers utilizing VMwares (GDB) debugging protocol to invasive inject commands into a Linux-x64 target. Automatic detection of kernel API is performed to locate _vmalloc & call_usermodehelper* functions across all 3x and 4x kernels.
Kernel Recipes 2015: Kernel packet capture technologiesAnne Nicolas
Sniffing through the ages
Capturing packets running on the wire to send them to a software doing analysis seems at first sight a simple tasks. But one has not to forget that with current network this can means capturing 30M packets per second. The objective of this talk is to show what methods and techniques have been implemented in Linux and how they have evolved over time.
The talk will cover AF_PACKET capture as well as PF_RING, dpdk and netmap. It will try to show how the various evolution of hardware and software have had an impact on the design of these technologies. Regarding software a special focus will be made on Suricata IDS which is implementing most of these capture methods.
Eric Leblond, Stamus Networks
Kernel Recipes 2015 - Porting Linux to a new processor architectureAnne Nicolas
Getting the Linux kernel running on a new processor architecture is a difficult process. Worse still, there is not much documentation available describing the porting process.
After spending countless hours becoming almost fluent in many of the supported architectures, I discovered that a well-defined skeleton shared by the majority of ports exists. Such a skeleton can logically be split into two parts that intersect a great deal.
The first part is the boot code, meaning the architecture-specific code that is executed from the moment the kernel takes over from the bootloader until init is finally executed. The second part concerns the architecture-specific code that is regularly executed once the booting phase has been completed and the kernel is running normally. This second part includes starting new threads, dealing with hardware interrupts or software exceptions, copying data from/to user applications, serving system calls, and so on.
In this talk I will provide an overview of the procedure, or at least one possible procedure, that can be followed when porting the Linux kernel to a new processor architecture.
Joël Porquet – Joël was a post-doc at Pierre and Marie Curie University (UPMC) where he ported Linux to TSAR, an academic processor. He is now looking for new adventures.
Kernel Recipes 2015 - So you want to write a Linux driver frameworkAnne Nicolas
Writing a new driver framework in Linux is hard. There are many pitfalls along the way; this talk hopes to point out some of those pitfalls and hard lessons learned through examples, advice and humorous anecdotes in the hope that it will aid those adventurous enough to take on the task of writing a new driver framework. The scope of the talk includes internal framework design as well as external API design exposed to drivers and consumers of the framework. This presentation pulls directly from the Michael Turquette’s experience authoring the Common Clock
Framework and maintaining that code for the last four years.
Additionally Mike has solicited tips and advice from other subsystem maintainers, for a well-rounded overview. Be prepared to learn some winning design patterns and hear some embarrassing stories of framework design gone wrong.
Mike Turquette, BayLibre
Linux Kernel Platform Development: Challenges and InsightsGlobalLogic Ukraine
This presentation is about the main tasks which Linux kernel platform engineers take care of. The talk includes real-life cases which help understand the role of respective specialists and might be helpful to those who consider such change in their careers.
The talk was delivered by Sam Protsenko (Software Engineer, Consultant, GlobalLogic) at GlobalLogic Embedded Career Day #2 on February 10, 2018.
More about GlobalLogic Embedded Career Day #2: https://www.globallogic.com/ua/events/globallogic-kyiv-embedded-career-day-2-materials
Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...Anne Nicolas
Ftrace is the official tracer of the Linux kernel. It was added in 2008, and in 2009 came trace-cmd which was a command line tool that would make interaction with ftrace easier. Shortly after that, KernelShark was created as a GUI for trace-cmd interface. But as KernelShark and trace-cmd were mostly side projects, there wasn't as much activity that they deserved. trace-cmd was updated more often, but KernelShark has suffered with bit-rot for some time. But all that has changed recently as VMware has active developers working on it.
KernelShark has been completely rewritten from scratch and version 1.0 is due to be released in August of 2018 (has already been released as of this talk). This will discuss what changed, how to use the new tool and what is coming in the future.
Greybus is the name for a new application layer protocol on top of Unipro that controls the Ara Phone from Google. This protocol turns a phone into a modular device, allowing any part of the system to be hotplugged while the phone is running.
This talk will describe what this protocol is, why it was designed, and give the basics for how it works. It will discuss how this is implemented in the Linux kernel, and how it easily bridges existing hardware like USB, I2C, GPIO and others with little to no changes needed to existing kernel drivers.
Greg KH, Linux Foundation
The presentation deals with the set of tools and features that can be used by Linux kernel developers for kernel debugging. Also, static analysis of kernel patches was addressed during speech. Special attention was given to access tools, tracing tools, and interactive debugging tools, namely: DebugFS, ftrace, and GDB.
This presentation by Aleksandr Bulyshchenko (Software Engineer, Consultant, GlobalLogic Kharkiv) was delivered at GlobalLogic Kharkiv Embedded TechTalk #1 on March 13, 2018.
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015curryon
http://curry-on.org/2015/sessions/bits-of-advice-for-vm-writers.html
This is a talk about the choices one makes when building a Virtual Machine. Many of these choices aren’t even obviously being made when you first get the machine running - it’s not until years later when you look at your limitations that you even realize there was a choice. There’s the obvious Big VM (server, desktop, laptop, cell phone?) vs Small VM (embedded device, cell phone?) choice. But also: GC-or-no-GC. Portable or not (X86 vs ARM? vs Power/Sparc/tiny-DSP)? Multi-threaded or not? Run any “native” code - or only highly cooperative code? Run inside a pre-emptive multi-tasking OS? Or bare metal? Interpret bytecodes/p-codes vs dumb template-JIT vs Multi-tier-highly-optimizing-JIT? The set of choices goes on and on.
Most of these choices interact in Bad Ways… and usually the interactions are not obvious until long after the design decisions are made and locked in. And worse: most of the choices have to be made from the start, when you don’t really know the answers. Coding for yourself & your PhD advisor? Coding for a fortune-1000 company? Coding for the Internet-Scale-Masses? All different scenarios, with radically different goals. While the talk is based on my experience with the HotSpot Java VM, the bits of advice only loosely tied to Java, and can equally be applied to a host of other (VM) hosted languages.
Bio:
Cliff Click is the CTO and Co-Founder of H2O, makers of H2O, the open source math and machine learning engine for Big Data. Cliff wrote his first compiler when he was 15 (Pascal to TRS Z-80!), although Cliff’s most famous compiler is the HotSpot Server Compiler (the Sea of Nodes IR). Cliff helped Azul Systems build an 864 core pure-Java mainframe that keeps GC pauses on 500Gb heaps to under 10ms, and worked on all aspects of that JVM. Before that he worked on HotSpot at Sun Microsystems, and was at least partially responsible for bringing Java into the mainstream. Cliff is invited to speak regularly at industry and academic conferences and has published many papers about HotSpot technology. He holds a PhD in Computer Science from Rice University and about 20 patents.
Current experience shows that a lot of developers working on Xen/Linux kernel use mainly only small set of debugging tools. Often they are sufficient for generic work. However, when unusual problem arises which could not be easily debugged using known tools sometimes they are trying to reinvent the wheel. Goal of this session is to present wide range of debugging tools starting from simplest one to most feature reach solutions in context of Xen/Linux kernel debugging. It will describe pros and cons of printk (serial, debug console, etc.), gdb, gdbsx, kgdb, QEMU, kdump and others. Additionally, there will be some information about possible new solutions and current kexec/kdump developments for Xen.
Kernel Recipes 2015: Speed up your kernel development cycle with QEMUAnne Nicolas
Kernel development is often associated with rebooting crashed machines, debugging over serial consoles, and an unwiedly development cycle. Developers know that short development cycles are incredibly important for programmer productivity.
The QEMU machine emulator and virtualizer offers a way to test kernels inside virtual machines without risk of hanging the physical machine. It also makes kernel debugging easier than between physical machines. The kernel development with QEMU allows kernel code changes to be tested within seconds.
This talk covers methods of compiling, testing, and debugging kernels using QEMU. Common approaches include building a custom initramfs or sharing the host file system with a virtual machine. Advanced use cases like cross-architecture development and device driver bringup are also possible using QEMU.
This presentation is aimed at anyone wishing to shorten their kernel development cycle and overcome some of the hurdles of developing low-level software.
Stefan Hajnoczi, Red Hat
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
LAS16-403: GDB Linux Kernel Awareness
Speakers: Peter Griffin
Date: September 29, 2016
★ Session Description ★
The presentation will look at the ways in which GDB can be enhanced when debugging the Linux kernel to give it better knowledge of the underlying operating system to enable a better debugging experience. It will also provide a status of the current work being undertaken in this area by the ST landing team, a demo and potential future work.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-403
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-403/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
Kernel Recipes 2015: Anatomy of an atomic KMS driverAnne Nicolas
The DRM and KMS APIs have won in the Linux graphics ecosystem. Long gone are the days when KMS meant only a handful of desktop graphics drivers. As a side effect, new problems have been uncovered, and API extensions are being designed to address advanced use cases. Atomic updates is the latest significant of such extensions.
While the userspace API extension is simple, a lot of work went under the hood and the in-kernel KMS helpers went through major changes that are not trivial to implement in drivers. This talk will present KMS atomic updates and explain how to update KMS drivers to take advantage of the new API, using the Renesas rcar-du-drm driver as an example.
Laurent Pinchart, Ideas on Board
Compromising Linux Virtual Machines with Debugging MechanismsRussell Sanford
This presentation covers utilizing VMwares (GDB) debugging protocol to invasive inject commands into a Linux-x64 target. Automatic detection of kernel API is performed to locate _vmalloc & call_usermodehelper* functions across all 3x and 4x kernels.
Kernel Recipes 2015: Kernel packet capture technologiesAnne Nicolas
Sniffing through the ages
Capturing packets running on the wire to send them to a software doing analysis seems at first sight a simple tasks. But one has not to forget that with current network this can means capturing 30M packets per second. The objective of this talk is to show what methods and techniques have been implemented in Linux and how they have evolved over time.
The talk will cover AF_PACKET capture as well as PF_RING, dpdk and netmap. It will try to show how the various evolution of hardware and software have had an impact on the design of these technologies. Regarding software a special focus will be made on Suricata IDS which is implementing most of these capture methods.
Eric Leblond, Stamus Networks
Kernel Recipes 2015 - Porting Linux to a new processor architectureAnne Nicolas
Getting the Linux kernel running on a new processor architecture is a difficult process. Worse still, there is not much documentation available describing the porting process.
After spending countless hours becoming almost fluent in many of the supported architectures, I discovered that a well-defined skeleton shared by the majority of ports exists. Such a skeleton can logically be split into two parts that intersect a great deal.
The first part is the boot code, meaning the architecture-specific code that is executed from the moment the kernel takes over from the bootloader until init is finally executed. The second part concerns the architecture-specific code that is regularly executed once the booting phase has been completed and the kernel is running normally. This second part includes starting new threads, dealing with hardware interrupts or software exceptions, copying data from/to user applications, serving system calls, and so on.
In this talk I will provide an overview of the procedure, or at least one possible procedure, that can be followed when porting the Linux kernel to a new processor architecture.
Joël Porquet – Joël was a post-doc at Pierre and Marie Curie University (UPMC) where he ported Linux to TSAR, an academic processor. He is now looking for new adventures.
Kernel Recipes 2015 - So you want to write a Linux driver frameworkAnne Nicolas
Writing a new driver framework in Linux is hard. There are many pitfalls along the way; this talk hopes to point out some of those pitfalls and hard lessons learned through examples, advice and humorous anecdotes in the hope that it will aid those adventurous enough to take on the task of writing a new driver framework. The scope of the talk includes internal framework design as well as external API design exposed to drivers and consumers of the framework. This presentation pulls directly from the Michael Turquette’s experience authoring the Common Clock
Framework and maintaining that code for the last four years.
Additionally Mike has solicited tips and advice from other subsystem maintainers, for a well-rounded overview. Be prepared to learn some winning design patterns and hear some embarrassing stories of framework design gone wrong.
Mike Turquette, BayLibre
Hse alert 2013 35 two fatalities as a result of a failure of a bonnet-to...Alan Bassett
Process Safety - On the 19th of November 2013, 3pm, a steam explosion occurred at the Antwerp Refinery, Belgium. The accident involved the boiler feed water system (operating pressure 70 bar (1015 psi), operating temperature 280 °C (536°F)) of unit 72 (continuous catalytic reforming, CCR).
Two contractors were reinjecting sealant on a leak box on the bonnet-to-body flange of a 16 inch motorized valve, when suddenly the studs of the flange failed. The bonnet was launched into the process area, and landed at a 25 m distance.
The investigation that is looking into the causes of the accident is ongoing.
This powerpoint is from my presentation at the KDLA KY Bookmobile & Outreach Services conference held in Lexington, KY 8/31/09 - 9/01/09. I hope it can help other libraries with issues they may have, and I\'ll be happy to answer any questions you have as well!
Design innovation: 10 ways to improve the learner experienceBrightwave Group
http://www.brightwave.co.uk/emosaic
However, innovation is not always about major technological advancements that disrupt the status quo.
James Cory-Wright, Head of Learning Design at Brightwave, demonstrates how it's often the butterfly moments, the little design innovations, that make the biggest difference. 10 design innovations that have built unstoppable momentum.
Ice Age melting down: Intel features considered usefull!Peter Hlavaty
Decades history of kernel exploitation, however still most used techniques are such as ROP. Software based approaches comes finally challenge this technique, one more successful than the others. Those approaches usually trying to solve far more than ROP only problem, and need to handle not only security but almost more importantly performance issues. Another common attacker vector for redirecting control flow is stack what comes from design of today’s architectures, and once again some software approaches lately tackling this as well. Although this software based methods are piece of nice work and effective to big extent, new game changing approach seems coming to the light. Methodology closing this attack vector coming right from hardware - intel. We will compare this way to its software alternatives, how one interleaving another and how they can benefit from each other to challenge attacker by breaking his most fundamental technologies. However same time we go further, to challenge those approaches and show that even with those technologies in place attackers is not yet in the corner.
You didnt see it’s coming? "Dawn of hardened Windows Kernel" Peter Hlavaty
Past few years our team was focusing on different operating systems including Microsoft windows kernel. Honestly our first pwn at Windows kernel was not that challenging. Number of available targets with friendly environment for straightforward pwn, from user up to reliable kernel code execution.
However, step by step, security policies continue to evolve, and it becomes more troublesome to choose ideal attack surface from various sandboxes. In addition, what steps to follow for digging security holes is highly dependent upon the chosen target. In general, a few common strategies are available for researchers to choose: e.g choose “unknown” one which hasn’t been researched before; Select well fuzzed or well audited one, or research on kernel module internals to find “hidden” attack surfaces which are not explicitly interconnected. In the first part of the talk we introduce our methodology of selecting, alongside with cost of tricks around to choose seemingly banned targets, illustrated by notable examples.
After getting hands on potential bug available from targeted sandbox, it is time for Microsoft windows taking hardening efforts to put attacker into corner. Strong mitigations are being introduced more frequently than ever, with promising direction which cuts lots of attack surface off, and a several exploitation techniques being killed. We will show difficulties of developing universal exploitation techniques, and demonstrate needed technical level depending on code quality of target. We will examine how different it becomes with era of Redstone and following versions even with those techniques and good vulnerability in hand. How it changed attacker landscape and how it will (and will not) kill those techniques and applications. However will it really change the game or not?
Stealing from Thieves: Breaking IonCUBE VM to RE Exploit KitsМохачёк Сахер
Stealing from Thieves: Breaking IonCUBE VM to RE Exploit Kits is a walkthrough through breaking IonCUBE's DRM protection scheme to allow researches in decompiling and analyzing Exploit Kits protected with such protection.
The JVM memory model describes how threads in the Java eco-system interact through memory. While the memory model impact on developing for the JVM may not be obvious, it is the cause for certain number of "anomalies" that are, well, by design.
In this presentation we will explore the aspects of the memory model, including things like reordering of instructions, volatile members, monitors, atomics and JIT.
Summer training embedded system and its scopeArshit Rai
CETPA INFOTECH PVT LTD is one of the IT education and training service provider brands of India that is preferably working in 3 most important domains. It includes IT Training services, software and embedded product development and consulting services.
http://www.cetpainfotech.com
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...Alexandre Moneger
This presentation shows that code coverage guided fuzzing is possible in the context of network daemon fuzzing.
Some fuzzers are blackbox while others are protocol aware. Even ones which are made protocol aware, fuzzer writers typically model the protocol specification and implement packet awareness logic in the fuzzer. Unfortunately, just because the fuzzer is protocol aware, it does not guarantee that sufficient code paths have been reached.
The presentation deals with specific scenarios where the target protocol is completely unknown (proprietary) and no source code or protocol specs are accessible. The tool developed builds a feedback loop between the client and the server components using the concept of "gate functions". A gate function triggers monitoring. The pintool component tracks the binary code coverage for all the functions untill it reaches an exit gate. By instrumenting such gated functions, the tool is able to measure code coverage during packet processing.
Practical IoT Exploitation (DEFCON23 IoTVillage) - Lyon YangLyon Yang
This is a light training/presentation talk.
My name is Lyon Yang and I am an IoT hacker. I live in sunny Singapore where IoT is rapidly being deployed – in production. This walkthrough will aim to shed light on the subject of IoT, from finding vulnerabilities in IoT devices to getting shiny hash prompts.
Our journey starts with a holistic view of IoT security, the issues faced by IoT devices and the common mistakes made by IoT developers. Things will then get technical as we progress into a both ARM and MIPS exploitation, followed by a ‘hack-along-with-us’ workshop where you will be exploiting a commonly found IoT daemon. If you are new to IoT or a seasoned professional you will likely learn something new in this workshop.
https://www.iotvillage.org/#schedule
Summer training embedded system and its scopeArshit Rai
CETPA INFOTECH PVT LTD is one of the IT education and training service provider brands of India that is preferably working in 3 most important domains. It includes IT Training services, software and embedded product development and consulting services.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
20 Comprehensive Checklist of Designing and Developing a WebsitePixlogix Infotech
Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.
2. Introduction
• I was working to add SMP & 64bit support to
a BSD-based embedded OS at
The target device was MIPS64
• There’s no complete *BSD/MIPS SMP
implementation at that time, so I implemented it
• The implementation was proprietary, but I wanted
to contribute something to BSD community
• I decided to implement SMP from scratch, tried to
find a suitable MIPS SMP machine
3. Finding MIPS/SMP
Machine(1)
• Broadcom SiByte BCM1250 looks nice -
2core 1GHz MIPS64, DDR DRAM, GbE,
PCI, HT
• Cisco 3845 integrated this processor
• How much is it?
4. Finding MIPS/SMP
Machine(1)
• Broadcom SiByte BCM1250 looks nice -
2core 1GHz MIPS64, DDR DRAM, GbE,
PCI, HT
• Cisco 3845 integrated this processor
• How much is it?
from $14,200!
Totally unacceptable!!!
5. Finding MIPS/SMP
Machine(2)
• Some antique SGI machines are available on
eBay
• These are basically very cheap
• I realized SGI Octane 2 has 2 cores, already
supported by OpenBSD
6. Finding MIPS/SMP
Machine(2)
• Some antique SGI machines are available on
eBay
• These are basically very cheap
• I realized SGI Octane 2 has 2 cores, already
supported by OpenBSD
Just $33!
Do you remember how much was it?
7. This is my Octane2
Processors MIPS R12000 400MHz x2
Memory 1GB SDRAM
Graphics 3D Graphics Card
Sound Integrated Digital Audio
Storage 35GB SCSI HDD
Ethernet 100BASE-T
8.
9.
10.
11.
12. Become an OpenBSD
developer
• I started working on Octane2 since Apr
2009, wrote about on my blog
• Miod Vallat discovered it, suggested my
code be merged into OpenBSD main tree
• I become an OpenBSD developer in Sep
2009, started merging the code, joined
hackathon, worked with Miod and Joel Sing
13. NOW IT WORKS!!!!
• Merged into OpenBSD main tree
• You can try it now!
• It seems stable -
I tried “make build” again and again over a
day, it keeps working
17. What did we need to
do for the SMP work
There were lots of works...
• Support multiple cpu_info • Secondary processor entry
and processor related point
macros
• IPI: Inter-Processor Interrupt
• Move per-processor data into
cpu_info • Per-processor ASID
management
• Implement lock primitives
• TLB shootdown
• Acquiring giant lock
• Lazy FPU handling
• Implement atomic operations
• Per-processor clock
• Spin up secondary
processors
18. Describe it more
simply
We can classify tasks into three kind of
problems:
• Restructuring per-processor informations
• Implement and use lock/atomic primitives
• Hardware related problems
19. Restructuring per-
processor informations
• In the original sgi port, some informations which
related the processor are allocated only one
• Simple case:
Information is stored into global variable
Move it into “cpu_info”, per-processor structure
• Complex case:
In pmap, we need to maintain some information
per-processor * per-process
20. Simple case -
clock.c: original code
// defined as global variables
u_int32_t cpu_counter_last;
u_int32_t cpu_counter_interval;
u_int32_t pendingticks;
uint32_t clock_int5(uint32_t mask, struct trap_frame *tf)
...
clkdiff = cp0_get_count() - cpu_counter_last;
while (clkdiff >= cpu_counter_interval) {
cpu_counter_last += cpu_counter_interval;
clkdiff = cp0_get_count() - cpu_counter_last;
pendingticks++;
22. Complex case: pmap
• MIPS TLB entries are tagged with 8bit process id called ASID,
used for improve performance
MMU skips different process TLB entries on lookup
We won’t need to flush TLB every context switches
• Need to maintain process:ASID assign information
because it’s smaller than PID, we need to rotate it
• The information should keep beyond context switch
• We maintain ASID individually per-processor
• What we need is:
ASID information per-processor * per-process
23. Complex case - pmap.c:
original and modified code
uint pmap_alloc_tlbpid(struct proc *p)
...
tlbpid_cnt = id + 1;
pmap->pm_tlbpid = id;
uint pmap_alloc_tlbpid(struct proc *p)
...
tlbpid_cnt[cpuid] = id + 1;
pmap->pm_tlbpid[cpuid] = id;
24. Implement and use
lock/atomic primitives
• Needed to implement
• lock primitives: mutex, mp_lock
• atomic primitives: CAS, 64bit add, etc..
• Acquiring giant lock prior to entering the kernel
context
• hardware interrupts
• software interrupts
• trap()
26. Acquiring giant lock on
clock interrupt handler
uint32_t clock_int5(uint32_t mask, struct trap_frame *tf)
...
if (tf->ipl < IPL_CLOCK) {
#ifdef MULTIPROCESSOR
__mp_lock(&kernel_lock);
#endif
while (ci->ci_pendingticks) {
clk_count.ec_count++;
hardclock(tf);
ci->ci_pendingticks--;
}
#ifdef MULTIPROCESSOR
__mp_unlock(&kernel_lock);
#endif
Actually, it causes a bug... described later
27. Hardware related
problems
• Spin up secondary processor
• Keeping TLB consistency by software
• Cache coherency
28. Spin up secondary
processor
• We need to launch secondary processor
• Access hardware register on controller device
to power on the processor
• Secondary processor needs own bootstrap code
• Has to be similar to primary processor’s one
• But we won’t need kernel, memory, and device
initialization
• Because primary processor already did it
29. Keeping TLB consistency
by software
• MIPS TLB doesn’t have mechanism to keep
consistency, we need to do it by software
Just like the other typical architectures
• Invalidate/update other processor’s TLB
using TLB shootdown
• It implemented by IPI
(Inter-Processor Interrupt)
30. Cache coherency
• MIPS R10000/R12000 processors have full
cache coherency
• We don’t have to care about it on Octane
• But, some other processors haven’t full
cache coherency
31. Ideas implementing
SMP
• We have faced a number of issues while
implementing SMP
• Fighting against deadlock
• Dynamic memory allocation without
using virtual address
• Reduce frequency of TLB shootdown
32. Fighting against
deadlock
• It’s hard to find the cause of deadlocks
because both processors runs concurrently
• It causes timing bugs -
conditions are depend on timing
• Need to be able to determine what
happened on both processors at that time
• There are tools for debugging it
33. JTAG ICE
• Very useful for debugging
• We can get any data for debugging on
desired timing, even after kernel hangs
• I used it when I was implementing SMP for
the embedded OS
• Not for Octane, there’s no way to connect
34. ddb
• OpenBSD kernel has in-kernel debugger, named ddb
• We can get similar data for JTAG ICE, but kernel
need to alive - because it’s part of the kernel
• Missing features:
We hadn’t implemented “machine ddbcpu<#>” -
which is processor switching function on ddb
Without this, we can only debug one processor
which invoked ddb on a breakpoint
• Not always useful
35. printf()
• Most popular kernel debugging tool
• Just write printf(message) on your code, Easy to use ;)
• Unfortunately, it has some problems
• printf() wastes lot of cycles, changes timing between
processors
We may miss timing bug because of it
• Some point of the code are printf() unsafe
causes kernel hang
• We use it anyway
36. Divide printf output for
two serial port
• There’s two serial port and two processors
• If we have lots of debug print, it’s hard to
understand which lines belongs to what processor
• I implemented dirty hack code which output
strings directly to secondary serial port, named
combprintf()
• Rewrite debug print to
primary processor outputs primary serial port,
secondary processor outputs secondary serial port
37. What do we need to
print for debugging?
• To know where the kernel running roughly
• put debug print everywhere the point
kernel may running through
• dump all system call using
SYSCALL_DEBUG
• How can we determine a deadlock point?
38. Determine a deadlock
point
• Deadlocks are occurring on spinlocks
• It loops permanently until a condition
become available, but that condition never
comes up
• At least to know which lock primitives we
stuck on, we need to stop permanent loop
by implementing timeout counter and print
debug message
39. Adding timeout
counter into mutex
void mtx_enter(struct mutex *mtx)
...
for (;;) {
if (mtx->mtx_wantipl != IPL_NONE)
s = splraise(mtx->mtx_wantipl);
if (try_lock(mtx)) {
if (mtx->mtx_wantipl != IPL_NONE)
mtx->mtx_oldipl = s;
mtx->mtx_owner = curcpu();
return;
}
if (mtx->mtx_wantipl != IPL_NONE)
splx(s);
if (++i > MTX_TIMEOUT)
panic("mtx deadlockedn”);
}
40. Adding timeout
counter into mutex
void mtx_enter(struct mutex *mtx)
...
for (;;) {
if (mtx->mtx_wantipl != IPL_NONE)
s = splraise(mtx->mtx_wantipl);
if (try_lock(mtx)) {
if (mtx->mtx_wantipl != IPL_NONE)
mtx->mtx_oldipl = s;
mtx->mtx_owner = curcpu();
return;
}
if (mtx->mtx_wantipl != IPL_NONE)
splx(s);
if (++i > MTX_TIMEOUT)
panic("mtx deadlockedn”);
}
But, this is not enough
41. Why it’s not enough
CPU A CPU B
Acquires lock Acquires lock
Lock Lock
A B
Spins until released Spins until released
42. Why it’s not enough
CPU A CPU B
Acquires lock Acquires lock
Lock Lock
A B
Spins until released Spins until released
We can break here
43. Why it’s not enough
CPU A CPU B
Acquires lock Acquires lock
We wanna know which
Lock Lock function acquired it
A B
Spins until released Spins until released
We can break here
44. Remember who
acquired it
void mtx_enter(struct mutex *mtx)
...
for (;;) {
if (mtx->mtx_wantipl != IPL_NONE)
s = splraise(mtx->mtx_wantipl);
if (try_lock(mtx)) {
if (mtx->mtx_wantipl != IPL_NONE)
mtx->mtx_oldipl = s;
mtx->mtx_owner = curcpu();
mtx->mtx_ra =
__builtin_return_address(0);
return;
}
if (mtx->mtx_wantipl != IPL_NONE)
splx(s);
if (++i > MTX_TIMEOUT)
panic("mtx deadlocked ra:%pn",
mtx->mtx_ra);
}
45. Interrupt blocks IPI
CPU A CPU B
Interrupt
Fault
Disable interrupt
Acquire lock
Lock
Wait until released
TLB shootdown Blocked
IPI
Wait ACK
46. Interrupt blocks IPI
CPU A CPU B
Interrupt
Fault
Disable interrupt
Acquire lock
Lock
Wait until released
TLB shootdown
IPI
Wait ACK
47. Interrupt blocks IPI
CPU A CPU B
Interrupt
Fault
Disable interrupt
Acquire lock Re-enable interrupt
Lock
Wait until released
TLB shootdown Accept Interrupt
IPI
Wait ACK
48. Interrupt blocks IPI
CPU A CPU B
Interrupt
Fault
Disable interrupt
Acquire lock Re-enable interrupt
Lock
Wait until released
TLB shootdown Accept Interrupt
IPI
Wait ACK ACK for rendezvous
49. Interrupt blocks IPI
CPU A CPU B
Interrupt
Fault
Solution: Enable IPI interrupt on
Disable interrupt
Acquire lock interrupt handlers
Re-enable interrupt
Lock
Wait until released
TLB shootdown Accept Interrupt
IPI
Wait ACK ACK for rendezvous
51. splhigh() blocks IPI
CPU A CPU B
splhigh()
Fault
Acquire lock
Lock
Wait until released
TLB shootdown Blocked
IPI
Wait ACK
52. splhigh() blocks IPI
CPU A CPU B
splhigh()
Fault
Acquire lock
Lock
Wait until released
TLB shootdown
IPI
Wait ACK
53. splhigh() blocks IPI
CPU A CPU B
splhigh()
Fault
Acquire lock
Lock
Wait until released
TLB shootdown Accept Interrupt
IPI
Wait ACK
54. splhigh() blocks IPI
CPU A CPU B
splhigh()
Fault
Acquire lock
Lock
Wait until released
TLB shootdown Accept Interrupt
IPI
Wait ACK ACK for rendezvous
55. splhigh() blocks IPI
CPU A CPU B
splhigh()
Solution: defined new interrupt priority
Fault
level named IPL_IPI to be higher than
Acquire lock IPL_HIGH
Lock
Wait until released
TLB shootdown Accept Interrupt
IPI
Wait ACK ACK for rendezvous
57. Dynamic memory allocation
without using virtual address
• To support N(>2) processors, the cpu_info structure and the
bootstrap kernel stack for secondary processors should be
allocated dynamically
• But we can’t use virtual address for them
• stack may used before TLB initialization, thus causing the
processor fault
• MIPS has Software TLB, need to maintain TLB by software
TLB miss handler is the code to handle it
This handler refers cpu_info, it cause TLB miss loop
• To avoid these problems, we implemented wrapper function to
allocate memory dynamically, then get the physical address and
return it
58. The wrapper function
vaddr_t smp_malloc(size_t size)
...
if (size < PAGE_SIZE) {
va = (vaddr_t)malloc(size, M_DEVBUF, M_NOWAIT);
if (va == NULL)
return NULL;
error = pmap_extract(pmap_kernel(), va, &pa);
if (error == FALSE)
return NULL;
} else {
TAILQ_INIT(&mlist);
error = uvm_pglistalloc(size, 0, -1L, 0, 0,
&mlist, 1, UVM_PLA_NOWAIT);
if (error)
return NULL;
m = TAILQ_FIRST(&mlist);
pa = VM_PAGE_TO_PHYS(m);
}
return PHYS_TO_XKPHYS(pa, CCA_CACHED);
59. Reduce frequency of
TLB shootdown
• There’s a condition we can skip TLB shootdown in invalidate/
update:
using the pagetable not using
kernel mode need need
user mode need don’t need
• In user mode, if shootee processor doesn’t using the pagetable,
we won’t need shootdown; just changing ASID assign is enough
• In reference pmap implementation, a TLB shootdown
performed non-conditionally, even in case it isn’t really needed
• We added the condition to reduce frequency of it
61. CPU_INFO_FOREACH(cii, ci)
if (cpuset_isset(&cpus_running, ci)) {
unsigned int i = ci->ci_cpuid;
unsigned int m = 1 << i;
if (pmap->pm_asid[i].pma_asidgen !=
pmap_asid_info[i].pma_asidgen)
continue;
else if (ci->ci_curpmap != pmap) {
pmap->pm_asid[i].pma_asidgen = 0;
continue;
}
cpumask |= m;
}
if (cpumask == 1 << cpuid) {
u_long asid;
asid = pmap->pm_asid[cpuid].pma_asid << VMTLB_PID_SHIFT;
tlb_flush_addr(va | asid);
} else if (cpumask) {
struct pmap_invalidate_page_arg arg;
arg.pmap = pmap;
arg.va = va;
smp_rendezvous_cpus(cpumask, pmap_invalidate_user_page_action,
&arg);
}
62. Future works
• Commit machine ddbcpu<#>
• New port for Cavium OCTEON,
if it’s acceptable for OpenBSD project
• Maybe SMP support for SGI Origin 350
• Also interested in MI part of SMP