Building a full kernel takes time but is often necessary during development or when backporting patches. The nature of the kernel makes it easy to distribute its build on multiple cheap machines. This presentation will explain how to set up a build farm based on cost, size, and performance.
Willy Tarreau, HaProxy
Kernel Recipes 2016 - kernelci.org: 1.5 million kernel boots (and counting)Anne Nicolas
The kernelci.org project performs over 2000 kernel boot tests per day for upstream kernels on a wide variety of hardware. This talk will provide an overview of kernelci.org, how distributed board farms are used, how it is used by kernel maintainers and developers, and how you can make use of the reports, logs and pre-built images.
Kevin Hilman, BayLibre
The Linux driver model was created over a decade ago with the goal of unifying all hardware drivers in the kernel in a way to provide both consitant device naming and properly power management control. This talk will go into how well those goals were reached, how the model works today, and what remains to be done.
Greg KH, The Linux Foundation
Kernel Recipes 2016 - entry_*.S: A carefree stroll through kernel entry codeAnne Nicolas
I have always wondered what happens when we enter the kernel from userspace: what preparations does the hardware meet when the userspace to kernel space switch instructions are executed and back, and what does the kernel do when it executes a system call. There are also a bunch of things it does before it executes the actual syscall so I try to look at those too.
This talk is an attempt to demystify some of the aspects of the cryptic x86 entry code in arch/x86/entry/ written in assembly and how does that all fit with software-visible architecture of x86, what hardware features are being used and how.
With the hope to get more people excited about this funky piece of the kernel and maybe have the same fun we’re having.
Borislav Petkov, SUSE
Kernel Recipes 2015 - So you want to write a Linux driver frameworkAnne Nicolas
Writing a new driver framework in Linux is hard. There are many pitfalls along the way; this talk hopes to point out some of those pitfalls and hard lessons learned through examples, advice and humorous anecdotes in the hope that it will aid those adventurous enough to take on the task of writing a new driver framework. The scope of the talk includes internal framework design as well as external API design exposed to drivers and consumers of the framework. This presentation pulls directly from the Michael Turquette’s experience authoring the Common Clock
Framework and maintaining that code for the last four years.
Additionally Mike has solicited tips and advice from other subsystem maintainers, for a well-rounded overview. Be prepared to learn some winning design patterns and hear some embarrassing stories of framework design gone wrong.
Mike Turquette, BayLibre
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s goingAnne Nicolas
The Linux kernel features an extensive array of, to put it kindly, somewhat disorganized documentation. A significant effort is underway to make things better, though. This talk will review the state of kernel documentation, cover the changes that are being made (including the adoption of a new system for formatted documentation), and discuss how interested developers can help.
Jonathan Corbet, LWN.net
Spying on the Linux kernel for fun and profitAndrea Righi
Do you ever wonder what the kernel is doing while your code is running? This talk will explore some methodologies and techniques (eBPF, ftrace, etc.) to look under the hood of the Linux kernel and understand what it’s actually doing behind the scenes.
Agenda:
In this talk we will present various locking mechanisms implemented in the linux kernel.
From System V locks to raw spinlocks and the RT patch.
Speaker:
Mark Veltzer - CTO of Hinbit and a senior instructor at John Bryce. Mark is also a member of the Free Source Foundation and contributes to many free projects.
https://github.com/veltzer
Kernel Recipes 2016 - kernelci.org: 1.5 million kernel boots (and counting)Anne Nicolas
The kernelci.org project performs over 2000 kernel boot tests per day for upstream kernels on a wide variety of hardware. This talk will provide an overview of kernelci.org, how distributed board farms are used, how it is used by kernel maintainers and developers, and how you can make use of the reports, logs and pre-built images.
Kevin Hilman, BayLibre
The Linux driver model was created over a decade ago with the goal of unifying all hardware drivers in the kernel in a way to provide both consitant device naming and properly power management control. This talk will go into how well those goals were reached, how the model works today, and what remains to be done.
Greg KH, The Linux Foundation
Kernel Recipes 2016 - entry_*.S: A carefree stroll through kernel entry codeAnne Nicolas
I have always wondered what happens when we enter the kernel from userspace: what preparations does the hardware meet when the userspace to kernel space switch instructions are executed and back, and what does the kernel do when it executes a system call. There are also a bunch of things it does before it executes the actual syscall so I try to look at those too.
This talk is an attempt to demystify some of the aspects of the cryptic x86 entry code in arch/x86/entry/ written in assembly and how does that all fit with software-visible architecture of x86, what hardware features are being used and how.
With the hope to get more people excited about this funky piece of the kernel and maybe have the same fun we’re having.
Borislav Petkov, SUSE
Kernel Recipes 2015 - So you want to write a Linux driver frameworkAnne Nicolas
Writing a new driver framework in Linux is hard. There are many pitfalls along the way; this talk hopes to point out some of those pitfalls and hard lessons learned through examples, advice and humorous anecdotes in the hope that it will aid those adventurous enough to take on the task of writing a new driver framework. The scope of the talk includes internal framework design as well as external API design exposed to drivers and consumers of the framework. This presentation pulls directly from the Michael Turquette’s experience authoring the Common Clock
Framework and maintaining that code for the last four years.
Additionally Mike has solicited tips and advice from other subsystem maintainers, for a well-rounded overview. Be prepared to learn some winning design patterns and hear some embarrassing stories of framework design gone wrong.
Mike Turquette, BayLibre
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s goingAnne Nicolas
The Linux kernel features an extensive array of, to put it kindly, somewhat disorganized documentation. A significant effort is underway to make things better, though. This talk will review the state of kernel documentation, cover the changes that are being made (including the adoption of a new system for formatted documentation), and discuss how interested developers can help.
Jonathan Corbet, LWN.net
Spying on the Linux kernel for fun and profitAndrea Righi
Do you ever wonder what the kernel is doing while your code is running? This talk will explore some methodologies and techniques (eBPF, ftrace, etc.) to look under the hood of the Linux kernel and understand what it’s actually doing behind the scenes.
Agenda:
In this talk we will present various locking mechanisms implemented in the linux kernel.
From System V locks to raw spinlocks and the RT patch.
Speaker:
Mark Veltzer - CTO of Hinbit and a senior instructor at John Bryce. Mark is also a member of the Free Source Foundation and contributes to many free projects.
https://github.com/veltzer
Current experience shows that a lot of developers working on Xen/Linux kernel use mainly only small set of debugging tools. Often they are sufficient for generic work. However, when unusual problem arises which could not be easily debugged using known tools sometimes they are trying to reinvent the wheel. Goal of this session is to present wide range of debugging tools starting from simplest one to most feature reach solutions in context of Xen/Linux kernel debugging. It will describe pros and cons of printk (serial, debug console, etc.), gdb, gdbsx, kgdb, QEMU, kdump and others. Additionally, there will be some information about possible new solutions and current kexec/kdump developments for Xen.
Kernel Recipes 2016 - Understanding a Real-Time System (more than just a kernel)Anne Nicolas
The PREEMPT_RT patch turns Linux into a hard Real-Time designed operating system. But it takes more than just a kernel to make sure you can meet all your requirements. This talk explains all aspects of the system that is being used for a mission critical project that must be considered. Creating a Real-Time environment is difficult and there is no simple solution to make sure that your system is capable to fulfill its needs. One must be vigilant with all aspects of the system to make sure there are no surprises. This talk will discuss most of the “gotchas” that come with putting together a Real-Time system.
You don’t need to be a developer to enjoy this talk. If you are curious to know how your computer is an unpredictable mess you should definitely come to this talk.
Steven Rostedt - Red Hat
Have you ever heard of FreeBSD? Probably.
Have you ever interacted with its kernel? Probably not.
In this talk, Gili Yankovitch (nyxsecuritysolutions.com) will talk about the FreeBSD operating system, its network stack and how to write network drivers for it.
The talk will cover the following topics:
* Kernel/User interation in FreeBSD
* The FreeBSD Network Stack
* Network Buffers API
* L2 and L3 Hooking
LAS16-101: Efficient kernel backporting
Speakers: Alex Shi
Date: September 26, 2016
★ Session Description ★
In computer/mobile product world, due to the stability, project timeline, etc considerations, latest upstream kernel isn’t their preference. The long term stable kernel is. But if you want to some of the latest features which only is in upstream kernel,you will have to backport them to old stable kernel. This presentation will share the kernel feature backport experience with audience, help them understand how to do backports quickly and effectively without detailed knowledge of the target feature, thus giving more flexibility and Improving productivity when making products. We will use some examples, to discuss how to get info from backport request, how to find necessary commits, how to get dependency, how to resolve conflicts, and finally how to test it.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-101
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-101/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
High-Performance Networking Using eBPF, XDP, and io_uringScyllaDB
In the networking world there are a number of ways to increase performance over naive use of basic Berkeley sockets. These techniques have ranged from polling blocking sockets, non-blocking sockets controlled by Epoll, all the way through completely bypassing the Linux kernel for maximum network performance where you talk directly to the network interface card by using something like DPDK or Netmap. All these tools have their place, and generally occupy a space from convenience to performance. But in recent years, that landscape has changed massively.. The tools available to the average Linux systems developer have improved from the creation of io_uring, to the expansion of bpf from a simple filtering language to a full-on programming environment embedded directly in the kernel. Along with that came something called XDP (express datapath). This was Linux kernel's answer to kernel-bypass networking. AF_XDP is the new socket type created by this feature, and generally works very similarly to something like DPDK. History lessons out of the way, this talk will look into, and discuss the merits of this technology, it's place in the broader ecosystem and how it can be used to attain the highest level of performance possible. This talk will dive into crucial details, such as how AF_XDP works, how it can be integrated into a larger system and finally more advanced topics such as request sharding/load balancing. There will be detailed look at the design of AF_XDP, the eBpf code used, as well as the userspace code required to drive it all. It will also include performance numbers from this setup compared to regular kernel networking. And most importantly how to put all this together to handle as much data as possible on a single modern multi-core system.
Introduction to binary translation in QEMU(TCG). Describe how it works. In addition, there is a section which demonstrate qemu-monitor, a debug tool for AArch64/QEMU.
There are lots of animations in the slides so download and open it with Microsoft PowerPoint for the best experience. Below is the download link.
Google Driver Link: http://goo.gl/XXMC9X
IRQs: the Hard, the Soft, the Threaded and the PreemptibleAlison Chaiken
The Linux kernel supports a diverse set of interrupt handlers that partition work into immediate and deferred tasks. The talk introduces the major varieties and explains how IRQs differ in the real-time kernel.
The Linux Kernel Scheduler (For Beginners) - SFO17-421Linaro
Session ID: SFO17-421
Session Name: The Linux Kernel Scheduler (For Beginners) - SFO17-421
Speaker: Viresh Kumar
Track: Power Management
★ Session Summary ★
This talk will take you through the internals of the Linux Kernel scheduler.
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/sfo17/sfo17-421/
Presentation:
Video: https://www.youtube.com/watch?v=q283Wm__QQ0
---------------------------------------------------
★ Event Details ★
Linaro Connect San Francisco 2017 (SFO17)
25-29 September 2017
Hyatt Regency San Francisco Airport
---------------------------------------------------
Keyword:
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://twitter.com/linaroorg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961
Agenda:
This talk will provide an in-depth review of the usage of canaries in the kernel and the interaction with userspace, as well as a short review of canaries and why they are needed in general so don't be afraid if you never heard of them.
Speaker:
Gil Yankovitch, CEO, Chief Security Researcher from Nyx Security Solutions
Kernel Recipes 2015 - The Dronecode Project – A step in open source dronesAnne Nicolas
UAVs are becoming more and more present in our everyday life and there are lots of different projects that are being currently developed in order to control their flight, handle their stability, make it possible to edit automatic missions that the drones will execute and anything that the developers can think of.
On October 2014, the Linux Foundation announced the creation of the DroneCode Project which is to become “a common, shared open source platform for Unmanned Aerial Vehicles (UAVs)”. Parrot started to sell Linux based drones in 2010 and obviously needed to take part in that adventure.
This Lightning Talk will try to give a quick overview of the projects that are developed by the Dronecode community and explain why and how I started a few months ago to port an open source autopilot name Ardupilot to Parrot’s drones. This Lightning Talk will also present the current status of this project, and the many possibilities that can come from it.
Julien BERAUD
Let's trace Linux Lernel with KGDB @ COSCUP 2021Jian-Hong Pan
https://coscup.org/2021/en/session/39M73K
https://www.youtube.com/watch?v=L_Gyvdl_d_k
Engineers have plenty of debug tools for user space programs development, code tracing, debugging and analyzing. Except “printk”, do we have any other debug tools for Linux kernel development? The “KGDB” mentioned in Linux kernel document provides another possibility.
Will share how to experiment with the KGDB in a virtual machine. And, use GDB + OpenOCD + JTAG + Raspberry Pi in the real environment as the demo in this talk.
開發 user space 軟體時,工程師們有方便的 debug 工具進行查找、分析、除錯。但在 Linux kernel 的開發,除了 printk 外,還可以有哪些工具可以使用呢?從 Linux kernel document 可以看到 KGDB 相關的資訊,提供了在 kernel 除錯時的另一個可能性。
本次將分享,從建立最簡單環境的虛擬機機開始,到實際使用 GDB + OpenOCD + JTAG + Raspberry Pi 當作展示範例。
The presentation deals with the set of tools and features that can be used by Linux kernel developers for kernel debugging. Also, static analysis of kernel patches was addressed during speech. Special attention was given to access tools, tracing tools, and interactive debugging tools, namely: DebugFS, ftrace, and GDB.
This presentation by Aleksandr Bulyshchenko (Software Engineer, Consultant, GlobalLogic Kharkiv) was delivered at GlobalLogic Kharkiv Embedded TechTalk #1 on March 13, 2018.
This talk will provide several examples of how Facebook engineers use BPF to scale the networking, prevent denial of service, secure containers, analyze performance. It’s suitable for BPF newbies and experts.
Alexei Starovoitov, Facebook
The objective of this conference is to explain how to develop on small machines that can boot fast, fanless … It’s convenient and often more realistic than to work in virtual machines, especially when working on developping drivers.
Willy will list all the necessary items to setup an optimal environment to be as efficient as possible. Among these one: the importance of investing time to speed up the boot, etc …
He will also speak a bit about crosstool-ng, used in this environement.
Current experience shows that a lot of developers working on Xen/Linux kernel use mainly only small set of debugging tools. Often they are sufficient for generic work. However, when unusual problem arises which could not be easily debugged using known tools sometimes they are trying to reinvent the wheel. Goal of this session is to present wide range of debugging tools starting from simplest one to most feature reach solutions in context of Xen/Linux kernel debugging. It will describe pros and cons of printk (serial, debug console, etc.), gdb, gdbsx, kgdb, QEMU, kdump and others. Additionally, there will be some information about possible new solutions and current kexec/kdump developments for Xen.
Kernel Recipes 2016 - Understanding a Real-Time System (more than just a kernel)Anne Nicolas
The PREEMPT_RT patch turns Linux into a hard Real-Time designed operating system. But it takes more than just a kernel to make sure you can meet all your requirements. This talk explains all aspects of the system that is being used for a mission critical project that must be considered. Creating a Real-Time environment is difficult and there is no simple solution to make sure that your system is capable to fulfill its needs. One must be vigilant with all aspects of the system to make sure there are no surprises. This talk will discuss most of the “gotchas” that come with putting together a Real-Time system.
You don’t need to be a developer to enjoy this talk. If you are curious to know how your computer is an unpredictable mess you should definitely come to this talk.
Steven Rostedt - Red Hat
Have you ever heard of FreeBSD? Probably.
Have you ever interacted with its kernel? Probably not.
In this talk, Gili Yankovitch (nyxsecuritysolutions.com) will talk about the FreeBSD operating system, its network stack and how to write network drivers for it.
The talk will cover the following topics:
* Kernel/User interation in FreeBSD
* The FreeBSD Network Stack
* Network Buffers API
* L2 and L3 Hooking
LAS16-101: Efficient kernel backporting
Speakers: Alex Shi
Date: September 26, 2016
★ Session Description ★
In computer/mobile product world, due to the stability, project timeline, etc considerations, latest upstream kernel isn’t their preference. The long term stable kernel is. But if you want to some of the latest features which only is in upstream kernel,you will have to backport them to old stable kernel. This presentation will share the kernel feature backport experience with audience, help them understand how to do backports quickly and effectively without detailed knowledge of the target feature, thus giving more flexibility and Improving productivity when making products. We will use some examples, to discuss how to get info from backport request, how to find necessary commits, how to get dependency, how to resolve conflicts, and finally how to test it.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-101
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-101/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
High-Performance Networking Using eBPF, XDP, and io_uringScyllaDB
In the networking world there are a number of ways to increase performance over naive use of basic Berkeley sockets. These techniques have ranged from polling blocking sockets, non-blocking sockets controlled by Epoll, all the way through completely bypassing the Linux kernel for maximum network performance where you talk directly to the network interface card by using something like DPDK or Netmap. All these tools have their place, and generally occupy a space from convenience to performance. But in recent years, that landscape has changed massively.. The tools available to the average Linux systems developer have improved from the creation of io_uring, to the expansion of bpf from a simple filtering language to a full-on programming environment embedded directly in the kernel. Along with that came something called XDP (express datapath). This was Linux kernel's answer to kernel-bypass networking. AF_XDP is the new socket type created by this feature, and generally works very similarly to something like DPDK. History lessons out of the way, this talk will look into, and discuss the merits of this technology, it's place in the broader ecosystem and how it can be used to attain the highest level of performance possible. This talk will dive into crucial details, such as how AF_XDP works, how it can be integrated into a larger system and finally more advanced topics such as request sharding/load balancing. There will be detailed look at the design of AF_XDP, the eBpf code used, as well as the userspace code required to drive it all. It will also include performance numbers from this setup compared to regular kernel networking. And most importantly how to put all this together to handle as much data as possible on a single modern multi-core system.
Introduction to binary translation in QEMU(TCG). Describe how it works. In addition, there is a section which demonstrate qemu-monitor, a debug tool for AArch64/QEMU.
There are lots of animations in the slides so download and open it with Microsoft PowerPoint for the best experience. Below is the download link.
Google Driver Link: http://goo.gl/XXMC9X
IRQs: the Hard, the Soft, the Threaded and the PreemptibleAlison Chaiken
The Linux kernel supports a diverse set of interrupt handlers that partition work into immediate and deferred tasks. The talk introduces the major varieties and explains how IRQs differ in the real-time kernel.
The Linux Kernel Scheduler (For Beginners) - SFO17-421Linaro
Session ID: SFO17-421
Session Name: The Linux Kernel Scheduler (For Beginners) - SFO17-421
Speaker: Viresh Kumar
Track: Power Management
★ Session Summary ★
This talk will take you through the internals of the Linux Kernel scheduler.
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/sfo17/sfo17-421/
Presentation:
Video: https://www.youtube.com/watch?v=q283Wm__QQ0
---------------------------------------------------
★ Event Details ★
Linaro Connect San Francisco 2017 (SFO17)
25-29 September 2017
Hyatt Regency San Francisco Airport
---------------------------------------------------
Keyword:
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://twitter.com/linaroorg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961
Agenda:
This talk will provide an in-depth review of the usage of canaries in the kernel and the interaction with userspace, as well as a short review of canaries and why they are needed in general so don't be afraid if you never heard of them.
Speaker:
Gil Yankovitch, CEO, Chief Security Researcher from Nyx Security Solutions
Kernel Recipes 2015 - The Dronecode Project – A step in open source dronesAnne Nicolas
UAVs are becoming more and more present in our everyday life and there are lots of different projects that are being currently developed in order to control their flight, handle their stability, make it possible to edit automatic missions that the drones will execute and anything that the developers can think of.
On October 2014, the Linux Foundation announced the creation of the DroneCode Project which is to become “a common, shared open source platform for Unmanned Aerial Vehicles (UAVs)”. Parrot started to sell Linux based drones in 2010 and obviously needed to take part in that adventure.
This Lightning Talk will try to give a quick overview of the projects that are developed by the Dronecode community and explain why and how I started a few months ago to port an open source autopilot name Ardupilot to Parrot’s drones. This Lightning Talk will also present the current status of this project, and the many possibilities that can come from it.
Julien BERAUD
Let's trace Linux Lernel with KGDB @ COSCUP 2021Jian-Hong Pan
https://coscup.org/2021/en/session/39M73K
https://www.youtube.com/watch?v=L_Gyvdl_d_k
Engineers have plenty of debug tools for user space programs development, code tracing, debugging and analyzing. Except “printk”, do we have any other debug tools for Linux kernel development? The “KGDB” mentioned in Linux kernel document provides another possibility.
Will share how to experiment with the KGDB in a virtual machine. And, use GDB + OpenOCD + JTAG + Raspberry Pi in the real environment as the demo in this talk.
開發 user space 軟體時,工程師們有方便的 debug 工具進行查找、分析、除錯。但在 Linux kernel 的開發,除了 printk 外,還可以有哪些工具可以使用呢?從 Linux kernel document 可以看到 KGDB 相關的資訊,提供了在 kernel 除錯時的另一個可能性。
本次將分享,從建立最簡單環境的虛擬機機開始,到實際使用 GDB + OpenOCD + JTAG + Raspberry Pi 當作展示範例。
The presentation deals with the set of tools and features that can be used by Linux kernel developers for kernel debugging. Also, static analysis of kernel patches was addressed during speech. Special attention was given to access tools, tracing tools, and interactive debugging tools, namely: DebugFS, ftrace, and GDB.
This presentation by Aleksandr Bulyshchenko (Software Engineer, Consultant, GlobalLogic Kharkiv) was delivered at GlobalLogic Kharkiv Embedded TechTalk #1 on March 13, 2018.
This talk will provide several examples of how Facebook engineers use BPF to scale the networking, prevent denial of service, secure containers, analyze performance. It’s suitable for BPF newbies and experts.
Alexei Starovoitov, Facebook
The objective of this conference is to explain how to develop on small machines that can boot fast, fanless … It’s convenient and often more realistic than to work in virtual machines, especially when working on developping drivers.
Willy will list all the necessary items to setup an optimal environment to be as efficient as possible. Among these one: the importance of investing time to speed up the boot, etc …
He will also speak a bit about crosstool-ng, used in this environement.
"Lightweight Virtualization with Linux Containers and Docker". Jerome Petazzo...Yandex
Lightweight virtualization", also called "OS-level virtualization", is not new. On Linux it evolved from VServer to OpenVZ, and, more recently, to Linux Containers (LXC). It is not Linux-specific; on FreeBSD it's called "Jails", while on Solaris it’s "Zones". Some of those have been available for a decade and are widely used to provide VPS (Virtual Private Servers), cheaper alternatives to virtual machines or physical servers. But containers have other purposes and are increasingly popular as the core components of public and private Platform-as-a-Service (PAAS), among others.
Just like a virtual machine, a Linux Container can run (almost) anywhere. But containers have many advantages over VMs: they are lightweight and easier to manage. After operating a large-scale PAAS for a few years, dotCloud realized that with those advantages, containers could become the perfect format for software delivery, since that is how dotCloud delivers from their build system to their hosts. To make it happen everywhere, dotCloud open-sourced Docker, the next generation of the containers engine powering its PAAS. Docker has been extremely successful so far, being adopted by many projects in various fields: PAAS, of course, but also continuous integration, testing, and more.
cachegrand: A Take on High Performance CachingScyllaDB
cachegrand is what happens when you throw in a mix a SIMD-accelerated hashtable — capable of performing parallel GET operations without locks or busy-wait loops (e.g. atomic operations) — with fibers, io_uring, your own I/O library, your own memory allocator, and an in-memory & on-disk time series database!
Written in C, built from scratch, natively modular - currently working on Redis compatibility — it's a platform that can deliver very high QPS with low latencies for caching and data streaming with the door open to supporting business logic in Rust & WebAssembly down the line.
This session will focus on developing techniques and OS components used highlighting how they can provide an extra boost to your platforms, no matter the programming language.
When working with big data or complex algorithms, we often look to parallelize our code to optimize runtime. By taking advantage of a GPUs 1000+ cores, a data scientist can quickly scale out solutions inexpensively and sometime more quickly than using traditional CPU cluster computing. In this webinar, we will present ways to incorporate GPU computing to complete computationally intensive tasks in both Python and R.
See the full presentation here: 👉 https://vimeo.com/153290051
Learn more about the Domino data science platform: https://www.dominodatalab.com
UWE Linux Boot Camp 2007: Hacking embedded Linux on the cheapedlangley
Slides from a talk at the first ever UWE Linux Boot Camp in 2007, about getting started playing around with embedded Linux on a budget. The example system used is the Mattel Juicebox.
strangeloop 2012 apache cassandra anti patternsMatthew Dennis
random list of Apache Cassndra Anti Patterns. There is a lot of info on what to use Cassandra for and how, but not a lot of information on what not to do. This presentation works towards filling that gap.
C for Cuda - Small Introduction to GPU computingIPALab
In this talk, we are presenting a short introduction to CUDA and GPU computing to help anyone who reads it to get started with this technology.
At first, we are introducing the GPU from the hardware point of view: what is it? How is it built? Why use it for General Purposes (GPGPU)? How does it differ from the CPU?
The second part of the presentation is dealing with the software abstraction and the use of CUDA to implement parallel computing. The software architecture, the kernels and the different types of memories are tackled in this part.
Finally, and to illustrate what has been presented previously, examples of codes are given. These examples are also highlighting the issues that may occur while using parallel-computing.
In this session, Boyan Krosnov, CPO of StorPool will discuss a private cloud setup with KVM achieving 1M IOPS per hyper-converged (storage+compute) node. We will answer the question: What is the optimum architecture and configuration for performance and efficiency?
Some resources how to navigate in the hardware space in order to build your own workstation for training deep learning models.
Alternative download link: https://www.dropbox.com/s/o7cwla30xtf9r74/deepLearning_buildComputer.pdf?dl=0
Parallel computing in bioinformatics t.seemann - balti bioinformatics - wed...Torsten Seemann
I describe the three levels of parallelism that can be exploited in bioinformatics software (1) clusters of multiple computers; (2) multiple cores on each computer; and (3) vector machine code instructions.
This session will cover performance-related developments in Red Hat Gluster Storage 3 and share best practices for testing, sizing, configuration, and tuning.
Join us to learn about:
Current features in Red Hat Gluster Storage, including 3-way replication, JBOD support, and thin-provisioning.
Features that are in development, including network file system (NFS) support with Ganesha, erasure coding, and cache tiering.
New performance enhancements related to the area of remote directory memory access (RDMA), small-file performance, FUSE caching, and solid state disks (SSD) readiness.
This presentation demonstrates how to efficiently manage GPU buffers using today's APIs. It describes why buffer management is so important, and how inefficient buffer management can cut frame rates in half. Finally, it demonstrates a couple of new techniques; the first being discard-free circular buffers and the second transient buffers.
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, VectorizedHostedbyConfluent
Enforcing format, changing schema, introducing privacy filters have always been a challenge with the classical Kafka-API. In this talk we'll cover how to extend existing applications with webassembly, allowing developers to change the shape of data at runtime, per application without creating additional topics. By leveraging WebAssembly, we can extend the capabilities of the Kafka-API beyond what it was initially imagined. Come and learn about the future of the Kafka-API
Similar to Kernel Recipes 2016 - Speeding up development by setting up a kernel build farm (20)
Kernel Recipes 2019 - Driving the industry toward upstream firstAnne Nicolas
Wanting to avoid the Android experience, Google developers always aimed to make their Chrome OS Linux kernels as close to mainline as possible. However, when Chromebooks were first created, Google was left with no choice, the mainline kernel, in some subsystems, still did not have all the functionalities needed by Chromebooks. Hence, similarly to Android, Chrome OS had to develop their own out-of-tree code for the kernel and maintain that for a few different kernel versions.
Luckily, over the last few years a strong and consistent effort has been happening to bring Chromebook devices closer to mainline. It has led to significant improvements that now make it possible to run mainline on Chrome OS devices. And not only Chromebooks, as these significant strides are also improving Arm-based SOCs and other key components of the rich Chromebook hardware ecosystem. In this talk, we will look at how and why upstream support for Chromebooks improved, the current status of various models, and what we expect in the future.
Enric Balletbò i Serra
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIAnne Nicolas
As the name would suggest, a Non-Maskable Interrupt (NMI) is an interrupt-like feature that is unaffected by the disabling of classic interrupts. In Linux, NMIs are involved in some features such as performance event monitoring, hard-lockup detector, on demand state dumping, etc… Their potential to fire when least expected can fill the most seasoned kernel hackers with dread.
AArch64 (aka arm64 in the Linux tree) does not provide architected NMIs, a consequence being that features benefiting from NMIs see their use limited on AArch64. However, the Arm Generic Interrupt Controller (GIC) supports interrupt prioritization and masking, which, among other things, provides a way to control whether or not a set of interrupts can be signaled to a CPU.
This talk will cover how, using the GIC interrupt priorities, we provide a way to configure some interrupts to behave in an NMI-like manner on AArch64. We’ll discuss the implementation, some of the complications that ensued and also some of the benefits obtained from it.
Julien Thierry
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelAnne Nicolas
At a rate of almost 9 changes per hour (24/7), the Linux kernel is definitely a scary beast. Bugs are introduced on a daily basis and, through the use of multiple code analyzers, *some* of them are detected and fixed before they hit mainline. Over the course of the last few years, Gustavo has been fixing such bugs and many different issues in every corner of the Linux kernel. Recently, he was in charge of leading the efforts to globally enable -Wimplicit-fallthrough; which appears by default in Linux v5.3. This presentation is a report on all the stuff Gustavo has found and fixed in the kernel with the support of the Core Infrastructure Initiative.
Gustavo A.R. Silva
Kernel Recipes 2019 - Metrics are moneyAnne Nicolas
In I.T. we all use all kinds of metrics. Operations teams rely heavily on these, especially when things go south. These metrics are sometimes overrated. Let’s dive into a few real life stories together.
Aurélien Rougemont
Kernel Recipes 2019 - Kernel documentation: past, present, and futureAnne Nicolas
The Linux kernel project includes a huge amount of documentation, but that information has seen little in the way of care over the
years. The amount of care has increased significantly recently, though, and things are improving quickly. Listen as the kernel’s documentation maintainer discusses the current state of the kernel’s docs, how we got here, where we’re trying to go, and how you can help.
Jonathan Corbet
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...Anne Nicolas
Modern SoC designs incorporate technologies from numerous vendors, each with their own inconsistent, confusing, undocumented and even contradictory terminology. The result is a mess of acronyms and product names which have a surprising impact on the ability to develop reusable, modular code thanks to the nature of the underlying IP being obscured.
This presentation will dive into some of the misnomers plaguing the Arm ecosystem, with the aim of explaining why things are like they are, how they fit together under the architectural umbrella and how you, as a developer, can decipher the baffling ingredients list of your next SoC design!
Will Deacon
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary dataAnne Nicolas
GNU poke is a new interactive editor for binary data. Not limited to editing basic ntities such as bits and bytes, it provides a full-fledged procedural, interactive programming language designed to describe data structures and to operate on them. Once a user has defined a structure for binary data (usually matching some file format) she can search, inspect, create, shuffle and modify abstract entities such as ELF relocations, MP3 tags, DWARF expressions, partition table entries, and so on, with primitives resembling simple editing of bits and bytes. The program comes with a library of already written descriptions (or “pickles” in poke parlance) for many binary formats.
GNU poke is useful in many domains. It is very well suited to aid in the development of programs that operate on binary files, such as assemblers and linkers. This was in fact the primary inspiration that brought me to write it: easily injecting flaws into ELF files in order to reproduce toolchain bugs. Also, due to its flexibility, poke is also very useful for reverse engineering, where the real structure of the data being edited is discovered by experiment, interactively. It is also good for the fast development of prototypes for programs like linkers, compressors or filters, and it provides a convenient foundation to write other utilities such as diff and patch tools for binary files.
This talk (unlike Gaul) is divided into four parts. First I will introduce the program and show what it does: from simple bits/bytes editing to user-defined structures. Then I will show some of the internals, and how poke is implemented. The third block will cover the way of using Poke to describe user data, which is to say the art of writing “pickles”. The presentation ends with a status of the project, a call for hackers, and a hint at future works.
Jose E. Marchesi
Kernel Recipes 2019 - Analyzing changes to the binary interface exposed by th...Anne Nicolas
Operating system distributors often face challenges that are somewhat different from that of upstream kernel developers. For instance, some kernel updates often need to stay at least binary compatible with modules that might be “out of tree” for some time.
In that context, being able to automatically detect and analyze changes to the binary interface exposed by the kernel to its module does have some noticeable value.
The Libabigail framework is capable of analyzing ELF binaries along with their accompanying debug info in the DWARF format, detect and report changes in types, functions, variables and ELF symbols. It has historically supported that for user space shared libraries and application so we worked to make it understand the Linux kernel
binaries.
In this presentation, we are going to present the current support of ABI analysis for Linux Kernel binaries, the challenges we face, how we address them and the plans we have for the future.
Dodji Seketeli, Jessica Yu, Matthias Männich
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and BareboxAnne Nicolas
Different upgrade and update strategies exist when it comes to embedded Linux system. If at development time none of these strategies have been chosen, adding them afterwards can be tedious task.
Even harder it gets when the system is already deployed in the field and only accessible via a 3G connection.
This talk is a developer experience of putting in place exactly that. Giving a return of experience on one way of doing it on a system running Barebox and a Yocto-based distribution.
Patrick Boettcher
Embedded Recipes 2019 - Making embedded graphics less specialAnne Nicolas
Traditionally graphics drivers were one of the last hold-outs of proprietary software in an embedded Linux system. This situation is changing with open-source graphics drivers showing up for almost all of the graphics acceleration peripherals on the market right now. This talk will show how open-source graphics drivers are making embedded systems less special, as well as trying to provide an overview of the Linux graphics stack, de-mystifying what is often seen as black magic GPU stuff from outside observers.
Lucas Stach
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre SiliconAnne Nicolas
This talk will explore Open Source Hardware projects relevant to Linux, including boards like BeagleBone, Olimex OLinuXino, Giant board and more. Looking at the benefits and challenges of designing Open Source Hardware for a Linux system, along with BeagleBoard.org’s experience of working with community, manufacturers, and distributors to create an Open Source Hardware platform. In closing also looking at the future, Libre Silicon like RISC-V designs, and where this might take Linux.
Drew Fustini
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) pictureAnne Nicolas
The I2C subsystem is not the shiniest part of the Linux Kernel. For embedded devices, though, it is one of the many puzzle pieces which just have to work. Wolfram Sang has the experience of maintaining this subsystem for nearly 7 years now. This talk gives a short overview of how maintaining works in general and specifically in this subsystem. But mainly, it will highlight noteworthy points in the timeline and lessons learnt from that. It will present trends, not so much regarding I2C but more the Linux Kernel and the embedded ecosystem in general. And of course, there will be plenty of anecdotes and bits from behind the scenes for your entertainment.
Wolfram Sang
Embedded Recipes 2019 - Testing firmware the devops wayAnne Nicolas
ITRenew is selling recertified OCP servers under the Sesame brand, those servers come either with their original UEFI BIOS or with LinuxBoot. The LinuxBoot project is pushing the Linux kernel inside bios flash and using userland programs as bootloader.
To achieve quality on our software stack, as any project, we need to test it. Traditional BIOS are tested by hand, this is 2019 we need to do it automatically! We already presented the hardware setup behind the LinuxBoot CI, this talk will focus on the software.
We use u-root for our userland bootloader; this software is written in Go so we naturally choose to use Go for our testing too. We will present how we are using and extending the Go native test framework `go test` for testing embedded systems (serial console) and improving the report format for integration to a CI.
Julien Viard de Galbert
Embedded Recipes 2019 - Herd your socs become a matchmakerAnne Nicolas
About 60% of the Linux kernel source tree is devoted to drivers for a large variety of supported hardware components. Especially in the embedded world, the number of different SoC families, versions, and revisions, integrating a myriad of “IP cores”, keeps on growing.
In this presentation, Geert will explain how to match drivers against hardware, and how to support a wide variety of (dis)similar devices, without turning platform and driver code into an entangled bowl of spaghetti.tra
Starting with a brief history of driver matching in Linux, he will fast-forward to device-tree based matching. He will discuss ways to handle slight variations of the same hardware devices, and different SoC revisions, each with their own quirks and bugs. Finally, Geert will show best practices for evolving device drivers in a maintainable way, based on his experiences as an embedded Linux kernel developer and maintainer.
Geert Uytterhoeven
Embedded Recipes 2019 - LLVM / Clang integrationAnne Nicolas
Buildroot is a popular and easy to use embedded Linux build system. It generates, in few minutes, lightweight and customized Linux systems, including the cross-compilation toolchain, kernel and bootloader images, as well as a wide variety of userspace libraries and programs.
This talk is about the integration of LLVM/clang into Buildroot.
In 2018, Valentin Korenblit, supervised by Romain Naour, worked on this topic during his internship at Smile ECS. After a short introduction about llvm/clang and Buildroot, this talk will go through the numerous issues discovered while adding llvm/clang componants and how these issues were fixed. Romain will also detail the work in progress and the work to be done based on llvm/clang libraries (OpenCL, Compiler-rt, BCC. Chromium, ldd).
Romain Naour
Embedded Recipes 2019 - Introduction to JTAG debuggingAnne Nicolas
This talk introduces JTAG debugging capabilities, both for debugging hardware and software. Marek first explains what the JTAG stands for and explains the operation of the JTAG state machine. This is followed by an introduction to free software JTAG tools, OpenOCD and urJTAG. Marek shortly explains how to debug software using those tools and how that ties into the JTAG state machine. However, JTAG was designed for testing hardware. Marek explains what boundary scan testing (BST) is, what are BSDL files and their format, and practically demonstrates how to blink an LED using BST and only free software tools.
Marek Vasut
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimediaAnne Nicolas
PipeWire is an open source project that aims to greatly improve audio and video handling under Linux. Utilising a fresh design, it bridges use cases that have been previously addressed by different tools – or not addressed at all -, providing ground for building complex, yet secure and efficient, multimedia systems.
In this talk, Julien is going to present the PipeWire project and the concepts that make up its design. In addition, he is going to give an update of the current and future work going on around PipeWire, both upstream and in Automotive Grade Linux, an early adopter that Julien is actively working on.
Julian Bouzas
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedAnne Nicolas
Ftrace’s most powerful feature is the function tracer (and function graph tracer which is built from it). But to have this enabled on production systems, it had to have its overhead be negligible when disabled. As the function tracer uses gcc’s profiling mechanism, which adds a call to “mcount” (or more recently fentry, don’t worry if you don’t know what this is, it will all be explained) at the start of almost all functions, it had to do something about the overhead that causes. The solution was to turn those calls into “nops” (an instruction that the CPU simply ignores). But this was no easy feat. It took a lot to come up with a solution (and also turning a few network cards into bricks). This talk will explain the history of how ftrace came about implementing the function tracer, and brought with it the possibility of static branches and soon static calls!
Steven Rostedt
Kernel Recipes 2019 - Suricata and XDPAnne Nicolas
Suricata is a network threat detection engine using network packets capture to reconstruct the traffic till the application layer and find threats on the network using rules that define behavior to detect. This task is really CPU intensive and discarding non interesting traffic is a solution to enable a scaling of Suricata to 40gbps and other.
This talk will present the latest evolution of Suricata that knows uses eBPF and XDP to bypass traffic. Suricata 5.0 is supporting the hardware XDP to provide ypass with network card such as Netronome. It also takes advantage of pinned maps to get persistance of the bypassed flows. This talk will cover the different usage of XDP and eBPF in Suricata and shows how it impact performance and usability. If development time permit, the talk will also cover AF_XDP and the impact on this new capture method on Suricata.
Eric Leblond
Kernel Recipes 2019 - Marvels of Memory Auto-configuration (SPD)Anne Nicolas
System memory configuration is a transparent operation nowadays, something that we all came to expect to just work out of the box. Still, it does happen behind the scenes every single time we boot our computers. This requires the cooperation of hardware components on the mainboard and on memory modules themselves, as well as firmware code to drive these. While it is possible to just let it happen, having a deeper understanding of how it works makes it possible to access valuable information from the operating system at run-time.
I will take you through the history of system memory configuration from the mid 70s to now. We will explore the different types of memory modules, how their configuration data is stored and how the firmware can access them. We will see which problems had to be solved along the way and how they were solved. Lastly we will see how Linux supports reading the memory configuration information and what you can do with that information.
Jean Delvare
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfJay Das
With the advent of artificial intelligence or AI tools, project management processes are undergoing a transformative shift. By using tools like ChatGPT, and Bard organizations can empower their leaders and managers to plan, execute, and monitor projects more effectively.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
3. Target
● Developers who build a lot of code
● Maintainers who backport lots of patches
● People who have to debug and bisect
● Developers having to use very slow laptops
(“ultrabooks”)
● Those who like to have fun with clusters
● Others ?
4. Observations
● Backporting fixes into old kernels is not trivial
● It often causes build failures
● Need to build a lot to validate backports
● Build time dominates in a backport session
● Not always building from the same place
● I spend a lot of time building distros as well...
5. How is time spent
● Lots of time spent between keyboard and chair
● Few short build sessions (< 10s)
● Many medium build sessions (~ 1mn)
● Few long build sessions (>15mn)
● You never know how long it takes
The goal is in fact to reduce the wait time!
6. Ideas on how to reduce build time
● Stop testing backports and rely on previews :-)
● Release more often with less patches
● Buy a bigger machine
● Use a compiler cache
● Distribute the build over several machines
7. Ideas on how to reduce build time
● Stop testing backports and rely on previews :-)
● Release more often with less patches
● Buy a bigger machine
● Use a compiler cache
● Distribute the build over several machines
8. Ideas on how to reduce build time
● Stop testing backports and rely on previews :-)
● Release more often with less patches
● Buy a bigger machine
● Use a compiler cache
● Distribute the build over several machines
9. Ideas on how to reduce build time
● Stop testing backports and rely on previews :-)
● Release more often with less patches
● Buy a bigger machine
● Use a compiler cache
● Distribute the build over several machines
10. Ideas on how to reduce build time
● Stop testing backports and rely on previews :-)
● Release more often with less patches
● Buy a bigger machine
● Use a compiler cache
● Distribute the build over several machines
11. Ideas on how to reduce build time
● Stop testing backports and rely on previews :-)
● Release more often with less patches
● Buy a bigger machine
● Use a compiler cache
● Distribute the build over several machines
12. Distributing the build ?
For this, you need :
● A distributable workload (not kidding)
● Multiple machines
● The exact same compiler everywhere
● Some way to submit the job to these machines
● A low enough latency
13. Distributable workloads
Quite a few requirements :
● Strong support for parallel builds (make -j)
● No dependency on the build node...
● ...or ability to replicate the build environment
● Many more C files to build than machines
● Homogenous build times (hence sizes)
=> The kernel fits pretty well here.
14. Distributing the build ?
For this, you need :
● A distributable workload (not kidding)
● Multiple machines
● The exact same compiler everywhere
● Some way to submit the job to these machines
● A low enough latency
15. Distributing the build ?
For this, you need :
● A distributable workload (not kidding)
● Multiple machines
● The exact same compiler everywhere
● Some way to submit the job to these machines
● A low enough latency
16. Same compiler everywhere ?
It is absolutelely mandatory
=> Use crosstool-ng for this even for the local node
● Reusable, archivable configurations
● Builds relocatable toolchains that run everywhere
● Supports Canadian Cross compilers
● Supports bare-metal compilers
● Use of arch-vendor-os-abi naming allows many different toolchains to
coexist
Hint: set the gcc version in the vendor field, eg x86_64-gcc47-linux-gnu
17. Distributing the build ?
For this, you need :
● A distributable workload (not kidding)
● Multiple machines
● The exact same compiler everywhere
● Some way to submit the job to these machines
● A low enough latency
18. Distributed build controller
This component will be responsible for distributing the load across all the
machines. There are a few prerequisites :
● Must not be intrusive in the build process :
– No extra steps
– No patching
● Must present a very low overhead
● Must support cross-compilers
● Must be smart enough to fall back on the local node if any risk
● Should ignore unreachable machines
=> distcc is exactly all this
19. A few words on distcc
● Works either as a wrapper or in masqueraded mode :
$ ln -s /usr/bin/distcc x86_64-gcc47-linux-gnu-gcc
$ make -j 20 CC=$PWD/x86_64-gcc47-linux-gnu-gcc
● Automatically detects options involving the local node.
● Can use environment variables for nodes list
● Supports per-node usage limits
● Uses file-based node locking (no deamon)
● Detects dead nodes and can sometimes retry locally
20. Warnings on distcc
● Will not use remote nodes if gcov profiling is
enabled
=> sed -i ‘s,.*(CONFIG_GCOV_KERNEL).*,1=n,’ .config
● Pre-processing and linking are local and
consume some time (approx 20-30%)
21. How to compare performance
● Depends on number of files, CPU count,
parallelism, etc...
● Need to use a portable project as a reference to
compare all machines, and stick to that version
● Count the project’s line count to establish a metric
in lines per second.
– Hint: replace “gcc -c” with “gcc -S”, build, and count
non-empty lines not starting with “#”.
– Eg: 458870 lines for haproxy-6bcb0a84.
22. How to compare performance (...)
● Set cpufreq governor to “performance”
● Run multiple times (get at least 3 similar results)
● For more details on the method, please consult
the wiki link at the end.
23. Suitable machines
The first question is :
What do we want to optimize ?
● Build performance for a given cost ?
● Number of nodes (ie switch ports) ?
● Hardware cost for a given performance level ?
● Hardware size / weight ?
● Power consumption / cooling / noise ?
24. Suitable machines (...)
The second question is :
What impacts performance ?
● CPU architecture and family (AMD Phenom and later, intel core
i3/i5/i7, ARM cortex A17 & A53 are efficient)
●
DRAM latency (target high frequency and large busses)
●
CPU caches sizes and latencies
● Storage access time if any (building in /dev/shm is better)
● Compiler build options (save ~10% by playing with ct-ng)
25. Suitable machines (...)
● The “BogoLocs” metric models expected performance
numbers and avoids uselessly testing unsuitable hardware :
26. Suitable machines (...)
● BogoLocs only uses ramlat output
● It is surprizingly accurate for a prediction
● It reveals is that the most important speed limiting
factors are the following, from most to least important :
– DRAM latency (time to load a cache line)
– L3 cache latency (if any)
– L2 cache latency (if any)
– CPU frequency
– Core count.
27. Optimizing for performance
Always ensure the build nodes are properly used :
● Top/vmstat should almost always show ~100% CPU
● Local machine should almost never stay around 100% CPU
● Network must not be saturated
Real world example :
● Controller: Thinkpad T430s (core i5 2*3.1 GHz). Avg 70% CPU
● Build nodes: 4*CS-008 (RK3288 4*1.8 GHz). 95% CPU
● Network: 180 Mbps avg, peaks at 450 Mbps.
● Theorical limit for this controller: ~6 build nodes.
28. Optimizing for performance (...)
PC-type machines are the fastest and can be tuned :
● Fill all memory channels with fastest DRAM available (20-50% gains possible)
● Enable Hyperthreading when available (~50% gain max)
● Overclocking is a possibility (~15% gain max)
●
Add more RAM to improve file caching
●
Less gain over 8 cores due to memory bottlenecks
… but they (are expensive) and (are huge or noisy (or both)) and (draw a lot of
power) but they are versatile.
Hint: chose motherboards, CPUs and RAM designed for gamers.
29. Optimizing for the number of nodes
Use case : limited connectivity, limited build
parallelism, boss willing to buy only one server, etc.
=> Need for highest performance per node
Dual-socket high-frequency, 8-core x86 machines with
all RAM slots populated have a huge memory
bandwidth and are not very expensive.
=> often bigger than the typical PC and draw more
power.
30. Optimizing for hardware costs
In 2016, a quad-core 1.2 GHz machine costs $8.
Eg: NanoPI NEO :
http://www.friendlyarm.com/index.php?route=product/product&path=69&product_id=132
31. Optimizing for hardware costs (...)
But does it really work ?
● Yes it does, but it will be around 16 times slower
than a $800 PC.
● It may be even slower once it starts heating and
throttles
● It is limited to 100 Mbps per node (ok for this
one)
… and there are hidden costs
32. Hidden costs of very small hardware
● Shipping costs : ~$5 per machine
● Connectivity : a switch costs at least ~$2 per port
● Ethernet cable : ~$1 per cable
● A heatsink is needed to avoid throttling : $1
● A micro-SD card is needed to boot : $3
● A USB power supply is needed : $2
=> Total: ~$22 per machine for 1/16 of a $800 PC =
$350 per PC equivalent, not very interesting
33. Pitfalls of very small hardware
● Large files will take 15-20 times as long as they take on
a PC. A single large file can extend the build time.
● Quickly require some enclosures or rack mount options
● Heating, heating, heating...
● Stability issues : many vendors sell overclocked CPUs
(eg: some OrangePI claim 1.6 GHz for this 1.2 GHz
CPU)
● Sometimes advertised frequency cannot be reached
34. Mid-range hardware
● Often sold as “small PCs”
● $25-$60 price range
● CPUs up to quad-cortex A9 at 2 GHz
● Eg: NanoPI2-Fire and Odroid-C2:
35. Mid-range hardware (...)
Pros:
● Very interesting form factor, better cooling
● High performance density
● Some available in 8 A53 cores (eg: NanoPI-M3)
Cons:
● Comparatively more expensive per board
● Some platforms heat a lot (eg: Allwinner chips)
● Often forced to run their own kernel
36. High-range hardware
● Often sold as “set-top boxes” (STB)
● $50-$200 price range
● CPUs: 4-8 A17/A53 at 1.5-2.0 GHz
● Eg: MiQi, RKM-v5, CS-008 (clone) :
37. High-range hardware (...)
Pros:
● Very high performance density (up to ¼ of a $800 PC)
● Gigabit Ethernet connectivity (not always)
● On-board eMMC storage
● Often ultimately gets supported in mainline (eg: MiQi)
Cons:
● Often sold on Android, porting can be a pain (but fun)
● Setting up a chroot in android doesn’t always work due to
selinux
38. High-range hardware (...)
Warnings:
● Varying build quality (eg: CS-008 clones with fake RAM
chips)
● Varying software quality for cheaper clones
● Kernel lying on the real frequency
● Advertised power draw is lower than reality, need for a
strong USB power supply.
● Thermal throttling may happen on heating, heatsinks
are always needed even if sold without
40. Current Implementation
4 CS-008 at
1.8 GHz, 5-
port GigE
switch, 5x2A
PSU. Total
cost: $280 for
16 A17 cores
and 8 GB
RAM. Builds
allmodconfig
in 14 mn vs
43 mn on
laptop.
41. Future research
● Experiment with distcc’s “pump” mode
● Use haproxy in front of distccd :
● Avoid the switch for the portable version :
– Try WiFi with LZO (unsuited without, 180-450 Mbps for 4
machines).
– Try USB-NET on OTG ports with LZO over a USB hub (some USB
power supplies include the hub).
42. More or less important hints
● Don’t focus on kernel optimization/support as long as it is able to quickly and
reliably execute gcc.
● Any lightweight distro works (even a chroot inside Android if selinux is not too
strict). Using “formilux” here (~10 MB).
● Check with “top” that there’s no CPU hog on the machine (lightdm, systemd)
● Use “ramlat” to measure the performance impact of graphics mode. Disable,
reduce, rebuild without support, or kill Xorg.
● Unlocked core i7 are amazing when stuffed with fast DRAM
● Xeons and Atoms are not really worth the price.
● Cortex A17 is very strong (half a core i5 at 2/3 the frequency) but expensive
● Cortex A53 is 2/3 its performance and much cheaper but often provided with slow
RAM and more cores
● Cooling: it’s up to you (passive, active, ...)
43. More or less important hints (...)
● configure leds as heartbeat to spot dead machines
● On a desk, daisy-chain clusters : 3 per 5-port
switch, 6 per 8-port switch.
● Overclocking: sometimes OK, especially for build
testing, or on “gamers” machines.
● Use power-meters to ensure the power usage
remains below 10W per node for fanless operation
44. More or less important hints (...)
● configure leds as heartbeat to spot dead machines
● On a desk, daisy-chain clusters : 3 per 5-port switch, 6 per 8-port
switch:
● Overclocking: sometimes OK, especially for build testing, or on
“gamers” machines.
● Use power-meters to ensure the power usage remains below 10W
per node for fanless operation