Status update the Hardware Procurement section's effort in implementing hardware monitoring with collectd by Luca Gardi luca.gardi@CERN.CH IT-CF-FPP (CERN IT Computing Facilities - Facilitiy Planning and procurement)
AndesClarity is a pipeline visualizer and analyzer for Andes V5 vector processors. It graphically represents instruction execution and pipeline stages with performance information. It helps optimize algorithms by identifying bottlenecks and stalls. The document provides an example of using AndesClarity to optimize a fast discrete cosine transform algorithm through four iterations. Each optimization interleaves instructions to better utilize the vector processor's functional units and reduce dependencies between iterations.
This document summarizes a presentation on static partitioning virtualization for RISC-V. It discusses the motivation for embedded virtualization, an overview of static partitioning hypervisors like Jailhouse and Xen, and the Bao hypervisor. It then provides an overview of the RISC-V hypervisor specification and extensions, including implemented features. It evaluates the performance overhead and interrupt latency of a prototype RISC-V hypervisor implementation with and without interference mitigations like cache partitioning.
In recent years Graphic Processing Units (GPUs) have seen widespread adoption in many scientific fields, from Machine Learning (ML) to Genomics. Their use makes it possible to achieve significant speedups and improvements in power efficiency over computationally intensive algorithms compared to General Purpose Central Processing Units (CPUs).
However, algorithms require specific knowledge of the GPU architecture and expertise to achieve significant results.
In this work, we describe a methodology for automatic GPU kernel optimization.
Our methodology exploits the Berkeley Roofline Model to perform a performance analysis of the algorithm considered and aims to increase the accessibility of GPU programming automatizing the optimization process of the kernel.
We provide an in-depth analysis of this methodology, an overview on the state of the art, and a description of a tool we developed that automatically applies our methodology to obtain a highly optimized GPU version of two of the most popular algorithms used in computational biology, the X-drop and Smith-Waterman algorithms.
The Smith-Waterman algorithm is one of the most used algorithms in genomics pipelines.
The algorithm finds the optimal local alignment between two genomic sequences, at the cost of being particularly compute-intensive.
The popular X-drop algorithm reduces the time required by the alignment by searching only for high-quality alignments.
The algorithms accelerated using our methodology achieve more than 6x and 3x speed-up, for the X-drop and Smith-Waterman algorithms respectively, with respect to the state of the art implementation of these algorithms.
Getting started with RISC-V verification what's next after compliance testingRISC-V International
The document discusses the CPU design verification (DV) process for RISC-V processors and the challenges presented by RISC-V's open standard nature. It covers developing a verification plan, obtaining tests and models, running simulations, and verifying until coverage metrics are met. Key aspects include using a reference model for configuration and comparison, techniques like self-check, signature comparison, trace logging and step-and-compare, and test suites like riscv-compliance. The presenter demonstrates step-and-compare verification between an Imperas reference model and RISC-V RTL using open source tools and models.
This document provides an overview of upstreaming code to the Linux kernel. It defines upstreaming as moving software into the main Linux repository hosted on kernel.org. It describes the process of upstreaming, including preparing code, creating patches, getting feedback on mailing lists, and maintaining code after it is accepted. It emphasizes the importance of understanding kernel frameworks, mailing lists, merge windows, and working with maintainers. The target audience is developers and engineering managers interested in contributing code to the Linux kernel.
1. Logically split the work between those responsible for the device tree binding, any framework changes, the driver code, and DTS additions.
2. Create git commits for the device tree binding, driver implementation, and DTS changes in a logical series.
3. Post the commit series to the appropriate mailing lists after addressing any feedback, with cover letter, signatures, and CCing maintainers.
Accelerated Linux Core Dump Analysis training public slidesDmitry Vostokov
The slides from Software Diagnostics Services Linux core dump analysis training. The training description: "Learn how to analyse Linux process crashes and hangs, navigate through process core memory dump space and diagnose corruption, memory leaks, CPU spikes, blocked threads, deadlocks, wait chains, and much more. This book uses a unique and innovative pattern-oriented diagnostic analysis approach to speed up the learning curve. The training consists of 13 practical step-by-step exercises using GDB debugger highlighting more than 25 memory analysis patterns diagnosed in 64-bit process core memory dumps. The training also includes source code of modelling applications, a catalogue of relevant patterns from Software Diagnostics Institute, and an overview of relevant similarities and differences between Windows and Linux user space memory dump analysis useful for engineers with Wintel background."
LAS16-500: The Rise and Fall of Assembler and the VGIC from HellLinaro
LAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
Speakers: Marc Zyngier, Christoffer Dall
Date: September 30, 2016
★ Session Description ★
KVM/ARM has grown up. While the initial implementation of virtualization support for ARM processors in Linux was a quality upstream software project, there were initial design decisions simply not suitable for a long-term maintained hypervisor code base. For example, the way KVM/ARM utilized the hardware support for virtualization, was by running a ‘switching’ layer of code in EL2, purely written in assembly. This was a reasonable design decision in the initial implementation, as the switching layer only had to do one thing: Switch between a VM and the host. But as we began to optimize the implementation, add support for ARMv8.1 and VHE, and added features such as debugging support, we had to move to a more integrated approach, writing the switching logic in C code as well. As another example, the support for virtual interrupts, famously known as the VGIC, was designed with a focus on optimizing MMIO operations. As it turns out, MMIO operations is a less important and infrequent operation on the GIC, and the design had some serious negative consequences for supporting other state transitions for virtual interrupts and had negative performance implications. Therefore, we completely redesigned the VGIC support, and implemented the whole thing from scratch as a team effort, with a very promising result, upstream since Linux v4.7. In this talk we will cover the evolution of this software project and give an overview of the state of the project as it is today.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-500
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-500/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
AndesClarity is a pipeline visualizer and analyzer for Andes V5 vector processors. It graphically represents instruction execution and pipeline stages with performance information. It helps optimize algorithms by identifying bottlenecks and stalls. The document provides an example of using AndesClarity to optimize a fast discrete cosine transform algorithm through four iterations. Each optimization interleaves instructions to better utilize the vector processor's functional units and reduce dependencies between iterations.
This document summarizes a presentation on static partitioning virtualization for RISC-V. It discusses the motivation for embedded virtualization, an overview of static partitioning hypervisors like Jailhouse and Xen, and the Bao hypervisor. It then provides an overview of the RISC-V hypervisor specification and extensions, including implemented features. It evaluates the performance overhead and interrupt latency of a prototype RISC-V hypervisor implementation with and without interference mitigations like cache partitioning.
In recent years Graphic Processing Units (GPUs) have seen widespread adoption in many scientific fields, from Machine Learning (ML) to Genomics. Their use makes it possible to achieve significant speedups and improvements in power efficiency over computationally intensive algorithms compared to General Purpose Central Processing Units (CPUs).
However, algorithms require specific knowledge of the GPU architecture and expertise to achieve significant results.
In this work, we describe a methodology for automatic GPU kernel optimization.
Our methodology exploits the Berkeley Roofline Model to perform a performance analysis of the algorithm considered and aims to increase the accessibility of GPU programming automatizing the optimization process of the kernel.
We provide an in-depth analysis of this methodology, an overview on the state of the art, and a description of a tool we developed that automatically applies our methodology to obtain a highly optimized GPU version of two of the most popular algorithms used in computational biology, the X-drop and Smith-Waterman algorithms.
The Smith-Waterman algorithm is one of the most used algorithms in genomics pipelines.
The algorithm finds the optimal local alignment between two genomic sequences, at the cost of being particularly compute-intensive.
The popular X-drop algorithm reduces the time required by the alignment by searching only for high-quality alignments.
The algorithms accelerated using our methodology achieve more than 6x and 3x speed-up, for the X-drop and Smith-Waterman algorithms respectively, with respect to the state of the art implementation of these algorithms.
Getting started with RISC-V verification what's next after compliance testingRISC-V International
The document discusses the CPU design verification (DV) process for RISC-V processors and the challenges presented by RISC-V's open standard nature. It covers developing a verification plan, obtaining tests and models, running simulations, and verifying until coverage metrics are met. Key aspects include using a reference model for configuration and comparison, techniques like self-check, signature comparison, trace logging and step-and-compare, and test suites like riscv-compliance. The presenter demonstrates step-and-compare verification between an Imperas reference model and RISC-V RTL using open source tools and models.
This document provides an overview of upstreaming code to the Linux kernel. It defines upstreaming as moving software into the main Linux repository hosted on kernel.org. It describes the process of upstreaming, including preparing code, creating patches, getting feedback on mailing lists, and maintaining code after it is accepted. It emphasizes the importance of understanding kernel frameworks, mailing lists, merge windows, and working with maintainers. The target audience is developers and engineering managers interested in contributing code to the Linux kernel.
1. Logically split the work between those responsible for the device tree binding, any framework changes, the driver code, and DTS additions.
2. Create git commits for the device tree binding, driver implementation, and DTS changes in a logical series.
3. Post the commit series to the appropriate mailing lists after addressing any feedback, with cover letter, signatures, and CCing maintainers.
Accelerated Linux Core Dump Analysis training public slidesDmitry Vostokov
The slides from Software Diagnostics Services Linux core dump analysis training. The training description: "Learn how to analyse Linux process crashes and hangs, navigate through process core memory dump space and diagnose corruption, memory leaks, CPU spikes, blocked threads, deadlocks, wait chains, and much more. This book uses a unique and innovative pattern-oriented diagnostic analysis approach to speed up the learning curve. The training consists of 13 practical step-by-step exercises using GDB debugger highlighting more than 25 memory analysis patterns diagnosed in 64-bit process core memory dumps. The training also includes source code of modelling applications, a catalogue of relevant patterns from Software Diagnostics Institute, and an overview of relevant similarities and differences between Windows and Linux user space memory dump analysis useful for engineers with Wintel background."
LAS16-500: The Rise and Fall of Assembler and the VGIC from HellLinaro
LAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
Speakers: Marc Zyngier, Christoffer Dall
Date: September 30, 2016
★ Session Description ★
KVM/ARM has grown up. While the initial implementation of virtualization support for ARM processors in Linux was a quality upstream software project, there were initial design decisions simply not suitable for a long-term maintained hypervisor code base. For example, the way KVM/ARM utilized the hardware support for virtualization, was by running a ‘switching’ layer of code in EL2, purely written in assembly. This was a reasonable design decision in the initial implementation, as the switching layer only had to do one thing: Switch between a VM and the host. But as we began to optimize the implementation, add support for ARMv8.1 and VHE, and added features such as debugging support, we had to move to a more integrated approach, writing the switching logic in C code as well. As another example, the support for virtual interrupts, famously known as the VGIC, was designed with a focus on optimizing MMIO operations. As it turns out, MMIO operations is a less important and infrequent operation on the GIC, and the design had some serious negative consequences for supporting other state transitions for virtual interrupts and had negative performance implications. Therefore, we completely redesigned the VGIC support, and implemented the whole thing from scratch as a team effort, with a very promising result, upstream since Linux v4.7. In this talk we will cover the evolution of this software project and give an overview of the state of the project as it is today.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-500
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-500/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
This document discusses using fuzzing to generate tests for RISC-V compliance testing. It proposes extending an LLVM-based fuzzer with custom mutators and coverage metrics tailored for RISC-V. Experimental results found bugs in several RISC-V simulators, demonstrating the effectiveness of fuzzing for negative compliance testing. The approach generates platform-independent assembly tests and filters invalid tests. It leverages an open-source RISC-V virtual prototype for test execution.
An Essential Relationship between Real-time and Resource PartitioningYoshitake Kobayashi
(ELC Europe 2013)
Running real-time and general purpose applications on a same hardware is normally a crazy idea in most case. However, we strongly focus to run both applications on a hardware without virtualization. Resource Partitioning enables the assignment of hardware resource (e.g.: core, execution time, memory bandwidth or device access) to processes with special requirements (e.g: real-time performance or safety requirements).
In this talk, we would like to discuss current limitation on Linux kernel and describe how to solve it.
BKK16-302: Android Optimizing Compiler: New Member Assimilation GuideLinaro
A tour of essential topics for working on the Android Optimizing Compiler, with a special emphasis on helping new engineers integrate and hit the ground running. Learn how to work on intrinsics, instruction simplification, platform specific optimizations, how to submit good patches, write Checker tests, analyse IR, take boot.oat measurements, and debug performance and execution issues with Streamline and GDB.
RISC-V NOEL-V - A new high performance RISC-V Processor FamilyRISC-V International
This document summarizes the NOEL-V processor family from Cobham Gaisler. It describes the NOEL-V as a RISC-V compliant 64-bit processor with fault tolerance features. It provides details on the processor architecture, peripherals, software ecosystem, verification process, and commercial and open source availability. Examples of projects adopting the NOEL-V include the European H2020 funded De-RISC and SELENE projects for safety-critical computing.
BKK16-400A LuvOS and ACPI Compliance TestingLinaro
ARM server hardware will be shipping in 2016. An incredible amount of work has been done to get this far -- defining and implementing industry standards used by servers, development and testing of SoCs, and all sorts of Linux kernel work. So, how do we make sure we meet all these industry standards?
To a great extent, we've relied on magical thinking so far. That works, but only for so long. LuvOS and FWTS were created in order to catch many of the problems users have found; in this presentation, we describe how we have started extending FWTS to check for standards compliance, specifically ACPI and the SBBR, and how we can use LuvOS to run FWTS and other test suites so that we can rely on hard data, and not just wishful thinking.
The document discusses Macpaul Lin's experience porting U-boot to the NDS32 architecture and sharing lessons learned from open source embedded software development. It covers the software architecture of U-boot, the boot process, code commit rules, patch workflow, and coding style guidelines. The presentation provides an introduction and technical overview of porting U-boot to a new CPU architecture.
LAS16-201: ART JIT in Android N
Speakers: Xueliang Zhong
Date: September 27, 2016
★ Session Description ★
Android runtime (ART) has evolved from an AOT compiler (in Android L & M) to a hybrid mode runtime (in Android N) which combines fast interpreter, JIT compiler and profile guided AOT compiler. In this talk, we’ll take a look at all these important changes in Android N. For example, the design and implementation of JIT, hybrid mode, tooling support, etc. This talk is meant to help Linaro members and developers to have a deeper understanding of ART in Android N, and to help them face the challenges of the new behaviors of Android runtime.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-201
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-201/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
LAS16-106: GNU Toolchain Development LifecycleLinaro
LAS16-106: GNU Toolchain Development Lifecycle
Speakers: Ryan Arnold
Date: September 26, 2016
★ Session Description ★
This presentation will examine the lifecycle of toolchain development from inception of the micro-architecture, to development of the ISA, to delivery of OS enablement in FOSS projects, to adoption in Linux Distributions. It will examine the behaviors of successful silicon vendors as well as behaviors of vendors that struggle to get their platform fully enabled in the GNU/Linux OS.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-106
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-106/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
Kernel Recipes 2013 - Deciphering OopsiesAnne Nicolas
The Linux kernel is a very complex beast living in millions of households and data centers around the world. Normally, you’re not supposed to notice its presence but when it gets cranky because of something not suiting it, it spits crazy messages called colloquially
oopses and panics.
In this talk, we’re going to try to understand how to read those messages in order to be able to address its complaints so that it can get back to work for us.
Tech talk with lampro mellon an open source solution for accelerating verific...RISC-V International
This document describes an open-source design verification environment called LM RISC-V DV for accelerating the design and verification of RISC-V processors. It leverages existing open-source RISC-V IPs, including the SweRV EH1 core, RISCV-DV instruction generator, and Spike ISS. The environment provides automated test generation, RTL simulation, ISS simulation, post-simulation comparison and coverage analysis. The environment is available on GitHub to help the open-source RISC-V ecosystem. Future work includes support for co-simulation and additional SweRV cores.
Tommaso Cucinotta - Low-latency and power-efficient audio applications on Linuxlinuxlab_conf
Building Linux-based low-latency audio processing software for nowadays multi-core devices can be cumbersome. I’ll present some of our on-going research on the topic at the Real-Time Systems Lab of Scuola Superiore Sant’Anna, focusing on sound synthesis on Android where power-efficiency is a must.
The talk will provide basic background information on how the audio sub-system of Linux works, in terms of interactions between the Linux kernel and the ALSA sound architecture, including how user-space applications normally cope with low-latency requirements, touching briefly on design concepts behind the existence of the JACK low-latency framework. Then, a few concepts will be provided on the peculiarities of the Android audio processing pipeline, crossing the concepts with the due complications arising from the world of mobile and power-efficient devices. Throughout the talk, I’ll touch upon concepts behind our research efforts on the topic, describing how properly designed real-time CPU scheduling strategies can make a difference in what is achievable in this area.
LAS16-407: Internet of Tiny Linux (IoTL): the sequel.Linaro
LAS16-407: Internet of Tiny Linux (IoTL): the sequel.
Speakers: Nicolas Pitre
Date: September 29, 2016
★ Session Description ★
This is a discussion on various methods put forward to reduce the size of Linux kernel and user space binaries to make them suitable for small IoT applications, ranging from low hanging fruits to more daring approaches. Results from on-going work will also be presented.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-407
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-407/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
LAS16-210: Hardware Assisted Tracing on ARM with CoreSight and OpenCSDLinaro
LAS16-210: Hardware Assisted Tracing on ARM with CoreSight and OpenCSD
Speakers: Mathieu Poirier
Date: September 27, 2016
★ Session Description ★
The CoreSight framework available in the Linux kernel has recently been integrated with the standard Perf trace system, making HW assisted tracing on ARM systems accessible to developers working on a wide spectrum of products. This presentation will start by giving a brief overview of the CoreSight technology itself before presenting the current solution, from trace collection in kernel space to off system trace decoding. To help with the latter part the Open CoreSight Decoding Library (openCSD) is introduced. OpenCSD is an open source library assisting with the decoding of collected trace data. We will see how it is used with the existing perf tools to provide an end-to-end solution for CoreSight trace decoding. The presentation will conclude with trace acquisition and decoding scenarios, along with tips on how to interpret trace information rendered by the perf tools.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-210
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-210/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
LAS16-501: Introduction to LLVM - Projects, Components, Integration, InternalsLinaro
LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
Speakers: Renato Golin
Date: September 30, 2016
★ Session Description ★
Deep dive into LLVM internals, middle/back-ends, libraries, sanitizers, linker, debugger and overall compilation process. The focus is to show how LLVM works under the hood, which is useful for GCC compiler engineers getting into LLVM development, as well as for other engineers to learn more about parts of the toolchain they’re not familiar with. This presentation also touches on frequent LLVM-specific errors, so GCC users may find useful, if they’re moving to LLVM.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-501
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-501/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
Перехват функций (хуки) под Windows в приложениях с помощью C/C++corehard_by
This document provides an overview of function hooking techniques in Windows applications using C/C++. It discusses various hooking methods like DLL injection, API hooking using IAT/EAT patching, splicing hooks that modify machine code, and virtual table hooks for C++ applications. It covers topics like portable executable format, system calls, virtual method tables, and how to inject and hook functions in the Windows environment while avoiding detection. Examples of applications that use hooks are also provided, along with pros and cons of different techniques.
Andes enhancing verification coverage for risc v vector extension using riscv-dvRISC-V International
1) The document discusses using RISCV-DV, an open source framework, to enhance verification coverage for the RISC-V Vector extension.
2) It describes how RISCV-DV implements the Vector extension through instruction support, programmer's model support, and constrained random generation of vector programs.
3) A case study shows how Andes customized RISCV-DV to verify their NX27V processor implementation of the Vector extension through co-simulation and directed instruction streams.
TMT SequenceL customer use cases and resultsDoug Norton
1. SequenceL is a programming language that allows users to easily write parallel code that utilizes all CPU cores and GPUs for improved performance.
2. Several customers were able to achieve significant speedups over existing solutions by rewriting algorithms and applications in SequenceL, including over 2x, 17.8x, and 26x speedups in computational fluid dynamics simulations.
3. SequenceL also offers advantages in code readability, portability across platforms, and faster development times compared to alternative solutions like C, C++, OpenMP, and Matlab.
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...NETWAYS
This document discusses hardware-level data center monitoring using Prometheus. It outlines the speaker's data center which contains over 2,000 servers and 200 network devices. It then provides a brief introduction to Prometheus, highlighting its reliability, scalability, flexibility and ease of integration. Several Prometheus exporters are described that monitor nodes, network devices, and other systems, replacing tools like Nagios, Ganglia and Cacti. Methods for merging data from different sources are demonstrated. The transition to Prometheus monitoring is deemed successful due to the many available integrations and ease of developing new ones.
Introduction to Civil Infrastructure PlatformSZ Lin
CIP is target to establish an open source base layer of industrial grade software to enable the use and implementation of software. This slide will introduce the current status and road map in CIP
This document discusses using fuzzing to generate tests for RISC-V compliance testing. It proposes extending an LLVM-based fuzzer with custom mutators and coverage metrics tailored for RISC-V. Experimental results found bugs in several RISC-V simulators, demonstrating the effectiveness of fuzzing for negative compliance testing. The approach generates platform-independent assembly tests and filters invalid tests. It leverages an open-source RISC-V virtual prototype for test execution.
An Essential Relationship between Real-time and Resource PartitioningYoshitake Kobayashi
(ELC Europe 2013)
Running real-time and general purpose applications on a same hardware is normally a crazy idea in most case. However, we strongly focus to run both applications on a hardware without virtualization. Resource Partitioning enables the assignment of hardware resource (e.g.: core, execution time, memory bandwidth or device access) to processes with special requirements (e.g: real-time performance or safety requirements).
In this talk, we would like to discuss current limitation on Linux kernel and describe how to solve it.
BKK16-302: Android Optimizing Compiler: New Member Assimilation GuideLinaro
A tour of essential topics for working on the Android Optimizing Compiler, with a special emphasis on helping new engineers integrate and hit the ground running. Learn how to work on intrinsics, instruction simplification, platform specific optimizations, how to submit good patches, write Checker tests, analyse IR, take boot.oat measurements, and debug performance and execution issues with Streamline and GDB.
RISC-V NOEL-V - A new high performance RISC-V Processor FamilyRISC-V International
This document summarizes the NOEL-V processor family from Cobham Gaisler. It describes the NOEL-V as a RISC-V compliant 64-bit processor with fault tolerance features. It provides details on the processor architecture, peripherals, software ecosystem, verification process, and commercial and open source availability. Examples of projects adopting the NOEL-V include the European H2020 funded De-RISC and SELENE projects for safety-critical computing.
BKK16-400A LuvOS and ACPI Compliance TestingLinaro
ARM server hardware will be shipping in 2016. An incredible amount of work has been done to get this far -- defining and implementing industry standards used by servers, development and testing of SoCs, and all sorts of Linux kernel work. So, how do we make sure we meet all these industry standards?
To a great extent, we've relied on magical thinking so far. That works, but only for so long. LuvOS and FWTS were created in order to catch many of the problems users have found; in this presentation, we describe how we have started extending FWTS to check for standards compliance, specifically ACPI and the SBBR, and how we can use LuvOS to run FWTS and other test suites so that we can rely on hard data, and not just wishful thinking.
The document discusses Macpaul Lin's experience porting U-boot to the NDS32 architecture and sharing lessons learned from open source embedded software development. It covers the software architecture of U-boot, the boot process, code commit rules, patch workflow, and coding style guidelines. The presentation provides an introduction and technical overview of porting U-boot to a new CPU architecture.
LAS16-201: ART JIT in Android N
Speakers: Xueliang Zhong
Date: September 27, 2016
★ Session Description ★
Android runtime (ART) has evolved from an AOT compiler (in Android L & M) to a hybrid mode runtime (in Android N) which combines fast interpreter, JIT compiler and profile guided AOT compiler. In this talk, we’ll take a look at all these important changes in Android N. For example, the design and implementation of JIT, hybrid mode, tooling support, etc. This talk is meant to help Linaro members and developers to have a deeper understanding of ART in Android N, and to help them face the challenges of the new behaviors of Android runtime.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-201
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-201/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
LAS16-106: GNU Toolchain Development LifecycleLinaro
LAS16-106: GNU Toolchain Development Lifecycle
Speakers: Ryan Arnold
Date: September 26, 2016
★ Session Description ★
This presentation will examine the lifecycle of toolchain development from inception of the micro-architecture, to development of the ISA, to delivery of OS enablement in FOSS projects, to adoption in Linux Distributions. It will examine the behaviors of successful silicon vendors as well as behaviors of vendors that struggle to get their platform fully enabled in the GNU/Linux OS.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-106
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-106/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
Kernel Recipes 2013 - Deciphering OopsiesAnne Nicolas
The Linux kernel is a very complex beast living in millions of households and data centers around the world. Normally, you’re not supposed to notice its presence but when it gets cranky because of something not suiting it, it spits crazy messages called colloquially
oopses and panics.
In this talk, we’re going to try to understand how to read those messages in order to be able to address its complaints so that it can get back to work for us.
Tech talk with lampro mellon an open source solution for accelerating verific...RISC-V International
This document describes an open-source design verification environment called LM RISC-V DV for accelerating the design and verification of RISC-V processors. It leverages existing open-source RISC-V IPs, including the SweRV EH1 core, RISCV-DV instruction generator, and Spike ISS. The environment provides automated test generation, RTL simulation, ISS simulation, post-simulation comparison and coverage analysis. The environment is available on GitHub to help the open-source RISC-V ecosystem. Future work includes support for co-simulation and additional SweRV cores.
Tommaso Cucinotta - Low-latency and power-efficient audio applications on Linuxlinuxlab_conf
Building Linux-based low-latency audio processing software for nowadays multi-core devices can be cumbersome. I’ll present some of our on-going research on the topic at the Real-Time Systems Lab of Scuola Superiore Sant’Anna, focusing on sound synthesis on Android where power-efficiency is a must.
The talk will provide basic background information on how the audio sub-system of Linux works, in terms of interactions between the Linux kernel and the ALSA sound architecture, including how user-space applications normally cope with low-latency requirements, touching briefly on design concepts behind the existence of the JACK low-latency framework. Then, a few concepts will be provided on the peculiarities of the Android audio processing pipeline, crossing the concepts with the due complications arising from the world of mobile and power-efficient devices. Throughout the talk, I’ll touch upon concepts behind our research efforts on the topic, describing how properly designed real-time CPU scheduling strategies can make a difference in what is achievable in this area.
LAS16-407: Internet of Tiny Linux (IoTL): the sequel.Linaro
LAS16-407: Internet of Tiny Linux (IoTL): the sequel.
Speakers: Nicolas Pitre
Date: September 29, 2016
★ Session Description ★
This is a discussion on various methods put forward to reduce the size of Linux kernel and user space binaries to make them suitable for small IoT applications, ranging from low hanging fruits to more daring approaches. Results from on-going work will also be presented.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-407
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-407/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
LAS16-210: Hardware Assisted Tracing on ARM with CoreSight and OpenCSDLinaro
LAS16-210: Hardware Assisted Tracing on ARM with CoreSight and OpenCSD
Speakers: Mathieu Poirier
Date: September 27, 2016
★ Session Description ★
The CoreSight framework available in the Linux kernel has recently been integrated with the standard Perf trace system, making HW assisted tracing on ARM systems accessible to developers working on a wide spectrum of products. This presentation will start by giving a brief overview of the CoreSight technology itself before presenting the current solution, from trace collection in kernel space to off system trace decoding. To help with the latter part the Open CoreSight Decoding Library (openCSD) is introduced. OpenCSD is an open source library assisting with the decoding of collected trace data. We will see how it is used with the existing perf tools to provide an end-to-end solution for CoreSight trace decoding. The presentation will conclude with trace acquisition and decoding scenarios, along with tips on how to interpret trace information rendered by the perf tools.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-210
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-210/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
LAS16-501: Introduction to LLVM - Projects, Components, Integration, InternalsLinaro
LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
Speakers: Renato Golin
Date: September 30, 2016
★ Session Description ★
Deep dive into LLVM internals, middle/back-ends, libraries, sanitizers, linker, debugger and overall compilation process. The focus is to show how LLVM works under the hood, which is useful for GCC compiler engineers getting into LLVM development, as well as for other engineers to learn more about parts of the toolchain they’re not familiar with. This presentation also touches on frequent LLVM-specific errors, so GCC users may find useful, if they’re moving to LLVM.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-501
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-501/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
Перехват функций (хуки) под Windows в приложениях с помощью C/C++corehard_by
This document provides an overview of function hooking techniques in Windows applications using C/C++. It discusses various hooking methods like DLL injection, API hooking using IAT/EAT patching, splicing hooks that modify machine code, and virtual table hooks for C++ applications. It covers topics like portable executable format, system calls, virtual method tables, and how to inject and hook functions in the Windows environment while avoiding detection. Examples of applications that use hooks are also provided, along with pros and cons of different techniques.
Andes enhancing verification coverage for risc v vector extension using riscv-dvRISC-V International
1) The document discusses using RISCV-DV, an open source framework, to enhance verification coverage for the RISC-V Vector extension.
2) It describes how RISCV-DV implements the Vector extension through instruction support, programmer's model support, and constrained random generation of vector programs.
3) A case study shows how Andes customized RISCV-DV to verify their NX27V processor implementation of the Vector extension through co-simulation and directed instruction streams.
TMT SequenceL customer use cases and resultsDoug Norton
1. SequenceL is a programming language that allows users to easily write parallel code that utilizes all CPU cores and GPUs for improved performance.
2. Several customers were able to achieve significant speedups over existing solutions by rewriting algorithms and applications in SequenceL, including over 2x, 17.8x, and 26x speedups in computational fluid dynamics simulations.
3. SequenceL also offers advantages in code readability, portability across platforms, and faster development times compared to alternative solutions like C, C++, OpenMP, and Matlab.
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...NETWAYS
This document discusses hardware-level data center monitoring using Prometheus. It outlines the speaker's data center which contains over 2,000 servers and 200 network devices. It then provides a brief introduction to Prometheus, highlighting its reliability, scalability, flexibility and ease of integration. Several Prometheus exporters are described that monitor nodes, network devices, and other systems, replacing tools like Nagios, Ganglia and Cacti. Methods for merging data from different sources are demonstrated. The transition to Prometheus monitoring is deemed successful due to the many available integrations and ease of developing new ones.
Introduction to Civil Infrastructure PlatformSZ Lin
CIP is target to establish an open source base layer of industrial grade software to enable the use and implementation of software. This slide will introduce the current status and road map in CIP
In this video from the Blue Waters 2018 Symposium, Maxim Belkin presents a tutorial on Containers: Shifter and Singularity on Blue Waters.
Container solutions are a great way to seamlessly execute code on a variety of platforms. Not only they are used to abstract away from the software stack of the underlying operating system, they also enable reproducible computational research. In this mini-tutorial, I will review the process of working with Shifter and Singularity on Blue Waters.
Watch the video: https://wp.me/p3RLHQ-iXO
Learn more: https://bluewaters.ncsa.illinois.edu/blue-waters-symposium-2018
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Using the new extended Berkley Packet Filter capabilities in Linux to the improve performance of auditing security relevant kernel events around network, file and process actions.
This document provides an update on PGI compilers and tools for heterogeneous supercomputing. It discusses PGI's support for OpenACC directives to accelerate applications on multicore CPUs and NVIDIA GPUs from a single source. It highlights new compiler features including support for Intel Skylake, AMD EPYC and IBM POWER9 CPUs as well as NVIDIA Volta GPUs. Benchmark results show strong performance of OpenACC applications on these platforms. The document also discusses the growing adoption of OpenACC in HPC applications and resources available to support OpenACC development.
Benmarking Orange Forge with CLIF, OW2con'15, November 17, Paris OW2
This presentation is given by Brunon Dillenseger, Orange.
Abstract: At the middle of this year, Orange switched its internal software forge (so-called Orange Forge) to a completely new hardware infrastructure and a new version of the Tuleap forge software. With more than 1000 connected users daily, among 16000 registered users worldwide, working on more than 5000 active projects, this switch was definitely a critical operation.
Could this new brand new Orange Forge platform cope with such a traffic? If so, what about users' quality of experience? This is what we tried to know using OW2's load testing project CLIF. Feedback says it was a good idea to do so.
2022 Cauldron analyzer talk from david malcolmssuser866937
The document summarizes new developments in GCC's static analyzer (-fanalyzer) option. Key points include 15 new -Wanalyzer warnings added in GCC 13 covering issues like misuse of variable argument functions and file descriptor handling. Diagnostic serialization formats were improved, allowing recording and playback of analysis results. Internal changes improved inlining and plugin support. Work is ongoing to analyze the Linux kernel and add new warnings for infinite recursion, dereferencing before checking for null, and other issues.
Puppet Camp Berlin 2015: Configuration Management @ CERN: Going Agile with StylePuppet
CERN uses Puppet for configuration management of its large infrastructure including 15k servers and 300PB of storage. Puppet is used to configure services across the infrastructure through modules, hostgroups, and environments. Changes go through a review process involving testing in custom environments before being merged to production. Tools like Jens, Jenkins, Rundeck, and Mcollective help automate operations and configuration changes at scale. Configuration drift and package inventory are challenges addressed through tools and snapshots. Puppet allows CERN to efficiently manage its resources in an agile manner while reducing intervention times.
Puppet Camp Berlin 2015: Andrea Giardini | Configuration Management @ CERN: G...NETWAYS
In 2011, CERN decided to start using Puppet as main tool for development, machines configuration and provisioning as replacement of Quattor.
Since then the infrastructure has changed a lot, the "Agile infrastructure" project evolved is a series of tools and softwares that currently allow more than 10.000 nodes to be configured and provisioned following custom definitions.
Foreman, Git, Openstack and our homemade librarian Jens are only a few of the tools that will be described during the talk, that aims to give an overview about the current workflow for machines lifecycle at CERN.
This talk will cover how Puppet allows us to deal with several hundred of installations a day and, at the same time, provide highly customizable machine configurations for service owners.
Review of the CLIP (Cloud Infrastructure Project) at the Vienna BioCenter. CLIP is an OpenStack deployment for research computing, set out to replace on-site legacy HPC systems. We talk about how we have setup our continuous deployment process, our infrastructure as code approach, continuous testing and verification, monitoring and the pitfalls and surprises we encountered along the way.
The document describes a Cisco Live 2014 presentation on advanced troubleshooting of Cisco Nexus 7000 series switches. It includes an agenda that covers system, data plane, and control plane troubleshooting over 120 minutes. It also discusses strategies, tools, and techniques for troubleshooting these different areas. Some key tools highlighted include show commands, scripts like SystemCheck, packet capture with ELAME, and analyzing logs. The presentation provides guidance on approaches for each troubleshooting area and highlights the extensive logging capabilities of NX-OS.
NVIDIA GTC 2019: Red Hat and the NVIDIA DGX: Tried, Tested, TrustedJeremy Eder
Red Hat and NVIDIA collaborated to bring together two of the technology industry's most popular products: Red Hat Enterprise Linux 7 and the NVIDIA DGX system. This talk will cover how the combination of RHELs rock-solid stability with the incredible DGX hardware can deliver tremendous value to enterprise data scientists. We will also show how to leverage NVIDIA GPU Cloud container images with Kubernetes and RHEL to reap maximum benefits from this incredible hardware.
Profiling deep learning network using NVIDIA nsight systemsJack (Jaegeun) Han
Jack Han presented on profiling deep learning networks using NVIDIA tools. He discussed annotating PyTorch models with NVTX to identify bottlenecks, optimizing data loading in PyTorch, and achieving a 4x speedup on BERT by using mixed precision and Tensor Cores. He also covered profiling TensorFlow graphs with NVTX plugins and command examples for profiling multi-GPU applications with Nsight Systems.
Bounded Model Checking for C Programs in an Enterprise EnvironmentAdaCore
This document discusses using bounded model checking to analyze C programs at scale in an enterprise environment. It describes compiling thousands of software packages using a tool called goto-cc that converts C code to an intermediate representation. This allows running verification tools to find bugs. Many bugs were found and reported, improving quality. The goal is to focus on developing verification methods and analyzing a large codebase to find more bugs and security issues.
FIWARE Global Summit - Real-time Media Stream Processing Using KurentoFIWARE
Kurento Media Server is an open source platform for processing audio and video streams. It allows input streams to be processed and output streams to be manipulated or redistributed. The server has endpoints to receive media and filters that can transform or process the media. Client applications connect to Kurento to build processing pipelines with these components and control the streaming applications.
Slides that were presented during the webrtc Qt Cmake tutorial at IIT-RTC in October 2017 in Chicago. The slides are not yet complete, and will be updated later.
This document provides an overview of a 10-week course on embedded systems using C programming. It discusses the design and implementation of a flexible scheduler for single-processor embedded systems over the course of two seminar sessions. The scheduler will allow tasks to be added and deleted dynamically. The document outlines the key topics that will be covered in each seminar, including an introduction to schedulers, building a basic scheduler, analog I/O, PID control, and linking multiple processors together using various communication protocols.
Similar to Hardware monitoring with collectd at CERN (20)
Google Calendar is a versatile tool that allows users to manage their schedules and events effectively. With Google Calendar, you can create and organize calendars, set reminders for important events, and share your calendars with others. It also provides features like creating events, inviting attendees, and accessing your calendar from mobile devices. Additionally, Google Calendar allows you to embed calendars in websites or platforms like SlideShare, making it easier for others to view and interact with your schedules.
Building a Raspberry Pi Robot with Dot NET 8, Blazor and SignalR - Slides Onl...Peter Gallagher
In this session delivered at Leeds IoT, I talk about how you can control a 3D printed Robot Arm with a Raspberry Pi, .NET 8, Blazor and SignalR.
I also show how you can use a Unity app on an Meta Quest 3 to control the arm VR too.
You can find the GitHub repo and workshop instructions here;
https://bit.ly/dotnetrobotgithub
1. Computing Facilities
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
CF
Hardware monitoring with collectd
Luca Gardi - luca.gardi@cern.ch
2. CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
CF Introduction
● explain the differences between Lemon and collectd
● summarize needed changes for hardware monitoring
● explain the choices made during the process
● provide a status update
● explain current issues and proposed fixes
3. CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
CF 1 - Lemon and collectd
Lemon
● developed by CERN
● in production since 2006 (at least)
● old monitoring infrastructure has been replaced
● retirement efforts started mid-2017
collectd
● open source project
● collects system and service metrics
● optimized to handle thousands of metrics
● modular and portable with community plugins
● easy to develop new plugins in Python/Java/C/Perl
● continuously improving and well documented
m
4. CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
CF 2 - Why collectd?
Pros:
● community-driven and rich ecosystem
● alarms and plugins definitions are puppet-based
● better reusability, documentation
● easier to set up for quick metric collection
● easier metric dispatch in plugins
Cons:
● alarms generated on transition
● existing plugins require re-writing
● MONIT provides a lemon-sensor wrapper but is deprecated
6. CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
CF 4 - Moving towards a collectd era
● very specific and complex needs
○ heterogeneity of hardware and configurations
○ hardware RAID controllers
○ intense use of IPMI
● no community sensors we could adopt
● good news! code can be ported from lemon sensors
● adopt TDD (Test-Driven Development)
● compatibility with python 2.4, 2.7, 3.4
● Continuous Development (CI/CD) using GitLab
8. CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
CF 6 - The big migration
Collectd plugins:
● collectd-mdstat: in production (new) ✔
● collectd-smart-tests: in production ✔
● collectd-megaraidsas: in QA ✔
● collectd-sasarray: in development
● collectd-blockdevices: in pipeline
● collectd-adaptec: in pipeline
● mcelog: from the community
Centralized monitoring:
● CINNAMON: in production ✔ (requires minor changes)
● PODIUM: in development
9. CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
CF 7 - Plugin development workflow
● identify output metrics and write the tests
● write the plugin
● if tests.color == green: plugin.puppet_deploy()
10. CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
CF 8 - Plugin deployment workflow
● RPM packaging and repositories using Koji
● Collectd plugin definition on Puppet
○ it-puppet-module-cerncollectd_contrib on GitLab
● standard CERN CRM QA -> Production pipeline (1 week)
● deployed on physical machines
○ it-puppet-module-hardware: physical.pp
11. CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
CF 9 - Alarms
● based on standard collectd Threshold plugin
● checks local metrics against defined thresholds
● states: OK, WARNING, FAILURE
● puppet defined (metricmgr is already read-only)
● Service Managers can override thresholds and SNOW targets
12. CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
CF
● finish porting of the sensors to collectd
● start retirement of old lemon sensors
● too many tickets:
○ fine tuning of the alarms is necessary
○ waiting for better SNOW tickets deduplication
● tickets are not very descriptive:
○ a pull request has been sent to the upstream community
● no lemon-host-check
○ do we need it?
○ is collectdctl enough?
10 - What’s left and current issues
13. CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
CF
● collectd provides a mature environment for HW monitoring
● using puppet for alarms definition is definitely a plus for
versioning and maintenance, compared to metricmgr
● after an initial series of delays, mainly due to our early adoption,
we are now more than half-way there and progressing steadily
● targeting end of the year for finishing the migration
● a good occasion for collaboration with IT-CM-MM
11 - Conclusions
17. CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
CF Backup slides - Plugin deployment
if (versioncmp($::operatingsystemmajrelease,'6') >= 0) or (versioncmp($::operatingsystemmajrelease,'7') >= 0){
# Software RAID failures (see target in YAML file data)
include ::cerncollectd_contrib::alarm::mdstat_wrong
include ::cerncollectd_contrib::plugin::mdstat
# SMART attributes failures
include ::cerncollectd_contrib::alarm::smart_wrong
include ::cerncollectd_contrib::plugin::smart_tests
# MegaRAID failures
include ::cerncollectd_contrib::alarm::megaraidsas::bbu_status_wrong
include ::cerncollectd_contrib::alarm::megaraidsas::controller_status_wrong
include ::cerncollectd_contrib::alarm::megaraidsas::controller_correctable_errors
include ::cerncollectd_contrib::alarm::megaraidsas::controller_uncorrectable_errors
include ::cerncollectd_contrib::alarm::megaraidsas::cache_policy_on_faulty_bbu_wrong
include ::cerncollectd_contrib::alarm::megaraidsas::cache_policy_on_raid_array_wrong
include ::cerncollectd_contrib::alarm::megaraidsas::raid_array_status_wrong
include ::cerncollectd_contrib::alarm::megaraidsas::missing_drives
include ::cerncollectd_contrib::alarm::megaraidsas::unconfigured_good_drives
include ::cerncollectd_contrib::alarm::megaraidsas::unconfigured_bad_drives
include ::cerncollectd_contrib::alarm::megaraidsas::offline_drives
if (versioncmp($::operatingsystemmajrelease,'6') >= 0){
class {'::cerncollectd_contrib::plugin::megaraidsas':
lsmod_path => '/sbin/lsmod',
}
} else {
include ::cerncollectd_contrib::plugin::megaraidsas
}
}