Talk by Brendan Gregg at Kernel Recipes 2017 (Paris): "The in-kernel Berkeley Packet Filter (BPF) has been enhanced in recent kernels to do much more than just filtering packets. It can now run user-defined programs on events, such as on tracepoints, kprobes, uprobes, and perf_events, allowing advanced performance analysis tools to be created. These can be used in production as the BPF virtual machine is sandboxed and will reject unsafe code, and are already in use at Netflix.
Beginning with the bpf() syscall in 3.18, enhancements have been added in many kernel versions since, with major features for BPF analysis landing in Linux 4.1, 4.4, 4.7, and 4.9. Specific capabilities these provide include custom in-kernel summaries of metrics, custom latency measurements, and frequency counting kernel and user stack traces on events. One interesting case involves saving stack traces on wake up events, and associating them with the blocked stack trace: so that we can see the blocking stack trace and the waker together, merged in kernel by a BPF program (that particular example is in the kernel as samples/bpf/offwaketime).
This talk will discuss the new BPF capabilities for performance analysis and debugging, and demonstrate the new open source tools that have been developed to use it, many of which are in the Linux Foundation iovisor bcc (BPF Compiler Collection) project. These include tools to analyze the CPU scheduler, TCP performance, file system performance, block I/O, and more."
Talk for Facebook Systems@Scale 2021 by Brendan Gregg: "BPF (eBPF) tracing is the superpower that can analyze everything, helping you find performance wins, troubleshoot software, and more. But with many different front-ends and languages, and years of evolution, finding the right starting point can be hard. This talk will make it easy, showing how to install and run selected BPF tools in the bcc and bpftrace open source projects for some quick wins. Think like a sysadmin, not like a programmer."
zfsday talk (a video is on the last slide). The performance of the file system, or disks, is often the target of blame, especially in multi-tenant cloud environments. At Joyent we deploy a public cloud on ZFS-based systems, and frequently investigate performance with a wide variety of applications in growing environments. This talk is about ZFS performance observability, showing the tools and approaches we use to quickly show what ZFS is doing. This includes observing ZFS I/O throttling, an enhancement added to illumos-ZFS to isolate performance between neighbouring tenants, and the use of DTrace and heat maps to examine latency distributions and locate outliers.
Performance Wins with BPF: Getting StartedBrendan Gregg
Keynote by Brendan Gregg for the eBPF summit, 2020. How to get started finding performance wins using the BPF (eBPF) technology. This short talk covers the quickest and easiest way to find performance wins using BPF observability tools on Linux.
OSSNA 2017 Performance Analysis Superpowers with Linux BPFBrendan Gregg
Talk by Brendan Gregg for OSSNA 2017. "Advanced performance observability and debugging have arrived built into the Linux 4.x series, thanks to enhancements to Berkeley Packet Filter (BPF, or eBPF) and the repurposing of its sandboxed virtual machine to provide programmatic capabilities to system tracing. Netflix has been investigating its use for new observability tools, monitoring, security uses, and more. This talk will be a dive deep on these new tracing, observability, and debugging capabilities, which sooner or later will be available to everyone who uses Linux. Whether you’re doing analysis over an ssh session, or via a monitoring GUI, BPF can be used to provide an efficient, custom, and deep level of detail into system and application performance.
This talk will also demonstrate the new open source tools that have been developed, which make use of kernel- and user-level dynamic tracing (kprobes and uprobes), and kernel- and user-level static tracing (tracepoints). These tools provide new insights for file system and storage performance, CPU scheduler performance, TCP performance, and a whole lot more. This is a major turning point for Linux systems engineering, as custom advanced performance instrumentation can be used safely in production environments, powering a new generation of tools and visualizations."
re:Invent 2019 BPF Performance Analysis at NetflixBrendan Gregg
Talk by Brendan Gregg at AWS re:Invent 2019. Abstract: "Extended BPF (eBPF) is an open source Linux technology that powers a whole new class of software: mini programs that run on events. Among its many uses, BPF can be used to create powerful performance analysis tools capable of analyzing everything: CPUs, memory, disks, file systems, networking, languages, applications, and more. In this session, Netflix's Brendan Gregg tours BPF tracing capabilities, including many new open source performance analysis tools he developed for his new book "BPF Performance Tools: Linux System and Application Observability." The talk includes examples of using these tools in the Amazon EC2 cloud."
Talk for Facebook Systems@Scale 2021 by Brendan Gregg: "BPF (eBPF) tracing is the superpower that can analyze everything, helping you find performance wins, troubleshoot software, and more. But with many different front-ends and languages, and years of evolution, finding the right starting point can be hard. This talk will make it easy, showing how to install and run selected BPF tools in the bcc and bpftrace open source projects for some quick wins. Think like a sysadmin, not like a programmer."
zfsday talk (a video is on the last slide). The performance of the file system, or disks, is often the target of blame, especially in multi-tenant cloud environments. At Joyent we deploy a public cloud on ZFS-based systems, and frequently investigate performance with a wide variety of applications in growing environments. This talk is about ZFS performance observability, showing the tools and approaches we use to quickly show what ZFS is doing. This includes observing ZFS I/O throttling, an enhancement added to illumos-ZFS to isolate performance between neighbouring tenants, and the use of DTrace and heat maps to examine latency distributions and locate outliers.
Performance Wins with BPF: Getting StartedBrendan Gregg
Keynote by Brendan Gregg for the eBPF summit, 2020. How to get started finding performance wins using the BPF (eBPF) technology. This short talk covers the quickest and easiest way to find performance wins using BPF observability tools on Linux.
OSSNA 2017 Performance Analysis Superpowers with Linux BPFBrendan Gregg
Talk by Brendan Gregg for OSSNA 2017. "Advanced performance observability and debugging have arrived built into the Linux 4.x series, thanks to enhancements to Berkeley Packet Filter (BPF, or eBPF) and the repurposing of its sandboxed virtual machine to provide programmatic capabilities to system tracing. Netflix has been investigating its use for new observability tools, monitoring, security uses, and more. This talk will be a dive deep on these new tracing, observability, and debugging capabilities, which sooner or later will be available to everyone who uses Linux. Whether you’re doing analysis over an ssh session, or via a monitoring GUI, BPF can be used to provide an efficient, custom, and deep level of detail into system and application performance.
This talk will also demonstrate the new open source tools that have been developed, which make use of kernel- and user-level dynamic tracing (kprobes and uprobes), and kernel- and user-level static tracing (tracepoints). These tools provide new insights for file system and storage performance, CPU scheduler performance, TCP performance, and a whole lot more. This is a major turning point for Linux systems engineering, as custom advanced performance instrumentation can be used safely in production environments, powering a new generation of tools and visualizations."
re:Invent 2019 BPF Performance Analysis at NetflixBrendan Gregg
Talk by Brendan Gregg at AWS re:Invent 2019. Abstract: "Extended BPF (eBPF) is an open source Linux technology that powers a whole new class of software: mini programs that run on events. Among its many uses, BPF can be used to create powerful performance analysis tools capable of analyzing everything: CPUs, memory, disks, file systems, networking, languages, applications, and more. In this session, Netflix's Brendan Gregg tours BPF tracing capabilities, including many new open source performance analysis tools he developed for his new book "BPF Performance Tools: Linux System and Application Observability." The talk includes examples of using these tools in the Amazon EC2 cloud."
Velocity 2017 Performance analysis superpowers with Linux eBPFBrendan Gregg
Talk by for Velocity 2017 by Brendan Gregg: Performance analysis superpowers with Linux eBPF.
"Advanced performance observability and debugging have arrived built into the Linux 4.x series, thanks to enhancements to Berkeley Packet Filter (BPF, or eBPF) and the repurposing of its sandboxed virtual machine to provide programmatic capabilities to system tracing. Netflix has been investigating its use for new observability tools, monitoring, security uses, and more. This talk will investigate this new technology, which sooner or later will be available to everyone who uses Linux. The talk will dive deep on these new tracing, observability, and debugging capabilities. Whether you’re doing analysis over an ssh session, or via a monitoring GUI, BPF can be used to provide an efficient, custom, and deep level of detail into system and application performance.
This talk will also demonstrate the new open source tools that have been developed, which make use of kernel- and user-level dynamic tracing (kprobes and uprobes), and kernel- and user-level static tracing (tracepoints). These tools provide new insights for file system and storage performance, CPU scheduler performance, TCP performance, and a whole lot more. This is a major turning point for Linux systems engineering, as custom advanced performance instrumentation can be used safely in production environments, powering a new generation of tools and visualizations."
Talk by Brendan Gregg for USENIX LISA 2019: Linux Systems Performance. Abstract: "
Systems performance is an effective discipline for performance analysis and tuning, and can help you find performance wins for your applications and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas of Linux systems performance: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (Ftrace, bcc/BPF, and bpftrace/BPF), and much advice about what is and isn't important to learn. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud."
Linux 4.x Tracing: Performance Analysis with bcc/BPFBrendan Gregg
Talk about bcc/eBPF for SCALE15x (2017) by Brendan Gregg. "BPF (Berkeley Packet Filter) has been enhanced in the Linux 4.x series and now powers a large collection of performance analysis and observability tools ready for you to use, included in the bcc (BPF Complier Collection) open source project. BPF nowadays can do system tracing, software defined networks, and kernel fast path: much more than just filtering packets! This talk will focus on the bcc/BPF tools for performance analysis, which make use of other built in Linux capabilities: dynamic tracing (kprobes and uprobes) and static tracing (tracepoints and USDT). There are now bcc tools for measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. Tracing superpowers have finally arrived, built in to Linux."
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtAnne Nicolas
Ftrace is the official tracer of the Linux kernel. It has been apart of Linux since 2.6.31, and has grown tremendously ever since. Ftrace’s name comes from its most powerful feature: function tracing. But the ftrace infrastructure is much more than that. It also encompasses the trace events that are used by perf, as well as kprobes that can dynamically add trace events that the user defines.
This talk will focus on learning how the kernel works by using the ftrace infrastructure. It will show how to see what happens within the kernel during a system call; learn how interrupts work; see how ones processes are being scheduled, and more. A quick introduction to some tools like trace-cmd and KernelShark will also be demonstrated.
Steven Rostedt, VMware
Talk for YOW! by Brendan Gregg. "Systems performance studies the performance of computing systems, including all physical components and the full software stack to help you find performance wins for your application and kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (ftrace, bcc/BPF, and bpftrace/BPF), advice about what is and isn't important to learn, and case studies to see how it is applied. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud.
"
A brief talk on systems performance for the July 2013 meetup "A Midsummer Night's System", video: http://www.youtube.com/watch?v=P3SGzykDE4Q. This summarizes how systems performance has changed from the 1990's to today. This was the reason for writing a new book on systems performance, to provide a reference that is up to date, covering new tools, technologies, and methodologies.
Talk by Brendan Gregg for All Things Open 2018. "At over one thousand code commits per week, it's hard to keep up with Linux developments. This keynote will summarize recent Linux performance features,
for a wide audience: the KPTI patches for Meltdown, eBPF for performance observability and the new open source tools that use it, Kyber for disk I/O sc
heduling, BBR for TCP congestion control, and more. This is about exposure: knowing what exists, so you can learn and use it later when needed. Get the
most out of your systems with the latest Linux kernels and exciting features."
EuroBSDcon 2017 System Performance Analysis MethodologiesBrendan Gregg
keynote by Brendan Gregg. "Traditional performance monitoring makes do with vendor-supplied metrics, often involving interpretation and inference, and with numerous blind spots. Much in the field of systems performance is still living in the past: documentation, procedures, and analysis GUIs built upon the same old metrics. Modern BSD has advanced tracers and PMC tools, providing virtually endless metrics to aid performance analysis. It's time we really used them, but the problem becomes which metrics to use, and how to navigate them quickly to locate the root cause of problems.
There's a new way to approach performance analysis that can guide you through the metrics. Instead of starting with traditional metrics and figuring out their use, you start with the questions you want answered then look for metrics to answer them. Methodologies can provide these questions, as well as a starting point for analysis and guidance for locating the root cause. They also pose questions that the existing metrics may not yet answer, which may be critical in solving the toughest problems. System methodologies include the USE method, workload characterization, drill-down analysis, off-CPU analysis, chain graphs, and more.
This talk will discuss various system performance issues, and the methodologies, tools, and processes used to solve them. Many methodologies will be discussed, from the production proven to the cutting edge, along with recommendations for their implementation on BSD systems. In general, you will learn to think differently about analyzing your systems, and make better use of the modern tools that BSD provides."
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedAnne Nicolas
Ftrace’s most powerful feature is the function tracer (and function graph tracer which is built from it). But to have this enabled on production systems, it had to have its overhead be negligible when disabled. As the function tracer uses gcc’s profiling mechanism, which adds a call to “mcount” (or more recently fentry, don’t worry if you don’t know what this is, it will all be explained) at the start of almost all functions, it had to do something about the overhead that causes. The solution was to turn those calls into “nops” (an instruction that the CPU simply ignores). But this was no easy feat. It took a lot to come up with a solution (and also turning a few network cards into bricks). This talk will explain the history of how ftrace came about implementing the function tracer, and brought with it the possibility of static branches and soon static calls!
Steven Rostedt
Delivered as plenary at USENIX LISA 2013. video here: https://www.youtube.com/watch?v=nZfNehCzGdw and https://www.usenix.org/conference/lisa13/technical-sessions/plenary/gregg . "How did we ever analyze performance before Flame Graphs?" This new visualization invented by Brendan can help you quickly understand application and kernel performance, especially CPU usage, where stacks (call graphs) can be sampled and then visualized as an interactive flame graph. Flame Graphs are now used for a growing variety of targets: for applications and kernels on Linux, SmartOS, Mac OS X, and Windows; for languages including C, C++, node.js, ruby, and Lua; and in WebKit Web Inspector. This talk will explain them and provide use cases and new visualizations for other event types, including I/O, memory usage, and latency.
Computing Performance: On the Horizon (2021)Brendan Gregg
Talk by Brendan Gregg for USENIX LISA 2021. https://www.youtube.com/watch?v=5nN1wjA_S30 . "The future of computer performance involves clouds with hardware hypervisors and custom processors, servers running a new type of BPF software to allow high-speed applications and kernel customizations, observability of everything in production, new Linux kernel technologies, and more. This talk covers interesting developments in systems and computing performance, their challenges, and where things are headed."
UM2019 Extended BPF: A New Type of SoftwareBrendan Gregg
Keynote for Ubuntu Masters 2019 by Brendan Gregg, Netflix. Video https://www.youtube.com/watch?v=7pmXdG8-7WU&feature=youtu.be . "Extended BPF is a new type of software, and the first fundamental change to how kernels are used in 50 years. This new type of software is already in use by major companies: Netflix has 14 BPF programs running by default on all of its cloud servers, which run Ubuntu Linux. Facebook has 40 BPF programs running by default. Extended BPF is composed of an in-kernel runtime for executing a virtual BPF instruction set through a safety verifier and with JIT compilation. So far it has been used for software defined networking, performance tools, security policies, and device drivers, with more uses planned and more we have yet to think of. It is changing how we use and think about systems. This talk explores the past, present, and future of BPF, with BPF performance tools as a use case."
Video: https://www.youtube.com/watch?v=JRFNIKUROPE . Talk for linux.conf.au 2017 (LCA2017) by Brendan Gregg, about Linux enhanced BPF (eBPF). Abstract:
A world of new capabilities is emerging for the Linux 4.x series, thanks to enhancements that have been included in Linux for to Berkeley Packet Filter (BPF): an in-kernel virtual machine that can execute user space-defined programs. It is finding uses for security auditing and enforcement, enhancing networking (including eXpress Data Path), and performance observability and troubleshooting. Many new open source tools that have been written in the past 12 months for performance analysis that use BPF. Tracing superpowers have finally arrived for Linux!
For its use with tracing, BPF provides the programmable capabilities to the existing tracing frameworks: kprobes, uprobes, and tracepoints. In particular, BPF allows timestamps to be recorded and compared from custom events, allowing latency to be studied in many new places: kernel and application internals. It also allows data to be efficiently summarized in-kernel, including as histograms. This has allowed dozens of new observability tools to be developed so far, including measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more.
This talk will summarize BPF capabilities and use cases so far, and then focus on its use to enhance Linux tracing, especially with the open source bcc collection. bcc includes BPF versions of old classics, and many new tools, including execsnoop, opensnoop, funcccount, ext4slower, and more (many of which I developed). Perhaps you'd like to develop new tools, or use the existing tools to find performance wins large and small, especially when instrumenting areas that previously had zero visibility. I'll also summarize how we intend to use these new capabilities to enhance systems analysis at Netflix.
USENIX ATC 2017 Performance Superpowers with Enhanced BPFBrendan Gregg
Talk for USENIX ATC 2017 by Brendan Gregg
"The Berkeley Packet Filter (BPF) in Linux has been enhanced in very recent versions to do much more than just filter packets, and has become a hot area of operating systems innovation, with much more yet to be discovered. BPF is a sandboxed virtual machine that runs user-level defined programs in kernel context, and is part of many kernels. The Linux enhancements allow it to run custom programs on other events, including kernel- and user-level dynamic tracing (kprobes and uprobes), static tracing (tracepoints), and hardware events. This is finding uses for the generation of new performance analysis tools, network acceleration technologies, and security intrusion detection systems.
This talk will explain the BPF enhancements, then discuss the new performance observability tools that are in use and being created, especially from the BPF compiler collection (bcc) open source project. These tools provide new insights for file system and storage performance, CPU scheduler performance, TCP performance, and much more. This is a major turning point for Linux systems engineering, as custom advanced performance instrumentation can be used safely in production environments, powering a new generation of tools and visualizations.
Because these BPF enhancements are only in very recent Linux (such as Linux 4.9), most companies are not yet running new enough kernels to be exploring BPF yet. This will change in the next year or two, as companies including Netflix upgrade their kernels. This talk will give you a head start on this growing technology, and also discuss areas of future work and unsolved problems."
Palestra realizada por Toronto Garcez aka torontux durante a 3a. edição da Nullbyte Security Conference em 26 de novembro de 2016.
Resumo:
O objetivo da apresentação é demonstrar de forma prática, o passo-a-passo para criar uma botnet com roteadores wi-fi e/ou embarcados em geral. Será demonstrado o desenvolvimento de um comando e controle e a utilização de firmwares "backdorados" para tornar dispositivos em bots.
Velocity 2017 Performance analysis superpowers with Linux eBPFBrendan Gregg
Talk by for Velocity 2017 by Brendan Gregg: Performance analysis superpowers with Linux eBPF.
"Advanced performance observability and debugging have arrived built into the Linux 4.x series, thanks to enhancements to Berkeley Packet Filter (BPF, or eBPF) and the repurposing of its sandboxed virtual machine to provide programmatic capabilities to system tracing. Netflix has been investigating its use for new observability tools, monitoring, security uses, and more. This talk will investigate this new technology, which sooner or later will be available to everyone who uses Linux. The talk will dive deep on these new tracing, observability, and debugging capabilities. Whether you’re doing analysis over an ssh session, or via a monitoring GUI, BPF can be used to provide an efficient, custom, and deep level of detail into system and application performance.
This talk will also demonstrate the new open source tools that have been developed, which make use of kernel- and user-level dynamic tracing (kprobes and uprobes), and kernel- and user-level static tracing (tracepoints). These tools provide new insights for file system and storage performance, CPU scheduler performance, TCP performance, and a whole lot more. This is a major turning point for Linux systems engineering, as custom advanced performance instrumentation can be used safely in production environments, powering a new generation of tools and visualizations."
Talk by Brendan Gregg for USENIX LISA 2019: Linux Systems Performance. Abstract: "
Systems performance is an effective discipline for performance analysis and tuning, and can help you find performance wins for your applications and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas of Linux systems performance: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (Ftrace, bcc/BPF, and bpftrace/BPF), and much advice about what is and isn't important to learn. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud."
Linux 4.x Tracing: Performance Analysis with bcc/BPFBrendan Gregg
Talk about bcc/eBPF for SCALE15x (2017) by Brendan Gregg. "BPF (Berkeley Packet Filter) has been enhanced in the Linux 4.x series and now powers a large collection of performance analysis and observability tools ready for you to use, included in the bcc (BPF Complier Collection) open source project. BPF nowadays can do system tracing, software defined networks, and kernel fast path: much more than just filtering packets! This talk will focus on the bcc/BPF tools for performance analysis, which make use of other built in Linux capabilities: dynamic tracing (kprobes and uprobes) and static tracing (tracepoints and USDT). There are now bcc tools for measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. Tracing superpowers have finally arrived, built in to Linux."
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtAnne Nicolas
Ftrace is the official tracer of the Linux kernel. It has been apart of Linux since 2.6.31, and has grown tremendously ever since. Ftrace’s name comes from its most powerful feature: function tracing. But the ftrace infrastructure is much more than that. It also encompasses the trace events that are used by perf, as well as kprobes that can dynamically add trace events that the user defines.
This talk will focus on learning how the kernel works by using the ftrace infrastructure. It will show how to see what happens within the kernel during a system call; learn how interrupts work; see how ones processes are being scheduled, and more. A quick introduction to some tools like trace-cmd and KernelShark will also be demonstrated.
Steven Rostedt, VMware
Talk for YOW! by Brendan Gregg. "Systems performance studies the performance of computing systems, including all physical components and the full software stack to help you find performance wins for your application and kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (ftrace, bcc/BPF, and bpftrace/BPF), advice about what is and isn't important to learn, and case studies to see how it is applied. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud.
"
A brief talk on systems performance for the July 2013 meetup "A Midsummer Night's System", video: http://www.youtube.com/watch?v=P3SGzykDE4Q. This summarizes how systems performance has changed from the 1990's to today. This was the reason for writing a new book on systems performance, to provide a reference that is up to date, covering new tools, technologies, and methodologies.
Talk by Brendan Gregg for All Things Open 2018. "At over one thousand code commits per week, it's hard to keep up with Linux developments. This keynote will summarize recent Linux performance features,
for a wide audience: the KPTI patches for Meltdown, eBPF for performance observability and the new open source tools that use it, Kyber for disk I/O sc
heduling, BBR for TCP congestion control, and more. This is about exposure: knowing what exists, so you can learn and use it later when needed. Get the
most out of your systems with the latest Linux kernels and exciting features."
EuroBSDcon 2017 System Performance Analysis MethodologiesBrendan Gregg
keynote by Brendan Gregg. "Traditional performance monitoring makes do with vendor-supplied metrics, often involving interpretation and inference, and with numerous blind spots. Much in the field of systems performance is still living in the past: documentation, procedures, and analysis GUIs built upon the same old metrics. Modern BSD has advanced tracers and PMC tools, providing virtually endless metrics to aid performance analysis. It's time we really used them, but the problem becomes which metrics to use, and how to navigate them quickly to locate the root cause of problems.
There's a new way to approach performance analysis that can guide you through the metrics. Instead of starting with traditional metrics and figuring out their use, you start with the questions you want answered then look for metrics to answer them. Methodologies can provide these questions, as well as a starting point for analysis and guidance for locating the root cause. They also pose questions that the existing metrics may not yet answer, which may be critical in solving the toughest problems. System methodologies include the USE method, workload characterization, drill-down analysis, off-CPU analysis, chain graphs, and more.
This talk will discuss various system performance issues, and the methodologies, tools, and processes used to solve them. Many methodologies will be discussed, from the production proven to the cutting edge, along with recommendations for their implementation on BSD systems. In general, you will learn to think differently about analyzing your systems, and make better use of the modern tools that BSD provides."
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedAnne Nicolas
Ftrace’s most powerful feature is the function tracer (and function graph tracer which is built from it). But to have this enabled on production systems, it had to have its overhead be negligible when disabled. As the function tracer uses gcc’s profiling mechanism, which adds a call to “mcount” (or more recently fentry, don’t worry if you don’t know what this is, it will all be explained) at the start of almost all functions, it had to do something about the overhead that causes. The solution was to turn those calls into “nops” (an instruction that the CPU simply ignores). But this was no easy feat. It took a lot to come up with a solution (and also turning a few network cards into bricks). This talk will explain the history of how ftrace came about implementing the function tracer, and brought with it the possibility of static branches and soon static calls!
Steven Rostedt
Delivered as plenary at USENIX LISA 2013. video here: https://www.youtube.com/watch?v=nZfNehCzGdw and https://www.usenix.org/conference/lisa13/technical-sessions/plenary/gregg . "How did we ever analyze performance before Flame Graphs?" This new visualization invented by Brendan can help you quickly understand application and kernel performance, especially CPU usage, where stacks (call graphs) can be sampled and then visualized as an interactive flame graph. Flame Graphs are now used for a growing variety of targets: for applications and kernels on Linux, SmartOS, Mac OS X, and Windows; for languages including C, C++, node.js, ruby, and Lua; and in WebKit Web Inspector. This talk will explain them and provide use cases and new visualizations for other event types, including I/O, memory usage, and latency.
Computing Performance: On the Horizon (2021)Brendan Gregg
Talk by Brendan Gregg for USENIX LISA 2021. https://www.youtube.com/watch?v=5nN1wjA_S30 . "The future of computer performance involves clouds with hardware hypervisors and custom processors, servers running a new type of BPF software to allow high-speed applications and kernel customizations, observability of everything in production, new Linux kernel technologies, and more. This talk covers interesting developments in systems and computing performance, their challenges, and where things are headed."
UM2019 Extended BPF: A New Type of SoftwareBrendan Gregg
Keynote for Ubuntu Masters 2019 by Brendan Gregg, Netflix. Video https://www.youtube.com/watch?v=7pmXdG8-7WU&feature=youtu.be . "Extended BPF is a new type of software, and the first fundamental change to how kernels are used in 50 years. This new type of software is already in use by major companies: Netflix has 14 BPF programs running by default on all of its cloud servers, which run Ubuntu Linux. Facebook has 40 BPF programs running by default. Extended BPF is composed of an in-kernel runtime for executing a virtual BPF instruction set through a safety verifier and with JIT compilation. So far it has been used for software defined networking, performance tools, security policies, and device drivers, with more uses planned and more we have yet to think of. It is changing how we use and think about systems. This talk explores the past, present, and future of BPF, with BPF performance tools as a use case."
Video: https://www.youtube.com/watch?v=JRFNIKUROPE . Talk for linux.conf.au 2017 (LCA2017) by Brendan Gregg, about Linux enhanced BPF (eBPF). Abstract:
A world of new capabilities is emerging for the Linux 4.x series, thanks to enhancements that have been included in Linux for to Berkeley Packet Filter (BPF): an in-kernel virtual machine that can execute user space-defined programs. It is finding uses for security auditing and enforcement, enhancing networking (including eXpress Data Path), and performance observability and troubleshooting. Many new open source tools that have been written in the past 12 months for performance analysis that use BPF. Tracing superpowers have finally arrived for Linux!
For its use with tracing, BPF provides the programmable capabilities to the existing tracing frameworks: kprobes, uprobes, and tracepoints. In particular, BPF allows timestamps to be recorded and compared from custom events, allowing latency to be studied in many new places: kernel and application internals. It also allows data to be efficiently summarized in-kernel, including as histograms. This has allowed dozens of new observability tools to be developed so far, including measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more.
This talk will summarize BPF capabilities and use cases so far, and then focus on its use to enhance Linux tracing, especially with the open source bcc collection. bcc includes BPF versions of old classics, and many new tools, including execsnoop, opensnoop, funcccount, ext4slower, and more (many of which I developed). Perhaps you'd like to develop new tools, or use the existing tools to find performance wins large and small, especially when instrumenting areas that previously had zero visibility. I'll also summarize how we intend to use these new capabilities to enhance systems analysis at Netflix.
USENIX ATC 2017 Performance Superpowers with Enhanced BPFBrendan Gregg
Talk for USENIX ATC 2017 by Brendan Gregg
"The Berkeley Packet Filter (BPF) in Linux has been enhanced in very recent versions to do much more than just filter packets, and has become a hot area of operating systems innovation, with much more yet to be discovered. BPF is a sandboxed virtual machine that runs user-level defined programs in kernel context, and is part of many kernels. The Linux enhancements allow it to run custom programs on other events, including kernel- and user-level dynamic tracing (kprobes and uprobes), static tracing (tracepoints), and hardware events. This is finding uses for the generation of new performance analysis tools, network acceleration technologies, and security intrusion detection systems.
This talk will explain the BPF enhancements, then discuss the new performance observability tools that are in use and being created, especially from the BPF compiler collection (bcc) open source project. These tools provide new insights for file system and storage performance, CPU scheduler performance, TCP performance, and much more. This is a major turning point for Linux systems engineering, as custom advanced performance instrumentation can be used safely in production environments, powering a new generation of tools and visualizations.
Because these BPF enhancements are only in very recent Linux (such as Linux 4.9), most companies are not yet running new enough kernels to be exploring BPF yet. This will change in the next year or two, as companies including Netflix upgrade their kernels. This talk will give you a head start on this growing technology, and also discuss areas of future work and unsolved problems."
Palestra realizada por Toronto Garcez aka torontux durante a 3a. edição da Nullbyte Security Conference em 26 de novembro de 2016.
Resumo:
O objetivo da apresentação é demonstrar de forma prática, o passo-a-passo para criar uma botnet com roteadores wi-fi e/ou embarcados em geral. Será demonstrado o desenvolvimento de um comando e controle e a utilização de firmwares "backdorados" para tornar dispositivos em bots.
Presented at LISA18: https://www.usenix.org/conference/lisa18/presentation/babrou
This is a technical dive into how we used eBPF to solve real-world issues uncovered during an innocent OS upgrade. We'll see how we debugged 10x CPU increase in Kafka after Debian upgrade and what lessons we learned. We'll get from high-level effects like increased CPU to flamegraphs showing us where the problem lies to tracing timers and functions calls in the Linux kernel.
The focus is on tools what operational engineers can use to debug performance issues in production. This particular issue happened at Cloudflare on a Kafka cluster doing 100Gbps of ingress and many multiple of that egress.
You have a system with an advanced programmatic tracer: do you know what to do with it? Brendan has used numerous tracers in production environments, and has published hundreds of tracing-based tools. In this talk he will share tips and know-how for creating CLI tracing tools and GUI visualizations, to solve real problems effectively. Programmatic tracing is an amazing superpower, and this talk will show you how to wield it!
Video: https://www.youtube.com/watch?v=uibLwoVKjec . Talk by Brendan Gregg for Sysdig CCWFS 2016. Abstract:
"You have a system with an advanced programmatic tracer: do you know what to do with it? Brendan has used numerous tracers in production environments, and has published hundreds of tracing-based tools. In this talk he will share tips and know-how for creating CLI tracing tools and GUI visualizations, to solve real problems effectively. Programmatic tracing is an amazing superpower, and this talk will show you how to wield it!"
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoringNETWAYS
Nowadays system administrators have great choices when it comes down to Linux performance profiling and monitoring. The challenge is to pick the appropriate tools and interpret their results correctly.
This talk is a chance to take a tour through various performance profiling and benchmarking tools, focusing on their benefit for every sysadmin.
More than 25 different tools are presented. Ranging from well known tools like strace, iostat, tcpdump or vmstat to new features like Linux tracepoints or perf_events. You will also learn which tools can be monitored by Icinga and which monitoring plugins are already available for that.
At the end the goal is to gather reference points to look at, whenever you are faced with performance problems.
Take the chance to close your knowledge gaps and learn how to get the most out of your system.
This slide will show you how to use SOFA to do performance analysis of CPU/GPU cooperative programs, especially programs running with deep software stacks like TensorFlow, PyTorch, etc.
source code at:
https://github.com/cyliustack/sofa
O'Reilly Velocity New York 2016 presentation on modern Linux tracing tools and technology. Highlights the available tracing data sources on Linux (ftrace, perf_events, BPF) and demonstrates some tools that can be used to obtain traces, including DebugFS, the perf front-end, and most importantly, the BCC/BPF tool collection.
Talk for USENIX LISA17: "Containers pose interesting challenges for performance monitoring and analysis, requiring new analysis methodologies and tooling. Resource-oriented analysis, as is common with systems performance tools and GUIs, must now account for both hardware limits and soft limits, as implemented using cgroups. A reverse diagnosis methodology can be applied to identify whether a container is resource constrained, and by which hard or soft resource. The interaction between the host and containers can also be examined, and noisy neighbors identified or exonerated. Performance tooling can need special usage or workarounds to function properly from within a container or on the host, to deal with different privilege levels and name spaces. At Netflix, we're using containers for some microservices, and care very much about analyzing and tuning our containers to be as fast and efficient as possible. This talk will show you how to identify bottlenecks in the host or container configuration, in the applications by profiling in a container environment, and how to dig deeper into kernel and container internals."
Similar to Kernel Recipes 2017: Performance Analysis with BPF (20)
Talk by Brendan Gregg for YOW! 2021. "The pursuit of faster performance in computing is the driving reason for many new technologies and updates. This talk discusses performance improvements now underway that you will likely be adopting soon, for processors (including 3D stacking and cloud vendor CPUs), memory (including DDR5 and high-bandwidth memory [HBM]), disks (including 3D Xpoint as a 3D NAND accelerator), networking (including QUIC and eXpress Data Path [XDP]), runtimes, hypervisors, and more. The future of performance is increasingly cloud-based, with hardware hypervisors and custom processors, meaningful observability of everything down to cycle stalls (even as cloud guests), and high-speed syscall-avoiding applications that use eBPF, FPGAs, and io_uring. The talk also discusses where future performance improvements might be expected, with predictions for new technologies."
USENIX LISA2021 talk by Brendan Gregg (https://www.youtube.com/watch?v=_5Z2AU7QTH4). This talk is a deep dive that describes how BPF (eBPF) works internally on Linux, and dissects some modern performance observability tools. Details covered include the kernel BPF implementation: the verifier, JIT compilation, and the BPF execution environment; the BPF instruction set; different event sources; and how BPF is used by user space, using bpftrace programs as an example. This includes showing how bpftrace is compiled to LLVM IR and then BPF bytecode, and how per-event data and aggregated map data are fetched from the kernel.
YOW2018 Cloud Performance Root Cause Analysis at NetflixBrendan Gregg
Keynote by Brendan Gregg for YOW! 2018. Video: https://www.youtube.com/watch?v=03EC8uA30Pw . Description: "At Netflix, improving the performance of our cloud means happier customers and lower costs, and involves root cause
analysis of applications, runtimes, operating systems, and hypervisors, in an environment of 150k cloud instances
that undergo numerous production changes each week. Apart from the developers who regularly optimize their own code
, we also have a dedicated performance team to help with any issue across the cloud, and to build tooling to aid in
this analysis. In this session we will summarize the Netflix environment, procedures, and tools we use and build t
o do root cause analysis on cloud performance issues. The analysis performed may be cloud-wide, using self-service
GUIs such as our open source Atlas tool, or focused on individual instances, and use our open source Vector tool, f
lame graphs, Java debuggers, and tooling that uses Linux perf, ftrace, and bcc/eBPF. You can use these open source
tools in the same way to find performance wins in your own environment."
Talk by Brendan Gregg and Martin Spier for the Linkedin Performance Engineering meetup on Nov 8, 2018. FlameScope is a visualization for performance profiles that helps you study periodic activity, variance, and perturbations, with a heat map for navigation and flame graphs for code analysis.
Linux Performance 2018 (PerconaLive keynote)Brendan Gregg
Keynote for PerconaLive 2018 by Brendan Gregg. Video: https://youtu.be/sV3XfrfjrPo?t=30m51s . "At over one thousand code commits per week, it's hard to keep up with Linux developments. This keynote will summarize recent Linux performance features, for a wide audience: the KPTI patches for Meltdown, eBPF for performance observability, Kyber for disk I/O scheduling, BBR for TCP congestion control, and more. This is about exposure: knowing what exists, so you can learn and use it later when needed. Get the most out of your systems, whether they are databases or application servers, with the latest Linux kernels and exciting features."
How Netflix Tunes EC2 Instances for PerformanceBrendan Gregg
CMP325 talk for AWS re:Invent 2017, by Brendan Gregg. "
At Netflix we make the best use of AWS EC2 instance types and features to create a high performance cloud, achieving near bare metal speed for our workloads. This session will summarize the configuration, tuning, and activities for delivering the fastest possible EC2 instances, and will help other EC2 users improve performance, reduce latency outliers, and make better use of EC2 features. We'll show how we choose EC2 instance types, how we choose between EC2 Xen modes: HVM, PV, and PVHVM, and the importance of EC2 features such SR-IOV for bare-metal performance. SR-IOV is used by EC2 enhanced networking, and recently for the new i3 instance type for enhanced disk performance as well. We'll also cover kernel tuning and observability tools, from basic to advanced. Advanced performance analysis includes the use of Java and Node.js flame graphs, and the new EC2 Performance Monitoring Counter (PMC) feature released this year."
Kernel Recipes 2017: Using Linux perf at NetflixBrendan Gregg
Talk for Kernel Recipes 2017 by Brendan Gregg. "Linux perf is a crucial performance analysis tool at Netflix, and is used by a self-service GUI for generating CPU flame graphs and other reports. This sounds like an easy task, however, getting perf to work properly in VM guests running Java, Node.js, containers, and other software, has been at times a challenge. This talk summarizes Linux perf, how we use it at Netflix, the various gotchas we have encountered, and a summary of advanced features."
USENIX ATC 2017: Visualizing Performance with Flame GraphsBrendan Gregg
Talk by Brendan Gregg for USENIX ATC 2017.
"Flame graphs are a simple stack trace visualization that helps answer an everyday problem: how is software consuming resources, especially CPUs, and how did this change since the last software version? Flame graphs have been adopted by many languages, products, and companies, including Netflix, and have become a standard tool for performance analysis. They were published in "The Flame Graph" article in the June 2016 issue of Communications of the ACM, by their creator, Brendan Gregg.
This talk describes the background for this work, and the challenges encountered when profiling stack traces and resolving symbols for different languages, including for just-in-time compiler runtimes. Instructions will be included generating mixed-mode flame graphs on Linux, and examples from our use at Netflix with Java. Advanced flame graph types will be described, including differential, off-CPU, chain graphs, memory, and TCP events. Finally, future work and unsolved problems in this area will be discussed."
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
24. Pre-BPF: Linux Perf Analysis in 60s
1. uptime
2. dmesg -T | tail
3. vmstat 1
4. mpstat -P ALL 1
5. pidstat 1
6. iostat -xz 1
7. free -m
8. sar -n DEV 1
9. sar -n TCP,ETCP 1
10. top
hcp://techblog.neRlix.com/2015/11/linux-performance-analysis-in-60s.html
25. bcc InstallaCon
• hcps://github.com/iovisor/bcc/blob/master/INSTALL.md
• eg, Ubuntu Xenial:
– Also available as an Ubuntu snap
– Ubuntu 16.04 is good, 16.10 becer: more tools work
• Installs many tools
– In /usr/share/bcc/tools, and …/tools/old for older kernels
# echo "deb [trusted=yes] https://repo.iovisor.org/apt/xenial xenial-nightly main" |
sudo tee /etc/apt/sources.list.d/iovisor.list
# sudo apt-get update
# sudo apt-get install bcc-tools
29. Exonerate or confirm storage latency outliers with ext4slower
# /usr/share/bcc/tools/ext4slower 1
Tracing ext4 operations slower than 1 ms
TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME
17:31:42 postdrop 15523 S 0 0 2.32 5630D406E4
17:31:42 cleanup 15524 S 0 0 1.89 57BB7406EC
17:32:09 titus-log-ship 19735 S 0 0 1.94 slurper_checkpoint.db
17:35:37 dhclient 1061 S 0 0 3.32 dhclient.eth0.leases
17:35:39 systemd-journa 504 S 0 0 26.62 system.journal
17:35:39 systemd-journa 504 S 0 0 1.56 system.journal
17:35:39 systemd-journa 504 S 0 0 1.73 system.journal
17:35:45 postdrop 16187 S 0 0 2.41 C0369406E4
17:35:45 cleanup 16188 S 0 0 6.52 C1B90406EC
[…]
Tracing at the file system is a more reliable and complete indicator than measuring disk I/O latency
Also: btrfsslower, xfsslower, zfsslower
30. Exonerate or confirm storage latency outliers with ext4slower
# /usr/share/bcc/tools/ext4slower 1
Tracing ext4 operations slower than 1 ms
TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME
17:31:42 postdrop 15523 S 0 0 2.32 5630D406E4
17:31:42 cleanup 15524 S 0 0 1.89 57BB7406EC
17:32:09 titus-log-ship 19735 S 0 0 1.94 slurper_checkpoint.db
17:35:37 dhclient 1061 S 0 0 3.32 dhclient.eth0.leases
17:35:39 systemd-journa 504 S 0 0 26.62 system.journal
17:35:39 systemd-journa 504 S 0 0 1.56 system.journal
17:35:39 systemd-journa 504 S 0 0 1.73 system.journal
17:35:45 postdrop 16187 S 0 0 2.41 C0369406E4
17:35:45 cleanup 16188 S 0 0 6.52 C1B90406EC
[…]
Tracing at the file system is a more reliable and complete indicator than measuring disk I/O latency
Also: btrfsslower, xfsslower, zfsslower
31. Iden?fy mul?modal disk I/O latency and outliers with biolatency
# biolatency -mT 10
Tracing block device I/O... Hit Ctrl-C to end.
19:19:04
msecs : count distribution
0 -> 1 : 238 |********* |
2 -> 3 : 424 |***************** |
4 -> 7 : 834 |********************************* |
8 -> 15 : 506 |******************** |
16 -> 31 : 986 |****************************************|
32 -> 63 : 97 |*** |
64 -> 127 : 7 | |
128 -> 255 : 27 |* |
19:19:14
msecs : count distribution
0 -> 1 : 427 |******************* |
2 -> 3 : 424 |****************** |
[…]
Average latency (iostat/sar) may not be represenCCve with mulCple modes or outliers
The "count" column is
summarized in-kernel
32. Iden?fy mul?modal disk I/O latency and outliers with biolatency
# biolatency -mT 10
Tracing block device I/O... Hit Ctrl-C to end.
19:19:04
msecs : count distribution
0 -> 1 : 238 |********* |
2 -> 3 : 424 |***************** |
4 -> 7 : 834 |********************************* |
8 -> 15 : 506 |******************** |
16 -> 31 : 986 |****************************************|
32 -> 63 : 97 |*** |
64 -> 127 : 7 | |
128 -> 255 : 27 |* |
19:19:14
msecs : count distribution
0 -> 1 : 427 |******************* |
2 -> 3 : 424 |****************** |
[…]
Average latency (iostat/sar) may not be represenCCve with mulCple modes or outliers
The "count" column is
summarized in-kernel
39. Construct programma?c one-liners with trace
# trace 'sys_read (arg3 > 20000) "read %d bytes", arg3'
TIME PID COMM FUNC -
05:18:23 4490 dd sys_read read 1048576 bytes
05:18:23 4490 dd sys_read read 1048576 bytes
05:18:23 4490 dd sys_read read 1048576 bytes
^C
argdist by Sasha Goldshtein
# trace -h
[...]
trace –K blk_account_io_start
Trace this kernel function, and print info with a kernel stack trace
trace 'do_sys_open "%s", arg2'
Trace the open syscall and print the filename being opened
trace 'sys_read (arg3 > 20000) "read %d bytes", arg3'
Trace the read syscall and print a message for reads >20000 bytes
trace r::do_sys_return
Trace the return from the open syscall
trace 'c:open (arg2 == 42) "%s %d", arg1, arg2'
Trace the open() call from libc only if the flags (arg2) argument is 42
[...]
e.g. reads over 20000 bytes:
51. BCC Improvements
• Challenges
– IniCalize all variables
– BPF_PERF_OUTPUT()
– Verifier errors
– SCll explicit bpf_probe_read()s.
It's getng becer (thanks):
• High-Level Languages
– One-liners and scripts
– Can use libbcc
tcpconnlat.py
52. ply
• A new BPF-based language and tracer for Linux
– Created by Tobias Waldekranz
– hcps://github.com/iovisor/ply hcps://wkz.github.io/ply/
– Promising, was in development
# ply -c 'kprobe:do_sys_open { printf("opened: %sn", mem(arg(1), "128s")); }'
1 probe active
opened: /sys/kernel/debug/tracing/events/enable
opened: /etc/ld.so.cache
opened: /lib/x86_64-linux-gnu/libselinux.so.1
opened: /lib/x86_64-linux-gnu/libc.so.6
opened: /proc/filesystems
opened: /usr/lib/locale/locale-archive
opened: .
[...]
54. bpVrace
• Another new BPF-based language and tracer for Linux
– Created by Alastair Robertson
– hcps://github.com/ajor/bpVrace
– In acCve development
# bpftrace -e 'kprobe:sys_open { printf("opened: %sn", str(arg0)); }'
Attaching 1 probe...
opened: /sys/devices/system/cpu/online
opened: /proc/1956/stat
opened: /proc/1241/stat
opened: /proc/net/dev
opened: /proc/net/if_inet6
opened: /sys/class/net/eth0/device/vendor
opened: /proc/sys/net/ipv4/neigh/eth0/retrans_time_ms
[...]
58. Case Studies
• Use it
• Solve something
• Write about it
• Talk about it
• Recent posts:
– hcps://blogs.dropbox.com/tech/2017/09/opCmizing-web-servers-for-high-throughput-and-
low-latency/
– hcps://joseuacik.github.io/kernel/scheduler/bcc/bpf/2017/08/03/sched-Cme.html
61. Thank You
– QuesCons?
– iovisor bcc: hcps://github.com/iovisor/bcc
– hcp://www.brendangregg.com
– hcp://slideshare.net/brendangregg
– bgregg@neRlix.com
– @brendangregg
Thanks to Alexei Starovoitov (Facebook), Brenden Blanco (PLUMgrid/VMware), Sasha Goldshtein (Sela),
Teng Qin (Facebook), Yonghong Song (Facebook), Daniel Borkmann (Cisco/Covalent), Wang Nan
(Huawei), Vicent Mar| (GitHub), Paul Chaignon (Orange), and other BPF and bcc contributors!