Tracing Summit 2014, Düsseldorf. What can Linux learn from DTrace: what went well, and what didn't go well, on its path to success? This talk will discuss not just the DTrace software, but lessons from the marketing and adoption of a system tracer, and an inside look at how DTrace was really deployed and used in production environments. It will also cover ongoing problems with DTrace, and how Linux may surpass them and continue to advance the field of system tracing. A world expert and core contributor to DTrace, Brendan now works at Netflix on Linux performance with the various Linux tracers (ftrace, perf_events, eBPF, SystemTap, ktap, sysdig, LTTng, and the DTrace Linux ports), and will summarize his experiences and suggestions for improvements. He has also been contributing to various tracers: recently promoting ftrace and perf_events adoption through articles and front-end scripts, and testing eBPF.
Broken benchmarks, misleading metrics, and terrible tools. This talk will help you navigate the treacherous waters of Linux performance tools, touring common problems with system tools, metrics, statistics, visualizations, measurement overhead, and benchmarks. You might discover that tools you have been using for years, are in fact, misleading, dangerous, or broken.
The speaker, Brendan Gregg, has given many talks on tools that work, including giving the Linux PerformanceTools talk originally at SCALE. This is an anti-version of that talk, to focus on broken tools and metrics instead of the working ones. Metrics can be misleading, and counters can be counter-intuitive! This talk will include advice for verifying new performance tools, understanding how they work, and using them successfully.
Video: https://www.youtube.com/watch?v=FJW8nGV4jxY and https://www.youtube.com/watch?v=zrr2nUln9Kk . Tutorial slides for O'Reilly Velocity SC 2015, by Brendan Gregg.
There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This tutorial explains methodologies for using these tools, and provides a tour of four tool types: observability, benchmarking, tuning, and static tuning. Many tools will be discussed, including top, iostat, tcpdump, sar, perf_events, ftrace, SystemTap, sysdig, and others, as well observability frameworks in the Linux kernel: PMCs, tracepoints, kprobes, and uprobes.
This tutorial is updated and extended on an earlier talk that summarizes the Linux performance tool landscape. The value of this tutorial is not just learning that these tools exist and what they do, but hearing when and how they are used by a performance engineer to solve real world problems — important context that is typically not included in the standard documentation.
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
Talk for USENIX/LISA2014 by Brendan Gregg, Netflix. At Netflix performance is crucial, and we use many high to low level tools to analyze our stack in different ways. In this talk, I will introduce new system observability tools we are using at Netflix, which I've ported from my DTraceToolkit, and are intended for our Linux 3.2 cloud instances. These show that Linux can do more than you may think, by using creative hacks and workarounds with existing kernel features (ftrace, perf_events). While these are solving issues on current versions of Linux, I'll also briefly summarize the future in this space: eBPF, ktap, SystemTap, sysdig, etc.
High-Performance Networking Using eBPF, XDP, and io_uringScyllaDB
In the networking world there are a number of ways to increase performance over naive use of basic Berkeley sockets. These techniques have ranged from polling blocking sockets, non-blocking sockets controlled by Epoll, all the way through completely bypassing the Linux kernel for maximum network performance where you talk directly to the network interface card by using something like DPDK or Netmap. All these tools have their place, and generally occupy a space from convenience to performance. But in recent years, that landscape has changed massively.. The tools available to the average Linux systems developer have improved from the creation of io_uring, to the expansion of bpf from a simple filtering language to a full-on programming environment embedded directly in the kernel. Along with that came something called XDP (express datapath). This was Linux kernel's answer to kernel-bypass networking. AF_XDP is the new socket type created by this feature, and generally works very similarly to something like DPDK. History lessons out of the way, this talk will look into, and discuss the merits of this technology, it's place in the broader ecosystem and how it can be used to attain the highest level of performance possible. This talk will dive into crucial details, such as how AF_XDP works, how it can be integrated into a larger system and finally more advanced topics such as request sharding/load balancing. There will be detailed look at the design of AF_XDP, the eBpf code used, as well as the userspace code required to drive it all. It will also include performance numbers from this setup compared to regular kernel networking. And most importantly how to put all this together to handle as much data as possible on a single modern multi-core system.
Ariel Waizel discusses the Data Plane Development Kit (DPDK), an API for developing fast packet processing code in user space.
* Who needs this library? Why bypass the kernel?
* How does it work?
* How good is it? What are the benchmarks?
* Pros and cons
Ariel worked on kernel development at the IDF, Ben Gurion University, and several companies. He is interested in networking, security, machine learning, and basically everything except UI development. Currently a Solution Architect at ConteXtream (an HPE company), which specializes in SDN solutions for the telecom industry.
DPDK (Data Plane Development Kit) Overview by Rami Rosen
* Background and short history
* Advantages and disadvantages
- Very High speed networking acceleration in L2
- How this acceleration is achieved (hugepages, optimizations)
- rte_kni (and KCP)
- VPP (and FD.io project) , providing routing and switching.
- TLDK (Transport Layer Development Kit, TCP/UDP)
* Anatomy of a simple DPDK application.
* Development and governance model
* Testpmd: DPDK CLI tool
* DDP - Dynamic Device Profiles
Rami Rosen is a Linux Kernel expert, the author of "Linux Kernel Networking", Apress, 2014.
Rami had published two articles about DPDK in the last year:
"Network acceleration with DPDK"
https://lwn.net/Articles/725254/
"Userspace Networking with DPDK"
https://www.linuxjournal.com/content/userspace-networking-dpdk
Broken benchmarks, misleading metrics, and terrible tools. This talk will help you navigate the treacherous waters of Linux performance tools, touring common problems with system tools, metrics, statistics, visualizations, measurement overhead, and benchmarks. You might discover that tools you have been using for years, are in fact, misleading, dangerous, or broken.
The speaker, Brendan Gregg, has given many talks on tools that work, including giving the Linux PerformanceTools talk originally at SCALE. This is an anti-version of that talk, to focus on broken tools and metrics instead of the working ones. Metrics can be misleading, and counters can be counter-intuitive! This talk will include advice for verifying new performance tools, understanding how they work, and using them successfully.
Video: https://www.youtube.com/watch?v=FJW8nGV4jxY and https://www.youtube.com/watch?v=zrr2nUln9Kk . Tutorial slides for O'Reilly Velocity SC 2015, by Brendan Gregg.
There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This tutorial explains methodologies for using these tools, and provides a tour of four tool types: observability, benchmarking, tuning, and static tuning. Many tools will be discussed, including top, iostat, tcpdump, sar, perf_events, ftrace, SystemTap, sysdig, and others, as well observability frameworks in the Linux kernel: PMCs, tracepoints, kprobes, and uprobes.
This tutorial is updated and extended on an earlier talk that summarizes the Linux performance tool landscape. The value of this tutorial is not just learning that these tools exist and what they do, but hearing when and how they are used by a performance engineer to solve real world problems — important context that is typically not included in the standard documentation.
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
Talk for USENIX/LISA2014 by Brendan Gregg, Netflix. At Netflix performance is crucial, and we use many high to low level tools to analyze our stack in different ways. In this talk, I will introduce new system observability tools we are using at Netflix, which I've ported from my DTraceToolkit, and are intended for our Linux 3.2 cloud instances. These show that Linux can do more than you may think, by using creative hacks and workarounds with existing kernel features (ftrace, perf_events). While these are solving issues on current versions of Linux, I'll also briefly summarize the future in this space: eBPF, ktap, SystemTap, sysdig, etc.
High-Performance Networking Using eBPF, XDP, and io_uringScyllaDB
In the networking world there are a number of ways to increase performance over naive use of basic Berkeley sockets. These techniques have ranged from polling blocking sockets, non-blocking sockets controlled by Epoll, all the way through completely bypassing the Linux kernel for maximum network performance where you talk directly to the network interface card by using something like DPDK or Netmap. All these tools have their place, and generally occupy a space from convenience to performance. But in recent years, that landscape has changed massively.. The tools available to the average Linux systems developer have improved from the creation of io_uring, to the expansion of bpf from a simple filtering language to a full-on programming environment embedded directly in the kernel. Along with that came something called XDP (express datapath). This was Linux kernel's answer to kernel-bypass networking. AF_XDP is the new socket type created by this feature, and generally works very similarly to something like DPDK. History lessons out of the way, this talk will look into, and discuss the merits of this technology, it's place in the broader ecosystem and how it can be used to attain the highest level of performance possible. This talk will dive into crucial details, such as how AF_XDP works, how it can be integrated into a larger system and finally more advanced topics such as request sharding/load balancing. There will be detailed look at the design of AF_XDP, the eBpf code used, as well as the userspace code required to drive it all. It will also include performance numbers from this setup compared to regular kernel networking. And most importantly how to put all this together to handle as much data as possible on a single modern multi-core system.
Ariel Waizel discusses the Data Plane Development Kit (DPDK), an API for developing fast packet processing code in user space.
* Who needs this library? Why bypass the kernel?
* How does it work?
* How good is it? What are the benchmarks?
* Pros and cons
Ariel worked on kernel development at the IDF, Ben Gurion University, and several companies. He is interested in networking, security, machine learning, and basically everything except UI development. Currently a Solution Architect at ConteXtream (an HPE company), which specializes in SDN solutions for the telecom industry.
DPDK (Data Plane Development Kit) Overview by Rami Rosen
* Background and short history
* Advantages and disadvantages
- Very High speed networking acceleration in L2
- How this acceleration is achieved (hugepages, optimizations)
- rte_kni (and KCP)
- VPP (and FD.io project) , providing routing and switching.
- TLDK (Transport Layer Development Kit, TCP/UDP)
* Anatomy of a simple DPDK application.
* Development and governance model
* Testpmd: DPDK CLI tool
* DDP - Dynamic Device Profiles
Rami Rosen is a Linux Kernel expert, the author of "Linux Kernel Networking", Apress, 2014.
Rami had published two articles about DPDK in the last year:
"Network acceleration with DPDK"
https://lwn.net/Articles/725254/
"Userspace Networking with DPDK"
https://www.linuxjournal.com/content/userspace-networking-dpdk
Talk by Brendan Gregg for USENIX LISA 2019: Linux Systems Performance. Abstract: "
Systems performance is an effective discipline for performance analysis and tuning, and can help you find performance wins for your applications and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas of Linux systems performance: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (Ftrace, bcc/BPF, and bpftrace/BPF), and much advice about what is and isn't important to learn. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud."
Talk for PerconaLive 2016 by Brendan Gregg. Video: https://www.youtube.com/watch?v=CbmEDXq7es0 . "Systems performance provides a different perspective for analysis and tuning, and can help you find performance wins for your databases, applications, and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes six important areas of Linux systems performance in 50 minutes: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events), static tracing (tracepoints), and dynamic tracing (kprobes, uprobes), and much advice about what is and isn't important to learn. This talk is aimed at everyone: DBAs, developers, operations, etc, and in any environment running Linux, bare-metal or the cloud."
New Ways to Find Latency in Linux Using TracingScyllaDB
Ftrace is the official tracer of the Linux kernel. It originated from the real-time patch (now known as PREEMPT_RT), as developing an operating system for real-time use requires deep insight and transparency of the happenings of the kernel. Not only was tracing useful for debugging, but it was critical for finding areas in the kernel that was causing unbounded latency. It's no wonder why the ftrace infrastructure has a lot of tooling for seeking out latency. Ftrace was introduced into mainline Linux in 2008, and several talks have been done on how to utilize its tracing features. But a lot has happened in the past few years that makes the tooling for finding latency much simpler. Other talks at P99 will discuss the new ftrace tracers "osnoise" and "timerlat", but this talk will focus more on the new flexible and dynamic aspects of ftrace that facilitates finding latency issues which are more specific to your needs. Some of this work may still be in a proof of concept stage, but this talk will give you the advantage of knowing what tools will be available to you in the coming year.
FOSDEM15 SDN developer room talk
DPDK performance
How to not just do a demo with DPDK
The Intel DPDK provides a platform for building high performance Network Function Virtualization applications. But it is hard to get high performance unless certain design tradeoffs are made. This talk focuses on the lessons learned in creating the Brocade vRouter using DPDK. It covers some of the architecture, locking and low level issues that all have to be dealt with to achieve 80 Million packets per second forwarding.
re:Invent 2019 BPF Performance Analysis at NetflixBrendan Gregg
Talk by Brendan Gregg at AWS re:Invent 2019. Abstract: "Extended BPF (eBPF) is an open source Linux technology that powers a whole new class of software: mini programs that run on events. Among its many uses, BPF can be used to create powerful performance analysis tools capable of analyzing everything: CPUs, memory, disks, file systems, networking, languages, applications, and more. In this session, Netflix's Brendan Gregg tours BPF tracing capabilities, including many new open source performance analysis tools he developed for his new book "BPF Performance Tools: Linux System and Application Observability." The talk includes examples of using these tools in the Amazon EC2 cloud."
Talk for SCaLE13x. Video: https://www.youtube.com/watch?v=_Ik8oiQvWgo . Profiling can show what your Linux kernel and appliacations are doing in detail, across all software stack layers. This talk shows how we are using Linux perf_events (aka "perf") and flame graphs at Netflix to understand CPU usage in detail, to optimize our cloud usage, solve performance issues, and identify regressions. This will be more than just an intro: profiling difficult targets, including Java and Node.js, will be covered, which includes ways to resolve JITed symbols and broken stacks. Included are the easy examples, the hard, and the cutting edge.
USENIX LISA2021 talk by Brendan Gregg (https://www.youtube.com/watch?v=_5Z2AU7QTH4). This talk is a deep dive that describes how BPF (eBPF) works internally on Linux, and dissects some modern performance observability tools. Details covered include the kernel BPF implementation: the verifier, JIT compilation, and the BPF execution environment; the BPF instruction set; different event sources; and how BPF is used by user space, using bpftrace programs as an example. This includes showing how bpftrace is compiled to LLVM IR and then BPF bytecode, and how per-event data and aggregated map data are fetched from the kernel.
Video: https://www.youtube.com/watch?v=JRFNIKUROPE . Talk for linux.conf.au 2017 (LCA2017) by Brendan Gregg, about Linux enhanced BPF (eBPF). Abstract:
A world of new capabilities is emerging for the Linux 4.x series, thanks to enhancements that have been included in Linux for to Berkeley Packet Filter (BPF): an in-kernel virtual machine that can execute user space-defined programs. It is finding uses for security auditing and enforcement, enhancing networking (including eXpress Data Path), and performance observability and troubleshooting. Many new open source tools that have been written in the past 12 months for performance analysis that use BPF. Tracing superpowers have finally arrived for Linux!
For its use with tracing, BPF provides the programmable capabilities to the existing tracing frameworks: kprobes, uprobes, and tracepoints. In particular, BPF allows timestamps to be recorded and compared from custom events, allowing latency to be studied in many new places: kernel and application internals. It also allows data to be efficiently summarized in-kernel, including as histograms. This has allowed dozens of new observability tools to be developed so far, including measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more.
This talk will summarize BPF capabilities and use cases so far, and then focus on its use to enhance Linux tracing, especially with the open source bcc collection. bcc includes BPF versions of old classics, and many new tools, including execsnoop, opensnoop, funcccount, ext4slower, and more (many of which I developed). Perhaps you'd like to develop new tools, or use the existing tools to find performance wins large and small, especially when instrumenting areas that previously had zero visibility. I'll also summarize how we intend to use these new capabilities to enhance systems analysis at Netflix.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Ceph is a open source , software defined storage excellent and the only ( i would say ) storage backend as a cloud storage. Ceph is the Future of Storage. In this presentation i am explaining ceph and openstack briefly , you would definitely enjoy it.
BPF of Berkeley Packet Filter mechanism was first introduced in linux in 1997 in version 2.1.75. It has seen a number of extensions of the years. Recently in versions 3.15 - 3.19 it received a major overhaul which drastically expanded it's applicability. This talk will cover how the instruction set looks today and why. It's architecture, capabilities, interface, just-in-time compilers. We will also talk about how it's being used in different areas of the kernel like tracing and networking and future plans.
Linux 4.x Tracing: Performance Analysis with bcc/BPFBrendan Gregg
Talk about bcc/eBPF for SCALE15x (2017) by Brendan Gregg. "BPF (Berkeley Packet Filter) has been enhanced in the Linux 4.x series and now powers a large collection of performance analysis and observability tools ready for you to use, included in the bcc (BPF Complier Collection) open source project. BPF nowadays can do system tracing, software defined networks, and kernel fast path: much more than just filtering packets! This talk will focus on the bcc/BPF tools for performance analysis, which make use of other built in Linux capabilities: dynamic tracing (kprobes and uprobes) and static tracing (tracepoints and USDT). There are now bcc tools for measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. Tracing superpowers have finally arrived, built in to Linux."
Linux 4.x Tracing Tools: Using BPF SuperpowersBrendan Gregg
Talk for USENIX LISA 2016 by Brendan Gregg.
"Linux 4.x Tracing Tools: Using BPF Superpowers
The Linux 4.x series heralds a new era of Linux performance analysis, with the long-awaited integration of a programmable tracer: Enhanced BPF (eBPF). Formally the Berkeley Packet Filter, BPF has been enhanced in Linux to provide system tracing capabilities, and integrates with dynamic tracing (kprobes and uprobes) and static tracing (tracepoints and USDT). This has allowed dozens of new observability tools to be developed so far: for example, measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. Tracing superpowers have finally arrived.
In this talk I'll show you how to use BPF in the Linux 4.x series, and I'll summarize the different tools and front ends available, with a focus on iovisor bcc. bcc is an open source project to provide a Python front end for BPF, and comes with dozens of new observability tools (many of which I developed). These tools include new BPF versions of old classics, and many new tools, including: execsnoop, opensnoop, funccount, trace, biosnoop, bitesize, ext4slower, ext4dist, tcpconnect, tcpretrans, runqlat, offcputime, offwaketime, and many more. I'll also summarize use cases and some long-standing issues that can now be solved, and how we are using these capabilities at Netflix."
Talk by Brendan Gregg for USENIX LISA 2019: Linux Systems Performance. Abstract: "
Systems performance is an effective discipline for performance analysis and tuning, and can help you find performance wins for your applications and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas of Linux systems performance: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (Ftrace, bcc/BPF, and bpftrace/BPF), and much advice about what is and isn't important to learn. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud."
Talk for PerconaLive 2016 by Brendan Gregg. Video: https://www.youtube.com/watch?v=CbmEDXq7es0 . "Systems performance provides a different perspective for analysis and tuning, and can help you find performance wins for your databases, applications, and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes six important areas of Linux systems performance in 50 minutes: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events), static tracing (tracepoints), and dynamic tracing (kprobes, uprobes), and much advice about what is and isn't important to learn. This talk is aimed at everyone: DBAs, developers, operations, etc, and in any environment running Linux, bare-metal or the cloud."
New Ways to Find Latency in Linux Using TracingScyllaDB
Ftrace is the official tracer of the Linux kernel. It originated from the real-time patch (now known as PREEMPT_RT), as developing an operating system for real-time use requires deep insight and transparency of the happenings of the kernel. Not only was tracing useful for debugging, but it was critical for finding areas in the kernel that was causing unbounded latency. It's no wonder why the ftrace infrastructure has a lot of tooling for seeking out latency. Ftrace was introduced into mainline Linux in 2008, and several talks have been done on how to utilize its tracing features. But a lot has happened in the past few years that makes the tooling for finding latency much simpler. Other talks at P99 will discuss the new ftrace tracers "osnoise" and "timerlat", but this talk will focus more on the new flexible and dynamic aspects of ftrace that facilitates finding latency issues which are more specific to your needs. Some of this work may still be in a proof of concept stage, but this talk will give you the advantage of knowing what tools will be available to you in the coming year.
FOSDEM15 SDN developer room talk
DPDK performance
How to not just do a demo with DPDK
The Intel DPDK provides a platform for building high performance Network Function Virtualization applications. But it is hard to get high performance unless certain design tradeoffs are made. This talk focuses on the lessons learned in creating the Brocade vRouter using DPDK. It covers some of the architecture, locking and low level issues that all have to be dealt with to achieve 80 Million packets per second forwarding.
re:Invent 2019 BPF Performance Analysis at NetflixBrendan Gregg
Talk by Brendan Gregg at AWS re:Invent 2019. Abstract: "Extended BPF (eBPF) is an open source Linux technology that powers a whole new class of software: mini programs that run on events. Among its many uses, BPF can be used to create powerful performance analysis tools capable of analyzing everything: CPUs, memory, disks, file systems, networking, languages, applications, and more. In this session, Netflix's Brendan Gregg tours BPF tracing capabilities, including many new open source performance analysis tools he developed for his new book "BPF Performance Tools: Linux System and Application Observability." The talk includes examples of using these tools in the Amazon EC2 cloud."
Talk for SCaLE13x. Video: https://www.youtube.com/watch?v=_Ik8oiQvWgo . Profiling can show what your Linux kernel and appliacations are doing in detail, across all software stack layers. This talk shows how we are using Linux perf_events (aka "perf") and flame graphs at Netflix to understand CPU usage in detail, to optimize our cloud usage, solve performance issues, and identify regressions. This will be more than just an intro: profiling difficult targets, including Java and Node.js, will be covered, which includes ways to resolve JITed symbols and broken stacks. Included are the easy examples, the hard, and the cutting edge.
USENIX LISA2021 talk by Brendan Gregg (https://www.youtube.com/watch?v=_5Z2AU7QTH4). This talk is a deep dive that describes how BPF (eBPF) works internally on Linux, and dissects some modern performance observability tools. Details covered include the kernel BPF implementation: the verifier, JIT compilation, and the BPF execution environment; the BPF instruction set; different event sources; and how BPF is used by user space, using bpftrace programs as an example. This includes showing how bpftrace is compiled to LLVM IR and then BPF bytecode, and how per-event data and aggregated map data are fetched from the kernel.
Video: https://www.youtube.com/watch?v=JRFNIKUROPE . Talk for linux.conf.au 2017 (LCA2017) by Brendan Gregg, about Linux enhanced BPF (eBPF). Abstract:
A world of new capabilities is emerging for the Linux 4.x series, thanks to enhancements that have been included in Linux for to Berkeley Packet Filter (BPF): an in-kernel virtual machine that can execute user space-defined programs. It is finding uses for security auditing and enforcement, enhancing networking (including eXpress Data Path), and performance observability and troubleshooting. Many new open source tools that have been written in the past 12 months for performance analysis that use BPF. Tracing superpowers have finally arrived for Linux!
For its use with tracing, BPF provides the programmable capabilities to the existing tracing frameworks: kprobes, uprobes, and tracepoints. In particular, BPF allows timestamps to be recorded and compared from custom events, allowing latency to be studied in many new places: kernel and application internals. It also allows data to be efficiently summarized in-kernel, including as histograms. This has allowed dozens of new observability tools to be developed so far, including measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more.
This talk will summarize BPF capabilities and use cases so far, and then focus on its use to enhance Linux tracing, especially with the open source bcc collection. bcc includes BPF versions of old classics, and many new tools, including execsnoop, opensnoop, funcccount, ext4slower, and more (many of which I developed). Perhaps you'd like to develop new tools, or use the existing tools to find performance wins large and small, especially when instrumenting areas that previously had zero visibility. I'll also summarize how we intend to use these new capabilities to enhance systems analysis at Netflix.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Ceph is a open source , software defined storage excellent and the only ( i would say ) storage backend as a cloud storage. Ceph is the Future of Storage. In this presentation i am explaining ceph and openstack briefly , you would definitely enjoy it.
BPF of Berkeley Packet Filter mechanism was first introduced in linux in 1997 in version 2.1.75. It has seen a number of extensions of the years. Recently in versions 3.15 - 3.19 it received a major overhaul which drastically expanded it's applicability. This talk will cover how the instruction set looks today and why. It's architecture, capabilities, interface, just-in-time compilers. We will also talk about how it's being used in different areas of the kernel like tracing and networking and future plans.
Linux 4.x Tracing: Performance Analysis with bcc/BPFBrendan Gregg
Talk about bcc/eBPF for SCALE15x (2017) by Brendan Gregg. "BPF (Berkeley Packet Filter) has been enhanced in the Linux 4.x series and now powers a large collection of performance analysis and observability tools ready for you to use, included in the bcc (BPF Complier Collection) open source project. BPF nowadays can do system tracing, software defined networks, and kernel fast path: much more than just filtering packets! This talk will focus on the bcc/BPF tools for performance analysis, which make use of other built in Linux capabilities: dynamic tracing (kprobes and uprobes) and static tracing (tracepoints and USDT). There are now bcc tools for measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. Tracing superpowers have finally arrived, built in to Linux."
Linux 4.x Tracing Tools: Using BPF SuperpowersBrendan Gregg
Talk for USENIX LISA 2016 by Brendan Gregg.
"Linux 4.x Tracing Tools: Using BPF Superpowers
The Linux 4.x series heralds a new era of Linux performance analysis, with the long-awaited integration of a programmable tracer: Enhanced BPF (eBPF). Formally the Berkeley Packet Filter, BPF has been enhanced in Linux to provide system tracing capabilities, and integrates with dynamic tracing (kprobes and uprobes) and static tracing (tracepoints and USDT). This has allowed dozens of new observability tools to be developed so far: for example, measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. Tracing superpowers have finally arrived.
In this talk I'll show you how to use BPF in the Linux 4.x series, and I'll summarize the different tools and front ends available, with a focus on iovisor bcc. bcc is an open source project to provide a Python front end for BPF, and comes with dozens of new observability tools (many of which I developed). These tools include new BPF versions of old classics, and many new tools, including: execsnoop, opensnoop, funccount, trace, biosnoop, bitesize, ext4slower, ext4dist, tcpconnect, tcpretrans, runqlat, offcputime, offwaketime, and many more. I'll also summarize use cases and some long-standing issues that can now be solved, and how we are using these capabilities at Netflix."
Video: https://www.facebook.com/atscaleevents/videos/1693888610884236/ . Talk by Brendan Gregg from Facebook's Performance @Scale: "Linux performance analysis has been the domain of ancient tools and metrics, but that's now changing in the Linux 4.x series. A new tracer is available in the mainline kernel, built from dynamic tracing (kprobes, uprobes) and enhanced BPF (Berkeley Packet Filter), aka, eBPF. It allows us to measure latency distributions for file system I/O and run queue latency, print details of storage device I/O and TCP retransmits, investigate blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. This talk will summarize this new technology and some long-standing issues that it can solve, and how we intend to use it at Netflix."
Delivered as plenary at USENIX LISA 2013. video here: https://www.youtube.com/watch?v=nZfNehCzGdw and https://www.usenix.org/conference/lisa13/technical-sessions/plenary/gregg . "How did we ever analyze performance before Flame Graphs?" This new visualization invented by Brendan can help you quickly understand application and kernel performance, especially CPU usage, where stacks (call graphs) can be sampled and then visualized as an interactive flame graph. Flame Graphs are now used for a growing variety of targets: for applications and kernels on Linux, SmartOS, Mac OS X, and Windows; for languages including C, C++, node.js, ruby, and Lua; and in WebKit Web Inspector. This talk will explain them and provide use cases and new visualizations for other event types, including I/O, memory usage, and latency.
Video: https://www.youtube.com/watch?v=uibLwoVKjec . Talk by Brendan Gregg for Sysdig CCWFS 2016. Abstract:
"You have a system with an advanced programmatic tracer: do you know what to do with it? Brendan has used numerous tracers in production environments, and has published hundreds of tracing-based tools. In this talk he will share tips and know-how for creating CLI tracing tools and GUI visualizations, to solve real problems effectively. Programmatic tracing is an amazing superpower, and this talk will show you how to wield it!"
SREcon 2016 Performance Checklists for SREsBrendan Gregg
Talk from SREcon2016 by Brendan Gregg. Video: https://www.usenix.org/conference/srecon16/program/presentation/gregg . "There's limited time for performance analysis in the emergency room. When there is a performance-related site outage, the SRE team must analyze and solve complex performance issues as quickly as possible, and under pressure. Many performance tools and techniques are designed for a different environment: an engineer analyzing their system over the course of hours or days, and given time to try dozens of tools: profilers, tracers, monitoring tools, benchmarks, as well as different tunings and configurations. But when Netflix is down, minutes matter, and there's little time for such traditional systems analysis. As with aviation emergencies, short checklists and quick procedures can be applied by the on-call SRE staff to help solve performance issues as quickly as possible.
In this talk, I'll cover a checklist for Linux performance analysis in 60 seconds, as well as other methodology-derived checklists and procedures for cloud computing, with examples of performance issues for context. Whether you are solving crises in the SRE war room, or just have limited time for performance engineering, these checklists and approaches should help you find some quick performance wins. Safe flying."
LinuxCon Europe, 2014. Video: https://www.youtube.com/watch?v=SN7Z0eCn0VY . There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This talk summarizes the three types of performance tools: observability, benchmarking, and tuning, providing a tour of what exists and why they exist. Advanced tools including those based on tracepoints, kprobes, and uprobes are also included: perf_events, ktap, SystemTap, LTTng, and sysdig. You'll gain a good understanding of the performance tools landscape, knowing what to reach for to get the most out of your systems.
A brief talk on systems performance for the July 2013 meetup "A Midsummer Night's System", video: http://www.youtube.com/watch?v=P3SGzykDE4Q. This summarizes how systems performance has changed from the 1990's to today. This was the reason for writing a new book on systems performance, to provide a reference that is up to date, covering new tools, technologies, and methodologies.
Introduction to DTrace (Dynamic Tracing), written by Brendan Gregg and delivered in 2007. While aimed at a Solaris-based audience, this introduction is still largely relevant today (2012). Since then, DTrace has appeared in other operating systems (Mac OS X, FreeBSD, and is being ported to Linux), and, many user-level providers have been developed to aid tracing of other languages.
Systems Performance: Enterprise and the CloudBrendan Gregg
My talk for BayLISA, Oct 2013, launching the Systems Performance book. Operating system performance analysis and tuning leads to a better end-user experience and lower costs, especially for cloud computing environments that pay by the operating system instance. This book covers concepts, strategy, tools and tuning for Unix operating systems, with a focus on Linux- and Solaris-based systems. The book covers the latest tools and techniques, including static and dynamic tracing, to get the most out of your systems.
Stop the Guessing: Performance Methodologies for Production SystemsBrendan Gregg
Talk presented at Velocity 2013. Description: When faced with performance issues on complex production systems and distributed cloud environments, it can be difficult to know where to begin your analysis, or to spend much time on it when it isn’t your day job. This talk covers various methodologies, and anti-methodologies, for systems analysis, which serve as guidance for finding fruitful metrics from your current performance monitoring products. Such methodologies can help check all areas in an efficient manner, and find issues that can be easily overlooked, especially for virtualized environments which impose resource controls. Some of the tools and methodologies covered, including the USE Method, were developed by the speaker and have been used successfully in enterprise and cloud environments.
Delivered at the FISL13 conference in Brazil: http://www.youtube.com/watch?v=K9w2cipqfvc
This talk introduces the USE Method: a simple strategy for performing a complete check of system performance health, identifying common bottlenecks and errors. This methodology can be used early in a performance investigation to quickly identify the most severe system performance issues, and is a methodology the speaker has used successfully for years in both enterprise and cloud computing environments. Checklists have been developed to show how the USE Method can be applied to Solaris/illumos-based and Linux-based systems.
Many hardware and software resource types have been commonly overlooked, including memory and I/O busses, CPU interconnects, and kernel locks. Any of these can become a system bottleneck. The USE Method provides a way to find and identify these.
This approach focuses on the questions to ask of the system, before reaching for the tools. Tools that are ultimately used include all the standard performance tools (vmstat, iostat, top), and more advanced tools, including dynamic tracing (DTrace), and hardware performance counters.
Other performance methodologies are included for comparison: the Problem Statement Method, Workload Characterization Method, and Drill-Down Analysis Method.
ACM Applicative System Methodology 2016Brendan Gregg
Video: https://youtu.be/eO94l0aGLCA?t=3m37s . Talk by Brendan Gregg for ACM Applicative 2016
"System Methodology - Holistic Performance Analysis on Modern Systems
Traditional systems performance engineering makes do with vendor-supplied metrics, often involving interpretation and inference, and with numerous blind spots. Much in the field of systems performance is still living in the past: documentation, procedures, and analysis GUIs built upon the same old metrics. For modern systems, we can choose the metrics, and can choose ones we need to support new holistic performance analysis methodologies. These methodologies provide faster, more accurate, and more complete analysis, and can provide a starting point for unfamiliar systems.
Methodologies are especially helpful for modern applications and their workloads, which can pose extremely complex problems with no obvious starting point. There are also continuous deployment environments such as the Netflix cloud, where these problems must be solved in shorter time frames. Fortunately, with advances in system observability and tracers, we have virtually endless custom metrics to aid performance analysis. The problem becomes which metrics to use, and how to navigate them quickly to locate the root cause of problems.
System methodologies provide a starting point for analysis, as well as guidance for quickly moving through the metrics to root cause. They also pose questions that the existing metrics may not yet answer, which may be critical in solving the toughest problems. System methodologies include the USE method, workload characterization, drill-down analysis, off-CPU analysis, and more.
This talk will discuss various system performance issues, and the methodologies, tools, and processes used to solve them. The focus is on single systems (any operating system), including single cloud instances, and quickly locating performance issues or exonerating the system. Many methodologies will be discussed, along with recommendations for their implementation, which may be as documented checklists of tools, or custom dashboards of supporting metrics. In general, you will learn to think differently about your systems, and how to ask better questions."
Surge 2014: From Clouds to Roots: root cause performance analysis at Netflix. Brendan Gregg.
At Netflix, high scale and fast deployment rule. The possibilities for failure are endless, and the environment excels at handling this, regularly tested and exercised by the simian army. But, when this environment automatically works around systemic issues that aren’t root-caused, they can grow over time. This talk describes the challenge of not just handling failures of scale on the Netflix cloud, but also new approaches and tools for quickly diagnosing their root cause in an ever changing environment.
Monitorama 2015 talk by Brendan Gregg, Netflix. With our large and ever-changing cloud environment, it can be vital to debug instance-level performance quickly. There are many instance monitoring solutions, but few come close to meeting our requirements, so we've been building our own and open sourcing them. In this talk, I will discuss our real-world requirements for instance-level analysis and monitoring: not just the metrics and features we desire, but the methodologies we'd like to apply. I will also cover the new and novel solutions we have been developing ourselves to meet these needs and desires, which include use of advanced Linux performance technologies (eg, ftrace, perf_events), and on-demand self-service analysis (Vector).
Video: http://joyent.com/blog/linux-performance-analysis-and-tools-brendan-gregg-s-talk-at-scale-11x ; This talk for SCaLE11x covers system performance analysis methodologies and the Linux tools to support them, so that you can get the most out of your systems and solve performance issues quickly. This includes a wide variety of tools, including basics like top(1), advanced tools like perf, and new tools like the DTrace for Linux prototypes.
The Dirty Little Secrets They Didn’t Teach You In Pentesting Class Chris Gates
Derbycon 2011
This talk is about methodologies and tools that we use or have coded that make our lives and pentest schedule a little easier, and why we do things the way we do. Of course, there will be a healthy dose of Metasploit in the mix.
With Dask and Numba, you can NumPy-like and Pandas-like code and have it run very fast on multi-core systems as well as at scale on many-node clusters.
Deep Learning on Apache® Spark™: Workflows and Best PracticesDatabricks
The combination of Deep Learning with Apache Spark has the potential for tremendous impact in many sectors of the industry. This webinar, based on the experience gained in assisting customers with the Databricks Virtual Analytics Platform, will present some best practices for building deep learning pipelines with Spark.
Rather than comparing deep learning systems or specific optimizations, this webinar will focus on issues that are common to deep learning frameworks when running on a Spark cluster, including:
* optimizing cluster setup;
* configuring the cluster;
* ingesting data; and
* monitoring long-running jobs.
We will demonstrate the techniques we cover using Google’s popular TensorFlow library. More specifically, we will cover typical issues users encounter when integrating deep learning libraries with Spark clusters.
Clusters can be configured to avoid task conflicts on GPUs and to allow using multiple GPUs per worker. Setting up pipelines for efficient data ingest improves job throughput, and monitoring facilitates both the work of configuration and the stability of deep learning jobs.
Deep Learning on Apache® Spark™: Workflows and Best PracticesJen Aman
The combination of Deep Learning with Apache Spark has the potential for tremendous impact in many sectors of the industry. This webinar, based on the experience gained in assisting customers with the Databricks Virtual Analytics Platform, will present some best practices for building deep learning pipelines with Spark.
Rather than comparing deep learning systems or specific optimizations, this webinar will focus on issues that are common to deep learning frameworks when running on a Spark cluster, including:
* optimizing cluster setup;
* configuring the cluster;
* ingesting data; and
* monitoring long-running jobs.
We will demonstrate the techniques we cover using Google’s popular TensorFlow library. More specifically, we will cover typical issues users encounter when integrating deep learning libraries with Spark clusters.
Clusters can be configured to avoid task conflicts on GPUs and to allow using multiple GPUs per worker. Setting up pipelines for efficient data ingest improves job throughput, and monitoring facilitates both the work of configuration and the stability of deep learning jobs.
Deep Learning on Apache® Spark™ : Workflows and Best PracticesJen Aman
The combination of Deep Learning with Apache Spark has the potential for tremendous impact in many sectors of the industry. This webinar, based on the experience gained in assisting customers with the Databricks Virtual Analytics Platform, will present some best practices for building deep learning pipelines with Spark.
Rather than comparing deep learning systems or specific optimizations, this webinar will focus on issues that are common to deep learning frameworks when running on a Spark cluster, including:
* optimizing cluster setup;
* configuring the cluster;
* ingesting data; and
* monitoring long-running jobs.
We will demonstrate the techniques we cover using Google’s popular TensorFlow library. More specifically, we will cover typical issues users encounter when integrating deep learning libraries with Spark clusters.
Clusters can be configured to avoid task conflicts on GPUs and to allow using multiple GPUs per worker. Setting up pipelines for efficient data ingest improves job throughput, and monitoring facilitates both the work of configuration and the stability of deep learning jobs.
Linux Distribution Collaboration …on a Mainframe!All Things Open
Presented at All Things Open 2023
Presented by Elizabeth K. Joseph - IBM
Title: Linux Distribution Collaboration …on a Mainframe!
Abstract: Linux has run on the mainframe architecture (s390x) for over 20 years now, and there’s even Linux-only mainframe hardware! But tight collaboration between the Linux distributions is rather new. Enter the Open Mainframe Project Linux Distributions Working Group, founded in late 2021.
Bringing together various Linux distributions, both corporate-backed and community-driven, representatives from openSUSE, Debian, Fedora, SUSE, and more immediately joined the effort to share bug reports and patches that impact all the distributions. Issues are often shared and discussed on the mailing list, and more complicated topics covered during the monthly meetings. The working group has a number of success stories that will be shared.
Future potential issues are also tackled, and notes shared about upstream changes that may soon impact the package processes. In the latest effort, the team has started thinking about actual upstream projects to invite to our group to be more pro-active about changes that may cause problems on the s390x architecture.
But more importantly, this is a story about community and collaboration. Many people view the various Linux distributions as a competitive space, but like so much of the open source software community, we are all more successful when we share knowledge about our core. The success of this working group, and growing enthusiasm for it from new Linux distributions who are joining, is a great example of this.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Solving Large Scale Optimization Problems using CPLEX Optimization Studiooptimizatiodirectdirect
Recent advancements in Linear and Mixed Programing give us the capability to solve larger Optimization Problems. In this talk using CPLEX Optimization Studio we will discuss modeling practices, case studies and demonstrate good practices for solving Hard Optimization Problems. We will also discuss recent CPLEX performance improvements and recently added features.
Notes from 2016 bay area deep learning school Niketan Pansare
Slide-deck for the lunch talk at IBM Almaden Research Center on Oct 11, 2016.
Abstract: In this lunch talk, I will give a high-level summary of bay area deep learning school which was held at Stanford on Sept 24 and 25. The videos and slides of the lectures are available online at http://www.bayareadlschool.org/. I will also give a very brief introduction of deep learning.
Choosing the right parallel compute architecture corehard_by
Multi-core architecture is the present and future way in which the market is addressing Moore’s law limitations. Multi-core workstations, high performance computers, GPUs and the focus on hybrid/ public cloud technologies for offloading and scaling applications is the direction development is heading. Leveraging multiple cores in order to increase application performance and responsiveness is expected especially from classic high-throughput executions such as rendering, simulations, and heavy calculations. Choosing the correct multi-core strategy for your software requirements is essential, making the wrong decision can have serious implications on software performance, scalability, memory usage and other factors. In this overview, we will inspect various considerations for choosing the correct multi-core strategy for your application’s requirement and investigate the pros and cons of multi-threaded development vs multi-process development. For example, Boost’s GIL (Generic Image Library) provides you with the ability to efficiently code image processing algorithms. However, deciding whether your algorithms should be executed as multi-threaded or multi-process has a high impact on your design, coding, future maintenance, scalability, performance, and other factors.
A partial list of considerations to take into account before taking this architectural decision includes:
- How big are the images I need to process
- What risks can I have in terms of race-conditions, timing issues, sharing violations – does it justify multi-threading programming?
- Do I have any special communication and synchronization requirements?
- How much time would it take my customers to execute a large scenario?
- Would I like to scale processing performance by using the cloud or cluster?
We will then examine these issues in real-world environments. In order to learn how this issue is being addressed in a real-world scenario, we will examine common development and testing environments we are using in our daily work and compare the multi-core strategies they have implemented in order to promote higher development productivity.
Introducing TensorFlow: The game changer in building "intelligent" applicationsRokesh Jankie
This is the slidedeck used for the presentation of the Amsterdam Pipeline of Data Science, held in December 2016. TensorFlow in the open source library from Google to implement deep learning, neural networks. This is an introduction to Tensorflow.
Note: Videos are not included (which were shown during the presentation)
Talk by Brendan Gregg for YOW! 2021. "The pursuit of faster performance in computing is the driving reason for many new technologies and updates. This talk discusses performance improvements now underway that you will likely be adopting soon, for processors (including 3D stacking and cloud vendor CPUs), memory (including DDR5 and high-bandwidth memory [HBM]), disks (including 3D Xpoint as a 3D NAND accelerator), networking (including QUIC and eXpress Data Path [XDP]), runtimes, hypervisors, and more. The future of performance is increasingly cloud-based, with hardware hypervisors and custom processors, meaningful observability of everything down to cycle stalls (even as cloud guests), and high-speed syscall-avoiding applications that use eBPF, FPGAs, and io_uring. The talk also discusses where future performance improvements might be expected, with predictions for new technologies."
Talk for Facebook Systems@Scale 2021 by Brendan Gregg: "BPF (eBPF) tracing is the superpower that can analyze everything, helping you find performance wins, troubleshoot software, and more. But with many different front-ends and languages, and years of evolution, finding the right starting point can be hard. This talk will make it easy, showing how to install and run selected BPF tools in the bcc and bpftrace open source projects for some quick wins. Think like a sysadmin, not like a programmer."
Computing Performance: On the Horizon (2021)Brendan Gregg
Talk by Brendan Gregg for USENIX LISA 2021. https://www.youtube.com/watch?v=5nN1wjA_S30 . "The future of computer performance involves clouds with hardware hypervisors and custom processors, servers running a new type of BPF software to allow high-speed applications and kernel customizations, observability of everything in production, new Linux kernel technologies, and more. This talk covers interesting developments in systems and computing performance, their challenges, and where things are headed."
Performance Wins with BPF: Getting StartedBrendan Gregg
Keynote by Brendan Gregg for the eBPF summit, 2020. How to get started finding performance wins using the BPF (eBPF) technology. This short talk covers the quickest and easiest way to find performance wins using BPF observability tools on Linux.
Talk for YOW! by Brendan Gregg. "Systems performance studies the performance of computing systems, including all physical components and the full software stack to help you find performance wins for your application and kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (ftrace, bcc/BPF, and bpftrace/BPF), advice about what is and isn't important to learn, and case studies to see how it is applied. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud.
"
UM2019 Extended BPF: A New Type of SoftwareBrendan Gregg
Keynote for Ubuntu Masters 2019 by Brendan Gregg, Netflix. Video https://www.youtube.com/watch?v=7pmXdG8-7WU&feature=youtu.be . "Extended BPF is a new type of software, and the first fundamental change to how kernels are used in 50 years. This new type of software is already in use by major companies: Netflix has 14 BPF programs running by default on all of its cloud servers, which run Ubuntu Linux. Facebook has 40 BPF programs running by default. Extended BPF is composed of an in-kernel runtime for executing a virtual BPF instruction set through a safety verifier and with JIT compilation. So far it has been used for software defined networking, performance tools, security policies, and device drivers, with more uses planned and more we have yet to think of. It is changing how we use and think about systems. This talk explores the past, present, and future of BPF, with BPF performance tools as a use case."
YOW2018 Cloud Performance Root Cause Analysis at NetflixBrendan Gregg
Keynote by Brendan Gregg for YOW! 2018. Video: https://www.youtube.com/watch?v=03EC8uA30Pw . Description: "At Netflix, improving the performance of our cloud means happier customers and lower costs, and involves root cause
analysis of applications, runtimes, operating systems, and hypervisors, in an environment of 150k cloud instances
that undergo numerous production changes each week. Apart from the developers who regularly optimize their own code
, we also have a dedicated performance team to help with any issue across the cloud, and to build tooling to aid in
this analysis. In this session we will summarize the Netflix environment, procedures, and tools we use and build t
o do root cause analysis on cloud performance issues. The analysis performed may be cloud-wide, using self-service
GUIs such as our open source Atlas tool, or focused on individual instances, and use our open source Vector tool, f
lame graphs, Java debuggers, and tooling that uses Linux perf, ftrace, and bcc/eBPF. You can use these open source
tools in the same way to find performance wins in your own environment."
Talk by Brendan Gregg and Martin Spier for the Linkedin Performance Engineering meetup on Nov 8, 2018. FlameScope is a visualization for performance profiles that helps you study periodic activity, variance, and perturbations, with a heat map for navigation and flame graphs for code analysis.
Talk by Brendan Gregg for All Things Open 2018. "At over one thousand code commits per week, it's hard to keep up with Linux developments. This keynote will summarize recent Linux performance features,
for a wide audience: the KPTI patches for Meltdown, eBPF for performance observability and the new open source tools that use it, Kyber for disk I/O sc
heduling, BBR for TCP congestion control, and more. This is about exposure: knowing what exists, so you can learn and use it later when needed. Get the
most out of your systems with the latest Linux kernels and exciting features."
Linux Performance 2018 (PerconaLive keynote)Brendan Gregg
Keynote for PerconaLive 2018 by Brendan Gregg. Video: https://youtu.be/sV3XfrfjrPo?t=30m51s . "At over one thousand code commits per week, it's hard to keep up with Linux developments. This keynote will summarize recent Linux performance features, for a wide audience: the KPTI patches for Meltdown, eBPF for performance observability, Kyber for disk I/O scheduling, BBR for TCP congestion control, and more. This is about exposure: knowing what exists, so you can learn and use it later when needed. Get the most out of your systems, whether they are databases or application servers, with the latest Linux kernels and exciting features."
How Netflix Tunes EC2 Instances for PerformanceBrendan Gregg
CMP325 talk for AWS re:Invent 2017, by Brendan Gregg. "
At Netflix we make the best use of AWS EC2 instance types and features to create a high performance cloud, achieving near bare metal speed for our workloads. This session will summarize the configuration, tuning, and activities for delivering the fastest possible EC2 instances, and will help other EC2 users improve performance, reduce latency outliers, and make better use of EC2 features. We'll show how we choose EC2 instance types, how we choose between EC2 Xen modes: HVM, PV, and PVHVM, and the importance of EC2 features such SR-IOV for bare-metal performance. SR-IOV is used by EC2 enhanced networking, and recently for the new i3 instance type for enhanced disk performance as well. We'll also cover kernel tuning and observability tools, from basic to advanced. Advanced performance analysis includes the use of Java and Node.js flame graphs, and the new EC2 Performance Monitoring Counter (PMC) feature released this year."
Talk for USENIX LISA17: "Containers pose interesting challenges for performance monitoring and analysis, requiring new analysis methodologies and tooling. Resource-oriented analysis, as is common with systems performance tools and GUIs, must now account for both hardware limits and soft limits, as implemented using cgroups. A reverse diagnosis methodology can be applied to identify whether a container is resource constrained, and by which hard or soft resource. The interaction between the host and containers can also be examined, and noisy neighbors identified or exonerated. Performance tooling can need special usage or workarounds to function properly from within a container or on the host, to deal with different privilege levels and name spaces. At Netflix, we're using containers for some microservices, and care very much about analyzing and tuning our containers to be as fast and efficient as possible. This talk will show you how to identify bottlenecks in the host or container configuration, in the applications by profiling in a container environment, and how to dig deeper into kernel and container internals."
Kernel Recipes 2017: Using Linux perf at NetflixBrendan Gregg
Talk for Kernel Recipes 2017 by Brendan Gregg. "Linux perf is a crucial performance analysis tool at Netflix, and is used by a self-service GUI for generating CPU flame graphs and other reports. This sounds like an easy task, however, getting perf to work properly in VM guests running Java, Node.js, containers, and other software, has been at times a challenge. This talk summarizes Linux perf, how we use it at Netflix, the various gotchas we have encountered, and a summary of advanced features."
Kernel Recipes 2017: Performance Analysis with BPFBrendan Gregg
Talk by Brendan Gregg at Kernel Recipes 2017 (Paris): "The in-kernel Berkeley Packet Filter (BPF) has been enhanced in recent kernels to do much more than just filtering packets. It can now run user-defined programs on events, such as on tracepoints, kprobes, uprobes, and perf_events, allowing advanced performance analysis tools to be created. These can be used in production as the BPF virtual machine is sandboxed and will reject unsafe code, and are already in use at Netflix.
Beginning with the bpf() syscall in 3.18, enhancements have been added in many kernel versions since, with major features for BPF analysis landing in Linux 4.1, 4.4, 4.7, and 4.9. Specific capabilities these provide include custom in-kernel summaries of metrics, custom latency measurements, and frequency counting kernel and user stack traces on events. One interesting case involves saving stack traces on wake up events, and associating them with the blocked stack trace: so that we can see the blocking stack trace and the waker together, merged in kernel by a BPF program (that particular example is in the kernel as samples/bpf/offwaketime).
This talk will discuss the new BPF capabilities for performance analysis and debugging, and demonstrate the new open source tools that have been developed to use it, many of which are in the Linux Foundation iovisor bcc (BPF Compiler Collection) project. These include tools to analyze the CPU scheduler, TCP performance, file system performance, block I/O, and more."
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
When stars align: studies in data quality, knowledge graphs, and machine lear...
From DTrace to Linux
1. TRACING SUMMIT
EUROPE
From
DTrace
To
Linux:
Oct, 2014
What
can
Linux
learn
from
DTrace?
Brendan
Gregg
Senior
Performance
Architect
bgregg@ne7lix.com
@brendangregg
2. Brendan
Gregg
• DTrace
contribu?ons
include:
– Primary
author
of
the
DTrace
book
– DTraceToolkit
– dtrace-‐cloud-‐tools
– DTrace
network
providers
• I
now
work
on
Linux
at
Ne7lix
– using:
Jrace,
perf_events,
SystemTap,
ktap,
eBPF,
…
– created:
perf-‐tools,
msr-‐cloud-‐tools
• Opinions
in
this
talk
are
my
own
3. Agenda
1. DTrace
– What
is
DTrace,
really?
– Who
is
DTrace
for,
really?
– Why
doesn’t
Linux
have
DTrace?
– What
worked
well?
– What
didn’t
work
well?
2. Linux
Tracers
– Jrace,
perf_events,
eBPF,
…
Topics
include
adop?on,
marke?ng,
technical
challenges,
and
our
usage
at
Ne7lix.
8. Prior
Technology
• Also:
– Sun’s
TNF
– DProbes:
user
+
kernel
dynamic
tracing
– Linux
Trace
Toolkit
(LTT)
– Others,
including
offline
binary
instrumenta?on
• DProbes
and
LTT
were
combined
in
Nov
2000,
but
not
integrated
into
the
Linux
kernel1
• Sun
set
forth
to
produce
a
produc?on-‐safe
tool
1
h^p://lkml.iu.edu/hypermail/linux/kernel/0011.3/0183.html
9.
10. Technology
• DTrace:
– Safe
for
produc?on
use
• You
might
step
on
your
foot
(overhead),
but
you
won’t
shoot
it
off
– Dynamic
tracing,
sta?c
tracing,
and
profiling
– User-‐
and
kernel-‐level,
unified
– Programma?c:
filters
and
summaries
– Solved
countless
issues
in
dev
and
prod
• That’s
what
DTrace
is
for
me
– An
awesome
technology,
oJen
needed
to
root
cause
kernel
&
app
issues
• But
for
most
people….
11. A
Typical
Conversa?on…
“Does
Linux
have
DTrace
yet?”
“That’s
a
pity”
“DTrace
is
awesome!”
“I’m
not
sure”
“No.”
“No.”
“Why?”
“Why,
specifically?”
“Have
you
used
it?”
14. Early
Marke?ng
• DTrace
had
awesome
marke?ng
– People
s?ll
want
it
but
don’t
really
know
why
• Early
marke?ng:
tradi?onal,
$$$
– Great
marke?ng
product
managers
• 10
Moves
Ahead
campaign:
airports,
sta?ons,
etc.
– Sun
sales
staff
pitched
DTrace
directly
– Sun
technology
evangelists
• Benefits
– Not
another
Sun
tech
no
one
knew
about
– Compelled
people
to
learn
more,
try
it
out
15. Marke?ng
Evolved
• Sun
marke?ng
become
innova?ve
– Engineering
blogs,
BigAdmin
– Marke?ng
staff
who
used
and
understood
DTrace
• Who
could
be^er
ar?culate
its
value
– Marke?ng
more
directly
from
the
engineers
17. Later
Marke?ng
• Many
ini?a?ves
by
Deirdré
Straughan:
– Social
media,
blogs,
events,
the
ponycorn
mascot,
...
– Video
and
share
everything:
all
meetups,
talks
• Blogs:
– including
h^p://dtrace.org;
my
own
>
1M
views
• Books:
– my
own
>
30k
sold
• Videos:
– me
shou?ng
while
DTracing
disks,
~1M
views
• Language
support
exposed
new
communi?es
to
DTrace
22. DTrace
end-‐users:
Current
Es?mated
DTrace
guide
users:
~100
Script
end-‐users:
~5,000
Note:
91.247%
of
sta?s?cs
are
made
up
23. DTrace
end-‐users:
Current
• DTrace
guide
users:
~100
– Understand
the
~400
page
Dynamic
Tracing
Guide
– Develop
their
own
scripts
from
scratch
– Understand
overhead
intui?vely
• Script
end-‐users:
~5000
– DTraceToolkit,
Google
– Run
scripts.
Some
tweaks/customiza?ons.
27. Company
Usage
• Prac?cal
usage
for
most
companies:
– A)
A
performance
team
(or
person)
• Acquires
useful
scripts
• Develops
custom
scripts
– B)
The
rest
of
the
company
asks
(A)
for
script/help
• They
need
to
know
what’s
possible,
to
know
to
ask
– Or,
you
buy/develop
a
GUI
that
everyone
can
use
• There
are
some
excep?ons
31. 1.
It
does
(sort
of)
• Linux
has
changed
– In
2005,
numerous
Linux
issues
were
difficult
or
impossible
to
solve.
Linux
needed
a
DTrace
equivalent.
– By
2014,
many
of
these
are
now
solvable,
especially
using
Jrace,
perf_events,
kprobes,
uprobes:
all
part
of
the
Linux
kernel
33. 2.
Technical
• Linux
is
a
more
difficult
environment
– Solaris
always
has
symbols,
via
CTF,
which
DTrace
uses
for
dynamic
tracing
– Linux
doesn’t
always
have
symbols/debuginfo
34. 3.
Linux
isn’t
a
Company
“All
the
wood
behind
one
arrow”
–
Sco^
McNealy,
CEO,
Sun
Microsystems
35. 3.
Linux
isn’t
a
Company
• Linus
can
refuse
patches,
but
can’t
stop
projects
– The
tracing
wood
is
split
between
many
arrows
• Jrace,
perf_events,
eBPF,
SystemTap,
ktap,
LTTng,
…
– And
we
are
a
small
community:
there’s
not
much
wood
to
go
around!
37. 4.
No
Trace
Race
• Post
2001,
Solaris
was
losing
ground
to
Linux.
Sun
desperately
needed
differen?ators
to
survive
– Three
top
Sun
engineers
spent
years
on
DTrace
– Sun
marke?ng
gave
it
their
best
shot…
• This
circumstance
will
never
exist
again
– For
Linux
today,
it
would
be
like
having
Linus,
Ingo,
and
Steven
do
tracing
full-‐?me
for
three
years,
followed
by
a
major
marke?ng
campaign
• There
may
never
be
another
trace
race.
Unless…
39. 1.
The
CDDL
From: Claire Giordano <claire.giordano@sun.com>
To: license-discuss@opensource.org
[Open
Source
Ini?a?ve]
Subject: For Approval: Common Development and Distribution License (CDDL)
Date: Wed, 01 Dec 2004 19:47:39 -0800
[…]
Like the MPL, the CDDL is not expected to be compatible with the GPL,
since it contains requirements that are not in the GPL (for example,
the "patent peace" provision in section 6). Thus, it is likely that
files released under the CDDL will not be able to be combined with
files released under the GPL to create a larger program.
[…]
CDDL Team, Sun Microsystems
Source:
h^p://lwn.net/Ar?cles/114840/
40. 1.
The
CDDL
• Linux
tradi?onally
includes
the
tracer/profiler
in
the
(GPL)
kernel,
but
the
DTrace
license
is
CDDL
– Out-‐of-‐tree
projects
have
maintenance
difficul?es
– Oracle
(who
own
the
DTrace
copyrights)
could
relicense
it
as
GPL,
but
haven’t,
and
may
never
do
this
• Note
that
ZFS
on
Linux
is
doing
well,
despite
being
CDDL,
and
out
of
tree
42. 2.
DTrace
ports
• There
are
two
ports,
but
both
currently
incomplete
• A)
h^ps://github.com/dtrace4linux/linux:
– Mostly
one
UK
developer,
Paul
Fox,
as
a
hobby
since
2008
(when
he
isn’t
developing
on
the
Rasberry
Pi)
• B)
Oracle
Linux
DTrace:
– Open
source
kernel,
closed
source
user-‐level
($)
• We
pay
for
monitoring
tools;
why
not
this
too?
– Experienced
engineers,
test
suite
focused
– Had
been
good
progress,
but
no
updates
for
months
45. 1.
Produc?on
Safety
• DTrace
architecture
– Restricted
probe
context:
no
kernel
facility
calls,
restricted
instruc?ons,
no
backwards
branches,
restricted
loads/stores
– Heartbeat:
aborted
due
to
systemic
unresponsiveness
• DTrace
Test
Suite
– Hundreds
of
tests
• Linux
is
learning
this:
– Oracle
Linux
DTrace
is
taking
the
test
suite
seriously
– Jracetest
47. 2.
All
the
wood
behind
one
arrow
• Can
Linux
learn
this?
– Can
we
vote
some
off
the
Linux
tracing
island?
• At
least,
no
new
tracers
in
2015,
please!
52. 4.
Many
Example
Scripts
• Scripts
serve
many
needs:
– tools:
ready
to
run
– examples:
learn
by-‐example
– marke?ng:
each
is
a
use
case
• DTrace
book
scripts
– 150+
short
examples
• DTraceToolkit
– 230
more
scripts
– all
DTraceToolkit
scripts
have
man
pages,
example
files,
and
are
tested
– An
essen?al
factor
in
DTrace’s
adop?on
53. 4.
Many
Example
Scripts
• Linux
can
learn:
– Many
users
will
just
run
scripts,
not
write
them
– People
want
good
short
examples
– If
they
aren’t
tested,
they
don’t
work
• It’s
easy
to
generate
metrics
that
kind-‐of
work;
it’s
hard
to
make
them
reliable
for
different
workloads.
– Maintenance
of
dynamic
tracing
scripts
is
painful
• The
instrumented
code
can
change
• Need
more
sta?c
tracepoints
55. 5.
Marke?ng
• DTrace
was
effec?vely
marketed
in
many
ways
– Tradi?onal,
social,
blogs,
scripts,
ponycorn,
…
• Linux
has
virtually
no
marke?ng
for
its
tracers
– Jrace
is
great,
if
you
ever
discover
it;
etc.
– Marke?ng
spend
is
on
commercial
products
instead
• Linux
can
learn
to
market
what
it
has
– Tracers
may
also
benefit
from
“a
great
name
and
a
cute
logo”1
– “eBPF”
is
not
catchy,
and
doesn’t
convey
meaning
1
h^p://thenewstack.io/why-‐did-‐docker-‐catch-‐on-‐quickly-‐and-‐why-‐is-‐it-‐so-‐interes?ng/
61. 1.
Adop?on
• Few
customers
ever
wrote
DTrace
scripts
– DTrace
should
have
been
used
more
than
it
was
– Sun’s
“killer”
tool
just
wasn’t
– Be^er
pickup
rate
with
developers,
not
sysadmins
• Many
customers
just
ran
my
scripts
– Not
ideal,
but
be^er
than
nothing
– This
wasn’t
what
many
at
Sun
dreamed
• Internal
adop?on
was
slow,
limited
– Sun
could
have
done
much
more,
but
didn’t
• The
problem
was
knowing
what
to
do
with
it
– The
syntax
was
the
easy
part
62. 1.
Adop?on
• Linux
can
learn:
– Adop?on
is
about
more
than
just
the
technology
• Documenta?on,
marke?ng,
training,
community
– Teaching
what
it
does
is
more
important
than
how
• Everyone
needs
to
know
when
to
ask
for
it,
not
necessarily
how
to
use
it
– Needs
an
adop?on
curve
(not
a
step
func?on)
• Tools,
one-‐liners,
short
scripts,
…
63. 2.
Training
This
is
to
cer?fy
that
Brendan
Gregg
Has
Completed
the
Sun
Educa?onal
course
DTrace
is
a
Solaris
differen3ator
64. 2.
Training
• Early
training
was
not
very
effec?ve
– Sun
began
including
the
DTraceToolkit
in
courses,
with
be^er
success
• It
gradually
improved
– The
last
courses
I
developed
and
taught
(aJer
Sun)
used
simulated
problems
for
the
students
to
solve
on
their
own
with
DTrace
• Linux
can
learn:
– Lab-‐based
training
is
most
effec?ve.
Online
tutorials?
66. 3.
GUIs
• Dozens
of
performance
monitoring
products,
but
almost
no
meaningful
DTrace
support
• A
couple
of
excep?ons:
– Oracle
ZFS
Storage
Appliance
Analy?cs
• Formally
the
Sun
Storage
7000
Analy?cs
• Should
be
generalized.
Oracle
Solaris
11.3?
– Joyent
Cloud
Analy?cs
67. 3.
GUIs
• Linux
can
learn:
– Real
adop?on
possible
through
scripts
&
GUIs
– Use
the
GUI
to
add
value
to
the
data
• Heat
maps:
latency,
u?liza?on,
offset
• Flame
graphs
• Time
series
thread
visualiza?ons
(Trace
Compass)
• ie,
not
just
line
graphs!
– Commercial
GUI
products
have
marke?ng
budget
• Applica?on
perf
monitoring
was
$2.4B
in
20131
1
h^ps://www.gartner.com/doc/2752217/market-‐share-‐analysis-‐applica?on-‐performance
68. 3.
GUIs
• Heat
maps
are
an
example
must-‐have
use
case
for
trace
data
69. 4.
Overheads
While
the
DTrace
technology
is
awesome,
it
does
have
some
minor
technical
challenges
as
well
70. 4.
Overheads
• While
op?mized,
for
many
targets
the
DTrace
CPU
overheads
can
s?ll
be
too
high
– Scheduler
tracing,
memory
alloca?on
tracing
– User-‐level
dynamic
tracing
(fast
trap)
– VM
probes
(eg,
Java
disables
some
probes
by
default)
– 10
GbE
Network
I/O,
etc…
• In
some
cases
it
doesn’t
ma^er
– Despera?on:
system
already
mel?ng
down
– Troubleshoo?ng
in
dev:
speed
not
a
concern
• Linux
can
learn:
– Speed
can
ma^er,
faster
makes
more
possible
72. 5.
Syscall
Provider
• Solaris
DTrace
instrumented
the
trap
table,
and
called
it
the
syscall
provider
– Which
is
actually
an
unstable
interface
– Breaks
between
Solaris
versions
• And
really
broke
in
Oracle
Solaris
11
– Other
weird
caveats
• Linux
can
learn:
– syscalls
are
the
#1
target
for
users
learning
system
tracers.
The
API
should
be
easy
and
stable.
73. Other
Issues
• The
lack
of:
– Bounded
loops
(like
SystemTap)
– Kernel
instruc?on
tracing
(like
perf_events)
– Easy
PMC
interface
(like
perf
stat)
– Aggrega?on
key/value
access
(stap,
ktap,
eBPF)
– Kernel
source
(issue
for
Oracle
Solaris
only)
• 4+
second
startup
?mes
– Several
Linux
tracers
start
instantly
75. • Massive
AWS
EC2
Linux
cloud,
with
FreeBSD
appliances
for
content
delivery
• Performance
is
cri?cal:
>50M
subscribers
• Just
launched
in
Europe!
76. System
Tracing
at
Ne7lix
• Present:
– Jrace
can
serve
many
needs
– perf_events
some
more,
esp.
with
debuginfo
– SystemTap
as
needed,
esp.
for
Java
– ad
hoc
other
tools
• Future:
– Jrace/perf_events/ktap
with
eBPF,
for
a
fully
featured
and
mainline
tracer?
– One
of
the
other
tracers
going
mainline?
• Summarizing
4
tracers…
78. 1.
Jrace
• Tracing
and
profiling:
/sys/kernel/debug/tracing
– added
by
Steven
Rostedt
and
others
since
2.6.27,
and
already
enabled
on
our
servers
(3.2+)
• Experiences:
– very
useful
capabili?es:
tracing,
coun?ng
– surprising
features:
graphing
(latencies),
filters
• Front-‐end
tools
to
ease
use
– h^ps://github.com/brendangregg/perf-‐tools
– WARNING:
these
are
unsupported
hacks
– There’s
also
the
trace-‐cmd
front-‐end
by
Steven
• 4
examples…
79. perf-‐tools:
iosnoop
• Block
I/O
(disk)
events
with
latency:
# ./iosnoop –ts!
Tracing block I/O. Ctrl-C to end.!
STARTs ENDs COMM PID TYPE DEV BLOCK BYTES LATms!
5982800.302061 5982800.302679 supervise 1809 W 202,1 17039600 4096 0.62!
5982800.302423 5982800.302842 supervise 1809 W 202,1 17039608 4096 0.42!
5982800.304962 5982800.305446 supervise 1801 W 202,1 17039616 4096 0.48!
5982800.305250 5982800.305676 supervise 1801 W 202,1 17039624 4096 0.43!
[…]!
# ./iosnoop –h!
USAGE: iosnoop [-hQst] [-d device] [-i iotype] [-p PID] [-n name] [duration]!
-d device # device string (eg, "202,1)!
-i iotype # match type (eg, '*R*' for all reads)!
-n name # process name to match on I/O issue!
-p PID # PID to match on I/O issue!
-Q # include queueing time in LATms!
-s # include start time of I/O (s)!
-t # include completion time of I/O (s)!
-h # this usage message!
duration # duration seconds, and use buffers!
[…]!
82. perf-‐tools:
kprobe
• Just
wrapping
capabili?es
eases
use.
Eg,
kprobes:
# ./kprobe 'p:open do_sys_open filename=+0(%si):string' 'filename ~ "*stat"'!
Tracing kprobe myopen. Ctrl-C to end.!
+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"!
+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"!
+0x0/0x220) filename="pg_stat_tmp/pgstat.stat”!
^C!
Ending tracing...!
• By
postgres-1172 [000] d... 6594028.787166: open: (do_sys_open
postgres-1172 [001] d... 6594028.797410: open: (do_sys_open
postgres-1172 [001] d... 6594028.797467: open: (do_sys_open
some
defini?on
of
“ease”.
Would
like
easier
symbol
usage,
instead
of
+0(%si).
83. 1.
Jrace
• Sugges?ons:
– I’m
blogging
and
so
can
you!
– Func?on
profiler:
• Can
these
in-‐kernel
counts
be
used
for
other
vars?
Eg,
associa?ve
array
or
histogram
of
%dx
– Func?on
grapher:
• Can
the
?ming
be
exposed
by
some
vars?
Picture
histogram
of
latency
– Mul?-‐user
access
possible?
85. 2.
perf_events
• In-‐kernel,
tools/perf,
mul?-‐tool,
“perf”
command
• Experiences:
– Stable,
powerful,
reliable
– The
sub
op?ons
can
feel
inconsistent
(perf
bench?)
– Amazing
with
kernel
debuginfo,
when
we
have
it
– We
use
it
for
CPU
stack
profiles
all
the
?me
• And
turn
them
into
flame
graphs,
which
have
solved
numerous
issues
so
far…
86. perf
CPU
Flame
Graph
Broken
Java
stacks
(missing
frame
pointer)
Kernel
TCP/IP
GC
Idle
Time
thread
Locks
epoll
87. 2.
perf_events
• Sugges?ons:
– Support
for
func?on
argument
symbols
without
a
full
debuginfo
– Rework
scrip?ng
framework
(eg,
try
por?ng
iosnoop)
• eg,
“perf
record”
may
need
a
tunable
?meout
to
trigger
data
writes,
for
efficient
interac?ve
scripts
– Break
up
the
mul?-‐tool
a
bit
(separate
perf
bench)
– eBPF
integra?on
for
custom
aggrega?ons?
89. 3.
SystemTap
• The
most
powerful
of
the
tracers
• Used
for
the
deepest
custom
tracing
– Especially
Java
hotspot
probes
• Experiences:
– Undergoing
a
reset.
Switching
to
the
latest
SystemTap
version,
and
a
newer
kernel.
So
far,
so
good.
– Trying
out
nd_syscall
for
debuginfo-‐less
tracing
• Sugges?ons:
– More
non-‐debuginfo
tapset
func?onality
91. 4.
eBPF
• Extended
BPF:
programs
on
tracepoints
– High
performance
filtering:
JIT
– In-‐kernel
summaries:
maps
• eg,
in-‐kernel
latency
heat
map
(showing
bimodal):
Low
latency
cache
hits
High
latency
device
I/O
Time
92. 4.
eBPF
• Experiences:
– Can
have
lower
CPU
overhead
than
DTrace
– Very
powerful:
really
custom
maps
– Assembly
version
very
hard
to
use;
C
is
be^er,
but
s?ll
not
easy
• Sugges?ons:
– Integrate:
custom
in-‐kernel
aggrega?ons
is
the
missing
piece
93. Other
Tracers
• Experiences
and
sugges?ons:
– ktap
– LTTng
– Oracle
Linux
DTrace
– dtrace4linux
– sysdig
94. The
Tracing
Landscape,
Oct
2014
Scope
&
Capability
(less
brutal)
Ease
of
use
sysdig
perf
Jrace
eBPF
ktap
stap
Stage
of
Development
(my
opinion)
dtrace4L.
(brutal)
(alpha)
(mature)
95. Summary
• DTrace
is
an
awesome
technology
– Which
has
also
had
awesome
marke?ng
• Tradi?onal,
social,
sales,
blogs,
…
– Most
people
won’t
use
it
directly,
and
that’s
ok
• Drive
usage
via
GUIs
and
scripts
• Linux
Tracers
are
catching
up,
and
may
surpass
– It’s
not
2005
anymore
• Now
we
have
Jrace,
perf_events,
kprobes,
uprobes,
…
– Speed
and
aggrega?ons
ma^er
• If
DTrace
is
Ki^y
Hawk,
eBPF
is
a
jet
engine
96. Acks
• dtrace.conf
X-‐ray
pony
art
by
substack
• h^p://www.raspberrypi.org/
rasberry
PI
image
• h^p://en.wikipedia.org/wiki/Crash_test_dummy
photo
by
Brady
Holt
• h^ps://findery.com/johnfox/notes/all-‐the-‐wood-‐behind-‐one-‐arrow
• h^p://en.wikipedia.org/wiki/Early_flying_machines
hang
glider
image
• h^p://www.beginningwithi.com/2010/09/12/how-‐the-‐dtrace-‐book-‐got-‐
done/
• h^p://www.cafepress.com/joyentsmartos.724465338
• h^p://generalzoi.deviantart.com/art/Pony-‐Creator-‐v3-‐397808116
• Tux
by
Larry
Ewing;
Linux®
is
the
registered
trademark
of
Linus
Torvalds
in
the
U.S.
and
other
countries.
• Thanks
Dominic
Kay
and
Deirdré
Straughan
for
feedback