This document provides information on monitoring Linux system resources and performance. It discusses tools like vmstat, sar, iostat for monitoring CPU usage, memory usage, I/O usage, and other metrics. It also covers Linux processes, memory management, and block device monitoring.
A brief talk on systems performance for the July 2013 meetup "A Midsummer Night's System", video: http://www.youtube.com/watch?v=P3SGzykDE4Q. This summarizes how systems performance has changed from the 1990's to today. This was the reason for writing a new book on systems performance, to provide a reference that is up to date, covering new tools, technologies, and methodologies.
The document provides an overview of common Linux performance analysis tools including top, mpstat, ps, sar, and others. It discusses how these tools can be used to monitor CPU, memory, disk, network, and process-level performance metrics. Examples are given showing output from top displaying real-time process activity, mpstat showing CPU utilization breakdown, ps displaying currently running processes, and sar historical CPU utilization. The tools can help identify potential performance issues like high CPU usage, memory pressures, and disk or network bottlenecks.
Kernel Recipes 2015 - Porting Linux to a new processor architectureAnne Nicolas
Getting the Linux kernel running on a new processor architecture is a difficult process. Worse still, there is not much documentation available describing the porting process.
After spending countless hours becoming almost fluent in many of the supported architectures, I discovered that a well-defined skeleton shared by the majority of ports exists. Such a skeleton can logically be split into two parts that intersect a great deal.
The first part is the boot code, meaning the architecture-specific code that is executed from the moment the kernel takes over from the bootloader until init is finally executed. The second part concerns the architecture-specific code that is regularly executed once the booting phase has been completed and the kernel is running normally. This second part includes starting new threads, dealing with hardware interrupts or software exceptions, copying data from/to user applications, serving system calls, and so on.
In this talk I will provide an overview of the procedure, or at least one possible procedure, that can be followed when porting the Linux kernel to a new processor architecture.
Joël Porquet – Joël was a post-doc at Pierre and Marie Curie University (UPMC) where he ported Linux to TSAR, an academic processor. He is now looking for new adventures.
USENIX ATC 2017 Performance Superpowers with Enhanced BPFBrendan Gregg
Talk for USENIX ATC 2017 by Brendan Gregg
"The Berkeley Packet Filter (BPF) in Linux has been enhanced in very recent versions to do much more than just filter packets, and has become a hot area of operating systems innovation, with much more yet to be discovered. BPF is a sandboxed virtual machine that runs user-level defined programs in kernel context, and is part of many kernels. The Linux enhancements allow it to run custom programs on other events, including kernel- and user-level dynamic tracing (kprobes and uprobes), static tracing (tracepoints), and hardware events. This is finding uses for the generation of new performance analysis tools, network acceleration technologies, and security intrusion detection systems.
This talk will explain the BPF enhancements, then discuss the new performance observability tools that are in use and being created, especially from the BPF compiler collection (bcc) open source project. These tools provide new insights for file system and storage performance, CPU scheduler performance, TCP performance, and much more. This is a major turning point for Linux systems engineering, as custom advanced performance instrumentation can be used safely in production environments, powering a new generation of tools and visualizations.
Because these BPF enhancements are only in very recent Linux (such as Linux 4.9), most companies are not yet running new enough kernels to be exploring BPF yet. This will change in the next year or two, as companies including Netflix upgrade their kernels. This talk will give you a head start on this growing technology, and also discuss areas of future work and unsolved problems."
This document discusses stateless hypervisors that are booted from a live image rather than persisting to local storage. Some key points:
- Rackspace uses stateless hypervisors booted from a network image to improve consistency and allow easy updating of all servers.
- The hypervisors are built using Ansible from a base operating system chroot. Common configurations are applied and different "personalities" like KVM or Xen are configured.
- Servers boot the image over the network using iPXE or locally using GRUB. The image runs in memory and mounts persistent storage.
- This approach allows rapid, consistent provisioning of thousands of hypervisors across different hardware with reproducible builds.
Linux Tracing Superpowers by Eugene PirogovPivorak MeetUp
This document discusses Linux tracing tools and the evolution from DTrace on BSD to eBPF on Linux. It begins with an overview of DTrace and its capabilities on BSD, then discusses the limitations of early Linux tracing tools. It introduces eBPF and the BCC compiler collection, which make it easier to write and use eBPF programs. Examples are given showing how BCC can be used to trace system calls, file opens, and command executions. The document argues that BCC and eBPF help address the problems of early Linux tracing by making the tools more approachable and powerful for production use.
Systems Performance: Enterprise and the CloudBrendan Gregg
My talk for BayLISA, Oct 2013, launching the Systems Performance book. Operating system performance analysis and tuning leads to a better end-user experience and lower costs, especially for cloud computing environments that pay by the operating system instance. This book covers concepts, strategy, tools and tuning for Unix operating systems, with a focus on Linux- and Solaris-based systems. The book covers the latest tools and techniques, including static and dynamic tracing, to get the most out of your systems.
Embedded Recipes 2018 - Finding sources of Latency In your system - Steven Ro...Anne Nicolas
Having just an RTOS is not enough for a real-time system. The hardware must be deterministic as well as the applications that run on the system. When you are missing deadlines, the first thing that must be done is to find what is the source of the latency that caused the issue. It could be the hardware, the operating system or the application, or even a combination of the above. This talk will discuss how to determine where the latency is using tools that come with the Linux Kernel, and will explain a few cases that caused issues.
A brief talk on systems performance for the July 2013 meetup "A Midsummer Night's System", video: http://www.youtube.com/watch?v=P3SGzykDE4Q. This summarizes how systems performance has changed from the 1990's to today. This was the reason for writing a new book on systems performance, to provide a reference that is up to date, covering new tools, technologies, and methodologies.
The document provides an overview of common Linux performance analysis tools including top, mpstat, ps, sar, and others. It discusses how these tools can be used to monitor CPU, memory, disk, network, and process-level performance metrics. Examples are given showing output from top displaying real-time process activity, mpstat showing CPU utilization breakdown, ps displaying currently running processes, and sar historical CPU utilization. The tools can help identify potential performance issues like high CPU usage, memory pressures, and disk or network bottlenecks.
Kernel Recipes 2015 - Porting Linux to a new processor architectureAnne Nicolas
Getting the Linux kernel running on a new processor architecture is a difficult process. Worse still, there is not much documentation available describing the porting process.
After spending countless hours becoming almost fluent in many of the supported architectures, I discovered that a well-defined skeleton shared by the majority of ports exists. Such a skeleton can logically be split into two parts that intersect a great deal.
The first part is the boot code, meaning the architecture-specific code that is executed from the moment the kernel takes over from the bootloader until init is finally executed. The second part concerns the architecture-specific code that is regularly executed once the booting phase has been completed and the kernel is running normally. This second part includes starting new threads, dealing with hardware interrupts or software exceptions, copying data from/to user applications, serving system calls, and so on.
In this talk I will provide an overview of the procedure, or at least one possible procedure, that can be followed when porting the Linux kernel to a new processor architecture.
Joël Porquet – Joël was a post-doc at Pierre and Marie Curie University (UPMC) where he ported Linux to TSAR, an academic processor. He is now looking for new adventures.
USENIX ATC 2017 Performance Superpowers with Enhanced BPFBrendan Gregg
Talk for USENIX ATC 2017 by Brendan Gregg
"The Berkeley Packet Filter (BPF) in Linux has been enhanced in very recent versions to do much more than just filter packets, and has become a hot area of operating systems innovation, with much more yet to be discovered. BPF is a sandboxed virtual machine that runs user-level defined programs in kernel context, and is part of many kernels. The Linux enhancements allow it to run custom programs on other events, including kernel- and user-level dynamic tracing (kprobes and uprobes), static tracing (tracepoints), and hardware events. This is finding uses for the generation of new performance analysis tools, network acceleration technologies, and security intrusion detection systems.
This talk will explain the BPF enhancements, then discuss the new performance observability tools that are in use and being created, especially from the BPF compiler collection (bcc) open source project. These tools provide new insights for file system and storage performance, CPU scheduler performance, TCP performance, and much more. This is a major turning point for Linux systems engineering, as custom advanced performance instrumentation can be used safely in production environments, powering a new generation of tools and visualizations.
Because these BPF enhancements are only in very recent Linux (such as Linux 4.9), most companies are not yet running new enough kernels to be exploring BPF yet. This will change in the next year or two, as companies including Netflix upgrade their kernels. This talk will give you a head start on this growing technology, and also discuss areas of future work and unsolved problems."
This document discusses stateless hypervisors that are booted from a live image rather than persisting to local storage. Some key points:
- Rackspace uses stateless hypervisors booted from a network image to improve consistency and allow easy updating of all servers.
- The hypervisors are built using Ansible from a base operating system chroot. Common configurations are applied and different "personalities" like KVM or Xen are configured.
- Servers boot the image over the network using iPXE or locally using GRUB. The image runs in memory and mounts persistent storage.
- This approach allows rapid, consistent provisioning of thousands of hypervisors across different hardware with reproducible builds.
Linux Tracing Superpowers by Eugene PirogovPivorak MeetUp
This document discusses Linux tracing tools and the evolution from DTrace on BSD to eBPF on Linux. It begins with an overview of DTrace and its capabilities on BSD, then discusses the limitations of early Linux tracing tools. It introduces eBPF and the BCC compiler collection, which make it easier to write and use eBPF programs. Examples are given showing how BCC can be used to trace system calls, file opens, and command executions. The document argues that BCC and eBPF help address the problems of early Linux tracing by making the tools more approachable and powerful for production use.
Systems Performance: Enterprise and the CloudBrendan Gregg
My talk for BayLISA, Oct 2013, launching the Systems Performance book. Operating system performance analysis and tuning leads to a better end-user experience and lower costs, especially for cloud computing environments that pay by the operating system instance. This book covers concepts, strategy, tools and tuning for Unix operating systems, with a focus on Linux- and Solaris-based systems. The book covers the latest tools and techniques, including static and dynamic tracing, to get the most out of your systems.
Embedded Recipes 2018 - Finding sources of Latency In your system - Steven Ro...Anne Nicolas
Having just an RTOS is not enough for a real-time system. The hardware must be deterministic as well as the applications that run on the system. When you are missing deadlines, the first thing that must be done is to find what is the source of the latency that caused the issue. It could be the hardware, the operating system or the application, or even a combination of the above. This talk will discuss how to determine where the latency is using tools that come with the Linux Kernel, and will explain a few cases that caused issues.
This document summarizes an Italian presentation on monitoring and tuning I/O performance on Linux. It discusses key topics like I/O monitoring tools like iostat, iotop and blktrace, tuning techniques like dirty page writeback and filesystem options, and ensuring reliability of data writes through proper synchronization and error handling. The presentation provides an overview of I/O subsystems in Linux and dives into specific tools and parameters for optimizing I/O.
While probably the most prominent, Docker is not the only tool for building and managing containers. Originally meant to be a "chroot on steroids" to help debug systemd, systemd-nspawn provides a fairly uncomplicated approach to work with containers. Being part of systemd, it is available on most recent distributions out-of-the-box and requires no additional dependencies.
This deck will introduce a few concepts involved in containers and will guide you through the steps of building a container from scratch. The payload will be a simple service, which will be automatically activated by systemd when the first request arrives.
This document discusses Exadata and OLTP performance. It summarizes the results of running the Silly Little Oracle Benchmark (SLOB) on different Exadata and non-Exadata systems to evaluate logical and physical I/O performance at different concurrency levels. The tests show that Exadata's flash cache allows it to maintain much lower and more consistent physical I/O response times as concurrency increases compared to systems without flash. Logical I/O performance depends more on CPU and memory configuration.
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/1N4GN6z.
Brendan Gregg focuses on broken tools and metrics instead of the working ones. Metrics can be misleading, and counters can be counter-intuitive. Gregg includes advice and methodologies for verifying new performance tools, understanding how they work, and using them successfully. Filmed at qconsf.com.
Brendan Gregg is a senior performance architect at Netflix, where he does large scale computer performance design, analysis, and tuning. He is the author of multiple technical books including Systems Performance published by Prentice Hall, and received the USENIX LISA Award for Outstanding Achievement in System Administration.
This document introduces Docker and provides an overview of its key features and benefits. It explains that Docker allows developers to package applications into lightweight containers that can run on any Linux server. Containers deploy instantly and consistently across environments due to their isolation via namespaces and cgroups. The document also summarizes Docker's architecture including storage drivers, images, and the Dockerfile for building images.
This talk is about a new interface to get information about processes, called task_diag, which we developed.
Currently /proc file system is used to get information about the processes running on the system. All information are presented as text files, which is convenient for humans, but not for programs such as ps and top. This incurs significant delays, especially on a systems with lots of containers running, which is frequently the case nowdays.
Ideally, tools such top and ps would get information in binary format, and use flexible means to specify which kinds of information and for which tasks is required. Presented is a new interface with all these features, called task_diag.
task_diag is based on netlink sockets and looks like socket-diag, which is used to get information about sockets. It uses the request-response model. An request specifies a set of processes and required properties for them. A response contains requested information and can be divided into a few netlink packets if it's too long.
The task diag is much faster than the /proc file system. For example, when reading from /proc, ps opens, reads, and closes many files -- and iterates this for every single processes. With task_diag, it's just sending a request and getting a response.
Except for ps and top, the proposed interface is to be used by CRIU, a containers checkpoint/restore and live migration mechanism. Also, developers of perf tool found that it can be useful to them and implemented a prototype which show a big performance improvements in case of using task_diag instead of procfs.
Our performance measurements show that the ps tool works at least four times faster if task_diag is used instead of procfs.
This document provides an overview of systemd and how it differs from traditional init systems. It discusses systemd units and how to manage services using systemctl. It covers customizing units using drop-ins, managing resources with cgroups, converting init scripts, and using the systemd journal. The presentation aims to demystify systemd and provide administrators with practical guidance on using its main features.
The document discusses diagnosing and mitigating MySQL performance issues. It describes using various operating system monitoring tools like vmstat, iostat, and top to analyze CPU, memory, disk, and network utilization. It also discusses using MySQL-specific tools like the MySQL command line, mysqladmin, mysqlbinlog, and external tools to diagnose issues like high load, I/O wait, or slow queries by examining metrics like queries, connections, storage engine statistics, and InnoDB logs and data written. The agenda covers identifying system and MySQL-specific bottlenecks by verifying OS metrics and running diagnostics on the database, storage engines, configuration, and queries.
Talk for PerconaLive 2016 by Brendan Gregg. Video: https://www.youtube.com/watch?v=CbmEDXq7es0 . "Systems performance provides a different perspective for analysis and tuning, and can help you find performance wins for your databases, applications, and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes six important areas of Linux systems performance in 50 minutes: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events), static tracing (tracepoints), and dynamic tracing (kprobes, uprobes), and much advice about what is and isn't important to learn. This talk is aimed at everyone: DBAs, developers, operations, etc, and in any environment running Linux, bare-metal or the cloud."
This slide will show you how to use SOFA to do performance analysis of CPU/GPU cooperative programs, especially programs running with deep software stacks like TensorFlow, PyTorch, etc.
source code at:
https://github.com/cyliustack/sofa
Kernel Recipes 2015: Solving the Linux storage scalability bottlenecksAnne Nicolas
lash devices introduced a sudden shift in the performance profile of direct attached storage. With IOPS rates orders of magnitude higher than rotating storage, it became clear that Linux needed a re-design of its storage stack to properly support and get the most out of these new devices.
This talk will detail the architecture of blk-mq, the redesign of the core of the Linux storage stack, and the later set of changes made to adapt the SCSI stack to this new queuing model. Early results of running Facebook infrastructure production workloads on top of the new stack will also be shared.
Jense Axboe, Facebook
This document provides an overview of blktrace, a Linux kernel feature and set of utilities that allow detailed tracing of operations within the block I/O layer. Blktrace captures events for each I/O request as it is processed, including queue operations, merges, remapping by software RAID, and driver handling. The blktrace utilities extract these events and allow live tracing or storage for later analysis. Analysis tools like btt can analyze the stored blktrace data to measure processing times and identify bottlenecks or anomalies in how I/O requests are handled throughout the block I/O stack.
The document discusses different types of local Linux filesystems and their features. It focuses on the btrfs filesystem, describing its main features like copy-on-write, snapshots, and subvolumes. It recommends using btrfs or xfs for new installations where snapshots are needed, and converting existing filesystems to btrfs. It demonstrates how btrfs and its snapshot tool snapper can be used for system administration tasks and on desktops.
Kdump and the kernel crash dump analysisBuland Singh
Kdump is a kernel crash dumping mechanism that uses kexec to load a separate crash kernel to capture a kernel memory dump (vmcore file) when the primary kernel crashes. It can be configured to dump the vmcore file to local storage or over the network. Testing involves triggering a kernel panic using SysRq keys which causes the crash kernel to load and dump diagnostic information to the configured target path for analysis.
This document discusses Linux kernel debugging. It provides an overview of debugging techniques including collecting system information, handling failures, and using printk(), KGDB, and debuggers. Key points covered are the components of a debugger, how KGDB can be used with gdb to debug interactively, analyzing crash data, and some debugging tricks and print functions.
From USENIX LISA 2010, San Jose.
Visualizations that include heat maps can be an effective way to present performance data: I/O latency, resource utilization, and more. Patterns can emerge that would be difficult to notice from columns of numbers or line graphs, which are revealing previously unknown behavior. These visualizations are used in a product as a replacement for traditional metrics such as %CPU and are allowing end users to identify more issues much more easily (and some issues are becoming nearly impossible to identify with tools such as vmstat(1)). This talk covers what has been learned, crazy heat map discoveries, and thoughts for future applications beyond performance analysis.
This document provides an overview of the differences between SystemV and systemd for initializing Linux systems. It begins with some background on systemd and its objectives to improve on outdated SystemV startup processes. The document then covers key aspects of systemd such as its functions, strategy of on-demand starting of services, and implementation details. It also discusses the benefits of systemd and compares some pros and cons between the two approaches.
This document summarizes the Texas Advanced Computing Center's (TACC) experience using DDN's Infinite Memory Engine (IME) as a burst buffer for three HPC applications on the Stampede supercomputer. Initial testing showed I/O bottlenecks that were addressed by improving the InfiniBand topology. Performance testing found the IME provided significant acceleration over the Lustre parallel file system, with speedups ranging from 3.7x to 28x for the HACC cosmology code, 6.8x to 22.3x for the S3D combustion code, and 6.2x to 10.1x for the MADBench mini-app. The IME demonstrated its ability to scale and improve
Talk for YOW! by Brendan Gregg. "Systems performance studies the performance of computing systems, including all physical components and the full software stack to help you find performance wins for your application and kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (ftrace, bcc/BPF, and bpftrace/BPF), advice about what is and isn't important to learn, and case studies to see how it is applied. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud.
"
This document provides an overview of Linux performance monitoring tools including mpstat, top, htop, vmstat, iostat, free, strace, and tcpdump. It discusses what each tool measures and how to use it to observe system performance and diagnose issues. The tools presented provide visibility into CPU usage, memory usage, disk I/O, network traffic, and system call activity which are essential for understanding workload performance on Linux systems.
This document summarizes an Italian presentation on monitoring and tuning I/O performance on Linux. It discusses key topics like I/O monitoring tools like iostat, iotop and blktrace, tuning techniques like dirty page writeback and filesystem options, and ensuring reliability of data writes through proper synchronization and error handling. The presentation provides an overview of I/O subsystems in Linux and dives into specific tools and parameters for optimizing I/O.
While probably the most prominent, Docker is not the only tool for building and managing containers. Originally meant to be a "chroot on steroids" to help debug systemd, systemd-nspawn provides a fairly uncomplicated approach to work with containers. Being part of systemd, it is available on most recent distributions out-of-the-box and requires no additional dependencies.
This deck will introduce a few concepts involved in containers and will guide you through the steps of building a container from scratch. The payload will be a simple service, which will be automatically activated by systemd when the first request arrives.
This document discusses Exadata and OLTP performance. It summarizes the results of running the Silly Little Oracle Benchmark (SLOB) on different Exadata and non-Exadata systems to evaluate logical and physical I/O performance at different concurrency levels. The tests show that Exadata's flash cache allows it to maintain much lower and more consistent physical I/O response times as concurrency increases compared to systems without flash. Logical I/O performance depends more on CPU and memory configuration.
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/1N4GN6z.
Brendan Gregg focuses on broken tools and metrics instead of the working ones. Metrics can be misleading, and counters can be counter-intuitive. Gregg includes advice and methodologies for verifying new performance tools, understanding how they work, and using them successfully. Filmed at qconsf.com.
Brendan Gregg is a senior performance architect at Netflix, where he does large scale computer performance design, analysis, and tuning. He is the author of multiple technical books including Systems Performance published by Prentice Hall, and received the USENIX LISA Award for Outstanding Achievement in System Administration.
This document introduces Docker and provides an overview of its key features and benefits. It explains that Docker allows developers to package applications into lightweight containers that can run on any Linux server. Containers deploy instantly and consistently across environments due to their isolation via namespaces and cgroups. The document also summarizes Docker's architecture including storage drivers, images, and the Dockerfile for building images.
This talk is about a new interface to get information about processes, called task_diag, which we developed.
Currently /proc file system is used to get information about the processes running on the system. All information are presented as text files, which is convenient for humans, but not for programs such as ps and top. This incurs significant delays, especially on a systems with lots of containers running, which is frequently the case nowdays.
Ideally, tools such top and ps would get information in binary format, and use flexible means to specify which kinds of information and for which tasks is required. Presented is a new interface with all these features, called task_diag.
task_diag is based on netlink sockets and looks like socket-diag, which is used to get information about sockets. It uses the request-response model. An request specifies a set of processes and required properties for them. A response contains requested information and can be divided into a few netlink packets if it's too long.
The task diag is much faster than the /proc file system. For example, when reading from /proc, ps opens, reads, and closes many files -- and iterates this for every single processes. With task_diag, it's just sending a request and getting a response.
Except for ps and top, the proposed interface is to be used by CRIU, a containers checkpoint/restore and live migration mechanism. Also, developers of perf tool found that it can be useful to them and implemented a prototype which show a big performance improvements in case of using task_diag instead of procfs.
Our performance measurements show that the ps tool works at least four times faster if task_diag is used instead of procfs.
This document provides an overview of systemd and how it differs from traditional init systems. It discusses systemd units and how to manage services using systemctl. It covers customizing units using drop-ins, managing resources with cgroups, converting init scripts, and using the systemd journal. The presentation aims to demystify systemd and provide administrators with practical guidance on using its main features.
The document discusses diagnosing and mitigating MySQL performance issues. It describes using various operating system monitoring tools like vmstat, iostat, and top to analyze CPU, memory, disk, and network utilization. It also discusses using MySQL-specific tools like the MySQL command line, mysqladmin, mysqlbinlog, and external tools to diagnose issues like high load, I/O wait, or slow queries by examining metrics like queries, connections, storage engine statistics, and InnoDB logs and data written. The agenda covers identifying system and MySQL-specific bottlenecks by verifying OS metrics and running diagnostics on the database, storage engines, configuration, and queries.
Talk for PerconaLive 2016 by Brendan Gregg. Video: https://www.youtube.com/watch?v=CbmEDXq7es0 . "Systems performance provides a different perspective for analysis and tuning, and can help you find performance wins for your databases, applications, and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes six important areas of Linux systems performance in 50 minutes: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events), static tracing (tracepoints), and dynamic tracing (kprobes, uprobes), and much advice about what is and isn't important to learn. This talk is aimed at everyone: DBAs, developers, operations, etc, and in any environment running Linux, bare-metal or the cloud."
This slide will show you how to use SOFA to do performance analysis of CPU/GPU cooperative programs, especially programs running with deep software stacks like TensorFlow, PyTorch, etc.
source code at:
https://github.com/cyliustack/sofa
Kernel Recipes 2015: Solving the Linux storage scalability bottlenecksAnne Nicolas
lash devices introduced a sudden shift in the performance profile of direct attached storage. With IOPS rates orders of magnitude higher than rotating storage, it became clear that Linux needed a re-design of its storage stack to properly support and get the most out of these new devices.
This talk will detail the architecture of blk-mq, the redesign of the core of the Linux storage stack, and the later set of changes made to adapt the SCSI stack to this new queuing model. Early results of running Facebook infrastructure production workloads on top of the new stack will also be shared.
Jense Axboe, Facebook
This document provides an overview of blktrace, a Linux kernel feature and set of utilities that allow detailed tracing of operations within the block I/O layer. Blktrace captures events for each I/O request as it is processed, including queue operations, merges, remapping by software RAID, and driver handling. The blktrace utilities extract these events and allow live tracing or storage for later analysis. Analysis tools like btt can analyze the stored blktrace data to measure processing times and identify bottlenecks or anomalies in how I/O requests are handled throughout the block I/O stack.
The document discusses different types of local Linux filesystems and their features. It focuses on the btrfs filesystem, describing its main features like copy-on-write, snapshots, and subvolumes. It recommends using btrfs or xfs for new installations where snapshots are needed, and converting existing filesystems to btrfs. It demonstrates how btrfs and its snapshot tool snapper can be used for system administration tasks and on desktops.
Kdump and the kernel crash dump analysisBuland Singh
Kdump is a kernel crash dumping mechanism that uses kexec to load a separate crash kernel to capture a kernel memory dump (vmcore file) when the primary kernel crashes. It can be configured to dump the vmcore file to local storage or over the network. Testing involves triggering a kernel panic using SysRq keys which causes the crash kernel to load and dump diagnostic information to the configured target path for analysis.
This document discusses Linux kernel debugging. It provides an overview of debugging techniques including collecting system information, handling failures, and using printk(), KGDB, and debuggers. Key points covered are the components of a debugger, how KGDB can be used with gdb to debug interactively, analyzing crash data, and some debugging tricks and print functions.
From USENIX LISA 2010, San Jose.
Visualizations that include heat maps can be an effective way to present performance data: I/O latency, resource utilization, and more. Patterns can emerge that would be difficult to notice from columns of numbers or line graphs, which are revealing previously unknown behavior. These visualizations are used in a product as a replacement for traditional metrics such as %CPU and are allowing end users to identify more issues much more easily (and some issues are becoming nearly impossible to identify with tools such as vmstat(1)). This talk covers what has been learned, crazy heat map discoveries, and thoughts for future applications beyond performance analysis.
This document provides an overview of the differences between SystemV and systemd for initializing Linux systems. It begins with some background on systemd and its objectives to improve on outdated SystemV startup processes. The document then covers key aspects of systemd such as its functions, strategy of on-demand starting of services, and implementation details. It also discusses the benefits of systemd and compares some pros and cons between the two approaches.
This document summarizes the Texas Advanced Computing Center's (TACC) experience using DDN's Infinite Memory Engine (IME) as a burst buffer for three HPC applications on the Stampede supercomputer. Initial testing showed I/O bottlenecks that were addressed by improving the InfiniBand topology. Performance testing found the IME provided significant acceleration over the Lustre parallel file system, with speedups ranging from 3.7x to 28x for the HACC cosmology code, 6.8x to 22.3x for the S3D combustion code, and 6.2x to 10.1x for the MADBench mini-app. The IME demonstrated its ability to scale and improve
Talk for YOW! by Brendan Gregg. "Systems performance studies the performance of computing systems, including all physical components and the full software stack to help you find performance wins for your application and kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (ftrace, bcc/BPF, and bpftrace/BPF), advice about what is and isn't important to learn, and case studies to see how it is applied. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud.
"
This document provides an overview of Linux performance monitoring tools including mpstat, top, htop, vmstat, iostat, free, strace, and tcpdump. It discusses what each tool measures and how to use it to observe system performance and diagnose issues. The tools presented provide visibility into CPU usage, memory usage, disk I/O, network traffic, and system call activity which are essential for understanding workload performance on Linux systems.
Talk by Brendan Gregg for USENIX LISA 2019: Linux Systems Performance. Abstract: "
Systems performance is an effective discipline for performance analysis and tuning, and can help you find performance wins for your applications and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas of Linux systems performance: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (Ftrace, bcc/BPF, and bpftrace/BPF), and much advice about what is and isn't important to learn. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud."
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoringNETWAYS
Nowadays system administrators have great choices when it comes down to Linux performance profiling and monitoring. The challenge is to pick the appropriate tools and interpret their results correctly.
This talk is a chance to take a tour through various performance profiling and benchmarking tools, focusing on their benefit for every sysadmin.
More than 25 different tools are presented. Ranging from well known tools like strace, iostat, tcpdump or vmstat to new features like Linux tracepoints or perf_events. You will also learn which tools can be monitored by Icinga and which monitoring plugins are already available for that.
At the end the goal is to gather reference points to look at, whenever you are faced with performance problems.
Take the chance to close your knowledge gaps and learn how to get the most out of your system.
Analyzing OS X Systems Performance with the USE MethodBrendan Gregg
Talk for MacIT 2014. This talk is about systems performance on OS X, and introduces the USE Method to check for common performance bottlenecks and errors. This methodology can be used by beginners and experts alike, and begins by constructing a checklist of the questions we’d like to ask of the system, before reaching for tools to answer them. The focus is resources: CPUs, GPUs, memory capacity, network interfaces, storage devices, controllers, interconnects, as well as some software resources such as mutex locks. These areas are investigated by a wide variety of tools, including vm_stat, iostat, netstat, top, latency, the DTrace scripts in /usr/bin (which were written by Brendan), custom DTrace scripts, Instruments, and more. This is a tour of the tools needed to solve our performance needs, rather than understanding tools just because they exist. This talk will make you aware of many areas of OS X that you can investigate, which will be especially useful for the time when you need to get to the bottom of a performance issue.
MeetBSDCA 2014 Performance Analysis for BSD, by Brendan Gregg. A tour of five relevant topics: observability tools, methodologies, benchmarking, profiling, and tracing. Tools summarized include pmcstat and DTrace.
This document provides an overview of performance analysis tools for Linux systems. It describes Brendan Gregg's background and work analyzing performance at Netflix. It then discusses different types of tools, including observability tools to monitor systems, benchmarking tools to test performance, and tuning tools to optimize systems. A number of command line monitoring tools are outlined, such as vmstat, iostat, mpstat, and netstat, as well as more advanced tools like strace and tcpdump.
Broken benchmarks, misleading metrics, and terrible tools. This talk will help you navigate the treacherous waters of Linux performance tools, touring common problems with system tools, metrics, statistics, visualizations, measurement overhead, and benchmarks. You might discover that tools you have been using for years, are in fact, misleading, dangerous, or broken.
The speaker, Brendan Gregg, has given many talks on tools that work, including giving the Linux PerformanceTools talk originally at SCALE. This is an anti-version of that talk, to focus on broken tools and metrics instead of the working ones. Metrics can be misleading, and counters can be counter-intuitive! This talk will include advice for verifying new performance tools, understanding how they work, and using them successfully.
Analyze Virtual Machine Overhead Compared to Bare Metal with TracingScyllaDB
The document compares the performance of running a MySQL database benchmark (Sysbench) on virtual machines versus bare metal machines. On Fedora, the benchmark achieved 6-7% higher transactions per second, queries per second, and lower latency when run on the bare metal host compared to the virtual machine guest. Similarly, on Debian, the benchmark achieved significantly higher transactions per second (over 500 vs under 80) and lower latency when run on bare metal. Tracing tools like trace-cmd can be used to analyze the additional overhead introduced by the virtualization layer.
OSDC 2015: Georg Schönberger | Linux Performance Profiling and MonitoringNETWAYS
Nowadays system administrators have great choices when it comes down to performance profiling and monitoring. The challenge is to pick the ppropriate tool and interpret their results correctly.
This talk is a chance to take a tour through various performance profiling and benchmarking tools, focusing on their benefit for every sysadmin. The topics will range from simple application profiling over sysstat utilities to low-level tracing methods. Besides traditional Linux methods a short glance at MySQL and Linux containers will be taken, too, as they are widely spread technologies.
At the end the goal is to gather reference points to look at, if you are faced with performance problems. Take the chance to close your knowledge gaps and learn how to get the most out of your system.
Talk for QConSF 2015: "Broken benchmarks, misleading metrics, and terrible tools. This talk will help you navigate the treacherous waters of system performance tools, touring common problems with system metrics, monitoring, statistics, visualizations, measurement overhead, and benchmarks. This will likely involve some unlearning, as you discover tools you have been using for years, are in fact, misleading, dangerous, or broken.
The speaker, Brendan Gregg, has given many popular talks on operating system performance tools. This is an anti-version of these talks, to focus on broken tools and metrics instead of the working ones. Metrics can be misleading, and counters can be counter-intuitive! This talk will include advice and methodologies for verifying new performance tools, understanding how they work, and using them successfully."
LinuxCon Europe, 2014. Video: https://www.youtube.com/watch?v=SN7Z0eCn0VY . There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This talk summarizes the three types of performance tools: observability, benchmarking, and tuning, providing a tour of what exists and why they exist. Advanced tools including those based on tracepoints, kprobes, and uprobes are also included: perf_events, ktap, SystemTap, LTTng, and sysdig. You'll gain a good understanding of the performance tools landscape, knowing what to reach for to get the most out of your systems.
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...Anne Nicolas
Understanding how Linux kernel IO subsystem works is a key to analysis of a wide variety of issues occurring when running a Linux system. This talk is aimed at helping Linux users understand what is going on and how to get more insight into what is happening.
First we present an overview of Linux kernel block layer including different IO schedulers. We also talk about a new block multiqueue implementation that gets used for more and more devices.
After surveying the basic architecture we will be prepared to talk about tools to peek into it. We start with lightweight monitoring like iostat and continue with more heavy blktrace and variety of tools that are based on it. We demonstrate use of the tools on analysis of real world issues.
Jan Kara, SUSE
- The document discusses various Linux system log files such as /var/log/messages, /var/log/secure, and /var/log/cron and provides examples of log entries.
- It also covers log rotation tools like logrotate and logwatch that are used to manage log files.
- Networking topics like IP addressing, subnet masking, routing, ARP, and tcpdump for packet sniffing are explained along with examples.
OSMC 2015: Linux Performance Profiling and Monitoring by Werner FischerNETWAYS
Nowadays system administrators have great choices when it comes down to Linux performance profiling and monitoring. The challenge is to pick the appropriate tools and interpret their results correctly.
This talk is a chance to take a tour through various performance profiling and benchmarking tools, focusing on their benefit for every sysadmin.
More than 25 different tools are presented. Ranging from well known tools like strace, iostat, tcpdump or vmstat to new features like Linux tracepoints or perf_events. You will also learn which tools can be monitored by Icinga and which monitoring plugins are already available for that.
At the end the goal is to gather reference points to look at, whenever you are faced with performance problems.
Take the chance to close your knowledge gaps and learn how to get the most out of your system.
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner FischerNETWAYS
The document discusses various Linux tools for profiling and monitoring system performance and resources. It provides examples of using mpstat to monitor CPU usage, vmstat to view memory and I/O statistics, and pidstat to analyze resource usage of specific processes. It also covers using iostat to monitor I/O subsystem performance and device utilization. The document aims to help understand how to use these tools to collect statistics and identify potential performance bottlenecks.
This document discusses security concepts like minimizing risk, managing vulnerabilities, and monitoring logs. It then focuses on OpenSSL and cryptography, explaining how to set up an OpenSSL certificate authority (CA) to generate and sign certificates. It covers generating keys, creating certificate signing requests, signing certificates, managing certificate revocation lists, and configuring applications to use the CRL for validation.
This document provides an overview of Linux basics including the directory structure, file permissions, common commands, shells, and Bash shell scripting. It covers key topics such as navigating the file system, setting file permissions, using commands like ls, cd, and rm, defining environment and shell variables, and controlling program flow using conditional statements and loops in Bash scripts.
7. 10
Process managementادامه
● Context switching
● Interrupt handling
● Process state (runing,rtopped,intruptible,unintruptible,zombie)
fork()
TASK_STOPPED
TASK_UNINTERRUPTIBLE
TASK_INTERRUPTIBLE
TASK_RUNNING
(ready)
TASK_ZOMBIE
exit()
Processor
TASK_RUNNING
8. 11
Processor metrics
● CPU utilization (Exceed 80%=bottleneck)
● User time, System time,Waiting, Idle time, Nice time
● Load avarage
● Runable processes (Excedd 10 time the amount of prcessors)
● Blocked (Uninterruptible: like process waiting for I/O)
● Context switch, Interrupts(software bottleneck)
9. 13
vmstat
$ vmstat
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
r b swpd free buff cache si so bi bo in cs
us sy id wa
1 0 262008 544552 21036 392400 1 4 173 142 136 332
15 5 77 2
$ vmstat
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
r b swpd free buff cache si so bi bo in cs
us sy id wa
1 0 262008 544552 21036 392400 1 4 173 142 136 332
15 5 77 2
Process (procs) r: The number of processes waiting for runtime
b: The number of processes in uninterruptable
sleep
Memory swpd: The amount of virtual memory used (KB)
free: The amount of idle memory (KB)
buff: The amount of memory used as buffers (KB)
cache: The amount of memory used as cache (KB)
Swap si: Amount of memory swapped from the disk (Kbps)
so: Amount of memory swapped to the disk (Kbps)
IO bi: Blocks sent to a block device (blocks/s)
bo: Blocks received from a block device (blocks/s)
System in: The number of interrupts per second, including the
clock
cs: The number of context switches per second
Process (procs) r: The number of processes waiting for runtime
b: The number of processes in uninterruptable
sleep
Memory swpd: The amount of virtual memory used (KB)
free: The amount of idle memory (KB)
buff: The amount of memory used as buffers (KB)
cache: The amount of memory used as cache (KB)
Swap si: Amount of memory swapped from the disk (Kbps)
so: Amount of memory swapped to the disk (Kbps)
IO bi: Blocks sent to a block device (blocks/s)
bo: Blocks received from a block device (blocks/s)
System in: The number of interrupts per second, including the
clock
cs: The number of context switches per second
11. 15
sar
$ sar -w
Linux 3.2.0-4-amd64 (laptop) 01/29/2015 _x86_64_ (4 CPU)
01:51:22 AM LINUX RESTART
01:55:01 AM proc/s cswch/s
02:05:01 AM 1.14 7087.09
02:15:01 AM 1.19 6955.23
02:25:01 AM 1.12 6855.21
02:35:01 AM 1.21 6311.18
02:45:01 AM 1.20 6006.53
$ sar -w
Linux 3.2.0-4-amd64 (laptop) 01/29/2015 _x86_64_ (4 CPU)
01:51:22 AM LINUX RESTART
01:55:01 AM proc/s cswch/s
02:05:01 AM 1.14 7087.09
02:15:01 AM 1.19 6955.23
02:25:01 AM 1.12 6855.21
02:35:01 AM 1.21 6311.18
02:45:01 AM 1.20 6006.53
13. 17
Linux memory architecture
● Physical and virtual memory
● Virtual memory manager
● Page frame
● Page frame reclaiming (Swap ,...)
● Buddy system
14. 18
Memory metrics
● Free memory
● Swap usage
● Buffer and caches
● Slabs (kernel usage of memory)
● Active versus inactive memory
● Inactive memory is a likely candidate to be swapped out to disk
15. 19
free, top, vmstat
$ top -b -n 1 | head -5
top - 16:54:00 up 5:39, 3 users, load average: 0.93, 0.65,
0.58
Tasks: 221 total, 1 running, 220 sleeping, 0 stopped, 0
zombie
%Cpu(s): 14.9 us, 4.6 sy, 0.0 ni, 78.2 id, 1.8 wa, 0.0 hi,
0.5 si, 0.0 st
KiB Mem: 2051244 total, 1600512 used, 450732 free, 29236
buffers
KiB Swap: 4954776 total, 254804 used, 4699972 free, 445900
cached
$ top -b -n 1 | head -5
top - 16:54:00 up 5:39, 3 users, load average: 0.93, 0.65,
0.58
Tasks: 221 total, 1 running, 220 sleeping, 0 stopped, 0
zombie
%Cpu(s): 14.9 us, 4.6 sy, 0.0 ni, 78.2 id, 1.8 wa, 0.0 hi,
0.5 si, 0.0 st
KiB Mem: 2051244 total, 1600512 used, 450732 free, 29236
buffers
KiB Swap: 4954776 total, 254804 used, 4699972 free, 445900
cached
$ vmstat -a
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
r b swpd free inact active si so bi bo in cs
us sy id wa
0 1 254804 453648 613696 844384 1 3 155 129 309 480
15 5 78 2
$ vmstat -a
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
r b swpd free inact active si so bi bo in cs
us sy id wa
0 1 254804 453648 613696 844384 1 3 155 129 309 480
15 5 78 2
16. 20
sar
$ sar -r
Linux 3.2.0-4-amd64 (laptop) 01/29/2015 _x86_64_ (4 CPU)
01:51:22 AM LINUX RESTART
01:55:01 AM kbmemfree kbmemused %memused kbbuffers kbcached
kbcommit %commit kbactive kbinact
02:05:01 AM 289476 1761768 85.89 75248 547476
3017772 43.07 1141260 480772
$ sar -r
Linux 3.2.0-4-amd64 (laptop) 01/29/2015 _x86_64_ (4 CPU)
01:51:22 AM LINUX RESTART
01:55:01 AM kbmemfree kbmemused %memused kbbuffers kbcached
kbcommit %commit kbactive kbinact
02:05:01 AM 289476 1761768 85.89 75248 547476
3017772 43.07 1141260 480772
17. 25
Block device metrics
● Iowait
● Average queue length
● Average wait
● Transfers per second
● Blocks read/write per second
● Kilobytes per second read/write
21. 31
Network interface metrics
● Packets received and sent, Bytes received and sent
● Collisions per second
● Packets dropped
● Overruns (ran out of buffer space)
● Errors