Presentació a càrrec d'Ismael Fernández i Cristian
Gomollón (tècnics d'Aplicacions al CSUC) duta a terme a la jornada de formació "Com usar el servei de càlcul del CSUC" celebrada el 8 d'octubre de 2019 al CSUC.
Presentació a càrrec d'Ismael Fernández, tècnic d'Aplicacions al CSUC, duta a terme a la "5a Jornada de formació sobre l'ús del servei de càlcul" celebrada el 16 de desembre de 2021 en format virtual.
Presentació a càrrec d'Ismael Fernández i Cristian
Gomollón (tècnics d'Aplicacions al CSUC) duta a terme a la "3a Jornada de formació sobre l'ús del servei de càlcul" celebrada el 29 d'octubre de 2020 en format virtual.
Presentació a càrrec d'Ismael Fernández, tècnic d'Aplicacions al CSUC, duta a terme a la "4a Jornada de formació sobre l'ús del servei de càlcul" celebrada el 17 de març de 2021 en format virtual.
Video: https://www.youtube.com/watch?v=JRFNIKUROPE . Talk for linux.conf.au 2017 (LCA2017) by Brendan Gregg, about Linux enhanced BPF (eBPF). Abstract:
A world of new capabilities is emerging for the Linux 4.x series, thanks to enhancements that have been included in Linux for to Berkeley Packet Filter (BPF): an in-kernel virtual machine that can execute user space-defined programs. It is finding uses for security auditing and enforcement, enhancing networking (including eXpress Data Path), and performance observability and troubleshooting. Many new open source tools that have been written in the past 12 months for performance analysis that use BPF. Tracing superpowers have finally arrived for Linux!
For its use with tracing, BPF provides the programmable capabilities to the existing tracing frameworks: kprobes, uprobes, and tracepoints. In particular, BPF allows timestamps to be recorded and compared from custom events, allowing latency to be studied in many new places: kernel and application internals. It also allows data to be efficiently summarized in-kernel, including as histograms. This has allowed dozens of new observability tools to be developed so far, including measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more.
This talk will summarize BPF capabilities and use cases so far, and then focus on its use to enhance Linux tracing, especially with the open source bcc collection. bcc includes BPF versions of old classics, and many new tools, including execsnoop, opensnoop, funcccount, ext4slower, and more (many of which I developed). Perhaps you'd like to develop new tools, or use the existing tools to find performance wins large and small, especially when instrumenting areas that previously had zero visibility. I'll also summarize how we intend to use these new capabilities to enhance systems analysis at Netflix.
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin KnaufVerverica
The need to enrich a fast, high volume data stream with slow-changing reference data is probably one of the most wide-spread requirements in stream processing applications. Apache Flink's built-in join functionalities and its flexible lower-level APIs support stream enrichment in various ways depending on the specific requirements of the use case at hand. In this webinar, I like to provide an overview of the basic methods to enrich a data stream with Apache Flink and highlight use cases, limitations, advantages and disadvantages of each.
Presentació a càrrec d'Ismael Fernández, tècnic d'Aplicacions al CSUC, duta a terme a la "5a Jornada de formació sobre l'ús del servei de càlcul" celebrada el 16 de desembre de 2021 en format virtual.
Presentació a càrrec d'Ismael Fernández i Cristian
Gomollón (tècnics d'Aplicacions al CSUC) duta a terme a la "3a Jornada de formació sobre l'ús del servei de càlcul" celebrada el 29 d'octubre de 2020 en format virtual.
Presentació a càrrec d'Ismael Fernández, tècnic d'Aplicacions al CSUC, duta a terme a la "4a Jornada de formació sobre l'ús del servei de càlcul" celebrada el 17 de març de 2021 en format virtual.
Video: https://www.youtube.com/watch?v=JRFNIKUROPE . Talk for linux.conf.au 2017 (LCA2017) by Brendan Gregg, about Linux enhanced BPF (eBPF). Abstract:
A world of new capabilities is emerging for the Linux 4.x series, thanks to enhancements that have been included in Linux for to Berkeley Packet Filter (BPF): an in-kernel virtual machine that can execute user space-defined programs. It is finding uses for security auditing and enforcement, enhancing networking (including eXpress Data Path), and performance observability and troubleshooting. Many new open source tools that have been written in the past 12 months for performance analysis that use BPF. Tracing superpowers have finally arrived for Linux!
For its use with tracing, BPF provides the programmable capabilities to the existing tracing frameworks: kprobes, uprobes, and tracepoints. In particular, BPF allows timestamps to be recorded and compared from custom events, allowing latency to be studied in many new places: kernel and application internals. It also allows data to be efficiently summarized in-kernel, including as histograms. This has allowed dozens of new observability tools to be developed so far, including measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more.
This talk will summarize BPF capabilities and use cases so far, and then focus on its use to enhance Linux tracing, especially with the open source bcc collection. bcc includes BPF versions of old classics, and many new tools, including execsnoop, opensnoop, funcccount, ext4slower, and more (many of which I developed). Perhaps you'd like to develop new tools, or use the existing tools to find performance wins large and small, especially when instrumenting areas that previously had zero visibility. I'll also summarize how we intend to use these new capabilities to enhance systems analysis at Netflix.
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin KnaufVerverica
The need to enrich a fast, high volume data stream with slow-changing reference data is probably one of the most wide-spread requirements in stream processing applications. Apache Flink's built-in join functionalities and its flexible lower-level APIs support stream enrichment in various ways depending on the specific requirements of the use case at hand. In this webinar, I like to provide an overview of the basic methods to enrich a data stream with Apache Flink and highlight use cases, limitations, advantages and disadvantages of each.
In this deck from the 2019 Stanford HPC Conference, Todd Gamblin from Lawrence Livermore National Laboratory presents: Spack - A Package Manager for HPC.
"Spack is a package manager for cluster users, developers and administrators. Rapidly gaining popularity in the HPC community, like other HPC package managers, Spack was designed to build packages from source. Spack supports relocatable binaries for specific OS releases, target architectures, MPI implementations, and other very fine-grained build options.
This talk will introduce some of the open infrastructure for distributing packages, challenges to providing binaries for a large package ecosystem and what we're doing to address problems. We'll also talk about challenges for implementing relocatable binaries with a multi-compiler system like Spack. Finally, we'll talk about how Spack integrates with the US Exascale project's open source software release plan and how this will help glue together the HPC OSS ecosystem.
Todd is a computer scientist in the Center for Applied Scientific Computing at Lawrence Livermore National Laboratory. His research focuses on scalable tools for measuring, analyzing, and visualizing performance data from massively parallel applications. Todd is also involved with many production projects at LLNL. He works with Livermore Computing’s Development Environment Group to build tools that allow users to deploy, run, debug, and optimize their software for machines with million-way concurrency.
Todd received his Ph.D. in computer science from the University of North Carolina at Chapel Hill in 2009. His dissertation investigated parallel methods for compressing and sampling performance measurements from hundreds of thousands of concurrent processors. He received his B.A. in Computer Science and Japanese from Williams College in 2002. He has also worked as a software developer in Tokyo and held research internships at the University of Tokyo and IBM Research.
Watch the video: https://youtu.be/DhUVbroMLJY
Learn more: https://computation.llnl.gov/projects/spack-hpc-package-manager
and
http://hpcadvisorycouncil.com/events/2019/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
P99 Pursuit: 8 Years of Battling P99 LatencyScyllaDB
Performance engineering is a Sisyphean hill climb for perfection. Those who climb the hill are hardly ever satisfied with the results. You should always ask yourself where the bottleneck is today and what’s holding you back. Great performance improves your software. It enables you to run fewer layers, manage 10x less machines, simplifies your stack, and more.
In this keynote session, ScyllaDB CEO Dor Laor will cover the principles for successful creation of projects like ScyllaDB, KVM, the Linux kernel and explain why they spurred his vision for the P99 CONF.
Presentation at Strata Data Conference 2018, New York
The controller is the brain of Apache Kafka. A big part of what the controller does is to maintain the consistency of the replicas and determine which replica can be used to serve the clients, especially during individual broker failure.
Jun Rao outlines the main data flow in the controller—in particular, when a broker fails, how the controller automatically promotes another replica as the leader to serve the clients, and when a broker is started, how the controller resumes the replication pipeline in the restarted broker.
Jun then describes recent improvements to the controller that allow it to handle certain edge cases correctly and increase its performance, which allows for more partitions in a Kafka cluster.
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
Flinkn Forward San Francisco 2022.
In this talk, we will cover various topics around performance issues that can arise when running a Flink job and how to troubleshoot them. We’ll start with the basics, like understanding what the job is doing and what backpressure is. Next, we will see how to identify bottlenecks and which tools or metrics can be helpful in the process. Finally, we will also discuss potential performance issues during the checkpointing or recovery process, as well as and some tips and Flink features that can speed up checkpointing and recovery times.
by
Piotr Nowojski
This document discusses PostgreSQL statistics and how to use them effectively. It provides an overview of various PostgreSQL statistics sources like views, functions and third-party tools. It then demonstrates how to analyze specific statistics like those for databases, tables, indexes, replication and query activity to identify anomalies, optimize performance and troubleshoot issues.
The document discusses various techniques for profiling CPU and memory performance in Rust programs, including:
- Using the flamegraph tool to profile CPU usage by sampling a running process and generating flame graphs.
- Integrating pprof profiling into Rust programs to expose profiles over HTTP similar to how it works in Go.
- Profiling heap usage by integrating jemalloc profiling and generating heap profiles on program exit.
- Some challenges with profiling asynchronous Rust programs due to the lack of backtraces.
The key takeaways are that there are crates like pprof-rs and techniques like jemalloc integration that allow collecting CPU and memory profiles from Rust programs, but profiling asynchronous programs
Broken benchmarks, misleading metrics, and terrible tools. This talk will help you navigate the treacherous waters of Linux performance tools, touring common problems with system tools, metrics, statistics, visualizations, measurement overhead, and benchmarks. You might discover that tools you have been using for years, are in fact, misleading, dangerous, or broken.
The speaker, Brendan Gregg, has given many talks on tools that work, including giving the Linux PerformanceTools talk originally at SCALE. This is an anti-version of that talk, to focus on broken tools and metrics instead of the working ones. Metrics can be misleading, and counters can be counter-intuitive! This talk will include advice for verifying new performance tools, understanding how they work, and using them successfully.
Message Passing Interface (MPI)-A means of machine communicationHimanshi Kathuria
MPI (Message Passing Interface) is a standard for writing message passing programs between parallel processes. It was developed in the late 1980s and early 1990s due to increasing computational needs. An MPI program typically initializes and finalizes the MPI environment, declares variables, includes MPI header files, contains parallel code using MPI calls, and terminates the environment before ending. Key MPI calls initialize and finalize the environment, determine the process rank and number of processes, and get the processor name.
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
Talk for USENIX/LISA2014 by Brendan Gregg, Netflix. At Netflix performance is crucial, and we use many high to low level tools to analyze our stack in different ways. In this talk, I will introduce new system observability tools we are using at Netflix, which I've ported from my DTraceToolkit, and are intended for our Linux 3.2 cloud instances. These show that Linux can do more than you may think, by using creative hacks and workarounds with existing kernel features (ftrace, perf_events). While these are solving issues on current versions of Linux, I'll also briefly summarize the future in this space: eBPF, ktap, SystemTap, sysdig, etc.
The document summarizes a talk on container performance analysis. It discusses identifying bottlenecks at the host, container, and kernel level using various Linux performance tools. It then provides an overview of how containers work in Linux using namespaces and control groups (cgroups). Finally, it demonstrates some example commands like docker stats, systemd-cgtop, and bcc/BPF tools that can be used to analyze containers and cgroups from the host system.
"In this follow-up to his 2014 presentation at the Stanford HPCAC Conference, Michael will provide an update on the latest happenings with the LBNL NHC project, new features in the latest release, and a brief overview of the roadmap for future development."
Learn more: https://github.com/mej/nhc
Sign up for our insideHPC Newsletter: http://insideHPC.com/newsletter
Linux offers an extensive selection of programmable and configurable networking components from traditional bridges, encryption, to container optimized layer 2/3 devices, link aggregation, tunneling, several classification and filtering languages all the way up to full SDN components. This talk will provide an overview of many Linux networking components covering the Linux bridge, IPVLAN, MACVLAN, MACVTAP, Bonding/Team, OVS, classification & queueing, tunnel types, hidden routing tricks, IPSec, VTI, VRF and many others.
Improved alerting with Prometheus and AlertmanagerJulien Pivotto
This document summarizes a presentation on improving alerting with Prometheus and Alertmanager. It discusses using PromQL and YAML to configure alerts and receivers. Key points covered include using thresholds, hysteresis, absence of metrics, computed thresholds, time windows like business hours, holidays and time zones, and inhibition rules to suppress notifications.
Aggregator Leaf Tailer: Bringing Data to Your Users with Ultra Low LatencyScyllaDB
theScore is one of the leading sports news and media platforms in North America. With the recent growth of sports betting, theScore is diving in head-first. Province and state regulations often enforce strict data-locality requirements often prohibiting the use of centralized hosted data storage solutions.
In this talk, we'll learn how theScore built Datadex, a geographically distributed system for low-latency queries and real-time updates. We'll touch on the architecture of the system (Aggregator Leaf Tailer), the underlying technologies (RocksDB, Kafka, and the JVM), as well as some nitty-gritty tips for optimizing the JVM and RocksDB for high throughput.
The document discusses improvements to the implementation of futexes (fast userspace mutexes) in the Linux kernel to improve scaling on multicore systems. Some key issues with the original futex implementation are a global hash table that does not scale well with NUMA, hash collisions, and contention on hash bucket locks. Improvements discussed include using per-process or per-thread hash tables to address NUMA issues, improving hashing to reduce collisions, releasing hash bucket locks before waking tasks to allow concurrent wakeups, and replacing spinlocks with queued/MCS locks to reduce cacheline bouncing under contention. These changes aim to improve futex performance and scalability as the number of cores in systems increases.
Evening out the uneven: dealing with skew in FlinkFlink Forward
Flink Forward San Francisco 2022.
When running Flink jobs, skew is a common problem that results in wasted resources and limited scalability. In the past years, we have helped our customers and users solve various skew-related issues in their Flink jobs or clusters. In this talk, we will present the different types of skew that users often run into: data skew, key skew, event time skew, state skew, and scheduling skew, and discuss solutions for each of them. We hope this will serve as a guideline to help you reduce skew in your Flink environment.
by
Jun Qin & Karl Friedrich
This document provides an overview of five steps to improve PostgreSQL performance: 1) hardware optimization, 2) operating system and filesystem tuning, 3) configuration of postgresql.conf parameters, 4) application design considerations, and 5) query tuning. The document discusses various techniques for each step such as selecting appropriate hardware components, spreading database files across multiple disks or arrays, adjusting memory and disk configuration parameters, designing schemas and queries efficiently, and leveraging caching strategies.
Presentació a càrrec d'Ismael Fernández i Cristian
Gomollón (tècnics d'Aplicacions al CSUC) duta a terme a la "2a Jornada de formació sobre l'ús del servei de càlcul" celebrada el 19 de febrer de 2020 al CSUC.
In this deck from the 2019 Stanford HPC Conference, Todd Gamblin from Lawrence Livermore National Laboratory presents: Spack - A Package Manager for HPC.
"Spack is a package manager for cluster users, developers and administrators. Rapidly gaining popularity in the HPC community, like other HPC package managers, Spack was designed to build packages from source. Spack supports relocatable binaries for specific OS releases, target architectures, MPI implementations, and other very fine-grained build options.
This talk will introduce some of the open infrastructure for distributing packages, challenges to providing binaries for a large package ecosystem and what we're doing to address problems. We'll also talk about challenges for implementing relocatable binaries with a multi-compiler system like Spack. Finally, we'll talk about how Spack integrates with the US Exascale project's open source software release plan and how this will help glue together the HPC OSS ecosystem.
Todd is a computer scientist in the Center for Applied Scientific Computing at Lawrence Livermore National Laboratory. His research focuses on scalable tools for measuring, analyzing, and visualizing performance data from massively parallel applications. Todd is also involved with many production projects at LLNL. He works with Livermore Computing’s Development Environment Group to build tools that allow users to deploy, run, debug, and optimize their software for machines with million-way concurrency.
Todd received his Ph.D. in computer science from the University of North Carolina at Chapel Hill in 2009. His dissertation investigated parallel methods for compressing and sampling performance measurements from hundreds of thousands of concurrent processors. He received his B.A. in Computer Science and Japanese from Williams College in 2002. He has also worked as a software developer in Tokyo and held research internships at the University of Tokyo and IBM Research.
Watch the video: https://youtu.be/DhUVbroMLJY
Learn more: https://computation.llnl.gov/projects/spack-hpc-package-manager
and
http://hpcadvisorycouncil.com/events/2019/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
P99 Pursuit: 8 Years of Battling P99 LatencyScyllaDB
Performance engineering is a Sisyphean hill climb for perfection. Those who climb the hill are hardly ever satisfied with the results. You should always ask yourself where the bottleneck is today and what’s holding you back. Great performance improves your software. It enables you to run fewer layers, manage 10x less machines, simplifies your stack, and more.
In this keynote session, ScyllaDB CEO Dor Laor will cover the principles for successful creation of projects like ScyllaDB, KVM, the Linux kernel and explain why they spurred his vision for the P99 CONF.
Presentation at Strata Data Conference 2018, New York
The controller is the brain of Apache Kafka. A big part of what the controller does is to maintain the consistency of the replicas and determine which replica can be used to serve the clients, especially during individual broker failure.
Jun Rao outlines the main data flow in the controller—in particular, when a broker fails, how the controller automatically promotes another replica as the leader to serve the clients, and when a broker is started, how the controller resumes the replication pipeline in the restarted broker.
Jun then describes recent improvements to the controller that allow it to handle certain edge cases correctly and increase its performance, which allows for more partitions in a Kafka cluster.
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
Flinkn Forward San Francisco 2022.
In this talk, we will cover various topics around performance issues that can arise when running a Flink job and how to troubleshoot them. We’ll start with the basics, like understanding what the job is doing and what backpressure is. Next, we will see how to identify bottlenecks and which tools or metrics can be helpful in the process. Finally, we will also discuss potential performance issues during the checkpointing or recovery process, as well as and some tips and Flink features that can speed up checkpointing and recovery times.
by
Piotr Nowojski
This document discusses PostgreSQL statistics and how to use them effectively. It provides an overview of various PostgreSQL statistics sources like views, functions and third-party tools. It then demonstrates how to analyze specific statistics like those for databases, tables, indexes, replication and query activity to identify anomalies, optimize performance and troubleshoot issues.
The document discusses various techniques for profiling CPU and memory performance in Rust programs, including:
- Using the flamegraph tool to profile CPU usage by sampling a running process and generating flame graphs.
- Integrating pprof profiling into Rust programs to expose profiles over HTTP similar to how it works in Go.
- Profiling heap usage by integrating jemalloc profiling and generating heap profiles on program exit.
- Some challenges with profiling asynchronous Rust programs due to the lack of backtraces.
The key takeaways are that there are crates like pprof-rs and techniques like jemalloc integration that allow collecting CPU and memory profiles from Rust programs, but profiling asynchronous programs
Broken benchmarks, misleading metrics, and terrible tools. This talk will help you navigate the treacherous waters of Linux performance tools, touring common problems with system tools, metrics, statistics, visualizations, measurement overhead, and benchmarks. You might discover that tools you have been using for years, are in fact, misleading, dangerous, or broken.
The speaker, Brendan Gregg, has given many talks on tools that work, including giving the Linux PerformanceTools talk originally at SCALE. This is an anti-version of that talk, to focus on broken tools and metrics instead of the working ones. Metrics can be misleading, and counters can be counter-intuitive! This talk will include advice for verifying new performance tools, understanding how they work, and using them successfully.
Message Passing Interface (MPI)-A means of machine communicationHimanshi Kathuria
MPI (Message Passing Interface) is a standard for writing message passing programs between parallel processes. It was developed in the late 1980s and early 1990s due to increasing computational needs. An MPI program typically initializes and finalizes the MPI environment, declares variables, includes MPI header files, contains parallel code using MPI calls, and terminates the environment before ending. Key MPI calls initialize and finalize the environment, determine the process rank and number of processes, and get the processor name.
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
Talk for USENIX/LISA2014 by Brendan Gregg, Netflix. At Netflix performance is crucial, and we use many high to low level tools to analyze our stack in different ways. In this talk, I will introduce new system observability tools we are using at Netflix, which I've ported from my DTraceToolkit, and are intended for our Linux 3.2 cloud instances. These show that Linux can do more than you may think, by using creative hacks and workarounds with existing kernel features (ftrace, perf_events). While these are solving issues on current versions of Linux, I'll also briefly summarize the future in this space: eBPF, ktap, SystemTap, sysdig, etc.
The document summarizes a talk on container performance analysis. It discusses identifying bottlenecks at the host, container, and kernel level using various Linux performance tools. It then provides an overview of how containers work in Linux using namespaces and control groups (cgroups). Finally, it demonstrates some example commands like docker stats, systemd-cgtop, and bcc/BPF tools that can be used to analyze containers and cgroups from the host system.
"In this follow-up to his 2014 presentation at the Stanford HPCAC Conference, Michael will provide an update on the latest happenings with the LBNL NHC project, new features in the latest release, and a brief overview of the roadmap for future development."
Learn more: https://github.com/mej/nhc
Sign up for our insideHPC Newsletter: http://insideHPC.com/newsletter
Linux offers an extensive selection of programmable and configurable networking components from traditional bridges, encryption, to container optimized layer 2/3 devices, link aggregation, tunneling, several classification and filtering languages all the way up to full SDN components. This talk will provide an overview of many Linux networking components covering the Linux bridge, IPVLAN, MACVLAN, MACVTAP, Bonding/Team, OVS, classification & queueing, tunnel types, hidden routing tricks, IPSec, VTI, VRF and many others.
Improved alerting with Prometheus and AlertmanagerJulien Pivotto
This document summarizes a presentation on improving alerting with Prometheus and Alertmanager. It discusses using PromQL and YAML to configure alerts and receivers. Key points covered include using thresholds, hysteresis, absence of metrics, computed thresholds, time windows like business hours, holidays and time zones, and inhibition rules to suppress notifications.
Aggregator Leaf Tailer: Bringing Data to Your Users with Ultra Low LatencyScyllaDB
theScore is one of the leading sports news and media platforms in North America. With the recent growth of sports betting, theScore is diving in head-first. Province and state regulations often enforce strict data-locality requirements often prohibiting the use of centralized hosted data storage solutions.
In this talk, we'll learn how theScore built Datadex, a geographically distributed system for low-latency queries and real-time updates. We'll touch on the architecture of the system (Aggregator Leaf Tailer), the underlying technologies (RocksDB, Kafka, and the JVM), as well as some nitty-gritty tips for optimizing the JVM and RocksDB for high throughput.
The document discusses improvements to the implementation of futexes (fast userspace mutexes) in the Linux kernel to improve scaling on multicore systems. Some key issues with the original futex implementation are a global hash table that does not scale well with NUMA, hash collisions, and contention on hash bucket locks. Improvements discussed include using per-process or per-thread hash tables to address NUMA issues, improving hashing to reduce collisions, releasing hash bucket locks before waking tasks to allow concurrent wakeups, and replacing spinlocks with queued/MCS locks to reduce cacheline bouncing under contention. These changes aim to improve futex performance and scalability as the number of cores in systems increases.
Evening out the uneven: dealing with skew in FlinkFlink Forward
Flink Forward San Francisco 2022.
When running Flink jobs, skew is a common problem that results in wasted resources and limited scalability. In the past years, we have helped our customers and users solve various skew-related issues in their Flink jobs or clusters. In this talk, we will present the different types of skew that users often run into: data skew, key skew, event time skew, state skew, and scheduling skew, and discuss solutions for each of them. We hope this will serve as a guideline to help you reduce skew in your Flink environment.
by
Jun Qin & Karl Friedrich
This document provides an overview of five steps to improve PostgreSQL performance: 1) hardware optimization, 2) operating system and filesystem tuning, 3) configuration of postgresql.conf parameters, 4) application design considerations, and 5) query tuning. The document discusses various techniques for each step such as selecting appropriate hardware components, spreading database files across multiple disks or arrays, adjusting memory and disk configuration parameters, designing schemas and queries efficiently, and leveraging caching strategies.
Presentació a càrrec d'Ismael Fernández i Cristian
Gomollón (tècnics d'Aplicacions al CSUC) duta a terme a la "2a Jornada de formació sobre l'ús del servei de càlcul" celebrada el 19 de febrer de 2020 al CSUC.
Lecture 7: Introduction to Quantum Chemical Simulation graduate course taught at MIT in Fall 2014 by Heather Kulik. This course covers: wavefunction theory, density functional theory, force fields and molecular dynamics and sampling.
Performance tweaks and tools for Linux (Joe Damato)Ontico
The document discusses various Linux performance analysis tools including lsof to list open files, strace to trace system calls, tcpdump to dump network traffic, perftools from Google for profiling CPU usage, and a Ruby library called perftools.rb for profiling Ruby code. Examples are provided for using these tools to analyze memory usage, slow queries, Ruby interpreter signals, thread scheduling overhead, and identifying hot spots in Ruby web applications.
A short overview of current technologies plucked from the Texas Linux Fest schedule for 2014. Includes overviews of systemd, popular configuration management tools, docker, distributed log collection, and openstack.
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Lucidworks
The document discusses troubleshooting techniques for Solr, including using a "TreeMap" process of establishing boundaries, splitting the problem, identifying relevant parts, zooming in, and re-formulating boundaries in an iterative process until the problem is fixed. It outlines how to establish boundaries by defining the identity, location, timing, and magnitude of the problem. Examples are provided for indexing and searching processes in Solr.
This document provides information on various debugging and profiling tools that can be used for Ruby including:
- lsof to list open files for a process
- strace to trace system calls and signals
- tcpdump to dump network traffic
- google perftools profiler for CPU profiling
- pprof to analyze profiling data
It also discusses how some of these tools have helped identify specific performance issues with Ruby like excessive calls to sigprocmask and memcpy calls slowing down EventMachine with threads.
LinuxCon Europe, 2014. Video: https://www.youtube.com/watch?v=SN7Z0eCn0VY . There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This talk summarizes the three types of performance tools: observability, benchmarking, and tuning, providing a tour of what exists and why they exist. Advanced tools including those based on tracepoints, kprobes, and uprobes are also included: perf_events, ktap, SystemTap, LTTng, and sysdig. You'll gain a good understanding of the performance tools landscape, knowing what to reach for to get the most out of your systems.
This document provides an overview of systemd and how it differs from traditional init systems. It discusses systemd units and how to manage services using systemctl. It covers customizing units using drop-ins, managing resources with cgroups, converting init scripts, and using the systemd journal. The presentation aims to demystify systemd and provide administrators with practical guidance on using its main features.
pg_proctab: Accessing System Stats in PostgreSQLMark Wong
pg_proctab is a collection of PostgreSQL stored functions that provide access to the operating system process table using SQL. We'll show you which functions are available and where they collect the data, and give examples of their use to collect processor and I/O statistics on SQL queries. These stored functions currently only work on Linux-based systems.
The document discusses using pg_proctab, a PostgreSQL extension that provides functions to query operating system process and statistics tables from within PostgreSQL. It demonstrates how to use pg_proctab to monitor CPU and memory usage, I/O, and other process-level metrics for queries. The document also shows how to generate custom reports on database activity and performance by taking snapshots before and after queries and analyzing the differences.
This document discusses the slides for Unit 2 of the Operating Systems course. It includes an index of lecture topics that will be covered, such as process concepts and threads, scheduling criteria and algorithms, thread scheduling, case studies of UNIX/Linux and Windows operating systems, and revision. Key concepts that will be covered include processes and threads, process state diagrams, process control blocks, CPU scheduling queues, producer-consumer problem solutions, scheduling criteria and algorithms like FCFS, SJF, priority and round robin, and thread scheduling models.
This document provides an overview of basic UNIX commands and utilities for general use, performance monitoring and debugging. It discusses commands for file manipulation, permissions, searching, networking and more. Specific utilities covered include ls, cat, grep, find, top and netstat. The document also reviews UNIX concepts like HOME directories, hidden files and aliases. Examples are given for many commands.
Infrastructure review - Shining a light on the Black BoxMiklos Szel
Scenario: You work as a consultant and a new client has just signed on. Their DBA left suddenly leaving nothing but some outdated documentation in their wiki. After the kick-off meeting you realise that the operations and the development teams know little to none about the databases. They have been encountering intermittent problems with the application’s performance and suspect it’s related to the databases. You are told: "Please fix it ASAP!” So you have your public key installed on their jumphost and they manage to provide you with a 6 character long mysql root password. This is where your journey begins! During this session you will learn some of the best practices around discovering a new environment, finding possible threats and weaknesses and determining what key metrics to focus on for performance and reliability. We will cover architecture, replication, OS and MySQL level configuration, storage engines, failover strategies, backup and restores, monitoring, query tuning and possible ways to save money. The goal at the end of the presentation is to have a prioritized action plan. I will also explain the usage and the output of some tools/wrappers that help during an infrastructure review. Examples include creation of maximum integetr usage reports, table fragmentation and duplicate keys (we will be leveraging multiple Percona Toolkit scripts but also some lesser known tools as well). It is often easy to overlook underlying problems in the infrastructure during day-to-day operations, so this presentation will aim to highlight how to identify and resolve potential bottlenecks with your systems.
Training Slides: 104 - Basics - Working With Command Line ToolsContinuent
This 62min training session takes an in-depth look at the command line tools used in conjunction with Tungsten Clustering.
TOPICS COVERED
- Re-cap the previous Installation
- Explore the main Command Line Tools
- tpm
- cctrl
- trepctl
- thl
This document discusses processes in Linux and tools for managing them. It begins by explaining that a process is a set of instructions loaded into memory that has a numeric PID for identification. It then covers commands like ps, pgrep, and pidof for listing and finding processes. The document also discusses signals for inter-process communication, scheduling priority, job control, and tools like top, at and crontab for interactive process management and scheduling jobs.
Video: https://www.youtube.com/watch?v=uibLwoVKjec . Talk by Brendan Gregg for Sysdig CCWFS 2016. Abstract:
"You have a system with an advanced programmatic tracer: do you know what to do with it? Brendan has used numerous tracers in production environments, and has published hundreds of tracing-based tools. In this talk he will share tips and know-how for creating CLI tracing tools and GUI visualizations, to solve real problems effectively. Programmatic tracing is an amazing superpower, and this talk will show you how to wield it!"
You have a system with an advanced programmatic tracer: do you know what to do with it? Brendan has used numerous tracers in production environments, and has published hundreds of tracing-based tools. In this talk he will share tips and know-how for creating CLI tracing tools and GUI visualizations, to solve real problems effectively. Programmatic tracing is an amazing superpower, and this talk will show you how to wield it!
Modern operating systems are complex beasts, responsible for sharing hardware resources between many competing programs. For low-latency systems, sometimes it's necessary to subvert the OS to grab back the resources your program needs. In this talk, we will explore what is actually going on when you run a program, how much time it actually gets on CPU, and strategies to help make your code run as fast as possible. By the end of this talk, you will know how to tune your software and the linux kernel to get the most from your hardware, and more importantly, how to validate that your changes have worked.
Activitat de formació, impartida pel CSUC, per sensibilitzar de la importància de fer una bona gestió de les dades de recerca i, sobretot, de la necessitat de publicar les dades seguint els principis FAIR (trobables, accessibles, interoperables i reutilitzables).
Presentació a càrrec de Clara Llebot, tècnica de curació de dades, al 18è congrés International Digital Curation Conference (IDCC). En són coautors la Mireia Alcalá, tècnica de recursos d'informació, i en Lluís Anglada, assessor de ciència oberta del CSUC.
Activitat de formació, impartida pel CSUC, per sensibilitzar de la importància de fer una bona gestió de les dades de recerca, començant per la generació de plans de gestió de dades.
Presentació a càrrec de Mireia Alcalá (CSUC). La presentació s'emmarca dins el taller "Com pot ajudar la gestió de les dades de recerca a posar en pràctica la ciència oberta?".
El taller va ser organitzat per la UVic-UCC des de la Biblioteca i l'OTRI i relitza una introducció a la ciència oberta i a la gestió de dades de recerca.
The Network eAcademy provides training and resources related to network automation. It was created through discussions between 2019-2020 and focuses on orchestration, automation and virtualization. The training includes self-paced courses and interactive sessions. While the eAcademy has been successful with over 2,600 users in 2023, live sessions have had low attendance, suggesting people may be shy about participating. The most popular training content introduces OAV concepts while Ansible and specific protocols also see high usage. Creation of the eAcademy involved collaboration across multiple organizations and took over three years to develop fully.
Participació de Mireia Alcalá, tècnica líder en Curació de Dades al CSUC, a la jornada de REBIUN-ANECA sobre l'avaluació de la recerca a les biblioteques universitàries i científiques espanyoles.
El documento habla sobre la importancia de priorizar estratégicamente los proyectos de TI en una universidad a través de una cartera de proyectos de TI. Explica que aunque no se disponga de recursos ilimitados, una cartera estratégica permite alinear los proyectos con los objetivos de la universidad y maximizar el valor de las inversiones en TI. También describe elementos clave de un proceso efectivo de gestión de cartera como la definición de roles, criterios de evaluación y seguimiento del impacto de los proyectos.
The document is a presentation by Gartner discussing the transformation of roles and skills in an AI-filled world. It contains information about generative AI and its applications, including using AI to extend legacy systems and automate parts of the software development process. It also discusses risks of generative AI like intellectual property and privacy issues. The presentation provides guidance to organizations on becoming "AI ready" and how CIOs can empower teams to safely develop digital capabilities.
More from CSUC - Consorci de Serveis Universitaris de Catalunya (20)
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
QA or the Highway - Component Testing: Bridging the gap between frontend appl...zjhamm304
These are the slides for the presentation, "Component Testing: Bridging the gap between frontend applications" that was presented at QA or the Highway 2024 in Columbus, OH by Zachary Hamm.
High performance Serverless Java on AWS- GoTo Amsterdam 2024Vadym Kazulkin
Java is for many years one of the most popular programming languages, but it used to have hard times in the Serverless community. Java is known for its high cold start times and high memory footprint, comparing to other programming languages like Node.js and Python. In this talk I'll look at the general best practices and techniques we can use to decrease memory consumption, cold start times for Java Serverless development on AWS including GraalVM (Native Image) and AWS own offering SnapStart based on Firecracker microVM snapshot and restore and CRaC (Coordinated Restore at Checkpoint) runtime hooks. I'll also provide a lot of benchmarking on Lambda functions trying out various deployment package sizes, Lambda memory settings, Java compilation options and HTTP (a)synchronous clients and measure their impact on cold and warm start times.
The Microsoft 365 Migration Tutorial For Beginner.pptxoperationspcvita
This presentation will help you understand the power of Microsoft 365. However, we have mentioned every productivity app included in Office 365. Additionally, we have suggested the migration situation related to Office 365 and how we can help you.
You can also read: https://www.systoolsgroup.com/updates/office-365-tenant-to-tenant-migration-step-by-step-complete-guide/
Dandelion Hashtable: beyond billion requests per second on a commodity serverAntonios Katsarakis
This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
ScyllaDB is making a major architecture shift. We’re moving from vNode replication to tablets – fragments of tables that are distributed independently, enabling dynamic data distribution and extreme elasticity. In this keynote, ScyllaDB co-founder and CTO Avi Kivity explains the reason for this shift, provides a look at the implementation and roadmap, and shares how this shift benefits ScyllaDB users.
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
The Department of Veteran Affairs (VA) invited Taylor Paschal, Knowledge & Information Management Consultant at Enterprise Knowledge, to speak at a Knowledge Management Lunch and Learn hosted on June 12, 2024. All Office of Administration staff were invited to attend and received professional development credit for participating in the voluntary event.
The objectives of the Lunch and Learn presentation were to:
- Review what KM ‘is’ and ‘isn’t’
- Understand the value of KM and the benefits of engaging
- Define and reflect on your “what’s in it for me?”
- Share actionable ways you can participate in Knowledge - - Capture & Transfer
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving
What began over 115 years ago as a supplier of precision gauges to the automotive industry has evolved into being an industry leader in the manufacture of product branding, automotive cockpit trim and decorative appliance trim. Value-added services include in-house Design, Engineering, Program Management, Test Lab and Tool Shops.
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
Discover top-tier mobile app development services, offering innovative solutions for iOS and Android. Enhance your business with custom, user-friendly mobile applications.
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...Fwdays
Direct losses from downtime in 1 minute = $5-$10 thousand dollars. Reputation is priceless.
As part of the talk, we will consider the architectural strategies necessary for the development of highly loaded fintech solutions. We will focus on using queues and streaming to efficiently work and manage large amounts of data in real-time and to minimize latency.
We will focus special attention on the architectural patterns used in the design of the fintech system, microservices and event-driven architecture, which ensure scalability, fault tolerance, and consistency of the entire system.
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
Essentials of Automations: Exploring Attributes & Automation ParametersSafe Software
Building automations in FME Flow can save time, money, and help businesses scale by eliminating data silos and providing data to stakeholders in real-time. One essential component to orchestrating complex automations is the use of attributes & automation parameters (both formerly known as “keys”). In fact, it’s unlikely you’ll ever build an Automation without using these components, but what exactly are they?
Attributes & automation parameters enable the automation author to pass data values from one automation component to the next. During this webinar, our FME Flow Specialists will cover leveraging the three types of these output attributes & parameters in FME Flow: Event, Custom, and Automation. As a bonus, they’ll also be making use of the Split-Merge Block functionality.
You’ll leave this webinar with a better understanding of how to maximize the potential of automations by making use of attributes & automation parameters, with the ultimate goal of setting your enterprise integration workflows up on autopilot.
3. What is SLURM?
• Allocates access to resources for some duration of time.
• Provides a framework for starting, executing, and
monitoring work (normally a parallel job).
• Arbitrates contention for resources by managing
a queue of pending work.
Cluster manager and job scheduler
system for large and small Linux
clusters.
8. SLURM: Resource Management
Partitions:
• Associatedwith specific
set of nodes
• Nodes can be in more
than one partition
• Job size and time limits
• Access control list
• State information
− Up
− Drain
− Down
Partitions
12. SLURM: Job Scheduling
Scheduling: The process of determining next job to run and
on which resources.
FIFO Scheduler
Backfill Scheduler
Resources
Time
13. SLURM: Job Scheduling
Scheduling: The process of determining next job to run and
on which resources.
Backfill Scheduler:
• Based on the job request, resources available, and
policy limits imposed.
• Starts with job priority.
• Results in a resource allocation over a period.
15. •sbatch – Submit a batch script to Slurm.
•salloc – Request resources to SLURM for an interactive
job.
•srun – Start a new job step.
•scancel – Cancel a job.
SLURM: Commands
16. • sinfo – Report system status (nodes, queues, etc.).
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
rest up infinite 3 idle~ pirineusgpu[1-2],pirineusknl1
rest up infinite 1 idle canigo2
std* up infinite 11 idle~ pirineus[14,19-20,23,25-26,29-30,33-34,40]
std* up infinite 18 mix pirineus[13,15-16,18,21-22,27-28,35,38-39,41-45,48-49]
std* up infinite 7 alloc pirineus[17,24,31,36-37,46-47]
gpu up infinite 2 alloc pirineusgpu[3-4]
knl up infinite 3 idle~ pirineusknl[2-4]
mem up infinite 1 mix canigo1
class_a up infinite 8 mix canigo1,pirineus[1-7]
class_a up infinite 1 alloc pirineus8
class_b up infinite 8 mix canigo1,pirineus[1-7]
class_b up infinite 1 alloc pirineus8
class_c up infinite 8 mix canigo1,pirineus[1-7]
class_c up infinite 1 alloc pirineus8
std_curs up infinite 5 idle~ pirineus[9-12,50]
gpu_curs up infinite 2 idle~ pirineusgpu[1-2]
SLURM: Commands
24. Login on CSUC infrastructure
• Login
ssh –p 2122 username@hpc.csuc.cat
• Transferfiles
scp -P 2122 local_file username@hpc.csuc.cat:[path to your folder]
sftp -oPort=2122 username@hpc.csuc.cat
• Useful paths
Name Variable Availability Quote/project Time limit Backup
/home/$user $HOME global 4 GB unlimited Yes
/scratch/$user $SCRATCH global unlimited 30 days No
/scratch/$user/tmp/jobid $TMPDIR Local to each node job file limit 1 week No
/tmp/$user/jobid $TMPDIR Local to each node job file limit 1 week No
• Get HC consumption
consum -a ‘any’ (group consumption)
consum -a ‘any’ -u ‘nom_usuari’ (user consumption)
25. Batch job submission: Default settings
• 4Gb/core (excepting on mem partition).
• 24Gb/core on mem partition.
• 1 core on std and mem partitions.
• 24 cores on gpu partition
• The whole node on KNL partition
• Non-exclusive, multinode job.
• Scratch and Output directory are the submit directory.
26. Batch job submission
• Basic Linux commands:
Description Command Exemple
List files ls ls /home/user
Making folders mkdir mkdir /home/prova
Changing folder cd cd /home/prova
Copy files cp cp nom_arxiu1 nom_arxiu2
Move file mv mv /home/prova.txt /cescascratch/prova.txt
Delete file rm rm filename
Print file content cat cat filename
Find string into files grep grep ‘word’ filename
List last lines on file tail tail filename
• Text editors : vim, nano, emacs,etc.
• More detailed info and options about the commands:
‘command’ –help
man ‘command’
27. Scheduler directives/Options
• -c, --cpus-per-task=ncpus number of cpus required per task
• --gres=list required generic resources
• -J, --job-name=jobname name of job
• -n, --ntasks=ntasks number of tasks to run
• --ntasks-per-node=n number of tasks to invoke on each node
• -N, --nodes=N number of nodes on which to run (N = min[-max])
• -o, --output=out file for batch script's standard output
• -p, --partition=partition partition requested
• -t, --time=minutes time limit (format: dd-hh:mm)
28. • -C, --constraint=list specify a list of constraints(mem, vnc , ....)
• --mem=MB minimum amount of total real memory
• --reservation=name allocate resources from named reservation
• -w, --nodelist=hosts... request a specific list of hosts
• --mem-per-cpu=MB amount of real memory per allocated core
Scheduler directives/Options
29. #!/bin/bash
#SBATCH–jtreball_prova
#SBATCH-o treball_prova.log
#SBATCH-e treball_prova.err
#SBATCH-p std
#SBATCH-n 48
module load mpi/intel/openmpi/3.1.0
cp –r $input $SCRATCH
Cd $SCRATCH
srun $APPLICATION
mkdir -p $OUTPUT_DIR
cp -r * $output
Batch job submission
Schedulerdirectives
Setting up the environment
Move the input files to the working directory
Launch the application(similar to mpirun)
Create the output folderand move the outputs
34. Best Practices
• Use $SCRATCHas workingdirectory.
• Move only the necessaryfiles(notall files in the folder each time).
• Try to keep importantfiles only at $HOME
• Try to choose the partition and resoruces whose mostfit to your job.