Greg Smith
Cover how to use simple low-level tools such as memtest86, dd, bonnie++, and sysbench to benchmark the hardware of a server intended for database use. A heavy dose of vendor management suggestions will be included as well, for the inevitable time when your shiny new server fails to deliver the performance it should.
Optimizing Servers for High-Throughput and Low-Latency at DropboxScyllaDB
I'm going to discuss the efficiency/performance optimizations of different layers of the system. Starting from the lowest levels like hardware and drivers: these tunings can be applied to pretty much any high-load server. Then we’ll move to Linux kernel and its TCP/IP stack: these are the knobs you want to try on any of your TCP-heavy boxes. Finally, we’ll discuss library and application-level tunings, which are mostly applicable to HTTP servers in general and nginx/envoy specifically.
For each potential area of optimization I’ll try to give some background on latency/throughput tradeoffs (if any), monitoring guidelines, and, finally, suggest tunings for different workloads.
Also, I'll cover more theoretical approaches to performance analysis and the newly developed tooling like `bpftrace` and new `perf` features.
Evaluating GPU programming Models for the LUMI SupercomputerGeorge Markomanolis
It is common in the HPC community that the achieved performance with just CPUs is limited for many computational cases. The EuroHPC pre-exascale and the coming exascale systems are mainly focused on accelerators, and some of the largest upcoming supercomputers such as LUMI and Frontier will be powered by AMD Instinct accelerators. However, these new systems create many challenges for developers who are not familiar with the new ecosystem or with the required programming models that can be used to program for heterogeneous architectures. In this paper, we present some of the more well-known programming models to program for current and future GPU systems. We then measure the performance of each approach using a benchmark and a mini-app, test with various compilers, and tune the codes where necessary. Finally, we compare the performance, where possible, between the NVIDIA Volta (V100), Ampere (A100) GPUs, and the AMD MI100 GPU.
Presentation of a paper accepted in Supercomputing Frontiers Asia 2022
The Linux kernel is undergoing the most fundamental architecture evolution in history and is becoming a microkernel. Why is the Linux kernel evolving into a microkernel? The potentially biggest fundamental change ever happening to the Linux kernel. This talk covers how companies like Facebook and Google use BPF to patch 0-day exploits, how BPF will change the way features are added to the kernel forever, and how BPF is introducing a new type of application deployment method for the Linux kernel.
Velocity 2017 Performance analysis superpowers with Linux eBPFBrendan Gregg
Talk by for Velocity 2017 by Brendan Gregg: Performance analysis superpowers with Linux eBPF.
"Advanced performance observability and debugging have arrived built into the Linux 4.x series, thanks to enhancements to Berkeley Packet Filter (BPF, or eBPF) and the repurposing of its sandboxed virtual machine to provide programmatic capabilities to system tracing. Netflix has been investigating its use for new observability tools, monitoring, security uses, and more. This talk will investigate this new technology, which sooner or later will be available to everyone who uses Linux. The talk will dive deep on these new tracing, observability, and debugging capabilities. Whether you’re doing analysis over an ssh session, or via a monitoring GUI, BPF can be used to provide an efficient, custom, and deep level of detail into system and application performance.
This talk will also demonstrate the new open source tools that have been developed, which make use of kernel- and user-level dynamic tracing (kprobes and uprobes), and kernel- and user-level static tracing (tracepoints). These tools provide new insights for file system and storage performance, CPU scheduler performance, TCP performance, and a whole lot more. This is a major turning point for Linux systems engineering, as custom advanced performance instrumentation can be used safely in production environments, powering a new generation of tools and visualizations."
PowerPoint Presentation on the popular topic Multi Core Processors,History of multi core processors, comparison between single core and multi core processors, advantages and disadvantages of multi core processors.
YOW2018 Cloud Performance Root Cause Analysis at NetflixBrendan Gregg
Keynote by Brendan Gregg for YOW! 2018. Video: https://www.youtube.com/watch?v=03EC8uA30Pw . Description: "At Netflix, improving the performance of our cloud means happier customers and lower costs, and involves root cause
analysis of applications, runtimes, operating systems, and hypervisors, in an environment of 150k cloud instances
that undergo numerous production changes each week. Apart from the developers who regularly optimize their own code
, we also have a dedicated performance team to help with any issue across the cloud, and to build tooling to aid in
this analysis. In this session we will summarize the Netflix environment, procedures, and tools we use and build t
o do root cause analysis on cloud performance issues. The analysis performed may be cloud-wide, using self-service
GUIs such as our open source Atlas tool, or focused on individual instances, and use our open source Vector tool, f
lame graphs, Java debuggers, and tooling that uses Linux perf, ftrace, and bcc/eBPF. You can use these open source
tools in the same way to find performance wins in your own environment."
Optimizing Servers for High-Throughput and Low-Latency at DropboxScyllaDB
I'm going to discuss the efficiency/performance optimizations of different layers of the system. Starting from the lowest levels like hardware and drivers: these tunings can be applied to pretty much any high-load server. Then we’ll move to Linux kernel and its TCP/IP stack: these are the knobs you want to try on any of your TCP-heavy boxes. Finally, we’ll discuss library and application-level tunings, which are mostly applicable to HTTP servers in general and nginx/envoy specifically.
For each potential area of optimization I’ll try to give some background on latency/throughput tradeoffs (if any), monitoring guidelines, and, finally, suggest tunings for different workloads.
Also, I'll cover more theoretical approaches to performance analysis and the newly developed tooling like `bpftrace` and new `perf` features.
Evaluating GPU programming Models for the LUMI SupercomputerGeorge Markomanolis
It is common in the HPC community that the achieved performance with just CPUs is limited for many computational cases. The EuroHPC pre-exascale and the coming exascale systems are mainly focused on accelerators, and some of the largest upcoming supercomputers such as LUMI and Frontier will be powered by AMD Instinct accelerators. However, these new systems create many challenges for developers who are not familiar with the new ecosystem or with the required programming models that can be used to program for heterogeneous architectures. In this paper, we present some of the more well-known programming models to program for current and future GPU systems. We then measure the performance of each approach using a benchmark and a mini-app, test with various compilers, and tune the codes where necessary. Finally, we compare the performance, where possible, between the NVIDIA Volta (V100), Ampere (A100) GPUs, and the AMD MI100 GPU.
Presentation of a paper accepted in Supercomputing Frontiers Asia 2022
The Linux kernel is undergoing the most fundamental architecture evolution in history and is becoming a microkernel. Why is the Linux kernel evolving into a microkernel? The potentially biggest fundamental change ever happening to the Linux kernel. This talk covers how companies like Facebook and Google use BPF to patch 0-day exploits, how BPF will change the way features are added to the kernel forever, and how BPF is introducing a new type of application deployment method for the Linux kernel.
Velocity 2017 Performance analysis superpowers with Linux eBPFBrendan Gregg
Talk by for Velocity 2017 by Brendan Gregg: Performance analysis superpowers with Linux eBPF.
"Advanced performance observability and debugging have arrived built into the Linux 4.x series, thanks to enhancements to Berkeley Packet Filter (BPF, or eBPF) and the repurposing of its sandboxed virtual machine to provide programmatic capabilities to system tracing. Netflix has been investigating its use for new observability tools, monitoring, security uses, and more. This talk will investigate this new technology, which sooner or later will be available to everyone who uses Linux. The talk will dive deep on these new tracing, observability, and debugging capabilities. Whether you’re doing analysis over an ssh session, or via a monitoring GUI, BPF can be used to provide an efficient, custom, and deep level of detail into system and application performance.
This talk will also demonstrate the new open source tools that have been developed, which make use of kernel- and user-level dynamic tracing (kprobes and uprobes), and kernel- and user-level static tracing (tracepoints). These tools provide new insights for file system and storage performance, CPU scheduler performance, TCP performance, and a whole lot more. This is a major turning point for Linux systems engineering, as custom advanced performance instrumentation can be used safely in production environments, powering a new generation of tools and visualizations."
PowerPoint Presentation on the popular topic Multi Core Processors,History of multi core processors, comparison between single core and multi core processors, advantages and disadvantages of multi core processors.
YOW2018 Cloud Performance Root Cause Analysis at NetflixBrendan Gregg
Keynote by Brendan Gregg for YOW! 2018. Video: https://www.youtube.com/watch?v=03EC8uA30Pw . Description: "At Netflix, improving the performance of our cloud means happier customers and lower costs, and involves root cause
analysis of applications, runtimes, operating systems, and hypervisors, in an environment of 150k cloud instances
that undergo numerous production changes each week. Apart from the developers who regularly optimize their own code
, we also have a dedicated performance team to help with any issue across the cloud, and to build tooling to aid in
this analysis. In this session we will summarize the Netflix environment, procedures, and tools we use and build t
o do root cause analysis on cloud performance issues. The analysis performed may be cloud-wide, using self-service
GUIs such as our open source Atlas tool, or focused on individual instances, and use our open source Vector tool, f
lame graphs, Java debuggers, and tooling that uses Linux perf, ftrace, and bcc/eBPF. You can use these open source
tools in the same way to find performance wins in your own environment."
Delivered as plenary at USENIX LISA 2013. video here: https://www.youtube.com/watch?v=nZfNehCzGdw and https://www.usenix.org/conference/lisa13/technical-sessions/plenary/gregg . "How did we ever analyze performance before Flame Graphs?" This new visualization invented by Brendan can help you quickly understand application and kernel performance, especially CPU usage, where stacks (call graphs) can be sampled and then visualized as an interactive flame graph. Flame Graphs are now used for a growing variety of targets: for applications and kernels on Linux, SmartOS, Mac OS X, and Windows; for languages including C, C++, node.js, ruby, and Lua; and in WebKit Web Inspector. This talk will explain them and provide use cases and new visualizations for other event types, including I/O, memory usage, and latency.
Talk for Facebook Systems@Scale 2021 by Brendan Gregg: "BPF (eBPF) tracing is the superpower that can analyze everything, helping you find performance wins, troubleshoot software, and more. But with many different front-ends and languages, and years of evolution, finding the right starting point can be hard. This talk will make it easy, showing how to install and run selected BPF tools in the bcc and bpftrace open source projects for some quick wins. Think like a sysadmin, not like a programmer."
Video: http://joyent.com/blog/linux-performance-analysis-and-tools-brendan-gregg-s-talk-at-scale-11x ; This talk for SCaLE11x covers system performance analysis methodologies and the Linux tools to support them, so that you can get the most out of your systems and solve performance issues quickly. This includes a wide variety of tools, including basics like top(1), advanced tools like perf, and new tools like the DTrace for Linux prototypes.
High-Performance Networking Using eBPF, XDP, and io_uringScyllaDB
In the networking world there are a number of ways to increase performance over naive use of basic Berkeley sockets. These techniques have ranged from polling blocking sockets, non-blocking sockets controlled by Epoll, all the way through completely bypassing the Linux kernel for maximum network performance where you talk directly to the network interface card by using something like DPDK or Netmap. All these tools have their place, and generally occupy a space from convenience to performance. But in recent years, that landscape has changed massively.. The tools available to the average Linux systems developer have improved from the creation of io_uring, to the expansion of bpf from a simple filtering language to a full-on programming environment embedded directly in the kernel. Along with that came something called XDP (express datapath). This was Linux kernel's answer to kernel-bypass networking. AF_XDP is the new socket type created by this feature, and generally works very similarly to something like DPDK. History lessons out of the way, this talk will look into, and discuss the merits of this technology, it's place in the broader ecosystem and how it can be used to attain the highest level of performance possible. This talk will dive into crucial details, such as how AF_XDP works, how it can be integrated into a larger system and finally more advanced topics such as request sharding/load balancing. There will be detailed look at the design of AF_XDP, the eBpf code used, as well as the userspace code required to drive it all. It will also include performance numbers from this setup compared to regular kernel networking. And most importantly how to put all this together to handle as much data as possible on a single modern multi-core system.
Amazon Elastic Block Store (Amazon EBS) provides persistent block level storage volumes for use with Amazon EC2 instances. In this technical session, we conduct a detailed analysis of the differences among the three types of Amazon EBS block storage: General Purpose (SSD), Provisioned IOPS (SSD), and Magnetic. We discuss how to maximize Amazon EBS performance, with a special eye towards low-latency, high-throughput applications like databases. We discuss the performance implications of our new larger and faster SSD volumes (up to 16 TB with increased max throughput levels), as well as Amazon EBS encryption. Throughout, we share tips for success.
Introduction to Database Benchmarking with Benchmark FactoryMichael Micalizzi
When you are faced with making changes to your database environment such as upgrades, patches, hardware installation or virtual machine implementations, there are inherent risks that database performance or availability can suffer.
Oracle ACE and author Bert Scalzo will show how to reduce these risks and demonstrate how Benchmark Factory can help you scale your database for best possible results.
Delivered as plenary at USENIX LISA 2013. video here: https://www.youtube.com/watch?v=nZfNehCzGdw and https://www.usenix.org/conference/lisa13/technical-sessions/plenary/gregg . "How did we ever analyze performance before Flame Graphs?" This new visualization invented by Brendan can help you quickly understand application and kernel performance, especially CPU usage, where stacks (call graphs) can be sampled and then visualized as an interactive flame graph. Flame Graphs are now used for a growing variety of targets: for applications and kernels on Linux, SmartOS, Mac OS X, and Windows; for languages including C, C++, node.js, ruby, and Lua; and in WebKit Web Inspector. This talk will explain them and provide use cases and new visualizations for other event types, including I/O, memory usage, and latency.
Talk for Facebook Systems@Scale 2021 by Brendan Gregg: "BPF (eBPF) tracing is the superpower that can analyze everything, helping you find performance wins, troubleshoot software, and more. But with many different front-ends and languages, and years of evolution, finding the right starting point can be hard. This talk will make it easy, showing how to install and run selected BPF tools in the bcc and bpftrace open source projects for some quick wins. Think like a sysadmin, not like a programmer."
Video: http://joyent.com/blog/linux-performance-analysis-and-tools-brendan-gregg-s-talk-at-scale-11x ; This talk for SCaLE11x covers system performance analysis methodologies and the Linux tools to support them, so that you can get the most out of your systems and solve performance issues quickly. This includes a wide variety of tools, including basics like top(1), advanced tools like perf, and new tools like the DTrace for Linux prototypes.
High-Performance Networking Using eBPF, XDP, and io_uringScyllaDB
In the networking world there are a number of ways to increase performance over naive use of basic Berkeley sockets. These techniques have ranged from polling blocking sockets, non-blocking sockets controlled by Epoll, all the way through completely bypassing the Linux kernel for maximum network performance where you talk directly to the network interface card by using something like DPDK or Netmap. All these tools have their place, and generally occupy a space from convenience to performance. But in recent years, that landscape has changed massively.. The tools available to the average Linux systems developer have improved from the creation of io_uring, to the expansion of bpf from a simple filtering language to a full-on programming environment embedded directly in the kernel. Along with that came something called XDP (express datapath). This was Linux kernel's answer to kernel-bypass networking. AF_XDP is the new socket type created by this feature, and generally works very similarly to something like DPDK. History lessons out of the way, this talk will look into, and discuss the merits of this technology, it's place in the broader ecosystem and how it can be used to attain the highest level of performance possible. This talk will dive into crucial details, such as how AF_XDP works, how it can be integrated into a larger system and finally more advanced topics such as request sharding/load balancing. There will be detailed look at the design of AF_XDP, the eBpf code used, as well as the userspace code required to drive it all. It will also include performance numbers from this setup compared to regular kernel networking. And most importantly how to put all this together to handle as much data as possible on a single modern multi-core system.
Amazon Elastic Block Store (Amazon EBS) provides persistent block level storage volumes for use with Amazon EC2 instances. In this technical session, we conduct a detailed analysis of the differences among the three types of Amazon EBS block storage: General Purpose (SSD), Provisioned IOPS (SSD), and Magnetic. We discuss how to maximize Amazon EBS performance, with a special eye towards low-latency, high-throughput applications like databases. We discuss the performance implications of our new larger and faster SSD volumes (up to 16 TB with increased max throughput levels), as well as Amazon EBS encryption. Throughout, we share tips for success.
Introduction to Database Benchmarking with Benchmark FactoryMichael Micalizzi
When you are faced with making changes to your database environment such as upgrades, patches, hardware installation or virtual machine implementations, there are inherent risks that database performance or availability can suffer.
Oracle ACE and author Bert Scalzo will show how to reduce these risks and demonstrate how Benchmark Factory can help you scale your database for best possible results.
This tutorial was held at IEEE BigData '14 on October 29, 2014 in Bethesda, ML, USA.
Presenters: Chaitan Baru and Tilmann Rabl
More information available at:
http://msrg.org/papers/BigData14-Rabl
Summary:
This tutorial will introduce the audience to the broad set of issues involved in defining big data benchmarks, for creating auditable industry-standard benchmarks that consider performance as well as price/performance. Big data benchmarks must capture the essential characteristics of big data applications and systems, including heterogeneous data, e.g. structured, semi- structured, unstructured, graphs, and streams; large-scale and evolving system configurations; varying system loads; processing pipelines that progressively transform data; workloads that include queries as well as data mining and machine learning operations and algorithms. Different benchmarking approaches will be introduced, from micro-benchmarks to application- level benchmarking.
Since May 2012, five workshops have been held on Big Data Benchmarking including participation from industry and academia. One of the outcomes of these meetings has been the creation of industry’s first big data benchmark, viz., TPCx-HS, the Transaction Processing Performance Council’s benchmark for Hadoop Systems. During these workshops, a number of other proposals have been put forward for more comprehensive big data benchmarking. The tutorial will present and discuss salient points and essential features of such benchmarks that have been identified in these meetings, by experts in big data as well as benchmarking. Two key approaches are now being pursued—one, called BigBench, is based on extending the TPC- Decision Support (TPC-DS) benchmark with big data applications characteristics. The other called Deep Analytics Pipeline, is based on modeling processing that is routinely encountered in real-life big data applications. Both will be discussed.
We conclude with a discussion of a number of future directions for big data benchmarking
MeetBSDCA 2014 Performance Analysis for BSD, by Brendan Gregg. A tour of five relevant topics: observability tools, methodologies, benchmarking, profiling, and tracing. Tools summarized include pmcstat and DTrace.
KVM and docker LXC Benchmarking with OpenStackBoden Russell
Passive benchmarking with docker LXC and KVM using OpenStack hosted in SoftLayer. These results provide initial incite as to why LXC as a technology choice offers benefits over traditional VMs and seek to provide answers as to the typical initial LXC question -- "why would I consider Linux Containers over VMs" from a performance perspective.
Results here provide insight as to:
- Cloudy ops times (start, stop, reboot) using OpenStack.
- Guest micro benchmark performance (I/O, network, memory, CPU).
- Guest micro benchmark performance of MySQL; OLTP read, read / write complex and indexed insertion.
- Compute node resource consumption; VM / Container density factors.
- Lessons learned during benchmarking.
The tests here were performed using OpenStack Rally to drive the OpenStack cloudy tests and various other linux tools to test the guest performance on a "micro level". The nova docker virt driver was used in the Cloud scenario to realize VMs as docker LXC containers and compared to the nova virt driver for libvirt KVM.
Please read the disclaimers in the presentation as this is only intended to be the "chip of the ice burg".
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Виталий Стародубцев
##Что такое Storage Replica
##Архитектура и сценарии
##Синхронная и асинхронная репликация
##Междисковая, межсерверная, внутрикластерная и межкластерная репликация
##Дизайн и проектирование Storage Replica
##Нововведения в Windows Server 2016 TP5
##Графический интерфейс управления, и другие возможности - демонстрация и планы развития
##Интеграция Storage Replica с Storage Spaces Direct
Database performance tuning for SSD based storageAngelo Rajadurai
Databases are a key part of any application. The storage subsystem contributes most to performance of the database. In recent days, new storage technologies like Solid State Storage (SSD) and high performance drives are becoming cheaper and more accessible, but it takes a lot of planning to use these technologies in a cost effective way for best price-performance.
Databases are a key part of any application. The storage subsystem contributes most to performance of the database. In recent days, new storage technologies like Solid State Storage (SSD) and high performance drives are becoming cheaper and more accessible, but it takes a lot of planning to use these technologies in a cost effective way for best price-performance.
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Databricks
Effectively leveraging fast networking and storage hardware (e.g., RDMA, NVMe, etc.) in Apache Spark remains challenging. Current ways to integrate the hardware at the operating system level fall short, as the hardware performance advantages are shadowed by higher layer software overheads. This session will show how to integrate RDMA and NVMe hardware in Spark in a way that allows applications to bypass both the operating system and the Java virtual machine during I/O operations. With such an approach, the hardware performance advantages become visible at the application level, and eventually translate into workload runtime improvements. Stuedi will demonstrate how to run various Spark workloads (e.g, SQL, Graph, etc.) effectively on 100Gbit/s networks and NVMe flash.
Data deduplication is a hot topic in storage and saves significant disk space for many environments, with some trade offs. We’ll discuss what deduplication is and where the Open Source solutions are versus commercial offerings. Presentation will lean towards the practical – where attendees can use it in their real world projects (what works, what doesn’t, should you use in production, etcetera).
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...Odinot Stanislas
(FR)
Voici un excellent document qui explique étape après étape comment installer, monitorer et surtout correctement benchmarker ses SSD PCIe/NVMe (pas si simple que ça). Autre élément clé : comment analyser la charge I/O de véritables applications? Combien d'IOPS, en read, en write, quelle bande passante et surtout quel impact sur la durée de vie des SSD? Bref à mettre en toute les mains, et un merci à mon collègue Andrey Kudryavtsev.
(EN)
An excellent content which describe step by step how to install, monitor and benchmark PCIe/NVMe SSD (many trick not so simple). Another key learning: how to measure real I/O activities on a real workload? How many R/W IOPS, block size, throughtput, and finally what's the impact on SSD endurance and (real)life? A must read, and a huge thanks to my colleague Andrey Kudryavtsev.
Auteurs/Authors:
Andrey Kudryavtsev, SSD Solution Architect, Intel Corporation
Zhdan Bybin, Application Engineer, Intel Corporation
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...Nagios
Dan Wittenberg's presentation on using Nagios at a Fortune 50 Company
The presentation was given during the Nagios World Conference North America held Sept 25-28th, 2012 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna
This presentation provides an overview of the Dell PowerEdge R730xd server performance results with Red Hat Ceph Storage. It covers the advantages of using Red Hat Ceph Storage on Dell servers with their proven hardware components that provide high scalability, enhanced ROI cost benefits, and support of unstructured data.
Speaker: Akira Kurogane, Senior Technical Services Engineer, MongoDB
Level: 300 (Advanced)
Track: Performance
One week your active dataset consumes 90% of available RAM. The next week it's 110%. Is that a 10% or 99% performance degradation? Let's discover what it looks like when different hardware capacity limitations are hit. For example, memory vs. disk bottlenecks, the rare CPU bottleneck and network bottlenecks, seeing what happens when you drop a crucial index during peak load, or what happens when you run multiple WiredTiger nodes on the same server without limiting their cache size.
What You Will Learn:
- Performance analysis
- Post-mortem log analysis
- Capacity planning
Slide chia sẻ công nghệ về caching, thông qua slide này bạn sẽ trả lời được những câu hỏi như:
- Caching là gì
- Làm sao sử dụng cũng như xây dựng hệ thống caching
- Tại sao cache giúp tăng tốc ứng dụng lên vài chục, vài trăm lần
- Các hệ thống lớn của Facebook, Twitter, ... đang sử dụng cache thế nào
- ...
Slide chia sẻ về công nghệ về caching, thông qua slide này bạn sẽ trả lời được những câu hỏi như:
- Caching là gì
- Làm sao sử dụng cũng như xây dựng hệ thống caching
- Tại sao cache giúp tăng tốc ứng dụng lên vài chục, vài trăm lần
- Các hệ thống lớn của Facebook, Twitter, ... đang sử dụng cache thế nào
- ...
Vous avez récemment commencé à travailler sur Spark et vos jobs prennent une éternité pour se terminer ? Cette présentation est faite pour vous.
Himanshu Arora et Nitya Nand YADAV ont rassemblé de nombreuses bonnes pratiques, optimisations et ajustements qu'ils ont appliqué au fil des années en production pour rendre leurs jobs plus rapides et moins consommateurs de ressources.
Dans cette présentation, ils nous apprennent les techniques avancées d'optimisation de Spark, les formats de sérialisation des données, les formats de stockage, les optimisations hardware, contrôle sur la parallélisme, paramétrages de resource manager, meilleur data localité et l'optimisation du GC etc.
Ils nous font découvrir également l'utilisation appropriée de RDD, DataFrame et Dataset afin de bénéficier pleinement des optimisations internes apportées par Spark.
Howdah - An Application using Pylons, PostgreSQL, Simpycity and ExceptableCommand Prompt., Inc
Aurynn Shaw
This mini-tutorial covers building a small application on Howdah, an open source, Python based web development framework by Commandprompt, Inc. We will cover the full process of designing a vertically coherent application on Howdah, integrating DB-level stored procedures, DB exception propagation through Exceptable, DB access through Simpycity, authentication through repoze.who, permissions through VerticallyChallenged, and application views through Pylons. By the end of the talk, we will have covered a full application built on The Stack, and how to cover common pitfalls in using Howdah components.
Scott Bailey
Few things we model in our databases are as complicated as time. The major database vendors have struggled for years with implementing the base data types to represent time. And the capabilities and functionality vary wildly among databases. Fortunately PostgreSQL has one of the best implementations out there. We will look at PostgreSQL's core functionality, discuss temporal extensions, modeling temporal data, time travel and bitemporal data.
Bruce Momjian
Pg_Migrator allows data to be transfered between major Postgres versions
without a dump/restore. This talk explains the internal workings of
pg_migrator and includes a pg_migrator demonstration
Adrian Klaver
An exploration of various Python projects (PyRTF,ReportLab,xlwt) that help with presenting your data in formats (rtf,pdf,xls) that other people want. I will step through a simple data extraction and conversion process using the above software to create an RTF,PDF and XLS file respectively.
PostgreSQL, Extensible to the Nth Degree: Functions, Languages, Types, Rules,...Command Prompt., Inc
Jeff Davis
I'll be showing how the extensible pieces of PostgreSQL fit together to give you the full power of native functionality -- including performance. These pieces, when combined, make PostgreSQL able to do almost anything you can imagine. A variety add-ons have been very successful in PostgreSQL merely by using this extensibility. Examples in this talk will range from PostGIS (a GIS extension for PostgreSQL) to DBI-Link (manage any data source accessible via perl DBI).
Jeff Davis
UNIQUE indexes have long held a unique position among constraints: they are the only way to express a constraint that two tuples in a table conflict without resorting to triggers and locks (which severely impact performance). But what if you want to impose the constraint that one person can't be in two places at the same time? In other words, you have a schedule, and you want to be sure that two periods of time for the same person do not overlap. This is nearly impossible to do efficiently with the current version of PostgreSQL -- and most other database systems. I will be presenting Generalized Index Constraints, which is being submitted for inclusion in the next PostgreSQL release, along with the PERIOD data type (available now from PgFoundry). I will show how these can, together, offer a fast, scalable, and highly concurrent solution to a very common business requirement. A business requirement is still a requirement even if your current database system can't do it!
Implementing the Future of PostgreSQL Clustering with TungstenCommand Prompt., Inc
Robert Hodges
Users have traditionally used database clusters to solve database availability and performance requirements. However, clustering requirements are changing as hardware improvements make performance concerns obsolete for many users. In this talk I will discuss how the Tungsten project uses master/slave replication, group communications, and rules processing to develop easy-to-manage database clusters that solve database availability, protect data, and address hardware utilization. Our implementation is based on existing PostgreSQL capabilities like Londiste and WAL shipping, which we eventually plan to replace with our own log-based replication. Come see the future of database clustering with Tungsten!
Josh Berkus
Most users know that PostgreSQL has a 23-year development history. But did you know that Postgres code is used for over a dozen other database systems? Thanks to our liberal licensing, many companies and open source projects over the years have taken the Postgres or PostgreSQL code, changed it, added things to it, and/or merged it into something else. Illustra, Truviso, Aster, Greenplum, and others have seen the value of Postgres not just as a database but as some darned good code they could use. We'll explore the lineage of these forks, and go into the details of some of the more interesting ones.
Joshua D. Drake
Are you tired of not having a real solution for PITR? Enter PITRTools, a single and secure solution for using Point In Time Recovery for PostgreSQL.
Selena Deckelmann
Bucardo is a mature replication system for PostgreSQL that supports asynchronous replication for both master-slave and multi-master systems. Originally designed for slow and unreliable networks, it has remarkable recovery ability, an easy to use command-line interface and development is active! Uses for Bucardo include: a slave read-only database, multi-master replication, data warehousing and just having fun moving your data around! Will include overview replication for PostgreSQL in general, a tour of features, a basic configuration walk through and the much feared live demo!
Matt Smiley
This is a basic primer aimed primarily at developers or DBAs new to Postgres. The format is a Q/A style tour with examples, based on common questions and pitfalls. Begin with a quick tour of relevant parts of the postgres catalog, with an aim to answer simple but important questions like:
How many rows does the optimizer think my table has?
When was it last analyzed?
Which other tables also have a column named "foo"?
How often is this index used?
Rod Anderson
For the small business support person being able to provide PostgreSQL hosting for a small set of specific applications without having to build and support several Pg installations is necessary. By building a multi-tenant Pg cluster with one tenant per database and each application in it's own schema maintenance and support is much simpler. The issues that present themselves are how to provide and control dba and user access to the database and get the applications into their own schema. With this comes need to make logging in to the database (pg_hba.conf) as non-complex as possible.
Josh Berkus
You've heard that PostgreSQL is the highest-performance transactional open source database, but you're not seeing it on YOUR server. In fact, your PostgreSQL application is kind of poky. What should you do? While doing advanced performance engineering for really high-end systems takes years to learn, you can learn the basics to solve performance issues for 80% of PostgreSQL installations in less than an hour. In this session, you will learn: -- The parts of database application performance -- The performance setup procedure -- Basic troubleshooting tools -- The 13 postgresql.conf settings you need to know -- Where to look for more information.
Brent Friedman
- Three hour workshop format, 30 minutes or so on history, trends, quick review of normalization, 45 minutes on a normalization walk-through including everyone, then break into small teams (3-6 ppl) to do a 'normalization challenge' on a real world practical problem
Brent Friedman
- Three hour workshop format, 30 minutes or so on history, trends, quick review of normalization, 45 minutes on a normalization walk-through including everyone, then break into small teams (3-6 ppl) to do a 'normalization challenge' on a real world practical problem
2. About this presentation
The master source for these slides is
http://www.2ndquadrant.com/talks
Slides are released under the Creative Commons Attribution
3.0 United States License
http://creativecommons.org/licenses/by/3.0/us
Greg Smith Database Hardware Benchmarking
3. Why should you always benchmark your hardware?
Many useful tests will only run when the server isn’t being
used yet
Software stacks are complicated
Spending money on upgrades only helps if you upgrade the
right thing usefully
Vendors lie
Greg Smith Database Hardware Benchmarking
4. Systematic Benchmarking
Memory
CPU
Disk
Database server
Application
Greg Smith Database Hardware Benchmarking
5. Memory Tests
DOS: memtest86+ (also on many Linux CDs, like the Ubuntu
installer)
Windows: SiSoftware Sandra
UNIX: sysbench, stream
Greg Smith Database Hardware Benchmarking
6. sysbench memory write and CPU tests
sysbench --test=memory --memory-oper=write
--memory-block-size=64MB --memory-total-size=1024MB
run
sysbench --test=cpu run
Greg Smith Database Hardware Benchmarking
7. stream memory read test
wget http://www.cs.virginia.edu/stream/FTP/Code/stream.c
gcc -O stream.c -o stream
./stream
Greg Smith Database Hardware Benchmarking
10. Oracle Calling Center OLTP Benchmark
http://it.anandtech.com/IT/showdoc.aspx?i=3769&p=4
Greg Smith Database Hardware Benchmarking
11. Sources for slow memory results
Single channel RAM/slot mistakes
Incorrect SPD/timing/voltage
Bad RAM/CPU multiplier combination
Poor quality RAM
Greg Smith Database Hardware Benchmarking
12. PostgreSQL and the CPU
PostgreSQL uses only a single CPU per query
Queries executing against cached data will bottleneck on CPU
COPY is CPU intensive
Greg Smith Database Hardware Benchmarking
13. CPU Tests
Windows: SiSoftware Sandra
UNIX: sysbench CPU test
Custom test with timing and generate series
pgbench select-only on small database
Vary pgbench client count to test single or multiple CPUs
Greg Smith Database Hardware Benchmarking
15. Sources for slow CPU results
Slow memory
Power management throttling
Linux: /proc/cpuinfo shows 1000MHz suggests you need to
adjust the CPUFreq Governor to “performance”
Greg Smith Database Hardware Benchmarking
16. Disk Tests
Sequential write: INSERT, COPY FROM (when not CPU
limited)
Sequential read: SELECT * FROM and similar table
sequential scans
Seeks: SELECT using index, UPDATE
Commit fsync rate: INSERT, UPDATE
Greg Smith Database Hardware Benchmarking
17. dd test
Compute 2X the size of your RAM in 8KB blocks
blocks = 250,000 * gigabytes of RAM
time sh -c "dd if=/dev/zero of=bigfile bs=8k count=X &&
sync"
time dd if=bigfile of=/dev/null bs=8k
Watch vmstat and/or iostat during disk tests
vmstat’s bi and bo will match current read/write rate
Note the CPU percentage required to reach the peak rate
Greg Smith Database Hardware Benchmarking
18. bonnie++
./bonnie++
bon csv2html
Ignore the per-character and create results, look at the block
output/input ones
Random Seeks:
The test runs SeekProcCount processes (default 3) in parallel,
doing a total of 8000 random seek reads to locations in the
file. In 10% of cases, the block read is changed and written
back.
Greg Smith Database Hardware Benchmarking
19. bonnie++ ZCAV
./zcav -f/dev/sda > t500
Must get a recent version of bonnie++ for ZCAV to scale
properly for TB drives (1.03e works)
ZCAV on experimental branch (1.96) gave useless results for
me
Download somewhat broken gnuplot script sample and typical
results from:
http://www.coker.com.au/bonnie++/zcav/results.html
Greg Smith Database Hardware Benchmarking
20. Improved bonnie++ ZCAV gnuplot script
unset autoscale x
set autoscale xmax
unset autoscale y
set autoscale ymax
set xlabel "Position GB"
set ylabel "MB/s"
set key right bottom
set terminal png
set output "zcav.png"
plot "raid0" title "7200RPM RAID 0 3 Spindles",
"single" title "7200RPM Single Drive"
Greg Smith Database Hardware Benchmarking
25. Sample sysbench random read results
Read 78.125Mb Written 0b
Total transferred 78.125Mb (1.0059Mb/sec)
128.75 Requests/sec executed
That’s 128.75 seeks/second over 10GB, resulting in a net
throughput of 128.75 * 8KB/s = 1.01MB/s
Consider both the size of the disk used and the number of
clients doing seeks
Greg Smith Database Hardware Benchmarking
26. More customizable seek tests
bonnie++ experimental (currently at 1.96)
iozone
fio
Windows: HD Tune does everything but commit rate
Greg Smith Database Hardware Benchmarking
27. Sources for slow disk results
Poor mapping to underlying hardware
Buggy driver
Insufficient bandwidth to storage
Bottlenecking at CPU/memory limits
Bad performing filesystem or filesystem misaligned with stripe
sizes
Writes faster than reads? Probably low read-ahead settings
somewhere.
Vibration: don’t shout at your JBODs! They don’t like it!
http://it.toolbox.com/blogs/database-soup/the-problem-with-iscsi-30602
http://blog.endpoint.com/2008/09/filesystem-io-what-we-presented.html
http://www.youtube.com/watch?v=tDacjrSCeq4
Greg Smith Database Hardware Benchmarking
28. fsync tests
sysbench --test=fileio --file-fsync-freq=1 --file-num=1
--file-total-size=16384 --file-test-mode=rndwr run
| grep "Requests/sec"
pgbench insert-only test
PostgreSQL contrib/test fsync might work, but isn’t really
reliable
Greg Smith Database Hardware Benchmarking
29. Sample laptop disk specification
ST9320423AS Momentus 7200.4 320GB
7200 RPM
16MB Cache
Average seek: 11ms read/13ms write
Average rotational latency: 4.17ms
Greg Smith Database Hardware Benchmarking
31. IOPS Calculators and Info
http://www.wmarow.com/strcalc/
http://www.dbasupport.com/oracle/ora10g/disk_IO_02.shtml
http://storageadvisors.adaptec.com/2007/03/20/sata-iops-measurement/
Greg Smith Database Hardware Benchmarking
32. Sample disk results
Disk Seq Seq bonnie++ Read-only Commit Drive
Count Read Write seeks seeks Rate Model
1 71 58 232 @ 4GB 194 @ 4GB 105/s 7200.4
1 59 54 177 @ 16GB 56 @ 100GB 10212/s WD160
3 125 119 371 @ 16GB 60 @ 100GB 10855/s RAID0
Commit rate for 7200.4 laptop drive is 1048/s with unsafe
volatile write cache
Non-laptop drives include a 256MB battery-backed write
cache, Linux SW RAID
Greg Smith Database Hardware Benchmarking
33. Custom PostgreSQL tests
Quick CPU test (about 1140ms on my laptop):
timing
select sum(generate series) from
generate series(1,1000000);
Quick insert/plan test:
timing
CREATE TABLE test (id INTEGER PRIMARY KEY);
INSERT INTO test VALUES (generate series(1,100000));
EXPLAIN ANALYZE SELECT COUNT(*) FROM test;
Greg Smith Database Hardware Benchmarking
34. Benchmarking functions
http://justatheory.com/computers/databases/postgresql/benchmarking_upc_validation.html
http://justatheory.com/computers/databases/postgresql/benchmarking_functions.html
pg stat user functions handles this specific job in 8.4
The general technique is applicable for all sorts of custom
benchmarks
Greg Smith Database Hardware Benchmarking
35. pgbench select only test
set naccounts 100000 * :scale
setrandom aid 1 :naccounts
SELECT abalance FROM accounts WHERE aid = :aid;
Greg Smith Database Hardware Benchmarking
36. pgbench custom test: insert only, to measure commit rate
set nbranches :scale
set ntellers 10 * :scale
set naccounts 100000 * :scale
setrandom aid 1 :naccounts
setrandom aid 1 :naccounts
setrandom bid 1 :nbranches
setrandom tid 1 :ntellers
setrandom delta -5000 5000
BEGIN
INSERT INTO history (tid, bid, aid, delta, mtime)
VALUES (:tid, :bid, :aid, :delta, CURRENT TIMESTAMP);
END;
Greg Smith Database Hardware Benchmarking
37. pgbench standard test
BEGIN;
UPDATE accounts SET abalance = abalance + :delta
WHERE aid = :aid;
SELECT abalance FROM accounts WHERE aid = :aid;
UPDATE tellers SET tbalance = tbalance + :delta
WHERE tid = :tid;
UPDATE branches SET bbalance = bbalance + :delta
WHERE bid = :bid;
INSERT INTO history (tid, bid, aid, delta, mtime)
VALUES (:tid, :bid, :aid, :delta, CURRENT TIMESTAMP);
END;
Greg Smith Database Hardware Benchmarking
38. pgbench-tools
Available from git.postgresql.org and github
Set of shell scripts to automate running many pgbench tests
Results are saved into a database for analysis
Saves TPS, latency, background writer statistics
Most of the cool features require PostgreSQL 8.3
Inspired by the dbt2 benchmark framework
Greg Smith Database Hardware Benchmarking
39. Server configuration for pgbench results
Quad-Core Intel Q6600
8GB DDR2-800 RAM
Areca ARC-1210 SATA II PCI-e x8 RAID controller, 256MB
write cache
DB: 3x640GB Western Digital SATA disks, short-stroked,
Linux software RAID-0
WAL: 160GB Western Digital SATA disk
CentOS 5.4, Linux Kernel 2.6.18-164.11.1.el5xen x86 64
OS on separate disk
Untuned ext3 filesystems
Greg Smith Database Hardware Benchmarking
45. What should you do?
Trust no one
Don’t start on application benchmarks until you’ve proven
basic performance
Don’t spend too long on basic performance if you can switch
to application benchmarks
Vendors alternate among lying, misunderstanding what you
want, and trying to make you feel dumb
Use simple, standard tools whenever possible to minimize
vendor disputes
Be prepared to translate to your vendor’s language and
subvert their agenda
Never spend real money on hardware unless you can return it
if it sucks
Greg Smith Database Hardware Benchmarking
46. Credits
Special thanks to:
Mark Wong and the other contributors to dbt-2
Truviso for providing some of the test hardware included here
Greg Smith Database Hardware Benchmarking
47. For more information...
Some slides are taken from the upcoming book:
“High Performance PostgreSQL” by Greg Smith
Planned for release this summer by Packt Publishing
Performance tuning of PostgreSQL 8.1 through 9.0, from
hardware to scaling via replication
Greg Smith Database Hardware Benchmarking
48. Questions?
Remember that your coffee break awaits you
Greg Smith Database Hardware Benchmarking