This document provides an introduction to pgbench, which is a benchmarking tool for PostgreSQL. It discusses pgbench's history and origins, how to generate databases of different scales for testing, the database schema and scripting language used, and how to run standard, read-only, and custom tests. The author also analyzes results from tests at different database scales and number of clients, discusses warm vs. cold cache performance, and provides tips for thorough benchmarking.
This one is about advanced indexing in PostgreSQL. It guides you through basic concepts as well as through advanced techniques to speed up the database.
All important PostgreSQL Index types explained: btree, gin, gist, sp-gist and hashes.
Regular expression indexes and LIKE queries are also covered.
This presentation is primarily focused on how to use collectd (http://collectd.org/) to gather data from the Postgres statistics tables. Examples of how to use collectd with Postgres will be shown. There is some hackery involved to make collectd do a little more and collect more meaningful data from Postgres. These small patches will be explored. A small portion of the discussion will be about how to visualize the data.
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
This presentation covers the following topics:
What is logging?
The purpose of logging: Debugging
The purpose of logging: Security
The purpose of logging: Stats & analytics
Traditional logging
Traditional logging: Advantages
Traditional logging: Disadvantages
The solution: Large-scale logging
Large-scale logging: Core principles
Large-scale logging: Solution types
Large-scale logging: Cloud vs on-prem
Large-scale logging: Operational complexity
Large-scale logging: Security
Large-scale logging: Costs
Large-scale logging: On-prem comparison
- Elasticsearch
- Grafana Loki
- VictoriaLogs
On-prem comparison: Setup and operation
On-prem comparison: Costs
On-prem comparison: Full-text search support
On-prem comparison: How to efficiently query 100TB of logs?
On-prem comparison: Integration with CLI tools
VictoriaLogs for large-scale logging
VictoriaLogs demo instance
- Ingestion rate: 3600 messages / minute
- The number of log messages: 1.1 billion
- Uncompressed log messages’ size: 1.5TB
- Compressed log messages’ size: 23GB
- Compression ratio: 47x
- Memory usage: 150MB
VictoriaLogs CLI integration demo
- Which errors have occurred in all the apps during the last hour?
- How many errors have occurred during the last hour?
- Which apps generated the most of errors during the last hour?
- The number of per-minute errors for the last 10 minutes
- Status codes for the last hour
- Non-200 status codes for the last week
- Top client IPs for the last 4 weeks with 404 and 500 response status codes
- Per-month stats for the given IP across all the logs
Large-scale logging solution
MUST provide
excellent CLI integration
VictoriaLogs: (temporary) drawbacks
VictoriaLogs: Recap
- Easy to setup and operate
- The lowest RAM usage and disk space usage (up to 30x less than Elasticsearch and Grafana Loki)
- Fast full-text search
- Excellent integration with traditional command-line tools for log analysis
- Accepts logs from popular log shippers (Filebeat, Fluentbit, Logstash, Vector, Promtail, Grafana Agent)
- Open source and free to use!
MySQL Performance Schema in Action: the Complete TutorialSveta Smirnova
Performance Schema is powerful diagnostic instrument for:
- Query performance
- Complicated locking issues
- Memory leaks
- Resource usage
- Problematic behavior, caused by inappropriate settings
- More
It comes with hundreds of options which allow precisely tune what to instrument. More than 100 consumers store collected data.
In this tutorial we will try all important instruments out. We will provide test environment and few typical problems which could be hardly solved without Performance Schema. You will not only learn how to collect and use this information, but have experience with it.
Made it on PerconaLive Frankfurt, 2018: https://www.percona.com/live/e18/sessions/mysql-performance-schema-in-action-the-complete-tutorial
Performance Schema is a powerful diagnostic instrument for:
- Query performance
- Complicated locking issues
- Memory leaks
- Resource usage
- Problematic behavior, caused by inappropriate settings
- More
It comes with hundreds of options which allow precisely tuning what to instrument. More than 100 consumers store collected data.
In this tutorial, we will try all the important instruments out. We will provide a test environment and a few typical problems which could be hardly solved without Performance Schema. You will not only learn how to collect and use this information but have experience with it.
Tutorial at Percona Live Austin 2019
This one is about advanced indexing in PostgreSQL. It guides you through basic concepts as well as through advanced techniques to speed up the database.
All important PostgreSQL Index types explained: btree, gin, gist, sp-gist and hashes.
Regular expression indexes and LIKE queries are also covered.
This presentation is primarily focused on how to use collectd (http://collectd.org/) to gather data from the Postgres statistics tables. Examples of how to use collectd with Postgres will be shown. There is some hackery involved to make collectd do a little more and collect more meaningful data from Postgres. These small patches will be explored. A small portion of the discussion will be about how to visualize the data.
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
This presentation covers the following topics:
What is logging?
The purpose of logging: Debugging
The purpose of logging: Security
The purpose of logging: Stats & analytics
Traditional logging
Traditional logging: Advantages
Traditional logging: Disadvantages
The solution: Large-scale logging
Large-scale logging: Core principles
Large-scale logging: Solution types
Large-scale logging: Cloud vs on-prem
Large-scale logging: Operational complexity
Large-scale logging: Security
Large-scale logging: Costs
Large-scale logging: On-prem comparison
- Elasticsearch
- Grafana Loki
- VictoriaLogs
On-prem comparison: Setup and operation
On-prem comparison: Costs
On-prem comparison: Full-text search support
On-prem comparison: How to efficiently query 100TB of logs?
On-prem comparison: Integration with CLI tools
VictoriaLogs for large-scale logging
VictoriaLogs demo instance
- Ingestion rate: 3600 messages / minute
- The number of log messages: 1.1 billion
- Uncompressed log messages’ size: 1.5TB
- Compressed log messages’ size: 23GB
- Compression ratio: 47x
- Memory usage: 150MB
VictoriaLogs CLI integration demo
- Which errors have occurred in all the apps during the last hour?
- How many errors have occurred during the last hour?
- Which apps generated the most of errors during the last hour?
- The number of per-minute errors for the last 10 minutes
- Status codes for the last hour
- Non-200 status codes for the last week
- Top client IPs for the last 4 weeks with 404 and 500 response status codes
- Per-month stats for the given IP across all the logs
Large-scale logging solution
MUST provide
excellent CLI integration
VictoriaLogs: (temporary) drawbacks
VictoriaLogs: Recap
- Easy to setup and operate
- The lowest RAM usage and disk space usage (up to 30x less than Elasticsearch and Grafana Loki)
- Fast full-text search
- Excellent integration with traditional command-line tools for log analysis
- Accepts logs from popular log shippers (Filebeat, Fluentbit, Logstash, Vector, Promtail, Grafana Agent)
- Open source and free to use!
MySQL Performance Schema in Action: the Complete TutorialSveta Smirnova
Performance Schema is powerful diagnostic instrument for:
- Query performance
- Complicated locking issues
- Memory leaks
- Resource usage
- Problematic behavior, caused by inappropriate settings
- More
It comes with hundreds of options which allow precisely tune what to instrument. More than 100 consumers store collected data.
In this tutorial we will try all important instruments out. We will provide test environment and few typical problems which could be hardly solved without Performance Schema. You will not only learn how to collect and use this information, but have experience with it.
Made it on PerconaLive Frankfurt, 2018: https://www.percona.com/live/e18/sessions/mysql-performance-schema-in-action-the-complete-tutorial
Performance Schema is a powerful diagnostic instrument for:
- Query performance
- Complicated locking issues
- Memory leaks
- Resource usage
- Problematic behavior, caused by inappropriate settings
- More
It comes with hundreds of options which allow precisely tuning what to instrument. More than 100 consumers store collected data.
In this tutorial, we will try all the important instruments out. We will provide a test environment and a few typical problems which could be hardly solved without Performance Schema. You will not only learn how to collect and use this information but have experience with it.
Tutorial at Percona Live Austin 2019
Have you ever wondered about the internal structure of indexes in postgres? What's the difference between a btree and a gin index? How are indexes searches etc. ? This talk is about this.
There are many ways to run high availability with PostgreSQL. Here, we present a template for you to create your own customized, high-availability solution using Python and for maximum accessibility, a distributed configuration store like ZooKeeper or etcd.
Distributed Point-in-Time Recovery with Postgres | PGConf.Russia 2018 | Eren ...Citus Data
Postgres has a nice feature called Point-in-time Recovery (PITR) that would allow you to go back in time. In this talk, we will discuss what are the use-cases of PITR, how to prepare your database for PITR by setting good base backup and WAL shipping setups, with some examples. We will expand the discussion with how to achieve PITR if you have a distributed and sharded Postgres setup by mentioning challenges such as clock differences and ways to overcome them, such as two-phase commit and pg_create_restore_point.
Talk for PerconaLive 2016 by Brendan Gregg. Video: https://www.youtube.com/watch?v=CbmEDXq7es0 . "Systems performance provides a different perspective for analysis and tuning, and can help you find performance wins for your databases, applications, and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes six important areas of Linux systems performance in 50 minutes: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events), static tracing (tracepoints), and dynamic tracing (kprobes, uprobes), and much advice about what is and isn't important to learn. This talk is aimed at everyone: DBAs, developers, operations, etc, and in any environment running Linux, bare-metal or the cloud."
PostgreSQL 9.6 introduced wait events and PostgreSQL 10 progressed them, but what are they? What do they mean? How do I find them and how do I make them go away? Wait events are one of the most significant advancements in observability for PostgreSQL databases; their usefulness is unparalleled in troubleshooting performance. This talk will go into all that and more as we explore the world of PostgreSQL wait events.
Materialized Views and Secondary Indexes in Scylla: They Are finally here!ScyllaDB
Materialized Views and Secondary Indexes are finally ready for prime time and is going GA. In this talk we will cover the unique aspects of the Scylla implementation and what you can expect to do with it.
This ppt was used by Devrim at pgDay Asia 2017. He talked about some important facts about WAL - Transaction Logs or xlogs in PostgreSQL. Some of these can really come handy on a bad day
Redis in a Multi Tenant Environment–High Availability, Monitoring & Much More! Redis Labs
Running any
application in a multi-tenant environment poses its challenges. This talk is focused around how we at Rackspace run Redis
in a multi-tenant environment, ensuring security, performance, fault tolerance and high availability. This talk will cover: an
architecture deep dive of multi tenant Redis on the cloud, management of sentinels, monitoring and operations of a large
scale Redis deployment,introducing new Redis versions,scaling, security, some lessons learnt. The target audience for this
talk is anyone who is interested in the deployment/operational aspect of running Redis. This is relevant not only for those
who want to run Redis themselves, but also interested in how a Redis provider might be doing it for them.
My talk for "MySQL, MariaDB and Friends" devroom at Fosdem on February 2, 2019
Born in 2010 in MySQL 5.5.3 as "a feature for monitoring server execution at a low level," grown in 5.6 times with performance fixes and DBA-faced features, in MySQL 5.7 Performance Schema is a mature tool, used by humans and more and more monitoring products. It becomes more popular over the years. In this talk I will give an overview of Performance Schema, focusing on its tuning, performance, and usability.
Performance Schema helps to troubleshoot query performance, complicated locking issues, memory leaks, resource usage, problematic behavior, caused by inappropriate settings and much more. It comes with hundreds of options which allow precisely tune what to instrument. More than 100 consumers store collected data.
Performance Schema is a potent tool. And very complicated at the same time. It does not affect performance in most cases and can slow down server dramatically if configured without care. It collects a lot of data, and sometimes this data is hard to read.
This talk will start from the introduction of how Performance Schema designed, and you will understand why it slowdowns server in some cases and does not affect your queries in others. Then we will discuss which information you can retrieve from Performance Schema and how to do it effectively.
I will cover its companion sys schema and graphical monitoring tools.
This presentation covers all aspects of PostgreSQL administration, including installation, security, file structure, configuration, reporting, backup, daily maintenance, monitoring activity, disk space computations, and disaster recovery. It shows how to control host connectivity, configure the server, find the query being run by each session, and find the disk space used by each database.
Robert Haas
Why does my query need a plan? Sequential scan vs. index scan. Join strategies. Join reordering. Joins you can't reorder. Join removal. Aggregates and DISTINCT. Using EXPLAIN. Row count and cost estimation. Things the query planner doesn't understand. Other ways the planner can fail. Parameters you can tune. Things that are nearly always slow. Redesigning your schema. Upcoming features and future work.
Have you ever wondered about the internal structure of indexes in postgres? What's the difference between a btree and a gin index? How are indexes searches etc. ? This talk is about this.
There are many ways to run high availability with PostgreSQL. Here, we present a template for you to create your own customized, high-availability solution using Python and for maximum accessibility, a distributed configuration store like ZooKeeper or etcd.
Distributed Point-in-Time Recovery with Postgres | PGConf.Russia 2018 | Eren ...Citus Data
Postgres has a nice feature called Point-in-time Recovery (PITR) that would allow you to go back in time. In this talk, we will discuss what are the use-cases of PITR, how to prepare your database for PITR by setting good base backup and WAL shipping setups, with some examples. We will expand the discussion with how to achieve PITR if you have a distributed and sharded Postgres setup by mentioning challenges such as clock differences and ways to overcome them, such as two-phase commit and pg_create_restore_point.
Talk for PerconaLive 2016 by Brendan Gregg. Video: https://www.youtube.com/watch?v=CbmEDXq7es0 . "Systems performance provides a different perspective for analysis and tuning, and can help you find performance wins for your databases, applications, and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes six important areas of Linux systems performance in 50 minutes: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events), static tracing (tracepoints), and dynamic tracing (kprobes, uprobes), and much advice about what is and isn't important to learn. This talk is aimed at everyone: DBAs, developers, operations, etc, and in any environment running Linux, bare-metal or the cloud."
PostgreSQL 9.6 introduced wait events and PostgreSQL 10 progressed them, but what are they? What do they mean? How do I find them and how do I make them go away? Wait events are one of the most significant advancements in observability for PostgreSQL databases; their usefulness is unparalleled in troubleshooting performance. This talk will go into all that and more as we explore the world of PostgreSQL wait events.
Materialized Views and Secondary Indexes in Scylla: They Are finally here!ScyllaDB
Materialized Views and Secondary Indexes are finally ready for prime time and is going GA. In this talk we will cover the unique aspects of the Scylla implementation and what you can expect to do with it.
This ppt was used by Devrim at pgDay Asia 2017. He talked about some important facts about WAL - Transaction Logs or xlogs in PostgreSQL. Some of these can really come handy on a bad day
Redis in a Multi Tenant Environment–High Availability, Monitoring & Much More! Redis Labs
Running any
application in a multi-tenant environment poses its challenges. This talk is focused around how we at Rackspace run Redis
in a multi-tenant environment, ensuring security, performance, fault tolerance and high availability. This talk will cover: an
architecture deep dive of multi tenant Redis on the cloud, management of sentinels, monitoring and operations of a large
scale Redis deployment,introducing new Redis versions,scaling, security, some lessons learnt. The target audience for this
talk is anyone who is interested in the deployment/operational aspect of running Redis. This is relevant not only for those
who want to run Redis themselves, but also interested in how a Redis provider might be doing it for them.
My talk for "MySQL, MariaDB and Friends" devroom at Fosdem on February 2, 2019
Born in 2010 in MySQL 5.5.3 as "a feature for monitoring server execution at a low level," grown in 5.6 times with performance fixes and DBA-faced features, in MySQL 5.7 Performance Schema is a mature tool, used by humans and more and more monitoring products. It becomes more popular over the years. In this talk I will give an overview of Performance Schema, focusing on its tuning, performance, and usability.
Performance Schema helps to troubleshoot query performance, complicated locking issues, memory leaks, resource usage, problematic behavior, caused by inappropriate settings and much more. It comes with hundreds of options which allow precisely tune what to instrument. More than 100 consumers store collected data.
Performance Schema is a potent tool. And very complicated at the same time. It does not affect performance in most cases and can slow down server dramatically if configured without care. It collects a lot of data, and sometimes this data is hard to read.
This talk will start from the introduction of how Performance Schema designed, and you will understand why it slowdowns server in some cases and does not affect your queries in others. Then we will discuss which information you can retrieve from Performance Schema and how to do it effectively.
I will cover its companion sys schema and graphical monitoring tools.
This presentation covers all aspects of PostgreSQL administration, including installation, security, file structure, configuration, reporting, backup, daily maintenance, monitoring activity, disk space computations, and disaster recovery. It shows how to control host connectivity, configure the server, find the query being run by each session, and find the disk space used by each database.
Robert Haas
Why does my query need a plan? Sequential scan vs. index scan. Join strategies. Join reordering. Joins you can't reorder. Join removal. Aggregates and DISTINCT. Using EXPLAIN. Row count and cost estimation. Things the query planner doesn't understand. Other ways the planner can fail. Parameters you can tune. Things that are nearly always slow. Redesigning your schema. Upcoming features and future work.
PostgreSQL Portland Performance Practice Project - Database Test 2 HowtoMark Wong
Fourth presentation in a speaker series sponsored by the Portland State University Computer Science Department. The series covers PostgreSQL performance with an OLTP (on-line transaction processing) workload called Database Test 2 (DBT-2). This presentation is a set of examples to go along with the live presentation given on March 12, 2009.
AWS Webcast - Achieving consistent high performance with Postgres on Amazon W...Amazon Web Services
Postgres is a popular relational database and is the backend of a number of high traffic applications. Join AWS and PalominoDB, the company that helped Obama for America campaign optimize the database infrastructure on AWS, to learn about how you can run high throughput, I/O intensive Postgres clusters on the Amazon EBS storage platform. We will go over best practices including performance, durability and optimization related to deploying Postgres on AWS.
You hear about the best practices learned and applied for the Obama for America campaign.
In this webinar, you will learn about:
- Amazon Elastic Block Store (EBS)
- Why Provisioned IOPS volumes fit the needs of high I/O intensive applications
- Best practices for deploying Postgres on AWS
- How to leverage Provisioned IOPS volumes for Postgres
Architecture for building scalable and highly available Postgres ClusterAshnikbiz
As PostgreSQL has made way into business critical applications, many customers who are using Oracle RAC for high availability and load balancing have asked for similar functionality for using PostgreSQL.
In this Hangout session we would discuss architecture and alternatives, based on real life experience, for achieving high availability and load balancing functionality when you deploy PostgreSQL. We will also present some of the key tools and how to deploy them for effectiveness of this architecture.
NEW LAUNCH! Introducing PostgreSQL compatibility for Amazon AuroraAmazon Web Services
After we launched Amazon Aurora, a cloud-native relational database with region-wide durability, high availability, fast failover, up to 15 read replicas, and up to five times the performance of MySQL, many of you asked us whether we could deliver the same features - but with PostgreSQL compatibility. We are now delivering a preview of Amazon Aurora with this functionality: we have built a PostgreSQL-compatible edition of Amazon Aurora, sharing the core Amazon Aurora innovations with the object-oriented capabilities, language interfaces, JSON compatibility, ANSI:SQL:2008 compliance, and broad functional richness of PostgreSQL. Amazon Aurora will provide full PostgreSQL compatibility while delivering more than twice the performance of the community PostgreSQL database on many workloads. At this session, we will be discussing the newest addition to Amazon Aurora in detail.
Scott Callaghan from the Southern California Earthquake Center presented this deck in a recent Blue Waters Webinar.
"I will present an overview of scientific workflows. I'll discuss what the community means by "workflows" and what elements make up a workflow. We'll talk about common problems that users might be facing, such as automation, job management, data staging, resource provisioning, and provenance tracking, and explain how workflow tools can help address these challenges. I'll present a brief example from my own work with a series of seismic codes showing how using workflow tools can improve scientific applications. I'll finish with an overview of high-level workflow concepts, with an aim to preparing users to get the most out of discussions of specific workflow tools and identify which tools would be best for them."
Watch the video: http://wp.me/p3RLHQ-gtH
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Josh Berkus
You've heard that PostgreSQL is the highest-performance transactional open source database, but you're not seeing it on YOUR server. In fact, your PostgreSQL application is kind of poky. What should you do? While doing advanced performance engineering for really high-end systems takes years to learn, you can learn the basics to solve performance issues for 80% of PostgreSQL installations in less than an hour. In this session, you will learn: -- The parts of database application performance -- The performance setup procedure -- Basic troubleshooting tools -- The 13 postgresql.conf settings you need to know -- Where to look for more information.
Palestra realizada no PHP Conference 2018 sobre recursos avançados SQL do PostgreSQL que aumentam drasticamente o desempenho e simplificam o desenvolvimento da aplicação.
Como analisar planos de execução e estatísticas no PostgreSQL.
- Rastreamento de consultas lentas
- Uso do EXPLAIN
- Métodos de acesso
- Junções
- Parâmetros relevantes para o otimizador
Traditionally database systems were optimized either for OLAP either for OLTP workloads. Such mainstream DBMSes like Postgres,MySQL,... are mostly used for OLTP, while Greenplum, Vertica, Clickhouse, SparkSQL,... are oriented on analytic queries. But right now many companies do not want to have two different data stores for OLAP/OLTP and need to perform analytic queries on most recent data. I want to discuss which features should be added to Postgres to efficiently handle HTAP workload.
Right now postgres can't compress its data in many situations and that leads sometimes to increased storage overhead by the order of magnitude comparing with commercial DBMS. Common viewpoint that this task can be accomplished by file system level compression but most popular and well tested Linux file system can't do that. I will talk about our patches that implements page compression on disk or on disk + in memory; in what situation it is better to use what kind of compression; and also discuss experience of using compression in production.
Uncover the hidden challenges that plague production environments in this eye-opening session. Join us as we explore the five most common performance problems that emerge in live systems. Gain invaluable insights into detecting these issues early on, before they wreak havoc on your operations. Discover practical solutions that empower you to address these challenges head-on, ensuring optimal performance and seamless user experiences.
Speedrunning the Open Street Map osm2pgsql LoaderGregSmith458515
The Open Street Map project provides invaluable data that keeps driving users toward the PostGIS and PostgreSQL stacks. Loading today’s full Planet data set takes a 120GB XML file and unrolls it into over a terabyte of database data. Crunchy’s benchmark labs have followed the expansion of that Planet data over the last six database releases, as the re-ignition of the CPU wars combined with parallel execution features landing in the database. We’ll take a look at that data evolution, which server configurations worked, and which metrics techniques still matter in the all SSD era.
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeDatabricks
Over the years, there has been extensive and continuous effort on improving Spark SQL’s query optimizer and planner, in order to generate high quality query execution plans. One of the biggest improvements is the cost-based optimization framework that collects and leverages a variety of data statistics (e.g., row count, number of distinct values, NULL values, max/min values, etc.) to help Spark make better decisions in picking the most optimal query plan.
Are you building high throughput, low latency application? Are you trying to figure out perfect JVM heap size? Are you struggling to choose right garbage collection algorithm and settings? Are you striving to achieve pause less GC? Do you know the right tools & best practices to tame the GC? Do you know to troubleshoot memory problems using GC logs? You will get complete answers to several such questions in this presentation.
Web Performance Madness - brightonSEO 2018Bastian Grimm
My talk from brightonSEO 2018 covering various web performance strategies, this time mainly focussing on critical rendering path, various image optimisation strategies as well as how to handle custom web fonts.
Talk by Brendan Gregg for USENIX LISA 2019: Linux Systems Performance. Abstract: "
Systems performance is an effective discipline for performance analysis and tuning, and can help you find performance wins for your applications and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas of Linux systems performance: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (Ftrace, bcc/BPF, and bpftrace/BPF), and much advice about what is and isn't important to learn. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud."
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New FeaturesAmazon Web Services
Learn the specifics of Amazon RDS for PostgreSQL’s capabilities and extensions that make it powerful. This session begins with a brief overview of the RDS PostgreSQL service, how it provides High Availability & Durability and will then deep dive into the new features that we have released since re:Invent 2014, including major version upgrade and newly added PostgreSQL extensions to RDS PostgreSQL. During the session, we will also discuss lessons learned running a large fleet of PostgreSQL instances, including specific recommendations. In addition we will present benchmarking results looking at differences between the 9.3, 9.4 and 9.5 releases.
WebSocket is cool, and you probably already played with it. But it’s just a transport technology. If you have thousands of client connections you need to do lots of improvements to make it scalable, reliable and achieve high performance. You need to implement many things on top of it.
We are building financial data streaming platform for thousands of traders using WebSocket. I’m going to share my experience and cover such techniques as delta delivery, conflation, dynamic throttling, bandwidth and frequency limitation and other. I will also do a live demo of how to build scalable WebSocket backend from scratch using Java and Spring.
This talk is divided into multiple parts, starting with a basic introduction into the popular opensource database management software postgresql.
Building upon that operational aspects of a postgresql cluster instance will be discussed with a focus on what can, should and must be monitored and what kind of tools and nagios related plugins are available.
The final part of the talk gives an insight into the complex and distributed nature of the postgresql.org infrastructure and how nagios plays a crucial role as the central monitoring and alerting platform for the sysadmin team.
Defining Your Goal: Starting Your Own BusinessJoshua Drake
Joshua Drake will still go through the "business" steps of starting your own business, but will focus on the real decisions and sacrifices required that a pure business standpoint will not get into. Self improvement/discovery and focusing on your entrepreneurial nature will be the core theme.
Defining Your Goal: Starting Your Own BusinessJoshua Drake
Joshua Drake will still go through the "business" steps of starting your own business, but will focus on the real decisions and sacrifices required that a pure business standpoint will not get into. Self improvement/discovery and focusing on your entrepreneurial nature will be the core theme.
A 45 minute talk discussing various in production performance enhancements for PostgreSQL. We touch on hard drives including SSD and RAID. We also discuss Memory, PostgreSQL settings and various other topics such as DASvsNASvsSAN.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
2. About this presentation
The master source for these slides is
http://www.westnet.com/~gsmith/content/postgresql
Slides are released under the Creative Commons Attribution
3.0 United States License
See http://creativecommons.org/licenses/by/3.0/us
Greg Smith Introduction to pgbench
3. pgbench History
TPC-B: http://www.tpc.org/tpcb/
State of the art...in 1990
Models a bank with accounts, tellers, and branches
pgbench actually derived from JDBCbench
http:
//developer.mimer.com/features/feature_16.htm
But aimed to eliminate Java overhead
The JDBCbench code is buggy
Early pgbench versions inherited some of those bugs
Greg Smith Introduction to pgbench
4. Database scale
Each scale increment adds another branch to the database
1 branch + 10 tellers + 100000 accounts
Total database size is approximately 16MB * scale
createdb pgbench
pgbench -i -s 10
scale=10 makes for a 160MB database
Greg Smith Introduction to pgbench
5. Database schema
CREATE TABLE branches(bid int not null,
bbalance int, filler char(88));
ALTER TABLE branches add primary key (bid);
CREATE TABLE tellers(tid int not null,bid int,
tbalance int,filler char(84));
ALTER TABLE tellers add primary key (tid);
CREATE TABLE accounts(aid int not null,bid int,
abalance int,filler char(84));
ALTER TABLE accounts add primary key (aid);
CREATE TABLE history(tid int,bid int,aid int,delta int,
mtime timestamp,filler char(22));
Greg Smith Introduction to pgbench
6. Filler
Filler itended to make each row at least 100 bytes long
Fail: null data inserted into it doesn’t do that
Rows are therefore very narrow
Not fixed because it would put new pgbench results on a
different scale
Greg Smith Introduction to pgbench
7. Scripting language
set nbranches :scale
set ntellers 10 * :scale
set naccounts 100000 * :scale
setrandom aid 1 :naccounts
setrandom bid 1 :nbranches
setrandom tid 1 :ntellers
setrandom delta -5000 5000
Greg Smith Introduction to pgbench
8. Standard test
BEGIN;
UPDATE accounts SET abalance = abalance + :delta
WHERE aid = :aid;
SELECT abalance FROM accounts WHERE aid = :aid;
UPDATE tellers SET tbalance = tbalance + :delta
WHERE tid = :tid;
UPDATE branches SET bbalance = bbalance + :delta
WHERE bid = :bid;
INSERT INTO history (tid, bid, aid, delta, mtime)
VALUES (:tid, :bid, :aid, :delta, CURRENT TIMESTAMP);
END;
Greg Smith Introduction to pgbench
9. No teller/branch test
BEGIN;
UPDATE accounts SET abalance = abalance + :delta
WHERE aid = :aid;
SELECT abalance FROM accounts WHERE aid = :aid;
INSERT INTO history (tid, bid, aid, delta, mtime)
VALUES (:tid, :bid, :aid, :delta, CURRENT TIMESTAMP);
END;
Greg Smith Introduction to pgbench
10. Select only test
set naccounts 100000 * :scale
setrandom aid 1 :naccounts
SELECT abalance FROM accounts WHERE aid = :aid;
Greg Smith Introduction to pgbench
14. postgresql.conf parameters that matter
shared buffers
checkpoint segments
autovacuum
synchronous commit + wal writer delay
wal buffers
checkpoint completion target
wal sync method
http://wiki.postgresql.org/wiki/Tuning Your PostgreSQL Server
Greg Smith Introduction to pgbench
15. postgresql.conf parameters that don’t matter
effective cache size
default statistics target
work mem
random page cost
Greg Smith Introduction to pgbench
16. Gotchas
Long enough runtime–1 minute or 100,000 transactions
minimum
Not paying attention to DB size relative to the various cache
sizes
Update contention warnings not as important
pgbench client parsing and process switching limitations
vacuum and bloating
Checkpoints adding variation
Custom scripts don’t detect scale
Developer PostgreSQL builds
Greg Smith Introduction to pgbench
17. pgbench-tools
Set of shell scripts to automate running many pgbench tests
Results are saved into a database for analysis
Saves TPS, latency, background writer statistics
Most of the cool features require PostgreSQL 8.3
Inspired by the dbt2 benchmark framework
Greg Smith Introduction to pgbench
18. Server configuration
Quad-Core Intel Q6600
8GB DDR2-800 RAM
Areca ARC-1210 SATA II PCI-e x8 RAID controller, 256MB
write cache
DB: 3x160GB Maxtor SATA disks, Linux software RAID-0
WAL: 160GB Maxtor SATA disk
Linux Kernel 2.6.22 x86 64
Untuned ext3 filesystems
Greg Smith Introduction to pgbench
23. pgbench TPC-B Size and Client Scaling
(3D glasses not included)
Greg Smith Introduction to pgbench
24. Don’t be fooled by TPS
scaling factor: 600
number of clients: 8
number of transactions per client: 25000
number of transactions actually processed: 200000/200000
tps = 226.228995 (including connections establishing)
scaling factor: 600
number of clients: 32
number of transactions per client: 6250
number of transactions actually processed: 200000/200000
tps = 376.016694 (including connections establishing)
Greg Smith Introduction to pgbench
29. Thorough testing takes a long time
psql pgbench -c
quot;select sum(end time - start time) from testsquot;
sum
-----------------
53:45:32.383059
...and even then you may not get it right!
Need to aim for an equal number of checkpoints
Greg Smith Introduction to pgbench
33. Cold vs. Warm Database Cache
Results so far have all been warm cache
Database initialization and benchmark run had no cache
clearing in between
This is fairly real-world once your app has been running for a
while
The situation just after a reboot will be far worse
Both are interesting benchmark numbers, neither is ’right’
Greg Smith Introduction to pgbench
34. Cold Cache Comparison
The select-only test can work like a simple test to measure
real-world row+index seeks/second over some data size
Create a pgbench database with the size you want to test
normally
Clear all caches: reboot, on Linux you can use pg ctl stop plus
drop caches
Run the select only test, disabling any init/VACUUM steps
(SKIPINIT feature)
Watch bi column in vmstat to see effective seek MB/s
At the beginning of the test, you’ll be fetching a lot of the
index to the table along with the table
That makes for two seeks per row until that’s populated.
Can look at pg buffercache for more insight into when the
index is all in RAM
Greg Smith Introduction to pgbench
37. Zoom on start: low of 37 TPS
Greg Smith Introduction to pgbench
38. Database typical row query
select relname,relpages,reltuples,
pg size pretty(pg relation size(relname::regclass))
as relsize,
round(pg relation size(relname::regclass) / reltuples)
as quot;bytes per rowquot;,
round(8192 / (pg relation size(relname::regclass)
/ reltuples)) as rows per page,
round(1024*1024 / (pg relation size(relname::regclass)
/ reltuples)) as rows per mb
from pg class where relname=’accounts’;
Greg Smith Introduction to pgbench
39. Using seek rate to estimate fill time
reltuples relsize bytes per row rows per mb
9.99977e+06 1240 MB 130 8064
Measured vmstat bo=1.1MB/s
1.1 MB/s * 8064 rows/MB = 8870 rows/second
Estimated cache fill time:
10M rows / 8870 rows/sec = 1127 seconds = 18 minutes
Database pages/MB are normally (1024*1024)/8192 = 128
Greg Smith Introduction to pgbench
40. Cold Cache Fill Time Measurement
Had reached near full warm cache speed, 16799 TPS, by the
end of the run
select end time - start time as run time from tests
00:23:27.890015
Didn’t account for index overhead
Would have made estimate above even closer to reality
Greg Smith Introduction to pgbench
41. Custom test: insert size
create table data(filler text);
insert into data (filler) values (repeat(’X’,:scale));
SCALES=quot;10 100 1000 10000quot;
SCRIPT=quot;insert-size.sqlquot;
TOTTRANS=100000
SETTIMES=1
SETCLIENTS=quot;1 2 4 8 16quot;
SKIPINIT=1
Greg Smith Introduction to pgbench
43. Conclusions
Pay attention to your database scale
Client scaling may not work as you expect
Automate your tests and be consistent how you run them
Results in a database make life easier
Generating pretty graphs isn’t just for marketing
pgbench can be used for running your specific tests, too
Greg Smith Introduction to pgbench