This document provides an overview of PostgreSQL architecture, transactions, connection pooling, monitoring, and tips. It discusses:
- PostgreSQL architecture including processes like the postmaster, background writer, and WAL writer.
- Transactions and concurrency using MVCC, with snapshots of data at a point in time and increasing transaction IDs for consistency.
- Connection pooling tools like PgPool and PgBouncer that help reuse connections and lower impact on the database.
- Monitoring options including Graphite, Zabbix, Graphana, log insight, and specific queries for stats, sessions, replication, checkpoints, caching, and queries.
- Tips like analyzing indexes, identifying duplicates, missing indexes
JSON is an important datatype transporting data between servers and many modern applications. Postgres has been at the forefront of bringing these capabilities into the hands of database users. JSONB data type allows for faster operations within PostgreSQL.
At this webinar we will look at:
- How to use JSON from applications
- How to store it in the database
- How to index JSON data
- Tips and tricks to optimize usage
We then closed with a review of the roadmap for new PostgreSQL features for JSON and JSON standards compliance.
[Pgday.Seoul 2021] 2. Porting Oracle UDF and OptimizationPgDay.Seoul
The document discusses porting functions from Oracle to PostgreSQL and optimizing performance, including different function types in PostgreSQL like SQL functions and PL/pgSQL functions, as well as volatility categories. It also provides examples of test data created for use in examples and covers strategies for analyzing inefficient Oracle functions and improving them to leverage the PostgreSQL optimizer.
This document provides an introduction and overview of PostgreSQL, including its history, features, installation, usage and SQL capabilities. It describes how to create and manipulate databases, tables, views, and how to insert, query, update and delete data. It also covers transaction management, functions, constraints and other advanced topics.
This document provides an introduction to MongoDB, including what it is, why it may be used, and how its data model works. Some key points:
- MongoDB is a non-relational database that stores data in flexible, JSON-like documents rather than fixed schema tables.
- It offers advantages like dynamic schemas, embedding of related data, and fast performance at large scales.
- Data is organized into collections of documents, which can contain sub-documents to represent one-to-many relationships without joins.
- Queries use JSON-like syntax to search for patterns in documents, and indexes can improve performance.
PostgreSQL (or Postgres) began its life in 1986 as POSTGRES, a research project of the University of California at Berkeley.
PostgreSQL isn't just relational, it's object-relational.it's object-relational. This gives it some advantages over other open source SQL databases like MySQL, MariaDB and Firebird.
This presentation is for people who want to understand how PostgreSQL shares information among processes using shared memory. Topics covered include the internal data page format, usage of the shared buffers, locking methods, and various other shared memory data structures.
This document provides an overview of how to use various Oracle performance monitoring and diagnostic tools like ASH, AWR, and SQL Monitor to analyze and troubleshoot performance issues. It begins with introductions and background on the speaker. It then demonstrates how to generate and interpret reports from these tools using the Oracle Enterprise Manager console and command line. It provides examples of querying ASH data directly and using tools like Compare ADDM and SQL Monitor. The document aims to help users quickly understand performance problems by leveraging these built-in Oracle performance diagnostics.
JSON is an important datatype transporting data between servers and many modern applications. Postgres has been at the forefront of bringing these capabilities into the hands of database users. JSONB data type allows for faster operations within PostgreSQL.
At this webinar we will look at:
- How to use JSON from applications
- How to store it in the database
- How to index JSON data
- Tips and tricks to optimize usage
We then closed with a review of the roadmap for new PostgreSQL features for JSON and JSON standards compliance.
[Pgday.Seoul 2021] 2. Porting Oracle UDF and OptimizationPgDay.Seoul
The document discusses porting functions from Oracle to PostgreSQL and optimizing performance, including different function types in PostgreSQL like SQL functions and PL/pgSQL functions, as well as volatility categories. It also provides examples of test data created for use in examples and covers strategies for analyzing inefficient Oracle functions and improving them to leverage the PostgreSQL optimizer.
This document provides an introduction and overview of PostgreSQL, including its history, features, installation, usage and SQL capabilities. It describes how to create and manipulate databases, tables, views, and how to insert, query, update and delete data. It also covers transaction management, functions, constraints and other advanced topics.
This document provides an introduction to MongoDB, including what it is, why it may be used, and how its data model works. Some key points:
- MongoDB is a non-relational database that stores data in flexible, JSON-like documents rather than fixed schema tables.
- It offers advantages like dynamic schemas, embedding of related data, and fast performance at large scales.
- Data is organized into collections of documents, which can contain sub-documents to represent one-to-many relationships without joins.
- Queries use JSON-like syntax to search for patterns in documents, and indexes can improve performance.
PostgreSQL (or Postgres) began its life in 1986 as POSTGRES, a research project of the University of California at Berkeley.
PostgreSQL isn't just relational, it's object-relational.it's object-relational. This gives it some advantages over other open source SQL databases like MySQL, MariaDB and Firebird.
This presentation is for people who want to understand how PostgreSQL shares information among processes using shared memory. Topics covered include the internal data page format, usage of the shared buffers, locking methods, and various other shared memory data structures.
This document provides an overview of how to use various Oracle performance monitoring and diagnostic tools like ASH, AWR, and SQL Monitor to analyze and troubleshoot performance issues. It begins with introductions and background on the speaker. It then demonstrates how to generate and interpret reports from these tools using the Oracle Enterprise Manager console and command line. It provides examples of querying ASH data directly and using tools like Compare ADDM and SQL Monitor. The document aims to help users quickly understand performance problems by leveraging these built-in Oracle performance diagnostics.
This document provides an introduction to NoSQL and MongoDB. It discusses that NoSQL is a non-relational database management system that avoids joins and is easy to scale. It then summarizes the different flavors of NoSQL including key-value stores, graphs, BigTable, and document stores. The remainder of the document focuses on MongoDB, describing its structure, how to perform inserts and searches, features like map-reduce and replication. It concludes by encouraging the reader to try MongoDB themselves.
PostgreSQL Internals (1) for PostgreSQL 9.6 (English)Noriyoshi Shinoda
This document provides an overview of PostgreSQL internals including its process and memory architecture, storage architecture, and file formats. It discusses topics like processes and signals, shared buffers, huge pages, checkpoints, WAL logs, the database directory structure, tablespaces, visibility maps, VACUUM behavior, online backups, and key configuration files. The document is intended for engineers using PostgreSQL and aims to help them better understand its internal workings.
Spencer Christensen
There are many aspects to managing an RDBMS. Some of these are handled by an experienced DBA, but there are a good many things that any sys admin should be able to take care of if they know what to look for.
This presentation will cover basics of managing Postgres, including creating database clusters, overview of configuration, and logging. We will also look at tools to help monitor Postgres and keep an eye on what is going on. Some of the tools we will review are:
* pgtop
* pg_top
* pgfouine
* check_postgres.pl.
Check_postgres.pl is a great tool that can plug into your Nagios or Cacti monitoring systems, giving you even better visibility into your databases.
Presentation that I gave as a guest lecture for a summer intensive development course at nod coworking in Dallas, TX. The presentation targets beginning web developers with little, to no experience in databases, SQL, or PostgreSQL. I cover the creation of a database, creating records, reading/querying records, updating records, destroying records, joining tables, and a brief introduction to transactions.
This document provides an overview of Postgresql, including its history, capabilities, advantages over other databases, best practices, and references for further learning. Postgresql is an open source relational database management system that has been in development for over 30 years. It offers rich SQL support, high performance, ACID transactions, and extensive extensibility through features like JSON, XML, and programming languages.
Indexes are references to documents that are efficiently ordered by key and maintained in a tree structure for fast lookup. They improve the speed of document retrieval, range scanning, ordering, and other operations by enabling the use of the index instead of a collection scan. While indexes improve query performance, they can slow down document inserts and updates since the indexes also need to be maintained. The query optimizer aims to select the best index for each query but can sometimes be overridden.
Working with JSON Data in PostgreSQL vs. MongoDBScaleGrid.io
In this post, we are going to show you tips and techniques on how to effectively store and index JSON data in PostgreSQL vs. MongoDB. Learn more in the blog post: https://scalegrid.io/blog/using-jsonb-in-postgresql-how-to-effectively-store-index-json-data-in-postgresql
This document discusses building dynamic web sites using databases. It begins by explaining that truly dynamic sites have content that changes over time, is customized for users, and can be automatically generated. It recommends using a database rather than storing content in files, as databases are faster, more efficient, and easier to manage when content grows large. The document then provides an overview of key database concepts like tables, fields, queries, and the relational structure. It gives an example of how a student database might be implemented and why a database is better than flat files for such an application. Finally, it discusses MySQL as a popular open-source database and shows basic concepts like connecting to the database, selecting a database, performing queries, and extracting record
This document summarizes the results of an OLTP performance benchmark test comparing PostgreSQL and Oracle databases. The test used HammerDB to run the same workload against each database on a server with 2x8 core CPUs and 192GB RAM. With 8 vCPUs, Oracle was 2.6% faster, used 16% less CPU, and had 9.3% more transactions per minute than PostgreSQL. When scaled to 16 vCPUs, Oracle was 3.4% faster, used 12.3% less CPU and had 22.43% more transactions per minute.
An overview of techniques for defending against SQL Injection using Python tools. This slide deck was presented at the DC Python Meetup on October 4th, 2011 by Edgar Roman, Sr Director of Application Development at PBS
Introduction to Oracle Data Guard BrokerZohar Elkayam
This is an old deck I recently renewed for a customer session. This is the introduction to Oracle Data Guard broker feature, how to deploy it, how to use it and what are its benefits.
This presentation is based on version 11g but most of it is also compatible to Oracle 12c,
Agenda:
- Oracle Data Guard overview
- Dataguard broker introduction
- Configuring and using the data guard
- Live Demos
This document discusses using social media for business. It explains why social media is important, noting statistics about major platforms like Facebook, Twitter, YouTube and LinkedIn. It discusses choosing the right social media platforms for a business based on its audience. Platforms like Facebook, Twitter, LinkedIn, blogs, online communities and social bookmarking are compared in terms of their suitability for customer communication, brand exposure and search engine optimization. The document also provides tips for how to engage and behave on social media to develop business skills.
Este documento describe la historia de Jessica Aguilera con el tenis y cómo este deporte la ha ayudado a desarrollarse personal y profesionalmente. El tenis ha sido parte importante de su vida desde niña y le ha brindado oportunidades como ser campeona a nivel departamental y nacional en su país y obtener una beca deportiva. Además, practicar tenis le ha permitido mantenerse en forma, desestresarse, hacer amigos, viajar y fortalecerse física y mentalmente. El tenis también le ha enseñado valores como la disciplina
This document provides an introduction to NoSQL and MongoDB. It discusses that NoSQL is a non-relational database management system that avoids joins and is easy to scale. It then summarizes the different flavors of NoSQL including key-value stores, graphs, BigTable, and document stores. The remainder of the document focuses on MongoDB, describing its structure, how to perform inserts and searches, features like map-reduce and replication. It concludes by encouraging the reader to try MongoDB themselves.
PostgreSQL Internals (1) for PostgreSQL 9.6 (English)Noriyoshi Shinoda
This document provides an overview of PostgreSQL internals including its process and memory architecture, storage architecture, and file formats. It discusses topics like processes and signals, shared buffers, huge pages, checkpoints, WAL logs, the database directory structure, tablespaces, visibility maps, VACUUM behavior, online backups, and key configuration files. The document is intended for engineers using PostgreSQL and aims to help them better understand its internal workings.
Spencer Christensen
There are many aspects to managing an RDBMS. Some of these are handled by an experienced DBA, but there are a good many things that any sys admin should be able to take care of if they know what to look for.
This presentation will cover basics of managing Postgres, including creating database clusters, overview of configuration, and logging. We will also look at tools to help monitor Postgres and keep an eye on what is going on. Some of the tools we will review are:
* pgtop
* pg_top
* pgfouine
* check_postgres.pl.
Check_postgres.pl is a great tool that can plug into your Nagios or Cacti monitoring systems, giving you even better visibility into your databases.
Presentation that I gave as a guest lecture for a summer intensive development course at nod coworking in Dallas, TX. The presentation targets beginning web developers with little, to no experience in databases, SQL, or PostgreSQL. I cover the creation of a database, creating records, reading/querying records, updating records, destroying records, joining tables, and a brief introduction to transactions.
This document provides an overview of Postgresql, including its history, capabilities, advantages over other databases, best practices, and references for further learning. Postgresql is an open source relational database management system that has been in development for over 30 years. It offers rich SQL support, high performance, ACID transactions, and extensive extensibility through features like JSON, XML, and programming languages.
Indexes are references to documents that are efficiently ordered by key and maintained in a tree structure for fast lookup. They improve the speed of document retrieval, range scanning, ordering, and other operations by enabling the use of the index instead of a collection scan. While indexes improve query performance, they can slow down document inserts and updates since the indexes also need to be maintained. The query optimizer aims to select the best index for each query but can sometimes be overridden.
Working with JSON Data in PostgreSQL vs. MongoDBScaleGrid.io
In this post, we are going to show you tips and techniques on how to effectively store and index JSON data in PostgreSQL vs. MongoDB. Learn more in the blog post: https://scalegrid.io/blog/using-jsonb-in-postgresql-how-to-effectively-store-index-json-data-in-postgresql
This document discusses building dynamic web sites using databases. It begins by explaining that truly dynamic sites have content that changes over time, is customized for users, and can be automatically generated. It recommends using a database rather than storing content in files, as databases are faster, more efficient, and easier to manage when content grows large. The document then provides an overview of key database concepts like tables, fields, queries, and the relational structure. It gives an example of how a student database might be implemented and why a database is better than flat files for such an application. Finally, it discusses MySQL as a popular open-source database and shows basic concepts like connecting to the database, selecting a database, performing queries, and extracting record
This document summarizes the results of an OLTP performance benchmark test comparing PostgreSQL and Oracle databases. The test used HammerDB to run the same workload against each database on a server with 2x8 core CPUs and 192GB RAM. With 8 vCPUs, Oracle was 2.6% faster, used 16% less CPU, and had 9.3% more transactions per minute than PostgreSQL. When scaled to 16 vCPUs, Oracle was 3.4% faster, used 12.3% less CPU and had 22.43% more transactions per minute.
An overview of techniques for defending against SQL Injection using Python tools. This slide deck was presented at the DC Python Meetup on October 4th, 2011 by Edgar Roman, Sr Director of Application Development at PBS
Introduction to Oracle Data Guard BrokerZohar Elkayam
This is an old deck I recently renewed for a customer session. This is the introduction to Oracle Data Guard broker feature, how to deploy it, how to use it and what are its benefits.
This presentation is based on version 11g but most of it is also compatible to Oracle 12c,
Agenda:
- Oracle Data Guard overview
- Dataguard broker introduction
- Configuring and using the data guard
- Live Demos
This document discusses using social media for business. It explains why social media is important, noting statistics about major platforms like Facebook, Twitter, YouTube and LinkedIn. It discusses choosing the right social media platforms for a business based on its audience. Platforms like Facebook, Twitter, LinkedIn, blogs, online communities and social bookmarking are compared in terms of their suitability for customer communication, brand exposure and search engine optimization. The document also provides tips for how to engage and behave on social media to develop business skills.
Este documento describe la historia de Jessica Aguilera con el tenis y cómo este deporte la ha ayudado a desarrollarse personal y profesionalmente. El tenis ha sido parte importante de su vida desde niña y le ha brindado oportunidades como ser campeona a nivel departamental y nacional en su país y obtener una beca deportiva. Además, practicar tenis le ha permitido mantenerse en forma, desestresarse, hacer amigos, viajar y fortalecerse física y mentalmente. El tenis también le ha enseñado valores como la disciplina
1. Ayurveda views neurology through the lens of Vaayu (kinetic energy) which is seen as the fundamental energy underlying all bodily functions. Vaayu manifests individually as dosha prakriti (inherent constitution) which determines an individual's functional patterns.
2. During conception, as the metaphysical person and zygote unite, dosha prakriti is established based on the panchamahabhuta patterns present, with a Vaayu-dominant stage setting the individual's permanent tridoshic balance.
3. Neurological and rheumatic conditions are seen as later stages in a unified model of pathogenesis where inflammation progresses from Jvara
Analyzuj a Proveď je obchodní název pro finanční analýzu firem a obcí.
Základním účelem Analyzuj a Proveď je přinést Vám konkrétní fungující nástroje pro zvyšování výkonnosti vašich firem, což vám následně pomůže naplňovat vaše osobní a profesní cíle.
Analyzuj a Proveď je produkt společnosti Edolo Consult s.r.o.
1) The document discusses solving absolute value inequalities by isolating the absolute value, setting up two inequalities with the original inequality and its reverse, and solving each inequality.
2) Examples show setting up the inequalities, solving them individually, and combining the solutions using "and" or "or" depending on whether one or both solutions are possible.
3) One example is always true since absolute values are positive and greater than a negative number, while another is always false since absolute values cannot be less than a negative number.
Digital storytelling is an engaging way for teachers to involve students in lessons through hands-on projects using digital media like photos, video, and music to create stories, which can increase information retention by 80% and give students a fun, creative way to learn and express themselves. Software like PowerPoint makes it easy for students to design digital stories they can share worldwide via the web.
This Truebridge workshop can teach you how content marketing can drive brand awareness and generate sales for banks and credit unions. People are looking for answers to their financial questions. They will overwhelmingly buy from the answer provider. See how your bank or CU can become that kind of valuable resource.
The document provides details about a student's foundation production portfolio for a psychological thriller film. It includes descriptions of the codes and conventions used for elements like camerawork, editing, characters, settings, and sound. The student discusses stereotypes portrayed, the intended audience, and potential pathways for distribution. Feedback from early test viewers praised the tension and ability to keep them interested. The student also reflects on what was learned from the continuity task to the final product, including improving filming and editing skills through practice.
Dokumen ini berisi daftar alamat website pemerintah, agama, berita, pendidikan, media sosial, dan mesin pencari yang mencakup website resmi lembaga-lembaga negara seperti DPR, MPR, DPD, MK, MA serta website berita populer seperti detik, okezone, kompas, liputan6, vivanews, situs pendidikan seperti e-dukasi, pesonaedu, fisikanet, chem-istry, ixl, media sosial seperti Facebook, Twitter, LinkedIn, M
The document summarizes a social media listening study conducted for Coca-Cola brands Coke Zero, Dasani, and VitaminWater over 6 months. Key findings for each brand are presented along with 3 top-line strategic insights. The approach involved using Sysomos tools and custom queries to analyze conversations. For Coke Zero, events were found to be content-rich opportunities but some comments questioned aspartame safety. For Dasani, the PlantBottle was discussed positively but consumers questioned the difference from tap water. For VitaminWater, many events were identified as engagement opportunities but the brands were sometimes compared. The insights recommend optimizing media based on purchase funnel priorities and social insights for improved ROI.
Produsele profesionale si programele de ingrijire a tenului si intretinere corporala BECOS by ALFAPARF Group MILANO si Danielle Laroche - exclusiv pe www.laduchesse.ro
Queens of the Stone Age have maintained an engaged online presence for over 18 years through various social media platforms and websites that guide fans to purchase music and merchandise. While they do not frequently post or interact on social media, frontman Joshua Homme is known for unconventional interactions with fans. Their consistent release of high quality music and media campaigns have helped Queens of the Stone Age find and retain a dedicated global fanbase.
Fredag 8. og lørdag 9. har vi stort sykkelsalg i butikken med sykler fra Rocky Mountain.
Det blir mulig å prøve masse rå sykler,
Prisene blir fantastiske, med mellom 40 til 50% rabatt på alt fra Hardtails, stisykler til downhillsykler.
The document discusses tuning autovacuum in PostgreSQL. It provides an overview of autovacuum, how it helps prevent database bloat, and best practices for configuring autovacuum parameters like autovacuum_vacuum_threshold, autovacuum_analyze_threshold, autovacuum_naptime, and autovacuum_max_workers. It emphasizes regularly monitoring for bloat, configuring autovacuum appropriately based on table sizes and usage, and avoiding manual vacuuming when not needed.
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...Ontico
The new PL profiler allows you to easily get through the dark barrier, PL/pgSQL puts between tools like pgbadger and the queries, you are looking for.
Query and schema tuning is tough enough by itself. But queries, buried many call levels deep in PL/pgSQL functions, make it torture. The reason is that the default monitoring tools like logs, pg_stat_activity and pg_stat_statements cannot penetrate into PL/pgSQL. All they report is that your query calling function X is slow. That is useful if function X has 20 lines of simple code. Not so useful if it calls other functions and the actual problem query is many call levels down in a dungeon of 100,000 lines of PL code.
Learn from the original author of PL/pgSQL and current maintainer of the plprofiler extension how you can easily analyze, what is going on inside your PL code.
Docker Logging and analysing with Elastic StackJakub Hajek
Collecting logs from the entire stateless environment is challenging parts of the application lifecycle. Correlating business logs with operating system metrics to provide insights is a crucial part of the entire organization. What aspects should be considered while you design your logging solutions?
Docker Logging and analysing with Elastic Stack - Jakub Hajek PROIDEA
Collecting logs from the entire stateless environment is challenging parts of the application lifecycle. Correlating business logs with operating system metrics to provide insights is a crucial part of the entire organization. We will see the technical presentation on how to manage a large amount of the data in a typical environment with microservices.
The document summarizes the conceptual architecture of PostgreSQL. It describes PostgreSQL as having three main subsystems: the client server, server processes, and database control. The client server uses a client-server architecture. The postmaster process uses implicit invocation to handle connections from clients and launch server processes. Server processes employ a hybrid pipe and filter and repository architecture to process queries. The database control manages data storage and access using various independent subsystems.
Intro to Apache Apex - Next Gen Platform for Ingest and TransformApache Apex
Introduction to Apache Apex - The next generation native Hadoop platform. This talk will cover details about how Apache Apex can be used as a powerful and versatile platform for big data processing. Common usage of Apache Apex includes big data ingestion, streaming analytics, ETL, fast batch alerts, real-time actions, threat detection, etc.
Bio:
Pramod Immaneni is Apache Apex PMC member and senior architect at DataTorrent, where he works on Apache Apex and specializes in big data platform and applications. Prior to DataTorrent, he was a co-founder and CTO of Leaf Networks LLC, eventually acquired by Netgear Inc, where he built products in core networking space and was granted patents in peer-to-peer VPNs.
[Auto]Vacuum is a complex topic in PostgreSQL. All too often, we ignore tuning it and end up with big problems. Let's talk about how to be tune [Auto]Vacuum strategically.
This technical presentation by EDB Dave Thomas, Systems Engineer provides an overview of:
1) BGWriter/Writer Process
2) Wall Writer Process
3) Stats Collector Process
4) Autovacuum Launch Process
5) Syslogger Process/Logger process
6) Archiver Process
7) WAL Send/Receive Processes
Kubernetes can orchestrate and manage container workloads through components like Pods, Deployments, DaemonSets, and StatefulSets. It schedules containers across a cluster based on resource needs and availability. Services enable discovery and network access to Pods, while ConfigMaps and Secrets allow injecting configuration and credentials into applications.
The document discusses strategic autovacuum configuration and monitoring in PostgreSQL. It begins by explaining the ACID properties and how MVCC and transactions work. It then discusses how to monitor workloads for heavily updated tables, adjust per-table autovacuum thresholds to prioritize those tables, monitor autovacuum behavior over time using logs and queries, and tune the autovacuum throttle settings based on that monitoring to optimize autovacuum performance. The key steps are to start with defaults, monitor workload changes, adjust settings for busy tables, continue monitoring, and refine settings as needed.
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACKristofferson A
This document summarizes the steps taken to diagnose and resolve a sudden slow down issue affecting applications running on a two node Real Application Clusters (RAC) environment. The troubleshooting process involved systematically measuring performance at the operating system, database, and session levels. Key findings included high wait times and fragmentation issues on the network interconnect, which were resolved by replacing the network switch. Measuring performance using tools like ASH, AWR, and OS monitoring was essential to systematically diagnose the problem.
About a year ago I was caught up in line-of-fire when a production system started behaving abruptly
- A batch process which would finish in 15minutes started taking 1.5 hours
- We started facing OLTP read queries on standby being cancelled
- We faced a sudden slowness on the Primary server and we were forced to do a forceful switch to standby.
We were able to figure out that some peculiarities of the application code and batch process were responsible for this. But we could not fix the application code (as it is packaged application).
In this talk I would like to share more details of how we debugged, what was the problem we were facing and how we applied a work around for it. We also learnt that a query returning in 10minutes may not be as dangerous as a query returning in 10sec but executed 100s of times in an hour.
I will share in detail-
- How to map the process/top stats from OS with pg_stat_activity
- How to get and read explain plan
- How to judge if a query is costly
- What tools helped us
- A peculiar autovacuum/vacuum Vs Replication conflict we ran into
- Various parameters to tune autvacuum and auto-analyze process
- What we have done to work-around the problem
- What we have put in place for better monitoring and information gathering
This was my second presentation taken on MySQL User Camp ( Bangalore ) held on Nov-08 2013. I have a made a presentation about the percona tools mostly used percona tools.
Presto is used to process 15 trillion rows per day for Treasure Data customers. Treasure Data developed tools to manage Presto performance and optimize queries. They collect Presto query logs to analyze performance bottlenecks and classify queries to set implicit service level objectives. Tools like Prestobase Proxy and Presto Stella storage optimizer were created to improve low-latency access and optimize storage partitioning. Workflows using DigDag and a new tabular data format called MessageFrame are being explored to split huge queries and support incremental processing.
Using Groovy? Got lots of stuff to do at the same time? Then you need to take a look at GPars (“Jeepers!”), a library providing support for concurrency and parallelism in Groovy. GPars brings powerful concurrency models from other languages to Groovy and makes them easy to use with custom DSLs:
- Actors (Erlang and Scala)
- Dataflow (Io)
- Fork/join (Java)
- Agent (Clojure agents)
In addition to this support, GPars integrates with standard Groovy frameworks like Grails and Griffon.
Background, comparisons to other languages, and motivating examples will be given for the major GPars features.
Parallel processing involves executing multiple tasks simultaneously using multiple cores or processors. It can provide performance benefits over serial processing by reducing execution time. When developing parallel applications, developers must identify independent tasks that can be executed concurrently and avoid issues like race conditions and deadlocks. Effective parallelization requires analyzing serial code to find optimization opportunities, designing and implementing concurrent tasks, and testing and tuning to maximize performance gains.
The document discusses analyzing database systems using a 3D method for performance analysis. It introduces the 3D method, which looks at performance from the perspectives of the operating system (OS), Oracle database, and applications. The 3D method provides a holistic view of the system that can help identify issues and direct solutions. It also covers topics like time-based analysis in Oracle, how wait events are classified, and having a diagnostic framework for quick troubleshooting using tools like the Automatic Workload Repository report.
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexApache Apex
This is an overview of architecture with use cases for Apache Apex, a big data analytics platform. It comes with a powerful stream processing engine, rich set of functional building blocks and an easy to use API for the developer to build real-time and batch applications. Apex runs natively on YARN and HDFS and is used in production in various industries. You will learn more about two use cases: A leading Ad Tech company serves billions of advertising impressions and collects terabytes of data from several data centers across the world every day. Apex was used to implement rapid actionable insights, for real-time reporting and allocation, utilizing Kafka and files as source, dimensional computation and low latency visualization. A customer in the IoT space uses Apex for Time Series service, including efficient storage of time series data, data indexing for quick retrieval and queries at high scale and precision. The platform leverages the high availability, horizontal scalability and operability of Apex.
To understand an application’s performance, first you have to know what to measure. That’s the easy part. How do you take those measurements? Store them? Analyze them? Get them to the people who need them? Well, that’s where things get complicated, especially in the high-traffic distributed systems of the modern web! Like careful scientists, we must observe our subjects without altering them, and we must report our findings quickly so that we have the data necessary to make smart choices about the health and growth of the system.
Let’s explore the lessons learned by engineers at one of the world’s top web companies in their quest to find meaning at 5 MB/s. We’ll discuss the tools and techniques that enable the collection, indexing, and analysis of billions or more datapoints each hour, and learn how these same approaches can empower your applications and your business, no matter the scale.
1404 app dev series - session 8 - monitoring & performance tuningMongoDB
This document discusses MongoDB monitoring tools and key metrics. It provides an overview of tools like mongostat, the MongoDB shell, MMS, and mtools for monitoring operations per second, memory usage, page faults, and other metrics. It also discusses using logs to analyze query performance and disk saturation. The importance of monitoring queued readers/writers, page faults, background flush processes, memory usage, locks, and other core metrics is highlighted.
Similar to What you need to know for postgresql operation (20)
How to Build a Module in Odoo 17 Using the Scaffold MethodCeline George
Odoo provides an option for creating a module by using a single line command. By using this command the user can make a whole structure of a module. It is very easy for a beginner to make a module. There is no need to make each file manually. This slide will show how to create a module using the scaffold method.
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
This slide is special for master students (MIBS & MIFB) in UUM. Also useful for readers who are interested in the topic of contemporary Islamic banking.
This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
1. What you need to know for
Postgresql operation
https://orabase.org/owncloud/index.php/s/vdqNlAqNqIPLxil
materials
2. Who Am I ?
Oracle OCP 12c
OCE 11g PT
OCE 11g RAC
Senior Specialist at RT Labs
The guy on the left ^_^
PostgreSQL 9.3 Associate
3. Presentation plan
1.Architecture of Postgresql
2.Transactions and Concurrency, MVCC
3.Connection Pooling ( pgpool,pgbouncer )
4.Tips & Trics + Monitoring
4. Architectural Summary :
• PostgreSQL uses processes, not threads
• Postmaster process acts as supervisor
• Several utility processes perform background work
• postmaster starts them, restarts them if they die
• postmaster listens for new connections
6. Main Utility Processes:
• Background writer − Writes dirty data blocks to disk
• WAL writer − Flushes write-ahead log to disk
• Checkpointer process − Automatically performs a checkpoint based
on config parameters
• Autovacuum launcher − Starts Autovacuum workers as needed
• Autovacuum workers − Recover free space for reuse
• Stats Collector – collects runtime statistics
8. What is a Transaction?
• A transaction is set of statements bundled into a single step, all-or-
nothing operation
• A transaction must possess ACID properties:
• An all-or-nothing operation (Atomicity).
• Only valid data is written to the database (Consistency).
• The intermediate states between the steps are not visible to other concurrent
transactions (Isolation).
• If some failure occurs that prevents the transaction from completing, then
none of the steps affect the database at all (Durability).
9. 2. Transactions and Concurrency, MVCC
• Snapshot of data at a point in time.
• Updates, inserts and deletes cause the creation of a new row version.
Row version stored in same page.
• MVCC uses increasing transaction IDs to achieve consistency.
• Each row has 2 transaction ids: created and expired
• Queries check:
• creation trans id is committed and < current trans counter
• row lacks expire trans id or expire was in process at query start
10. MVCC Maintenance
• MVCC creates multiple versions of a row for concurrency
• Old row versions can cause “bloat”
• Rows no longer needed are recovered for reuse/removed via
vacuuming or autovacuum
• To prevent transaction wraparound failure each table must be
vacuumed periodically
• PostgreSQL reserves a special XID as FrozenXID
• This XID is always considered older than every normal XID
11. Presentation plan
1.Architecture of Postgresql
2.Transactions and Concurrency, MVCC
3.Connection Pooling ( pgpool,pgbouncer )
4.Tips & Trics + Monitoring
12. Pgpool
• At first, developed for connection pooling
• Replication Master/Slave mode
• Load balancing
• Automatic failover on desync detection
• Online recovery
• Parallel Query
14. PgBouncer
• Lightweight connection pooler for PostgreSQL
• Any application can connect to Pgboucer as it connects with
PostgreSQL
• Pgbouncer help to lower down the connections impact on the
PostgreSQL Server
• Pgbouncer provides connection pooling thus reuse the existing
connections
15. Types of Connections
• pgbouncer supports several types of pooling when rotating
connections:
• Session pooling − A server connection is assigned to the client application for
the life of the client connection.
• Transaction pooling − A server connection is assigned to the client application
for the duration of a transaction
• Statement pooling − A server connection is assigned to the client application
for each statement
16. How Connections are Established
• An application connects to PgBouncer as if it were a PostgreSQL database
• PgBouncer then creates a connection to the actual database server, or it
reuses one of the existing connections from the pool
• Step 1: The client application attempts to connect to PostgreSQL on the port where
pgbouncer is running
• Step 2: The database name supplied by the client application must match with the
list in pgBouncer.ini
• Step 3: The user name and password supplied must match with the list in users.txt
• Step 4: If a connection with same settings is available in pool it will be assigned to
client
• otherwise a new connection object will be created
• Step 5: Once client log off the connection object return back to the pool
17. Manage pgbouncer
• Show stats, servers, clients, pools, lists, databases, fds commands can
be used.
• Manage pgbouncer by connecting to the special administration
database
• pgbouncer and issuing show help;
• $ psql -p 6543 -U someuser pgbouncer
• pgbouncer=# show help;
• NOTICE: Console usage
• DETAIL: SHOW [HELP|CONFIG|DATABASES|FDS|POOLS|CLIENTS|SERVERS|SOCKETS|LISTS|VERSION]
19. quick test with pgbouncer
• Connecting to the bouncer over local unix socket, it took 31s to
perform all the queries.
• Connecting to the bouncer over localhost, it took 45s to perform all
the queries.
• Connecting to the bouncer running on the remote server, it took
1m6s
• Without using pgbouncer, it took 3m34s
23. How graphite populate data
• We write function that in single pass get all information (still under
development)
• adm-get_stat_activity.sql
24. Autovacuum & DB activity monitoring
• autovacuum_count.Query=select count (*) from pg_stat_activity
where state = 'active' AND query LIKE 'autovacuum:%’
• autovacuum_max.Query=select coalesce (max(round(extract( epoch
FROM age(statement_timestamp(), state_change)))),0)
active_seconds from pg_stat_activity where state = 'active' AND
query LIKE 'autovacuum:%’
• xactcommit.Query=SELECT sum(xact_commit) FROM
pg_stat_database
• xactrollback.Query=SELECT sum(xact_rollback) FROM
pg_stat_database
28. Session activity monitoring
• active_session_cnt.Query=select count (*) from pg_stat_activity where state='active' and pid != pg_backend_pid()
• active_5s.Query=select count (*) from pg_stat_activity where state='active' and statement_timestamp() - state_change >
INTERVAL '5s' AND query not LIKE 'autovacuum:%’
• active_max.Query=select coalesce(abs(max(round(extract( epoch FROM age(statement_timestamp(), state_change))))),0)
• active_max_seconds from pg_stat_activity where state='active' AND query not LIKE 'autovacuum:%’
• idle_session_cnt.Query=select count (*) from pg_stat_activity where state='idle’
• idle_in_trans_cnt.Query=select count (*) from pg_stat_activity where state like 'idle in trans%’
• idle_in_trans_5s.Query=select count (*) from pg_stat_activity where state like 'idle in trans%' and statement_timestamp() -
state_change > INTERVAL '5s’
• idle_in_trans_max.Query=select coalesce(max(round(extract( epoch FROM age(statement_timestamp(), state_change)))),0)
max_idle_in_trans from eyes.get_pg_stat_activity() where state like ’
• idle in trans%'waiting_session_cnt.Query=select count (*) from eyes.get_pg_stat_activity() where waiting is true
• waiting_session_5s.Query=select count (*) from pg_stat_activity where waiting is true and statement_timestamp() - state_change
> INTERVAL '5s’
• waiting_session_max.Query=select coalesce (abs(max(round(extract( epoch FROM age(statement_timestamp(),
state_change))))),0)
• waiting_max from pg_stat_activity where waiting is true
48. Top query by avg runtime
• select
md5(query),calls,total_time,rows,shared_blks_hit,shared_blks_read,
(total_time/calls) as avg_time from pg_stat_statements order by
avg_time desc limit 5;
51. pg_stat_kcache
SELECT datname, queryid, round(total_time::numeric, 2) AS total_time, calls,
pg_size_pretty((shared_blks_hit+shared_blks_read)*8192 - reads) AS memory_hit,
pg_size_pretty(reads) AS disk_read, pg_size_pretty(writes) AS disk_write,
round(user_time::numeric, 2) AS user_time, round(system_time::numeric, 2) ASsystem_time
FROM pg_stat_statements s
JOIN pg_stat_kcache() k USING (userid, dbid, queryid)
JOIN pg_database d ON s.dbid = d.oid
WHERE datname != 'postgres' AND datname NOT LIKE 'template%’ ORDER BY total_time DESC LIMIT 10;
54. Top object in cache
SELECT c.relname ,
pg_size_pretty(count(*) * 8192) as buffered
, round(100.0 * count(*) / ( SELECT setting FROM pg_settings WHERE
name='shared_buffers')::integer,1) AS buffers_percent
, round(100.0 * count(*) * 8192 / pg_relation_size(c.oid),1) AS percent_of_relation
FROM pg_class c
JOIN pg_buffercache b ON b.relfilenode = c.relfilenode
JOIN pg_database d ON (b.reldatabase = d.oid AND d.datname = current_database())
WHERE pg_relation_size(c.oid) > 0 GROUP BY c.oid, c.relname
ORDER BY 3 DESC LIMIT 10;
56. Top 20 unused indexes
SELECT relid::regclass AS table,
indexrelid::regclass AS index,
pg_size_pretty(pg_relation_size(indexrelid::regclass)) AS index_size,
idx_tup_read,
idx_tup_fetch,
idx_scan
FROM pg_stat_user_indexes
JOIN pg_index USING (indexrelid)
WHERE idx_scan = 0 AND indisunique IS FALSE
order by pg_relation_size(indexrelid::regclass) desc limit 20;
58. indexes on nulls
Select
pg_index.indrelid::regclass as table,
pg_index.indexrelid::regclass as index,
pg_attribute.attname as field,pg_statistic.stanullfrac,
pg_size_pretty(pg_relation_size(pg_index.indexrelid)) as indexsize,
pg_get_indexdef(pg_index.indexrelid) as indexdef
from pg_index
join pg_attribute ON pg_attribute.attrelid=pg_index.indrelid AND
pg_attribute.attnum=ANY(pg_index.indkey)
join pg_statistic ON pg_statistic.starelid=pg_index.indrelid AND
pg_statistic.staattnum=pg_attribute.attnum
where pg_statistic.stanullfrac>0.5
AND pg_relation_size(pg_index.indexrelid)>10*8192
order by pg_relation_size(pg_index.indexrelid) desc,1,2,3;
60. Duplicate indexes
SELECT pg_size_pretty(SUM(pg_relation_size(idx))::BIGINT) AS SIZE,
(array_agg(idx))[1] AS idx1, (array_agg(idx))[2] AS idx2,
(array_agg(idx))[3] AS idx3, (array_agg(idx))[4] AS idx4
FROM ( SELECT indexrelid::regclass AS idx, (indrelid::text ||E'n'|| indclass::text ||E'n'||
indkey::text ||E'n'||
COALESCE(indexprs::text,'')||E'n' || COALESCE(indpred::text,'')) AS KEY
FROM pg_index) sub
GROUP BY KEY HAVING COUNT(*)>1 ORDER BY SUM(pg_relation_size(idx)) DESC;
62. Missing index
SELECT relname, seq_scan-idx_scan AS too_much_seq, case when
seq_scan-idx_scan>0 THEN 'Missing Index?' ELSE 'OK' END,
pg_size_pretty(pg_relation_size(relname::regclass)) AS rel_size,
seq_scan, idx_scan FROM pg_stat_all_tables WHERE
schemaname='public' AND
pg_relation_size(relname::regclass)>10*1024*1024 ORDER BY
too_much_seq DESC nulls last;
65. Write activity
SELECT s.relname,
pg_size_pretty(pg_relation_size(relid)),
coalesce(n_tup_ins,0) + 2 * coalesce(n_tup_upd,0) - coalesce(n_tup_hot_upd,0) +
coalesce(n_tup_del,0) AS total_writes,
(coalesce(n_tup_hot_upd,0)::float * 100 / (CASE WHEN n_tup_upd > 0 THEN n_tup_upd
ELSE 1 END)::float)::numeric(10,2) AS hot_rate,
(SELECT v[1]
FROM regexp_matches(reloptions::text,e'fillfactor=(d+)') AS r(v) LIMIT 1) AS fillfactor
FROM pg_stat_all_tables s
JOIN pg_class c ON c.oid=relid
ORDER BY total_writes DESC LIMIT 50;
67. Usefull materials
• Postgrespro курс на русском
• Способы диагностики PostgreSQL ( yandex )
• Deep dive into postgresql statistics
• PostgreSQL meetup @ Avito (9.04.2016)
• What is HOT
В виду последних событий и тренда на импортозамещение большую популярность набирает база данных postgresql,
Тема доклада : что нужно знать для эксплуатации postgres
Коллеги, доброго дня, по традиции немного о себе, почти два года работаю в РТ Лабс старшим специалистом в отделе эксплуатации субд .
Умудрился сдать на сертификаты: ( клац клац мышкой )
эксперт по кластерам Oracle , настройке производительности бла бла бла, но в свете последних событий судьба оракла в России туманна, поэтому пришлось заняться постгресом, собственно этот доклад будет на тему того что нужно знать для его эксплуатации
Клац
Сейчас уже сдал на первый сертификат по postgres
клац
Человек слева :) справа Илья Космодемьянский ( Postgrespro )
План презентации:
Начнем с ахритектуры ( куда же без этого )
Продолжим расказом про уровни изоляции транзакций и управление конкурентным доступом с помощью многоверсионности ( MVCC Multi-Version Concurrency Control )
Третим пунктом будет описание реализации пула коонектов сторонними средствами ( pgpool,pgbouncer )
Последним пойдет обзор того что и чем можно пользоваться в postgres для анализа
Postgresql использует процессы, не треды, это значит что каждый пользователь подключившийся к базе обслуживается отдельным серверным процессом
Все процессы порождаются одним, postmaster
Главный процесс postmaster, запускает вспомогательные процессы которые делают фоновую работу, постмастер перезапускает их если они умерли если перезапуск не удался – постмастер убивает экземпляр
так же постмастер слушает запросы на новые подключения
Этот слайд показывает общую архитектуру процессов и памяти
Так же можно увидеть файлы, с них и начнем:
Датафайлы, где хранятся данные
Wal segments – логи транзакции
Archive wal– если настроено, в случае если лог полный и переключен он может быть заахривирован в отдельную папку
ERROR log – если настроено, то все ошибки, долгие транзакции, логи установки соединения будут писаться в лог
В памяти:
Во время старта экземпляра ему выделяется разделяемый кусок памяти ( shared memory )
Его можно разделить на:
Shared buffers: – используется для операций с датафайлами ( кеш блоков )
WAL Buffers: -
Process Array : - т.к каждый пользователь это процесс, то база должна поддерживать этот список процессов, так же как поддерживать список блокировок которые держат эти процессы и другую информацию необходимую для работы
Впомогательные процессы ( bgwriter etc ) работают в фоне
Основные вспомогательные процессы:работают в фоне
Bgwiter пишет грязные буферы на диск
Wall writer пишет данные в wal логи на каждом коммите
Checkpointer ( every 5 min, manual, when full ) просыпается процесс и удостоверяется что все что есть в памяти скидывается на диск
Autovacuum launcer – по умолчанию запущен (и это хорошо ) – запускает autowacuum workers
Autowacuum workers – делают работу по очистке старых записей и сбору статитики
Stats Collector – важный процесс, который собирает статитику, настройки которые отвечают за то, сколько статистики будет собиратся :
The parameter track_activities enables monitoring of the current command being executed by any server process.
The parameter track_counts controls whether statistics are collected about table and index accesses. (pg_stat_activity)
The parameter track_functions enables tracking of usage of user-defined functions.
The parameter track_io_timing enables monitoring of block read and write times.
Пример запущенных процессов
Пройдемся по транзакциям и узнаем как реализован принцип ACID
Что такое транзакция ?
Транзакция это набор выражений собранных в один шаг, операция все или ничего
Транзакция должна соблюдать свойства ACID :
Атомарность гарантирует, что никакая транзакция не будет зафиксирована в системе частично. Будут либо выполнены все её подоперации, либо не выполнено ни одной.
Согласованность Транзакция, достигающая своего нормального завершения (EOT — end of transaction, завершение транзакции) и, тем самым, фиксирующая свои результаты, сохраняет согласованность базы данных. Другими словами, каждая успешная транзакция по определению фиксирует только допустимые результаты. Это условие является необходимым для поддержки четвёртого свойства.
Изолированность. Во время выполнения транзакции параллельные транзакции не должны оказывать влияние на её результат.
Долговечность :Независимо от проблем на нижних уровнях (к примеру, обесточивание системы или сбои в оборудовании) изменения, сделанные успешно завершённой транзакцией, должны остаться сохранёнными после возвращения системы в работу. Другими словами, если пользователь получил подтверждение от системы, что транзакция выполнена, он может быть уверен, что сделанные им изменения не будут отменены из-за какого-либо сбоя.
Как ACID реализован в Postgres:
Обычное явление когда несколько пользователей работают с одним набором данных, для обеспечения конуретного доступа в используются снапшоты
Версии строк храняться в той же странице, MVCC увеличивает счетчик айди транзакции для обеспечения согласованности
Для работы MVCC использются несколько версий строк для конкурентного доступа
Старые строки могут быть причиной распухания, бороться с ним помогает вакум или автовакум, который удаляет неиспользуемые строки
Работа механизмов MVCC была бы невозможна без существования счётчика транзакций. Загадка, почему до сих пор счётчик этот 32-х битный, но имеем то, что имеем — через каждые 2 миллиарда транзакций счётчику полагается обнулиться. Чтобы не произошло потери данных этим строкам ставится в соответствие некий зарезервированный FrozenXID. При достижении счётчиком транзакций определённого конфигом значения запускается автовакуум с красивым комментом «to prevent wraparound». При больших таблицах (в десятки ГБ) этот процесс может занять часы, в течение которых таблица будет недоступна ни для чтения, ни для записи.
Дело в том, что postgres для каждого соединения создает новый процесс. Чтобы «удешевить» соединение с БД, современные библиотеки используют пул соединений. То есть, они один раз соединяются с БД, а потом многократно используют это соединение. Если в библиотеке работы с БД нет возможности организовывать пулы соединений, то на помощь приходят pgpool и pgnouncer
Пгпул, хорошая утилита ,позволяющая делать паралельные запрсы, сама может раскидывать читающие запросы на стендбаи, а запросы на сапись на мастер, может сделать автоматический файловер..
Но файловер настраивается руками, поэтому нужна квалифакация
У меня ее не было и после того как на проде получили 2 мастера приняли решение отказаться от него
Более легковесный чем pgpool,
На слайде видно что после переключения с pgpool на pgbouncer нагрузка на процессор и памят упала.
Pgbouncer выступает как промежуточный слой между клиентом и сервером, клиент подключается к пгбаунсеру точно так же как он бы подключался к базе данных, но после завершения сесии клиента пгбаунсер не закрывает соединение с базой, а возвращает его в пул свободных.
Тем самым снижается нагрузка на базу на открытие новых соединений
Типы соединений:
Пгбаунсер поддерживает несколько типов соединений:
Клац
Пулл на уровне сессии – это когда соединение с сервером назначено клиенту на время всей жизни соединения
Клац
Пул опетаций– это когда соедидение выдано клиенту на время одной операции
Клац
Мы используем Сешн пул
Т.к сторонний софт может выставлять сессионные переменные, следовательно они могут заафектить другие сессии, поэтому разработчикам стоит подключаться к базе напрямую
Пропустим..
Пропустим
Кому интересно как это все работает под большой нагрузкой приведу пример использования в авито где реализовали схему с pgbouncer расположеным на балансировщиках и перед базой
Есть ссылка на доклад
На сколько это эффективно:
Небольшой тест
Замеряли 50000 запросов select now() c установкой соединения для каждого запроса и после закрывали соединение.
Результаты:с использованием bgbouncer тест отрабатывал быстрее минимум в 3 раза
В данной части мы пройдемся по тому что мы смотрим ( собираемся смотреть ) + набор полезных скриптов
Что мы можем посмотреть..
:) похоже что можно посмотреть практически все
но порой и этого не достаточно
Начнем с мониторинга, то что заведено в графики, после покажу запросы которыми можно вытащить полезную информацию
Начнем по порядку:
Клац
Тут основной интерес к работе автовакума и активности экземпляра ( кол-во коммитов ролбэков ) они же TPS
Это метрики для заббикса, смотрим количество сессий автовакума и время самой долгой операции автовакума, так же количество коммитов и ролбэков на основе которых высчитывается TPS
SELECT sum(xact_commit+xact_rollback) FROM pg_stat_database;
Автовакум : вывели на график количеств сессий + график самой долгой сессии
TPS: вывели отдельно информацию о кол-ве коммитов и ролбеков
На график выводиться количество активных или ждущих сессий, в том числе и автовакум + информацио об их длительности
Данный блок наиболее полезный, т.к хранит информациу о текущих ( активных или ждущих ) сессиях, так же статистику по запросам
Тут смотрим на количество активных, активных дольше 5ти секунд сессий
Так же на самую долгую сессию
Отдельно выводим информацию об сессиях которые в состоянии idle In transaction или ждут блокировку
В данный момент для заббикса каждая метрика опрашивается поодельности, для графаны мой коллега Самойлов Александр реализовал все это в виде функции которая делает по одному запросу к представлениям и возвращает итоговый набор данных который уже отрисовывается графиками, что, как очевидно, более эффективно
Очень полезная информация из модуля pg_stat_statements
Показывает какие запросы в какое время были активны в базе, но тут есть один ньюанс – статистика в pg_stat_activity попадает после завершения запроса..
В данный момент у нас не реализовано, но есть аналог ( Клац )
Нечто подобное в данный момент у нас реализовано средставми анализатора логов от vmware, картина не полная, т.к в лог попадают запросы дольше 200 мс
Статистика работы с буфером, общая статистика активнсти по базе
Количество подключений к базе
Количество таплов ( можно обобщить как строки ) которые вернули
Количество строк которые вставили\ удалили
Для заббикса можно добавить метрику использования темпа
Тут стоит учесть что раз мы метрики собираем раз в N секунд, то и результат должны делить на N, чтобы получить кол-во операций в секунду
Тут мы поделили RO + RW активности
Тоже самое в Graphana
В этом блоке мониторим отставание стендбаев + статистику процесса чекпоинта
В данный момент заведено только в графане, т.к заббикс стали внедрять пару недель назад
На графиках видно отставание реплик в разрезе времени и объеме информации которое нужно передать
Чекпойты по времени являются нормой, если пошли реквесты на чекпоинт, то с этим надо разбираться
В данном блоке больше информации пожно получить из операционной системы, как то утилизация дисков, сети, кол-во операций ввода вывода, время ожидания дисков
Думаю где искать эти метрики всем хорошо знакомо, как и их визуализация.
Эту часть рассморим на примере конкретных запросов, которые можно ( и нужно ) использовать в работе
Для обладателей доступа в наш конфлюенс есть сслка на документ
В данной части пройдемся по наибоее интересным ( для меня ) расширениям: pg_buffercache, Pg_stat_statements, Pg_stat_kcache
Дальше покажу несоклько запросов которые помогут оценить что можно улучшить в архитектуре приложения.
Данный модуль содержит коммулятивную статитику запросов ( с момента старта экземпляра, но можно сбросить)
Позволяет посмотреть топ запросов
Есть набор готовых скриптов от postgesql consulting
на основе этой вью легко строить подобные отчеты в различных разрезах ( данный по среднему времени выполнения )
В данном примере используется статистика blk_read_time blk_wite_time, за которую отвечает параметр track_io_timing
Начнем pg_stat_kcache который позволяет собрать статистику о реальных чтениях\записи сделанных файловой системой
Для работы нужен 9.4
И, к сожалению, в данний момент не в контрибе..
Данным запросом мы находим топ запрсов по общему времени выполенения, но уже с учетом физичиких чтений\записи
ПГ буферкэш, нужен для того чтобы определить какая часть таблицы\индекса закешировано в шаред буферах
Для каждого буфера в общем кеше выдаётся одна строка, следовательно если посчитать каунт по нужным условиям, то получим общее количество буферов занятых обьектом.
При обращении к представлению pg_buffercache устанавливаются блокировки менеджера внутренних буферов на время, достаточное для копирования всех данных состояния буферов, которые будут выводиться в представлении, это может повлиять на производительность базы данных, если обращаться к этому представлению часто.
Данным запросом находим топ 10 обьетов в кеше и высчитываем процент от общего количество шаред буферов и процент от всего размера обьекта
Дальше пойдут полезные запросы по системному каталогу, этим запросом мы найдем индексы которые не используются + их размеры
Индексы по нулам, где процент нулов больше 50%
Дубли индексов
HOT stands for Heap Overflow Tuple and this is an attempt to solve some of the problems associated with frequently updated tables. This design optimizies the updates when none of the index columns are modified and length of the tuple remains the same after update. In this particular case, the updated tuple is stored in a seperate overflow relation and pulled-back into the main relation when the tuple in the main relation becomes dead.
Коллеги, если не успею ответить на вопросы ( или не смогу ), то пожалуйста напишите их мне, со своей стороны обещаю разобраться и дать ответ за вменяемое время.