SlideShare a Scribd company logo
1 of 55
Download to read offline
Modern PostgreSQL Monitoring & Diagnostics
Mahadevan Ramachandran
Co-founder & CEO, RapidLoop
September 10, 2019
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 2 / 55
Hello!
Mahadevan Ramachandran
Co-founder & CEO, RapidLoop
We build monitoring products:
OpsDash – server & service monitoring
pgDash – dedicated PostgreSQL monitoring
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 3 / 55
In The Beginning...
OpsDash was our first product. It does server and service monitoring. It
also does basic PostgreSQL monitoring. OpsDash itself uses Postgres
internally.
As we grew, we realized that we needed more, in-depth monitoring for
Postgres.
But:
What should we collect? There was no standard list of metrics
information to collect.
How should we collect? There was no standard way to collect, that
worked across Postgres versions, distros etc.
Why should we collect what we were supposed to collect? How were
we supposed to interpret it?
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 4 / 55
What we needed
A modern system by which we could collect all relevant info about a
Postgres server into one place
Extract & store relevant metrics into timeseries db
Display collected information in a rich set of dashboards – not just as
a bunch of timeseries graphs
More importantly: let algorithms have a look at all this information
and perform diagnostics – for example:
WAL file count increased steadily over the last 24 hours
Look at transactions running for more than 24 hours – none
Look at wal archiving info – not failing – what then?
Look at replication slots – one of them has become inactive – aha!
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 5 / 55
So we built pgmetrics
We did some research:
Started with our requirements
Actually read the Postgres documentation :-)
Various metrics collection agents/plugins, check nagios.pl
Read a few Postgres administration books
and then built an open source tool called pgmetrics
Added more features and tweaks based on user’s inputs
Now queries and collects over 350 metrics..
..using over 50 different queries
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 6 / 55
pgmetrics
Open source CLI tool - also usable as a library
Run anywhere - Go, single binary, statically linked, no dependencies -
Windows, Linux, FreeBSD, MacOS
No installation or Postgres extensions required - so works with AWS
RDS, Heroku too
Easy to use - exact same command-line arguments and env. vars as
psql
Usable for scripting and automation - JSON, CSV outputs
Accommodates Postgres version differences - works with v9.3+
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 7 / 55
How It Works
pgmetrics collects information from:
Statistics collector views – pg stat archiver, pg stat bgwriter,
pg stat replication, pg stat wal receiver, pg stat activity,
pg stat subscription, ...
Functions – pg is in recovery(), pg last wal receive lsn(),
pg last wal replay lsn(), pg tablespace size(), ...
System catalog views – pg database, pg tablespace, pg class, ...
PSS Extension – pg stat statements
Configuration settings – pg settings
Bloat – using the query from check nagios.pl
Filesystem – pg ls dir(’pg wal’)
System metrics – /proc filesystem
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 8 / 55
Demo Time
Let’s have a look!
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 9 / 55
PostgreSQL Metrics
PostgreSQL Metrics
Cluster-Level
*cluster = a set of databases managed by a single postgres process
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 10 / 55
Cluster Overview
Name: 10/main
Server Version: 10.10 (Ubuntu 10.10-1.pgdg18.04+1)
Server Started: 23 Aug 2019 3:47:32 AM (24 minutes ago)
System Identifier: 6685915216424112509
Timeline: 1
Last Checkpoint: 23 Aug 2019 4:07:36 AM (4 minutes ago)
Prior LSN: 13E/4100F370
REDO LSN: 13E/4103E9F8 (190 KiB since Prior)
Checkpoint LSN: 13E/410411D8 (10 KiB since REDO)
Transaction IDs: 1499764934 to 1640241289 (diff = 140476355)
Notification Queue: 0.0% used
Active Backends: 3 (max 100)
Recovery Mode? no
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 11 / 55
Cluster Overview
What to monitor:
Transaction ID range
Time since last checkpoint
Number of (client) backends
Notification queue usage
Time since server start
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 12 / 55
Backends
Backends:
Total Backends: 17 (17.0% of max 100)
Problematic: 5 waiting on locks, 6 waiting on other, 1 xact too long, 3 idle in xact
Waiting for Locks:
+-------+--------+---------+-------------+----------+----------------------+------------------------+
| PID | User | App | Client Addr | Database | Wait | Query Start |
+-------+--------+---------+-------------+----------+----------------------+------------------------+
| 28525 | mdevan | psql | | bench | Lock / relation | 28 Aug 2019 6:42:13 AM |
| 28539 | mdevan | pgbench | | bench | Lock / transactionid | 28 Aug 2019 6:42:56 AM |
| 28541 | mdevan | pgbench | | bench | Lock / transactionid | 28 Aug 2019 6:42:56 AM |
| 28565 | mdevan | psql | | bench | Lock / relation | 28 Aug 2019 6:42:26 AM |
| 28588 | mdevan | psql | | bench | Lock / relation | 28 Aug 2019 6:42:45 AM |
+-------+--------+---------+-------------+----------+----------------------+------------------------+
Other Waiting Backends:
+-------+--------+---------+-------------+----------+-----------------------+------------------------+
| PID | User | App | Client Addr | Database | Wait | Query Start |
+-------+--------+---------+-------------+----------+-----------------------+------------------------+
| 22066 | pgdash | | | pgdash | Client / ClientRead | 28 Aug 2019 6:40:06 AM |
| 27976 | pgdash | | | pgdash | Client / ClientRead | 28 Aug 2019 6:40:06 AM |
| 28174 | mdevan | psql | | bench | Client / ClientRead | 28 Aug 2019 6:41:47 AM |
| 28534 | mdevan | pgbench | | bench | LWLock / WALWriteLock | 28 Aug 2019 6:42:56 AM |
| 28536 | mdevan | pgbench | | bench | LWLock / WALWriteLock | 28 Aug 2019 6:42:56 AM |
| 28542 | mdevan | pgbench | | bench | LWLock / WALWriteLock | 28 Aug 2019 6:42:56 AM |
+-------+--------+---------+-------------+----------+-----------------------+------------------------+
(continued..)
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 13 / 55
Backends
Long Running (>60 sec) Transactions:
+-------+--------+------+-------------+----------+---------------------------------------+
| PID | User | App | Client Addr | Database | Transaction Start |
+-------+--------+------+-------------+----------+---------------------------------------+
| 28174 | mdevan | psql | | bench | 28 Aug 2019 6:41:40 AM (1 minute ago) |
+-------+--------+------+-------------+----------+---------------------------------------+
Idling in Transaction:
+-------+--------+---------+-------------+----------+----------+------------------------+
| PID | User | App | Client Addr | Database | Aborted? | State Change |
+-------+--------+---------+-------------+----------+----------+------------------------+
| 28174 | mdevan | psql | | bench | no | 28 Aug 2019 6:41:47 AM |
| 28535 | mdevan | pgbench | | bench | no | 28 Aug 2019 6:42:56 AM |
| 28540 | mdevan | pgbench | | bench | no | 28 Aug 2019 6:42:56 AM |
+-------+--------+---------+-------------+----------+----------+------------------------+
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 14 / 55
Backends
What to monitor:
Backends waiting for locks
Backends that have been running for too long
Backends that are idling in transaction
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 15 / 55
WAL Files
WAL Files:
WAL Archiving? yes
WAL Files: 19
Ready Files: 0
Archive Rate: 1.20 per min
Last Archived: 27 Aug 2019 8:46:52 AM (13 seconds ago)
Last Failure:
Totals: 7265 succeeded, 0 failed
Totals Since: 23 Aug 2019 3:47:36 AM (4 days ago)
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 16 / 55
WAL Files
What to monitor:
Number of WAL files
Number of WAL files ready for archiving
Archiving failures
Rate of archiving
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 17 / 55
BG Writer
Checkpoint Rate: 0.20 per min
Average Write: 6.0 MiB per checkpoint
Total Checkpoints: 1214 sched (100.0%) + 0 req (0.0%) = 1214
Total Write: 1.3 TiB, @ 3.7 MiB per sec
Buffers Allocated: 257126091 (1.9 TiB)
Buffers Written: 938583 chkpt (0.5%) + 84471609 bgw (48.4%) +
89281998 be (51.1%)
Clean Scan Stops: 769729
BE fsyncs: 0
Counts Since: 23 Aug 2019 3:47:36 AM (4 days ago)
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 18 / 55
BG Writer
What to monitor:
Checkpoint rate (compare with checkpoint timeout)
Ratio of scheduled to requested checkpoints
Percentage of buffers written by the bgwriter
Stops of the bgwriter clean scan run
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 19 / 55
Tablespaces
+------------+----------+---------------------------------------+-
| Name | Owner | Location |
+------------+----------+---------------------------------------+-
| pg_default | postgres | $PGDATA = /var/lib/postgresql/10/main |
| pg_global | postgres | $PGDATA = /var/lib/postgresql/10/main |
| ts1 | postgres | /opt/ts1 |
+------------+----------+---------------------------------------+-
-+---------+---------------------------+
| Size | Disk Used |
-+---------+---------------------------+
| 1.5 GiB | 6.2 GiB (10.7%) of 58 GiB |
| 582 KiB | 6.2 GiB (10.7%) of 58 GiB |
| 2.6 MiB | 6.2 GiB (10.7%) of 58 GiB |
-+---------+---------------------------+
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 20 / 55
Tablespaces
What to monitor:
Disk usage per tablespace
Inode usage (typically not an issue)
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 21 / 55
Locks
Locks:
+---------------+-------------+-------+
| Lock Type | Not Granted | Total |
+---------------+-------------+-------+
| relation | 3 | 85 |
| transactionid | 4 | 17 |
| virtualxid | 0 | 18 |
+---------------+-------------+-------+
| | 7 | 120 |
+---------------+-------------+-------+
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 22 / 55
Locks
What to monitor:
Overall number of locks
‘relation’ locks (see also Blocked Queries at database level)
‘advisory’ locks
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 23 / 55
PostgreSQL Metrics
PostgreSQL Metrics
Database-Level
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 24 / 55
Database Overview
Database #4:
Name: bench
Owner: mdevan
Tablespace: pg_default
Connections: 1 (no max limit)
Frozen Xid Age: 149467276
Transactions: 159267223 (57.6%) commits,
117062056 (42.4%) rollbacks
Cache Hits: 98.1%
Rows Changed: ins 25.0%, upd 75.0%, del 0.0%
Total Temp: 0 B in 0 files
Problems: 0 deadlocks, 0 conflicts
Totals Since: 23 Aug 2019 3:47:39 AM (4 days ago)
Size: 1.4 GiB
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 25 / 55
Database Overview
What to monitor:
Number of connections
Transaction ID range
Cache efficiency
Commit ratio
Deadlock Count
Query Conflict Count (on standbys)
Size of database on-disk
Temporary files
Total data written into temporary files
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 26 / 55
Slow Queries
Slow Queries:
+-----------+----------+---------------+-----------+----------------------------------------------------+
| Calls | Avg Time | Total Time | Rows/Call | Query |
+-----------+----------+---------------+-----------+----------------------------------------------------+
| 291994217 | 2ms | 197h0m57.062s | 1 | UPDATE pgbench_branches SET bbalance = bbalance + |
| 291994221 | 0s | 60h30m15.222s | 1 | UPDATE pgbench_tellers SET tbalance = tbalance + $ |
| 291994222 | 0s | 33h17m46.907s | 1 | UPDATE pgbench_accounts SET abalance = abalance + |
| 291994222 | 0s | 3h39m9.976s | 1 | SELECT abalance FROM pgbench_accounts WHERE aid = |
| 291994217 | 0s | 2h9m23.344s | 1 | INSERT INTO pgbench_history (tid, bid, aid, delta, |
| 291994222 | 0s | 11m57.856s | 0 | BEGIN |
| 291994217 | 0s | 8m54.985s | 0 | END |
| 2392 | 207ms | 8m15.573s | 13 | SELECT current_database() AS db, schemaname, tab |
| 16730 | 6ms | 1m44.987s | 1 | SELECT pg_database_size($1) |
| 2392 | 41ms | 1m38.98s | 1 | WITH pc AS (SELECT pubname, COUNT(*) AS c FROM pg_ |
| 2392 | 29ms | 1m10.38s | 2 | SELECT name, current_database(), COALESCE(default_ |
| 7170 | 9ms | 1m10.199s | 1 | SELECT pg_tablespace_size($1) |
| 327353 | 0s | 50.747s | 3 | SELECT a.attname, a.atttypid, a.atttyp |
| 327353 | 0s | 33.834s | 1 | SELECT c.oid, c.relreplident FROM pg_catalog.pg_c |
| 4780 | 6ms | 33.173s | 1 | SELECT COUNT(*) FROM pg_ls_dir($1) WHERE pg_ls_dir |
| 327353 | 0s | 25.619s | 0 | BEGIN READ ONLY ISOLATION LEVEL REPEATABLE READ |
| 2390 | 10ms | 25.193s | 1 | SELECT archived_count, COALESCE(last_archived_ |
| 2389 | 8ms | 20.97s | 99 | SELECT userid, dbid, queryid, LEFT(query, $1), cal |
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 27 / 55
Slow Queries
What to monitor:
Average time taken and rows/call for specific queries
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 28 / 55
Blocked Queries
Blocked Query #1:
Query: TRUNCATE TABLE locktest;
Started By: psql mdevan/bench (PID 28565)
Waiting Since: 28 Aug 2019 6:42:26 AM (30 seconds ago)
Waiting For:
Query: LOCK TABLE locktest IN ACCESS EXCLUSIVE MODE;
Lock: relation, AccessExclusiveLock, table public.locktest
Started By: psql mdevan/bench (PID 28174)
Waiting For:
Query: SELECT * FROM locktest FOR UPDATE;
Lock: relation, AccessExclusiveLock, table public.locktest
Started By: psql mdevan/bench (PID 28525)
Blocked Query #2:
Query: CLUSTER VERBOSE locktest;
Started By: psql mdevan/bench (PID 28588)
Waiting Since: 28 Aug 2019 6:42:45 AM (11 seconds ago)
Waiting For:
Query: LOCK TABLE locktest IN ACCESS EXCLUSIVE MODE;
Lock: relation, AccessExclusiveLock, table public.locktest
Started By: psql mdevan/bench (PID 28174)
Waiting For:
Query: SELECT * FROM locktest FOR UPDATE;
Lock: relation, AccessExclusiveLock, table public.locktest
Started By: psql mdevan/bench (PID 28525)
Waiting For:
Query: TRUNCATE TABLE locktest;
Lock: relation, AccessExclusiveLock, table public.locktest
Started By: psql mdevan/bench (PID 28565)
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 29 / 55
Blocked Queries
What to monitor:
Number of blocked queries
Record query information for later analysis
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 30 / 55
Tables
Table #2 in "bench":
Name: bench.public.pgbench_tellers
Columns: 4
Manual Vacuums: never
Manual Analyze: never
Auto Vacuums: 7237, last 30 seconds ago
Auto Analyze: 6910, last 30 seconds ago
Post-Analyze: 0.0% est. rows modified
Row Estimate: 100.0% live of total 100
Rows Changed: ins 0.0%, upd 99.4%, del 0.0%
HOT Updates: 99.4% of all updates
Seq Scans: 93933255, 100.0 rows/scan
Idx Scans: 95608483, 1.0 rows/scan
Cache Hits: 100.0% (idx=94.5%)
Size: 744 KiB
Bloat: 704 KiB (94.6%)
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 31 / 55
Tables
What to monitor:
Time since last vacuum
Time since last analyze
HOT updates
Cache efficiency
Number of sequential scans
Size of the table on-disk
Bloat, both in bytes as well as a % of table size
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 32 / 55
Indexes
+----------------------+-------+--------+----------------+-
| Index | Type | Size | Bloat |
+----------------------+-------+--------+----------------+-
| pgbench_tellers_pkey | btree | 27 MiB | 27 MiB (99.9%) |
+----------------------+-------+--------+----------------+-
-+------------+----------+----------------+-------------------+
| Cache Hits | Scans | Rows Read/Scan | Rows Fetched/Scan |
-+------------+----------+----------------+-------------------+
| 94.5% | 95608483 | 1.6 | 1.0 |
-+------------+----------+----------------+-------------------+
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 33 / 55
Indexes
What to monitor:
Unused indexes (same scan count over n days)
Cache efficiency
Size of the index on-disk
Bloat, both in bytes as well as a % of index size
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 34 / 55
PostgreSQL Metrics
PostgreSQL Metrics
Replication
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 35 / 55
Outgoing Replication, Physical
Outgoing physical replication connection to a standby:
Destination #2:
User: repluser
Application: walreceiver
Client Address: 127.0.0.1/32
State: streaming
Started At: 23 Aug 2019 6:00:15 AM (10 minutes ago)
Sent LSN: 13E/DD556190
Written Until: 13E/DD556190 (no write lag)
Flushed Until: 13E/DD555D30 (flush lag = 1.1 KiB)
Replayed Until: 13E/DD555D30 (no replay lag)
Sync Priority: 0
Sync State: async
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 36 / 55
Outgoing Replication, Physical
Generic outgoing physical replicaton connection:
Destination #3:
User: mdevan
Application: pg_receivewal
Client Address:
State: streaming
Started At: 23 Aug 2019 4:32:35 AM (49 seconds ago)
Sent LSN: 13E/63643DC8
Written Until: 13E/636439B0 (write lag = 1.0 KiB)
Flushed Until: 13E/63000000 (flush lag = 6.3 MiB)
Replayed Until:
Sync Priority: 0
Sync State: async
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 37 / 55
Outgoing Replication, Physical
What to monitor:
Write lag
Flush lag
Replay lag
WAL sender state – ”streaming”, ”catchup”
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 38 / 55
Physical Replication Slots
Physical Replication Slots:
+-------------+--------+---------------+--------------+-
| Name | Active | Oldest Txn ID | Restart LSN |
+-------------+--------+---------------+--------------+-
| repl_test_1 | yes | | 13E/63000000 |
| standby1 | yes | | 13E/63643DC8 |
+-------------+--------+---------------+--------------+-
-+-----------+
| Temporary |
-+-----------+
| no |
| no |
-+-----------+
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 39 / 55
Physical Replication Slots
What to monitor:
Is the slot active or not?
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 40 / 55
Incoming Replication
Recovery Status:
Replay paused: no
Received LSN: 160/2D09CE0
Replayed LSN: 160/2D09CE0 (no lag)
Last Replayed Txn: 28 Aug 2019 4:52:39 AM (now)
Incoming Replication Stats:
Status: streaming
Received LSN: 160/2D09CE0 (started at 153/48000000, 51 GiB)
Timeline: 1 (was 1 at start)
Latency: 58us
Replication Slot: standby1
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 41 / 55
Incoming Replication
What to monitor:
Replay lag
Latency
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 42 / 55
Outgoing Replication, Logical
Outgoing logical replicaton connection:
Destination #1:
User: mdevan
Application: pg_recvlogical
Client Address:
State: streaming
Started At: 23 Aug 2019 3:47:39 AM (45 minutes ago)
Sent LSN: 13E/63643DC8
Written Until: 13E/636439B0 (write lag = 1.0 KiB)
Flushed Until: 13E/636439B0 (no flush lag)
Replayed Until:
Sync Priority: 0
Sync State: async
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 43 / 55
Outgoing Replication, Logical
What to monitor:
Write lag
Flush lag
Replay lag
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 44 / 55
Logical Replication Slots
Logical Replication Slots:
+-------------+---------------+----------+--------+---------------+-
| Name | Plugin | Database | Active | Oldest Txn ID |
+-------------+---------------+----------+--------+---------------+-
| repl_test_2 | test_decoding | bench | yes | |
+-------------+---------------+----------+--------+---------------+-
-+--------------+---------------+-----------+
| Restart LSN | Flushed Until | Temporary |
-+--------------+---------------+-----------+
| 15A/A05A42A8 | 15A/A14270F8 | no |
-+--------------+---------------+-----------+
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 45 / 55
Logical Replication Slots
What to monitor:
Is the slot active or not?
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 46 / 55
Logial Replication Publication
This is at database level.
Logical Replication Publications:
+------------+-------------+---------------------------+--------+
| Name | All Tables? | Propagate | Tables |
+------------+-------------+---------------------------+--------+
| pub_test_1 | yes | inserts, updates, deletes | 5 |
+------------+-------------+---------------------------+--------+
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 47 / 55
Logical Replication Subscription
This is at database level.
Logical Replication Subscriptions:
Subscription #1:
Name: sub_test_1
Enabled? yes
Publications: 1
Tables: 5
Workers: 1
Received Until: 160/208CBED0
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 48 / 55
Logical Replication Subscription
What to monitor:
Disabled subscriptions
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 49 / 55
PostgreSQL Metrics
PostgreSQL Metrics
Other
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 50 / 55
Other
What to monitor:
vacuum jobs
disabled triggers
changes to configuration settings
changes to the list of roles & membership
system-level
maximum load average in the last 24 hours
maximum memory usage in the last 24 hours
disk IOPS and bandwidth (MB/s)
CPU usage (esp. user, iowait)
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 51 / 55
pgDash
We spun off our Postgres monitoring into it’s own product.
pgDash is what you’d do with the output of pgmetrics – dashboards,
diagnostics, query performance, alerting and more.
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 52 / 55
pgDash Features - https://pgdash.io
Diagnostics - Automatically analyze and call out situations like high
bloat, tables not vacuumed in a week, inactive replication slots
Query Performance - Queries executed during a time range
Blocked Queries - Historical information about locks and blocked
queries, including the SQL itself
Index Management - Looking for indexes not used in the last 30
days?
Alerts - Meaningful alerts: ”bloat > 10% for tables of size 100 MiB
or more”
And More - Replication, In-Depth metrics about Tables and Indexes,
Tablespaces, Backends, ...
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 53 / 55
pgmetrics Roadmap
Features in the pipeline:
Collect data from log files
Deadlock details
auto explain output to get query plans
vacuum job information
Collect/monitor data from more sources
AWS CloudWatch & AWS Enhanced Monitoring for RDS
PgBouncer (already available)
PgPool
Collect query plans directly..
from Postgres, maybe via an enhanced pg stat statements?
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 54 / 55
Thank you!
We’d love your hear your thoughts about pgmetrics!
pgmetrics Home https://pgmetrics.io
GitHub https://github.com/rapidloop/pgmetrics
pgDash https://pgdash.io
RapidLoop https://www.rapidloop.com
Me! mahadevan@rapidloop.com
Thanks for your time and have a great evening!
Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 55 / 55

More Related Content

What's hot

PGConf APAC 2018 - A PostgreSQL DBAs Toolbelt for 2018
PGConf APAC 2018 - A PostgreSQL DBAs Toolbelt for 2018PGConf APAC 2018 - A PostgreSQL DBAs Toolbelt for 2018
PGConf APAC 2018 - A PostgreSQL DBAs Toolbelt for 2018PGConf APAC
 
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...VMware Tanzu
 
PGConf APAC 2018 - Tale from Trenches
PGConf APAC 2018 - Tale from TrenchesPGConf APAC 2018 - Tale from Trenches
PGConf APAC 2018 - Tale from TrenchesPGConf APAC
 
Transparent GPU Exploitation on Apache Spark with Kazuaki Ishizaki and Madhus...
Transparent GPU Exploitation on Apache Spark with Kazuaki Ishizaki and Madhus...Transparent GPU Exploitation on Apache Spark with Kazuaki Ishizaki and Madhus...
Transparent GPU Exploitation on Apache Spark with Kazuaki Ishizaki and Madhus...Databricks
 
PGConf APAC 2018 Keynote: PostgreSQL goes eleven
PGConf APAC 2018 Keynote: PostgreSQL goes elevenPGConf APAC 2018 Keynote: PostgreSQL goes eleven
PGConf APAC 2018 Keynote: PostgreSQL goes elevenPGConf APAC
 
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Spark Summit
 
PostgreSQL on AWS: Tips & Tricks (and horror stories)
PostgreSQL on AWS: Tips & Tricks (and horror stories)PostgreSQL on AWS: Tips & Tricks (and horror stories)
PostgreSQL on AWS: Tips & Tricks (and horror stories)Alexander Kukushkin
 
Spark performance tuning - Maksud Ibrahimov
Spark performance tuning - Maksud IbrahimovSpark performance tuning - Maksud Ibrahimov
Spark performance tuning - Maksud IbrahimovMaksud Ibrahimov
 
Understanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And ProfitUnderstanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And ProfitSpark Summit
 
Spark meetup feb 2016
Spark meetup feb 2016Spark meetup feb 2016
Spark meetup feb 2016Todd Niven
 
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBDistributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBYugabyteDB
 
RedisConf17 - Geofencing using Redis Geospatial Queries
RedisConf17 - Geofencing using Redis Geospatial QueriesRedisConf17 - Geofencing using Redis Geospatial Queries
RedisConf17 - Geofencing using Redis Geospatial QueriesRedis Labs
 
TuneIn: How to Get Your Hadoop/Spark Jobs Tuned While You’re Sleeping with Ma...
TuneIn: How to Get Your Hadoop/Spark Jobs Tuned While You’re Sleeping with Ma...TuneIn: How to Get Your Hadoop/Spark Jobs Tuned While You’re Sleeping with Ma...
TuneIn: How to Get Your Hadoop/Spark Jobs Tuned While You’re Sleeping with Ma...Databricks
 
PGConf.ASIA 2019 Bali - Partitioning in PostgreSQL - Amit Langote
PGConf.ASIA 2019 Bali -  Partitioning in PostgreSQL - Amit LangotePGConf.ASIA 2019 Bali -  Partitioning in PostgreSQL - Amit Langote
PGConf.ASIA 2019 Bali - Partitioning in PostgreSQL - Amit LangoteEqunix Business Solutions
 
The state of Spark in the cloud
The state of Spark in the cloudThe state of Spark in the cloud
The state of Spark in the cloudNicolas Poggi
 
Spark 2.x Troubleshooting Guide
Spark 2.x Troubleshooting GuideSpark 2.x Troubleshooting Guide
Spark 2.x Troubleshooting GuideIBM
 
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production ScaleGPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production ScaleSpark Summit
 
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Databricks
 

What's hot (20)

PGConf APAC 2018 - A PostgreSQL DBAs Toolbelt for 2018
PGConf APAC 2018 - A PostgreSQL DBAs Toolbelt for 2018PGConf APAC 2018 - A PostgreSQL DBAs Toolbelt for 2018
PGConf APAC 2018 - A PostgreSQL DBAs Toolbelt for 2018
 
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...
 
PGConf APAC 2018 - Tale from Trenches
PGConf APAC 2018 - Tale from TrenchesPGConf APAC 2018 - Tale from Trenches
PGConf APAC 2018 - Tale from Trenches
 
Transparent GPU Exploitation on Apache Spark with Kazuaki Ishizaki and Madhus...
Transparent GPU Exploitation on Apache Spark with Kazuaki Ishizaki and Madhus...Transparent GPU Exploitation on Apache Spark with Kazuaki Ishizaki and Madhus...
Transparent GPU Exploitation on Apache Spark with Kazuaki Ishizaki and Madhus...
 
PGConf APAC 2018 Keynote: PostgreSQL goes eleven
PGConf APAC 2018 Keynote: PostgreSQL goes elevenPGConf APAC 2018 Keynote: PostgreSQL goes eleven
PGConf APAC 2018 Keynote: PostgreSQL goes eleven
 
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
 
PostgreSQL on AWS: Tips & Tricks (and horror stories)
PostgreSQL on AWS: Tips & Tricks (and horror stories)PostgreSQL on AWS: Tips & Tricks (and horror stories)
PostgreSQL on AWS: Tips & Tricks (and horror stories)
 
Spark performance tuning - Maksud Ibrahimov
Spark performance tuning - Maksud IbrahimovSpark performance tuning - Maksud Ibrahimov
Spark performance tuning - Maksud Ibrahimov
 
Understanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And ProfitUnderstanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And Profit
 
Exploiting GPUs in Spark
Exploiting GPUs in SparkExploiting GPUs in Spark
Exploiting GPUs in Spark
 
Spark meetup feb 2016
Spark meetup feb 2016Spark meetup feb 2016
Spark meetup feb 2016
 
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBDistributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
 
RedisConf17 - Geofencing using Redis Geospatial Queries
RedisConf17 - Geofencing using Redis Geospatial QueriesRedisConf17 - Geofencing using Redis Geospatial Queries
RedisConf17 - Geofencing using Redis Geospatial Queries
 
TuneIn: How to Get Your Hadoop/Spark Jobs Tuned While You’re Sleeping with Ma...
TuneIn: How to Get Your Hadoop/Spark Jobs Tuned While You’re Sleeping with Ma...TuneIn: How to Get Your Hadoop/Spark Jobs Tuned While You’re Sleeping with Ma...
TuneIn: How to Get Your Hadoop/Spark Jobs Tuned While You’re Sleeping with Ma...
 
20140708hcj
20140708hcj20140708hcj
20140708hcj
 
PGConf.ASIA 2019 Bali - Partitioning in PostgreSQL - Amit Langote
PGConf.ASIA 2019 Bali -  Partitioning in PostgreSQL - Amit LangotePGConf.ASIA 2019 Bali -  Partitioning in PostgreSQL - Amit Langote
PGConf.ASIA 2019 Bali - Partitioning in PostgreSQL - Amit Langote
 
The state of Spark in the cloud
The state of Spark in the cloudThe state of Spark in the cloud
The state of Spark in the cloud
 
Spark 2.x Troubleshooting Guide
Spark 2.x Troubleshooting GuideSpark 2.x Troubleshooting Guide
Spark 2.x Troubleshooting Guide
 
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production ScaleGPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
 
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
 

Similar to PGConf.ASIA 2019 Bali - Modern PostgreSQL Monitoring & Diagnostics - Mahadevan Ramachandran

Combinación de logs, métricas y seguimiento para una visibilidad centralizada
Combinación de logs, métricas y seguimiento para una visibilidad centralizadaCombinación de logs, métricas y seguimiento para una visibilidad centralizada
Combinación de logs, métricas y seguimiento para una visibilidad centralizadaElasticsearch
 
The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019
The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019 The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019
The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019 Sandesh Rao
 
Logs: Can’t Hate Them, Won’t Love Them: Brief Log Management Class by Anton C...
Logs: Can’t Hate Them, Won’t Love Them: Brief Log Management Class by Anton C...Logs: Can’t Hate Them, Won’t Love Them: Brief Log Management Class by Anton C...
Logs: Can’t Hate Them, Won’t Love Them: Brief Log Management Class by Anton C...Anton Chuvakin
 
OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009marpierc
 
EEDC 2010. Scaling Web Applications
EEDC 2010. Scaling Web ApplicationsEEDC 2010. Scaling Web Applications
EEDC 2010. Scaling Web ApplicationsExpertos en TI
 
Combinación de logs, métricas y seguimiento para una visibilidad centralizada
Combinación de logs, métricas y seguimiento para una visibilidad centralizadaCombinación de logs, métricas y seguimiento para una visibilidad centralizada
Combinación de logs, métricas y seguimiento para una visibilidad centralizadaElasticsearch
 
EEDC 2010. Scaling SaaS Applications
EEDC 2010. Scaling SaaS ApplicationsEEDC 2010. Scaling SaaS Applications
EEDC 2010. Scaling SaaS ApplicationsExpertos en TI
 
Rakuten Technology Conference 2017 A Distributed SQL Database For Data Analy...
Rakuten Technology Conference 2017 A Distributed SQL Database  For Data Analy...Rakuten Technology Conference 2017 A Distributed SQL Database  For Data Analy...
Rakuten Technology Conference 2017 A Distributed SQL Database For Data Analy...Rakuten Group, Inc.
 
StasD & Graphite - Measure anything, Measure Everything
StasD & Graphite - Measure anything, Measure EverythingStasD & Graphite - Measure anything, Measure Everything
StasD & Graphite - Measure anything, Measure EverythingAvi Revivo
 
Postgre sql best_practices
Postgre sql best_practicesPostgre sql best_practices
Postgre sql best_practicesJacques Kostic
 
OGCE Project Overview
OGCE Project OverviewOGCE Project Overview
OGCE Project Overviewmarpierc
 
Java one2013 con4540-keenan
Java one2013 con4540-keenanJava one2013 con4540-keenan
Java one2013 con4540-keenanddkeenan
 
Improving the performance of Odoo deployments
Improving the performance of Odoo deploymentsImproving the performance of Odoo deployments
Improving the performance of Odoo deploymentsOdoo
 
Scaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache BeamScaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache BeamTatiana Al-Chueyr
 
Habits of Effective SAS Programmers
Habits of Effective SAS ProgrammersHabits of Effective SAS Programmers
Habits of Effective SAS ProgrammersSunil Gupta
 
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour Oct 2019
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour  Oct 2019Troubleshooting Tips and Tricks for Database 19c - EMEA Tour  Oct 2019
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour Oct 2019Sandesh Rao
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and CassandraNatalino Busa
 
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...BigDataCloud
 
Monitoring MySQL with Prometheus and Grafana
Monitoring MySQL with Prometheus and GrafanaMonitoring MySQL with Prometheus and Grafana
Monitoring MySQL with Prometheus and GrafanaJulien Pivotto
 

Similar to PGConf.ASIA 2019 Bali - Modern PostgreSQL Monitoring & Diagnostics - Mahadevan Ramachandran (20)

Combinación de logs, métricas y seguimiento para una visibilidad centralizada
Combinación de logs, métricas y seguimiento para una visibilidad centralizadaCombinación de logs, métricas y seguimiento para una visibilidad centralizada
Combinación de logs, métricas y seguimiento para una visibilidad centralizada
 
The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019
The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019 The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019
The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019
 
Logs: Can’t Hate Them, Won’t Love Them: Brief Log Management Class by Anton C...
Logs: Can’t Hate Them, Won’t Love Them: Brief Log Management Class by Anton C...Logs: Can’t Hate Them, Won’t Love Them: Brief Log Management Class by Anton C...
Logs: Can’t Hate Them, Won’t Love Them: Brief Log Management Class by Anton C...
 
OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009
 
EEDC 2010. Scaling Web Applications
EEDC 2010. Scaling Web ApplicationsEEDC 2010. Scaling Web Applications
EEDC 2010. Scaling Web Applications
 
Combinación de logs, métricas y seguimiento para una visibilidad centralizada
Combinación de logs, métricas y seguimiento para una visibilidad centralizadaCombinación de logs, métricas y seguimiento para una visibilidad centralizada
Combinación de logs, métricas y seguimiento para una visibilidad centralizada
 
EEDC 2010. Scaling SaaS Applications
EEDC 2010. Scaling SaaS ApplicationsEEDC 2010. Scaling SaaS Applications
EEDC 2010. Scaling SaaS Applications
 
Rakuten Technology Conference 2017 A Distributed SQL Database For Data Analy...
Rakuten Technology Conference 2017 A Distributed SQL Database  For Data Analy...Rakuten Technology Conference 2017 A Distributed SQL Database  For Data Analy...
Rakuten Technology Conference 2017 A Distributed SQL Database For Data Analy...
 
StasD & Graphite - Measure anything, Measure Everything
StasD & Graphite - Measure anything, Measure EverythingStasD & Graphite - Measure anything, Measure Everything
StasD & Graphite - Measure anything, Measure Everything
 
Postgre sql best_practices
Postgre sql best_practicesPostgre sql best_practices
Postgre sql best_practices
 
OGCE Project Overview
OGCE Project OverviewOGCE Project Overview
OGCE Project Overview
 
Java one2013 con4540-keenan
Java one2013 con4540-keenanJava one2013 con4540-keenan
Java one2013 con4540-keenan
 
Improving the performance of Odoo deployments
Improving the performance of Odoo deploymentsImproving the performance of Odoo deployments
Improving the performance of Odoo deployments
 
Postgre sql best_practices
Postgre sql best_practicesPostgre sql best_practices
Postgre sql best_practices
 
Scaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache BeamScaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache Beam
 
Habits of Effective SAS Programmers
Habits of Effective SAS ProgrammersHabits of Effective SAS Programmers
Habits of Effective SAS Programmers
 
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour Oct 2019
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour  Oct 2019Troubleshooting Tips and Tricks for Database 19c - EMEA Tour  Oct 2019
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour Oct 2019
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
 
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
 
Monitoring MySQL with Prometheus and Grafana
Monitoring MySQL with Prometheus and GrafanaMonitoring MySQL with Prometheus and Grafana
Monitoring MySQL with Prometheus and Grafana
 

More from Equnix Business Solutions

Yang perlu kita ketahui Untuk memahami aspek utama IT dalam bisnis_.pdf
Yang perlu kita ketahui Untuk memahami aspek utama IT dalam bisnis_.pdfYang perlu kita ketahui Untuk memahami aspek utama IT dalam bisnis_.pdf
Yang perlu kita ketahui Untuk memahami aspek utama IT dalam bisnis_.pdfEqunix Business Solutions
 
Kebocoran Data_ Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...
Kebocoran Data_  Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...Kebocoran Data_  Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...
Kebocoran Data_ Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...Equnix Business Solutions
 
Kuliah Tamu - Dari Proses Bisnis Menuju Struktur Data.pdf
Kuliah Tamu - Dari Proses Bisnis Menuju Struktur Data.pdfKuliah Tamu - Dari Proses Bisnis Menuju Struktur Data.pdf
Kuliah Tamu - Dari Proses Bisnis Menuju Struktur Data.pdfEqunix Business Solutions
 
EWTT22_ Apakah Open Source Cocok digunakan dalam Korporasi_.pdf
EWTT22_ Apakah Open Source Cocok digunakan dalam Korporasi_.pdfEWTT22_ Apakah Open Source Cocok digunakan dalam Korporasi_.pdf
EWTT22_ Apakah Open Source Cocok digunakan dalam Korporasi_.pdfEqunix Business Solutions
 
Oracle to PostgreSQL, Challenges to Opportunity.pdf
Oracle to PostgreSQL, Challenges to Opportunity.pdfOracle to PostgreSQL, Challenges to Opportunity.pdf
Oracle to PostgreSQL, Challenges to Opportunity.pdfEqunix Business Solutions
 
[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdf
[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdf[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdf
[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdfEqunix Business Solutions
 
Webinar2021 - Does HA Can Help You Balance Your Load-.pdf
Webinar2021 - Does HA Can Help You Balance Your Load-.pdfWebinar2021 - Does HA Can Help You Balance Your Load-.pdf
Webinar2021 - Does HA Can Help You Balance Your Load-.pdfEqunix Business Solutions
 
Webinar2021 - In-Memory Database, is it really faster-.pdf
Webinar2021 - In-Memory Database, is it really faster-.pdfWebinar2021 - In-Memory Database, is it really faster-.pdf
Webinar2021 - In-Memory Database, is it really faster-.pdfEqunix Business Solutions
 
equpos - General Presentation v20230420.pptx
equpos - General Presentation v20230420.pptxequpos - General Presentation v20230420.pptx
equpos - General Presentation v20230420.pptxEqunix Business Solutions
 
Equnix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdf
Equnix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdfEqunix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdf
Equnix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdfEqunix Business Solutions
 
OSPX - Professional PostgreSQL Certification Scheme v20201111.pdf
OSPX - Professional PostgreSQL Certification Scheme v20201111.pdfOSPX - Professional PostgreSQL Certification Scheme v20201111.pdf
OSPX - Professional PostgreSQL Certification Scheme v20201111.pdfEqunix Business Solutions
 
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki Kondo
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki KondoPGConf.ASIA 2019 - The Future of TDEforPG - Taiki Kondo
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki KondoEqunix Business Solutions
 
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo Hirose
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo HirosePGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo Hirose
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo HiroseEqunix Business Solutions
 
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky HaryadiPGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky HaryadiEqunix Business Solutions
 
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...Equnix Business Solutions
 
PGConf.ASIA 2019 Bali - Keynote Speech 3 - Kohei KaiGai
PGConf.ASIA 2019 Bali - Keynote Speech 3 - Kohei KaiGaiPGConf.ASIA 2019 Bali - Keynote Speech 3 - Kohei KaiGai
PGConf.ASIA 2019 Bali - Keynote Speech 3 - Kohei KaiGaiEqunix Business Solutions
 
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan PachenkoPGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan PachenkoEqunix Business Solutions
 

More from Equnix Business Solutions (20)

Yang perlu kita ketahui Untuk memahami aspek utama IT dalam bisnis_.pdf
Yang perlu kita ketahui Untuk memahami aspek utama IT dalam bisnis_.pdfYang perlu kita ketahui Untuk memahami aspek utama IT dalam bisnis_.pdf
Yang perlu kita ketahui Untuk memahami aspek utama IT dalam bisnis_.pdf
 
Kebocoran Data_ Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...
Kebocoran Data_  Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...Kebocoran Data_  Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...
Kebocoran Data_ Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...
 
Kuliah Tamu - Dari Proses Bisnis Menuju Struktur Data.pdf
Kuliah Tamu - Dari Proses Bisnis Menuju Struktur Data.pdfKuliah Tamu - Dari Proses Bisnis Menuju Struktur Data.pdf
Kuliah Tamu - Dari Proses Bisnis Menuju Struktur Data.pdf
 
EWTT22_ Apakah Open Source Cocok digunakan dalam Korporasi_.pdf
EWTT22_ Apakah Open Source Cocok digunakan dalam Korporasi_.pdfEWTT22_ Apakah Open Source Cocok digunakan dalam Korporasi_.pdf
EWTT22_ Apakah Open Source Cocok digunakan dalam Korporasi_.pdf
 
Oracle to PostgreSQL, Challenges to Opportunity.pdf
Oracle to PostgreSQL, Challenges to Opportunity.pdfOracle to PostgreSQL, Challenges to Opportunity.pdf
Oracle to PostgreSQL, Challenges to Opportunity.pdf
 
[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdf
[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdf[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdf
[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdf
 
PostgreSQL as Enterprise Solution v1.1.pdf
PostgreSQL as Enterprise Solution v1.1.pdfPostgreSQL as Enterprise Solution v1.1.pdf
PostgreSQL as Enterprise Solution v1.1.pdf
 
Webinar2021 - Does HA Can Help You Balance Your Load-.pdf
Webinar2021 - Does HA Can Help You Balance Your Load-.pdfWebinar2021 - Does HA Can Help You Balance Your Load-.pdf
Webinar2021 - Does HA Can Help You Balance Your Load-.pdf
 
Webinar2021 - In-Memory Database, is it really faster-.pdf
Webinar2021 - In-Memory Database, is it really faster-.pdfWebinar2021 - In-Memory Database, is it really faster-.pdf
Webinar2021 - In-Memory Database, is it really faster-.pdf
 
EQUNIX - PPT 11DB-Postgres™.pdf
EQUNIX - PPT 11DB-Postgres™.pdfEQUNIX - PPT 11DB-Postgres™.pdf
EQUNIX - PPT 11DB-Postgres™.pdf
 
equpos - General Presentation v20230420.pptx
equpos - General Presentation v20230420.pptxequpos - General Presentation v20230420.pptx
equpos - General Presentation v20230420.pptx
 
Equnix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdf
Equnix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdfEqunix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdf
Equnix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdf
 
OSPX - Professional PostgreSQL Certification Scheme v20201111.pdf
OSPX - Professional PostgreSQL Certification Scheme v20201111.pdfOSPX - Professional PostgreSQL Certification Scheme v20201111.pdf
OSPX - Professional PostgreSQL Certification Scheme v20201111.pdf
 
Equnix Company Profile v20230329.pdf
Equnix Company Profile v20230329.pdfEqunix Company Profile v20230329.pdf
Equnix Company Profile v20230329.pdf
 
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki Kondo
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki KondoPGConf.ASIA 2019 - The Future of TDEforPG - Taiki Kondo
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki Kondo
 
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo Hirose
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo HirosePGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo Hirose
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo Hirose
 
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky HaryadiPGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
 
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...
 
PGConf.ASIA 2019 Bali - Keynote Speech 3 - Kohei KaiGai
PGConf.ASIA 2019 Bali - Keynote Speech 3 - Kohei KaiGaiPGConf.ASIA 2019 Bali - Keynote Speech 3 - Kohei KaiGai
PGConf.ASIA 2019 Bali - Keynote Speech 3 - Kohei KaiGai
 
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan PachenkoPGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
 

Recently uploaded

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

PGConf.ASIA 2019 Bali - Modern PostgreSQL Monitoring & Diagnostics - Mahadevan Ramachandran

  • 1.
  • 2. Modern PostgreSQL Monitoring & Diagnostics Mahadevan Ramachandran Co-founder & CEO, RapidLoop September 10, 2019 Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 2 / 55
  • 3. Hello! Mahadevan Ramachandran Co-founder & CEO, RapidLoop We build monitoring products: OpsDash – server & service monitoring pgDash – dedicated PostgreSQL monitoring Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 3 / 55
  • 4. In The Beginning... OpsDash was our first product. It does server and service monitoring. It also does basic PostgreSQL monitoring. OpsDash itself uses Postgres internally. As we grew, we realized that we needed more, in-depth monitoring for Postgres. But: What should we collect? There was no standard list of metrics information to collect. How should we collect? There was no standard way to collect, that worked across Postgres versions, distros etc. Why should we collect what we were supposed to collect? How were we supposed to interpret it? Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 4 / 55
  • 5. What we needed A modern system by which we could collect all relevant info about a Postgres server into one place Extract & store relevant metrics into timeseries db Display collected information in a rich set of dashboards – not just as a bunch of timeseries graphs More importantly: let algorithms have a look at all this information and perform diagnostics – for example: WAL file count increased steadily over the last 24 hours Look at transactions running for more than 24 hours – none Look at wal archiving info – not failing – what then? Look at replication slots – one of them has become inactive – aha! Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 5 / 55
  • 6. So we built pgmetrics We did some research: Started with our requirements Actually read the Postgres documentation :-) Various metrics collection agents/plugins, check nagios.pl Read a few Postgres administration books and then built an open source tool called pgmetrics Added more features and tweaks based on user’s inputs Now queries and collects over 350 metrics.. ..using over 50 different queries Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 6 / 55
  • 7. pgmetrics Open source CLI tool - also usable as a library Run anywhere - Go, single binary, statically linked, no dependencies - Windows, Linux, FreeBSD, MacOS No installation or Postgres extensions required - so works with AWS RDS, Heroku too Easy to use - exact same command-line arguments and env. vars as psql Usable for scripting and automation - JSON, CSV outputs Accommodates Postgres version differences - works with v9.3+ Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 7 / 55
  • 8. How It Works pgmetrics collects information from: Statistics collector views – pg stat archiver, pg stat bgwriter, pg stat replication, pg stat wal receiver, pg stat activity, pg stat subscription, ... Functions – pg is in recovery(), pg last wal receive lsn(), pg last wal replay lsn(), pg tablespace size(), ... System catalog views – pg database, pg tablespace, pg class, ... PSS Extension – pg stat statements Configuration settings – pg settings Bloat – using the query from check nagios.pl Filesystem – pg ls dir(’pg wal’) System metrics – /proc filesystem Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 8 / 55
  • 9. Demo Time Let’s have a look! Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 9 / 55
  • 10. PostgreSQL Metrics PostgreSQL Metrics Cluster-Level *cluster = a set of databases managed by a single postgres process Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 10 / 55
  • 11. Cluster Overview Name: 10/main Server Version: 10.10 (Ubuntu 10.10-1.pgdg18.04+1) Server Started: 23 Aug 2019 3:47:32 AM (24 minutes ago) System Identifier: 6685915216424112509 Timeline: 1 Last Checkpoint: 23 Aug 2019 4:07:36 AM (4 minutes ago) Prior LSN: 13E/4100F370 REDO LSN: 13E/4103E9F8 (190 KiB since Prior) Checkpoint LSN: 13E/410411D8 (10 KiB since REDO) Transaction IDs: 1499764934 to 1640241289 (diff = 140476355) Notification Queue: 0.0% used Active Backends: 3 (max 100) Recovery Mode? no Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 11 / 55
  • 12. Cluster Overview What to monitor: Transaction ID range Time since last checkpoint Number of (client) backends Notification queue usage Time since server start Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 12 / 55
  • 13. Backends Backends: Total Backends: 17 (17.0% of max 100) Problematic: 5 waiting on locks, 6 waiting on other, 1 xact too long, 3 idle in xact Waiting for Locks: +-------+--------+---------+-------------+----------+----------------------+------------------------+ | PID | User | App | Client Addr | Database | Wait | Query Start | +-------+--------+---------+-------------+----------+----------------------+------------------------+ | 28525 | mdevan | psql | | bench | Lock / relation | 28 Aug 2019 6:42:13 AM | | 28539 | mdevan | pgbench | | bench | Lock / transactionid | 28 Aug 2019 6:42:56 AM | | 28541 | mdevan | pgbench | | bench | Lock / transactionid | 28 Aug 2019 6:42:56 AM | | 28565 | mdevan | psql | | bench | Lock / relation | 28 Aug 2019 6:42:26 AM | | 28588 | mdevan | psql | | bench | Lock / relation | 28 Aug 2019 6:42:45 AM | +-------+--------+---------+-------------+----------+----------------------+------------------------+ Other Waiting Backends: +-------+--------+---------+-------------+----------+-----------------------+------------------------+ | PID | User | App | Client Addr | Database | Wait | Query Start | +-------+--------+---------+-------------+----------+-----------------------+------------------------+ | 22066 | pgdash | | | pgdash | Client / ClientRead | 28 Aug 2019 6:40:06 AM | | 27976 | pgdash | | | pgdash | Client / ClientRead | 28 Aug 2019 6:40:06 AM | | 28174 | mdevan | psql | | bench | Client / ClientRead | 28 Aug 2019 6:41:47 AM | | 28534 | mdevan | pgbench | | bench | LWLock / WALWriteLock | 28 Aug 2019 6:42:56 AM | | 28536 | mdevan | pgbench | | bench | LWLock / WALWriteLock | 28 Aug 2019 6:42:56 AM | | 28542 | mdevan | pgbench | | bench | LWLock / WALWriteLock | 28 Aug 2019 6:42:56 AM | +-------+--------+---------+-------------+----------+-----------------------+------------------------+ (continued..) Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 13 / 55
  • 14. Backends Long Running (>60 sec) Transactions: +-------+--------+------+-------------+----------+---------------------------------------+ | PID | User | App | Client Addr | Database | Transaction Start | +-------+--------+------+-------------+----------+---------------------------------------+ | 28174 | mdevan | psql | | bench | 28 Aug 2019 6:41:40 AM (1 minute ago) | +-------+--------+------+-------------+----------+---------------------------------------+ Idling in Transaction: +-------+--------+---------+-------------+----------+----------+------------------------+ | PID | User | App | Client Addr | Database | Aborted? | State Change | +-------+--------+---------+-------------+----------+----------+------------------------+ | 28174 | mdevan | psql | | bench | no | 28 Aug 2019 6:41:47 AM | | 28535 | mdevan | pgbench | | bench | no | 28 Aug 2019 6:42:56 AM | | 28540 | mdevan | pgbench | | bench | no | 28 Aug 2019 6:42:56 AM | +-------+--------+---------+-------------+----------+----------+------------------------+ Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 14 / 55
  • 15. Backends What to monitor: Backends waiting for locks Backends that have been running for too long Backends that are idling in transaction Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 15 / 55
  • 16. WAL Files WAL Files: WAL Archiving? yes WAL Files: 19 Ready Files: 0 Archive Rate: 1.20 per min Last Archived: 27 Aug 2019 8:46:52 AM (13 seconds ago) Last Failure: Totals: 7265 succeeded, 0 failed Totals Since: 23 Aug 2019 3:47:36 AM (4 days ago) Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 16 / 55
  • 17. WAL Files What to monitor: Number of WAL files Number of WAL files ready for archiving Archiving failures Rate of archiving Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 17 / 55
  • 18. BG Writer Checkpoint Rate: 0.20 per min Average Write: 6.0 MiB per checkpoint Total Checkpoints: 1214 sched (100.0%) + 0 req (0.0%) = 1214 Total Write: 1.3 TiB, @ 3.7 MiB per sec Buffers Allocated: 257126091 (1.9 TiB) Buffers Written: 938583 chkpt (0.5%) + 84471609 bgw (48.4%) + 89281998 be (51.1%) Clean Scan Stops: 769729 BE fsyncs: 0 Counts Since: 23 Aug 2019 3:47:36 AM (4 days ago) Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 18 / 55
  • 19. BG Writer What to monitor: Checkpoint rate (compare with checkpoint timeout) Ratio of scheduled to requested checkpoints Percentage of buffers written by the bgwriter Stops of the bgwriter clean scan run Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 19 / 55
  • 20. Tablespaces +------------+----------+---------------------------------------+- | Name | Owner | Location | +------------+----------+---------------------------------------+- | pg_default | postgres | $PGDATA = /var/lib/postgresql/10/main | | pg_global | postgres | $PGDATA = /var/lib/postgresql/10/main | | ts1 | postgres | /opt/ts1 | +------------+----------+---------------------------------------+- -+---------+---------------------------+ | Size | Disk Used | -+---------+---------------------------+ | 1.5 GiB | 6.2 GiB (10.7%) of 58 GiB | | 582 KiB | 6.2 GiB (10.7%) of 58 GiB | | 2.6 MiB | 6.2 GiB (10.7%) of 58 GiB | -+---------+---------------------------+ Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 20 / 55
  • 21. Tablespaces What to monitor: Disk usage per tablespace Inode usage (typically not an issue) Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 21 / 55
  • 22. Locks Locks: +---------------+-------------+-------+ | Lock Type | Not Granted | Total | +---------------+-------------+-------+ | relation | 3 | 85 | | transactionid | 4 | 17 | | virtualxid | 0 | 18 | +---------------+-------------+-------+ | | 7 | 120 | +---------------+-------------+-------+ Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 22 / 55
  • 23. Locks What to monitor: Overall number of locks ‘relation’ locks (see also Blocked Queries at database level) ‘advisory’ locks Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 23 / 55
  • 24. PostgreSQL Metrics PostgreSQL Metrics Database-Level Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 24 / 55
  • 25. Database Overview Database #4: Name: bench Owner: mdevan Tablespace: pg_default Connections: 1 (no max limit) Frozen Xid Age: 149467276 Transactions: 159267223 (57.6%) commits, 117062056 (42.4%) rollbacks Cache Hits: 98.1% Rows Changed: ins 25.0%, upd 75.0%, del 0.0% Total Temp: 0 B in 0 files Problems: 0 deadlocks, 0 conflicts Totals Since: 23 Aug 2019 3:47:39 AM (4 days ago) Size: 1.4 GiB Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 25 / 55
  • 26. Database Overview What to monitor: Number of connections Transaction ID range Cache efficiency Commit ratio Deadlock Count Query Conflict Count (on standbys) Size of database on-disk Temporary files Total data written into temporary files Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 26 / 55
  • 27. Slow Queries Slow Queries: +-----------+----------+---------------+-----------+----------------------------------------------------+ | Calls | Avg Time | Total Time | Rows/Call | Query | +-----------+----------+---------------+-----------+----------------------------------------------------+ | 291994217 | 2ms | 197h0m57.062s | 1 | UPDATE pgbench_branches SET bbalance = bbalance + | | 291994221 | 0s | 60h30m15.222s | 1 | UPDATE pgbench_tellers SET tbalance = tbalance + $ | | 291994222 | 0s | 33h17m46.907s | 1 | UPDATE pgbench_accounts SET abalance = abalance + | | 291994222 | 0s | 3h39m9.976s | 1 | SELECT abalance FROM pgbench_accounts WHERE aid = | | 291994217 | 0s | 2h9m23.344s | 1 | INSERT INTO pgbench_history (tid, bid, aid, delta, | | 291994222 | 0s | 11m57.856s | 0 | BEGIN | | 291994217 | 0s | 8m54.985s | 0 | END | | 2392 | 207ms | 8m15.573s | 13 | SELECT current_database() AS db, schemaname, tab | | 16730 | 6ms | 1m44.987s | 1 | SELECT pg_database_size($1) | | 2392 | 41ms | 1m38.98s | 1 | WITH pc AS (SELECT pubname, COUNT(*) AS c FROM pg_ | | 2392 | 29ms | 1m10.38s | 2 | SELECT name, current_database(), COALESCE(default_ | | 7170 | 9ms | 1m10.199s | 1 | SELECT pg_tablespace_size($1) | | 327353 | 0s | 50.747s | 3 | SELECT a.attname, a.atttypid, a.atttyp | | 327353 | 0s | 33.834s | 1 | SELECT c.oid, c.relreplident FROM pg_catalog.pg_c | | 4780 | 6ms | 33.173s | 1 | SELECT COUNT(*) FROM pg_ls_dir($1) WHERE pg_ls_dir | | 327353 | 0s | 25.619s | 0 | BEGIN READ ONLY ISOLATION LEVEL REPEATABLE READ | | 2390 | 10ms | 25.193s | 1 | SELECT archived_count, COALESCE(last_archived_ | | 2389 | 8ms | 20.97s | 99 | SELECT userid, dbid, queryid, LEFT(query, $1), cal | Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 27 / 55
  • 28. Slow Queries What to monitor: Average time taken and rows/call for specific queries Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 28 / 55
  • 29. Blocked Queries Blocked Query #1: Query: TRUNCATE TABLE locktest; Started By: psql mdevan/bench (PID 28565) Waiting Since: 28 Aug 2019 6:42:26 AM (30 seconds ago) Waiting For: Query: LOCK TABLE locktest IN ACCESS EXCLUSIVE MODE; Lock: relation, AccessExclusiveLock, table public.locktest Started By: psql mdevan/bench (PID 28174) Waiting For: Query: SELECT * FROM locktest FOR UPDATE; Lock: relation, AccessExclusiveLock, table public.locktest Started By: psql mdevan/bench (PID 28525) Blocked Query #2: Query: CLUSTER VERBOSE locktest; Started By: psql mdevan/bench (PID 28588) Waiting Since: 28 Aug 2019 6:42:45 AM (11 seconds ago) Waiting For: Query: LOCK TABLE locktest IN ACCESS EXCLUSIVE MODE; Lock: relation, AccessExclusiveLock, table public.locktest Started By: psql mdevan/bench (PID 28174) Waiting For: Query: SELECT * FROM locktest FOR UPDATE; Lock: relation, AccessExclusiveLock, table public.locktest Started By: psql mdevan/bench (PID 28525) Waiting For: Query: TRUNCATE TABLE locktest; Lock: relation, AccessExclusiveLock, table public.locktest Started By: psql mdevan/bench (PID 28565) Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 29 / 55
  • 30. Blocked Queries What to monitor: Number of blocked queries Record query information for later analysis Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 30 / 55
  • 31. Tables Table #2 in "bench": Name: bench.public.pgbench_tellers Columns: 4 Manual Vacuums: never Manual Analyze: never Auto Vacuums: 7237, last 30 seconds ago Auto Analyze: 6910, last 30 seconds ago Post-Analyze: 0.0% est. rows modified Row Estimate: 100.0% live of total 100 Rows Changed: ins 0.0%, upd 99.4%, del 0.0% HOT Updates: 99.4% of all updates Seq Scans: 93933255, 100.0 rows/scan Idx Scans: 95608483, 1.0 rows/scan Cache Hits: 100.0% (idx=94.5%) Size: 744 KiB Bloat: 704 KiB (94.6%) Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 31 / 55
  • 32. Tables What to monitor: Time since last vacuum Time since last analyze HOT updates Cache efficiency Number of sequential scans Size of the table on-disk Bloat, both in bytes as well as a % of table size Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 32 / 55
  • 33. Indexes +----------------------+-------+--------+----------------+- | Index | Type | Size | Bloat | +----------------------+-------+--------+----------------+- | pgbench_tellers_pkey | btree | 27 MiB | 27 MiB (99.9%) | +----------------------+-------+--------+----------------+- -+------------+----------+----------------+-------------------+ | Cache Hits | Scans | Rows Read/Scan | Rows Fetched/Scan | -+------------+----------+----------------+-------------------+ | 94.5% | 95608483 | 1.6 | 1.0 | -+------------+----------+----------------+-------------------+ Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 33 / 55
  • 34. Indexes What to monitor: Unused indexes (same scan count over n days) Cache efficiency Size of the index on-disk Bloat, both in bytes as well as a % of index size Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 34 / 55
  • 35. PostgreSQL Metrics PostgreSQL Metrics Replication Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 35 / 55
  • 36. Outgoing Replication, Physical Outgoing physical replication connection to a standby: Destination #2: User: repluser Application: walreceiver Client Address: 127.0.0.1/32 State: streaming Started At: 23 Aug 2019 6:00:15 AM (10 minutes ago) Sent LSN: 13E/DD556190 Written Until: 13E/DD556190 (no write lag) Flushed Until: 13E/DD555D30 (flush lag = 1.1 KiB) Replayed Until: 13E/DD555D30 (no replay lag) Sync Priority: 0 Sync State: async Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 36 / 55
  • 37. Outgoing Replication, Physical Generic outgoing physical replicaton connection: Destination #3: User: mdevan Application: pg_receivewal Client Address: State: streaming Started At: 23 Aug 2019 4:32:35 AM (49 seconds ago) Sent LSN: 13E/63643DC8 Written Until: 13E/636439B0 (write lag = 1.0 KiB) Flushed Until: 13E/63000000 (flush lag = 6.3 MiB) Replayed Until: Sync Priority: 0 Sync State: async Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 37 / 55
  • 38. Outgoing Replication, Physical What to monitor: Write lag Flush lag Replay lag WAL sender state – ”streaming”, ”catchup” Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 38 / 55
  • 39. Physical Replication Slots Physical Replication Slots: +-------------+--------+---------------+--------------+- | Name | Active | Oldest Txn ID | Restart LSN | +-------------+--------+---------------+--------------+- | repl_test_1 | yes | | 13E/63000000 | | standby1 | yes | | 13E/63643DC8 | +-------------+--------+---------------+--------------+- -+-----------+ | Temporary | -+-----------+ | no | | no | -+-----------+ Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 39 / 55
  • 40. Physical Replication Slots What to monitor: Is the slot active or not? Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 40 / 55
  • 41. Incoming Replication Recovery Status: Replay paused: no Received LSN: 160/2D09CE0 Replayed LSN: 160/2D09CE0 (no lag) Last Replayed Txn: 28 Aug 2019 4:52:39 AM (now) Incoming Replication Stats: Status: streaming Received LSN: 160/2D09CE0 (started at 153/48000000, 51 GiB) Timeline: 1 (was 1 at start) Latency: 58us Replication Slot: standby1 Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 41 / 55
  • 42. Incoming Replication What to monitor: Replay lag Latency Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 42 / 55
  • 43. Outgoing Replication, Logical Outgoing logical replicaton connection: Destination #1: User: mdevan Application: pg_recvlogical Client Address: State: streaming Started At: 23 Aug 2019 3:47:39 AM (45 minutes ago) Sent LSN: 13E/63643DC8 Written Until: 13E/636439B0 (write lag = 1.0 KiB) Flushed Until: 13E/636439B0 (no flush lag) Replayed Until: Sync Priority: 0 Sync State: async Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 43 / 55
  • 44. Outgoing Replication, Logical What to monitor: Write lag Flush lag Replay lag Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 44 / 55
  • 45. Logical Replication Slots Logical Replication Slots: +-------------+---------------+----------+--------+---------------+- | Name | Plugin | Database | Active | Oldest Txn ID | +-------------+---------------+----------+--------+---------------+- | repl_test_2 | test_decoding | bench | yes | | +-------------+---------------+----------+--------+---------------+- -+--------------+---------------+-----------+ | Restart LSN | Flushed Until | Temporary | -+--------------+---------------+-----------+ | 15A/A05A42A8 | 15A/A14270F8 | no | -+--------------+---------------+-----------+ Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 45 / 55
  • 46. Logical Replication Slots What to monitor: Is the slot active or not? Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 46 / 55
  • 47. Logial Replication Publication This is at database level. Logical Replication Publications: +------------+-------------+---------------------------+--------+ | Name | All Tables? | Propagate | Tables | +------------+-------------+---------------------------+--------+ | pub_test_1 | yes | inserts, updates, deletes | 5 | +------------+-------------+---------------------------+--------+ Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 47 / 55
  • 48. Logical Replication Subscription This is at database level. Logical Replication Subscriptions: Subscription #1: Name: sub_test_1 Enabled? yes Publications: 1 Tables: 5 Workers: 1 Received Until: 160/208CBED0 Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 48 / 55
  • 49. Logical Replication Subscription What to monitor: Disabled subscriptions Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 49 / 55
  • 50. PostgreSQL Metrics PostgreSQL Metrics Other Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 50 / 55
  • 51. Other What to monitor: vacuum jobs disabled triggers changes to configuration settings changes to the list of roles & membership system-level maximum load average in the last 24 hours maximum memory usage in the last 24 hours disk IOPS and bandwidth (MB/s) CPU usage (esp. user, iowait) Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 51 / 55
  • 52. pgDash We spun off our Postgres monitoring into it’s own product. pgDash is what you’d do with the output of pgmetrics – dashboards, diagnostics, query performance, alerting and more. Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 52 / 55
  • 53. pgDash Features - https://pgdash.io Diagnostics - Automatically analyze and call out situations like high bloat, tables not vacuumed in a week, inactive replication slots Query Performance - Queries executed during a time range Blocked Queries - Historical information about locks and blocked queries, including the SQL itself Index Management - Looking for indexes not used in the last 30 days? Alerts - Meaningful alerts: ”bloat > 10% for tables of size 100 MiB or more” And More - Replication, In-Depth metrics about Tables and Indexes, Tablespaces, Backends, ... Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 53 / 55
  • 54. pgmetrics Roadmap Features in the pipeline: Collect data from log files Deadlock details auto explain output to get query plans vacuum job information Collect/monitor data from more sources AWS CloudWatch & AWS Enhanced Monitoring for RDS PgBouncer (already available) PgPool Collect query plans directly.. from Postgres, maybe via an enhanced pg stat statements? Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 54 / 55
  • 55. Thank you! We’d love your hear your thoughts about pgmetrics! pgmetrics Home https://pgmetrics.io GitHub https://github.com/rapidloop/pgmetrics pgDash https://pgdash.io RapidLoop https://www.rapidloop.com Me! mahadevan@rapidloop.com Thanks for your time and have a great evening! Mahadevan Ramachandran PostgreSQL Monitoring September 10, 2019 55 / 55