What you need to know for postgresql operation

What you need to know for
Postgresql operation
https://orabase.org/owncloud/index.php/s/vdqNlAqNqIPLxil
materials

Who Am I ?
Oracle OCP 12c
OCE 11g PT
OCE 11g RAC
Senior Specialist at RT Labs
The guy on the left ^_^
PostgreSQL 9.3 Associate

Presentation plan
1.Architecture of Postgresql
2.Transactions and Concurrency, MVCC
3.Connection Pooling ( pgpool,pgbouncer )
4.Tips & Trics + Monitoring

Architectural Summary :
• PostgreSQL uses processes, not threads
• Postmaster process acts as supervisor
• Several utility processes perform background work
• postmaster starts them, restarts them if they die
• postmaster listens for new connections

Main Utility Processes:
• Background writer − Writes dirty data blocks to disk
• WAL writer − Flushes write-ahead log to disk
• Checkpointer process − Automatically performs a checkpoint based
on config parameters
• Autovacuum launcher − Starts Autovacuum workers as needed
• Autovacuum workers − Recover free space for reuse
• Stats Collector – collects runtime statistics

1.Architecture of Postgresql
2.Transactions and Concurrency, MVCC
3.Connection Pooling ( pgpool,pgbouncer )
4.Tips & Trics + Monitoring
Presentation plan

What is a Transaction?
• A transaction is set of statements bundled into a single step, all-or-
nothing operation
• A transaction must possess ACID properties:
• An all-or-nothing operation (Atomicity).
• Only valid data is written to the database (Consistency).
• The intermediate states between the steps are not visible to other concurrent
transactions (Isolation).
• If some failure occurs that prevents the transaction from completing, then
none of the steps affect the database at all (Durability).

2. Transactions and Concurrency, MVCC
• Snapshot of data at a point in time.
• Updates, inserts and deletes cause the creation of a new row version.
Row version stored in same page.
• MVCC uses increasing transaction IDs to achieve consistency.
• Each row has 2 transaction ids: created and expired
• Queries check:
• creation trans id is committed and < current trans counter
• row lacks expire trans id or expire was in process at query start

MVCC Maintenance
• MVCC creates multiple versions of a row for concurrency
• Old row versions can cause “bloat”
• Rows no longer needed are recovered for reuse/removed via
vacuuming or autovacuum
• To prevent transaction wraparound failure each table must be
vacuumed periodically
• PostgreSQL reserves a special XID as FrozenXID
• This XID is always considered older than every normal XID

Pgpool
• At first, developed for connection pooling
• Replication Master/Slave mode
• Load balancing
• Automatic failover on desync detection
• Online recovery
• Parallel Query

PgBouncer
• Lightweight connection pooler for PostgreSQL
• Any application can connect to Pgboucer as it connects with
PostgreSQL
• Pgbouncer help to lower down the connections impact on the
PostgreSQL Server
• Pgbouncer provides connection pooling thus reuse the existing
connections

Types of Connections
• pgbouncer supports several types of pooling when rotating
connections:
• Session pooling − A server connection is assigned to the client application for
the life of the client connection.
• Transaction pooling − A server connection is assigned to the client application
for the duration of a transaction
• Statement pooling − A server connection is assigned to the client application
for each statement

How Connections are Established
• An application connects to PgBouncer as if it were a PostgreSQL database
• PgBouncer then creates a connection to the actual database server, or it
reuses one of the existing connections from the pool
• Step 1: The client application attempts to connect to PostgreSQL on the port where
pgbouncer is running
• Step 2: The database name supplied by the client application must match with the
list in pgBouncer.ini
• Step 3: The user name and password supplied must match with the list in users.txt
• Step 4: If a connection with same settings is available in pool it will be assigned to
client
• otherwise a new connection object will be created
• Step 5: Once client log off the connection object return back to the pool

quick test with pgbouncer
• Connecting to the bouncer over local unix socket, it took 31s to
perform all the queries.
• Connecting to the bouncer over localhost, it took 45s to perform all
the queries.
• Connecting to the bouncer running on the remote server, it took
1m6s
• Without using pgbouncer, it took 3m34s

How graphite populate data
• We write function that in single pass get all information (still under
development)
• adm-get_stat_activity.sql

Autovacuum & DB activity monitoring
• autovacuum_count.Query=select count (*) from pg_stat_activity
where state = 'active' AND query LIKE 'autovacuum:%’
• autovacuum_max.Query=select coalesce (max(round(extract( epoch
FROM age(statement_timestamp(), state_change)))),0)
active_seconds from pg_stat_activity where state = 'active' AND
query LIKE 'autovacuum:%’
• xactcommit.Query=SELECT sum(xact_commit) FROM
pg_stat_database
• xactrollback.Query=SELECT sum(xact_rollback) FROM
pg_stat_database

Session activity monitoring
• active_session_cnt.Query=select count (*) from pg_stat_activity where state='active' and pid != pg_backend_pid()
• active_5s.Query=select count (*) from pg_stat_activity where state='active' and statement_timestamp() - state_change >
INTERVAL '5s' AND query not LIKE 'autovacuum:%’
• active_max.Query=select coalesce(abs(max(round(extract( epoch FROM age(statement_timestamp(), state_change))))),0)
• active_max_seconds from pg_stat_activity where state='active' AND query not LIKE 'autovacuum:%’
• idle_session_cnt.Query=select count (*) from pg_stat_activity where state='idle’
• idle_in_trans_cnt.Query=select count (*) from pg_stat_activity where state like 'idle in trans%’
• idle_in_trans_5s.Query=select count (*) from pg_stat_activity where state like 'idle in trans%' and statement_timestamp() -
state_change > INTERVAL '5s’
• idle_in_trans_max.Query=select coalesce(max(round(extract( epoch FROM age(statement_timestamp(), state_change)))),0)
max_idle_in_trans from eyes.get_pg_stat_activity() where state like ’
• idle in trans%'waiting_session_cnt.Query=select count (*) from eyes.get_pg_stat_activity() where waiting is true
• waiting_session_5s.Query=select count (*) from pg_stat_activity where waiting is true and statement_timestamp() - state_change
> INTERVAL '5s’
• waiting_session_max.Query=select coalesce (abs(max(round(extract( epoch FROM age(statement_timestamp(),
state_change))))),0)
• waiting_max from pg_stat_activity where waiting is true

Database stats monitoring
• activeconn.Query=select sum(numbackends) from pg_stat_database
• tupreturned.Query=select sum(tup_returned) from pg_stat_database
• tupfetched.Query=select sum(tup_fetched) from pg_stat_database
• tupinserted.Query=select sum(tup_inserted) from pg_stat_database
• tupupdated.Query=select sum(tup_updated) from pg_stat_database
• tupdeleted.Query=select sum(tup_deleted) from pg_stat_database

MasterSlave stats queries
• master_or_slave.Query=select pg_is_in_recovery()::int
• slave_delay_mb.Query=select
application_name,(pg_xlog_location_diff(sent_location,
replay_location))/1024/1024 as mb_lag from pg_stat_replication as
MB_lag
• slave_delay_sec.Query=select extract(epoch FROM now() -
COALESCE(pg_last_xact_replay_timestamp(),now()))

Bgwriter +checkpoint stats
• checkpoints_timed.Query=select checkpoints_timed from
pg_stat_bgwriter
• checkpoints_req.Query=select checkpoints_req from pg_stat_bgwriter
• buffers_checkpoint.Query=select buffers_checkpoint from
pg_stat_bgwriter
• buffers_clean.Query=select buffers_clean from pg_stat_bgwriter
• maxwritten_clean.Query=select maxwritten_clean from pg_stat_bgwriter
• buffers_backend.Query=select buffers_backend from pg_stat_bgwriter
• buffers_alloc.Query=select buffers_alloc from pg_stat_bgwriter

Tips and Tricks
• Pg_stat_statements
• Pg_stat_kcache
• pg_buffercache
• pg_stat_user_indexes,pg_stat_user_tables *etc

Pg_stat_statements
• Good toolset by postgresql consulting

Top query by avg runtime
• select
md5(query),calls,total_time,rows,shared_blks_hit,shared_blks_read,
(total_time/calls) as avg_time from pg_stat_statements order by
avg_time desc limit 5;

pg_stat_kcache
• Gathers statistics about real reads and writes done by the filesystem
layer.
• PostgreSQL >= 9.4
• Not yet in contrib

pg_stat_kcache
SELECT datname, queryid, round(total_time::numeric, 2) AS total_time, calls,
pg_size_pretty((shared_blks_hit+shared_blks_read)*8192 - reads) AS memory_hit,
pg_size_pretty(reads) AS disk_read, pg_size_pretty(writes) AS disk_write,
round(user_time::numeric, 2) AS user_time, round(system_time::numeric, 2) ASsystem_time
FROM pg_stat_statements s
JOIN pg_stat_kcache() k USING (userid, dbid, queryid)
JOIN pg_database d ON s.dbid = d.oid
WHERE datname != 'postgres' AND datname NOT LIKE 'template%’ ORDER BY total_time DESC LIMIT 10;

Top object in cache
SELECT c.relname ,
pg_size_pretty(count(*) * 8192) as buffered
, round(100.0 * count(*) / ( SELECT setting FROM pg_settings WHERE
name='shared_buffers')::integer,1) AS buffers_percent
, round(100.0 * count(*) * 8192 / pg_relation_size(c.oid),1) AS percent_of_relation
FROM pg_class c
JOIN pg_buffercache b ON b.relfilenode = c.relfilenode
JOIN pg_database d ON (b.reldatabase = d.oid AND d.datname = current_database())
WHERE pg_relation_size(c.oid) > 0 GROUP BY c.oid, c.relname
ORDER BY 3 DESC LIMIT 10;

Top 20 unused indexes
SELECT relid::regclass AS table,
indexrelid::regclass AS index,
pg_size_pretty(pg_relation_size(indexrelid::regclass)) AS index_size,
idx_tup_read,
idx_tup_fetch,
idx_scan
FROM pg_stat_user_indexes
JOIN pg_index USING (indexrelid)
WHERE idx_scan = 0 AND indisunique IS FALSE
order by pg_relation_size(indexrelid::regclass) desc limit 20;

Top 20 unused indexes examples

indexes on nulls
Select
pg_index.indrelid::regclass as table,
pg_index.indexrelid::regclass as index,
pg_attribute.attname as field,pg_statistic.stanullfrac,
pg_size_pretty(pg_relation_size(pg_index.indexrelid)) as indexsize,
pg_get_indexdef(pg_index.indexrelid) as indexdef
from pg_index
join pg_attribute ON pg_attribute.attrelid=pg_index.indrelid AND
pg_attribute.attnum=ANY(pg_index.indkey)
join pg_statistic ON pg_statistic.starelid=pg_index.indrelid AND
pg_statistic.staattnum=pg_attribute.attnum
where pg_statistic.stanullfrac>0.5
AND pg_relation_size(pg_index.indexrelid)>10*8192
order by pg_relation_size(pg_index.indexrelid) desc,1,2,3;

Duplicate indexes
SELECT pg_size_pretty(SUM(pg_relation_size(idx))::BIGINT) AS SIZE,
(array_agg(idx))[1] AS idx1, (array_agg(idx))[2] AS idx2,
(array_agg(idx))[3] AS idx3, (array_agg(idx))[4] AS idx4
FROM ( SELECT indexrelid::regclass AS idx, (indrelid::text ||E'n'|| indclass::text ||E'n'||
indkey::text ||E'n'||
COALESCE(indexprs::text,'')||E'n' || COALESCE(indpred::text,'')) AS KEY
FROM pg_index) sub
GROUP BY KEY HAVING COUNT(*)>1 ORDER BY SUM(pg_relation_size(idx)) DESC;

Missing index
SELECT relname, seq_scan-idx_scan AS too_much_seq, case when
seq_scan-idx_scan>0 THEN 'Missing Index?' ELSE 'OK' END,
pg_size_pretty(pg_relation_size(relname::regclass)) AS rel_size,
seq_scan, idx_scan FROM pg_stat_all_tables WHERE
schemaname='public' AND
pg_relation_size(relname::regclass)>10*1024*1024 ORDER BY
too_much_seq DESC nulls last;

Write activity
SELECT s.relname,
pg_size_pretty(pg_relation_size(relid)),
coalesce(n_tup_ins,0) + 2 * coalesce(n_tup_upd,0) - coalesce(n_tup_hot_upd,0) +
coalesce(n_tup_del,0) AS total_writes,
(coalesce(n_tup_hot_upd,0)::float * 100 / (CASE WHEN n_tup_upd > 0 THEN n_tup_upd
ELSE 1 END)::float)::numeric(10,2) AS hot_rate,
(SELECT v[1]
FROM regexp_matches(reloptions::text,e'fillfactor=(d+)') AS r(v) LIMIT 1) AS fillfactor
FROM pg_stat_all_tables s
JOIN pg_class c ON c.oid=relid
ORDER BY total_writes DESC LIMIT 50;

Write activity example
• What is HOT

Usefull materials
• Postgrespro курс на русском
• Способы диагностики PostgreSQL ( yandex )
• Deep dive into postgresql statistics
• PostgreSQL meetup @ Avito (9.04.2016)
• What is HOT

please feel free to contact me at email:
bushmelev.aa@gmail.com

What you need to know for postgresql operation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to What you need to know for postgresql operation

Similar to What you need to know for postgresql operation (20)

Recently uploaded

Recently uploaded (20)

What you need to know for postgresql operation

Editor's Notes