Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to PostgreSQL for System Administrators

3,795 views

Published on

Published in: Technology
  • Be the first to comment

Introduction to PostgreSQL for System Administrators

  1. 1. Introduction to PostgreSQL for System Administrators Jignesh Shah Staff Engineer, VMware Inc PGEast March 2011 – NYC
  2. 2. About MySelf Joined VMware in 2010  Database performance on vSphere Previously at Sun Microsystems from 2000-2010  Database Performance on Solaris/Sun Systems Work with PostgreSQL Performance Community  Scaling, Bottlenecks using various workloads My Blog: http://jkshah.blogspot.com © 2011 VMware Inc 2
  3. 3. Content Why OpenSource Databases? Why PostgreSQL? Quick Start to PostgreSQL PostgreSQL Internals PostgreSQL Monitoring © 2011 VMware Inc 3
  4. 4. Why Open Source Databases? License costs Need for “Just Enough” databases Source code access if vendor goes under or above Many available but currently two popular ones  MySQL – Great for simple, web-based workloads  PostgreSQL – Great for complex, enterprise transactional oriented workloads © 2011 VMware Inc 4
  5. 5. Why PostgreSQL? PostgreSQL is community driven  Consists of Core team, Committers, Contributors, Developers No one company owns it Great Mailing List Support Yearly releases Great strides in overall performance, replication in last few years License is more flexible than GPL (OEM, Embedded customers more welcome) © 2011 VMware Inc 5
  6. 6. Quick Start Guide for PostgreSQL Install PostgreSQL binaries or self build/install them Set $PGDATA variable to a valid directory path $ initdb $ pg_ctl start -l logfile $ createdb dbname $ psql dbname © 2011 VMware Inc 6
  7. 7. Quick Start Guide for Remote Access If you want to make the database access from remote servers Add network/remote host entry in $PGDATA/pg_hba.conf Add listen=* or specific IP address of NIC in $PGDATA/postgresql.conf $ pg_ctl reload $ psql -h hostname -u username -d dbname © 2011 VMware Inc 7
  8. 8. Quick Start Guide for Performance Tuning Modify following parameters in $PGDATA/postgresql.conf  checkpoint_segments=16  shared_buffers=512MB  wal_buffers=1MB If disks are slow to write and willing to take few transaction loss without corrupting databases  synchronous_commit=off Separate pg_xlog on separate filesystem  pg_ctl stop  mv $PGDATA/pg_xlog /path/to/pg_xlog  ln -s /path/to/pg_xlog $PGDATA/pg_xlog  pg_ctl start -l logfile © 2011 VMware Inc 8
  9. 9. Quick Start Guide for Client Connections psql - SQL Interpreter client libpq – API access to write C applications to access the database (used by PHP, Python, etc) JDBC – Standard JDBC for Java Programs to connect with PostgreSQL databases Npgsql – Database access for .NET/Mono applications UnixODBC – Standard ODBC for applications to connect with PostgreSQL © 2011 VMware Inc 9
  10. 10. Quick Start Guide - PostgreSQL Administration SQL Support: SELECT, INSERT, UPDATE, INSERT Utilities:  DDL Support: CREATE, ALTER, etc  Backup: pg_dump, pg_restore  Statistics: analyze  Cleanup: vacuumdb (or Auto-vacuum)  Misc: oid2name © 2011 VMware Inc 10
  11. 11. PostgreSQL Internals – File System Layout Files in $PGDATA  postgresql.conf – PostgreSQL Configuration File  pg_hba.conf - PostgreSQL Host Based Access Configuration  pg_ident.conf - Mapping System names to PostgreSQL user names  PG_VERSION – Contains Version Information  postmaster.pid – Contains PID of PostMaster, $PGDATA and shared memory segment ids  postmaster.opts – Contains the record for command line options for the last stared database instance © 2011 VMware Inc 11
  12. 12. PostgreSQL Internals – File System Layout Directories in $PGDATA  base – Contains per-database subdirectories  pg_stat_tmp - Temporary directory for Statistics subsystem  global - Database Cluster-wide tables (pg_control)  pg_log – Server System logs (text or csv readable)  pg_subtrans – Contains sub-transaction status data  pg_xlog – WAL or Write Ahead Log directory  pg_clog - Commit Status data  pg_multixact - Contains multi-transaction status data  pg_tblspc – Contains links to tablespace locations  pg_notify - LISTEN/NOTIFY status data  pg_twophase – Contains state files for prepared transactions © 2011 VMware Inc 12
  13. 13. PostgreSQL Internals – Processes Layout> pgrep -lf postgres6688 /usr/local/postgres/bin/postgres(POSTMASTER)6689 postgres: logger process6691 postgres: writer process6692 postgres: wal writer process6693 postgres: autovacuum launcher process6694 postgres: stats collector process13583 postgres: username dbname [local] idle intransaction © 2011 VMware Inc 13
  14. 14. PostgreSQL Internals – Memory Layout PostgreSQL Shared Memory PostgreSQL Backend Process PostgreSQL Backend Process PostgreSQL Backend Process © 2011 VMware Inc 14
  15. 15. PostgreSQL Internals – Shared Memory LayoutPROCProcArray XLOG Buffers CLOG BuffersAuto Vacuum Subtrans BuffersBtree Vacuum Two-phase StructsFree Space Map Multi-xact buffersBackground Writer Shared Invalidation Buffer Descriptors Shared BuffersLWLocksLock HashesPROCLOCKStatisticsSynchronized Scans © 2011 VMware Inc 15
  16. 16. PostgreSQL Internals – WAL Layout WAL – Write Ahead Log pg_xlog contains 16MB segments All transactions are first logged here before they are committed checkpoint_segments and checkpoint_timeout drives the number of WAL/pg_xlog segments Can grow pretty big based on transaction load © 2011 VMware Inc 16
  17. 17. PostgreSQL Internals – Checkpoint Checkpoint does the following:  Pushes dirty bufferpool pages to storage  Syncs all filesystem cache  Recycle WAL files  Check for server messages indicating too-frequent checkpoints Checkpoint causes performance spikes (drops) while checkpoint is in progress © 2011 VMware Inc 17
  18. 18. PostgreSQL Internals – MVCC Default isolation level in PostgreSQL is READ- COMMITTED Each Query only sees transactions completed before it started On Query Start, PostgreSQL records:  The transaction counter  All transactions that are in progress In a multi-statement transaction, a transactions own previous queries are also visible © 2011 VMware Inc 18
  19. 19. PostgreSQL Internals – MVCC Visible Tuples must have xmin or creation xact id that:  Is a committed transaction  Is less than the transaction counter stored at query start  Was not in-process at query start Visible tuples must also have an expiration transaction id (xmax) that:  Is blank or aborted or  Is greater than the transaction counter at query start or  Was in-process at query start © 2011 VMware Inc 19
  20. 20. PostgreSQL Internals – VACUUM VACUUM is basically MVCC Cleanup Get rid of deleted rows (or earlier versions of rows) Records free space in .fsm files Often used with Analyze to collect optimizer statistics Auto vacuum helps to do the vacuum task as and when needed © 2011 VMware Inc 20
  21. 21. PostgreSQL Internals – Lock Types Access Share Lock – SELECT Row Share Lock - SELECT FOR UPDATE Row Exclusive Lock – INSERT, UPDATE, DELETE Share Lock – CREATE INDEX Share Row Exclusive Lock – EXCLUSIVE MODE but allows ROW SHARE LOCK Exclusive Lock – Blocks even Row Share Lock Access Exclusive Lock – ALTER TABLE, DROP TABLE, VACUUM and LOCK TABLE © 2011 VMware Inc 21
  22. 22. PostgreSQL Internals – Query Execution Parse Statement Separate Utilities from actual queries Rewrite Queries in a standard format Generate Paths Select Optimal Path Generate Plan Execute Plan © 2011 VMware Inc 22
  23. 23. Monitoring PostgreSQL – Monitoring IO Iostat -xcmt 5 (Helps if pg_xlog is on separate filesystem)12/02/2010 01:09:56 AMavg-cpu: %user %nice %system %iowait %steal %idle 8.21 0.00 2.40 0.77 0.00 88.62Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %utilsda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00sdd ($PGDATA) 0.00 46.67 0.00 3.67 0.00 0.20 110.55 0.01 1.45 0.73 0.27sde (pg_xlog) 0.00 509.00 0.00 719.33 0.00 3.39 9.66 0.11 0.15 0.15 10.53sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 © 2011 VMware Inc 23
  24. 24. Monitoring PostgreSQL – Monitoring CPU/Mem top -ctop - 01:06:27 up 20 days, 1:48, 1 user, load average: 1.41, 5.30, 2.95Tasks: 157 total, 1 running, 156 sleeping, 0 stopped, 0 zombieCpu(s): 5.0%us, 0.5%sy, 0.0%ni, 93.2%id, 0.6%wa, 0.0%hi, 0.7%si, 0.0%stMem: 16083M total, 5312M used, 10770M free, 213M buffersSwap: 8189M total, 0M used, 8189M free, 4721M cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND29725 pguser 20 0 2105m 55m 53m S 21 0.3 0:14.66 postgres: pgusersbtest 192.168.0.65(20121) idle in transaction29726 pguser 20 0 2105m 55m 53m S 21 0.3 0:14.56 postgres: pgusersbtest 192.168.0.65(20122) idle in transaction29727 pguser 20 0 2105m 55m 53m S 21 0.3 0:14.58 postgres: pgusersbtest 192.168.0.65(20123) idle in transaction29728 pguser 20 0 2105m 55m 53m S 21 0.3 0:14.62 postgres: pgusersbtest 192.168.0.65(20124) idle in transaction © 2011 VMware Inc 24
  25. 25. Monitoring PostgreSQL – Monitoring SQL select * from pg_stat_activity;datid | datname | procpid | usesysid | usename | application_name | client_addr | client_port |backend_start | xact_start | query_start | waiting | current_query-------+---------+---------+----------+---------+------------------+--------------+-------------+------------------------------- 16384 | sbtest | 29601 | 10 | pguser | psql | | -1 | 2010-12-02 00:59:26.883118+00| 2010-12-02 01:00:43.088853+00 | 2010-12-02 01:00:43.088853+00 | f | select * from pg_stat_activity; 16384 | sbtest | 29624 | 10 | pguser | | 192.168.0.65 | 50659 | 2010-12-0201:00:27.406842+00 | 2010-12-02 01:00:43.087634+00 | 2010-12-02 01:00:43.089749+00 | f | SELECT cfrom sbtest where id=$1 16384 | sbtest | 29626 | 10 | pguser | | 192.168.0.65 | 50661 | 2010-12-0201:00:27.439644+00 | 2010-12-02 01:00:43.060565+00 | 2010-12-02 01:00:43.087203+00 | f | <IDLE> intransaction16384 | sbtest | 29627 | 10 | pguser | | 192.168.0.65 | 50662 | 2010-12-0201:00:27.458667+00 | 2010-12-02 01:00:43.082811+00 | 2010-12-02 01:00:43.089694+00 | f | SELECTSUM(K) from sbtest where id between $1 and $2 © 2011 VMware Inc 25
  26. 26. Monitoring PostgreSQL – EXPLAIN ANALYZE Explain Analyze select count(*) from sbtest where k > 1;QUERY PLAN---------------------------------------------------------- Aggregate (cost=380.48..380.49 rows=1 width=0) (actualtime=6.929..6.929 rows=1 loops=1) -> Index Scan using k on sbtest (cost=0.00..374.58rows=2356 width=0) (actual time=0.023..6.364 rows=3768loops=1) Index Cond: (k > 1) Total runtime: 6.977 ms(4 rows) © 2011 VMware Inc 26
  27. 27. Monitoring PostgreSQL – Monitoring DB Statistics select * from pg_stat_database;[ RECORD 4 ]-+-----------datid | 16384datname | sbtestnumbackends | 1xact_commit | 136978505xact_rollback | 95blks_read | 99161blks_hit | 1481038913tup_returned | 1138850495tup_fetched | 986059963tup_inserted | 3150677tup_updated | 6145651tup_deleted | 2050612 © 2011 VMware Inc 27
  28. 28. Monitoring PostgreSQL – Monitoring Table Statistics select * from pg_stat_user_tables;-[ RECORD 1 ]----+------------------------------relid | 16405schemaname | publicrelname | sbtestseq_scan | 2seq_tup_read | 0idx_scan | 2001731idx_tup_fetch | 46158444n_tup_ins | 211207n_tup_upd | 333527n_tup_del | 111207n_tup_hot_upd | 108633n_live_tup | 100000n_dead_tup | 0last_vacuum |last_autovacuum | 2010-12-02 01:38:44.821662+00last_analyze |last_autoanalyze | 2010-12-02 01:38:47.598785+00 © 2011 VMware Inc 28
  29. 29. Monitoring PostgreSQL – Monitoring IO Statistics select * from pg_statio_user_tables;-[ RECORD 1 ]---+---------relid | 16405schemaname | publicrelname | sbtestheap_blks_read | 4588heap_blks_hit | 29358125idx_blks_read | 1919idx_blks_hit | 6551658toast_blks_read |toast_blks_hit |tidx_blks_read |tidx_blks_hit | © 2011 VMware Inc 29
  30. 30. Monitoring PostgreSQL – Monitoring IO Statistics select * from pg_statio_user_indexes;-[ RECORD 1 ]-+------------relid | 16405indexrelid | 16412schemaname | publicrelname | sbtestindexrelname | sbtest_pkeyidx_blks_read | 391idx_blks_hit | 5210450-[ RECORD 2 ]-+------------relid | 16405indexrelid | 16414schemaname | publicrelname | sbtestindexrelname | kidx_blks_read | 1528idx_blks_hit | 1341208 © 2011 VMware Inc 30
  31. 31. Questions / More Information Email: jshah@vmware.com Learn more about PostgreSQL  http://www.postgresql.org Blog: http://jkshah.blogspot.com © 2011 VMware Inc 31

×