Pro PostgreSQL

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    4 Favorites

    Pro PostgreSQL - Presentation Transcript

    1. Pro PostgreSQL Robert Treat omniti.com brighterlamp.org
    2. Who Am I? (Why Listen To Me) A-0
      • PostgreSQL User Since 6.5.x
      • DBA of High Traffic / Large PostgreSQL Instances
      • Long Time Contributor to PostgreSQL Project
      • Contribute / Maintain Several Open Source Projects
      • Co-Author Beginning PHP & PostgreSQL 8 (Apress)
    3. Outline A-1
      • Installation
      • Upgrading
      • Configuration
      • Routine maintenance
      • Replication / Availability
      • Advanced SQL
      • Query tuning
      • Indexing
      • Tablespaces
      • Partitioning
      http://pgfoundry.org/projects/dbsamples/
    4. C-0 Get Off To A Good Start
    5. C-1 Get Off To A Good Start Use package management
      • Consistent
      • Standardized
      • Simple
      • Different across systems
      • Upgrades are an issue
      • Trust your packager?
      C-2 Get Off To A Good Start Use package management
      • Different across systems
      • Upgrades are an issue
      • Trust your packager?
      C-3 Get Off To A Good Start Use package management Don't Be Afraid To Roll Your Own
    6. C-4 Get Off To A Good Start
      • $PGDATA/pg_log
      • /var/log/pgsql
      • when in doubt... (postgresql.conf)
      Configure Logging Logging is often overlooked, but is the first step toward troubleshooting!
    7. C-5 Get Off To A Good Start
      • most systems have different defaults
      • firewalls/ selinux (FATAL)
      • rtfm (pg_hba.conf, grant, revoke)
      Configure Authentication
    8. C-6 Get Off To A Good Start
      • TRUST
      • md5
      • IDENT
      Authentication Methods
    9. C-7 Get Off To A Good Start
      • trust these more than your own code
      • package dependent
      • use different schemas
      • tsearch2, pgcrypto
      • pgstatstuple, pg_buffercache, pg_freespacemap
      /contrib
    10. C-8 Get Off To A Good Start
      • package dependent
      • some are non-core (plruby, plr, plphp)
      • varying functionality
      • varying levels of trust
      • don't be afraid, test!
      procedural languages
    11. D-0 Let's Talk About Upgrades
    12. D-1 Let's Talk About Upgrades Versioning First Digit ( 7 .4.16 -> 8 .2.0) Second Digit (8.2.4 -> 8.3.0) Third Digit (8.3.0 -> 8.3.1)
    13. D-2 Let's Talk About Upgrades Versioning First Digit (7.4.16 -> 8.2.0) Second Digit (8. 2 .4 -> 8. 3 .0) Third Digit (8.3.0 -> 8.3.1)
    14. D-3 Let's Talk About Upgrades Versioning First Digit (7.4.16 -> 8.2.0) Second Digit (8.2.4 -> 8.3.0) Third Digit (8.3. 0 -> 8.3. 1 )
    15. D-4 Let's Talk About Upgrades Achtung!!
      • Make Backups!
      • Read the Release Notes!
    16. D-5 Let's Talk About Upgrades pg_dump/pg_restore
      • simple
      • -Fc is your friend
      • dump with new version of pg_dump
      • pitfalls (time, hdd)
    17. D-6 Let's Talk About Upgrades the slony method
      • not simple
      • create slave on new version
      • switchover (switch back?)
      • pitfalls (initial synch, compatibility)
    18. D-7 Let's Talk About Upgrades pg_migrator
      • in place upgrades
      • rewrites system catalog info
      • no way to go back (fs snapshots)
      • still new, getting better
      • 8.1 -> 8.2 only (for now)
    19. D-8 Let's Talk About Upgrades upgrading older db
      • <= 7.2 is no longer supported (upgrade now!)
      • pg_dump 8.2 has issues with <= 7.2
          • you can upgrade to 7.3 first
          • use adddepends on 7.3 install
      • slony requires 7.3 (or 7.4) or newer
      • pg_migrator (lol)
    20. E-0 Figure Your Configure
    21. E-1 Figure Your Configure the basics : performance
      • effective_cache_size
      • shared_buffers
      • default_statistics_target
      • sort_mem
      • checkpoint_segments
      • checkpoint_timeout
    22. E-2 Figure Your Configure the basics : logging
      • stderr/pg_log vs. syslog/eventlog
      • log_min_error_statement (error!)
      • log_min_duration_statement
      • log_line_prefix (%d, %p, %t)
    23. E-5 Figure Your Configure other stuff worth looking at
      • maintenance_work_mem
      • max_prepared_transactions
      • update_process_title
      • max_fsm_pages
    24. P-0 Routine Maintenance
    25. P-1 Routine Maintenance a word about vacuum
      • reclaim usable space
      • update table stats
      • avoid xid wraparound
    26. P-2 Routine Maintenance autovacuum : just do it!
      • autovacuum
      • stats_start_collector
      • stats_row_level
      • pg_autovacuum ?
    27. P-3 Routine Maintenance other stuff worth looking at
      • reindexing
      • logfiles
      • backups
      • failover
    28. G-0 Availability
    29. G-1 Availability what do we mean by availability?
      • not backups (exactly)
      • not replication (necessarily)
      • not clustering (even less so)
    30. G-2 Availability what do we mean by availability? if (kablooy) then (ok)
      • not backups (exactly)
      • not replication (necessarily)
      • not clustering (even less so)
    31. G-3 Availability pg_dump
      • traditionally used for backups
      • send dump to another server
      • constantly run restore process
      • large time, i/o constraints
    32. G-4 Availability pitr
      • create second, standby server
      • ship wal logs to new server
      • less time/io than pg_dump
      • 8.1 -> cold standby
      • 8.2 -> warm standby
      • 8.4 -> hot standby ?
    33. G-5 Availability slony
      • asynchronous, master-slave replication
      • controlled switchover, failover
      • low i/o, time constraints
      • other benefits (upgrades, scaling)
    34. G-5 Availability bucardo
      • asynchronous, multi-master replication
      • conflict resolution
      • low i/o, time constraints
      • other benefits (upgrades, scaling)
    35. G-6 Availability shared disk
      • one copy of PGDATA on shared storage
      • standby takes over akin to db crash
      • shared disk is point of failure (raid)
      • must ensure only one postmaster running
    36. G-7 Availability filesystem replication
      • drbd, zfs
      • filesystem mirrored between servers
      • synchronized, ordered writes
      • single disk system?
    37. G-8 Availability pgpool
      • dual-master, statement based
      • little caveats (random(),now(),sequences)
      • bigger caveats (security, password, pg_hba)
      • pgpool becomes failure point
    38. I-0 Beyond Simple SQL
    39. I-1 Beyond Simple SQL generate series pagila=# select * from generate_series(1,3); generate_series ----------------- 1 2 3 (3 rows)
      • behold the power of loops!
    40. I-2 Beyond Simple SQL generate series
      • extrapolate many uses
      pagila=# select '2007-05-22 09:00:00'::timestamp - x * '1 hour'::interval as countdown from generate_series(1,10) x; countdown --------------------- 2007-05-22 08:00:00 2007-05-22 07:00:00 2007-05-22 06:00:00 2007-05-22 05:00:00 2007-05-22 04:00:00 2007-05-22 03:00:00 2007-05-22 02:00:00 2007-05-22 01:00:00 2007-05-22 00:00:00 2007-05-21 23:00:00 (10 rows)
    41. I-3 Beyond Simple SQL rownum()
    42. I-4 Beyond Simple SQL row numbering
    43. I-5 Beyond Simple SQL row numbering pagila=# select (select count(*)+1 from customer where customer_id < x.customer_id) as rownum, customer_id, first_name, last_name from customer x limit 5; rownum | customer_id | first_name | last_name --------+-------------+------------+----------- 1 | 1 | MARY | SMITH 2 | 3 | LINDA | WILLIAMS 3 | 4 | BARBARA | JONES 4 | 5 | ELIZABETH | BROWN 5 | 6 | JENNIFER | DAVIS (5 rows)
    44. I-6 Beyond Simple SQL row numbering pagila=# select (select count(*)+1 from customer where customer_id < x.customer_id) as rownum, customer_id, first_name, last_name from customer x limit 5; rownum | customer_id | first_name | last_name --------+-------------+------------+----------- 1 | 1 | MARY | SMITH 2 | 3 | LINDA | WILLIAMS 3 | 4 | BARBARA | JONES 4 | 5 | ELIZABETH | BROWN 5 | 6 | JENNIFER | DAVIS (5 rows)
    45. I-7 Beyond Simple SQL row numbering pagila=# select (select count(*)+1 from customer where customer_id < x.customer_id) as rownum, customer_id, first_name, last_name from customer x limit 5; rownum | customer_id | first_name | last_name --------+-------------+------------+----------- 1 | 1 | MARY | SMITH 2 | 3 | LINDA | WILLIAMS 3 | 4 | BARBARA | JONES 4 | 5 | ELIZABETH | BROWN 5 | 6 | JENNIFER | DAVIS (5 rows)
      • fill in missing gaps
    46. I-8 Beyond Simple SQL row numbering
      • watch out for order by
      pagila=# select (select count(*)+1 from customer where last_name < x.last_name) as rownum, first_name, last_name from customer x order by last_name limit 5; rownum | first_name | last_name --------+------------+----------- 1 | RAFAEL | ABNEY 2 | NATHANIEL | ADAM 3 | KATHLEEN | ADAMS 4 | DIANA | ALEXANDER 5 | GORDON | ALLARD (5 rows)
    47. I-9 Beyond Simple SQL rollup() pagila=# select customer_id, amount from payment limit 10; customer_id | amount -------------+-------- 267 | 7.98 267 | 0.00 269 | 3.98 269 | 0.00 274 | 0.99 279 | 4.99 282 | 0.99 284 | 5.98 284 | 0.00 287 | 0.99 (10 rows)
    48. I-10 Beyond Simple SQL rollup() pagila=# select customer_id, amount, total from payment JOIN (select customer_id, sum(amount) as total from payment group by customer_id) x using (customer_id) limit 10; customer_id | amount | total -------------+--------+-------- 267 | 7.98 | 159.64 267 | 0.00 | 159.64 269 | 3.98 | 129.70 269 | 0.00 | 129.70 274 | 0.99 | 152.65 279 | 4.99 | 135.69 282 | 0.99 | 103.73 284 | 5.98 | 126.72 284 | 0.00 | 126.72 287 | 0.99 | 115.71 (10 rows)
      • Include totals with given row
    49. I-11 Beyond Simple SQL rank() SELECT * FROM (select c1.first_name, c1.last_name, c1.store_id, p1.total, (select 1 + count(*) from customer c2 join (select customer_id, sum(amount) as total from only payment group by customer_id) p2 using (customer_id) where c2.store_id = c1.store_id and p2.total > p1.total) as rank from customer c1 join (select customer_id, sum(amount) as total from only payment group by customer_id) p1 using (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank; first_name | last_name | store_id | total | rank ------------+-----------+----------+--------+------ ELEANOR | HUNT | 1 | 216.54 | 1 CLARA | SHAW | 1 | 195.58 | 2 TOMMY | COLLAZO | 1 | 186.62 | 3 KARL | SEAL | 2 | 221.55 | 1 MARION | SNYDER | 2 | 194.61 | 2 RHONDA | KENNEDY | 2 | 194.61 | 2 (6 rows)
    50. I-12 Beyond Simple SQL SELECT * FROM (SELECT c1.first_name, c1.last_name, c1.store_id, p1.total, (SELECT 1 + count(*) FROM customer c2 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p2 USING (customer_id) WHERE c2.store_id = c1.store_id AND p2.total > p1.total ) AS rank FROM customer c1 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p1 USING (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank;
    51. I-13 Beyond Simple SQL SELECT * FROM (SELECT c1.first_name, c1.last_name, c1.store_id, p1.total, (SELECT 1 + count(*) FROM customer c2 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p2 USING (customer_id) WHERE c2.store_id = c1.store_id AND p2.total > p1.total ) AS rank FROM customer c1 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p1 USING (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank;
    52. I-14 Beyond Simple SQL SELECT * FROM (SELECT c1.first_name, c1.last_name, c1.store_id, p1.total, (SELECT 1 + count(*) FROM customer c2 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p2 USING (customer_id) WHERE c2.store_id = c1.store_id AND p2.total > p1.total ) AS rank FROM customer c1 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p1 USING (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank;
    53. I-15 Beyond Simple SQL SELECT * FROM (SELECT c1.first_name, c1.last_name, c1.store_id, p1.total, (SELECT 1 + count(*) FROM customer c2 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p2 USING (customer_id) WHERE c2.store_id = c1.store_id AND p2.total > p1.total ) AS rank FROM customer c1 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p1 USING (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank;
    54. I-16 Beyond Simple SQL SELECT * FROM (SELECT c1.first_name, c1.last_name, c1.store_id, p1.total, (SELECT 1 + count(*) FROM customer c2 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p2 USING (customer_id) WHERE c2.store_id = c1.store_id AND p2.total > p1.total ) AS rank FROM customer c1 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p1 USING (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank;
    55. J-0 Query Your Queries
    56. J-1 Query Your Queries finding slow queries: log_min_duration_statement
      • -1, 0 , n
      • superuser only
      • alter user
      LOG: duration: 5005.273 ms statement: select pg_sleep(5);
    57. J-2 Query Your Queries finding slow queries: pgfouine / pqa
      • log analyzers
      • command line, generate reports
      • i/o load
      http://pgfouine.projects.postgresql.org/reports.html http://pqa.projects.postgresql.org/example.html
    58. J-3 Query Your Queries finding slow queries: pg_stat_all_tables pagila=# d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ----------------+-------------------------- relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint |
    59. J-4 Query Your Queries finding slow queries: pg_stat_all_tables pagila=# d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ----------------+-------------------------- relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint |
    60. J-5 Query Your Queries finding slow queries: pg_stat_all_tables pagila=# d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ----------------+-------------------------- relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint |
    61. J-6 Query Your Queries finding slow queries: pg_stat_all_tables pagila=# d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ----------------+-------------------------- relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint |
    62. J-7 Query Your Queries finding slow queries: pg_stat_all_indexes pagila=# d pg_stat_all_indexes View &quot;pg_catalog.pg_stat_all_indexes&quot; Column | Type | ---------------+--------+ relid | oid | indexrelid | oid | schemaname | name | relname | name | indexrelname | name | idx_scan | bigint | idx_tup_read | bigint | idx_tup_fetch | bigint |
    63. J-8 Query Your Queries finding slow queries: pg_stat_all_indexes pagila=# d pg_stat_all_indexes View &quot;pg_catalog.pg_stat_all_indexes&quot; Column | Type | ---------------+--------+ relid | oid | indexrelid | oid | schemaname | name | relname | name | indexrelname | name | idx_scan | bigint | idx_tup_read | bigint | idx_tup_fetch | bigint |
    64. J-9 Query Your Queries finding slow queries: pg_statio_all_tables pagila=# d pg_statio_all_tables View &quot;pg_catalog.pg_statio_all_tables&quot; Column | Type | -----------------+--------+ relid | oid | schemaname | name | relname | name | heap_blks_read | bigint | heap_blks_hit | bigint | idx_blks_read | bigint | idx_blks_hit | bigint | toast_blks_read | bigint | toast_blks_hit | bigint | tidx_blks_read | bigint | tidx_blks_hit | bigint |
    65. J-9 Query Your Queries finding slow queries: pg_statio_all_tables pagila=# d pg_statio_all_tables View &quot;pg_catalog.pg_statio_all_tables&quot; Column | Type | -----------------+--------+ relid | oid | schemaname | name | relname | name | heap_blks_read | bigint | heap_blks_hit | bigint | idx_blks_read | bigint | idx_blks_hit | bigint | toast_blks_read | bigint | toast_blks_hit | bigint | tidx_blks_read | bigint | tidx_blks_hit | bigint |
    66. J-10 Query Your Queries fixing slow queries: explain analyze
      • universal tool
      • good for specific queries
      • “ explain” for large queries
      • could be it's own talk
    67. J-11 Query Your Queries fixing slow queries: explain analyze
      • universal tool
      • good for specific queries
      • “ explain” for large queries
      • could be it's own talk
      http://www.postgresql.org/docs/techdocs.38
    68. L-0 Indexing Options
    69. L-1 Indexing Options indexing (basic)
      • use explain to find large sequential reads
      • use pg_stat_* tables to find numerous reads
      • btree – (gist/gin)
      • enable_indexscan, enable_bitmapscan
      • dual column vs. single column
    70. L-2 Indexing Options indexing (partial) create index address_ba_part_idx on address (district) where district = 'Buenos Aires';
      • restrain index to rows that matter
      • can give significant speed improvements
      • where clause of index should match
      • where clause of query
    71. L-3 Indexing Options indexing (partial) create index customer_active_part_idx on customer (customer_id) where activebool is true;
      • restrain index to rows that matter
      • can give significant speed improvements
      • where clause of index should match
      • where clause of query
    72. L-4 Indexing Options indexing (functional)
      • some people prefer to call these expressional indexes
    73. L-5 Indexing Options indexing (expressional) create unique index one_true_email_xidx on customer (lower(email));
      • push expensive functions into your index
      • system sees just WHERE indexedcolumn = 'constant'
      • expression of index should match expression of queries
      • narrow scope, but nice gains
    74. L-6 Indexing Options indexing (expressional) create index fullname_xidx on customer ((first_name||' '||last_name));
      • push expensive functions into your index
      • system sees just WHERE indexedcolumn = 'constant'
      • expression of index should match expression of queries
      • narrow scope, but nice gains
    75. L-7 Indexing Options full text indexing gist vs. gin
      • old school
      • slower for queries
      • faster insert / update
      • mature
      • new in 8.2
      • faster for queries
      • slower insert / update
      • green
    76. N-0 PostgreSQL Tablespaces
    77. N-1 PostgreSQL Tablespaces tablespaces?
      • define logical locations for object placement
      • point to locations on disk (uses symlinks)
      • size determined by disk size (not pre-ordained)
      • dedicate per db, split db across multiple tblspc
    78. N-2 PostgreSQL Tablespaces tablespaces!
      • split database over separate disks
      • use stat, statio tables to gauge disk access
      • create dedicated storage for workloads
          • disk for read / write
          • disk for read only
          • large, slow disk for archiving
          • disk for indexes
    79. Q-0 PostgreSQL Partitioning
    80. Q-1 PostgreSQL Partitioning partitioning?
      • as table size grows, it becomes unmanageable
      • use inheritance, rules, constraints to split data
      • queries ignore non-relevant partitions
      • could be it's own talk
    81. Q-2 PostgreSQL Partitioning partitioning!
      • as table size grows, it becomes unmanageable
      • use inheritance, rules, constraints to split data
      • queries ignore non-relevant partitions
      • could be it's own talk
      http://www.pgcon.org/2007/schedule/events/41.en.html
    82. Q-3 PostgreSQL Partitioning partitioning : key points
      • determine list vs. range
      • use triggers rather than rules
      • partition creation vs. data population
      • automate maintenance
    83. K-0 Other Stuff I Should Mention
    84. K-1 Other Stuff I Should Mention tsearch2
      • full text search interface
      • /contrib module (integrated in 8.3?)
      • very advanced capabilities (rank, headline)
      • missing some things (wildcard)
      • good performance for db's
      • better performance with problem specific tools
      • make it live in its own schema
    85. K-2 Other Stuff I Should Mention pgcrypto
      • cryptography type functions
      • /contrib (export issues)
      • md5, sha1, blowfish, many more
    86. K-3 Other Stuff I Should Mention dblink
      • pg -> pg connections
      • /contrib (still under development?)
      • can have performance issues on large queries
      • make it live in its own schema
      • beware security issue
    87. K-4 Other Stuff I Should Mention autonomous logging tool
      • persistent logging for postgresql functions
      • built on top of dblink
      • make it live in its own schema
      https://labs.omniti.com/trac/pgsoltools
    88. K-4 Other Stuff I Should Mention snapshot pitr clones
      • full read/write copy of pitr slave
      • static snapshot
      • need solaris (zfs zone mojo)
      https://labs.omniti.com/trac/pgsoltools
    89. K-4 Other Stuff I Should Mention pgbouncer
      • connection pooling application
      • 100-1 pgb/db connection ratio
      • small community, good results
      https://developer.skype.com/SkypeGarage/DbProjects/PgBouncer
    90. K-5 Other Stuff I Should Mention dbi-link
      • heterogeneous connections for postgresql
      • built on plperl / dbi
      • moving toward sql-med support
      • similar performance issues to dblink
      http://pgfoundry.org/projects/dbi-link/
    91. K-6 Other Stuff I Should Mention phppgadmin
      • web based gui for postgresql
      • remote administration of multiple servers
      • implements much of postgresql functionality
      • support back to 7.1?
      http://phppgadmin.sourceforge.net/
    92. K-7 Other Stuff I Should Mention ;-) my book?
    93. K-8 Other Stuff I Should Mention ;-) we're hiring
      • software engineers
      • web programmers
      • ui/graphics
      http://www.omniti.com/people/jobs
    94. X-0 El Fin

    + Robert TreatRobert Treat, 2 years ago

    custom

    3059 views, 4 favs, 0 embeds more stats

    This is the version of my "Pro PostgreSQL" talk giv more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 3059
      • 3059 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 4
    • Downloads 0
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories