Pro PostgreSQL, OSCon 2008

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    6 Favorites

    Pro PostgreSQL, OSCon 2008 - Presentation Transcript

    1. Pro PostgreSQL Robert Treat omniti.com brighterlamp.org
    2. Who Am I? (Why Listen To Me) O-0
      • PostgreSQL User Since 6.5.x
      • DBA of High Traffic / Large PostgreSQL Instances
      • Long Time Contributor to PostgreSQL Project
      • Contribute / Maintain Several Open Source Projects
      • Co-Author Beginning PHP & PostgreSQL 8 (Apress)
    3. Outline O-1
      • What you need to know about the project
      • Getting started
      • Upgrading
      • Configuring your server
      • Hardware
      • Availability
      • Scalability
      • Query tuning
      • Tablespaces
      • Partitioning
      • Stuff you should know about
    4. K-0 Know Your Way Around The Project
    5. Know Your Way Around The Project K-1 www.postgresql.org
      • downloads
      • documentation
      • bug reports
      • security alerts
      • wiki
      • support companies
      • rss –> news – events - versions
    6. Know Your Way Around The Project K-2 www.pgfoundry.org projects.postgresql.org
      • Modules
      • Programs
      • Resources
      • URI Type
      • CIText
      • SkyTools
      • Npgsql
      • Pl/Proxy
      • pg_bulkload
      • plpgsql-debugger
      • sample databases
    7. Know Your Way Around The Project K-3 www.planetpostgresql.org
      • Project News
      • Community News
      • Helpful Tips / Examples
    8. Know Your Way Around The Project K-4 archives.postgresql.org
      • mailing list archives back to 1997
      • full text search via postgtresql 8.3
      • keyword search suggestions
      • lists for users, developers, regional, user groups
    9. Know Your Way Around The Project K-5 #postgresql
      • irc.freenode.net
      • real time help
      • rtfm_please - ??help
    10. Know Your Way Around The Project K-6 project management
      • core team
      • committers
      • -hackers
      • roadmap
      • web team
    11. S-0 Get Off To A Good Start
    12. S-1 Get Off To A Good Start Use package management
      • Consistent
      • Standardized
      • Simple
      • Different across systems
      • Upgrades are an issue
      • Trust your packager?
      S-2 Get Off To A Good Start Use package management
      • Different across systems
      • Upgrades are an issue
      • Trust your packager?
      S-2 Get Off To A Good Start Use package management Don't Be Afraid To Roll Your Own
    13. S-4 Get Off To A Good Start
      • $PGDATA/pg_log
      • /var/log/pgsql
      • when in doubt... (postgresql.conf)
      • separate disk
      Configure Logging Logging is often overlooked, but is the first step toward troubleshooting!
    14. S-5 Get Off To A Good Start
      • most systems have different defaults
      • firewalls/ selinux (FATAL)
      • rtfm (pg_hba.conf, grant, revoke)
      Configure Authentication
    15. S-6 Get Off To A Good Start
      • TRUST
      • md5
      • IDENT
      Authentication Methods
    16. S-7 Get Off To A Good Start
      • trust these more than your own code
      • package dependent
      • use different schemas (when able)
      • pgcrypto
      • pgstatstuple, pg_buffercache, pg_freespacemap
      /contrib
    17. S-8 Get Off To A Good Start
      • package dependent
      • some are non-core (plruby, plr, plphp)
      • varying functionality
      • varying levels of trust
      • don't be afraid, test!
      procedural languages
    18. U-0 Let's Talk About Upgrades
    19. U-1 Let's Talk About Upgrades Versioning First Digit ( 7 .4.16 -> 8 .2.0) Second Digit (8.2.4 -> 8.3.0) Third Digit (8.3.0 -> 8.3.1)
    20. U-2 Let's Talk About Upgrades Versioning First Digit (7.4.16 -> 8.2.0) Second Digit (8. 2 .4 -> 8. 3 .0) Third Digit (8.3.0 -> 8.3.1)
    21. U-3 Let's Talk About Upgrades Versioning First Digit (7.4.16 -> 8.2.0) Second Digit (8.2.4 -> 8.3.0) Third Digit (8.3. 0 -> 8.3. 1 )
    22. U-4 Let's Talk About Upgrades Achtung!!
      • Make Backups!
      • Read the Release Notes!
    23. U-5 Let's Talk About Upgrades pg_dump/pg_restore
      • simple
      • -Fc is your friend
      • dump with new version of pg_dump
      • pitfalls (time, hdd)
    24. U-6 Let's Talk About Upgrades the slony method
      • not simple
      • create slave on new version
      • switchover (switch back?)
      • pitfalls (initial synch, compatibility)
    25. U-7 Let's Talk About Upgrades pg_migrator
      • in place upgrades
      • rewrites system catalog info
      • no way to go back (fs snapshots)
      • still new, under-flux
      • 8.1 -> 8.2 only (for now)
    26. U-8 Let's Talk About Upgrades upgrading older db
      • <= 7.3 is no longer supported (upgrade now!)
      • pg_dump 8.2 has issues with <= 7.2
          • you can upgrade to 7.3 first
          • use adddepends on 7.3 install
      • slony requires 7.3 (or 7.4) (or 8.3) or newer
      • pg_migrator (lol)
    27. C-0 Figure Your Configure
    28. C-1 Figure Your Configure the basics : performance
      • effective_cache_size
      • shared_buffers
      • default_statistics_target
      • work_mem
      • checkpoint_segments
    29. C-2 Figure Your Configure the basics : logging
      • stderr/pg_log vs. syslog/eventlog
      • log_min_error_statement (error!)
      • log_min_duration_statement
      • log_line_prefix (%d, %p, %t)
    30. C-3 Figure Your Configure other stuff worth looking at
      • maintenance_work_mem
      • max_prepared_transactions
      • update_process_title
      • max_fsm_pages
      • synchronous_commit
    31. M-0 Routine Maintenance
    32. M-1 Routine Maintenance a word about vacuum
      • reclaim usable space
      • update table stats
      • avoid xid wraparound
    33. M-2 Routine Maintenance autovacuum : just do it!
      • autovacuum
      • track_activities
      • track_counts
      • autovacuum_max_freeze_age
      • pg_autovacuum ?
    34. M-3 Routine Maintenance other stuff worth looking at
      • reindexing
      • logfiles
      • backups
      • failover
    35. P-0 Hardware For Software People
    36. P-1 Hardware For Software People you can be cheap on the licensing, don't be cheap on the hardware
      • Cram the (ecc) RAM
      • Redundant Disks
      • (Software|Hardware) RAID
      • Prefer SCSI, Accept SATA
      • RAID Z (JBOD)
    37. P-2 Hardware For Software People cpu and you
      • Multiple CPU's work wonders, up to 8 processors (16 OK)
      • Opteron, Xeon, Core 2 Duo
      • http://tweakers.net/
    38. P-3 Hardware For Software People disk do's
      • Put WAL on it's own disk (RAID 1)
      • Put DATA directory on it's own disk (RAID 10)
      • More Spindles is Good
      • More Controllers Even Gooder (tablespaces!)
      • Battery Backed Controllers, Write cache enabled
    39. P-4 Hardware For Software People disk dont's
      • NFS == Not Fully Safe!
      • RAID 5
      • Beware disks / controllers that Lie
    40. A-0 Availability (Not Scalability)
    41. A-1 Availability what do we mean by availability?
      • not backups (exactly)
      • not replication (necessarily)
      • not clustering (even less so)
    42. A-2 Availability what do we mean by availability? if (kablooy) then (ok)
      • not backups (exactly)
      • not replication (necessarily)
      • not clustering (even less so)
    43. A-3 Availability pg_dump
      • traditionally used for backups
      • send dump to another server
      • constantly run restore process
      • large time, i/o constraints
    44. A-4 Availability filesystem snapshots
      • zfs, lvm
      • built on top of pitr backup commands
      • traditionally used for backups
      • copy snapshot to another server
      • cumbersome time, i/o constraints
    45. A-5 Availability pitr
      • create second, standby server
      • ship wal logs to new server
      • less time/io than pg_dump
      • 8.1 -> cold standby
      • 8.2 -> warm standby
      • 8.4 -> hot standby ?
    46. A-6 Availability slony
      • asynchronous, master-slave replication
      • controlled switchover, failover
      • low i/o, time constraints
      • other benefits (upgrades, scaling)
    47. A-7 Availability bucardo
      • asynchronous, multi-master replication
      • also does master-slave
      • low i/o, time constraints
      • other benefits (upgrades, scaling)
    48. A-8 Availability shared disk
      • one copy of PGDATA on shared storage
      • standby takes over akin to db crash
      • shared disk is point of failure (raid)
      • STONITH
    49. A-9 Availability filesystem replication
      • drbd
      • filesystem mirrored between servers
      • synchronized, ordered writes
      • single disk system?
    50. A-10 Availability pgpool
      • dual-master, statement based
      • little caveats (random(),now(),sequences)
      • bigger caveats (security, password, pg_hba)
      • pgpool becomes failure point
    51. A-11 Availability postgres-r
      • multi-master, synchronous
      • just open sourced this month!
      • small community
      • not proven
    52. H-0 Scalability
    53. H-1 Scalability what is scaling? “ How well a solution to some problem will work when the relative size of the problem increases” - Theo Schlossnagle
    54. H-2 Scalability bigger, better, faster, more!
      • postgresql scales up pretty well
      • more disks (tablespaces)
      • more cpu's, more ram
      • connection pooling
      • 1000+ connections, TB+ data
    55. H-3 Scalability pgpool
      • dual-master, statement based
      • little caveats (random(),now(),sequences)
      • bigger caveats (security, password, pg_hba)
      • pgpool becomes failure point
    56. H-4 Scalability pg_bouncer
      • simple connection pooler
      • 10/1 -> 40/1
      • caveats (prepared statements, temp tables)
      • skype, myyearbook.com
    57. H-5 Scalability slony
      • asynchronous, master-slave replication
      • multiple, cascading slaves
      • scales read operations
      • other benefits (upgrades, scaling)
      • solid user base
    58. H-6 Scalability bucardo
      • asynchronous, multi-master replication
      • also does master-slave
      • low i/o, time constraints
      • other benefits (upgrades, scaling)
    59. H-7 Scalability pgpool-II
      • single db over multiple machines
      • scales read operations
      • replication, load balance, parallel query
      • green technology
    60. H-8 Scalability pgcluster
      • synchronous multi-master replication
      • significant complexity
      • scales read operations
      • other uses (failover abilities)
      • green technology
    61. H-9 Scalability postgres-r
      • multi-master, synchronous
      • just open source this month!
      • small community
      • other uses (failover abilities)
      • not proven
    62. H-10 Scalability pitr read-only slaves
      • based on pitr, warm standby operation
      • core team officially supporting development
      • 8.4 -> synchronous wal shipping
      • 8.? -> read only slaves
    63. J-0 Query Your Queries
    64. J-1 Query Your Queries finding slow queries: log_min_duration_statement
      • -1, 0 , n
      • superuser only
      • alter user
      LOG: duration: 5005.273 ms statement: select pg_sleep(5);
    65. J-2 Query Your Queries finding slow queries: pgfouine / pqa
      • log analyzers
      • command line, generate reports
      • i/o load
      http://pgfouine.projects.postgresql.org/reports.html http://pqa.projects.postgresql.org/example.html
    66. J-3 Query Your Queries finding slow queries: pg_stat_all_tables pagila=# d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ------------------+-------------+ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |
    67. J-4 Query Your Queries finding slow queries: pg_stat_all_tables pagila=# d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ------------------+-------------+ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |
    68. J-5 Query Your Queries finding slow queries: pg_stat_all_tables pagila=# d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ------------------+-------------+ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |
    69. J-6 Query Your Queries finding slow queries: pg_stat_all_tables pagila=# d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ------------------+-------------+ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |
    70. J-7 Query Your Queries finding slow queries: pg_stat_all_tables pagila=# d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ------------------+-------------+ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |
    71. J-8 Query Your Queries finding slow queries: pg_stat_all_tables pagila=# d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ------------------+-------------+ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |
    72. J-9 Query Your Queries finding slow queries: pg_stat_all_indexes pagila=# d pg_stat_all_indexes View &quot;pg_catalog.pg_stat_all_indexes&quot; Column | Type | ---------------+--------+ relid | oid | indexrelid | oid | schemaname | name | relname | name | indexrelname | name | idx_scan | bigint | idx_tup_read | bigint | idx_tup_fetch | bigint |
    73. J-10 Query Your Queries finding slow queries: pg_stat_all_indexes pagila=# d pg_stat_all_indexes View &quot;pg_catalog.pg_stat_all_indexes&quot; Column | Type | ---------------+--------+ relid | oid | indexrelid | oid | schemaname | name | relname | name | indexrelname | name | idx_scan | bigint | idx_tup_read | bigint | idx_tup_fetch | bigint |
    74. J-11 Query Your Queries finding slow queries: pg_statio_all_tables pagila=# d pg_statio_all_tables View &quot;pg_catalog.pg_statio_all_tables&quot; Column | Type | -----------------+--------+ relid | oid | schemaname | name | relname | name | heap_blks_read | bigint | heap_blks_hit | bigint | idx_blks_read | bigint | idx_blks_hit | bigint | toast_blks_read | bigint | toast_blks_hit | bigint | tidx_blks_read | bigint | tidx_blks_hit | bigint |
    75. J-12 Query Your Queries finding slow queries: pg_statio_all_tables pagila=# d pg_statio_all_tables View &quot;pg_catalog.pg_statio_all_tables&quot; Column | Type | -----------------+--------+ relid | oid | schemaname | name | relname | name | heap_blks_read | bigint | heap_blks_hit | bigint | idx_blks_read | bigint | idx_blks_hit | bigint | toast_blks_read | bigint | toast_blks_hit | bigint | tidx_blks_read | bigint | tidx_blks_hit | bigint |
    76. J-13 Query Your Queries fixing slow queries: explain analyze
      • universal tool
      • good for specific queries
      • “ explain” for large queries
      • could be it's own talk
    77. J-14 Query Your Queries fixing slow queries: explain analyze
      • universal tool
      • good for specific queries
      • “ explain” for large queries
      • could be it's own talk
      http://wiki.postgresql.org/Using_EXPLAIN
    78. J-15 Query Your Queries fixing slow queries: indexing (basic)
      • use explain to find large sequential reads
      • use pg_stat_* tables to find numerous reads
      • btree – (gist/gin)
      • enable_indexscan, enable_bitmapscan
      • dual column vs. single column
    79. J-16 Query Your Queries fixing slow queries: indexing (partial) create index address_ba_part_idx on address (district) where district = 'Buenos Aires';
      • restrain index to rows that matter
      • can give significant speed improvements
      • where clause of index should match
      • where clause of query
    80. J-17 Query Your Queries fixing slow queries: indexing (partial) create index customer_active_part_idx on customer (customer_id) where activebool is true;
      • restrain index to rows that matter
      • can give significant speed improvements
      • where clause of index should match
      • where clause of query
    81. J-18 Query Your Queries fixing slow queries: indexing (functional)
      • some people prefer to call these expressional indexes
    82. J-19 Query Your Queries fixing slow queries: indexing (expressional) create unique index one_true_email_xidx on customer (lower(email));
      • push expensive functions into your index
      • system sees just WHERE indexedcolumn = 'constant'
      • expression of index should match expression of queries
      • narrow scope, but nice gains
    83. J-20 Query Your Queries fixing slow queries: indexing (expressional) create index fullname_xidx on customer ((first_name||' '||last_name));
      • push expensive functions into your index
      • system sees just WHERE indexedcolumn = 'constant'
      • expression of index should match expression of queries
      • narrow scope, but nice gains
    84. J-21 Query Your Queries fixing slow queries: full text search
      • uses lexmes and word stemming to find common words
      • replacement for LIKE '%x%', ~* 'x';
      • supports multiple languages, custom dictionaries
      • special indexing options
    85. J-22 Indexing Options full text indexing gist vs. gin
      • “ old school”
      • slower for queries
      • faster insert / update
      • mature
      • new in 8.2
      • faster for queries
      • slower insert / update
      • stable
    86. N-0 PostgreSQL Tablespaces
    87. N-1 PostgreSQL Tablespaces tablespaces?
      • define logical locations for object placement
      • point to locations on disk (uses symlinks)
      • size determined by disk size (not pre-ordained)
      • dedicate per db, split db across multiple tblspc
    88. N-2 PostgreSQL Tablespaces tablespaces!
      • split database over separate disks
      • use stat, statio tables to gauge disk access
      • create dedicated storage for workloads
          • disk for read / write
          • disk for read only
          • large, slow disk for archiving
          • disk for indexes
    89. Q-0 PostgreSQL Partitioning
    90. Q-1 PostgreSQL Partitioning partitioning?
      • as table size grows, it becomes unmanageable
      • use inheritance, rules, constraints to split data
      • queries ignore non-relevant partitions
      • could be it's own talk
    91. Q-2 PostgreSQL Partitioning partitioning!
      • as table size grows, it becomes unmanageable
      • use inheritance, rules, constraints to split data
      • queries ignore non-relevant partitions
      • could be it's own talk
      http://www.pgcon.org/2007/schedule/events/41.en.html
    92. Q-3 PostgreSQL Partitioning partitioning : key points
      • determine list vs. range
      • use triggers rather than rules
      • partition creation vs. data population
      • automate maintenance
    93. I-0 Other Stuff I Should Mention
    94. I-1 Other Stuff I Should Mention pgcrypto
      • cryptography type functions
      • /contrib (export issues)
      • md5, sha1, blowfish, many more
    95. I-2 Other Stuff I Should Mention dblink
      • pg -> pg connections
      • /contrib (still under development?)
      • can have performance issues on large queries
      • make it live in it's own schema
    96. I-3 Other Stuff I Should Mention *-link
      • heterogenous connections for postgresql
      • db specific and db independent options
      • any pl/u language can implement this
      • similar performance issues to dblink
      • dblink-tds, dbi-link, oralink, odbclink
      http://www.pgfoundry.org/ (db link)
    97. I-4 Other Stuff I Should Mention autonomous logging tool
      • persistent logging for postgresql functions
      • built on top of dblink
      • make it live in it's own schema
      https://labs.omniti.com/trac/pgsoltools
    98. I-5 Other Stuff I Should Mention snapshot pitr clones
      • full read/write copy of pitr slave
      • static snapshot
      • need solaris (zfs zone mojo)
      • could re-implement on other systems
      https://labs.omniti.com/trac/pgsoltools
    99. I-6 Other Stuff I Should Mention check_postgres
      • nagios based monitoring script
      • common items for warnings and alerts
      • can be adapted to other uses
      http://bucardo.org/check_postgres
    100. I-7 Other Stuff I Should Mention reconnoiter
      • monitoring / graphing tool
      • postgres based
      • still pretty green
      https://labs.omniti.com/trac/reconnoiter
    101. I-8 Other Stuff I Should Mention phpPgAdmin
      • web based gui for postgresql
      • remote administration of multiple servers
      • implements much of postgresql functionality
      • support back to 7.2?
      http://phppgadmin.sourceforge.net/
    102. I-9 Other Stuff I Should Mention ;-) my book?
    103. I-10 Other Stuff I Should Mention ;-) we're hiring
      • Ops Ninjas
      • Perl Kung-Fu Artists
      • PHP Ninjas
      • Database Samurai
      http://omniti.com/is/hiring
    104. L-0 El Fin

    + Robert TreatRobert Treat, 2 years ago

    custom

    4511 views, 6 favs, 3 embeds more stats

    My Pro PostgreSQL talk, given at OSCon 2008

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 4511
      • 4471 on SlideShare
      • 40 from embeds
    • Comments 0
    • Favorites 6
    • Downloads 141
    Most viewed embeds
    • 37 views on http://hype-free.blogspot.com
    • 2 views on http://www.sim.spk-berlin.de
    • 1 views on http://www.blogger.com

    more

    All embeds
    • 37 views on http://hype-free.blogspot.com
    • 2 views on http://www.sim.spk-berlin.de
    • 1 views on http://www.blogger.com

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories