12 ft tall, 5+tons (12,000 lbs) Their trunk has over 100000 muscle units They eat and walk most of the day - terrible digestive system. Other animals eat their poop. One of the big five. Intimidating. Aggressive. Nobody messes with them. They change their teeth six times in a lifetime. At age 60 they don&#x2019;t grow them again and die of starvation. Four toes.
An expected feature, it guarantees the reliability of DB transactions.
A: Transactional operations either succeed completely or fail completely. In Postgres DDL operations are also transactional. C: Database goes from one consistent state to the next. I: Data that is generated in interim steps of a transaction can never be seen by any other readers/queries/viewers. This is what MVCC is all about. D: Guarantees that once the user/client is notified of success, the data is persisted and the transaction will not be undonde.
MVCC - Multi Version Concurrency Control.
An alternative to MVCC is read-locking. Every time a query reads from the DB, it locks the rows so that no other statement can change the data, and therefore it is reading &#x201C;real and consistent&#x201D; data. However, there can be many users reading from the same table. Hundreds of thousands in bigger websites. What if there&#x2019;s a queue of hundreds of reads, and you need to update the table? The update statement must wait until all reads are done in order to issue the update. Additional reads must wait until the update is completed before retrieving new data.
With MVCC, none of the reads need to lock the rows, and the update doesn&#x2019;t require a lock either. They all execute immediately.
Postgres keeps multiple versions (MVcc) of the data every time the data changes (to clean up older versions, use VACUUM).
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server http://www.postgresql.org/docs/current/static/runtime-config-client.htmlhttp://www.postgresql.org/docs/current/static/runtime-config-resource.htmlhttp://www.pgcon.org/2008/schedule/events/104.en.htmlhttp://vimeo.com/7109722http://wiki.postgresql.org/wiki/GUCS_Overhaulhttp://dimitrik.free.fr/db_STRESS_PostgreSQL_837_and_84_May2009.html http://www.pgexperts.com/documents.html Oracle has over 500 config settings.
postgresql&#x2019;s dedicated ram. 2nd level cache (1st level is the OS cache). Good starting point is 1/4 of available RAM. cache_miss statistics can tell you if you need more.
Memory limit on per query operations like sort, count, etc. If RAM swapping is high, work_mem is too high. In general, OS swapping == too much work_mem, while caching sorts in pg_temp == not enough work_mem. &#xA0;&#xA0;&#xA0;&#xA0;work_mem = 32MB
vacuum and analyze use it. Query expiration. Could be in the 256MB-1GB range for larger DBs.
Default is appropriate for one CPU, small DB. Increase it to maybe 8MB for SMPs. Whimpy default because of the linux shared mem limits.
Turn off, but know that there are risks as you may loose .5 seconds worth of data. If it&#x2019;s on, the WAL gets written immediately.
number of disks or channels. Only if your OS supports async IO. Linux and FreeBSD do.
This setting simply hints the planner, to make better cost estimates.
http://vimeo.com/9889075 So, I&#x2019;ve convinced you and you will port the app to postgres. Here&#x2019;s what to look after to avoid any pitfalls.
If any one component fails, the whole system suffers.
Hardware: I/O is a very common bottleneck. Many writes cause the transaction log to bring the process down. If the database is 3x larger than RAM -> I/O bound per query. Use SATA, not SAS (SATA is half duplex). Raid 1+0 over Raid 5, especially for writes. More spindles, the better => Many small drives better than less bigger drives. Move transaction log to a separate disk.
BEWARE OF CLOUD and IO!
Low RAM is also common, and capping on memory makes the server hit the disks. bad. It is ideal to fit entire database in RAM, using shared_buffers. If that&#x2019;s not possible, cache as much of the database as possible, in which case think about how big is the operation dataset that needs caching. Think about sorting, aggregates and other operations that may benefit from in-ram caching.
CPU: Pick more CPUs over more cores on same processor. L2 cache, speed, and 64 bit.
OS: Use something that supports direct I/O - linux, solaris, freebsd. * Tablespaces for large tables * Tuning linux: * XFS & JFS for database files. Otherwise ext3 if in redhat (no support for XFS). UFS if in solaris. * Reduce logging - data-writeback, noatime, nodiratime. * pre 2.6.9 kernels must upgrade. * use deadline scheduler for write speed
Schema: * 1NF. Postgres is optimized for normalized lookups. Wait for an actual issue before denormalizing. * You may benefit from separating out tables (1-to-1s) based on read/write patterns. * Indexes: FKs, where clauses, aggregations. Use expression indexes, partial indexes. Don&#x2019;t over index. Don&#x2019;t index small tables. Look at pg_stat_user_indexes and pg_stat_user_tables. * Partitioning: historical data, very big tables. Any big deletes. App must know how to deal with it, by querying the partition key as part of the WHERE criteria. * Query design. Do more on each query.
Have a mechanism for measuring changes you make. How much CPU, RAM, swapping is happening before and after tuning? Use top, vmstats, etc
* pgPool is more complex. Replication, distributed query management. pgBouncer is more lightweight. * Table Partitioning allows you to manage dead data without overhead, keeping your database healthy. Performance of course.
Grabbing the PostgreSQL Elephant by the Trunk
Grabbing the PostgreSQL
Elephant by the Trunk
• Init the database cluster
sudo -u postgres pg_ctl -D /usr/local/pgsql/data init
• Start the server
sudo -u postgres pg_ctl -D /usr/local/pgsql/data start
(there are SysV init scripts in contrib/start-scripts)
• Change postgres password:
sudo -u postgres psql template0
ALTER USER postgres WITH PASSWORD 'new_password';
• Create role
sudo -u postgres createuser --no-superuser --createdb --
no-createrole --login --pwprompt --encrypted -h 127.0.0.1
-p 5432 hgimenez
• pg_hba.conf (host based auth)
type database user cidr-address method
local all all ident
host all all 10.2.0.0/16 md5
• Createa “virgin” database
From the psql prompt
create database my_awesome_app_development owner=hgimenez
From your shell
createdb -U postgres -O db_user -T template0 -E 'utf8' db_name
Get acquainted with psql
psql -d db_name
set ECHO_HIDDEN true
Rails plugins and Ruby tools
Constraints, views and
other migration helpers:
Full text search (tsearch2):
The database toolkit:
“ Permission to use, copy, modify, and distribute
this software and its documentation for any
purpose, without fee, and without a written
agreement is hereby granted, provided that the
above copyright notice and this paragraph and
the following two paragraphs appear in all