PostgreSQL and the future
Aaron Thul
Who am I?
•  Computer & Database Geek
•  Formerly a Debian SysAdmin
•  Presently Vice President of Technology at a
   EMOL
•  PostgreSQL Evangelist
•  Penguicon Organizer
Ground rules
•  Questions are good
•  Arguments are bad
Agenda
•  Big News
•  New Features
•  State of PostgreSQL Community
•  Evil
•  Profit
First Things First

•  There is some big news
First Things First

•  There is some big news
•  PostgreSQL 8.5 is…
First Things First

•  There is some big news
•  PostgreSQL 8.5 is…


• PostgreSQL 9.0 !
Why 8.5 became 9.0

•  Built-in log-streaming replication
•  Hot standby
New Features
Streaming replication

•  Warm Standby 8.x to Hot Standby
•  New kinds of postmaster processes, walsenders
   and walreceiver
•  Thanks to Streaming Replication, lag should be
   nearly zero
DROP IF EXISTS
•  Columns
•  Constraints
Better messages for unique violation
•  Improve unique-constraint-violation error
   messages to include the exact values being
   complained of.
Deferrable Uniqueness
•  Massive unique-key reassignments
MOVE FORWARD or BACKWARD
•  MOVE {FORWARD,BACKWARD} X
•  More easily move around in a curser
’samehost’ and ’samenet’
•  Does anyone know what HBA stands for?
•  To lazy to list all the hosts on your network
•  CIDR-ADDRESS
GRANT ALL
•  Grant privileges on all existing tables
•  Automatically grant privileges on all tables that
   will be created in this database in the future
TRIGGERS on columns
•  SQL-compliant triggers on columns, ie fire only
   if certain columns are named in the UPDATE's
   SET list
•  Help resolve circular TRIGGER issues
Conditional TRIGGERS
CREATE TRIGGER test_u
 AFTER UPDATE ON test
 FOR EACH ROW
 WHEN ( NEW.i <= 0 )
 EXECUTE PROCEDURE test_u();
VACUUM FULL
•  VACUUM FULL will start to behave like
   CLUSTER
•  Will not actually be reordering rows
•  VACUUM FULL INPLACE
Buffers info for explain
explain ( analyze on, buffers on ) select count(*)
  from pg_attribute ;

  QUERY PLAN
----------------------------------------------------
  ----------------------

 Aggregate (cost=64.70..64.71 rows=1 width=0)
  (actual time=0.466..0.466 rows=1 loops=1)
   Buffers: shared hit=18 read=21
   -> Seq Scan on pg_attribute (cost=0.00..59.56
  rows=2056 width=0) (actual time=0.002..0.301
  rows=2002 loops=1)
         Buffers: shared hit=18 read=21
 Total runtime: 0.492 ms
The list goes on
•  Multi-threaded pgbench
•  Hinting for number of distinct values
•  Machine readable EXPLAIN
•  Checking password strength
•  PL/pgSQL by default
PostgreSQL Community
Who makes database Software
•  Oracle/SUN
•  IBM
•  Microsoft
PostgreSQL Community
•  Community not company
•  Growing
•  International
•  PostgreSQL Association
•  Nothing happening with SUN/MySQL/Oracle is
   hurting the PostgreSQL community
www.postgresql.org
•  downloads
•  documentation
•  bug reports
•  security alerts
•  wiki
•  support companies
www.pgfoundry.org
•  Modules
•  Programs
•  Resources
www.planetpostgresql.org
•  Project News
•  Community News
•  Helpful Tips / Examples
archives.postgresql.org
•  mailing list archives back to 1997
•  full text search via postgtresql 8.3
•  keyword search suggestions
#postgresql
•  irc.freenode.net
•  real time help
•  rtfm_please - ??help
PostgreSQL Community Events
•  PGCon
•  PostgreSQL Conference 2009 Japan
•  European PGDay
•  PgWest
•  PgEast
•  PGCon Brazil
•  Local PUGs
•  http://wiki.postgresql.org/wiki/Events
Project Management
•  Core team
•  Committers
•  -hackers
•  Roadmap
•  Web team
Threats to PostgreSQL
•  Patent attacks
•  Hiring of volunteers to work on unrelated
   projects
Evil
Data Types
•  Just use text
  ▫  char/varchar/text the same under the hood
  ▫  avoid artificial limits
•  Focus on functions
  ▫  Phone numbers often require string manipulation
  ▫  Unix timestamp vs. Date arithmetic
•  Minimize typecasts
Take advantage of strong data typing
•  CHECK limits input at column level
•  ENUM limits specific values at type level
 ▫  Allows you to define a custom order, provides
    compact storage
•  DOMAIN defines a data type within constraint
   boundaries
•  Often outperforms JOIN on lookup tables
•  Allows for simpler schema design
Normalization (3NF)
•  All non-key columns must be directly dependent on PK
   ▫  Head is only dependent on the Name through the
      Workgroup column
Surrogate Keys
•  Natural Key (NK) is a CK with a natural
   relationship to that row
•  Surrogate Key (SK) is an artificially added
   unique identifier
 ▫  Since they are artificial they make queries harder
    to readand can lead to more joins
•  Integers do not significantly improve JOIN
   performance or reduce file I/O for many data
   sets
Bareword ids
•  Makes SQL less obvious to read
 ▫  SELECT id, id, id FROM foo, bar, baz WHERE …
 ▫  Makes ANSI JOIN syntax more cumbersome
    JOIN foo USING (bar_id)
 ▫  JOIN foo ON (foo.bar_id = bar.id)
•  Often resort to alias columns to add clarity,
   scoping
•  Some ORMs really like this (can be overridden
•  Use verbose id names instead
 ▫  Create table actor (actor_id, full_name text);
Use standard definitions
•  Often data has been designed in a standard way
 ▫  Country Code
 ▫  Email address
 ▫  Zip Code
 ▫  VIN
 ▫  SEX (ISO 5218)
•  Helps eliminate short-sightedness
•  Increases commonality across projects
Images in the database
•  Replication
•  Backups
•  Access control
•  Transactions
•  OS Portability
Over indexing
•  Indexes must be updated when data changes
   occur
 ▫  INSERT/UPDATE/DELETE all touch indexes
 ▫  Some like it HOT, pg_stat_all_tables
•  BitMap vs. Multi-Column Indexes
 ▫  Combine index on (a) and (b) in memory
 ▫  Index on (x,y,z) implies index on (x) and (x,y)
•  Make sure indexes are used
 ▫  pg_stat_all_indexes
Covering indexes
•  Creating indexes to avoid accessing data in the
   table
 ▫  TOAST makes this less necessary
 ▫  Visibility information stored in the table
Full text indexing
•  IT ROCKS!
•  Add search engine style functionality to DBMS
 ▫  LIKE '%foo%' and LIKE '%foo' cannot use index
 ▫  Regex searching has similar issues
 ▫  Built-in tsearch functionality in 8.3+
     GIN, expensive to update, very fast for searching
     GIST, cheaper to update, not as fast for searching
•  Database Specific Syntax
Profit
Lessons To Take Away
•  PostgreSQL 9.0 is going to rock
•  Never confuse a company with community
•  Some mistakes are just to much fun to make
   only once
•  Never ask for directions from a two-headed
   tourist!
   -Big Bird
Thanks to
•  The PostgreSQL Community!
•  Robert Treat
•  Hubert Lubaczewski
•  More info:
 ▫  http://www.depesz.com/
 ▫  http://www.xzilla.net/
Questions
•  Web: http://www.chasingnuts.com
•  Email: aaron@chasingnuts.com
•  IRC: AaronThul on irc.freenode.org
•  Jabber: apthul@gmail.com
•  Twitter: @AaronThul
•  AIM: AaronThul

PostgreSQL 9.0 & The Future

  • 1.
    PostgreSQL and thefuture Aaron Thul
  • 2.
    Who am I? • Computer & Database Geek •  Formerly a Debian SysAdmin •  Presently Vice President of Technology at a EMOL •  PostgreSQL Evangelist •  Penguicon Organizer
  • 5.
    Ground rules •  Questionsare good •  Arguments are bad
  • 6.
    Agenda •  Big News • New Features •  State of PostgreSQL Community •  Evil •  Profit
  • 7.
    First Things First • There is some big news
  • 8.
    First Things First • There is some big news •  PostgreSQL 8.5 is…
  • 9.
    First Things First • There is some big news •  PostgreSQL 8.5 is… • PostgreSQL 9.0 !
  • 10.
    Why 8.5 became9.0 •  Built-in log-streaming replication •  Hot standby
  • 11.
  • 12.
    Streaming replication •  WarmStandby 8.x to Hot Standby •  New kinds of postmaster processes, walsenders and walreceiver •  Thanks to Streaming Replication, lag should be nearly zero
  • 13.
    DROP IF EXISTS • Columns •  Constraints
  • 14.
    Better messages forunique violation •  Improve unique-constraint-violation error messages to include the exact values being complained of.
  • 15.
    Deferrable Uniqueness •  Massiveunique-key reassignments
  • 16.
    MOVE FORWARD orBACKWARD •  MOVE {FORWARD,BACKWARD} X •  More easily move around in a curser
  • 17.
    ’samehost’ and ’samenet’ • Does anyone know what HBA stands for? •  To lazy to list all the hosts on your network •  CIDR-ADDRESS
  • 18.
    GRANT ALL •  Grantprivileges on all existing tables •  Automatically grant privileges on all tables that will be created in this database in the future
  • 19.
    TRIGGERS on columns • SQL-compliant triggers on columns, ie fire only if certain columns are named in the UPDATE's SET list •  Help resolve circular TRIGGER issues
  • 20.
    Conditional TRIGGERS CREATE TRIGGERtest_u AFTER UPDATE ON test FOR EACH ROW WHEN ( NEW.i <= 0 ) EXECUTE PROCEDURE test_u();
  • 21.
    VACUUM FULL •  VACUUMFULL will start to behave like CLUSTER •  Will not actually be reordering rows •  VACUUM FULL INPLACE
  • 22.
    Buffers info forexplain explain ( analyze on, buffers on ) select count(*) from pg_attribute ; QUERY PLAN ---------------------------------------------------- ---------------------- Aggregate (cost=64.70..64.71 rows=1 width=0) (actual time=0.466..0.466 rows=1 loops=1) Buffers: shared hit=18 read=21 -> Seq Scan on pg_attribute (cost=0.00..59.56 rows=2056 width=0) (actual time=0.002..0.301 rows=2002 loops=1) Buffers: shared hit=18 read=21 Total runtime: 0.492 ms
  • 23.
    The list goeson •  Multi-threaded pgbench •  Hinting for number of distinct values •  Machine readable EXPLAIN •  Checking password strength •  PL/pgSQL by default
  • 24.
  • 25.
    Who makes databaseSoftware •  Oracle/SUN •  IBM •  Microsoft
  • 26.
    PostgreSQL Community •  Communitynot company •  Growing •  International •  PostgreSQL Association •  Nothing happening with SUN/MySQL/Oracle is hurting the PostgreSQL community
  • 27.
    www.postgresql.org •  downloads •  documentation • bug reports •  security alerts •  wiki •  support companies
  • 28.
  • 29.
    www.planetpostgresql.org •  Project News • Community News •  Helpful Tips / Examples
  • 30.
    archives.postgresql.org •  mailing listarchives back to 1997 •  full text search via postgtresql 8.3 •  keyword search suggestions
  • 31.
    #postgresql •  irc.freenode.net •  realtime help •  rtfm_please - ??help
  • 32.
    PostgreSQL Community Events • PGCon •  PostgreSQL Conference 2009 Japan •  European PGDay •  PgWest •  PgEast •  PGCon Brazil •  Local PUGs •  http://wiki.postgresql.org/wiki/Events
  • 33.
    Project Management •  Coreteam •  Committers •  -hackers •  Roadmap •  Web team
  • 34.
    Threats to PostgreSQL • Patent attacks •  Hiring of volunteers to work on unrelated projects
  • 35.
  • 36.
    Data Types •  Justuse text ▫  char/varchar/text the same under the hood ▫  avoid artificial limits •  Focus on functions ▫  Phone numbers often require string manipulation ▫  Unix timestamp vs. Date arithmetic •  Minimize typecasts
  • 37.
    Take advantage ofstrong data typing •  CHECK limits input at column level •  ENUM limits specific values at type level ▫  Allows you to define a custom order, provides compact storage •  DOMAIN defines a data type within constraint boundaries •  Often outperforms JOIN on lookup tables •  Allows for simpler schema design
  • 38.
    Normalization (3NF) •  Allnon-key columns must be directly dependent on PK ▫  Head is only dependent on the Name through the Workgroup column
  • 39.
    Surrogate Keys •  NaturalKey (NK) is a CK with a natural relationship to that row •  Surrogate Key (SK) is an artificially added unique identifier ▫  Since they are artificial they make queries harder to readand can lead to more joins •  Integers do not significantly improve JOIN performance or reduce file I/O for many data sets
  • 40.
    Bareword ids •  MakesSQL less obvious to read ▫  SELECT id, id, id FROM foo, bar, baz WHERE … ▫  Makes ANSI JOIN syntax more cumbersome JOIN foo USING (bar_id) ▫  JOIN foo ON (foo.bar_id = bar.id) •  Often resort to alias columns to add clarity, scoping •  Some ORMs really like this (can be overridden •  Use verbose id names instead ▫  Create table actor (actor_id, full_name text);
  • 41.
    Use standard definitions • Often data has been designed in a standard way ▫  Country Code ▫  Email address ▫  Zip Code ▫  VIN ▫  SEX (ISO 5218) •  Helps eliminate short-sightedness •  Increases commonality across projects
  • 42.
    Images in thedatabase •  Replication •  Backups •  Access control •  Transactions •  OS Portability
  • 43.
    Over indexing •  Indexesmust be updated when data changes occur ▫  INSERT/UPDATE/DELETE all touch indexes ▫  Some like it HOT, pg_stat_all_tables •  BitMap vs. Multi-Column Indexes ▫  Combine index on (a) and (b) in memory ▫  Index on (x,y,z) implies index on (x) and (x,y) •  Make sure indexes are used ▫  pg_stat_all_indexes
  • 44.
    Covering indexes •  Creatingindexes to avoid accessing data in the table ▫  TOAST makes this less necessary ▫  Visibility information stored in the table
  • 45.
    Full text indexing • IT ROCKS! •  Add search engine style functionality to DBMS ▫  LIKE '%foo%' and LIKE '%foo' cannot use index ▫  Regex searching has similar issues ▫  Built-in tsearch functionality in 8.3+   GIN, expensive to update, very fast for searching   GIST, cheaper to update, not as fast for searching •  Database Specific Syntax
  • 46.
  • 47.
    Lessons To TakeAway •  PostgreSQL 9.0 is going to rock •  Never confuse a company with community •  Some mistakes are just to much fun to make only once •  Never ask for directions from a two-headed tourist! -Big Bird
  • 48.
    Thanks to •  ThePostgreSQL Community! •  Robert Treat •  Hubert Lubaczewski •  More info: ▫  http://www.depesz.com/ ▫  http://www.xzilla.net/
  • 49.
    Questions •  Web: http://www.chasingnuts.com • Email: aaron@chasingnuts.com •  IRC: AaronThul on irc.freenode.org •  Jabber: apthul@gmail.com •  Twitter: @AaronThul •  AIM: AaronThul