8.4 Upcoming Features

Version 8.4
Holy Frijole,
That's a lot of features!
February 2009
San Franciso
Josh Berkus, PostgreSQL Core Team

8.4: A Few Patches
5 CommitFests

8.4: A Few Patches
5 CommitFests

9 Months of Development

8.4: A Few Patches
5 CommitFests


Over 1600 GIT Updates

8.4: A Few Patches
5 CommitFests


Over 1600 GIT Updates

More Than 2 Dozen Major Features

Looks like some database projects still
know how to put out new version.

Some-but-not-all 8.4 Features
● Windowing Functions ● Unsigned Integers
● Common Table Expressions ● Boyer-Moore String Searching
● Parallel Restore ● Improved Hash Indexes
● CIText ● More DTrace probes
● array_agg ● Default & Variadic parameters
● auto_explain ● New PL/pgSQL statements
● SQL/MED connection manager ● pg_stat_statements
● Per-Database Collations ● pg_stat_functions
● d commands improved ● SSL refactor
● Multicolumn GIN indexes ● pg_hba improvements
● Column-level permissions ● Performance improvements

SQL Features
● Windowing Functions
● Common Table Expressions
● array_agg
● Per-database Collations
● New data types
– Unsigned Integers
– CIText
● Improved d commands
● Add columns to existing VIEWs

Windowing Functions
● Aggregate over part of the data
– SQL 2008 standard
– Great for BI, OLAP
● Functions:
– row_number()
– rank()
– lead()
– lag()
● More from David Fetter later!

Windowing Functions
SELECT
y,
m,
SUM(SUM(people)) OVER (PARTITION BY y ORDER BY m),
AVG(people)
FROM(
SELECT
EXTRACT(YEAR FROM accident_date) AS y,
EXTRACT(MONTH FROM accident_date) AS m,
*
FROM
accident SELECT
)s depname,
GROUP BY y, m; empno,
salary,
rank() OVER
(PARTITION BY depname
ORDER BY salary)
FROM
empsalary;

Common Table Expressions
● Ability to create "named subqueries" for your
query.
● Best use: WITH RECURSIVE
– real recursive queries
– "walk" trees with one query

● more from David Fetter later

Common Table Expressions
WITH RECURSIVE subdepartment AS
(
--
SELECT * FROM department WHERE id = 'A'

UNION ALL

-- recursive term referring to "subdepartment"
SELECT d.* FROM department AS d, subdepartment
AS sd
WHERE d.id = sd.parent_department
)
SELECT * FROM subdepartment;

array_agg
● History:
– added Arrays in 7.4
● array_accum() aggregate example code
– intarray contrib module in 8.0
● only ints, but very fast
● array_agg() in 8.4: all arrays, fast C Code
from Robert Haas, new contributor!
–

SELECT status, array_agg(username) FROM
logins GROUP BY status;

Per-Database Collations
● Collations (ordering character sets) used to be
per installation
● Now they are per database
● Someday they will be per column
● Google Summer of Code Project!
CREATE DATABASE mydb
COLLATE 'sv_se.UTF-8'
CTYPE 'sv_se.UTF-8'
TEMPLATE template0

New Data Types
● Make migrating from other DBMSes easier
● CIText (in /contrib)
– Case Insensitive Text
– Full CI indexing, comparisons
● Unsigned Integers (in pgFoundry)
– migrate from MySQL, others

Better d in psql
● d is now multi-version compatible
– dt etc. won't error if you connect an 8.4 client to an
8.2 database
● df for user functions only
– dfS for system functions
● ef to edit a funcion

Add columns to VIEWs
● In the bad old days:
– need to add another column to your VIEW?
– have to drop it & recreate it
– have to drop & recreate all dependancies
– enter the World Of Pain
● In 8.4:
– ALTER VIEW lets you add columns
– Can't rename or modify though

Performance & Monitoring
● Parallel Restore
● Improved Hash Indexes
● pg_stat_user_functions
● pg_stat_statements
● More Dtrace probes
● auto_explain
● Other Performance Improvements

Parallel Restore
● In 8.3, we were single-threaded

pg_dump dump Restore
file

8 Hours

Parallel Restore
● In 8.4, Multi-core, Restore!

Restore
Restore
Restore
Restore
pg_dump dump
Restore
file
Restore
Restore
Restore

2 Hours

Improved Hash Indexes
● Our old hash indexes were slow and useless
● Improved hash indexes are fast!
– use them for ID columns
● or other unique keys
– not completely recovery-safe yet though
● don't switch over production DBs until 8.5
● Google Summer of Code project!

pg_stat_user_functions
● For each of your functions, see
– # of times called
– amount of time spent
– amount of time spent excluding other functions

pg_stat_statements

log log pgFouine
file

More DTrace Probes
* Probes to measure query time * Probes to measure checkpoint stats such as running time,
query-parse-start (int, char *) buffers written, xlog files added, removed, recycled, etc
query-parse-done (int, char *)
query-plan-start () checkpoint-start (int)
query-plan-done () checkpoint-done (int, int, int, int, int)
query-execute-start ()
query-execute-done () * Probes to measure Idle in Transaction and client/network
query-statement-start (int, char *) time
query-statement-done (int, char *) idle-transaction-start (int, int)
idle-transaction-done ()

* Probes to measure dirty buffer writes by the backend because * Probes to measure sort time
bgwriter is not effective sort-start (int, int, int, int, int)
sort-done (int, long)
dirty-buffer-write-start (int, int, int, int)
dirty-buffer-write-done (int, int, int, int)
* Probes to determine whether or not the deadlock detector
* Probes to measure physical writes from the shared buffer has found a deadlock
buffer-write-start (int, int, int, int)
buffer-write-done (int, int, int, int, int) deadlock-found ()
deadlock-notfound (int)
* Probes to measure reads of a relation from a particular buffer
block * Probes to measure reads/writes by block numbers and
buffer-read-start (int, int, int, int, int) relations
buffer-read-done (int, int, int, int, int, int) smgr-read-start (int, int, int, int)
smgr-read-end (int, int, int, int, int, int)
* Probes to measure the effectiveness of buffer caching smgr-write-start (int, int, int, int)
buffer-hit () smgr-write-end (int, int, int, int, int, int)
buffer-miss ()

* Probes to measure I/O time because wal_buffers is too small
wal-buffer-write-start ()
wal-buffer-write-done ()

auto_explain
● misnamed; actually allows you to manually set specific
queries/sessions/functions to output explain plans to the log
postgres=# LOAD 'auto_explain';
postgres=# SET auto_explain.log_min_duration = 0;
postgres=# SELECT count(*)
FROM pg_class, pg_index
WHERE oid = indrelid AND indisunique;

This might produce log output such as:

LOG: duration: 0.986 ms plan:
Aggregate (cost=14.90..14.91 rows=1 width=0)
-> Hash Join (cost=3.91..14.70 rows=81 width=0)
Hash Cond: (pg_class.oid = pg_index.indrelid)
-> Seq Scan on pg_class (cost=0.00..8.27 rows=227 width
-> Hash (cost=2.90..2.90 rows=81 width=4)
-> Seq Scan on pg_index (cost=0.00..2.90 rows=81
●

More Performance Improvements
● Free Space Map is dynamically sized (no more
max_fsm_pages!)
● Visibility Map
– VACUUM only changed pages
– Index-only Scans in 8.5
● Less writing to pgstat file
– plus you can move it

Stored Procedures
● Default Parameters
● Variadic Parameters
● New PL/pgSQL Statements
● PL/pythonU OUT Parameters

DEFAULT parameters
CREATE OR REPLACE FUNCTI ON
adder ) a i nt de f a ul t 4 0 ,
b i nt de f a ul t 2 (
RETURNS i nt LANGUAGE ' sql '
AS ' sel ect $ 1 + $ 2' ;

SELECT adder ) ( ;
SELECT adder ) 1( ;
SELECT adder ) 1, 2( ;

VARIADIC parameters
CREATE OR REPLACE FUNCTION
adder(VARIADIC v int[])
RETURNS int AS $$
DECLARE s int; i int;
BEGIN
s:=0;
FOR i IN SELECT generate_subscripts(v,1) LOOP
s := s + i;
END LOOP;
RETURN s;
END;
$$ LANGUAGE 'plpgsql';

SELECT adder(1);
SELECT adder(1,2,3);
SELECT adder(40,2);

New PL/PgSQL Statements
● RETURNS TABLE
– SQL-compliant alias for "SETOF"
● CASE statement
– real switching logic
CASE
WHEN x BETWEEN 0 AND 10 THEN
msg := 'value is between zero and ten';
WHEN x BETWEEN 11 AND 20 THEN
msg := 'value is between eleven and twenty';
END CASE;

PL/pythonU OUT Parameters
● You now can use IN, OUT and INOUT
parameters with PL/pythonU functions.
● That's it!

Exotic Features
● SQL/MED Connection Manager
● Multi-column GIN Indexes
● Boyer-Moore String Searching

SQL/MED
● Foundation for connecting to external servers
– Future of PL/proxy and DBconnect
– Future of DBI-Connect

CREATE FOREIGN DATA WRAPPER pgsql LIBRARY
'pgsql_fdw';
CREATE SERVER foo FOREIGN DATA WRAPPER pgsql
OPTIONS (host 'remotehost', dbname 'remotedb');
CREATE USER MAPPING FOR PUBLIC SERVER foo OPTIONS
(username 'bob', password 'secret');

Multi-Column GIN Indexes
● Bad Old Days: to do a single Full Text Search
index over several columns, you had to
concatenate them.
● New Goodness: you can now do a proper
multicolumn index
– and it's faster!

Boyer-Moore String Searching

No, I don't know what it is either.

But we have it now.

Security
● Refactored SSL
● Improved pg_hba.conf
● Column-level Permissions
● SE-Postgres

Refactored SSL by Magnus

● Proper certificate verification
– Choose level, full verification is default
● Control over all key and certificate files
● SSL certificate authentication
– Trusted root certificate
– Map «cn» value of certificate

pg_hba Improvements
● "crypt" is gone (insecure)
● «ident sameuser» => «ident»
● New format for options
– name=value for all options
● usermaps for all external methods
– with regexp support
● Parsed on reload

Column Permissions
REVOKE SELECT (col1, col2), INSERT (col1, col2)
ON tab1 FROM role2;

● Restrict access to sensitive columns from
unprivileged ROLEs
– more fine-grained security
– no longer need to use VIEWs to do this

Many Patches == Lots of Testing
● Bug Testing
– can you make 8.4 crash?
● Specification Testing
– do the features do what the docs say they do?
● Performance Testing
– is 8.4 really faster? How much?
● Combinational Testing
– what happens when you put several new features
together?

Many Patches == Lots of Testing
1. Take a copy of your production applications
2. Port them to 8.4
3. Report breakage and issues
4. Play with implementing new features

Do It Now!
We're counting on you!

Contact Information
● Josh Berkus ● Upcoming events
– josh@postgresql.org – SCALE 7, Los
– http://it.toolbox.com/ Angeles, Feb. 20
blogs/database-soup – pgCon 2009, Ottawa,
May 20

This talk is copyright 2009 Josh Berkus, and is licensed under the Creative Commons Attribution License

8.4 Upcoming Features

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to 8.4 Upcoming Features

Similar to 8.4 Upcoming Features (20)

More from PostgreSQL Experts, Inc.

More from PostgreSQL Experts, Inc. (20)

Recently uploaded

Recently uploaded (20)

8.4 Upcoming Features