SQL Features● Windowing Functions● Common Table Expressions● array_agg● Per-database Collations● New data types – Unsigned Integers – CIText● Improved d commands● Add columns to existing VIEWs
Windowing Functions● Aggregate over part of the data – SQL 2008 standard – Great for BI, OLAP● Functions: – row_number() – rank() – lead() – lag()● More from David Fetter later!
Windowing FunctionsSELECT y, m, SUM(SUM(people)) OVER (PARTITION BY y ORDER BY m), AVG(people)FROM( SELECT EXTRACT(YEAR FROM accident_date) AS y, EXTRACT(MONTH FROM accident_date) AS m, * FROM accident SELECT)s depname,GROUP BY y, m; empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary;
Common Table Expressions● Ability to create "named subqueries" for your query.● Best use: WITH RECURSIVE – real recursive queries – "walk" trees with one query● more from David Fetter later
Common Table ExpressionsWITH RECURSIVE subdepartment AS( -- SELECT * FROM department WHERE id = A UNION ALL -- recursive term referring to "subdepartment" SELECT d.* FROM department AS d, subdepartmentAS sd WHERE d.id = sd.parent_department)SELECT * FROM subdepartment;
array_agg● History: – added Arrays in 7.4 ● array_accum() aggregate example code – intarray contrib module in 8.0 ● only ints, but very fast● array_agg() in 8.4: all arrays, fast C Code from Robert Haas, new contributor! –SELECT status, array_agg(username) FROM logins GROUP BY status;
Per-Database Collations● Collations (ordering character sets) used to be per installation● Now they are per database● Someday they will be per column● Google Summer of Code Project!CREATE DATABASE mydbCOLLATE sv_se.UTF-8CTYPE sv_se.UTF-8TEMPLATE template0
New Data Types● Make migrating from other DBMSes easier● CIText (in /contrib) – Case Insensitive Text – Full CI indexing, comparisons● Unsigned Integers (in pgFoundry) – migrate from MySQL, others
Better d in psql● d is now multi-version compatible – dt etc. wont error if you connect an 8.4 client to an 8.2 database● df for user functions only – dfS for system functions● ef to edit a funcion
Add columns to VIEWs● In the bad old days: – need to add another column to your VIEW? – have to drop it & recreate it – have to drop & recreate all dependancies – enter the World Of Pain● In 8.4: – ALTER VIEW lets you add columns – Cant rename or modify though
Performance & Monitoring● Parallel Restore● Improved Hash Indexes● pg_stat_user_functions● pg_stat_statements● More Dtrace probes● auto_explain● Other Performance Improvements
Parallel Restore● In 8.3, we were single-threaded pg_dump dump Restore file 8 Hours
Improved Hash Indexes● Our old hash indexes were slow and useless● Improved hash indexes are fast! – use them for ID columns ● or other unique keys – not completely recovery-safe yet though ● dont switch over production DBs until 8.5● Google Summer of Code project!
pg_stat_user_functions● For each of your functions, see – # of times called – amount of time spent – amount of time spent excluding other functions
pg_stat_statementspostgres=# SELECT * FROM pg_stat_statements ORDER BY total_time DESCLIMIT 3;-[ RECORD 1 ]------------------------------------------------------------userid | 10dbid | 63781query | UPDATE branches SET bbalance = bbalance + $1 WHERE bid = $2;calls | 3000total_time | 20.716706rows | 3000-[ RECORD 2 ]------------------------------------------------------------userid | 10dbid | 63781query | UPDATE tellers SET tbalance = tbalance + $1 WHERE tid = $2;calls | 3000total_time | 17.1107649999999rows | 3000-[ RECORD 3 ]------------------------------------------------------------userid | 10dbid | 63781query | UPDATE accounts SET abalance = abalance + $1 WHERE aid = $2;calls | 3000total_time | 0.645601rows | 3000
More DTrace Probes* Probes to measure query time * Probes to measure checkpoint stats such as running time,query-parse-start (int, char *) buffers written, xlog files added, removed, recycled, etcquery-parse-done (int, char *)query-plan-start () checkpoint-start (int)query-plan-done () checkpoint-done (int, int, int, int, int)query-execute-start ()query-execute-done () * Probes to measure Idle in Transaction and client/networkquery-statement-start (int, char *) timequery-statement-done (int, char *) idle-transaction-start (int, int) idle-transaction-done ()* Probes to measure dirty buffer writes by the backend because * Probes to measure sort timebgwriter is not effective sort-start (int, int, int, int, int) sort-done (int, long)dirty-buffer-write-start (int, int, int, int)dirty-buffer-write-done (int, int, int, int) * Probes to determine whether or not the deadlock detector* Probes to measure physical writes from the shared buffer has found a deadlockbuffer-write-start (int, int, int, int)buffer-write-done (int, int, int, int, int) deadlock-found () deadlock-notfound (int)* Probes to measure reads of a relation from a particular bufferblock * Probes to measure reads/writes by block numbers andbuffer-read-start (int, int, int, int, int) relationsbuffer-read-done (int, int, int, int, int, int) smgr-read-start (int, int, int, int) smgr-read-end (int, int, int, int, int, int)* Probes to measure the effectiveness of buffer caching smgr-write-start (int, int, int, int)buffer-hit () smgr-write-end (int, int, int, int, int, int)buffer-miss ()* Probes to measure I/O time because wal_buffers is too smallwal-buffer-write-start ()wal-buffer-write-done ()
auto_explain● misnamed; actually allows you to manually set specific queries/sessions/functions to output explain plans to the log postgres=# LOAD auto_explain; postgres=# SET auto_explain.log_min_duration = 0; postgres=# SELECT count(*) FROM pg_class, pg_index WHERE oid = indrelid AND indisunique; This might produce log output such as: LOG: duration: 0.986 ms plan: Aggregate (cost=14.90..14.91 rows=1 width=0) -> Hash Join (cost=3.91..14.70 rows=81 width=0) Hash Cond: (pg_class.oid = pg_index.indrelid) -> Seq Scan on pg_class (cost=0.00..8.27 rows=227 width -> Hash (cost=2.90..2.90 rows=81 width=4) -> Seq Scan on pg_index (cost=0.00..2.90 rows=81●
More Performance Improvements● Free Space Map is dynamically sized (no more max_fsm_pages!)● Visibility Map – VACUUM only changed pages – Index-only Scans in 8.5● Less writing to pgstat file – plus you can move it
Stored Procedures● Default Parameters● Variadic Parameters● New PL/pgSQL Statements● PL/pythonU OUT Parameters
DEFAULT parametersCREATE OR REPLACE FUNCTI ONadder ) a i nt de f a ul t 4 0 , b i nt de f a ul t 2 (RETURNS i nt LANGUAGE sql AS sel ect $ 1 + $ 2 ;SELECT adder ) ( ;SELECT adder ) 1( ;SELECT adder ) 1, 2( ;
VARIADIC parametersCREATE OR REPLACE FUNCTION adder(VARIADIC v int) RETURNS int AS $$DECLARE s int; i int;BEGIN s:=0; FOR i IN SELECT generate_subscripts(v,1) LOOP s := s + i; END LOOP;RETURN s;END;$$ LANGUAGE plpgsql;SELECT adder(1);SELECT adder(1,2,3);SELECT adder(40,2);
New PL/PgSQL Statements● RETURNS TABLE – SQL-compliant alias for "SETOF"● CASE statement – real switching logicCASE WHEN x BETWEEN 0 AND 10 THEN msg := value is between zero and ten; WHEN x BETWEEN 11 AND 20 THEN msg := value is between eleven and twenty;END CASE;
PL/pythonU OUT Parameters● You now can use IN, OUT and INOUT parameters with PL/pythonU functions.● Thats it!
SQL/MED● Foundation for connecting to external servers – Future of PL/proxy and DBconnect – Future of DBI-ConnectCREATE FOREIGN DATA WRAPPER pgsql LIBRARY pgsql_fdw;CREATE SERVER foo FOREIGN DATA WRAPPER pgsql OPTIONS (host remotehost, dbname remotedb);CREATE USER MAPPING FOR PUBLIC SERVER foo OPTIONS (username bob, password secret);
Multi-Column GIN Indexes● Bad Old Days: to do a single Full Text Search index over several columns, you had to concatenate them.● New Goodness: you can now do a proper multicolumn index – and its faster!
Refactored SSL by Magnus● Proper certificate verification – Choose level, full verification is default● Control over all key and certificate files● SSL certificate authentication – Trusted root certificate – Map «cn» value of certificate
pg_hba Improvements● "crypt" is gone (insecure)● «ident sameuser» => «ident»● New format for options – name=value for all options● usermaps for all external methods – with regexp support● Parsed on reload
Column PermissionsREVOKE SELECT (col1, col2), INSERT (col1, col2) ON tab1 FROM role2;● Restrict access to sensitive columns from unprivileged ROLEs – more fine-grained security – no longer need to use VIEWs to do this
Many Patches == Lots of Testing● Bug Testing – can you make 8.4 crash?● Specification Testing – do the features do what the docs say they do?● Performance Testing – is 8.4 really faster? How much?● Combinational Testing – what happens when you put several new features together?
Many Patches == Lots of Testing1. Take a copy of your production applications2. Port them to 8.43. Report breakage and issues4. Play with implementing new features Do It Now! Were counting on you!
Contact Information● Josh Berkus ● Upcoming events – email@example.com – SCALE 7, Los – http://it.toolbox.com/ Angeles, Feb. 20 blogs/database-soup – pgCon 2009, Ottawa, May 20 This talk is copyright 2009 Josh Berkus, and is licensed under the Creative Commons Attribution License