PostgreSQL 8.4 TriLUG 2009-11-12

2,221 views
2,052 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,221
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
21
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

PostgreSQL 8.4 TriLUG 2009-11-12

  1. 1. PostgreSQL 8.4 features <ul>Andrew Dunstan [email_address] [email_address] </ul>
  2. 2. Topics <ul><li>General info and history
  3. 3. PostgreSQL general features (briefly)
  4. 4. PostgreSQL 8.4 features </li><ul><li>Not comprehensive
  5. 5. See the release notes </li></ul><li>A few looks at what's in the pipeline </li></ul>
  6. 6. Untopics <ul><li>Why PostgreSQL is better or worse than <fill in blank here>
  7. 7. Should the name be PostgreSQL or Postgres?
  8. 8. GPL vs BSD
  9. 9. Four legs Unix good, two legs Windows bad
  10. 10. emacs vs. vi
  11. 11. GUI vs command line
  12. 12. ... </li></ul>
  13. 13. Play along <ul><li>wget ftp://ftp10.us.postgresql.org/pub/postgresql /source/v8.4.1/postgresql-8.4.1.tar.bz2
  14. 14. tar -j -xf postgresql-8.4.1.tar.bz2
  15. 15. cd postgresql-8.4.1
  16. 16. ./configure –prefix=`pwd`/../pg84 --with-pgport=5678
  17. 17. make && make install && cd contrib && make && make install </li></ul>
  18. 18. Who uses PostgreSQL? <ul><li>Yahoo
  19. 19. myyearbook
  20. 20. Skype
  21. 21. Etsy
  22. 22. New York Post
  23. 23. Afilias
  24. 24. Whitepages.com </li></ul><ul><li>JourneyX
  25. 25. IMDB
  26. 26. Rockport
  27. 27. Apple
  28. 28. NTT
  29. 29. Cisco
  30. 30. National Weather Service </li></ul>
  31. 31. What uses PostgreSQL? <ul><li>Bugzilla
  32. 32. Wikipedia
  33. 33. Drupal
  34. 34. Bricolage
  35. 35. OpenACS
  36. 36. Gforge
  37. 37. xTuple/OpenRPT </li></ul><ul><li>OpenBravo
  38. 38. Serendipity
  39. 39. PostGIS
  40. 40. OpenStreetMap
  41. 41. Reddit
  42. 42. Trac
  43. 43. LedgerSMB </li></ul>
  44. 44. PostgreSQL history Original Postgres project started at Berkeley 1986 by Michael Stonebraker SQL added in 1995 Current community project dates from 1996 Several developers from then still active, e.g. Bruce Momjian
  45. 45. PostgreSQL License <ul><li>BSD
  46. 46. Nobody owns the code, anyone can use the code
  47. 47. No monopoly </li></ul>
  48. 48. PostgreSQL philosophy <ul><li>Stability
  49. 49. Safety
  50. 50. Correctness
  51. 51. Robustness
  52. 52. Standards compliance
  53. 53. Performance </li></ul>
  54. 54. PostgreSQL Features <ul><li>Multi Version Concurrency Control (MVCC) </li><ul><li>Readers don't block writers
  55. 55. Writers don't block readers </li></ul><li>Extensive and extensible type system
  56. 56. Joins and Subqueries
  57. 57. Foreign keys
  58. 58. Namespaces (schemas)
  59. 59. Triggers </li></ul>
  60. 60. More PostgreSQL features <ul><li>Stored functions </li><ul><li>C
  61. 61. Plpgsql </li><ul><li>partial clone of plsql </li></ul><li>plPerl, plTcl, plPython, plRuby, plJava, plR, ..... </li></ul><li>Standard modules </li><ul><li>pgcrypto, dblink, uuid-ossp, ltree, .... </li></ul><li>Transactional DDL
  62. 62. Point In Time Recovery </li></ul>
  63. 63. PostgreSQL 8.4 <ul><li>Released 1 July 2009
  64. 64. 17 months in development
  65. 65. Over 200 new features and improvements </li></ul>
  66. 66. PostgreSQL 8.4 killer features <ul><li>Common Table Expressions
  67. 67. Window Functions
  68. 68. Parallel Restore </li></ul>
  69. 69. PostgreSQl 8.4 GBH features <ul><li>Column permissions
  70. 70. Variadic functions
  71. 71. Per database locales
  72. 72. Significant performance improvements
  73. 73. Version aware psql command </li></ul>
  74. 74. Common Table Expressions <ul><li>In SQL standard </li><ul><li>Put a query in a CTE and later treat it as a table </li></ul><li>with t as ( select a,b,c from foo ) select * from t;
  75. 75. with t as ( select a,b,c from foo ), s as (select a,d,e from bar) select b,d from t,s where t.a=s.a; </li></ul>
  76. 76. Recursive CTEs <ul><li>with recursive f as (select 0::numeric as a, 1::numeric as b, 1::int as r union select b, a+b, r+1 from f where r < 100 ) select b from f where r = 100;
  77. 77.  354224848179261915075 </li></ul>
  78. 78. Transitive closure with CTEs <ul><li>with recursive ancestors as (select 1::int as gen, parent as anc, child as des from children union select gen+1, anc, child from ancestors join children on des = parent ) select gen, anc from ancestors where des = 'fred';
  79. 79. Forget nested sets similar monstrosities </li></ul>
  80. 80. Interesting effects with CTEs create or replace function hanoi (discs integer, move out integer, a out int[], b out int[], c out int[]) returns setof record language sql as $$ with recursive han as ( select 1::int as move, $1 as ndiscs, '{99}'::int[] || array_agg(discs)as a, '{99}'::int[] as b, '{99}'::int[] as c from generate_series($1,1,-1) as discs union all select move + 1 , ndiscs, hnext(move, ndiscs, a,b,c), hnext(move, ndiscs, b,c,a), hnext(move, ndiscs, c,a,b) from han where array_length(b,1) < ndiscs + 1 ) select move, a[2:$1+1] as a, b[2:$1+1] as b, c[2:$1+1] as c from han order by move $$; Select * from hanoi(4);
  81. 81. Results: move | a | b | c ------+-----------+-----------+--------- 1 | {4,3,2,1} | {} | {} 2 | {4,3,2} | {} | {1} 3 | {4,3} | {2} | {1} 4 | {4,3} | {2,1} | {} 5 | {4} | {2,1} | {3} 6 | {4,1} | {2} | {3} 7 | {4,1} | {} | {3,2} 8 | {4} | {} | {3,2,1} 9 | {} | {4} | {3,2,1} 10 | {} | {4,1} | {3,2} 11 | {2} | {4,1} | {3} 12 | {2,1} | {4} | {3} 13 | {2,1} | {4,3} | {} 14 | {2} | {4,3} | {1} 15 | {} | {4,3,2} | {1} 16 | {} | {4,3,2,1} | {} (16 rows)
  82. 82. Sneak Peak: CTE's in 8.5 <ul><li>with t as (delete from foo where bar returning *) select * from t;
  83. 83. And similar for insert and update queries </li></ul>
  84. 84. Window functions <ul><li>In SQL standard
  85. 85. Similar to aggregates, but does not collapse rows
  86. 86. All aggregate functions can be used as window functions </li></ul>
  87. 87. Window function examples <ul><li>select *, row_number() over (order by foo) as rownum from bar order by foo
  88. 88. select *, row_number() over t as rownum from bar window t as (order by foo) order by foo
  89. 89. (you can omit the outer order by, but it's best not to). </li></ul>
  90. 90. More window function examples <ul><li>select salary, salary / avg(salary) over t, rank() over t from employees window t as ( partition by dept order by salary rows between unbounded preceding and unbounded following ) </li></ul>
  91. 91. Major missing SQL features <ul><li>grouping sets
  92. 92. merge </li></ul>
  93. 93. Parallel pg_restore <ul><li>My humble contribution :-)
  94. 94. Only works with custom format dumps
  95. 95. Uses specified number of connections to the database
  96. 96. Especially useful for partitioned databases
  97. 97. Typical speedup is around number_of_processors/2
  98. 98. Sweet spot is between number_of_processors and 2 * number_of_processors </li></ul>
  99. 99. Parallel pg_restore continued <ul><li>Uses a separate connection for each step
  100. 100. Individual steps are not parallelized
  101. 101. Uses dependency information to make sure steps are done in right order
  102. 102. Parallel clients are forked processes on *nix, threads on Windows
  103. 103. pg_restore -d dbname -j 4 dumpfile </li></ul>
  104. 104. Parallel pg_restore TODO <ul><li>Support tar format dumps
  105. 105. Parallel pg_dump </li><ul><li>Needs snapshot cloning
  106. 106. Needs new archive format (directory) </li></ul><li>Parallel COPY, index build </li></ul>
  107. 107. A small digression <ul><li>Three monkeys on our back
  108. 108. All being addressed </li></ul>
  109. 109. First monkey on our back <ul><li>No Upgrade in place </li><ul><li>Attempted for 8.4, but many caveats </li></ul><li>Parallel pg_restore some help
  110. 110. Two efforts underway ...
  111. 111. Core team committed to providing less painful upgrade mechanisms </li><ul><li>Makes it harder to make improvements in certain areas – can't just change on-disk format when we want </li></ul></ul>
  112. 112. Replication <ul><li>Often a “box to be checked”
  113. 113. Three forms: </li><ul><li>Statement replay
  114. 114. Data replay via triggers </li><ul><li>Needs unique key
  115. 115. Can cause foreign key constraint problems </li></ul><li>Log shipping </li><ul><li>Uses Write Ahead Log, for a whole cluster
  116. 116. Transparent to applications </li></ul></ul></ul>
  117. 117. The second monkey on our back <ul><li>No built-in management for log shipping </li><ul><li>Need third party tools or custom scripts to manage </li></ul><li>8.5 feature (nearly): Streaming Replication </li><ul><li>Developed by NTT in Japan
  118. 118. New special built in WALsender daemon
  119. 119. After initial backup (rsync) all managed via config file settings
  120. 120. Streams log records rather than waiting for complete WAL files – much lower latency </li></ul></ul>
  121. 121. Third monkey on our back <ul><li>Can't read from WAL based replicas </li><ul><li>Useful for failover, useless for load balancing
  122. 122. Many users thus have a combination of log shipping and trigger based replicas </li></ul><li>8.5 feature (nearly): Hot Standby </li><ul><li>Makes WAL based replicas available for read-only queries
  123. 123. Ideal for a report server, decision support etc.
  124. 124. Take load off transaction processing system </li></ul></ul>
  125. 125. Significant features <ul><li>Standard module citext: compares text case insensitively
  126. 126. Column level privileges </li><ul><li>grant select(colname), update(colname) on tabname to rolename;
  127. 127. Previously only way to do this was via a view
  128. 128. In SQL Standard </li></ul></ul>
  129. 129. Locales <ul><li>Locale is now per-database instead of per-cluster
  130. 130. Can be set at time of createdb instead of initdb
  131. 131. Ultimate goal is to be able to set locale and charset / encoding for each column </li><ul><li>Can involve performance issues </li></ul></ul>
  132. 132. Performance improvements <ul><li>suppress_redundant_updates_trigger </li><ul><li>Update normally writes a new record whether or not there is a change
  133. 133. This trigger inhibits the update if the new and old records are identical
  134. 134. Very fast (uses memcmp() on whole data area)
  135. 135. Break even point is at worst around 30%, i.e. you're better off using it if more than 30% of the updates are in fact redundant.
  136. 136. Also save vacuuming etc. </li></ul></ul>
  137. 137. Performance improvements continued <ul><li>Relation “forks” </li><ul><li>Information about a relation stored out of line </li><ul><li>“visibility map” fork </li><ul><li>Reduces vacuum cost on tables that don't change a lot </li></ul><li>“free space map” fork </li><ul><li>Now stored per table and not in shared memory
  138. 138. No more config settings required </li></ul><li>Infrastructure exists for other such forks if required. </li></ul></ul><li>Hash indexes now much faster </li><ul><li>Still not crash safe </li></ul></ul>
  139. 139. SQL Features - Arrays <ul><li>array_agg() </li><ul><li>Turn rows into array entries </li></ul><li>unnest() </li><ul><li>Turn array entries into rows
  140. 140. Opposite of array_agg() </li></ul></ul>
  141. 141. SQL Features - Views <ul><li>CREATE OR REPLACE VIEW can now add columns at the end </li></ul>
  142. 142. SQL Features – limit/offset <ul><li>No longer require a static expression, can now take a volatile expression or subquery instead
  143. 143. SQL Standard syntax now supported: </li><ul><li>OFFSET start [ ROW | ROWS ]
  144. 144. FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY </li></ul></ul>
  145. 145. Function features <ul><li>Variadic functions </li><ul><li>CREATE FUNCTION mleast(VARIADIC numeric[]) RETURNS numeric AS $$
  146. 146. SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
  147. 147. $$ LANGUAGE SQL; select mleast(1,4,9,-6,2,5);
  148. 148. Result: -6 </li></ul></ul>
  149. 149. More Function features <ul><li>Default argument values </li><ul><li>CREATE FUNCTION foo(a int, b int DEFAULT 2, c int DEFAULT 3) RETURNS int LANGUAGE SQL AS $$ SELECT $1 + $2 + $3; $$; select foo(10,20,30); -- 60 select foo(10,20); -- 33 select foo(10); -- 15
  150. 150. Previously required overloading to get same effect. </li></ul></ul>
  151. 151. PL/PGSQL features <ul><li>Pass parameters to dynamic query </li><ul><li>EXECUTE ... USING ... </li></ul><li>CASE statement </li></ul>
  152. 152. psql features <ul><li>version-aware queries </li><ul><li>Now makes the right queries to suit the server it's connected to </li></ul><li>ef function_name </li></ul>
  153. 153. Questions?

×