Your SlideShare is downloading. ×
0
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
You can do THAT without Perl?
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

You can do THAT without Perl?

445

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
445
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
5
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. You Can Do That Without Perl? Lists and Recursion and Trees, Oh My! YAPC10, Pittsburgh 2009Copyright © 2009David Fetter david.fetter@pgexperts.comAll Rights Reserved
  • 2. Subject: The YAPC::NA 2009 Survey
  • 3. Lists:Windowing Functions
  • 4. What are Windowing Functions? PGCon2009, 21-22 May 2009 Ottawa
  • 5. What are the Windowing Functions?• Similar to classical aggregates but does more!• Provides access to set of rows from the current row• Introduced SQL:2003 and more detail in SQL:2008 – SQL:2008 (http://wiscorp.com/sql200n.zip) 02: SQL/Foundation • 4.14.9 Window tables • 4.15.3 Window functions • 6.10 <window function> • 7.8 <window clause>• Supported by Oracle, SQL Server, Sybase and DB2 – No open source RDBMS so far except PostgreSQL (Firebird trying)• Used in OLAP mainly but also useful in OLTP – Analysis and reporting by rankings, cumulative aggregates PGCon2009, 21-22 May 2009 Ottawa
  • 6. How they work and what you get PGCon2009, 21-22 May 2009 Ottawa
  • 7. How they work and what do you get• Windowed table – Operates on a windowed table – Returns a value for each row – Returned value is calculated from the rows in the window PGCon2009, 21-22 May 2009 Ottawa
  • 8. How they work and what do you get • You can use… – New window functions – Existing aggregate functions – User-defined window functions – User-defined aggregate functions PGCon2009, 21-22 May 2009 Ottawa
  • 9. How they work and what do you get• Completely new concept! – With Windowing Functions, you can reach outside the current row PGCon2009, 21-22 May 2009 Ottawa
  • 10. How they work and what you get[Aggregates] SELECT key, SUM(val) FROM tbl GROUP BY key; PGCon2009, 21-22 May 2009 Ottawa
  • 11. How they work and what you get[Windowing Functions] SELECT key, SUM(val) OVER (PARTITION BY key) FROM tbl; PGCon2009, 21-22 May 2009 Ottawa
  • 12. How they work and what you getWho is the highest paid relatively compared with the department average? depname empno salary develop 10 5200 sales 1 5000 personnel 5 3500 sales 4 4800 sales 6 550 personnel 2 3900 develop 7 4200 develop 9 4500 sales 3 4800 develop 8 6000 develop 11 5200 PGCon2009, 21-22 May 2009 Ottawa
  • 13. How they work and what you getSELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname)::int, salary - avg(salary) OVER (PARTITION BY depname)::int AS diffFROM empsalary ORDER BY diff DESC depname empno salary avg diff develop 8 6000 5020 980 sales 6 5500 5025 475 personnel 2 3900 3700 200 develop 11 5200 5020 180 develop 10 5200 5020 180 sales 1 5000 5025 -25 personnel 5 3500 3700 -200 sales 4 4800 5025 -225 sales 3 4800 5025 -225 develop 9 4500 5020 -520 develop 7 4200 5020 -820 PGCon2009, 21-22 May 2009 Ottawa
  • 14. Anatomy of a Window• Represents set of rows abstractly as:• A partition – Specified by PARTITION BY clause – Never moves – Can contain:• A frame – Specified by ORDER BY clause and frame clause – Defined in a partition – Moves within a partition – Never goes across two partitions• Some functions take values from a partition. Others take them from a frame. PGCon2009, 21-22 May 2009 Ottawa
  • 15. A partition• Specified by PARTITION BY clause in OVER()• Allows to subdivide the table, much like GROUP BY clause• Without a PARTITION BY clause, the whole table is in a single partitionfunc(args) OVER ( ) partition-clause order- clause frame- clause PARTITION BY expr, … PGCon2009, 21-22 May 2009 Ottawa
  • 16. A frame• Specified by ORDER BY clause and frame clause in OVER()• Allows to tell how far the set is applied• Also defines ordering of the set• Without order and frame clauses, the whole of partition is a single framefunc(args) OVER ( ) partition-clause order- clause frame- clauseORDER BY expr [ASC|DESC] (ROWS|RANGE) BETWEEN UNBOUNDED[NULLS FIRST|LAST], … (PRECEDING|FOLLOWING) AND CURRENT ROW PGCon2009, 21-22 May 2009 Ottawa
  • 17. The WINDOW clause• You can specify common window definitions in one place and name it, for convenience• You can use the window at the same query levelSELECT target_list, … wfunc() OVER wFROM table_list, …WHERE qual_list, ….GROUP BY groupkey_list, ….HAVING groupqual_list, …WINDOW w A ( ) S partition-clause order- clause frame- clause PGCon2009, 21-22 May 2009 Ottawa
  • 18. Built-in Windowing Functions PGCon2009, 21-22 May 2009 Ottawa
  • 19. List of built-in Windowing Functions• row_number() • lag()• rank() • lead()• dense_rank() • first_value()• percent_rank() • last_value()• cume_dist() • nth_value()• ntile() PGCon2009, 21-22 May 2009 Ottawa
  • 20. row_number() • Returns number of the current row SELECT val, row_number() OVER (ORDER BY val DESC) FROM tbl; val row_number() 5 1 5 2 3 3 1 4Note: row_number() always incremented values independent of frame PGCon2009, 21-22 May 2009 Ottawa
  • 21. rank()• Returns rank of the current row with gap SELECT val, rank() OVER (ORDER BY val DESC) FROM tbl; val rank() 5 1 gap 5 1 3 3 1 4 Note: rank() OVER(*empty*) returns 1 for all rows, since all rows are peers to each other PGCon2009, 21-22 May 2009 Ottawa
  • 22. dense_rank()• Returns rank of the current row without gap SELECT val, dense_rank() OVER (ORDER BY val DESC) FROM tbl; val dense_rank() 5 1 no gap 5 1 3 2 1 3Note: dense_rank() OVER(*empty*) returns 1 for all rows, since all rowsare peers to each other PGCon2009, 21-22 May 2009 Ottawa
  • 23. percent_rank()• Returns relative rank; (rank() – 1) / (total row – 1) SELECT val, percent_rank() OVER (ORDER BY val DESC) FROM tbl; val percent_rank() 5 0 5 0 3 0.666666666666667 1 1 values are always between 0 and 1 inclusive.Note: percent_rank() OVER(*empty*) returns 0 for all rows, since all rowsare peers to each other PGCon2009, 21-22 May 2009 Ottawa
  • 24. cume_dist()• Returns relative rank; (# of preced. or peers) / (total row) SELECT val, cume_dist() OVER (ORDER BY val DESC) FROM tbl; val cume_dist() 5 0.5 =2/4 5 0.5 =2/4 3 0.75 =3/4 1 1 =4/4 The result can be emulated by “count(*) OVER (ORDER BY val DESC) / count(*) OVER ()” Note: cume_dist() OVER(*empty*) returns 1 for all rows, since all rows are peers to each other PGCon2009, 21-22 May 2009 Ottawa
  • 25. ntile()• Returns dividing bucket number SELECT val, ntile(3) OVER (ORDER BY val DESC) FROM tbl; val ntile(3) 5 1 4%3=1 5 1 3 2 1 3 The results are the divided positions, but if there’s remainder add row from the head Note: ntile() OVER (*empty*) returns same values as above, since ntile() doesn’t care the frame but works against the partition PGCon2009, 21-22 May 2009 Ottawa
  • 26. lag() • Returns value of row aboveSELECT val, lag(val) OVER (ORDER BY val DESC) FROM tbl;val lag(val)5 NULL5 53 51 3 Note: lag() only acts on a partition. PGCon2009, 21-22 May 2009 Ottawa
  • 27. lead()• Returns value of the row belowSELECT val, lead(val) OVER (ORDER BY val DESC) FROM tbl;val lead(val)5 55 33 11 NULL Note: lead() acts against a partition. PGCon2009, 21-22 May 2009 Ottawa
  • 28. first_value()• Returns the first value of the frame SELECT val, first_value(val) OVER (ORDER BY val DESC) FROM tbl; val first_value(val) 5 5 5 5 3 5 1 5 PGCon2009, 21-22 May 2009 Ottawa
  • 29. last_value()• Returns the last value of the frame SELECT val, last_value(val) OVER (ORDER BY val DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) FROM tbl; val last_value(val) 5 1 5 1 3 1 1 1 Note: frame clause is necessary since you have a frame between the first row and the current row by only the order clause PGCon2009, 21-22 May 2009 Ottawa
  • 30. nth_value()• Returns the n-th value of the frame SELECT val, nth_value(val, val) OVER (ORDER BY val DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) FROM tbl; val nth_value(val, val) 5 NULL 5 NULL 3 3 1 5 Note: frame clause is necessary since you have a frame between the first row and the current row by only the order clause PGCon2009, 21-22 May 2009 Ottawa
  • 31. aggregates(all peers)• Returns the same values along the frame SELECT val, sum(val) OVER () FROM tbl; val sum(val) 5 14 5 14 3 14 1 14 Note: all rows are the peers to each other PGCon2009, 21-22 May 2009 Ottawa
  • 32. cumulative aggregates• Returns different values along the frame SELECT val, sum(val) OVER (ORDER BY val DESC) FROM tbl; val sum(val) 5 10 5 10 3 13 1 14 Note: row#1 and row#2 return the same value since they are the peers. the result of row#3 is sum(val of row#1…#3) PGCon2009, 21-22 May 2009 Ottawa
  • 33. Recursion and Trees, Oh My:Common Table Expressions
  • 34. Thanks to:• The ISO SQL Standards Committee• Yoshiyuki Asaba• Ishii Tatsuo• Jeff Davis• Gregory Stark• Tom Lane• etc., etc., etc.
  • 35. Recursion
  • 36. Recursion in General•Initial Condition•Recursion step•Terminationcondition
  • 37. E1 List TableCREATE TABLE employee( id INTEGER NOT NULL, boss_id INTEGER, UNIQUE(id, boss_id)/*, etc., etc. */);INSERT INTO employee(id, boss_id)VALUES(1,NULL), /* El capo di tutti capi */(2,1),(3,1),(4,1),(5,2),(6,2),(7,2),(8,3),(9,3),(10,4),(11,5),(12,5),(13,6),(14,7),(15,8),(1,9);
  • 38. Tree Query InitiationWITH RECURSIVE t(node, path) AS ( SELECT id, ARRAY[id] FROM employee WHERE boss_id IS NULL /* Initiation Step */UNION ALL SELECT e1.id, t.path || ARRAY[e1.id] FROM employee e1 JOIN t ON (e1.boss_id = t.node) WHERE id NOT IN (t.path))SELECT CASE WHEN array_upper(path,1)>1 THEN +- ELSE END || REPEAT(--, array_upper(path,1)-2) || node AS "Branch"FROM tORDER BY path;
  • 39. Tree Query RecursionWITH RECURSIVE t(node, path) AS ( SELECT id, ARRAY[id] FROM employee WHERE boss_id IS NULLUNION ALL SELECT e1.id, t.path || ARRAY[e1.id] /* Recursion */ FROM employee e1 JOIN t ON (e1.boss_id = t.node) WHERE id NOT IN (t.path))SELECT CASE WHEN array_upper(path,1)>1 THEN +- ELSE END || REPEAT(--, array_upper(path,1)-2) || node AS "Branch"FROM tORDER BY path;
  • 40. Tree Query TerminationWITH RECURSIVE t(node, path) AS ( SELECT id, ARRAY[id] FROM employee WHERE boss_id IS NULLUNION ALL SELECT e1.id, t.path || ARRAY[e1.id] FROM employee e1 JOIN t ON (e1.boss_id = t.node) WHERE id NOT IN (t.path) /* Termination Condition */)SELECT CASE WHEN array_upper(path,1)>1 THEN +- ELSE END || REPEAT(--, array_upper(path,1)-2) || node AS "Branch"FROM tORDER BY path;
  • 41. Tree Query DisplayWITH RECURSIVE t(node, path) AS ( SELECT id, ARRAY[id] FROM employee WHERE boss_id IS NULLUNION ALL SELECT e1.id, t.path || ARRAY[e1.id] FROM employee e1 JOIN t ON (e1.boss_id = t.node) WHERE id NOT IN (t.path))SELECT CASE WHEN array_upper(path,1)>1 THEN +- ELSE END || REPEAT(--, array_upper(path,1)-2) || node AS "Branch" /* Display */FROM tORDER BY path;
  • 42. Tree Query Initiation Branch ---------- 1 +-2 +---5 +-----11 +-----12 +---6 +-----13 +---7 +-----14 +-3 +---8 +-----15 +---9 +-4 +---10 +-9 (16 rows)
  • 43. Travelling Salesman ProblemGiven a number of cities and the costs of travellingfrom any city to any other city, what is the least-cost round-trip route that visits each city exactlyonce and then returns to the starting city?
  • 44. TSP SchemaCREATE TABLE pairs ( from_city TEXT NOT NULL, to_city TEXT NOT NULL, distance INTEGER NOT NULL, PRIMARY KEY(from_city, to_city), CHECK (from_city < to_city));
  • 45. TSP DataINSERT INTO pairsVALUES (Bari,Bologna,672), (Bari,Bolzano,939), (Bari,Firenze,723), (Bari,Genova,944), (Bari,Milan,881), (Bari,Napoli,257), (Bari,Palermo,708), (Bari,Reggio Calabria,464), ....
  • 46. TSP Program: Symmetric Setup WITH RECURSIVE both_ways( from_city, to_city, distance ) /* Working Table */ AS ( SELECT from_city, to_city, distance FROM pairs UNION ALL SELECT to_city AS "from_city", from_city AS "to_city", distance FROM pairs ),
  • 47. TSP Program: Symmetric Setup WITH RECURSIVE both_ways( from_city, to_city, distance ) AS (/* Distances One Way */ SELECT from_city, to_city, distance FROM pairs UNION ALL SELECT to_city AS "from_city", from_city AS "to_city", distance FROM pairs ),
  • 48. TSP Program: Symmetric Setup WITH RECURSIVE both_ways( from_city, to_city, distance ) AS ( SELECT from_city, to_city, distance FROM pairs UNION ALL /* Distances Other Way */ SELECT to_city AS "from_city", from_city AS "to_city", distance FROM pairs ),
  • 49. TSP Program: Path Initialization Steppaths ( from_city, to_city, distance, path)AS ( SELECT from_city, to_city, distance, ARRAY[from_city] AS "path" FROM both_ways b1 WHERE b1.from_city = RomaUNION ALL
  • 50. TSP Program: Path Recursion Step SELECT b2.from_city, b2.to_city, p.distance + b2.distance, p.path || b2.from_city FROM both_ways b2 JOIN paths p ON ( p.to_city = b2.from_city AND b2.from_city <> ALL (p.path[ 2:array_upper(p.path,1) ]) /* Prevent re-tracing */ AND array_upper(p.path,1) < 6 ))
  • 51. TSP Program: Timely Termination Step SELECT b2.from_city, b2.to_city, p.distance + b2.distance, p.path || b2.from_city FROM both_ways b2 JOIN paths p ON ( p.to_city = b2.from_city AND b2.from_city <> ALL (p.path[ 2:array_upper(p.path,1) ]) /* Prevent re-tracing */ AND array_upper(p.path,1) < 6 /* Timely Termination */ ))
  • 52. TSP Program: Filter and DisplaySELECT path || to_city AS "path", distanceFROM pathsWHERE to_city = RomaAND ARRAY[Milan,Firenze,Napoli] <@ pathORDER BY distance, pathLIMIT 1;
  • 53. TSP Program: Filter and Displaydavidfetter@tsp=# i travelling_salesman.sql path | distance----------------------------------+---------- {Roma,Firenze,Milan,Napoli,Roma} | 1553(1 row)Time: 11679.503 ms
  • 54. TSP Program: Gritty Details QUERY PLAN------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=44.14..44.15 rows=1 width=68) (actual time=16131.326..16131.327 rows=1 loops=1) CTE both_ways -> Append (cost=0.00..4.64 rows=132 width=19) (actual time=0.011..0.090 rows=132 loops=1) -> Seq Scan on pairs (cost=0.00..1.66 rows=66 width=19) (actual time=0.010..0.026 rows=66 loops=1) -> Seq Scan on pairs (cost=0.00..1.66 rows=66 width=19) (actual time=0.004..0.029 rows=66 loops=1) CTE paths -> Recursive Union (cost=0.00..38.72 rows=31 width=104) (actual time=0.102..10631.385 rows=1092883 loops=1) -> CTE Scan on both_ways b1 (cost=0.00..2.97 rows=1 width=68) (actual time=0.100..0.251 rows=11 loops=1) Filter: (from_city = Roma::text) -> Hash Join (cost=0.29..3.51 rows=3 width=104) (actual time=180.125..958.459 rows=182145 loops=6) Hash Cond: (b2.from_city = p.to_city) Join Filter: (b2.from_city <> ALL (p.path[2:array_upper(p.path, 1)])) -> CTE Scan on both_ways b2 (cost=0.00..2.64 rows=132 width=68) (actual time=0.001..0.056 rows=132 loops=5) -> Hash (cost=0.25..0.25 rows=3 width=68) (actual time=172.292..172.292 rows=22427 loops=6) -> WorkTable Scan on paths p (cost=0.00..0.25 rows=3 width=68) (actual time=123.167..146.659 rows=22427 loops=6) Filter: (array_upper(path, 1) < 6) -> Sort (cost=0.79..0.79 rows=1 width=68) (actual time=16131.324..16131.324 rows=1 loops=1) Sort Key: paths.distance, ((paths.path || paths.to_city)) Sort Method: top-N heapsort Memory: 17kB -> CTE Scan on paths (cost=0.00..0.78 rows=1 width=68) (actual time=36.621..16127.841 rows=4146 loops=1) Filter: (({Milan,Firenze,Napoli}::text[] <@ path) AND (to_city = Roma::text)) Total runtime: 16180.335 ms(22 rows)
  • 55. Hausdorf-Besicovich- WTF DimensionWITH RECURSIVEZ(Ix, Iy, Cx, Cy, X, Y, I)AS ( SELECT Ix, Iy, X::float, Y::float, X::float, Y::float, 0 FROM (SELECT -2.2 + 0.031 * i, i FROM generate_series(0,101) AS i) AS xgen(x,ix) CROSS JOIN (SELECT -1.5 + 0.031 * i, i FROM generate_series(0,101) AS i) AS ygen(y,iy) UNION ALL SELECT Ix, Iy, Cx, Cy, X * X - Y * Y + Cx AS X, Y * X * 2 + Cy, I + 1 FROM Z WHERE X * X + Y * Y < 16::float AND I < 100),
  • 56. FilterZt (Ix, Iy, I) AS ( SELECT Ix, Iy, MAX(I) AS I FROM Z GROUP BY Iy, Ix ORDER BY Iy, Ix)
  • 57. DisplaySELECT array_to_string( array_agg( SUBSTRING( .,,,-----++++%%%%@@@@#### , GREATEST(I,1) ),)FROM ZtGROUP BY IyORDER BY Iy;
  • 58. Questions?Comments?Straitjackets?
  • 59. Thank You!Copyright © 2009David Fetter david.fetter@pgexperts.comAll Rights Reserved

×