Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Extreme querying with_analytics


Published on

Presentation given to the Sydney Oracle meetup on June 30th 2010.
Covering Oracle analytics and advanced aggregate functions

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Extreme querying with_analytics

  1. 2. <ul><li>blah blah NOT LIABLE blah blah blah, I NEVER SAID THAT blah blah READ THE DOCUMENTATION blah blah blah NO PROMISES blah I GET PAID BY THE WORD blah blah </li></ul>Read my blog at HTTP://BLOG.SYDORACLE.COM
  2. 6. <ul><li>Aggregate functions are the basis of many Analytics </li></ul><ul><li>All the standard aggregates (MIN, MAX, COUNT, SUM, etc) can be used with analytic clauses. </li></ul>
  3. 7. <ul><li>Min / Max (with added KEEP) </li></ul><ul><li>KEEP means keep the column value for the highest ranked record. </li></ul>
  4. 8. Which of their cities has the most potential slaves ?
  5. 9. SYDNEY and X both have a population of 2 million
  6. 10. MIN or MAX only makes a difference if there are multiple entries of the same ORDER BY rank
  7. 11. <ul><li>Min / Max (with added KEEP) </li></ul><ul><li>Collect </li></ul><ul><ul><li>Create an collection of all the individual values </li></ul></ul><ul><ul><li>A list of large cities … </li></ul></ul>
  8. 13. <ul><li>Min / Max (with added KEEP) </li></ul><ul><li>Collect </li></ul><ul><li>XMLAgg (in four steps) </li></ul><ul><ul><li>Collect the column(s) into an XML document </li></ul></ul>
  9. 18. <ul><li>Min / Max (with added KEEP) </li></ul><ul><li>Collect </li></ul><ul><li>XMLAGG </li></ul><ul><li>ListAgg </li></ul><ul><ul><li>11g function to create a single VARCHAR2 value from a collection of individual VARCHAR2s </li></ul></ul>
  10. 20. <ul><li>Wrap the aggregate around a CASE statement to give more aggregation possibilities. </li></ul><ul><li>SELECT </li></ul><ul><li>SUM(case when state='VIC' then pop end) vic_pop, </li></ul><ul><li>SUM(case when state='NSW' then pop end) nsw_pop </li></ul><ul><li>FROM cities; </li></ul>
  11. 21. (at last)
  12. 22. <ul><li>Dense Rank / Rank / Row Number </li></ul>
  13. 23. Smithers, Bring me a list of our highest paid employees … and the poisoned donuts.
  14. 24. <ul><li>select name, wage, sector, </li></ul><ul><li>row_number () over </li></ul><ul><li>( partition by sector order by wage desc) rn, </li></ul><ul><li>rank () over </li></ul><ul><li>(partition by sector order by wage desc) rnk, </li></ul><ul><li>dense_rank () over </li></ul><ul><li>(partition by sector order by wage desc) drnk </li></ul><ul><li>from emp </li></ul><ul><li>order by sector, wage desc; </li></ul>
  15. 27. <ul><li>Using ROW_NUMBER with other analytics can confuse… </li></ul><ul><li>select name, wage, cum_wage from </li></ul><ul><li>(select name, wage, </li></ul><ul><li>sum(wage) over (order by wage desc) cwage, </li></ul><ul><li>row_number() over (order by wage desc) rn </li></ul><ul><li>from emp </li></ul><ul><li>where sector = '7G') </li></ul><ul><li>where rn < 3 </li></ul><ul><li>NAME WAGE CUM_WAGE </li></ul><ul><li>Homer 2OO 2OO </li></ul><ul><li>Lenny 1OO 4OO </li></ul>
  16. 29. <ul><li>Dense Rank / Rank / Row Number </li></ul><ul><li>NTILE </li></ul><ul><ul><li>The &quot;Snobs&quot; and &quot;Yobs&quot; function </li></ul></ul><ul><ul><li>Ignore the outliers and extremes </li></ul></ul><ul><ul><li>Or ignore the 'huddled masses' </li></ul></ul>
  17. 31. Exclude the most common 90% Focus on the most common 10%
  18. 32. <ul><li>Dense Rank / Rank / Row Number </li></ul><ul><li>NTILE </li></ul><ul><li>Lag / Lead </li></ul><ul><ul><li>Look around for the previous or next row </li></ul></ul>
  19. 33. <ul><li>MONTH AMOUNT PREV_AMT PERC </li></ul><ul><li>January 340 </li></ul><ul><li>February 340 340 .00 </li></ul><ul><li>March 150 340 -55.88 </li></ul><ul><li>April 130 150 -13.33 </li></ul><ul><li>May 170 130 30.77 </li></ul><ul><li>June 210 170 23.53 </li></ul><ul><li>July 350 210 66.67 </li></ul><ul><li>August 270 350 -22.86 </li></ul><ul><li>September 380 270 40.74 </li></ul>
  20. 34. <ul><li>MON AMOUNT PREV_AMT </li></ul><ul><li>---------- ---------- ---------- </li></ul><ul><li>January 340 </li></ul><ul><li>February 340 340 </li></ul><ul><li>March 150 340 </li></ul><ul><li>April 130 150 </li></ul><ul><li>May 170 130 </li></ul><ul><li>June 170 </li></ul><ul><li>July 350 170 </li></ul><ul><li>August 270 350 </li></ul><ul><li>September 380 270 </li></ul>
  21. 35. <ul><li>Dense Rank / Rank / Row Number </li></ul><ul><li>Percent Rank </li></ul><ul><li>Lag / Lead </li></ul><ul><li>First / Last </li></ul><ul><ul><li>Look further ahead or behind </li></ul></ul>
  22. 36. <ul><li>select to_char(period,'Month') mon, </li></ul><ul><li>amount, </li></ul><ul><li>first_value (amount) over </li></ul><ul><li>( partition by trunc(period,'Q') </li></ul><ul><li>order by period) prev_amt </li></ul><ul><li>from sales </li></ul><ul><li>order by period </li></ul>
  23. 37. <ul><li>MON AMOUNT PREV_AMT </li></ul><ul><li>---------- ---------- ---------- </li></ul><ul><li>January 340 340 </li></ul><ul><li>February 340 340 </li></ul><ul><li>March 150 340 </li></ul><ul><li>April 130 130 </li></ul><ul><li>May 170 130 </li></ul><ul><li>June 210 130 </li></ul><ul><li>July 350 350 </li></ul><ul><li>August 270 350 </li></ul><ul><li>September 380 350 </li></ul>
  24. 38. <ul><li>Rarely needed in practice </li></ul><ul><li>Partition By and Order By normally enough </li></ul>
  25. 39. <ul><li>If you omit the PARTITION clause, especially with in-line views , the results can be BAD </li></ul>
  26. 42. In the inline view, the SUM analytic applies to ALL the Orders in the table.
  27. 44. (if we have time)
  28. 45. <ul><li>Rollup </li></ul><ul><li>Grouping sets </li></ul><ul><li>Cube </li></ul>
  29. 48. <ul><li>Rollup </li></ul><ul><li>Cube </li></ul><ul><ul><ul><li>CUBE allows combinations of columns to be totaled </li></ul></ul></ul>
  30. 50. <ul><li>Rollup </li></ul><ul><li>Cube </li></ul><ul><li>Grouping sets </li></ul><ul><ul><li>Perform grouping across multiple columns </li></ul></ul><ul><ul><li>Without the lower level totals of CUBE </li></ul></ul>
  31. 52. <ul><li>If you think you have a problem which the MODEL clause solves then </li></ul><ul><ul><li>Go have a coffee </li></ul></ul><ul><ul><li>Go have a bar of chocolate </li></ul></ul><ul><ul><li>Go have a beer </li></ul></ul><ul><ul><li>Go have a lie down </li></ul></ul><ul><li>BUT do something else until the feeling wears off </li></ul>