Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- SQL for pattern matching (Oracle 12c) by Logan Palanisamy 918 views
- Oracle database - Analytic function... by Kim Berg Hansen 876 views
- Oracle sql analytic functions by mamamowebby 865 views
- Oracle Advanced SQL by Marcin Blaszczyk 851 views
- Olap Functions Suport in Informix by Bingjie Miao 306 views
- New Not Your Father's Enterprise Ma... by Kellyn Pot'Vin-Go... 634 views

3,859 views

Published on

Do you know the difference between aggregate and analytic functions?

Without complex sub-queries or self-joins, do you know how to:

Calculate running/cumulative totals and moving/centered averages?

List products with revenues above or below their peers or product groups?

Compute the ratio of one category’s sales to the total sales?

Select the Top-N or Top N % of the customers/products?

Classify advertisers into quartiles/n-tiles based on the revenue potential?

Compare period-over-period (year-over-year, month-over-month) growth and rank advancement?

Convert rows into columns (pivot), columns into rows (unpivot) or aggregate strings?

Perform what-if analysis and hypothetical ranking?

Analytic functions are more performant because tables need to be scanned only once. They make you more productive because there is no need to write procedural code. No wonder Tom Kyte, a well-respected Oracle guru, says analytic functions are the best thing to happen after the sliced bread.

In the first half, I will cover the basics of the various analytic functions:

Ranking: RANK, DENSE_RANK, ROW_NUMBER, NTILE, CUME_DIST, PERCENTILE_RANK

Windowing: SUM, AVG, MAX, MIN, FIRST_VALUE, LAST_VALUE

Reporting: RATIO_TO_REPORT

Others: FIRST/LAST, LEAD/LAG, hypothetical ranking,

In the second half, I will show how powerful these functions are with a few examples.

If there is time, I will cover enhanced aggregation (ROLLUP, CUBE, GROUPING SET extensions to GROUP BY clause)

This class would be useful for both developers and DBAs alike, especially for those working in Analytic, Business Intelligence, and Datawarehouse environments.

Are you already an expert in analytic functions? Then come and help me refine the content.

For more info, read

http://download.oracle.com/docs/cd/E11882_01/server.112/e16579/analysis.htm

http://download.oracle.com/docs/cd/E11882_01/server.112/e16579/aggreg.htm

rollup, cross-tabulation across different dimensions using ROLLUP, CUBE and GROUPING SETS extension to GROUP BY clause

, most active time-periods (i.e. days when the most number of tickets are open in BZ, hours with the most take-off and landings, months with the highest sales, 5-minute periods with the maximum number of calls made, etc)

data densification?

their rank last year, this year, rank growth, running/cumulative total (Year-To-Date/Month-To-Date summation), moving averages, Year-Over-Year comparison, sales projection, average/min/max time between one sale and the next sale, products with above and below average sales.

overall average, sum, departmental average, sum, ranking, job wise ranking in one SQL.

Published in:
Technology

No Downloads

Total views

3,859

On SlideShare

0

From Embeds

0

Number of Embeds

20

Shares

0

Downloads

94

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Analytic and Window Functions in Oracle Logan Palanisamy
- 2. Agenda Difference between aggregate and analytic functions Introduction to various analytic functions Functions that are both aggregate and analytic Break More examples Enhanced Aggregation (CUBE, ROLLUP)
- 3. Meeting Basics Put your phones/pagers on vibrate/mute Messenger: Change the status to offline or in-meeting Remote attendees: Mute yourself (*6). Ask questions via Adobe Connect.
- 4. Aggregates vs. Analytics Aggregate functions Rows are collapsed. One row per group Non-Group-By columns not allowed in SELECT list. Analytic functions Rows are not collapsed As many rows in the output as in the input No restrictions on the columns in the SELECT list Evaluated after joins, WHERE, GROUP BY, HAVING clauses Nesting not allowed Can appear only in the SELECT or ORDER BY clause analytic_aggr_diff.sql
- 5. Analytics vs. other methods Show the dept, empno, sal and the sum of all salaries in their dept Three possible ways Using Joins Using Scalar Sub-queries Using Analytic Functions analytics_vs_others.sql
- 6. Anatomy of an analytic funcion function (arg1, ..., argN) OVER ([partition_by_clause] [order_by_clause [windowing_clause]]) The OVER keyword partition_by_clause: Optional. Not related to table/index partitions. Analogous to GROUP BY order_by_clause: Mandatory for Ranking and Windowing functions. Optional or meaningless for others windowing_clause: Optional. Should always be preceded by ORDER BY clause
- 7. Types of analytical functions Ranking functions FIRST_VALUE/LAST_VALUE/NTH_VALUE Windowing functions Reporting functions LAG/LEAD FIRST/LAST
- 8. Ranking Functions ROW_NUMBER() RANK() – Skips ranks after duplicate ranks DENSE_RANK() – Doesnt skip rank after duplicate ranks NTILE(n) – Sorts the rows into N equi-sized buckets CUME_DIST() – % of rows with values lower or equal PERCENT_RANK() - (rank of row -1)/(#rows – 1) function OVER ([PARTITION BY <c1,c2..>] ORDER BY <c3, ..>) PARTITION BY clause: Optional ORDER BY clause: Mandatory rank_dense_rank.sql
- 9. FIRST_VALUE/LAST_VALUE/NTH_VALUE Returns the first/last/nth value from an ordered set FIRST_VALUE(expr, [IGNORE NULLS]) OVER ([partitonby_clause] orderby_clause) IGNORE NULLS options helps you "carry forward". Often used in "Data Densification" Operates on Default Window (unbounded preceding and current row) when a window is not explicitly specified. NTH_VALUE introduced in 11gR2 flnth_value.sql
- 10. Window functions Used for computing cumulative/running totals (YTD, MTD, QTD), moving/centered averages function(args) OVER([partition_by_clause] order_by_clause [windowing_clause]) ORDER BY clause: mandatory. Windowing Clause: Optional. Defaults to: UNBOUNDED PRECEDING and CURRENT ROW anchored or sliding windows Two ways to specify windows: ROWS, RANGE [ROW | RANGE ] BETWEEN <start_exp> AND <end_exp> window.sql
- 11. ROWS type windows Physical offset. Number of rows before or after current row Non deterministic results if rows are not sorted uniquely Any number of columns in the ORDER BY clause ORDER BY columns can be of any type function(args) OVER ([partition_by_clause] order by c1, .., cN ROWS between <start_exp> and <end_exp>) windows_rows.sql
- 12. RANGE type Windows Logical offset non-unique rows treated as one logical row Only one column allowed in ORDER BY clause ORDER BY column should be numeric or date function(args) OVER ([partition_by_clause] order by c1 RANGE between <start_exp> and <end_exp>) windows_range.sql
- 13. Reporting function Computes the ratio of a value to the sum of a set of values RATIO_TO_REPORT(arg) OVER ([PARTITION BY <c1, .., cN>] PARTITION BY clause: Optional. ratio_to_report.sql
- 14. LAG/LEAD Gives the ability to access other rows without self-join. Allows you to treat cursor as an array Useful for making inter-row calculations (year-over-year comparison, time between events) LEAD (expr, <offset>, <default value>) [IGNORE NULLS] OVER ([partioning_clause] orderby_clause) Physical offset. Can be fixed or varying. default offset is 1 default value: value returned if offset points to a non- existent row IGNORE NULLS determines whether null values of are included or eliminated from the calculation. lead_lag.sql
- 15. FIRST/LAST Very different from FIRST_VALUE/LAST_VALUE Returns the results of aggregate/analytic function applied on column B on the first or last ranked rows sorted by column A function (expr_with_colB) KEEP (DENSE_RANK FIRST/LAST ORDER BY colA) [OVER (<partitioning_clause)>)] Slightly different syntax. Note the word KEEP analytic clause is optional. first_last.sql
- 16. Above/Below average calculation Find the list of employees whose salary is higher than the department average. above_average.sql
- 17. Top-N queries Find the full details of "set of" employees with the top- N salaries Find the two most recent hires in each department List the names and employee count of departments with the highest employee count top_n.sql
- 18. Top-N% queries List the top 5% of the customers by revenue top_np.sql
- 19. Multi Top-N queries For each customer, find out the maximum sale in the last 7 days the date of that sale the maximum sale in the last 30 days the date of that sale multi_top.sql
- 20. De-Duping Deleting duplicate records dedup.txt
- 21. Hypothetical Ranking CUME_DIST, DENSE_RANK, RANK, PERCENT_RANK Used for what-if analysis hypothetical_rank.sql
- 22. Inverse Percentile functions Return the value corresponding to a certain percentile (opposite of CUME_DIST) PERCENTILE_CONT (continuous) PERCENTILE_DISC (discrete) PERCENTILE_CONT(0.5) is the same as MEDIAN inverse_p.sql
- 23. String Aggregation: LISTAGG, STRAGG Concatenated string of values for a particular group (e.g. employees working in a dept) Tom Kytes STRAGG 11gR2 has LISTAGG 10g has COLLECT listagg.sql
- 24. Pivoting/Unpivoting Pivoting transposes rows to columns DECODE/CASE and GROUP BY used Unpivoting Converts columns to rows Join the base table with a one column serial number table 11gR2 introduced PIVOT and UNPIVOT clauses to SELECT pivot.sql
- 25. Data Densification Data normally stored in sparse form (e.g. No rows if there is no sales for a particular period) Missing data needed for comparison (e.g. month- over-month comparison) Data Densification comes in handy LAG (col, INGORE NULLS), and PARTITION BY OUTER JOIN are used. http://hoopercharles.wordpress.com/2009/12/07 /sql-filling-in-gaps-in-the-source-data/
- 26. When not to use analytics When a simple group by would do the job when_not_to_use_analytics.sq
- 27. Drawback of analytics Lot of sorting. Set PGA_AGGREGATE_TARGET/SORT_AREA_SIZE appropriately New versions reduce the number of sorts (same partition_by and order_by clauses on multiple analytic functions use single sort) http://asktom.oracle.com/pls/asktom/f?p=100:11:0:: NO::P11_QUESTION_ID:1137250200346660664 http://jonathanlewis.wordpress.com/2009/09/07/an alytic-agony/
- 28. Recap of Analytic Functions Analytic Functions: Were introduced in 8.1.6 (~1998) Are supported within PL/SQL only from 10g. Use "view" or "dynamic sql" older versions. Compute the aggregates while preserving the details Eliminate the need for self-joins or multiple passes on the same table Reduce the amount of data transferred between DB and client. Can be used only in SELECT and ORDER BY clauses. Use sub- queries if there is a need to filter. Are computed at the end - after join, where, group by, having clauses
- 29. Advanced Aggregation GROUP BY col1, col2 GROUP BY ROLLUP(col1, col2) GROUP BY CUBE(col1, col2) GROUP BY GROUPING SETS ((col1, col2), col1)
- 30. ROLLUP GROUP BY ROLLUP(col1, col2) Generates subtotals automatically Generally used in hierarchical dimensions (region, state, city), (year, quarter, month, day) n + 1 different groupings where n is the number of expressions in the ROLLUP operator in the GROUP BY clause. Order of the columns in ROLLUP matter. ROLLUP(col1, col2), ROLLUP(col2, col1) produce different outputs
- 31. CUBE GROUP BY CUBE(col1, col2) Gives subtotals automatically for every possible combination Used in cross-tabular reports. Suitable when dimensions are independent of each other 2n different groupings where n is the number of expressions in the CUBE operator in the GROUP BY clause. Have to be careful with higher values for n Order of the columns in CUBE doesn’t really matter. CUBE(col1, col2), CUBE(col2, col1) produce same results, but in a different order.
- 32. Grouping Sets GROUP BY GROUPING SETS (col1, (col1, col2)) Explicitly lists the needed groupings GROUPING, GROUPING_ID, GROUP_ID functions help you differentiate one grouping from the other. Advanced aggregation functions more efficient than their UNION ALL equivalents (why?) Grouping Equivalent GROUPING SETS advanced_agg.sql CUBE(a,b) GROUPING SETS((a,b), (a), (b), ()) ROLLUP(a,b) GROUPING SETS((a,b), (a), ()) ROLLUP(b,a) GROUPING SETS((a,b), (b), ()) ROLLUP(a) GROUPING SETS((a), ())
- 33. Composite Columns Treat multiple columns as a single column Composite_columns.sql
- 34. Concatenated Groupings GROUP BY GROUPING SETs (a,b), GROUPING SETS (c,d) The above is same as GROUP BY GROUPING SETS ((a,c), (a,d), (b,c), (b,d))
- 35. References http://orafaq.com/node/55 http://orafaq.com/node/56 http://www.orafaq.com/node/1874 http://www.morganslibrary.org/reference/a nalytic_functions.html http://morganslibrary.org/reference/rollup. html http://www.oracle.com/technology/oramag/ oracle/05-mar/o25dba.html http://www.gennick.com/magic.html
- 36. References Chapter 12 of "Expert Database Architecture" by Tom Kyte Business-Savy SQL by Ganesh Variar, Oracle magazine, Mar/April 2002 http://asktom.oracle.com/pls/asktom/asktom.searc h?p_string=rock+and+roll http://forums.oracle.com/forums/search.jspa?q=an alytic&objID=f75&dateRange=all&numResults=30& forumID=75&rankBy=10001&start=0
- 37. Q&A devel_oracle@
- 38. Predicate merging in views with analytics create view v select .. over(partition by ...) from t; select ... from v where col1 = A In some cases predicates dont get merged. Reasons: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QU ESTION_ID:12864646978683#30266389821111 http://asktom.oracle.com/pls/asktom/f?p=100:11:0::NO::P11_ QUESTION_ID:1137250200346660664 http://forums.oracle.com/forums/thread.jspa?messageID=416 9151�

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment