• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Oracle SQL Model Clause
 

Oracle SQL Model Clause

on

  • 164 views

Abstract: The session will breakdown the Model clause into its fundamental components and provides some basic real-world examples to demonstrate its greater potential. ...

Abstract: The session will breakdown the Model clause into its fundamental components and provides some basic real-world examples to demonstrate its greater potential.

Though most developers have heard of the SQL Model clause in 10g, many may baulk at the idea of using it - daunted by seemingly foreign syntax that might well have come out of a FORTRAN program.

Look a little closer and you'll find it's just like building a spreadsheet. Concise, easy to read syntax that provides the functionality for demanding calculations that would normally require elaborate joins, unions, analytics or PL/SQL. In addition to the development and maintenance burden, we are also faced with the all too familiar problem of business customers duplicating data to an Excel spreadsheet that is shared and erroneously modified around the workplace.

This session uses the Model clause as a high performance tool that can simplify approaches to every day problems. It demonstrates that Model is an extension to SQL that forms multi-dimensional arrays with inter-row & inter-array calculations that automatically resolves formula dependencies.

Statistics

Views

Total Views
164
Views on SlideShare
158
Embed Views
6

Actions

Likes
1
Downloads
3
Comments
0

2 Embeds 6

https://www.linkedin.com 4
https://twitter.com 2

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • This is my basic agenda for today. <br /> I hope you come out of it understand how to use the model clause and when/where you start using it, particular to replace spreadsheet functionality. <br />
  • Experts can either save me time and come up to explain this, or leave the room as not to embarrass me! <br />
  • Inspiration for this presentation when I saw a blog from Tom Kyte talking about “Why can’t programmers program?”. <br /> The following example was posed, something you might see in a Year 12 computing exam. Someone said he thought it was an old university drinking game (one that I’d fail badly) <br />
  • Various responses were given, I took best of breed and cleaned it to this level. <br />
  • But someone suggested this… so I wondered how it worked. <br /> Not having worked on a 10g db before, the only other time I remember seeing it was in a presentation by Penny where she labeled it a real beast! <br /> It’s really not too hard to understand, I want to break it down and demystify it for all of you. <br />
  • I’ve always wanted to include a pie chart in my presentation. <br /> Model clause is just like a spreadsheet, I love spreadsheets as my colleagues can testify. I have a spreadsheet for everything. <br /> Here we’re going to combing SQL & spreadsheets, with a little bit of craft. <br /> I mean, who’s had a user who’s asked “can’t we just have it in a spreadsheet?” <br /> As it says in a 2003 Oracle White paper, it’s a good replacement for spreadsheets which all businesses know can be dodgy. <br />
  • Not all features of model illustrated here, but it’s core functionality most likely to be used, and those being covered today. <br /> I’ll come back to this slide when I can <br />
  • Identifying characteristics like Date, Region, Product <br />
  • Analogous to the measures of a fact table in a star schema. Sales / Cost <br /> Each cell is accessed within its partition by specifying its full combination of dimensions <br />
  • Similar to your analytical functions, only applies rules within each partition. <br /> Each partition is viewed by the formulas as an independent array. <br />
  • Now the business end of the clause, in particular the cell assignments. <br />
  • All calculations done on database. Model is the last clause processed in the query except the final order by. <br /> Does not update or insert data in existing tables – but of course you can supply it to an insert/update/merge statement. <br /> So we can use Model to just view the rows we logically create or update. <br /> Convenient way to limit result to newly calculated values. <br /> Again useful to apply to our DML. <br />
  • Back to our rules slide… could also be a better version of group by rollup <br />
  • Now to revert back to the examples in the SH schema. (Remember this is an old sample schema with old data – Model has been around since 2003 and still not widely adopted) <br /> Single cell access. <br /> For creation of new cells, such as projections for future years, we need to use these positional references, or FOR loops discussed later. <br /> Positional both updates and inserts – UPSERT. <br />
  • Multi-cell access – update only. Also referred to multi cell access on the left side. <br /> Update all rows after 1999 for WHERE WE ALREADY HAVE ROWS. <br /> Conditions such as &lt;&gt; IN BETWEEN, I like to think of it as boolean comparisons <br /> Symbolic powerful but solely for updating cells. If any symbolic use in any dimension, will update only. <br /> FOR loops provide concise technique for creating multiple cells from a single formula such as this. <br />
  • Simple aggregate functions, like finding 100 more than the max sales for Bounce between those years <br />
  • Or analytical functions like finding 30% more of the median sales for those products for years less than 2003. <br /> Instantly, we’ve got more power and potential than spreadsheets, but using the same type of cell referencing as a basis. <br />
  • Used on right side of formulas to copy left side specifications that refer to multiple cells. <br /> Each years sales for Bounce is sum of Mouse Pad and 20% of Y Box sales for that year. <br /> Few things to note. <br />
  • Dimension as argument to CV() is optional, uses positional reference. <br />
  • Might have spotted lack of data. <br /> Data in DB only for 1999 – 2001, and since this uses symbolic references, data is only updated, not created. <br />
  • So it’s the same as writing multiple rules, evaluating to 3 separate rules instead of one compact and flexible rule <br /> It’s almost equivalent to a SQL join. It’s imperative to the efficient usage of the model clause. <br />
  • Now we start getting into some impressive expressions – inter-row calculations. <br />
  • Here is a common performance and readability issue – referring to previous years sales. <br /> We can do this with analytical functions, but only to a certain extent. There are restrictions on where in the query it may be present, and when placing values in calculations, you really need to use an inline view to make it readable. <br /> Here we need to select the column, define the measure, define the rule – and we have a calculated field <br />
  • A growth calculated column has been added, referring to the value of last years sales multiple times in the calculation rather simply. <br />
  • dimension IS ANY = Symbolic <br /> ANY = Positional, but treated as symbolic <br /> Either prevents cell insertion because it really means <br /> (dimension is not null or dimension is null) <br />
  • What’s the difference between these two functions? <br /> Although note it’s a little easier to reference year-1 without needing to eliminate it from the final result set. <br /> We’ll explore that a little further later on. Now to focus… <br />
  • If we want to start forecasting for data for several certain products, I could list out several rules. <br /> This could quickly get beyond manageable, and doesn’t demonstrate code reuse. <br />
  • We could use a familiar statement, but this is symbolic notation which doesn’t allow creation of rows. <br /> You’d get UPDATE behaviour even though you state UPSERT. <br />
  • This is where the for loop is handy. <br /> Think of it like a tool to make a single formula generate multiple formulas with positional references, thus allowing UPSERT. <br /> You can also see how we begin to get the power of PL/SQL-esque constructs in SQL. <br />
  • Many options available for the FOR construct. All these return the same results. <br /> The last option opens up a world of possibilities, for example introducing new data. <br />
  • Another FOR loop construct highlights a trap to be careful of. Something always to be wary of when assessing results is rule evaluation order. <br /> Take SEQUENTIAL order, which is the default <br /> In this case, the FOR construct rules are evaluated in the order they are generated. <br />
  • Adding a little French culture here… <br /> When explicitly using AUTOMATIC order, Oracle knows that there is a dependency starting from 2000. <br /> So you get vastly different results! <br /> There are a number of other scenarios to consider with FOR loops that I don’t have time to go into today, relating to usage of UPDATE, UPSERT, or UPSERT ALL. <br /> Just be aware of it if your results or performance are unexpected. <br />
  • We can have an order by for each rule as well, and same cautions apply <br /> Sequential order just can’t refer to itself without further direction. <br /> Order by ascending is going to have the same issue when travelling down the list. <br /> So we can order by desc to ensure correct results. <br />
  • Not too much of a difference, unless you have some records with null sales figures <br />
  • If we have a table of exchange rates, and want to combine it with our data from the Model query <br />
  • We can start using something called reference models, this little baby here… <br />
  • Which translates into this query. <br />
  • It looks daunting, but note we have exactly the same type of query you’ve been seeing all morning, with our little FOR loop here <br />
  • But this time we also have a read only reference model. <br /> Note the alias and reference semantics. <br /> There’s no reason you can’t have more than one reference model either, note the “1.0” where the documented example had a growth rate as well. <br />
  • Some of the sharper ones out there still awake may be thinking, eh? Why don’t I just use a simple join?! <br /> Sure, if this example is all you need to do, stick with the wheel that works, but… <br />
  • … remember our amortization example? <br /> Please believe me when I say it’s going to be a little difficult converting that back to a simple join. <br /> There’s recursion, multi-dimensional reference model and a heavy need for nurofen in there… <br />
  • Good segue onto restrictions… We all want to know what sort of restrictions apply to new features. I remember back in 8i materialized views were patience building with all the restrictions one would face for fast refresh. We don’t really have that here. Most are pretty obvious. <br /> Data-warehousing guide lists quite a few, but most are concerned with semantics already mentioned. Such as using decrement instead of increment -1 <br /> Here are some examples of what not to do… <br /> Don’t do this… although at this stage this is a bug – but you can’t update a reference model. Try in 11g? <br /> They also can’t have a partition clause, they don’t need it – it’s not relevant. <br />
  • The measure is an important and flexible part of the model clause. <br />
  • Only time sub-queries are allowed in the rules is in the FOR construct. <br /> Something like this would need to be re-written into the measure <br />
  • Sub-query can’t contain a WITH, or contain over 10,000 rows. <br /> And really, if you’ve got that many rows in your sub-query, are you approaching the problem correctly? <br />
  • You can’t supply a bind variable for the number of iterations you desire. <br /> However you can define a figure larger than expected, and define a bind variable where to iterate until. <br />
  • For starters, the sheer concept of using model clause over past concepts is we no longer need to use multiple joins to the same data source and/or add unions to our dataset. <br /> We can query the data set we’d like to manipulate and apply our entire rule set over the single set of data. <br />
  • To follow this scenario, here is a simple list of customers. <br /> But in a report, we want to fill in missing months and add cumulative figure. <br />
  • To follow this scenario, given a simple list of customers. <br /> In a report, we want to fill in missing months and add cumulative figure. <br />
  • Tell model what you want, optimizer figures it out for you. Don’t worry about analytics <br />
  • Model clause computation is scalable in terms of the number of processors you have. That goes for most parallel queries you may have worked with in the past. <br /> The way it’s achieved in the Model computation is via the partition by clause. <br /> It essentially creates parallel query slaves for each partition, and each slave independently evaluates the rules based on the data it receives. <br /> Obviously if your partition has low cardinality, the level of parallelism will be limited. <br /> Documentation even suggests if no partition clause is present, it may consider dimensions for partitioning. <br />
  • Oracle’s explain plans are fully aware of model computation. <br /> There is a line in the query’s main explain plan output showing the model and algorithm used. <br /> Reference models are tagged with the keyword reference. <br />
  • Explain plan detail is predominantly based on the order of evaluation of the rules. <br /> Sequential vs Automatic <br /> If automatic, then are there any cyclic dependencies in the formulas? <br /> FAST if left side cell references are single cell references and aggregates, if any, are simple arithmetic non-distinct aggregates like SUM, COUNT, AVG etc. <br /> According to the documentation, basically the further to the left on my wonderful chart here, the less CPU intensive your query will be. <br /> I analysed similar queries using tkprof and Tom Kyte’s runstats test harness for latches and couldn’t find any significant differences between these scenarios. However, for your particular situation, take a leaf out of Connor’s book and prove to yourself that the example you would like to use does actually perform better depending on how it’s written. <br />
  • Performance is an important aspect of development. Despite what some DBAs believe, us developers want to make sure our queries run like greased lightning, and we want to show our skills in getting the job done efficiently. Model clause is up for that. <br />
  • When: As a developer you may have trained your eyes to see solutions using decode, subquery, unions, analytics, or a ultimately a combination. <br /> Over time you’ll see opportunities for the model clause, just as we’ve been finding opportunities for analytics more & more ever since 8.1.7 – to the point where Tom Kyte is thinking of publishing a book solely on analytical functions. <br /> Why: <br /> It’s almost like a new language, where you’ll start to think about your data and results sets differently. I actually stumbled across the patent for this concept during my research <br /> Procedural programmers might find the array access to rows and set based programming an easy transition. Further allowing you to finalise your solutions in SQL without the need for PL/SQL <br /> Something I didn’t get a chance to go into was recursion. The power of recursion up to 9i relied on PL/SQL, the model clause allows recursion solely using SQL <br /> Combines procedural flexibility with set based processing power. Just consider memory requirements. <br /> Some cases will allow you to move PL/SQL logic to SQL, hence removing context switching and ultimately improving performance.Hence it gives you the opportunity to do even more things exclusively in SQL that would normally require some PL/SQL.Golden rule? Can it be done in SQL? <br />
  • The Model clause may seem more like a niche addition to SQL rather than a long-awaited solution. <br /> Once this new feature has been accepted and used by a large number of developers, its usefulness will grow as developers will undoubtedly discover clever and unexpected uses for it. <br /> In other words, give it time, and it will grow on you. <br /> - Anthony Molinaro <br />

Oracle SQL Model Clause Oracle SQL Model Clause Presentation Transcript

  • The Model Clause Spreadsheet Techniques using SQL Scott Wesley
  • Agenda • Concepts • Cell Referencing • Focus • Performance • Final thoughts…
  • Any Model experts? SELECT c, p, m, pp, ip FROM mortgage MODEL REFERENCE R ON (SELECT customer, fact, amt FROM mortgage_facts MODEL DIMENSION BY (customer, fact) MEASURES (amount amt) RULES (amt[any, 'PaymentAmt']= (amt[CV(),'Loan']* Power(1+ (amt[CV(),'Annual_Interest']/100/12),amt[CV(),'Payments'])* (amt[CV(),'Annual_Interest']/100/12)) / (Power(1+(amt[CV(),'Annual_Interest']/100/12), amt[CV(),'Payments']) - 1))) DIMENSION BY (customer cust, fact) MEASURES (amt) MAIN amortization PARTITION BY (customer c) DIMENSION BY (0 p) MEASURES (principalp pp, interestp ip, mort_balance m, customer mc) RULES ITERATE(1000) UNTIL (ITERATION_NUMBER+1 =r.amt[mc[0],'Payments']) (ip[ITERATION_NUMBER+1] = m[CV()-1] * r.amt[mc[0], 'Annual_Interest']/1200 ,pp[ITERATION_NUMBER+1] = r.amt[mc[0], 'PaymentAmt'] - ip[CV()] ,m[ITERATION_NUMBER+1] = m[CV()-1] - pp[CV()]) ORDER BY c, p;
  • FizzBuzz • Write a program that prints the numbers from 1 to 100. • But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. • For numbers which are multiples of both three and five print “FizzBuzz” http://tkyte.blogspot.com/2007/02/what-is-your-fizzbuzz-factor.html
  • SQL> WITH data AS (SELECT LEVEL AS n FROM DUAL CONNECT BY LEVEL <= 100) SELECT n, NVL(CASE WHEN MOD(n,3) = 0 THEN 'Fizz' END || CASE WHEN MOD(n,5) = 0 THEN 'Buzz' END, TO_CHAR(n)) AS answer FROM data; N ANSWER ---- ----------- 1 1 2 2 3 Fizz 4 4 5 Buzz 6 Fizz 7 7 8 8 9 Fizz 10 Buzz 11 11 12 Fizz 13 13 14 14 15 FizzBuzz 16 16 17 17 18 Fizz ...
  • Anthony Wilson said.... So... do we win points by using MODEL, or get thrown out of the interview on mental health grounds? Nobody in their right mind actually uses this stuff, right? :) select id, n from all_objects where object_id = 1 model dimension by (object_id id) measures (object_name n) rules ( n[for id from 1 to 100 increment 1] = to_char(cv(id)), n[mod(id, 3) = 0] = 'fizz', n[mod(id, 5) = 0] = 'buzz', n[mod(id, 15) = 0] = 'fizzbuzz' ) order by id
  • SQL Spreadsheets Craft
  • Basic Syntax <prior clauses of SELECT statements> MODEL [main] [RETURN {ALL|UPDATED} ROWS] [reference models] [PARTITION BY (<cols>)] DIMENSION BY (<cols>) MEASURES (<cols>) [IGNORE NAV] | [KEEP NAV] [RULES [UPSERT | UPDATE] [AUTOMATIC ORDER | SEQUENTIAL ORDER] [ITERATE (n) [UNTIL <condition>] ] ( <cell_assignment> = <expression> ... )
  • SELECT country ,year ,sales ,previous_year ,before_that FROM sales_view GROUP BY year, country -- Define model structure/options MODEL RETURN ALL ROWS PARTITION BY (country) DIMENSION BY (year) MEASURES (SUM(sales) AS sales -- Define 'calculated columns' ,CAST(NULL AS NUMBER) previous_year ,CAST(NULL AS NUMBER) before_that) -- Define rule options RULES AUTOMATIC ORDER ( -- Define actual rules previous_year[ANY] = sales[CV()-1] ,before_that [ANY] = sales[CV()-2] ) -- Define order of entire result set ORDER BY country, year; Apply model clause to this query
  • Concepts Country Product Year Sales Partition Dimension Dimension Measure AUS HAMMER 2006 10 AUS NAIL 2006 200 NZ NAIL 2006 300 NZ DRILL 2007 50
  • Dimension Country Product Year Sales Partition Dimension Dimension Measure <prior clauses of SELECT statements> MODEL [main] [RETURN {ALL|UPDATED} ROWS] [reference models] [PARTITION BY (<cols>)] DIMENSION BY (product p, year y) MEASURES (<cols>) [IGNORE NAV] | [KEEP NAV] [RULES …
  • Measure Country Product Year Sales Partition Dimension Dimension Measure <prior clauses of SELECT statements> MODEL [main] [RETURN {ALL|UPDATED} ROWS] [reference models] [PARTITION BY (<cols>)] DIMENSION BY (<cols>) MEASURES (sales) [IGNORE NAV] | [KEEP NAV] [RULES …
  • Partition Country Product Year Sales Partition Dimension Dimension Measure <prior clauses of SELECT statements> MODEL [main] [RETURN {ALL|UPDATED} ROWS] [reference models] [PARTITION BY (country)] DIMENSION BY (<cols>) MEASURES (<cols>) [IGNORE NAV] | [KEEP NAV] [RULES …
  • Array 20 15 5 35 15 Hammer Drill Product 2006 2007 2008 C ountry Year Aus N Z
  • Rules <prior clauses of SELECT statements> MODEL [main] [RETURN {ALL|UPDATED} ROWS] [reference models] [PARTITION BY (<cols>)] DIMENSION BY (<cols>) MEASURES (<cols>) [IGNORE NAV] | [KEEP NAV] [RULES [UPSERT | UPDATE] [AUTOMATIC ORDER | SEQUENTIAL ORDER] [ITERATE (n) [UNTIL <condition>] ] ( <cell_assignment> = <expression> ... )
  • Rules COUNTRY PRODUCT YEAR SALES Partition Dimension Dimension Measure AUS Hammer 2006 20 AUS Hammer 2007 15 AUS Drill 2007 5 NZ Hammer 2006 15 NZ Hammer 2007 10 NZ Drill 2007 10 AUS Hammer 2008 35 AUS Drill 2008 15 NZ Hammer 2008 25 NZ Drill 2008 30 Original Data Rule Results Sales[Hammer,2008] = Sales[Hammer,2006]+Sales[Hammer,2007] Sales[Drill ,2008] = Sales[Drill,2007]*3
  • Spreadsheet Sales[Hammer,2008] = Sales[Hammer,2006]+Sales[Hammer,2007] Sales[Drill ,2008] = Sales[Drill,2007]*3
  • Return Rows <prior clauses of SELECT statements> MODEL [main] [RETURN {ALL|UPDATED} ROWS] [reference models] [PARTITION BY (<cols>)] DIMENSION BY (<cols>) MEASURES (<cols>) [IGNORE NAV] | [KEEP NAV] [RULES [UPSERT | UPDATE] [AUTOMATIC ORDER | SEQUENTIAL ORDER] [ITERATE (n) [UNTIL <condition>] ] ( <cell_assignment> = <expression> ... )
  • Rules COUNTRY PRODUCT YEAR SALES Partition Dimension Dimension Measure AUS Hammer 2006 20 AUS Hammer 2007 15 AUS Drill 2007 5 NZ Hammer 2006 15 NZ Hammer 2007 10 NZ Drill 2007 10 AUS Hammer 2008 35 AUS Drill 2008 15 NZ Hammer 2008 25 NZ Drill 2008 30 RETURN ALL ROWS RETURN UPDATED ROWS Sales[Hammer,2008] = Sales[Hammer,2006]+Sales[Hammer,2007] Sales[Drill ,2008] = Sales[Drill,2007]*3
  • Positional Cell Referencing SELECT country, product, year, sales FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (product, year) MEASURES (sales) RULES ( sales['Bounce', 2000] = 10 ,sales['Bounce', 2008] = 20 ); COUNTRY PROD YEAR SALES ------------ ------------ ----- ---------- Australia Bounce 2000 10 Australia Bounce 2008 20
  • Symbolic Cell Referencing SELECT country, product, year, sales FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (product, year) MEASURES (sales) RULES ( sales[prod = 'Bounce', year > 1999] = 10 ) ORDER BY country, product, year; COUNTRY PRODUCT YEAR SALES ------------ ------------ ----- ---------- Australia Bounce 2000 10 Australia Bounce 2001 10
  • Multi-Cell Access on Right Side SELECT country, product, year, sales FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (product, year) MEASURES (sales) RULES ( sales['Bounce',2008] = 100 + MAX(sales) ['Bounce', year BETWEEN 1998 and 2002] ,sales['Y Box',2008] = 1.3 * PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY sales) [product in ('Bounce', 'Y Box'), year < 2003]); COUNTRY PRODUCT YEAR SALES ---------- -------- ----- ------- Australia Bounce 2008 3861 Australia Y Box 2008 4889
  • Multi-Cell Access on Right Side SELECT country, product, year, sales FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (product, year) MEASURES (sales) RULES ( sales['Bounce',2008] = 100 + MAX(sales) ['Bounce', year BETWEEN 1998 and 2002] ,sales['Y Box',2008] = 1.3 * PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY sales) [product in ('Bounce', 'Y Box'), year < 2003]); COUNTRY PRODUCT YEAR SALES ---------- -------- ----- ------- Australia Bounce 2008 3861 Australia Y Box 2008 4889
  • CV()
  • SELECT country, product, year, sales FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (product, year) MEASURES (sales) RULES ( sales['Bounce', year BETWEEN 1995 AND 2002] = sales['Mouse Pad', CV(year)] + 0.2 * sales['Y Box', CV()]); COUNTRY PRODUCT YEAR SALES ---------- -------- ----- ------- Australia Bounce 1999 6061 Australia Bounce 2000 8154 Australia Bounce 2001 15211
  • SELECT country, product, year, sales FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (product, year) MEASURES (sales) RULES ( sales['Bounce', year BETWEEN 1995 AND 2002] = sales['Mouse Pad', CV(year)] + 0.2 * sales['Y Box', CV()]); COUNTRY PRODUCT YEAR SALES ---------- -------- ----- ------- Australia Bounce 1999 6061 Australia Bounce 2000 8154 Australia Bounce 2001 15211
  • SELECT country, product, year, sales FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (product, year) MEASURES (sales) RULES ( sales['Bounce', year BETWEEN 1995 AND 2002] = sales['Mouse Pad', CV(year)] + 0.2 * sales['Y Box', CV()]); COUNTRY PRODUCT YEAR SALES ---------- -------- ----- ------- Australia Bounce 1999 6061 Australia Bounce 2000 8154 Australia Bounce 2001 15211
  • SELECT country, product, year, sales FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (product, year) MEASURES (sales) RULES ( sales['Bounce', 1999] = sales['Mouse Pad',1999]+0.2*sales['Y Box',1999] sales['Bounce', 2000] = sales['Mouse Pad',2000]+0.2*sales['Y Box',2000] sales['Bounce', 20001] = sales['Mouse Pad',2001]+0.2*sales['Y Box',2001]);
  • CV(year)-1
  • SELECT country, product, year, sales, p_year, growth_pct FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (product, year) MEASURES (sales, 0 p_year, 0 growth_pct) RULES ( p_year[product IN ('Bounce','Y Box'), year BETWEEN 1998 and 2001] = sales[CV(product), CV(year) -1] ,growth_pct[product IN ('Bounce','Y Box'), year BETWEEN 1998 and 2001] = 100* (sales[CV(product), CV(year)] - sales[CV(product), CV(year) -1] ) / sales[CV(product), CV(year) -1]) ORDER BY country, product, year; COUNTRY PRODUCT YEAR SALES P_YEAR GROWTH_PCT ---------- --------- ----- ------- ------- ---------- Australia Bounce 1999 1878 Australia Bounce 2000 3151 1878 67.83 Australia Bounce 2001 3761 3151 19.33 Australia Y Box 1999 13773 Australia Y Box 2000 27007 13773 96.09 Australia Y Box 2001 59291 27007 119.54
  • SELECT country, product, year, sales, p_year, growth_pct FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (product, year) MEASURES (sales, 0 p_year, 0 growth_pct) RULES ( p_year[product IN ('Bounce','Y Box'), year BETWEEN 1998 and 2001] = sales[CV(product), CV(year) -1] ,growth_pct[product IN ('Bounce','Y Box'), year BETWEEN 1998 and 2001] = 100* (sales[CV(product), CV(year)] - sales[CV(product), CV(year) -1] ) / sales[CV(product), CV(year) -1]) ORDER BY country, product, year; COUNTRY PRODUCT YEAR SALES P_YEAR GROWTH_PCT ---------- --------- ----- ------- ------- ---------- Australia Bounce 1999 1878 Australia Bounce 2000 3151 1878 67.83 Australia Bounce 2001 3761 3151 19.33 Australia Y Box 1999 13773 Australia Y Box 2000 27007 13773 96.09 Australia Y Box 2001 59291 27007 119.54
  • SELECT country, product, year, sales, p_year, growth_pct FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (product, year) MEASURES (sales, 0 p_year, 0 growth_pct) RULES ( p_year[product IN ('Bounce','Y Box'), ANY] = sales[CV(product), CV(year) -1] ,growth_pct[product IN ('Bounce','Y Box'), year IS ANY] = 100* (sales[CV(product), CV(year)] - sales[CV(product), CV(year) -1] ) / sales[CV(product), CV(year) -1]) ORDER BY country, product, year; COUNTRY PRODUCT YEAR SALES P_YEAR GROWTH_PCT ---------- --------- ----- ------- ------- ---------- Australia Bounce 1999 1878 Australia Bounce 2000 3151 1878 67.83 Australia Bounce 2001 3761 3151 19.33 Australia Y Box 1999 13773 Australia Y Box 2000 27007 13773 96.09 Australia Y Box 2001 59291 27007 119.54
  • Analytics vs Model LAG(sales,1) OVER (PARTITION BY prod ORDER BY year) p_year p_year[prod IS ANY, year IS ANY] = sales[CV(prod), CV(year)-1] SAME! Or is it…?
  • Model method: COUNTRY PRODUCT YEAR SALES P_YEAR GROWTH_PCT ---------- --------- ----- ------------ ------- ---------- Australia Bounce 1999 1,878 Australia Bounce 2000 3,151 1878 67.83 Australia Bounce 2001 3,761 3151 19.33 Australia Bounce 2003 8,790 Analytic method: COUNTRY PRODUCT YEAR SALES P_YEAR GROWTH_PCT ---------- --------- ----- ------------ ------- ---------- Australia Bounce 1999 1,878 Australia Bounce 2000 3,151 1878 67.83 Australia Bounce 2001 3,761 3151 19.33 Australia Bounce 2003 8,790 3761 133.73
  • Rule Repetition SELECT product, year, sales FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS DIMENSION BY (product, year) MEASURES (sales) RULES UPSERT ( sales['Mouse Pad', 2008] = 1.3 * sales['Mouse Pad', 2001] ,sales['Bounce', 2008] = 1.3 * sales['Bounce', 2001] ,sales['Y Box', 2008] = 1.3 * sales['Y Box', 2001] ); PRODUCT YEAR SALES ---------- ----- ------- Y Box 2008 77078 Bounce 2008 4888 Mouse Pad 2008 4358 3 rows selected.
  • Symbolic Reference SELECT product, year, sales FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS DIMENSION BY (product, year) MEASURES (sales) RULES UPSERT ( sales[product IN ('Mouse Pad', 'Bounce', 'Y Box') ,2008] = 1.3 * sales[CV(product), 2001] ); no rows selected
  • Positional Reference SELECT product, year, sales FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS DIMENSION BY (product, year) MEASURES (sales) RULES UPSERT ( sales[FOR product IN ('Mouse Pad', 'Bounce', 'Y Box') ,2008] = 1.3 * sales[CV(product), 2001] ); PRODUCT YEAR SALES ---------- ----- ------- Y Box 2008 77078 Bounce 2008 4888 Mouse Pad 2008 4358 3 rows selected.
  • Constants, Cell References, Queries, et al… RULES UPSERT ( -- Constants sales[FOR product IN ('Mouse Pad', 'Bounce', 'Y Box') ,2002] = 1.3 * sales[CV(product), CV(year)-1] ); RULES UPSERT ( -- Cell References sales[FOR (product, year) IN ( ('Mouse Pad', 2002), ('Bounce',2002), ('Y Box',2002))] = 1.3 * sales[CV(product), CV(year)-1] ); RULES UPSERT ( -- Queries sales[FOR product IN (SELECT product FROM my_products) ,FOR year IN (2002)] = 1.3 * sales[CV(product), CV(year)-1] ); PRODUCT YEAR SALES ---------- ----- ------- Y Box 2002 77078 Bounce 2002 4888 Mouse Pad 2002 4358
  • et allia – en trappe! RULES UPSERT SEQUENTIAL ORDER ( sales['Bounce', FOR year FROM 2004 TO 2001 DECREMENT 1] = 1.3 * sales[CV(), CV()-1] ) PRODUCT YEAR SALES -------- ----- ------- Bounce 2001 4097 Bounce 2002 4889 Bounce 2003 Bounce 2004 4 rows selected.
  • Ordering FOR loops RULES UPSERT AUTOMATIC ORDER ( sales['Bounce', FOR year FROM 2004 TO 2001 DECREMENT 1] = 1.3 * sales[CV(), CV()-1] ) PRODUCT YEAR SALES -------- ----- ------- Bounce 2001 4097 Bounce 2002 5326 Bounce 2003 6924 Bounce 2004 9001 4 rows selected.
  • ORDER BY in rules RULES SEQUENTIAL ORDER (sales[ANY, ANY] = sales[cv(),cv()-1] ) ORA-32637: Self cyclic rule in sequential order MODEL RULES (sales[ANY, ANY] ORDER BY YEAR ASC = sales[cv(),cv()-1]) PRODUCT YEAR SALES NEW_SALES --------- ----- ------------ ---------- Bounce 1999 1,878 0 Bounce 2000 3,151 0 Bounce 2001 3,761 0 RULES (sales[ANY, ANY] ORDER BY YEAR DESC = sales[cv(),cv()- 1]) PRODUCT YEAR SALES NEW_SALES --------- ----- ------------ ---------- Bounce 1999 1,878 0 Bounce 2000 3,151 1877.67 Bounce 2001 3,761 3151.35
  • NULL Measures • Cells that exist in array but are NULL • Cells not in the array at all (treated as NULL) <prior clauses of SELECT statements> MODEL [main] [RETURN {ALL| UPDATED} ROWS] [reference models] [PARTITION BY (<cols>)] DIMENSION BY (<cols>) MEASURES (<cols>) [IGNORE NAV] | [KEEP NAV] [RULES...
  • KEEP NAV SELECT product, year, sales FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS DIMENSION BY (product, year) MEASURES (sales) KEEP NAV RULES UPSERT (sales['Widget', 2003] = sales['Bounce', 2002] + sales['Bounce', 2001] ,sales['Widget', 2002] = sales['Bounce', 2002] ,sales['Widget', 2001] = sales['Bounce', 2001]); PRODUCT YEAR SALES -------- ----- ---------- Widget 2001 3760.51 Widget 2002 Widget 2003 3 rows selected.
  • IGNORE NAV SELECT product, year, sales FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS DIMENSION BY (product, year) MEASURES (sales) IGNORE NAV RULES UPSERT (sales['Widget', 2003] = sales['Bounce', 2002] + sales['Bounce', 2001] ,sales['Widget', 2002] = sales['Bounce', 2002] ,sales['Widget', 2001] = sales['Bounce', 2001]); PRODUCT YEAR SALES -------- ----- ---------- Widget 2001 3760.51 Widget 2002 0 Widget 2003 3760.51 3 rows selected.
  • NULL Defaults • 0 for numeric data • Empty string '' for character/string data • '01-JAN-2001' for date type data • NULL for all other data types
  • NULL Cell Reference • Positional – sales [ANY] – sales [NULL] • Symbolic – sales [product IS ANY] – sales [product IS NULL] – sales [product = NULL] -- FALSE
  • PRESENTV, PRESENTNNV SELECT product, year, sales FROM sales_view WHERE country = 'Australia' MODEL RETURN UPDATED ROWS DIMENSION BY (product, year) MEASURES (sales) --IGNORE NAV -- Makes no difference RULES UPSERT (sales['Widget', 2003] = sales['Bounce', 2002] + sales['Bounce', 2001] ,sales['Widget', 2002] = PRESENTV(sales['Bounce', 2002],1,0) ,sales['Widget', 2001] = PRESENTNNV(sales['Bounce', 2001],1,0)); PRODUCT YEAR SALES -------- ----- ---------- Widget 2001 1 Widget 2002 0 Widget 2003 3760.51 3 rows selected.
  • Reference Models scott@sw10g> select * from exch_rates; COUNTRY EXCHANGE_RATE ---------------- ------------- Poland .25 France .14 Australia .93 New Zealand 1.12 United Kingdom .45 5 rows selected. … COUNTRY YEAR SALES DOLLAR_SALES ---------------- ----- ------------ ------------ Australia 2001 1,164,871 1,083,330 France 2001 1,025,157 143,522 United Kingdom 2001 1,674,752 753,638
  • Reference Models <prior clauses of SELECT statements> MODEL [main] [RETURN {ALL|UPDATED} ROWS] [reference models] [PARTITION BY (<cols>)] DIMENSION BY (<cols>) MEASURES (<cols>) [IGNORE NAV] | [KEEP NAV] [RULES [UPSERT | UPDATE] [AUTOMATIC ORDER | SEQUENTIAL ORDER] [ITERATE (n) [UNTIL <condition>] ] ( <cell_assignment> = <expression> ... )
  • Reference Models SELECT country, year, ROUND(sales) sales, ROUND(dollar_sales) dollar_sales FROM sales_view GROUP BY country, year MODEL RETURN UPDATED ROWS REFERENCE conv_ref ON (SELECT country, exchange_rate FROM exch_rates) DIMENSION BY (country) MEASURES (exchange_rate) MAIN conversion DIMENSION BY (country, year) MEASURES (SUM(sales) sales, SUM(sales) dollar_sales) RULES (dollar_sales[FOR conversion.country IN ('France','Australia','United Kingdom'), 2001] = sales[CV(country), CV(year)] * 1.0 * conv_ref.exchange_rate[CV(country)]); COUNTRY YEAR SALES DOLLAR_SALES ---------------- ----- ------------ ------------ Australia 2001 1,164,871 1,083,330 France 2001 1,025,157 143,522 United Kingdom 2001 1,674,752 753,638 3 rows selected.
  • Reference Models SELECT country, year, ROUND(sales) sales, ROUND(dollar_sales) dollar_sales FROM sales_view GROUP BY country, year MODEL RETURN UPDATED ROWS REFERENCE conv_ref ONREFERENCE conv_ref ON (SELECT country, exchange_rate FROM exch_rates)(SELECT country, exchange_rate FROM exch_rates) DIMENSION BY (country) MEASURES (exchange_rate)DIMENSION BY (country) MEASURES (exchange_rate) MAINMAIN conversionconversion DIMENSION BY (country, year) MEASURES (SUM(sales) sales, SUM(sales) dollar_sales) RULES (dollar_sales[FOR conversion.country IN ('France','Australia','United Kingdom'), 2001] = sales[CV(country), CV(year)] * 1.0 * conv_ref.exchange_rate[CV(country)]conv_ref.exchange_rate[CV(country)]); COUNTRY YEAR SALES DOLLAR_SALES ---------------- ----- ------------ ------------ Australia 2001 1,164,871 1,083,330 France 2001 1,025,157 143,522 United Kingdom 2001 1,674,752 753,638 3 rows selected.
  • Reference Models SELECT country, year,SELECT country, year, ROUND(sales) sales, ROUND(dollar_sales) dollar_salesROUND(sales) sales, ROUND(dollar_sales) dollar_sales FROM sales_viewFROM sales_view GROUP BY country, yearGROUP BY country, year MODEL RETURN UPDATED ROWSMODEL RETURN UPDATED ROWS REFERENCE conv_ref ON (SELECT country, exchange_rate FROM exch_rates) DIMENSION BY (country) MEASURES (exchange_rate) MAIN conversion DIMENSION BY (country, year)DIMENSION BY (country, year) MEASURES (SUM(sales) sales, SUM(sales) dollar_sales)MEASURES (SUM(sales) sales, SUM(sales) dollar_sales) RULESRULES (dollar_sales[FOR conversion.country IN ('France','Australia','United Kingdom'), 2001] = sales[CV(country), CV(year)] * 1.0 * conv_ref.exchange_rate[CV(country)]); COUNTRY YEAR SALES DOLLAR_SALES ---------------- ----- ------------ ------------ Australia 2001 1,164,871 1,083,330 France 2001 1,025,157 143,522 United Kingdom 2001 1,674,752 753,638 3 rows selected. Australia 0.93
  • Eh? SELECT v.country, v.year ,ROUND(SUM(sales)) sales ,ROUND(SUM(v.sales * r.exchange_rate)) dollar_sales FROM sales_view v, exch_rates r WHERE v.country IN ('France', 'Australia', 'United Kingdom') AND v.year = 2001 AND v.country = r.country GROUP BY v.country, v.year ORDER BY v.country; COUNTRY YEAR SALES DOLLAR_SALES ---------------- ----- ------------ ------------ Australia 2001 1,164,871 1,083,330 France 2001 1,025,157 143,522 United Kingdom 2001 1,674,752 753,638 3 rows selected.
  • Amortization SELECT c, p, m, pp, ip FROM mortgage MODEL REFERENCE R ON (SELECT customer, fact, amt FROM mortgage_facts MODEL DIMENSION BY (customer, fact) MEASURES (amount amt) RULES (amt[any, 'PaymentAmt']= (amt[CV(),'Loan']* Power(1+ (amt[CV(),'Annual_Interest']/100/12),amt[CV(),'Payments'])* (amt[CV(),'Annual_Interest']/100/12)) / (Power(1+(amt[CV(),'Annual_Interest']/100/12), amt[CV(),'Payments']) - 1))) DIMENSION BY (customer cust, fact) MEASURES (amt) MAIN amortization PARTITION BY (customer c) DIMENSION BY (0 p) MEASURES (principalp pp, interestp ip, mort_balance m, customer mc) RULES ITERATE(1000) UNTIL (ITERATION_NUMBER+1 =r.amt[mc[0],'Payments']) (ip[ITERATION_NUMBER+1] = m[CV()-1] * r.amt[mc[0], 'Annual_Interest']/1200 ,pp[ITERATION_NUMBER+1] = r.amt[mc[0], 'PaymentAmt'] - ip[CV()] ,m[ITERATION_NUMBER+1] = m[CV()-1] - pp[CV()]) ORDER BY c, p;
  • SELECT country, year, ROUND(sales) sales, ROUND(dollar_sales) dollar_sales FROM sales_view GROUP BY country, year MODEL RETURN UPDATED ROWS REFERENCE conv_ref ON (SELECT country, exchange_rate FROM exch_rates) DIMENSION BY (country) MEASURES (exchange_rate) IGNORE NAV MAIN conversion DIMENSION BY (country, year) MEASURES (SUM(sales) sales, SUM(sales) dollar_sales) IGNORE NAV RULES (conv_ref.country['Australia'] = 0.97); ERROR at line 1: ORA-03113: end-of-file on communication channel Reference models are read only
  • SELECT country, year, SUM(sales) sales FROM sales_view GROUP BY country, year MODEL RETURN UPDATED ROWS MAIN conversion DIMENSION BY (country, year) MEASURES (SUM(sales) sales) IGNORE NAV RULES (sales[FOR country IN ('France','Australia','United Kingdom'), 2002] = sales[CV(country), CV()-1] * 1.2); ERROR at line 1: ORA-00934: group function is not allowed here Aggregate can’t be in SELECT/ORDER BY
  • SELECT * FROM sales_view WHERE country = 'Australia' MODEL DIMENSION BY (product, year) MEASURES (sales) RULES UPSERT (sales['Bounce', 2003] = sales[CV(), CV()-1] + (SELECT SUM(sales) FROM sales_view)); ERROR at line 1: ORA-32620: illegal subquery within MODEL rules MEASURES (sales, (SELECT SUM(sales) FROM sales_view) AS grand_total) RULES UPSERT (sales['Bounce', 2003] = sales[CV(), CV()-1] + grand_total[CV(), CV()-1]); Sub-queries
  • RULES UPSERT (sales[FOR product IN ( WITH kludge_with AS (SELECT prod_name FROM sh.products) SELECT prod_name FROM kludge_with) , 2003] = sales[CV(), CV()-1]); ERROR at line 1: ORA-32632: incorrect subquery in MODEL FOR cell index RULES UPSERT (sales[FOR cust_id IN (SELECT cust_id FROM sh.customers), 2003] = sales[CV(), CV()-1]); ERROR at line 1: ORA-32633: MODEL subquery FOR cell index returns too many rows FOR Construct Limitations
  • SELECT id, n FROM all_objects WHERE object_id=1 MODEL DIMENSION BY (object_id id) MEASURES (object_name n) RULES ITERATE (:v) (n[iteration_number] =iteration_number) ERROR at line 5: ORA-32607: invalid ITERATE value in MODEL clause SELECT id, n FROM all_objects WHERE object_id=1 MODEL DIMENSION BY (object_id id) MEASURES (object_name n) RULES ITERATE (100) UNTIL :v (n[iteration_number] =iteration_number) ITERATE UNTIL
  • Performance • Replaces multiple joins/unions • Scalable in size & parallelism • Explain plan
  • Fill Dates scott@sw10g> select * from customer; NAME AMT DT ------ ---------- ----------- Scott 117 17-jan-2008 Scott 250 17-feb-2008 Scott 300 17-apr-2008 Scott 50 17-jun-2008 Wade 1231 17-mar-2008 Wade 2321 17-apr-2008 Wade 3122 17-sep-2008 Wade 59 17-oct-2008 8 rows selected.
  • Fill Dates NAME MTH AMT CUM_AMT ------ --------- ---------- ---------- Scott January 117 117 Scott February 250 367 Scott March 0 367 Scott April 300 667 Scott May 0 667 Scott June 50 717 Scott July 0 717 Scott August 0 717 Scott September 0 717 Scott October 0 717 Scott November 0 717 Scott December 0 717 Wade January 0 0 Wade February 0 0 Wade March 1231 1231 Wade April 2321 3552 Wade May 0 3552 Wade June 0 3552 Wade July 0 3552 Wade August 0 3552 Wade September 3122 6674 Wade October 59 6733 Wade November 0 6733 Wade December 0 6733 24 rows selected. NAME AMT DT ------ ---------- ----------- Scott 117 17-jan-2007 Scott 250 17-feb-2007 Scott 300 17-apr-2007 Scott 50 17-jun-2007 Wade 1231 17-mar-2007 Wade 2321 17-apr-2007 Wade 3122 17-sep-2007 Wade 59 17-oct-2007 8 rows selected.
  • Analytic Solution SELECT date_fill.name, TO_CHAR(real_dt,'Month') mth, NVL(amt,0) amt ,NVL(SUM(amt) OVER (PARTITION BY date_fill.name ORDER BY real_dt ),0) cum_amt FROM (SELECT name, TRUNC(dt,'mm') dt, SUM(amt) amt FROM customer GROUP BY name, TRUNC(dt,'mm') ) actual_data, (SELECT name, real_dt FROM (SELECT DISTINCT name FROM customer) ,(WITH mths AS (SELECT TRUNC(SYSDATE,'YYYY') real_dt FROM DUAL CONNECT BY LEVEL <= 12) SELECT ADD_MONTHS(real_dt,ROWNUM-1) real_dt FROM mths) ) date_fill WHERE date_fill.real_dt = actual_data.dt(+) AND date_fill.name = actual_data.name(+) ORDER BY date_fill.name, date_fill.real_dt / Actual Data Distinct list joined with conjured dates Outer join actual data with full date list
  • Analytic ExplainExecution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=13 Card=8 Bytes=31 1 0 SORT (ORDER BY) (Cost=13 Card=8 Bytes=312) 2 1 WINDOW (SORT) (Cost=13 Card=8 Bytes=312) 3 2 HASH JOIN (OUTER) (Cost=11 Card=8 Bytes=312) 4 3 VIEW (Cost=6 Card=8 Bytes=104) 5 4 MERGE JOIN (CARTESIAN) (Cost=6 Card=8 Bytes=104) 6 5 VIEW (Cost=2 Card=1 Bytes=6) 7 6 COUNT 8 7 VIEW (Cost=2 Card=1 Bytes=6) 9 8 CONNECT BY (WITHOUT FILTERING) 10 9 FAST DUAL (Cost=2 Card=1) 11 5 BUFFER (SORT) (Cost=6 Card=8 Bytes=56) 12 11 VIEW (Cost=4 Card=8 Bytes=56) 13 12 SORT (UNIQUE) (Cost=4 Card=8 Bytes=56) 14 13 TABLE ACCESS (FULL) OF 'CUSTOMER' (TABLE) 15 3 VIEW (Cost=4 Card=8 Bytes=208) 16 15 SORT (GROUP BY) (Cost=4 Card=8 Bytes=232) 17 16 TABLE ACCESS (FULL) OF 'CUSTOMER' (TABLE) (Cost= Statistics ---------------------------------------------------- 9 recursive calls 30 consistent gets 6 sorts (memory)
  • Model Solution SELECT name, TO_CHAR(dt,'DD-MM-YYYY') dt, amt, cum_amt -- Model results FROM ( SELECT name, TRUNC(dt, 'MM') dt, SUM(amt) amt FROM customer GROUP BY name, TRUNC(dt, 'MM') ) MODEL PARTITION BY (name) DIMENSION BY (dt) MEASURES (amt, cast(NULL AS NUMBER) cum_amt) -- Define calculated col IGNORE NAV RULES SEQUENTIAL ORDER( amt[FOR dt FROM TO_DATE('01-01-2007', 'DD-MM-YYYY') TO TO_DATE('01-12-2007', 'DD-MM-YYYY') INCREMENT NUMTOYMINTERVAL(1, 'MONTH') ] = amt[CV(dt)] -- Apply amt for given date, if found ,cum_amt[ANY] = SUM(amt)[dt <= CV(dt)] -- Calculate cumulative ) ORDER BY name, dt / Essentially apply model to this data set Conjure dates
  • Model Explain Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=5 Card=8 Bytes=232 1 0 SORT (ORDER BY) (Cost=5 Card=8 Bytes=232) 2 1 SQL MODEL (ORDERED) (Cost=5 Card=8 Bytes=232) 3 2 SORT (GROUP BY) (Cost=5 Card=8 Bytes=232) 4 3 TABLE ACCESS (FULL) OF 'CUSTOMER' (TABLE) (Cost=3 Ca 5 2 WINDOW (IN SQL MODEL) (SORT) Statistics ---------------------------------------------------- 5 recursive calls 15 consistent gets 4 sorts (memory) …and no joins
  • Also noteworthy cum_amt[any] = sum(amt)[dt <= cv(dt)] -- Sum dates before iteration date Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=5 Card=8 Bytes=232 1 0 SORT (ORDER BY) (Cost=5 Card=8 Bytes=232) 2 1 SQL MODEL (ORDERED) (Cost=5 Card=8 Bytes=232) 3 2 SORT (GROUP BY) (Cost=5 Card=8 Bytes=232) 4 3 TABLE ACCESS (FULL) OF 'CUSTOMER' (TABLE) (Cost=3 Ca 5 2 WINDOW (IN SQL MODEL) (SORT) cum_amt[any] = sum(amt)[any] -- Aggregate all dates, no window sort
  • Performance • Replaces multiple joins/unions • Scalable in size & parallelism • Explain plan
  • 216 + 1 SELECT COUNT(*) FROM ( SELECT * FROM sales_mth_view MODEL RETURN ALL ROWS PARTITION BY (country) DIMENSION BY (product, year, month) MEASURES (sales) RULES UPSERT SEQUENTIAL ORDER (sales[ANY,ANY,ANY] = sales[CV(), CV(), CV()]*1.2 -- Plus forecast sales... ,sales[FOR product IN (SELECT prod_name FROM sh.products) ,FOR year FROM 2008 TO 2015 INCREMENT 1 ,FOR month FROM 1 TO 12 INCREMENT 1 ] = sales[CV(), CV()-8, CV()]*1.5) ) sw10g> / COUNT(*) ---------- 156157 1 row selected. Try that in Excel…
  • Performance • Replaces multiple joins/unions • Scalable in size & parallelism • Explain plan
  • Sequential Automatic Cyclic yn Ordered Fast Acyclic Cyclic Fast Order
  • Sequential Automatic Cyclic yn Ordered Fast Acyclic Cyclic Fast Order SELECT country, product, year, sales FROM sales_view WHERE country IN ('Australia','Japan') MODEL UNIQUE DIMENSION PARTITION BY (country) DIMENSION BY (product, year) MEASURES (sales) RULES UPSERT SEQUENTIAL ORDER (sales['Bounce',2003] = AVG(sales)[ANY,2002] * 1.5 ,sales['Y Box', 2003] = sales['Bounce',2003] * .25); Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS 1 0 SQL MODEL (ORDERED FAST) 2 1 SORT (GROUP BY) 3 2 HASH JOIN 4 3 TABLE ACCESS (FULL) OF 'TIMES' (TABL 5 3 HASH JOIN 6 5 TABLE ACCESS (FULL) OF 'PRODUCTS' 7 5 HASH JOIN 8 7 HASH JOIN 9 8 TABLE ACCESS (FULL) OF 'COUNTR 10 8 TABLE ACCESS (FULL) OF 'CUSTOM 11 7 PARTITION RANGE (ALL) 12 11 TABLE ACCESS (FULL) OF 'SALES'
  • Cyclic y Cyclic Order SELECT country, product, year, sales FROM sales_view WHERE country in ('Australia','Japan') MODEL UNIQUE DIMENSION PARTITION BY (country) DIMENSION BY (product, year) MEASURES (sales) IGNORE NAV RULES UPSERT AUTOMATIC ORDER (sales['Y Box',2003] = 0.25 * sales['Bounce', 2003] ,sales['Bounce',2003] = sales['Y Box',2003] + sales['Bounce',2002]); Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS 1 0 SQL MODEL (CYCLIC) 2 1 SORT (GROUP BY) 3 2 HASH JOIN 4 3 TABLE ACCESS (FULL) OF 'TIMES' (TABLE) 5 3 HASH JOIN 6 5 TABLE ACCESS (FULL) OF 'PRODUCTS' (TABL 7 5 HASH JOIN 8 7 HASH JOIN 9 8 TABLE ACCESS (FULL) OF 'COUNTRIES' ... Automatic
  • Why? When? 1. New language? 2. Array/Sets 3. Recursion 4. Flexibility & Power 5. Can it be done in SQL?
  • sales[cv()-1] • DECODE • Analytics • Model – CV() – Reference Model
  • DECODE SELECT country ,SUM(DECODE(year,2007,sales,NULL)) current_year ,SUM(DECODE(year,2006,sales,NULL)) previous_year ,SUM(DECODE(year,2005,sales,NULL)) before_that FROM sales_view WHERE year BETWEEN 2005 AND 2007 GROUP BY country; COUNTRY CURRENT_YEAR PREVIOUS_YEAR BEFORE_THAT -------- ------------ ------------- ----------- Greece 27540 25067 20835 Italy 23738 20831 22867
  • DECODE SELECT country, year ,SUM(DECODE(year,2007,sales,NULL)) current_year ,SUM(DECODE(year,2006,sales,NULL)) previous_year ,SUM(DECODE(year,2005,sales,NULL)) before_that FROM sales_view WHERE year BETWEEN 2005 AND 2007 GROUP BY country, year; COUNTRY YEAR CURRENT_YEAR PREVIOUS_YEAR BEFORE_THAT -------- ----- ------------ ------------- ----------- Italy 2005 22867 Italy 2006 20831 Italy 2007 23738 Greece 2005 20835 Greece 2006 25067 Greece 2007 27540
  • Analytics SELECT country, year, current_year, previous_year, before_that FROM ( SELECT year, country, tot_sales current_year ,LAG(tot_sales) OVER (PARTITION BY country ORDER BY year) previous_year ,LAG(tot_sales,2) OVER (PARTITION BY country ORDER BY year) before_that FROM ( SELECT country ,year ,SUM(sales) tot_sales FROM sales_view WHERE year BETWEEN 2005 AND 2007 GROUP BY year, country) ) WHERE year = 2007; COUNTRY YEAR CURRENT_YEAR PREVIOUS_YEAR BEFORE_THAT ---------- ----- ------------ ------------- ----------- Greece 2007 27540 25067 20835 Italy 2007 23738 20831 22867
  • Analytics SELECT country, year, current_year, previous_year, before_that FROM ( SELECT year, country, tot_sales current_year ,LAG(tot_sales) OVER (PARTITION BY country ORDER BY year) previous_year ,LAG(tot_sales,2) OVER (PARTITION BY country ORDER BY year) before_that FROM ( SELECT country ,year ,SUM(sales) tot_sales FROM sales_view WHERE year BETWEEN 2005 AND 2007 GROUP BY year, country) ) WHERE year >= 2005; COUNTRY YEAR CURRENT_YEAR PREVIOUS_YEAR BEFORE_THAT ---------- ----- ------------ ------------- ----------- Greece 2005 20835 Greece 2006 25067 20835 Greece 2007 27540 25067 20835 Italy 2005 22867 Italy 2006 20831 22867 Italy 2007 23738 20831 22867
  • Model – CV() SELECT country, year, sales, previous_year, before_that FROM sales_view GROUP BY year, country MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (year) MEASURES (SUM(sales) AS sales ,CAST(NULL AS NUMBER) previous_year ,CAST(NULL AS NUMBER) before_that) RULES ( previous_year[2007] = sales[CV()-1] ,before_that [2007] = sales[CV()-2] ); COUNTRY YEAR SALES PREVIOUS_YEAR BEFORE_THAT -------- ----- ------------ ------------- ----------- Greece 2007 27,540 25067 20835 Italy 2007 23,738 20831 22867
  • Model – CV() SELECT country, year, sales, previous_year, before_that FROM sales_view GROUP BY year, country MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (year) MEASURES (SUM(sales) AS sales ,CAST(NULL AS NUMBER) previous_year ,CAST(NULL AS NUMBER) before_that) RULES ( previous_year[ANY] = sales[CV()-1] ,before_that [ANY] = sales[CV()-2] ); COUNTRY YEAR SALES PREVIOUS_YEAR BEFORE_THAT -------- ----- ------------ ------------- ----------- Greece 2005 20,835 Greece 2006 25,067 20835 Greece 2007 27,540 25067 20835 Italy 2005 22,867 Italy 2006 20,831 22867 Italy 2007 23,738 20831 22867
  • Model - Reference SELECT country, year, sales, prev, bef FROM sales_view GROUP BY country, year MODEL REFERENCE r ON (WITH years AS (SELECT TO_CHAR(SYSDATE,'YYYY')-LEVEL+1 y FROM DUAL CONNECT BY LEVEL <=10) SELECT y, y-1 prev, y-2 bef FROM years) DIMENSION BY (y) MEASURES (prev, bef) MAIN calc PARTITION BY (country) DIMENSION BY (year) MEASURES (SUM(sales) AS sales ,CAST(NULL AS NUMBER) prev, 0 bef) RULES ( prev[ANY] = sales[r.prev[CV(year)]] ,bef [ANY] = sales[r.bef [CV(year)]] );
  • Give it time, it will grow on you…