Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Advanced row pattern matching

3,815 views

Published on

Presentation made at #UKOUG_Tech17, #DOAG2016 and #OUGN17. Download to see animations!

Published in: Technology
  • Already waiting eagerly for your presentation at ILOUG 2018 :):)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Advanced row pattern matching

  1. 1. Meet Your Match Advanced row pattern matching (12c) Stew Ashton UKOUG Tech 17 Can you read the following line? If not, please move closer. It's much better when you can read the code ;)
  2. 2. Advanced usage, not all the syntax • Reminder of the basics • Exercises • Bin fitting • Positive and negative sequencing • Hierarchical summaries – Thanks, Kim Berg Hansen • Alternatives to inequality joining – Thanks, Jonathan Lewis 2
  3. 3. Reminder: the Basics • To illustrate: table with PAGE column – Group consecutive pages together 3 PAGE 1 2 3 5 FIRSTPAGE LASTPAGE CNT 1 3 3 5 5 1
  4. 4. Pattern and Matching Rows • PATTERN – Uninterrupted series of input rows – Described as list of conditions (≅ “regular expressions”) PATTERN (A B*) "A" : 1 row, "B*" : 0 or more rows, as many as possible • DEFINE (at least one) row condition [A undefined = TRUE] B AS page = PREV(page)+1 • Each series that matches the pattern is a “match” – "A" and "B" identify the rows that meet their conditions – There can be unmatched rows between series 4
  5. 5. Input, Processing, Output 1. Define input 2. Order input 3. Process pattern 4. using defined conditions 5. Output: rows per match 6. Output: columns per row 7. Go where after match? 5 SELECT * FROM t MATCH_RECOGNIZE ( ORDER BY page MEASURES A.page as firstpage, LAST(page) as lastpage, COUNT(*) cnt ONE ROW PER MATCH AFTER MATCH SKIP PAST LAST ROW PATTERN (A B*) DEFINE B AS page = PREV(page)+1 );
  6. 6. Which row do we mean? pg id DEFINE ALL ROWS PER MATCH ONE ROW PER MATCH first Current last first Current last Final last first Current last Final last 1 A 1 1 1 1 1 1 3 2 B 1 2 2 1 2 2 3 3 B 1 3 3 1 3 3 3 1 3 3 3 5 B? 1 5 5 6 Column name by itself = « current » row • DEFINE: row being evaluated ; ALL ROWS: each row ; ONE ROW: last row
  7. 7. Exercise: what output from this input? 7 CUST_ID TX_DATE DESCR C001 2016-01-01 Inquiry C001 2016-01-01 Inquiry C001 2016-01-10 Sales C001 2016-01-21 Repeat Inquiry C001 2016-02-10 Repeat Inquiry C001 2016-05-01 Sales C001 2016-05-06 Sales C001 2016-06-10 Inquiry 1 C001 2016-09-01 Inquiry 2 C002 2016-02-01 Inquiry 1 C002 2016-02-25 Inquiry 2 C003 2016-02-01 Inquiry 2 C003 2016-02-10 Sales C003 2016-02-10 Sales C003 2016-03-10 Inquiry 2 C004 2016-04-15 Sales select * from t match_recognize( all rows per match pattern (a*) define a as 1=1 );
  8. 8. Add sequence number, starting over after 40 days 8 CUST_ID TX_DATE DESCR C001 2016-01-01 Inquiry C001 2016-01-01 Inquiry C001 2016-01-10 Sales C001 2016-01-21 Repeat Inquiry C001 2016-02-10 Repeat Inquiry C001 2016-05-01 Sales C001 2016-05-06 Sales C001 2016-06-10 Inquiry 1 C001 2016-09-01 Inquiry 2 C002 2016-02-01 Inquiry 1 C002 2016-02-25 Inquiry 2 C003 2016-02-01 Inquiry 2 C003 2016-02-10 Sales C003 2016-02-10 Sales C003 2016-03-10 Inquiry 2 C004 2016-04-15 Sales select * from t match_recognize( all rows per match pattern (a*) define a as 1=1 );
  9. 9. Add sequence number, starting over after 40 days 9 CUST_ID TX_DATE DESCR C001 2016-01-01 Inquiry C001 2016-01-01 Inquiry C001 2016-01-10 Sales C001 2016-01-21 Repeat Inquiry C001 2016-02-10 Repeat Inquiry C001 2016-05-01 Sales C001 2016-05-06 Sales C001 2016-06-10 Inquiry 1 C001 2016-09-01 Inquiry 2 C002 2016-02-01 Inquiry 1 C002 2016-02-25 Inquiry 2 C003 2016-02-01 Inquiry 2 C003 2016-02-10 Sales C003 2016-02-10 Sales C003 2016-03-10 Inquiry 2 C004 2016-04-15 Sales select * from t match_recognize( all rows per match pattern (a*) define a as 1=1 );
  10. 10. Add sequence number, starting over after 40 days 10 CUST_ID TX_DATE DESCR C001 2016-01-01 Inquiry C001 2016-01-01 Inquiry C001 2016-01-10 Sales C001 2016-01-21 Repeat Inquiry C001 2016-02-10 Repeat Inquiry C001 2016-05-01 Sales C001 2016-05-06 Sales C001 2016-06-10 Inquiry 1 C001 2016-09-01 Inquiry 2 C002 2016-02-01 Inquiry 1 C002 2016-02-25 Inquiry 2 C003 2016-02-01 Inquiry 2 C003 2016-02-10 Sales C003 2016-02-10 Sales C003 2016-03-10 Inquiry 2 C004 2016-04-15 Sales select * from t match_recognize( partition by cust_id order by tx_date, descr all rows per match pattern (a*) define a as ); select * from t match_recognize( partition by cust_id order by tx_date, descr all rows per match pattern (a*) define a as tx_date <= first(tx_date) + 40 ); select * from t match_recognize( partition by cust_id order by tx_date, descr measures count(*) as seq all rows per match pattern (a*) define a as tx_date <= first(tx_date) + 40 );
  11. 11. Add sequence number, starting over after 40 days 11 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry 1 C001 2016-01-01 Inquiry 2 C001 2016-01-10 Sales 3 C001 2016-01-21 Repeat Inquiry 4 C001 2016-02-10 Repeat Inquiry 5 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 1 C002 2016-02-01 Inquiry 1 1 C002 2016-02-25 Inquiry 2 2 C003 2016-02-01 Inquiry 2 1 C003 2016-02-10 Sales 2 C003 2016-02-10 Sales 3 C003 2016-03-10 Inquiry 2 4 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures count(*) as seq all rows per match pattern (a*) define a as tx_date <= first(tx_date) + 40 );
  12. 12. Sequence starts from First Sale, Inquiry outside 40 days = 0 12 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry 1 C001 2016-01-01 Inquiry 2 C001 2016-01-10 Sales 3 C001 2016-01-21 Repeat Inquiry 4 C001 2016-02-10 Repeat Inquiry 5 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 1 C002 2016-02-01 Inquiry 1 1 C002 2016-02-25 Inquiry 2 2 C003 2016-02-01 Inquiry 2 1 C003 2016-02-10 Sales 2 C003 2016-02-10 Sales 3 C003 2016-03-10 Inquiry 2 4 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures count(*) as seq all rows per match pattern (a*) define a as tx_date <= first(tx_date) + 40 );
  13. 13. Sequence starts from Sale, Inquiry outside 40 days = 0 13 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry 1 C001 2016-01-01 Inquiry 2 C001 2016-01-10 Sales 3 C001 2016-01-21 Repeat Inquiry 4 C001 2016-02-10 Repeat Inquiry 5 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 1 C002 2016-02-01 Inquiry 1 1 C002 2016-02-25 Inquiry 2 2 C003 2016-02-01 Inquiry 2 1 C003 2016-02-10 Sales 2 C003 2016-02-10 Sales 3 C003 2016-03-10 Inquiry 2 4 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures count(*) as seq all rows per match pattern (a *) define a as tx_date <= first(tx_date) + 40 );
  14. 14. Sequence starts from Sale, Inquiry outside 40 days = 0 14 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry 1 C001 2016-01-01 Inquiry 2 C001 2016-01-10 Sales 3 C001 2016-01-21 Repeat Inquiry 4 C001 2016-02-10 Repeat Inquiry 5 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 1 C002 2016-02-01 Inquiry 1 1 C002 2016-02-25 Inquiry 2 2 C003 2016-02-01 Inquiry 2 1 C003 2016-02-10 Sales 2 C003 2016-02-10 Sales 3 C003 2016-03-10 Inquiry 2 4 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures count(*) as seq all rows per match pattern (inq* sale1{0,1} more_tx*) define more_tx as tx_date <= + 40 ); - count(inq.*) define inq as descr != 'Sales', sale1 as descr = 'Sales', more_tx as tx_date <= sale1.tx_date + 40 );
  15. 15. Sequence starts from Sale, Inquiry outside 40 days = 0 15 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry 0 C001 2016-01-01 Inquiry 0 C001 2016-01-10 Sales 1 C001 2016-01-21 Repeat Inquiry 2 C001 2016-02-10 Repeat Inquiry 3 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 0 C002 2016-02-01 Inquiry 1 0 C002 2016-02-25 Inquiry 2 0 C003 2016-02-01 Inquiry 2 0 C003 2016-02-10 Sales 1 C003 2016-02-10 Sales 2 C003 2016-03-10 Inquiry 2 3 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures count(*) - count(inq.*) as seq all rows per match pattern (inq* sale1{0,1} more_tx*) define inq as descr != 'Sales', sale1 as descr = 'Sales', more_tx as tx_date <= sale1.tx_date + 40 );
  16. 16. Negative sequence for Inquiries within 10 days prior to Sale 16 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry 0 C001 2016-01-01 Inquiry 0 C001 2016-01-10 Sales 1 C001 2016-01-21 Repeat Inquiry 2 C001 2016-02-10 Repeat Inquiry 3 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 0 C002 2016-02-01 Inquiry 1 0 C002 2016-02-25 Inquiry 2 0 C003 2016-02-01 Inquiry 2 0 C003 2016-02-10 Sales 1 C003 2016-02-10 Sales 2 C003 2016-03-10 Inquiry 2 3 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures count(*) - count(inq.*) as seq all rows per match pattern (inq* sale1{0,1} more_tx*) define inq as descr != 'Sales', sale1 as descr = 'Sales', more_tx as tx_date <= sale1.tx_date + 40 );
  17. 17. Negative sequence for Inquiries within 10 days prior to Sale 17 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry 0 C001 2016-01-01 Inquiry 0 C001 2016-01-10 Sales 1 C001 2016-01-21 Repeat Inquiry 2 C001 2016-02-10 Repeat Inquiry 3 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 0 C002 2016-02-01 Inquiry 1 0 C002 2016-02-25 Inquiry 2 0 C003 2016-02-01 Inquiry 2 0 C003 2016-02-10 Sales 1 C003 2016-02-10 Sales 2 C003 2016-03-10 Inquiry 2 3 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures count(*) - count(inq.*) as seq all rows per match pattern (inq* sale1{0,1} more_tx*) define inq as descr != 'Sales', sale1 as descr = 'Sales', more_tx as tx_date <= sale1.tx_date + 40 );
  18. 18. Negative sequence for Inquiries within 10 days prior to Sale 18 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry 0 C001 2016-01-01 Inquiry 0 C001 2016-01-10 Sales 1 C001 2016-01-21 Repeat Inquiry 2 C001 2016-02-10 Repeat Inquiry 3 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 0 C002 2016-02-01 Inquiry 1 0 C002 2016-02-25 Inquiry 2 0 C003 2016-02-01 Inquiry 2 0 C003 2016-02-10 Sales 1 C003 2016-02-10 Sales 2 C003 2016-03-10 Inquiry 2 3 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures case when classifier() = 'INQ' and tx_date >= final first(sale1.tx_date) - 10 then count(inq.*) - final count(inq.*) - 1 else count(*) - count(inq.*) end as seq all rows per match pattern (inq* sale1{0,1} more_tx*) define inq as descr != 'Sales', sale1 as descr = 'Sales', more_tx as tx_date <= sale1.tx_date + 40 );
  19. 19. Negative sequence for Inquiries within 10 days prior to Sale 19 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry -2 C001 2016-01-01 Inquiry -1 C001 2016-01-10 Sales 1 C001 2016-01-21 Repeat Inquiry 2 C001 2016-02-10 Repeat Inquiry 3 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 0 C002 2016-02-01 Inquiry 1 0 C002 2016-02-25 Inquiry 2 0 C003 2016-02-01 Inquiry 2 -1 C003 2016-02-10 Sales 1 C003 2016-02-10 Sales 2 C003 2016-03-10 Inquiry 2 3 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures case when classifier() = 'INQ' and tx_date >= final first(sale1.tx_date) - 10 then count(inq.*) - final count(inq.*) - 1 else count(*) - count(inq.*) end as seq all rows per match pattern (inq* sale1{0,1} more_tx*) define inq as descr != 'Sales', sale1 as descr = 'Sales', more_tx as tx_date <= sale1.tx_date + 40 );
  20. 20. Hierarchical Summary: get salaries of mgr + subordinates 20 select level lvl, ename, sal from scott.emp start with mgr is null connect by mgr = prior empno; LVL ENAME SAL 1 KING 5000 2 JONES 2975 3 SCOTT 3000 4 ADAMS 1100 3 FORD 3000 4 SMITH 800 2 BLAKE 2850 3 ALLEN 1600 3 WARD 1250 3 MARTIN 1250 3 TURNER 1500 3 JAMES 950 2 CLARK 2450 3 MILLER 1300 >2
  21. 21. Hierarchical Summary: get salaries of mgr + subordinates 21 select * from ( select level lvl, ename, sal from scott.emp start with mgr is null connect by mgr = prior empno ) match_recognize( measures a.lvl lvl, a.ename ename, a.sal sal, sum(sal) as sum_sal pattern(a b*) define b as lvl > a.lvl ); LVL ENAME SAL 1 KING 5000 2 JONES 2975 3 SCOTT 3000 4 ADAMS 1100 3 FORD 3000 4 SMITH 800 2 BLAKE 2850 3 ALLEN 1600 3 WARD 1250 3 MARTIN 1250 3 TURNER 1500 3 JAMES 950 2 CLARK 2450 3 MILLER 1300
  22. 22. Hierarchical Summary: get salaries of mgr + subordinates 22 LVL ENAME SAL SUM_SAL 1 KING 5000 29025 select * from ( select level lvl, ename, sal from scott.emp start with mgr is null connect by mgr = prior empno ) match_recognize( measures a.lvl lvl, a.ename ename, a.sal sal, sum(sal) as sum_sal pattern(a b*) define b as lvl > a.lvl );
  23. 23. Hierarchical Summary: get salaries of mgr + subordinates 23 LVL ENAME SAL SUM_SAL 1 KING 5000 29025 select * from ( select level lvl, ename, sal from scott.emp start with mgr is null connect by mgr = prior empno ) match_recognize( measures a.lvl lvl, a.ename ename, a.sal sal, sum(sal) as sum_sal after match skip past last row pattern(a b*) define b as lvl > a.lvl );
  24. 24. Hierarchical Summary: get salaries of mgr + subordinates 24 LVL ENAME SAL SUM_SAL 1 KING 5000 29025 select * from ( select level lvl, ename, sal from scott.emp start with mgr is null connect by mgr = prior empno ) match_recognize( measures a.lvl lvl, a.ename ename, a.sal sal, sum(sal) as sum_sal after match skip to next row pattern(a b*) define b as lvl > a.lvl );
  25. 25. Hierarchical Summary: get salaries of mgr + subordinates 25 LVL ENAME SAL SUM_SAL 1 KING 5000 29025 2 JONES 2975 10875 3 SCOTT 3000 4100 4 ADAMS 1100 1100 3 FORD 3000 3800 4 SMITH 800 800 2 BLAKE 2850 9400 3 ALLEN 1600 1600 3 WARD 1250 1250 3 MARTIN 1250 1250 3 TURNER 1500 1500 3 JAMES 950 950 2 CLARK 2450 3750 3 MILLER 1300 1300 select * from ( select level lvl, ename, sal from scott.emp start with mgr is null connect by mgr = prior empno ) match_recognize( measures a.lvl lvl, a.ename ename, a.sal sal, sum(sal) as sum_sal after match skip to next row pattern(a b*) define b as lvl > a.lvl ); http://www.kibeha.dk/2015/07/row-pattern-matching-nested-within.html
  26. 26. Inequality joins 26 >create table t1(id, jd, v) cache as select level, level + .1, level from dual connect by level <= 20000; • Equality • Band Join: compare T1.ID to T2.ID + a constant • Range Join: T1.ID within a range T2 (ID to JD) • Overlap Join: T1 range (ID to JD) overlaps T2 range >create table t2 cache as select * from t1;
  27. 27. Equality 27 select t1.id id1, t2.id id2, t1.v v1, t2.v v2 from t1, t2 where t1.id = t2.id Elapsed: .O1 seconds
  28. 28. Band Join 28 select t1.id id1, t2.id id2, t1.v v1, t2.v v2 from t1, t2 where t1.id between t2.id and t2.id + .1 Elapsed: .O4 seconds • New implementation in 12.2 • Before 12.2, about the same time as range join =>
  29. 29. Range Join 29 select t1.id id1, t2.id id2, t1.v v1, t2.v v2 from t1, t2 where t1.id between t2.id and t2.jd Elapsed: 30 seconds (Equality: .01 Band: .04)
  30. 30. Range Join Execution Plan 30 ------------------------------------------------------- | Id | Operation | Name | Starts | A-Rows | ------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 20000 | | 1 | MERGE JOIN | | 1 | 20000 | | 2 | SORT JOIN | | 1 | 20000 | | 3 | TABLE ACCESS FULL | T1 | 1 | 20000 | |* 4 | FILTER | | 20000 | 20000 | |* 5 | SORT JOIN | | 20000 | 200M| | 6 | TABLE ACCESS FULL| T2 | 1 | 20000 | -------------------------------------------------------
  31. 31. 31 T1 T2 ID ID JD 1 1 1.1 1 2 2.1 1 3 3.1 2 1 1.1 2 2 2.1 2 3 3.1 3 1 1.1 3 2 2.1 3 3 3.1 All possible combinations
  32. 32. 32 T1 T2 ID ID JD 1 1 1.1 1 2 2.1 1 3 3.1 2 1 1.1 2 2 2.1 2 3 3.1 3 1 1.1 3 2 2.1 3 3 3.1 5 - access("T1"."ID">="T2"."ID")
  33. 33. 33 T1 T2 ID ID JD 1 1 1.1 1 2 2.1 1 3 3.1 2 1 1.1 2 2 2.1 2 3 3.1 3 1 1.1 3 2 2.1 3 3 3.1 4 - filter("T1"."ID"<="T2"."JD") 5 - access("T1"."ID">="T2"."ID")
  34. 34. Sort all and Match? 34 T2 T1 T2 T1 T2 T1 T2 1 1 1 2 2.1 2 2 3 3.1 3 3 4 4.1 4 5 5.1 Sort by ID, T2 first. (order shown from left to right) T2 range diff. is now 1.1, so will match 2 T1 rows
  35. 35. Sort all and Match? 35 T2 1 1 2 2.1 3 4 5 Start Look for following T1 rows with ID <= 2.1 Due to sort, their IDs must be >= T2.ID
  36. 36. Sort all and Match? 36 T2 T1 1 1 1 2 2.1 3 4 5 Start Join T1.ID < 2.1 Due to sort, T1.ID is automatically >= T2.ID
  37. 37. Sort all and Match? 37 T2 T1 T2 1 1 1 2 2.1 2 3 3.1 4 5 Start Join match T2.ID < 2.1 So match, but do not output
  38. 38. Sort all and Match? 38 T2 T1 T2 T1 1 1 1 2 2.1 2 2 3 3.1 4 5 Start Join match Join T1.ID < 2.1 So match and output
  39. 39. Sort all and Match? 39 T2 T1 T2 T1 T2 1 1 1 2 2.1 2 2 3 3.1 3 4 4.1 5 Start Join match Join X Match ended Skip to next
  40. 40. Sort all and Match? 40 T2 T1 T2 T1 T2 T1 T2 1 1 1 2 2.1 2 2 3 3.1 3 3 4 4.1 4 5 5.1 Start Join match Join X
  41. 41. (Almost) All Rows per Match • PATTERN ( A {- B A -} B) – The parts of the pattern enclosed between {- and -} are excluded from the output. – Here only two rows per match will be returned – More granular than using a WHERE clause • Alternation: | means OR – "Alternatives are preferred in the order they are specified." PATTERN ( A | B ) = If A condition is true then A, else if B condition is true then B 41
  42. 42. Range Match 42 select ID ID1, ID2, JD2 from ( select t2.*, 1 is_t2 from t2 union all select t1.*, null from t1 ) match_recognize( order by id, is_t2 measures t2.id id2, t2.jd jd2 all rows per match after match skip to next row pattern({-T2-} ( T1 | {-T2-} )* T1) define T2 as is_t2 = 1 and id < first(t2.jd), T1 as is_t2 is null and id < first(t2.jd) ); Elapsed: .12 secs Equality: .01 Band: .04 Range join: 30.00
  43. 43. Overlap Join 43 select t1.id id1, t2.id id2, t1.v v1, t2.v v2 from t1, t2 where (t2.id <= t1.id and t1.id < t2.jd) or (t1.id <= t2.id and t2.id < t1.jd) Elapsed: 50 seconds
  44. 44. Overlap Join Execution Plan 44 | 0 | SELECT STATEMENT | | 1 | | 1 | | 1 | SORT AGGREGATE | | 1 | 1 | 1 | | 2 | VIEW | VW...| 1 | 175M| 20000 | | 3 | UNION-ALL | | 1 | | 20000 | | 4 | MERGE JOIN | | 1 | 100M| 20000 | | 5 | SORT JOIN | | 1 | 20000 | 20000 | | 6 | TABLE ACCESS FULL | T2 | 1 | 20000 | 20000 | |* 7 | FILTER | | 20000 | | 20000 | |* 8 | SORT JOIN | | 20000 | 20000 | 200M| | 9 | TABLE ACCESS FULL| T1 | 1 | 20000 | 20000 | | 10 | MERGE JOIN | | 1 | 75M| 0 | | 11 | SORT JOIN | | 1 | 20000 | 20000 | | 12 | TABLE ACCESS FULL | T2 | 1 | 20000 | 20000 | |* 13 | FILTER | | 20000 | | 0 | |* 14 | SORT JOIN | | 20000 | 20000 | 200M| | 15 | TABLE ACCESS FULL| T1 | 1 | 20000 | 20000 |
  45. 45. Overlap Match 45 select * from ( select t1.*, 1 table_num from t1 union all select t2.*, 2 from t2 ) match_recognize( order by id, jd all rows per match after match skip to next row pattern({-ta-} ( tb | {-x-} )* tb) define tb as table_num != ta.table_num and id < first(jd), x as table_num = ta.table_num and id < first(jd) ); Elapsed: .12 secs Equality: .01 Band: .04 Range match: .12 No need to wait for 18c
  46. 46. 46 Child' s play
  47. 47. Solving Problems with pattern matching • Clear knowledge of input & requirement – Beware of assumptions • Identify typical problems and solutions – Consecutive sequences – Ad hoc grouping – Bin fitting – Ranges • Visualize the data processing flow – Output from other rows is not available, input is. 47
  48. 48. Meet Your Match Advanced row pattern matching (12c) Stew Ashton UKOUG Tech 17 https://stewashton.wordpress.com/ Twitter: @stewashton
  49. 49. Anchors • Anchors – ^ matches the position before the first row in the partition. – $ matches the position after the last row in the partition PATTERN(^ A $) = partition must have 1 row 49
  50. 50. JOIN alternative: CDC compare 50 PKVAL 1Same value 2Delete this 3Old value PKVAL 1Same value 3New value 4Insert this T1 T2 select pk, op, val, oldrid from ( select pk, val, rowid rid from t1 union all select pk, val, null from t2 ) match_recognize( partition by pk order by rid measures classifier() op, first(rid) oldrid all rows per match pattern(^ D $ | ^ I $ | (^ O U $) ) define D as rid is not null, U as decode(O.val, val, 0, 1) = 1 ); PK OP VAL OLDRID 2D Delete this AAAkdlAAH…MAAB 3O Old value AAAkdlAAH…MAAC 3U New value AAAkdlAAH…MAAC 4I Insert this

×