My presentation on a new Oracle Database feature called "Row Pattern Matching". I presented on 9 December 2014 at UKOUG Tech 14. Be sure to download for the animations.
ISO SQL:2016 introduced Row Pattern Matching: a feature to apply (limited) regular expressions on table rows and perform analysis on each match. As of writing, this feature is only supported by the Oracle Database 12c.
My presentation on a new Oracle Database feature called "Row Pattern Matching". I presented on 9 December 2014 at UKOUG Tech 14. Be sure to download for the animations.
ISO SQL:2016 introduced Row Pattern Matching: a feature to apply (limited) regular expressions on table rows and perform analysis on each match. As of writing, this feature is only supported by the Oracle Database 12c.
Building Machine Learning Algorithms on Apache Spark with William BentonSpark Summit
There are lots of reasons why you might want to implement your own machine learning algorithms on Spark: you might want to experiment with a new idea, try and reproduce results from a recent research paper, or simply to use an existing technique that isn’t implemented in MLlib. In this talk, we’ll walk through the process of developing a new machine learning model for Spark. We’ll start with the basics, by considering how we’d design a parallel implementation of a particular unsupervised learning technique. The bulk of the talk will focus on the details you need to know to turn an algorithm design into an efficient parallel implementation on Spark: we’ll start by reviewing a simple RDD-based implementation, show some improvements, point out some pitfalls to avoid, and iteratively extend our implementation to support contemporary Spark features like ML Pipelines and structured query processing. You’ll leave this talk with everything you need to build a new machine learning technique that runs on Spark.
Efficient Scalar Multiplication for Ate Based Pairing over KSS Curve of Embed...Md. Al-Amin Khandaker Nipu
Efficiency of the next generation pairing based security pro- tocols rely not only on the faster pairing calculation but also on efficient scalar multiplication on higher degree rational points. In this paper we proposed a scalar multiplication technique in the context of Ate based pairing with Kachisa-Schaefer-Scott (KSS) pairing friendly curves with embedding degree k = 18 at the 192-bit security level. From the system- atically obtained characteristics p, order r and Frobenious trace t of KSS curve, which is given by certain integer z also known as mother parame- ter, we exploit the relation #E(Fp) = p+1−t mod r by applying Frobe- nius mapping with rational point to enhance the scalar multiplication. In addition we proposed z-adic representation of scalar s. In combination of Frobenious mapping with multi-scalar multiplication technique we ef- ficiently calculate scalar multiplication by s. Our proposed method can achieve 3 times or more than 3 times faster scalar multiplication com- pared to binary scalar multiplication, sliding-window and non-adjacent form method.
At Concentra we have used abstract MTL patterns in our production service architecture for a number of years now. We are currently migrating across to the cats.mtl library which brings with it a heightened level of abstraction and composability whilst removing a lot of previously required boiler plate. In this talk I will give a brief overview of how to use cats mtl. Extol the benefits of implementing such an architecture. Share some of the more interesting consequences, as well as how we have resolved various challenges along the way.
This presentation is part of the Oracle OpenWorld 2016 session: EOUC Database ACES Share Their Favorite Database Things: Part II. In this session (UGF-2632) ACE Directors share their favorite database features in our now traditional quick-fire sessions (of 5 minutes per speaker).
Recognizing patterns in a sequence of rows has been a capability that was widely desired, but not possible with SQL until now. There were many workarounds, but these were difficult to write, hard to understand, and inefficient to execute. Beginning in Oracle Database 12c, you can use the MATCH_RECOGNIZE clause to achieve this capability in native SQL that executes efficiently. This presentation discusses how to do this.
Building Machine Learning Algorithms on Apache Spark with William BentonSpark Summit
There are lots of reasons why you might want to implement your own machine learning algorithms on Spark: you might want to experiment with a new idea, try and reproduce results from a recent research paper, or simply to use an existing technique that isn’t implemented in MLlib. In this talk, we’ll walk through the process of developing a new machine learning model for Spark. We’ll start with the basics, by considering how we’d design a parallel implementation of a particular unsupervised learning technique. The bulk of the talk will focus on the details you need to know to turn an algorithm design into an efficient parallel implementation on Spark: we’ll start by reviewing a simple RDD-based implementation, show some improvements, point out some pitfalls to avoid, and iteratively extend our implementation to support contemporary Spark features like ML Pipelines and structured query processing. You’ll leave this talk with everything you need to build a new machine learning technique that runs on Spark.
Efficient Scalar Multiplication for Ate Based Pairing over KSS Curve of Embed...Md. Al-Amin Khandaker Nipu
Efficiency of the next generation pairing based security pro- tocols rely not only on the faster pairing calculation but also on efficient scalar multiplication on higher degree rational points. In this paper we proposed a scalar multiplication technique in the context of Ate based pairing with Kachisa-Schaefer-Scott (KSS) pairing friendly curves with embedding degree k = 18 at the 192-bit security level. From the system- atically obtained characteristics p, order r and Frobenious trace t of KSS curve, which is given by certain integer z also known as mother parame- ter, we exploit the relation #E(Fp) = p+1−t mod r by applying Frobe- nius mapping with rational point to enhance the scalar multiplication. In addition we proposed z-adic representation of scalar s. In combination of Frobenious mapping with multi-scalar multiplication technique we ef- ficiently calculate scalar multiplication by s. Our proposed method can achieve 3 times or more than 3 times faster scalar multiplication com- pared to binary scalar multiplication, sliding-window and non-adjacent form method.
At Concentra we have used abstract MTL patterns in our production service architecture for a number of years now. We are currently migrating across to the cats.mtl library which brings with it a heightened level of abstraction and composability whilst removing a lot of previously required boiler plate. In this talk I will give a brief overview of how to use cats mtl. Extol the benefits of implementing such an architecture. Share some of the more interesting consequences, as well as how we have resolved various challenges along the way.
This presentation is part of the Oracle OpenWorld 2016 session: EOUC Database ACES Share Their Favorite Database Things: Part II. In this session (UGF-2632) ACE Directors share their favorite database features in our now traditional quick-fire sessions (of 5 minutes per speaker).
Recognizing patterns in a sequence of rows has been a capability that was widely desired, but not possible with SQL until now. There were many workarounds, but these were difficult to write, hard to understand, and inefficient to execute. Beginning in Oracle Database 12c, you can use the MATCH_RECOGNIZE clause to achieve this capability in native SQL that executes efficiently. This presentation discusses how to do this.
Use Cases of Row Pattern Matching in Oracle 12cGerger
In this presentation, Oracle ACE Kim Berg Hansen explains the use cases of the MATCH_RECOGNIZE feature in Oracle 12c. You can watch the recording of this presentation at http://www.prohuddle.com/webinars/KimBergHansen/Use_Cases_of_Row_Pattern_Matching_in_Oracle_12c.php
One key area of Oracle OpenWorld 2016 was data in various shapes. Big Data, streaming data and traditional transactional data. The power of SQL to access and unleash all data - even data in NoSQL databases. The advent of the citizen data scientist. Streaming data analysis in real time on vast and fast and vast data, data discovery. And the new Oracle Database 12cR2 release. Forms, APEX, SQL and PL/SQL.
Oracle Database 12c Feature Support in Oracle SQL DeveloperJeff Smith
A brief overview of Database 12c feature support in Oracle SQL Developer with a focus on using the SQL Translation Framework to fix problematic application SQL in production with ZERO application re-writes or changes.
Oracle 12c includes over 500 new features and enhancements which greatly enhance the Oracle database experience for developers and DBAs.
For this presentation, I have selected the top 12 new features of Oracle 12c:
From the "game changing" Pluggable Database architecture to lesser known but extremely cool features such as improved top-N queries or VARCHAR(32K).
In our webinar, we will discuss the top 12 new features of Oracle 12c we think every DBA should know about and demo some of the most useful new enchantments and capabilities available in Oracle's latest release.
1. PL/SQL enhancements.
2. Improve "column defaults" for tables - identity columns in Oracle, and more ...
3. Increased Size Limits - VARCHARS can go up to 32K!
4. Improved "TOP-n" queries - new Row limiting clause for result set pagination.
5. Row Pattern Matching - search for a specific pattern of data using SQL with REGEX-like syntax.
6. New partitioning features - multiple operations on multiple partitions in a single DDL, move a partition ONLINE (no DBMS_REDEF!), and more...
7. Adaptive Execution Plans - allows multiple execution plans to co-exists for a single SQL and have the optimizer switch between plans in realtime.
8. Enhanced Statistics - new histograms for data skew, automatically computed statistics after direct loads and more...
9. Temporary UNDO - Store UNDO data generated for temporary tables inside the UNDO tablespace itself which allows for DataGuard and Flashback enhancements.
10. Data Optimisation and ILM - The database now "remembers" which blocks are read / written frequently ("heat map") and allows us to create policies based on block access.
11. Transaction Guard - Provide protection for sensitive transactions that are allowed to only happen once.
12. Pluggable Databases - The "Game changer" of Oracle 12c, perform database consolidation on a scale never done before.
This webinar is intended for Oracle developers and DBAs experienced with Oracle 11g but which are new to Oracle 12c and wish to learn more about Oracle's latest and greatest database version.
Ameerpet Online Training gives you an effective and proven online learning option with an extensive learning catalog and the freedom to attend from virtually anywhere. We have trained nearly 1500+ Students on all technologies.
We are offering 10% off on Oracle Training and we will arrange a free demo at your flexible timings
The Amazing and Elegant PL/SQL Function Result CacheSteven Feuerstein
The Function Result Cache, introduced in Oracle Database 11g, offers a very elegant way to cache cross-session data and make it available via PL/SQL functions. It can have a dramatic performance impact on fetching static data (even static for just a period of time) - and it's managed automatically by Oracle Database for you!
Predicting Force Redistribution caused by bolt failures across a Plate Mohamad Sahil
A simplified ANSYS model of a plate exposed to a spatially heterogeneous loading with 208 uniformly spaced bolts has been created to estimate the force at each location in the presence of bolt failures. This project explores the potential for classical statistical modeling and machine learning to develop a metamodel approximation of the ANSYS model.
Describe the non linear dynamic pipeline concepts, Creation of reservation table from non-linear pipeline architecture, creation of collision vector from reservation table, generation of state diagram, derivation of simple cycles, greedy cycles and MAL(Minimum Average Latency)
This topic introduces the numbering systems: decimal, binary, octal and hexadecimal. The topic covers the conversion between numbering systems, binary arithmetic, one's complement, two's complement, signed number and coding system. This topic also covers the digital logic components.
Recent developments in the field of reduced order modeling - and in particular, active subspace construction - have made it possible to efficiently approximate complex models by constructing low-order response surfaces based upon a small subspace of the original high dimensional parameter space. These methods rely upon the fact that the response tends to vary more prominently in a few dominant directions defined by linear combinations of the original inputs, allowing for a rotation of the coordinate axis and a consequent transformation of the parameters. In this talk, we discuss a gradient free active subspace algorithm that is feasible for high dimensional parameter spaces where finite-difference techniques are impractical. We illustrate an initialized gradient-free active subspace algorithm for a neutronics example implemented with SCALE6.1.
Similar to Row Pattern Matching 12c MATCH_RECOGNIZE OOW14 (20)
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
1. Database 12c Row
Pattern Matching
Beating the Best Pre-12c Solutions
[CON3450]
Stew ASHTON
Oracle OpenWorld 2014
2. Photo Opportunity
• Presentation available on www.slideshare.net
• For exact link:
– See @StewAshton on Twitter
– Or see http://stewashton.wordpress.com
2
3. Agenda
• Who am I?
• Pre-12c solutions compared to row pattern
matching with MATCH_RECOGNIZE
– For all sizes of data
– Thinking in patterns
• Watch out for “catastrophic backtracking”
• Other things to keep in mind (time permitting)
OOW CON3450, Stew Ashton 3
4. Who am I?
• 33 years in IT
– Developer, Technical Sales Engineer, Technical Architect
– Aeronautics, IBM, Finance
– Mainframe, client-server, Web apps
• 25 years as an American in Paris
• 9 years using Oracle database
– Performance analysis
– Replace Java with SQL
• 2 years as internal “Oracle Development Expert”
OOW CON3450, Stew Ashton 4
5. 1) “Fixed Difference”
• Identify and group rows with consecutive values
• My presentation: print slides to keep
• Math: subtract known consecutives
– If A-1 = B-2 then A = B-1
– Else A <> B-1
– Consecutive becomes equality,
non-consecutive becomes inequality
• “Consecutive” = fixed difference of 1
PAGE
1
2
3
5
6
7
10
11
12
42
OOW CON3450, Stew Ashton 5
7. Think “match a row pattern”
• PATTERN
– Uninterrupted series of input rows
– Described as a list of conditions (“regular expressions”)
PATTERN (A B*)
"A" : 1 row, "B" : 0 or more rows, as many as possible
• DEFINE each row condition
[A undefined = TRUE]
B AS page = PREV(page)+1
• Each series that matches the pattern is a “match”
– "A" and "B" identify the rows that meet their conditions
OOW CON3450, Stew Ashton 7
8. Input, Processing, Output
1. Define input
2. Order input
3. Process pattern
4. using defined conditions
5. Output: rows per match
6. Output: columns per row
7. Go where after match?
SELECT *
FROM t
MATCH_RECOGNIZE (
ORDER BY page
MEASURES
PATTERN (A B*)
DEFINE B AS page = PREV(page)+1
ONE ROW PER MATCH
MEASURES
A.page firstpage,
LAST(page) lastpage,
COUNT(*) cnt
AFTER MATCH SKIP PAST LAST ROW
OOW CON3450, Stew Ashton 8
);
A.page firstpage,
LAST(page) lastpage,
COUNT(*) cnt
ONE ROW PER MATCH
AFTER MATCH SKIP PAST LAST ROW
PATTERN (A B*)
DEFINE B AS page = PREV(page)+1
9. 1) Run_Stats comparison
Stat Pre 12c Match_R Pct
Latches 4090 4079 100%
Elapsed Time 5.51 5.56 101%
CPU used by this session 5.5 5.55 101%
OOW CON3450, Stew Ashton 9
For one million rows:
“Latches” are serialization devices: fewer means more scalable
10. 1) Execution Plans
Operation Used-Mem
SELECT STATEMENT
HASH GROUP BY 40M (0)
Id Operation Name Starts E-Rows A-Rows A-Time Buffers OMem 1Mem Used-Mem
0 SELECT STATEMENT 1 400K 00:00:01.83 1594
1 VIEW
HASH GROUP BY 1 1000K 400K 00:00:01.83 1594 41M 5035K 40M (0)
2 VIEW WINDOW SORT 1 1000K 1000K 00:00:12.69 1594
3 WINDOW SORT 1 1000K 1000K 00:00:03.46 1594 22M 20M 1749K (0)
20M (0)
4 TABLE ACCESS FULL T 1 1000K 1000K 00:00:02.53 1594
Id Operation Name Starts E-Rows A-Rows A-Time Buffers OMem 1Mem Used-Mem
0 SELECT STATEMENT 1 400K 00:00:03.45 1594
1 VIEW 1 1000K 400K 00:00:03.45 1594
MATCH RECOGNIZE SORT DETERMINISTIC FINITE
2
AUTO
1 1000K 400K 00:00:01.87 1594 22M 1749K 20M (0)
3 TABLE ACCESS FULL T 1 1000K 1000K 00:00:02.09 1594
OOW CON3450, Stew Ashton 10
TABLE ACCESS FULL
Operation Used-Mem
SELECT STATEMENT
VIEW
MATCH RECOGNIZE SORT DETERMINISTIC FINITE AUTO 20M (0)
TABLE ACCESS FULL
11. 2) “Start of Group”
• Identify group boundaries, often using LAG()
• 3 steps instead of 2:
1. For each row: if start of group, assign 1
Else assign 0
2. Running total of 1s and 0s produces a group
identifier
3. Group by the group identifier
OOW CON3450, Stew Ashton 11
12. 2) Requirement
GROUP_NAME EFF_DATE TERM_DATE
X 2014-01-01 00:00 2014-02-01 00:00
X 2014-03-01 00:00 2014-04-01 00:00
X 2014-04-01 00:00 2014-05-01 00:00
X 2014-06-01 00:00 2014-06-01 01:00
X 2014-06-01 01:00 2014-06-01 02:00
X 2014-06-01 02:00 2014-06-01 03:00
Y 2014-06-01 03:00 2014-06-01 04:00
Y 2014-06-01 04:00 2014-06-01 05:00
Y 2014-07-03 08:00 2014-09-29 17:00
Merge contiguous date ranges in same group
OOW CON3450, Stew Ashton 12
13. 1
2
2
3
3
3
1
1
2
X X 05-X 06-06-03:Y 03:05:Y 07-03 08:09-29 17:X 01-01 00:00 02-01 00:00
1
X 03-01 00:00 04-01 00:00
1
X 04-01 00:00 05-01 00:00
0
X 06-01 00:00 06-01 01:00
1
X 06-01 01:00 06-01 02:00
0
X 06-01 02:00 06-01 03:00 0
Y 06-01 03:00 06-01 04:00 1
Y 06-01 04:00 06-01 05:00 0
Y 07-03 08:00 09-29 17:00 1
OOW CON3450, Stew Ashton 13
with grp_starts as (
select a.*,
case when start_ts =
lag(end_ts) over(
partition by group_name
order by start_ts
)
then 0 else 1 end grp_start
from t a
), grps as (
select b.*,
sum(grp_start) over(
partition by group_name
order by start_ts
) grp_id
from grp_starts b)
select group_name,
min(start_ts) start_ts,
max(end_ts) end_ts
from grps
group by group_name, grp_id;
14. 2) Match_Recognize
OOW CON3450, Stew Ashton 14
SELECT * FROM t
MATCH_RECOGNIZE(
PARTITION BY group_name
ORDER BY start_ts
MEASURES
A.start_ts start_ts,
end_ts end_ts,
next(start_ts) - end_ts gap
PATTERN(A B*)
DEFINE B AS start_ts = prev(end_ts)
);
New this time:
• Added PARTITION BY
• MEASURES
added gap using row
outside the match!
• ONE ROW PER MATCH
and
SKIP PAST LAST ROW
are the defaults
One solution replaces two methods: simple!
15. Which row do we mean?
OOW CON3450, Stew Ashton 15
Expression DEFINE
MEASURES
ALL ROWS… ONE ROW…
start_ts current row last row of match
FIRST(start_ts) First row of match
LAST(end_ts) current row last row of match
FINAL
ORA-62509 last row of match
LAST(end_ts)
B.start_ts most recent B row last B row
PREV(), NEXT() Physical offset from referenced row
COUNT(*) from first to current row all rows in match
COUNT(B.*) B rows including current row all B rows
16. 2) Run_Stats comparison
OOW CON3450, Stew Ashton 16
For 500,000 rows:
Stat Pre 12c Match_R Pct
Latches 10165 8066 79%
Elapsed Time 32,16 20,58 64%
CPU used by this session 31,94 19,67 62%
17. 2) Execution Plans
Operation Used-Mem
SELECT STATEMENT
HASH GROUP BY 20M (0)
VIEW
WINDOW BUFFER 32M (0)
VIEW
WINDOW SORT 27M (0)
TABLE ACCESS FULL
Operation Used-Mem
SELECT STATEMENT
VIEW
MATCH RECOGNIZE SORT DETERMINISTIC FINITE AUTO 27M (0)
TABLE ACCESS FULL
OOW CON3450, Stew Ashton 17
18. 2) Predicate pushing
Select * from <view> where group_name = 'X'
Operation Name A-Rows Buffers
SELECT STATEMENT 3 4
VIEW 3 4
OOW CON3450, Stew Ashton 18
MATCH RECOGNIZE SORT DETERMINISTIC
FINITE AUTO
3 4
TABLE ACCESS BY INDEX ROWID
BATCHED
T 6 4
INDEX RANGE SCAN TI 6 3
20. 20
SELECT s first_site, MAX(e) last_site, MAX(sm) sum_cnt FROM (
SELECT s, e, cnt, sm FROM t
MODEL
DIMENSION BY (row_number() over(order by study_site) rn)
MEASURES (study_site s, study_site e, cnt, cnt sm)
RULES (
sm[ > 1] =
CASE WHEN sm[cv() - 1] + cnt[cv()] > 65000 OR cnt[cv()]
> 65000
THEN cnt[cv()]
ELSE sm[cv() - 1] + cnt[cv()]
END,
s[ > 1] =
CASE WHEN sm[cv() - 1] + cnt[cv()] > 65000 OR cnt[cv()]
> 65000
THEN s[cv()]
ELSE s[cv() - 1]
END
)
)
GROUP BY s;
• DIMENSION with row_number
orders data and processing
• rn can be used like a subscript
• cv() means current row
• cv()-1 means previous row
rn
[– [[[[– [rn
[[[[[–
21. OOW CON3450, Stew Ashton 21
SELECT * FROM t
MATCH_RECOGNIZE (
ORDER BY study_site
MEASURES
FIRST(study_site) first_site,
LAST(study_site) last_site,
SUM(cnt) sum_cnt
PATTERN (A+)
DEFINE A AS SUM(cnt) <= 65000
);
New this time:
• PATTERN
(A+) replaces (A B*)
means 1 or more rows
• Why? In previous
examples I used PREV(),
which returns NULL on
the first row.
One solution replaces 3 methods: simpler!
22. 3) Run_Stats comparison
OOW CON3450, Stew Ashton 22
For one million rows:
Stat Pre 12c Match_R Pct
Latches 357448 4622 1%
Elapsed Time 32.85 2.9 9%
CPU used by this session 31.31 2.88 9%
23. 3) Execution Plans
Id Operation Used-Mem
0 SELECT STATEMENT
1 HASH GROUP BY 7534K (0)
2 VIEW
3 SQL MODEL ORDERED 105M (0)
4 WINDOW SORT 27M (0)
5 TABLE ACCESS FULL
Id Operation Used-Mem
0 SELECT STATEMENT
1 VIEW
2 MATCH RECOGNIZE SORT DETERMINISTIC FINITE AUTO 27M (0)
3 TABLE ACCESS FULL
OOW CON3450, Stew Ashton 23
24. 4) “Bin fitting”: fixed number
Name Val Val BIN1 BIN2 BIN3
1 1 10 10
2 2 9 10 9
3 3 8 10 9 8
4 4 7 10 9 15
5 5 6 10 15 15
6 6 5 15 15 15
7 7 4 19 15 15
8 8 3 19 18 15
9 9 2 19 18 17
10 10 1 19 18 18
• Requirement
– Distribute values in 3
“bins” as equally as
possible
• “Best fit decreasing”
– Sort values in
decreasing order
– Put each value in least
full bin
OOW CON3450, Stew Ashton 24
25. 4) Brilliant pre 12c solution
OOW CON3450, Stew Ashton 25
SELECT bin, Max (bin_value) bin_value
FROM (
SELECT * FROM items
MODEL
DIMENSION BY
(Row_Number() OVER
(ORDER BY item_value DESC) rn)
MEASURES (
item_name,
item_value,
Row_Number() OVER
(ORDER BY item_value DESC) bin,
item_value bin_value,
Row_Number() OVER
(ORDER BY item_value DESC) rn_m,
0 min_bin,
Count(*) OVER () - 3 - 1 n_iters
)
RULES ITERATE(100000)
UNTIL (ITERATION_NUMBER >= n_iters[1]) (
min_bin[1] = Min(rn_m) KEEP (DENSE_RANK
FIRST ORDER BY bin_value)[rn<= 3],
bin[ITERATION_NUMBER + 3 + 1] =
min_bin[1],
bin_value[min_bin[1]] =
bin_value[CV()] +
Nvl(item_value[ITERATION_NUMBER+4], 0))
)
WHERE item_name IS NOT NULL
group by bin;
26. OOW CON3450, Stew Ashton 26
SELECT * from items
MATCH_RECOGNIZE (
ORDER BY item_value desc
MEASURES
sum(bin1.item_value) bin1,
sum(bin2.item_value) bin2,
sum(bin3.item_value) bin3
PATTERN ((bin1|bin2|bin3)+)
DEFINE
bin1 AS count(bin1.*) = 1
OR sum(bin1.item_value)-bin1.item_value
<= least(
sum(bin2.item_value),
sum(bin3.item_value)
),
bin2 AS count(bin2.*) = 1
OR sum(bin2.item_value)-bin2.item_value
<= sum(bin3.item_value)
);
• ()+ = 1 or more of whatever
is inside
• '|' = alternatives,
“preferred in the order
specified”
• Bin1 condition:
• No rows here yet,
• Or this bin least full
• Bin2 condition
• No rows here yet, or
• This bin less full than 3
27. 4) Run_Stats comparison
OOW CON3450, Stew Ashton 27
For 10,000 rows:
Stat Pre 12c Match_R Pct
Latches 3124 47 2%
Elapsed Time 28 0.02 0%
CPU used by this session 26.39 0.03 0%
28. 4) Execution Plans
Id Operation Used-Mem
0 SELECT STATEMENT
1 HASH GROUP BY 817K (0)
2 VIEW
3 SQL MODEL ORDERED 1846K (0)
4 WINDOW SORT 424K (0)
5 TABLE ACCESS FULL
Id Operation Used-Mem
0 SELECT STATEMENT
1 VIEW
2 MATCH RECOGNIZE SORT 330K (0)
3 TABLE ACCESS FULL
OOW CON3450, Stew Ashton 28
29. Backtracking
• What happens when there is no match???
• “Greedy” quantifiers - * + {2,}
– are not that greedy
– Take all the rows they can, BUT
give rows back if necessary – one at a time
• Regular expression engines will test all possible
combinations to find a match
OOW CON3450, Stew Ashton 29
30. Repeating conditions
select 'match' from (
select level n from dual
connect by level <= 100
)
match_recognize(
pattern(a b* c)
define b as n > prev(n)
, c as n = 0
);
Runs in 0.005 secs
select 'match' from (
select level n from dual
connect by level <= 100
)
match_recognize(
pattern(a b* b* b* c)
define b as n > prev(n)
, c as n = 0
);
Runs in 5.4 secs
OOW CON3450, Stew Ashton 30
31. Imprecise Conditions
SELECT * FROM Ticker
MATCH_RECOGNIZE (
PARTITION BY symbol
ORDER BY tstamp
MEASURES FIRST(tstamp) AS start_tstamp,
LAST(tstamp) AS end_tstamp
AFTER MATCH SKIP TO LAST UP
PATTERN (STRT DOWN+ UP+ DOWN+ UP+)
DEFINE DOWN AS price < PREV(price),
UP AS price > PREV(price),
STRT AS price >= nvl(PREV(PRICE),0)
);
Runs in 0.02 seconds
CREATE TABLE Ticker (
SYMBOL VARCHAR2(10),
tstamp DATE,
price NUMBER
);
insert into ticker
select 'ACME',
sysdate + level/24/60/60,
10000-level
from dual
connect by level <= 5000;
31
price)
);
Runs in 24 seconds
INMEMORY: 13 seconds
32. Keep in Mind
• Backtracking
– Precise conditions
– Test data with no matches
• To debug:
Measures classifier() cl,
match_number() mn
All rows per match with
unmatched rows
• No DISTINCT, no LISTAGG
• MEASURES columns must
have aliases
• “Reluctant quantifier” = ?
= JDBC bind variable
• “Pattern variables” are
range variables, not bind
variables
OOW CON3450, Stew Ashton 32
33. Output Row “shape”
Per Match PARTITION BY ORDER BY MEASURES Other input
ONE ROW X Omitted X omitted
ALL ROWS X X X X
OOW CON3450, Stew Ashton 33
ORA-00918, anyone?