Row Pattern Matching
@MarkusWinand
Image: 72hoursamericanpower.com
Row Pattern Matching Availability
1999
2001
2003
2005
2007
2009
2011
2013
2015
MariaDB
MySQL
PostgreSQL
SQLite
DB2 LUW
12cR1 Oracle
SQL Server
Grouping Consecutive Events
Time
30 minutes
Example: Logfile
Grouping Consecutive Events
Example: Logfile
Time
30 minutes
Session 1 Session 2
Session 3
Session 4
Example problems:
‣ Count sessions
‣ Average session duration
Two approaches:
‣ Start-of-group tagging
‣ Row pattern matching
SELECT	COUNT(grp_start)	groups	
		FROM	(SELECT	CASE	WHEN	ts	>	LAG(	ts,	1,	DATE'1900-01-01'	)	
																														OVER(	ORDER	BY	ts	)	
																														+	INTERVAL	'30'	minute	
																				THEN	1	
																END	grp_start	
										FROM	log	
							)	T
Consecutive Events: Counting Start-of-group tagging
Time
30 minutes
SELECT	COUNT(grp_start)	groups	
		FROM	(SELECT	CASE	WHEN	ts	>	LAG(	ts,	1,	DATE'1900-01-01'	)	
																														OVER(	ORDER	BY	ts	)	
																														+	INTERVAL	'30'	minute	
																				THEN	1	
																END	grp_start	
										FROM	log	
							)	T
Consecutive Events: Counting Start-of-group tagging
Time
30 minutes
count the

non-NULL

values
SELECT	COUNT(*)	sessions	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
								PATTERN	(	new	)	
								DEFINE	new	AS	ts	>	COALESCE(	PREV(ts)	
																																			,	DATE	'1900-01-01'	
																																			)	
																											+	INTERVAL	'30'	minute		
							)	t
Consecutive Events: Counting Row Pattern Matching
Time
30 minutes
SELECT	COUNT(*)	sessions	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
								PATTERN	(	new	)	
								DEFINE	new	AS	ts	>	COALESCE(	PREV(ts)	
																																			,	DATE	'1900-01-01'	
																																			)	
																											+	INTERVAL	'30'	minute		
							)	t
Consecutive Events: Counting Row Pattern Matching
Time
30 minutes
row pattern
variable
SELECT	COUNT(*)	sessions	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
								PATTERN	(	new	)	
								DEFINE	new	AS	ts	>	COALESCE(	PREV(ts)	
																																			,	DATE	'1900-01-01'	
																																			)	
																											+	INTERVAL	'30'	minute		
							)	t
Consecutive Events: Counting Row Pattern Matching
Time
30 minutes
match

only “new”
rows
SELECT	COUNT(*)	sessions	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
								PATTERN	(	new	)	
								DEFINE	new	AS	ts	>	COALESCE(	PREV(ts)	
																																			,	DATE	'1900-01-01'	
																																			)	
																											+	INTERVAL	'30'	minute		
							)	t
Consecutive Events: Counting Row Pattern Matching
Time
30 minutes
count

rows
SELECT	COUNT(*)	sessions	
					,	AVG(duration)	avg_duration	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
								MEASURES	
									LAST(ts)	-	FIRST(ts)	AS	duration	
								ONE	ROW	PER	MATCH	
								PATTERN	(	new	cont*	)	
								DEFINE	cont	AS	ts	<	PREV(ts)	
																										+	INTERVAL	'30'	minute		
							)	t
Row Pattern MatchingConsecutive Events: Statistics
Time
30 minutes
define

continuation
Oracle doesn’t support avg on intervals — query doesn’t work as shown
SELECT	COUNT(*)	sessions	
					,	AVG(duration)	avg_duration	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
								MEASURES	
									LAST(ts)	-	FIRST(ts)	AS	duration	
								ONE	ROW	PER	MATCH	
								PATTERN	(	new	cont*	)	
								DEFINE	cont	AS	ts	<	PREV(ts)	
																										+	INTERVAL	'30'	minute		
							)	t
Row Pattern MatchingConsecutive Events: Statistics
Time
30 minutes
undefined

pattern variable:
matches any row
Oracle doesn’t support avg on intervals — query doesn’t work as shown
SELECT	COUNT(*)	sessions	
					,	AVG(duration)	avg_duration	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
								MEASURES	
									LAST(ts)	-	FIRST(ts)	AS	duration	
								ONE	ROW	PER	MATCH	
								PATTERN	(	new	cont*	)	
								DEFINE	cont	AS	ts	<	PREV(ts)	
																										+	INTERVAL	'30'	minute		
							)	t
Row Pattern MatchingConsecutive Events: Statistics
Time
30 minutes
any number

of “cont”

rows
Oracle doesn’t support avg on intervals — query doesn’t work as shown
SELECT	COUNT(*)	sessions	
					,	AVG(duration)	avg_duration	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
								MEASURES	
									LAST(ts)	-	FIRST(ts)	AS	duration	
								ONE	ROW	PER	MATCH	
								PATTERN	(	new	cont*	)	
								DEFINE	cont	AS	ts	<	PREV(ts)	
																										+	INTERVAL	'30'	minute		
							)	t
Row Pattern MatchingConsecutive Events: Statistics
Time
30 minutes
Very much

like GROUP BY
Oracle doesn’t support avg on intervals — query doesn’t work as shown
SELECT	COUNT(*)	sessions	
					,	AVG(duration)	avg_duration	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
								MEASURES	
									LAST(ts)	-	FIRST(ts)	AS	duration	
								ONE	ROW	PER	MATCH	
								PATTERN	(	new	cont*	)	
								DEFINE	cont	AS	ts	<	PREV(ts)	
																										+	INTERVAL	'30'	minute		
							)	t
Row Pattern MatchingConsecutive Events: Statistics
Time
30 minutes
Very much

like SELECT
Oracle doesn’t support avg on intervals — query doesn’t work as shown
SELECT	COUNT(*)	sessions	
					,	AVG(duration)	avg_duration	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
								MEASURES	
									LAST(ts)	-	FIRST(ts)	AS	duration	
								ONE	ROW	PER	MATCH	
								PATTERN	(	new	cont*	)	
								DEFINE	cont	AS	ts	<	PREV(ts)	
																										+	INTERVAL	'30'	minute		
							)	t
Row Pattern MatchingConsecutive Events: Statistics
Time
30 minutes
Oracle doesn’t support avg on intervals — query doesn’t work as shown
Consecutive Events: Statistics Start-of-group tagging
Time
30 minutes
Now, let’s try using window functions
SELECT	count(*)	sessions,	avg(duration)	avg_duration	
		FROM	(SELECT	MAX(ts)	-	MIN(ts)	duration	
										FROM	(SELECT	ts,	COUNT(grp_start)	OVER(ORDER	BY	ts)	session_no	
																		FROM	(SELECT	ts,	CASE	WHEN	ts	>=	LAG(	ts,	1,	DATE’1900-01-1'	)	
																																																			OVER(	ORDER	BY	ts	)	
																																																			+	INTERVAL	'30'	minute	
																																								THEN	1	
																																				END	grp_start	
																										FROM	log	
																							)	tagged	
															)	numbered	
									GROUP	BY	session_no	
							)	grouped
Consecutive Events: Statistics Start-of-group tagging
Time
30 minutes
Start-of-group
tags
SELECT	count(*)	sessions,	avg(duration)	avg_duration	
		FROM	(SELECT	MAX(ts)	-	MIN(ts)	duration	
										FROM	(SELECT	ts,	COUNT(grp_start)	OVER(ORDER	BY	ts)	session_no	
																		FROM	(SELECT	ts,	CASE	WHEN	ts	>=	LAG(	ts,	1,	DATE’1900-01-1'	)	
																																																			OVER(	ORDER	BY	ts	)	
																																																			+	INTERVAL	'30'	minute	
																																								THEN	1	
																																				END	grp_start	
																										FROM	log	
																							)	tagged	
															)	numbered	
									GROUP	BY	session_no	
							)	grouped
Consecutive Events: Statistics Start-of-group tagging
Time
30 minutes
number
sessions
2222 2 33 3 44 42 3 4
1
SELECT	count(*)	sessions,	avg(duration)	avg_duration	
		FROM	(SELECT	MAX(ts)	-	MIN(ts)	duration	
										FROM	(SELECT	ts,	COUNT(grp_start)	OVER(ORDER	BY	ts)	session_no	
																		FROM	(SELECT	ts,	CASE	WHEN	ts	>=	LAG(	ts,	1,	DATE’1900-01-1'	)	
																																																			OVER(	ORDER	BY	ts	)	
																																																			+	INTERVAL	'30'	minute	
																																								THEN	1	
																																				END	grp_start	
																										FROM	log	
																							)	tagged	
															)	numbered	
									GROUP	BY	session_no	
							)	grouped
Consecutive Events: Statistics Start-of-group tagging
Time
30 minutes 2222 2 33 3 44 42 3 4
1
SELECT	count(*)	sessions,	avg(duration)	avg_duration	
		FROM	(SELECT	MAX(ts)	-	MIN(ts)	duration	
										FROM	(SELECT	ts,	COUNT(grp_start)	OVER(ORDER	BY	ts)	session_no	
																		FROM	(SELECT	ts,	CASE	WHEN	ts	>=	LAG(	ts,	1,	DATE’1900-01-1'	)	
																																																			OVER(	ORDER	BY	ts	)	
																																																			+	INTERVAL	'30'	minute	
																																								THEN	1	
																																				END	grp_start	
																										FROM	log	
																							)	tagged	
															)	numbered	
									GROUP	BY	session_no	
							)	grouped
Consecutive Events: Statistics Start-of-group tagging
Time
30 minutes
4 Levels:
2 with window functions
2 for grouping


What about performance?
2222 2 33 3 44 42 3 4
1
Tolerating Gaps
Example: Comments (new vs. read)
Tolerating Gaps
Example: Comments (new vs. read)
Show comments which…
‣ …are new or
‣ …between two new ones

(show the comment instead of a “load more” button)
Two approaches:
‣ Start-of-group tagging
‣ Row pattern matching
Comments
new commentread comment
SELECT	id,	marker	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	new+	(read	new+)*	)	
								DEFINE	
									new	AS	(marker	=	'X')	
							)	T	
ORDER	BY	id
Tolerating Gaps Row Pattern Matching
Comments
Start with one

or more NEW

comment(s)
SELECT	id,	marker	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	new+	(read	new+)*	)	
								DEFINE	
									new	AS	(marker	=	'X')	
							)	T	
ORDER	BY	id
Tolerating Gaps Row Pattern Matching
Comments
Start with one

or more NEW

comment(s)
Doesn’t match
SELECT	id,	marker	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	new+	(read	new+)*	)	
								DEFINE	
									new	AS	(marker	=	'X')	
							)	T	
ORDER	BY	id
Tolerating Gaps Row Pattern Matching
Comments
Start with one

or more NEW

comment(s)
Two rows
match “new+”
SELECT	id,	marker	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	new+	(read	new+)*	)	
								DEFINE	
									new	AS	(marker	=	'X')	
							)	T	
ORDER	BY	id
Tolerating Gaps Row Pattern Matching
Comments
Match exactly

one row (any)
SELECT	id,	marker	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	new+	(read	new+)*	)	
								DEFINE	
									new	AS	(marker	=	'X')	
							)	T	
ORDER	BY	id
Tolerating Gaps Row Pattern Matching
Comments
SELECT	id,	marker	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	new+	(read	new+)*	)	
								DEFINE	
									new	AS	(marker	=	'X')	
							)	T	
ORDER	BY	id
Tolerating Gaps Row Pattern Matching
Comments
Repeat group

any number

of times
SELECT	id,	marker	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	new+	(read	new+)*	)	
								DEFINE	
									new	AS	(marker	=	'X')	
							)	T	
ORDER	BY	id
Tolerating Gaps Row Pattern Matching
Comments
Repeat group

any number

of times
SELECT	id,	marker	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	new+	(read	new+)*	)	
								DEFINE	
									new	AS	(marker	=	'X')	
							)	T	
ORDER	BY	id
Tolerating Gaps Row Pattern Matching
Comments
First match
SELECT	id,	marker	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	new+	(read	new+)*	)	
								DEFINE	
									new	AS	(marker	=	'X')	
							)	T	
ORDER	BY	id
Tolerating Gaps Row Pattern Matching
CommentsSecond

match
SELECT	id,	marker	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	new+	(read	new+)*	)	
								DEFINE	
									new	AS	(marker	=	'X')	
							)	T	
ORDER	BY	id
Tolerating Gaps Row Pattern Matching
Comments
Tolerating Gaps (also first/last) Row Pattern Matching
Comments
What about
this?
SELECT	id,	marker	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	(^read)?	new+	(read	new+)*	(read$)?	)	
								DEFINE	
									new	AS	(marker	=	'X')	
							)	T	
ORDER	BY	thread_id,	id
Tolerating Gaps (also first/last) Row Pattern Matching
Comments
What about
this?
SELECT	id,	marker	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	(^read)?	new+	(read	new+)*	(read$)?	)	
								DEFINE	
									new	AS	(marker	=	'X')	
							)	T	
ORDER	BY	thread_id,	id
Tolerating Gaps (also first/last) Row Pattern Matching
Comments
Match

“read” at the very

beginning
SELECT	id,	marker	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	(^read)?	new+	(read	new+)*	(read$)?	)	
								DEFINE	
									new	AS	(marker	=	'X')	
							)	T	
ORDER	BY	thread_id,	id
Tolerating Gaps (also first/last) Row Pattern Matching
Comments
Optionally match

“read” at the very

beginning
SELECT	id,	marker	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	(^read)?	new+	(read	new+)*	(read$)?	)	
								DEFINE	
									new	AS	(marker	=	'X')	
							)	T	
ORDER	BY	thread_id,	id
Tolerating Gaps (also first/last) Row Pattern Matching
Comments
Tolerating Gaps lead & lag
Comments
Now, let’s try using window functions
SELECT	t.*	
		FROM	(SELECT	msg.*	
													,	LAG	(	marker,	1,	'X'	)	OVER(	ORDER	BY	id	)	prev_marker	
													,	LEAD(	marker,	1,	'X'	)	OVER(	ORDER	BY	id	)	next_marker	
										FROM	msg	
							)	t	
WHERE	marker	=	'X'	
			OR	(prev_marker	=	'X'	and	next_marker	=	‘X')	
ORDER	BY	id
Tolerating Gaps lead & lag
Comments
I don't care what anything was designed to do,
I care about what it can do.
—Apollo 13, Universal Pictures
Tolerating Gaps (with grouped gaps) Row Pattern Matching
Comments
Tolerating Gaps (with grouped gaps) Row Pattern Matching
Comments
Load 2 more Load 9 more
Tolerating Gaps (with grouped gaps) Row Pattern Matching
Comments
Load 2 more Load 9 more
Tell me how many rows you skipped in between
SELECT	id,	marker,	gap_length	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								MEASURES	
										FINAL	COUNT(more.*)	AS	gap_length	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	(^read)?	new+	(read	new+)*	(read$)?	|	more	{-	more*	-}	)	
								DEFINE	new		AS	(marker		=	'X'),	
															more	AS	(marker	!=	'X')	
							)	T	
ORDER	BY	id
Tolerating Gaps (with grouped gaps) Row Pattern Matching
Comments
Load 2 more Load 9 more
SELECT	id,	marker,	gap_length	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								MEASURES	
										FINAL	COUNT(more.*)	AS	gap_length	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	(^read)?	new+	(read	new+)*	(read$)?	|	more	{-	more*	-}	)	
								DEFINE	new		AS	(marker		=	'X'),	
															more	AS	(marker	!=	'X')	
							)	T	
ORDER	BY	id
Tolerating Gaps (with grouped gaps) Row Pattern Matching
Comments
Load 2 more Load 9 more
Alternative
Match, but
don’t return
SELECT	id,	marker,	gap_length	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								MEASURES	
										FINAL	COUNT(more.*)	AS	gap_length	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	(^read)?	new+	(read	new+)*	(read$)?	|	more	{-	more*	-}	)	
								DEFINE	new		AS	(marker		=	'X'),	
															more	AS	(marker	!=	'X')	
							)	T	
ORDER	BY	id
Tolerating Gaps (with grouped gaps) Row Pattern Matching
Comments
Load 2 more Load 9 more
Consider

all rows
SELECT	id,	marker,	gap_length	
		FROM	msg	
							MATCH_RECOGNIZE	(	
								ORDER	BY	id	
								MEASURES	
										FINAL	COUNT(more.*)	AS	gap_length	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	(^read)?	new+	(read	new+)*	(read$)?	|	more	{-	more*	-}	)	
								DEFINE	new		AS	(marker		=	'X'),	
															more	AS	(marker	!=	'X')	
							)	T	
ORDER	BY	id
Tolerating Gaps (with grouped gaps) Row Pattern Matching
Comments
Load 2 more Load 9 more
Only rows
matched to the
pattern variable
“more”
SELECT	id,	marker	
													,	CASE	WHEN	marker	!=	'X'	AND	gap_length	>	2	
																				THEN	gap_length	
																END	gap_length	
										FROM	(SELECT	t2.*,	COUNT(	CASE	WHEN	marker	!=	'X'	THEN	1	END	)	
																														OVER(	PARTITION	BY	new_counter	)	gap_length			
																		FROM	(SELECT	msg.*,	COUNT(	CASE	WHEN	marker	=	'X'	THEN	1	END	)	
																																							OVER(	ORDER	BY	id	)	new_counter	
																																				,	LAG		(	marker,	1,	'X'	)	
																																							OVER(	ORDER	BY	id	)	prev_marker	
																										FROM	msg	
																							)	t2	
															)	t3	
									WHERE	marker	=	'X'	OR	gap_length	=	1	OR	prev_marker=	'X'	
									ORDER	BY	id
Start-of-group taggingTolerating Gaps (with grouped gaps)
Comments
Top-N Per Group
Example: List 3 most recent comments per topic
Top-N Per Group
Example: List 3 most recent comments per topic
Time
Topic 3
Topic 2
Topic 1
Top-N Per Group
Example: List 3 most recent comments per topic
Three approaches:
‣ lateral sub-query (requires specific indexing)
‣ Row pattern matching (requires 12c)
‣ row_number() window function
Time
Topic 3
Topic 2
Topic 1
SELECT	*	
		FROM	t	
							MATCH_RECOGNIZE(	
								PARTITION	BY	topic	
								ORDER	BY	val	
								MEASURES	
									RUNNING	COUNT(*)	AS	rn	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	^a{1,3}	)	
								DEFINE	
									a	AS	1=1	
							)
Time
Topic 3
Topic 2
Topic 1
Top-N Per Group
per “topic”
processing
SELECT	*	
		FROM	t	
							MATCH_RECOGNIZE(	
								PARTITION	BY	topic	
								ORDER	BY	val	
								MEASURES	
									RUNNING	COUNT(*)	AS	rn	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	^a{1,3}	)	
								DEFINE	
									a	AS	1=1	
							)
Time
Topic 3
Topic 2
Topic 1
Top-N Per Group
Consider rows
up till current
row
SELECT	*	
		FROM	t	
							MATCH_RECOGNIZE(	
								PARTITION	BY	topic	
								ORDER	BY	val	
								MEASURES	
									RUNNING	COUNT(*)	AS	rn	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	^a{1,3}	)	
								DEFINE	
									a	AS	1=1	
							)
Time
Topic 3
Topic 2
Topic 1
Top-N Per Group
1, 2, or 3 times
SELECT	*	
		FROM	t	
							MATCH_RECOGNIZE(	
								PARTITION	BY	topic	
								ORDER	BY	val	
								MEASURES	
									RUNNING	COUNT(*)	AS	rn	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	^a{1,3}	)	
								DEFINE	
									a	AS	1=1	
							)
Time
Topic 3
Topic 2
Topic 1
Top-N Per Group
DEFINE is
non-optional:
Use dummy
SELECT	*	
		FROM	t	
							MATCH_RECOGNIZE(	
								PARTITION	BY	topic	
								ORDER	BY	val	
								MEASURES	
									RUNNING	COUNT(*)	AS	rn	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	^a{1,3}	)	
								DEFINE	
									a	AS	1=1	
							)
SELECT	*		
		FROM	(	
			SELECT	t.*	
								,	ROW_NUMBER()	
											OVER	(PARTITION	BY	topic	
																	ORDER	BY	val)	rn	
				FROM	t	
		)	t	
	WHERE	rn	<=	3
Time
Topic 3
Topic 2
Topic 1
Top-N Per Group
SELECT	*	
		FROM	t	
							MATCH_RECOGNIZE(	
								PARTITION	BY	topic	
								ORDER	BY	val	
								MEASURES	
									RUNNING	COUNT(*)	AS	rn	
								ALL	ROWS	PER	MATCH	
								PATTERN	(	^a+	)	
								DEFINE	
									a	AS	count(*)	<=	3	
							)
SELECT	*		
		FROM	(	
			SELECT	t.*	
								,	ROW_NUMBER()	
											OVER	(PARTITION	BY	topic	
																	ORDER	BY	val)	rn	
				FROM	t	
		)	t	
	WHERE	rn	<=	3
Time
Topic 3
Topic 2
Topic 1
Top-N Per Group
Always
RUNNING
semantic
Time Intervals (non-overlapping)
Example: Bookings are stored as [begin; end[ intervals
Time Intervals (non-overlapping)
Example: Bookings are stored as [begin; end[ intervals
Two problems:
‣ Find free time-slots
‣ Close free time-slots
Time
Busy Busy Busy
Time
SELECT	*	
		FROM	reservations	
							MATCH_RECOGNIZE(	
								ORDER	BY	begin	
								MEASURES	
									a.end			AS	begin,	
									b.begin	AS	end	
								ONE	ROW	PER	MATCH	
								AFTER	MATCH	SKIP	TO	b	
								PATTERN	(	a	b	)	
								DEFINE	
									b	AS	a.end	<	begin	
							)
Time Intervals (non-overlapping)
Time
SELECT	*	
		FROM	reservations	
							MATCH_RECOGNIZE(	
								ORDER	BY	begin	
								MEASURES	
									a.end			AS	begin,	
									b.begin	AS	end	
								ONE	ROW	PER	MATCH	
								AFTER	MATCH	SKIP	TO	b	
								PATTERN	(	a	b	)	
								DEFINE	
									b	AS	a.end	<	begin	
							)
a b
Time Intervals (non-overlapping)
Time
SELECT	*	
		FROM	reservations	
							MATCH_RECOGNIZE(	
								ORDER	BY	begin	
								MEASURES	
									a.end			AS	begin,	
									b.begin	AS	end	
								ONE	ROW	PER	MATCH	
								AFTER	MATCH	SKIP	TO	b	
								PATTERN	(	a	b	)	
								DEFINE	
									b	AS	a.end	<	begin	
							)
a b
Time Intervals (non-overlapping)
Time
SELECT	*	
		FROM	reservations	
							MATCH_RECOGNIZE(	
								ORDER	BY	begin	
								MEASURES	
									a.end			AS	begin,	
									b.begin	AS	end	
								ONE	ROW	PER	MATCH	
								AFTER	MATCH	SKIP	TO	b	
								PATTERN	(	a	b	)	
								DEFINE	
									b	AS	a.end	<	begin	
							)
a b
Time Intervals (non-overlapping)
SELECT	*	
		FROM	reservations	
							MATCH_RECOGNIZE(	
								ORDER	BY	begin	
								MEASURES	
									a.end			AS	begin,	
									b.begin	AS	end	
								ONE	ROW	PER	MATCH	
								AFTER	MATCH	SKIP	TO	b	
								PATTERN	(	a	b	)	
								DEFINE	
									b	AS	a.end	<	begin	
							)
a b
Time Intervals (non-overlapping)
Time
SELECT	*	
		FROM	reservations	
							MATCH_RECOGNIZE(	
								ORDER	BY	begin	
								MEASURES	
									a.end			AS	begin,	
									b.begin	AS	end	
								ONE	ROW	PER	MATCH	
								AFTER	MATCH	SKIP	TO	b	
								PATTERN	(	a	b	)	
								DEFINE	
									b	AS	a.end	<	begin	
							)
a b
Time Intervals (non-overlapping)
Time
SELECT	*	
		FROM	reservations	
							MATCH_RECOGNIZE(	
								ORDER	BY	begin	
								MEASURES	
									a.end			AS	begin,	
									b.begin	AS	end	
								ONE	ROW	PER	MATCH	
								AFTER	MATCH	SKIP	TO	b	
								PATTERN	(	a	b	)	
								DEFINE	
									b	AS	a.end	<	begin	
							)
a b
Time Intervals (non-overlapping)
Time
Default is to

continue AFTER

last matched row
SELECT	*	
		FROM	reservations	
							MATCH_RECOGNIZE(	
								ORDER	BY	begin	
								MEASURES	
									a.end			AS	begin,	
									b.begin	AS	end	
								ONE	ROW	PER	MATCH	
								AFTER	MATCH	SKIP	TO	b	
								PATTERN	(	a	b	)	
								DEFINE	
									b	AS	a.end	<	begin	
							)
Time Intervals (non-overlapping)
Time
SELECT	*	
		FROM	reservations	
							MATCH_RECOGNIZE(	
								ORDER	BY	begin	
								MEASURES	
									a.end			AS	begin,	
									b.begin	AS	end	
								ONE	ROW	PER	MATCH	
								AFTER	MATCH	SKIP	TO	b	
								PATTERN	(	a	b	)	
								DEFINE	
									b	AS	a.end	<	begin	
							)
Time Intervals (non-overlapping)
Time
a b
SELECT	*	
		FROM	reservations	
							MATCH_RECOGNIZE(	
								ORDER	BY	begin	
								MEASURES	
									a.end			AS	begin,	
									b.begin	AS	end	
								ONE	ROW	PER	MATCH	
								AFTER	MATCH	SKIP	TO	b	
								PATTERN	(	a	b	)	
								DEFINE	
									b	AS	a.end	<	begin	
							)
Time Intervals (non-overlapping)
a b
Time
SELECT	*	
		FROM	reservations	
							MATCH_RECOGNIZE(	
								ORDER	BY	begin	
								MEASURES	
									a.end			AS	begin,	
									b.begin	AS	end	
								ONE	ROW	PER	MATCH	
								AFTER	MATCH	SKIP	TO	b	
								PATTERN	(	a	b	)	
								DEFINE	
									b	AS	a.end	<	begin	
							)
Time Intervals (non-overlapping)
Time
SELECT	*	
		FROM	reservations	
							MATCH_RECOGNIZE(	
								ORDER	BY	begin	
								MEASURES	
									a.end			AS	begin,	
									b.begin	AS	end	
								ONE	ROW	PER	MATCH	
								AFTER	MATCH	SKIP	TO	b	
								PATTERN	(	a	b	)	
								DEFINE	
									b	AS	a.end	<	begin	
							)
Time Intervals (non-overlapping)
Time
SELECT	*	
		FROM	reservations	
							MATCH_RECOGNIZE(	
								ORDER	BY	begin	
								MEASURES	
									a.end			AS	begin,	
									b.begin	AS	end	
								ONE	ROW	PER	MATCH	
								AFTER	MATCH	SKIP	TO	b	
								PATTERN	(	a	b	)	
								DEFINE	
									b	AS	a.end	<	begin	
							)
Time Intervals (non-overlapping)
Time
SELECT	*	
		FROM	(SELECT	end	begin	
													,	LEAD(begin)	
															OVER(ORDER	BY	begin)	end	
										FROM	reservations	
							)	
	WHERE	begin	<	end
Row Pattern Matching
Time
Time Intervals (close gaps)
SELECT	b	begin,	e	end,	type	
		FROM	reservations	MATCH_RECOGNIZE(	
																						ORDER	BY	begin	
																						MEASURES	CASE	WHEN	free.begin	IS	NULL	THEN	busy.begin	
																																																												ELSE	busy.end	
																																END	AS	b	
																														,	COALESCE(free.begin,	busy.end)	AS	e	
																														,	CLASSIFIER()	as	type	
																						ALL	ROWS	PER	MATCH	
																						AFTER	MATCH	SKIP	TO	NEXT	ROW	
																						PATTERN	(	busy	free?	)	
																						DEFINE	free	AS	begin	>	PREV(end)	
																				)
Row Pattern Matching
Time
Time Intervals (close gaps)
Always match

one row. Second only

if there is a gap
Busy Free
SELECT	b	begin,	e	end,	type	
		FROM	reservations	MATCH_RECOGNIZE(	
																						ORDER	BY	begin	
																						MEASURES	CASE	WHEN	free.begin	IS	NULL	THEN	busy.begin	
																																																												ELSE	busy.end	
																																END	AS	b	
																														,	COALESCE(free.begin,	busy.end)	AS	e	
																														,	CLASSIFIER()	as	type	
																						ALL	ROWS	PER	MATCH	
																						AFTER	MATCH	SKIP	TO	NEXT	ROW	
																						PATTERN	(	busy	free?	)	
																						DEFINE	free	AS	begin	>	PREV(end)	
																				)
Row Pattern Matching
Time
Time Intervals (close gaps)
Busy Free
SELECT	b	begin,	e	end,	type	
		FROM	reservations	MATCH_RECOGNIZE(	
																						ORDER	BY	begin	
																						MEASURES	CASE	WHEN	free.begin	IS	NULL	THEN	busy.begin	
																																																												ELSE	busy.end	
																																END	AS	b	
																														,	COALESCE(free.begin,	busy.end)	AS	e	
																														,	CLASSIFIER()	as	type	
																						ALL	ROWS	PER	MATCH	
																						AFTER	MATCH	SKIP	TO	NEXT	ROW	
																						PATTERN	(	busy	free?	)	
																						DEFINE	free	AS	begin	>	PREV(end)	
																				)
Row Pattern Matching
Time
Time Intervals (close gaps)
Busy Free
If it is not a

“free” row, pass

row through
SELECT	b	begin,	e	end,	type	
		FROM	reservations	MATCH_RECOGNIZE(	
																						ORDER	BY	begin	
																						MEASURES	CASE	WHEN	free.begin	IS	NULL	THEN	busy.begin	
																																																												ELSE	busy.end	
																																END	AS	b	
																														,	COALESCE(free.begin,	busy.end)	AS	e	
																														,	CLASSIFIER()	as	type	
																						ALL	ROWS	PER	MATCH	
																						AFTER	MATCH	SKIP	TO	NEXT	ROW	
																						PATTERN	(	busy	free?	)	
																						DEFINE	free	AS	begin	>	PREV(end)	
																				)
Row Pattern MatchingTime Intervals (close gaps)
Busy Free
Time
SELECT	b	begin,	e	end,	type	
		FROM	reservations	MATCH_RECOGNIZE(	
																						ORDER	BY	begin	
																						MEASURES	CASE	WHEN	free.begin	IS	NULL	THEN	busy.begin	
																																																												ELSE	busy.end	
																																END	AS	b	
																														,	COALESCE(free.begin,	busy.end)	AS	e	
																														,	CLASSIFIER()	as	type	
																						ALL	ROWS	PER	MATCH	
																						AFTER	MATCH	SKIP	TO	NEXT	ROW	
																						PATTERN	(	busy	free?	)	
																						DEFINE	free	AS	begin	>	PREV(end)	
																				)
Row Pattern MatchingTime Intervals (close gaps)
Busy Free
Time
Free
SELECT	b	begin,	e	end,	type	
		FROM	reservations	MATCH_RECOGNIZE(	
																						ORDER	BY	begin	
																						MEASURES	CASE	WHEN	free.begin	IS	NULL	THEN	busy.begin	
																																																												ELSE	busy.end	
																																END	AS	b	
																														,	COALESCE(free.begin,	busy.end)	AS	e	
																														,	CLASSIFIER()	as	type	
																						ALL	ROWS	PER	MATCH	
																						AFTER	MATCH	SKIP	TO	NEXT	ROW	
																						PATTERN	(	busy	free?	)	
																						DEFINE	free	AS	begin	>	PREV(end)	
																				)
Row Pattern MatchingTime Intervals (close gaps)
Busy Free
Time
Free
SELECT	b	begin,	e	end,	type	
		FROM	reservations	MATCH_RECOGNIZE(	
																						ORDER	BY	begin	
																						MEASURES	CASE	WHEN	free.begin	IS	NULL	THEN	busy.begin	
																																																												ELSE	busy.end	
																																END	AS	b	
																														,	COALESCE(free.begin,	busy.end)	AS	e	
																														,	CLASSIFIER()	as	type	
																						ALL	ROWS	PER	MATCH	
																						AFTER	MATCH	SKIP	TO	NEXT	ROW	
																						PATTERN	(	busy	free?	)	
																						DEFINE	free	AS	begin	>	PREV(end)	
																				)
Row Pattern MatchingTime Intervals (close gaps)
Busy
Time
Free
SELECT	b	begin,	e	end,	type	
		FROM	reservations	MATCH_RECOGNIZE(	
																						ORDER	BY	begin	
																						MEASURES	CASE	WHEN	free.begin	IS	NULL	THEN	busy.begin	
																																																												ELSE	busy.end	
																																END	AS	b	
																														,	COALESCE(free.begin,	busy.end)	AS	e	
																														,	CLASSIFIER()	as	type	
																						ALL	ROWS	PER	MATCH	
																						AFTER	MATCH	SKIP	TO	NEXT	ROW	
																						PATTERN	(	busy	free?	)	
																						DEFINE	free	AS	begin	>	PREV(end)	
																				)
Row Pattern MatchingTime Intervals (close gaps)
Busy
Time
Busy FreeFree
SELECT	b	begin,	e	end,	type	
		FROM	reservations	MATCH_RECOGNIZE(	
																						ORDER	BY	begin	
																						MEASURES	CASE	WHEN	free.begin	IS	NULL	THEN	busy.begin	
																																																												ELSE	busy.end	
																																END	AS	b	
																														,	COALESCE(free.begin,	busy.end)	AS	e	
																														,	CLASSIFIER()	as	type	
																						ALL	ROWS	PER	MATCH	
																						AFTER	MATCH	SKIP	TO	NEXT	ROW	
																						PATTERN	(	busy	free?	)	
																						DEFINE	free	AS	begin	>	PREV(end)	
																				)
Row Pattern MatchingTime Intervals (close gaps)
Busy Busy FreeFree
Time
Free
SELECT	b	begin,	e	end,	type	
		FROM	reservations	MATCH_RECOGNIZE(	
																						ORDER	BY	begin	
																						MEASURES	CASE	WHEN	free.begin	IS	NULL	THEN	busy.begin	
																																																												ELSE	busy.end	
																																END	AS	b	
																														,	COALESCE(free.begin,	busy.end)	AS	e	
																														,	CLASSIFIER()	as	type	
																						ALL	ROWS	PER	MATCH	
																						AFTER	MATCH	SKIP	TO	NEXT	ROW	
																						PATTERN	(	busy	free?	)	
																						DEFINE	free	AS	begin	>	PREV(end)	
																				)
Row Pattern MatchingTime Intervals (close gaps)
Busy BusyFree
Time
Free Busy
SELECT	begin,	end,	type	
		FROM	(SELECT	end	begin	
													,	LEAD(begin)	OVER(ORDER	BY	begin)	end	
													,	'FREE'	type	
										FROM	reservations	
							)	
	WHERE	begin	<	end	
	UNION	ALL	
SELECT	begin	
					,	end	
					,	'BUSY'	type	
		FROM	reservations
Busy BusyFree
Time
Free Busy
Time Intervals (close gaps) Window function
Endless possibilitesRow Pattern Matching
GROUP	BY

			➡	ONE	ROW	PER	MATCH
OVER	()

			➡	ALL	ROWS	PER	MATCH,	FINAL,	RUNNING
HAVING,	WHERE

			➡	PATTERN (unmatched, suppressed {-	…	-})
Mixing GROUP	BY and OVER()

			➡	ALL	ROWS	PER	MATCH + all-but-one rows suppressed
Data-driven match length 

			➡ SUM, COUNT, … in DEFINE
Duplicating rows (to some extend)

			➡ ALL	ROWS	PER	MATCH + AFTER	MATCH	SKIP	TO	…
What if I told you,
you can also
find patterns?
Not/Barely covered in this presentationRow Pattern Matching
‣ Reluctant (non-greedy) matching
‣ SHOW/OMIT	EMPTY	MATCHES

WITH	UNMATCHED	ROWS
‣ SUBSET (define pattern-vars for use in MEASURES,

							DEFINE and AFTER	MATCH	SKIP	TO)
‣ PREV, NEXT, FIRST, LAST (with some nesting!)
‣MATCH_NUMBER()
Obstacles and Issues (as of 12r1)Row Pattern Matching
‣ JDBC !!!!!

Tokens ?, {, } have special meaning in JDBC.

You have to escape them using {	...	}

https://docs.oracle.com/database/121/JJDBC/apxref.htm#CHECHCJH
‣ ORA-62513: Quantified subpatterns that can have empty matches are not yet supported



PATTERN	(	x	(a*	b*)+	y	)
‣ ORA-62512: This aggregate is not yet supported in MATCH_RECOGNIZE clause.

(only COUNT, SUM, AVG, MIN, and MAX)
About @MarkusWinand
‣Training for Developers
‣ SQL Performance (Indexing)
‣ Modern SQL
‣ On-Site or Online
‣SQL Tuning
‣ Index-Redesign
‣ Query Improvements
‣ On-Site or Online
http://winand.at/
About @MarkusWinand
€0,-
€10-30
sql-performance-explained.com
About @MarkusWinand
@ModernSQL
http://modern-sql.com

Row Pattern Matching in SQL:2016