SlideShare a Scribd company logo
1 of 45
Download to read offline
Dawid Wysakowicz,
@dwysakowicz,
SOFTWARE ENGINEER
Detecting Patterns in Event Streams
with Flink SQL
© 2019 Ververica2
Original creators of
Apache Flink®
Complete Stream
Processing Infrastructure
About Ververica
© 2019 Ververica3
ververica.com/download
Ververica Platform
© 2019 Ververica4
Why people like SQL?
• Well known interface: "lingua franca of data processing"
• No programming required - easier to learn
• Declarative way to express your query, WHAT not HOW
• Out-of-the box optimization
• Reusability
‒ built-in functions
‒ user defined functions
© 2019 Ververica5
SQL on Streams
• Unified semantics for bounded and unbounded input
‒ Same query, same input, same result
Stream is converted
into dynamic table
Dynamic table is
converted into
stream
Continuous query on
dynamic table produces
result d table
© 2019 Ververica6
SQL on Streams
• Unified semantics for stream and batch input
‒ Same query, same input, same result
Stream is converted
into dynamic table
Dynamic table is
converted into
stream
Continuous query on
dynamic table produces
result d table
Flink SQL in Action
by Fabian Hueske (link)
© 2019 Ververica7
The New York Taxi Rides Data Set
• The New York City & Limousine Commission provides a public data set
about past taxi rides in New York City
• We can derive a streaming table from the data
• Table: TaxiRides
rideId: BIGINT // ID of the taxi ride
taxiId: BIGINT // ID of the taxi cab
driverId: BIGINT // ID of the drive
isStart: BOOLEAN // flag for pick-up (true) or drop-off (false)
event
lon: DOUBLE // longitude of pick-up or drop-off location
lat: DOUBLE // latitude of pick-up or drop-off location
rowTime: TIMESTAMP // time of pick-up or drop-off event
psgCnt: INT // number of passengers
© 2019 Ververica8
Number of rides per 30 minutes per area
Time
Number of rides
© 2019 Ververica9
SQL Example
SELECT
toAreaId(lat, lon) as cellId,
COUNT(distinct rideId) as rideCount,
TUMBLE_ROWTIME(rowTime, INTERVAL '30' minute) AS rowTime,
TUMBLE_START(rowTime, INTERVAL '30' minute) AS startTime,
TUMBLE_END(rowTime, INTERVAL '30' minute) AS endTime
FROM
TaxiRides
GROUP BY
toAreaId(lat, lon),
TUMBLE(rowTime, INTERVAL '30' minute)
© 2019 Ververica10
SQL Support in Flink 1.8
• SELECT FROM WHERE
• GROUP BY / HAVING
‒ Non-windowed, TUMBLE, HOP, SESSION windows
• JOIN
‒ Windowed INNER, LEFT / RIGHT / FULL OUTER JOIN
‒ Non-windowed INNER / LEFT / RIGHT / FULL OUTER JOIN
‒ [streaming only] Temporal Tables
• Scalar, aggregation, table-valued UDFs
• Many built-in functions
• [streaming only] OVER / WINDOW
‒ UNBOUNDED / BOUNDED PRECEDING
• [streaming only] support of MATCH_RECOGNIZE
© 2019 Ververica11
• Find rides with mid-stops
What is hard with SQL?
© 2019 Ververica12
Mid stops
Pattern.<Row>begin("S”).where(
(row) -> {
return row.isStart == true;
}).next("E").where( (row) -> {
return row.isStart == true;
});
CEP.pattern(input.keyBy(“driverId”), pattern)
.flatSelect(
new PatternFlatSelectFunction<Row, Row>() {
@Override
public void flatSelect(
Map<String, List<Row>> pattern,
Collector<Row> out) throws Exception {
out.collect((
pattern.get(“S”).get(0).getRideId
));
}
);
© 2019 Ververica13
MATCH_RECOGNIZE
SQL:2016 extension
© 2019 Ververica14
Common use-cases
• stock market analysis
• customer behaviour
• tracking money laundering
• service quality
• network intrusion detection
© 2019 Ververica15
Position in a SQL Query
SELECT ...
FROM ...
MATCH_RECOGNIZE
(...)
WHERE ...
GROUP BY ...
© 2019 Ververica16
SELECT *
FROM TaxiRides
MATCH_RECOGNIZE (
PARTITION BY driverId
ORDER BY rowTime
MEASURES
S.rideId as sRideId
AFTER MATCH SKIP PAST LAST ROW
PATTERN (S E)
DEFINE
S AS S.isStart = true,
E AS E.isStart = true
)
Rides with mid-stops
partition the data by
given field = keyBy
© 2019 Ververica17
Rides with mid-stops
SELECT *
FROM TaxiRides
MATCH_RECOGNIZE (
PARTITION BY driverId
ORDER BY rowTime
MEASURES
S.rideId as sRideId
AFTER MATCH SKIP PAST LAST ROW
PATTERN (S E)
DEFINE
S AS S.isStart = true,
E AS E.isStart = true
)
● specify order
● primary order = Event
or Processing time
1 3 2 5 4
© 2019 Ververica18
SELECT *
FROM TaxiRides
MATCH_RECOGNIZE (
PARTITION BY driverId
ORDER BY rowTime
MEASURES
S.rideId as sRideId
AFTER MATCH SKIP PAST LAST ROW
PATTERN (S E)
DEFINE
S AS S.isStart = true,
E AS E.isStart = true
)
Rides with mid-stops
construct pattern
S E
PATTERN: Defining a Pattern
• Concatenation:
‒ All rows of a pattern must be mapped to pattern variables
‒ A pattern like (A B) means that the contiguity is strict between A and B
‒ In other words: No rows between A and B
• Quantifiers
‒ Number of rows mapped to a pattern variable
* 0 or more rows
+ 1 or more rows
? 0 or 1 rows
{ n }, { n, }, { n, m }, { , m } Define intervals (inclusive)
B*? Perform mapping reluctant instead of greedy (default behavior)
© 2019 Ververica20
SELECT *
FROM TaxiRides
MATCH_RECOGNIZE (
PARTITION BY driverId
ORDER BY rowTime
MEASURES
S.rideId as sRideId
AFTER MATCH SKIP PAST LAST ROW
PATTERN (S E)
DEFINE
S AS S.isStart = true,
E AS E.isStart = true
)
Rides with mid-stops
extract measures from
matched sequence
S E
DEFINE/MEASURES: Define/Access Variables
• MEASURES
‒ Defines what will be included in the output of a matching pattern
‒ Project columns and define expressions for evaluation
‒ Number of produced rows depends on the output mode.
Currently, ONE ROW PER MATCH = one output summary row per match only
‒ Output schema: [partitioning columns] + [measures columns]
• DEFINE
‒ Conditions that a row has to fulfill to be classified to the corresponding variable
‒ No condition for a variable evaluates to TRUE
© 2019 Ververica22
Multi-Stop
© 2019 Ververica23
Rides with more than one mid-stop
SELECT *
FROM TaxiRides
MATCH_RECOGNIZE (
PARTITION BY driverId
ORDER BY rowTime
MEASURES
S.rideId as sRideId,
COUNT(M.rideId) as countMidStops
AFTER MATCH SKIP PAST LAST ROW
PATTERN (S M{2,} E)
DEFINE
S AS S.isStart = true,
M AS M.rideId <> S.rideId,
E AS E.isStart = false AND E.rideId = S.rideId
)
SELECT *
FROM TaxiRides
MATCH_RECOGNIZE (
PARTITION BY driverId
ORDER BY rowTime
MEASURES
S.rideId as sRideId
AFTER MATCH SKIP PAST LAST ROW
PATTERN (S E)
DEFINE
S AS S.isStart = true,
E AS E.isStart = true
)
© 2019 Ververica24
Rides with more than one mid-stop
SELECT *
FROM TaxiRides
MATCH_RECOGNIZE (
PARTITION BY driverId
ORDER BY rowTime
MEASURES
S.rideId as sRideId,
COUNT(M.rideId) as countMidStops
AFTER MATCH SKIP PAST LAST ROW
PATTERN (S M{2,} E)
DEFINE
S AS S.isStart = true,
M AS M.rideId <> S.rideId,
E AS E.isStart = false AND E.rideId = S.rideId
)
SELECT *
FROM TaxiRides
MATCH_RECOGNIZE (
PARTITION BY driverId
ORDER BY rowTime
MEASURES
S.rideId as sRideId
AFTER MATCH SKIP PAST LAST ROW
PATTERN (S E)
DEFINE
S AS S.isStart = true,
E AS E.isStart = true
)
© 2019 Ververica25
Rush (peak) hours - V shape
Time
Number of rides
© 2019 Ververica26
Statistics per Area
CREATE VIEW RidesInArea AS
SELECT
toAreaId(lat, lon) as cellId,
COUNT(distinct rideId) as rideCount,
TUMBLE_ROWTIME(rowTime, INTERVAL '30' minute) AS rowTime,
TUMBLE_START(rowTime, INTERVAL '30' minute) AS startTime,
TUMBLE_END(rowTime, INTERVAL '30' minute) AS endTime
FROM
TaxiRides
GROUP BY
toAreaId(lat, lon),
TUMBLE(rowTime, INTERVAL '30' minute)
© 2019 Ververica27
Rush (peak) hours
SELECT * FROM RidesInArea
MATCH_RECOGNIZE(
PARTITION BY cellId
ORDER BY rowTime
MEASURES
FIRST(UP.startTime) as rushStart,
LAST(DOWN.endTime) AS rushEnd,
SUM(UP.rideCount) + SUM(DOWN.rideCount) AS rideSum
AFTER MATCH SKIP PAST LAST ROW
PATTERN (UP{4,} DOWN{2,} E)
DEFINE
UP AS UP.rideCount > LAST(UP.rideCount, 1) or LAST(UP.rideCount, 1) IS NULL,
DOWN AS DOWN.rideCount < LAST(DOWN.rideCount, 1) OR
LAST(DOWN.rideCount, 1) IS NULL,
E AS E.rideCount > LAST(DOWN.rideCount)
)
Time
Number of rides
Use previous
table/view
© 2019 Ververica28
Rush (peak) hours
Time
Number of rides
apply match to result of the
inner query
Use previous
table/view
SELECT * FROM RidesInArea
MATCH_RECOGNIZE(
PARTITION BY cellId
ORDER BY rowTime
MEASURES
FIRST(UP.startTime) as rushStart,
LAST(DOWN.endTime) AS rushEnd,
SUM(UP.rideCount) + SUM(DOWN.rideCount) AS rideSum
AFTER MATCH SKIP PAST LAST ROW
PATTERN (UP{4,} DOWN{2,} E)
DEFINE
UP AS UP.rideCount > LAST(UP.rideCount, 1) or LAST(UP.rideCount, 1) IS NULL,
DOWN AS DOWN.rideCount < LAST(DOWN.rideCount, 1) OR
LAST(DOWN.rideCount, 1) IS NULL,
E AS E.rideCount > LAST(DOWN.rideCount)
)
© 2019 Ververica29
SELECT * FROM RidesInArea
MATCH_RECOGNIZE(
PARTITION BY cellId
ORDER BY rowTime
MEASURES
FIRST(UP.startTime) as rushStart,
LAST(DOWN.endTime) AS rushEnd,
SUM(UP.rideCount) + SUM(DOWN.rideCount) AS rideSum
AFTER MATCH SKIP PAST LAST ROW
PATTERN (UP{4,} DOWN{2,} E)
DEFINE
UP AS UP.rideCount > LAST(UP.rideCount, 1) or LAST(UP.rideCount, 1) IS NULL,
DOWN AS DOWN.rideCount < LAST(DOWN.rideCount, 1) OR
LAST(DOWN.rideCount, 1) IS NULL,
E AS E.rideCount > LAST(DOWN.rideCount)
)
Rush (peak) hours
Time
Number of rides
access elements of
looping pattern
access elements of
looping pattern in m
© 2019 Ververica30
SELECT * FROM RidesInArea
MATCH_RECOGNIZE(
PARTITION BY cellId
ORDER BY rowTime
MEASURES
FIRST(UP.startTime) as rushStart,
LAST(DOWN.endTime) AS rushEnd,
SUM(UP.rideCount) + SUM(DOWN.rideCount) AS rideSum
AFTER MATCH SKIP PAST LAST ROW
PATTERN (UP{4,} DOWN{2,} E)
DEFINE
UP AS UP.rideCount > LAST(UP.rideCount, 1) or LAST(UP.rideCount, 1) IS NULL,
DOWN AS DOWN.rideCount < LAST(DOWN.rideCount, 1) OR
LAST(DOWN.rideCount, 1) IS NULL,
E AS E.rideCount > LAST(DOWN.rideCount)
)
Rush (peak) hours
Time
Number of rides
aggregate values from
looping patterns
DEFINE/MEASURES: Define/Access Variables
• Pattern Variable Referencing
‒ Access to the set of rows mapped to a particular pattern variable (so far)
‒ A.price = set of rows mapped so far to A plus the current row if we try to match
the current row to A
‒ If A is a set, the last row is selected for scalar operations.
‒ If no pattern variable is specified (e.g. SUM(price)), the default pattern variable
"*" is used. This set contains all rows matched for pattern + current row.
PATTERN (A B+)
DEFINE
A AS A.price > 10,
B AS B.price > A.price AND
SUM(price) < 100 AND SUM(B.price) < 80
DEFINE/MEASURES: Define/Access Variables
• Pattern Variable Navigation
‒ Logical offsets enable navigation within the events that were mapped to a
particular pattern variable.
‒ FIRST(variable.field, n) n starts from the beginning
‒ LAST(variable.field, n) n starts from the end
‒ Expressions on same "list" are supported: LAST(A.price * A.tax)
PATTERN (A B+)
DEFINE
A AS A.price > 10,
B AS (LAST(B.price, 1) IS NULL OR B.price > LAST(B.price, 1)) AND
(LAST(B.price, 2) IS NULL OR B.price > 2 * LAST(B.price, 2))
© 2019 Ververica33
Driver fatigue
Ride
Duration
Sum: 5h
© 2019 Ververica34
Driver fatigue
Ride
Duration Sum: 7h > 6h
© 2019 Ververica35
Driver fatigue
Ride
Duration Sum: 7h > 6h
© 2019 Ververica36
Rides durations
CREATE VIEW RidesDurations AS
SELECT * FROM TaxiRides
MATCH_RECOGNIZE (
PARTITION BY driverId
ORDER BY rowTime
MEASURES
rideId as rideId,
timeDiff(S.rowTime, E.rowTime) as rideDuration,
MATCH_ROWTIME() as rowTime,
S.rowTime as startTime,
E.rowTime AS endTime
AFTER MATCH SKIP PAST LAST ROW
PATTERN (S E)
DEFINE
S AS S.isStart = true,
E AS E.isStart = false AND E.rideId = S.rideId
);
© 2019 Ververica37
Rides durations
CREATE VIEW RidesDurations AS
SELECT * FROM TaxiRides
MATCH_RECOGNIZE (
PARTITION BY driverId
ORDER BY rowTime
MEASURES
rideId as rideId,
timeDiff(S.rowTime, E.rowTime) as rideDuration,
MATCH_ROWTIME() as rowTime,
S.rowTime as startTime,
E.rowTime AS endTime
AFTER MATCH SKIP PAST LAST ROW
PATTERN (S E)
DEFINE
S AS S.isStart = true,
E AS E.isStart = false AND E.rideId = S.rideId
);
Time attribute of the match
© 2019 Ververica38
Drivers fatigue
SELECT *
FROM RidesDurations
MATCH_RECOGNIZE (
PARTITION BY driverId
ORDER BY rowTime
MEASURES
formatDuration(SUM(rideDuration)) as totalRideDuration,
FIRST(R.startTime) as startTime,
LAST(R.endTime) as endTime
AFTER MATCH SKIP PAST LAST ROW
PATTERN (R+? E) WITHIN INTERVAL '1' DAY
DEFINE
E AS SUM(rideDuration) >= durationOfHours(2)
);
© 2019 Ververica39
Drivers fatigue
SELECT *
FROM RidesDurations
MATCH_RECOGNIZE (
PARTITION BY driverId
ORDER BY rowTime
MEASURES
formatDuration(SUM(rideDuration)) as totalRideDuration,
FIRST(R.startTime) as startTime,
LAST(R.endTime) as endTime
AFTER MATCH SKIP PAST LAST ROW
PATTERN (R+? E) WITHIN INTERVAL '1' DAY
DEFINE
E AS SUM(rideDuration) >= durationOfHours(2)
);
Aggregation in condition
© 2019 Ververica40
Feature set of MATCH_RECOGNIZE
• Quantifiers support:
‒ + (one or more), * (zero or more), {x,y} (times)
‒ greedy(default), ?(reluctant)
• with some restrictions (not working for last pattern)
• After Match Skip
‒ skip_to_first/last, skip_past_last, skip_to_next
• Aggregates (since 1.8)
• Allow time attribute extraction (since 1.8)
• Not supported:
‒ alter(|), permute, exclude ‘{- -}’
AFTER MATCH SKIP: Continuation strategy
• Location where to start a new matching procedure after a complete
match was found
• Thus, also specifies how many matches a single event can belong to
SKIP PAST LAST ROW next row after the last row of the current match
SKIP TO NEXT ROW next row after the starting row of the match
SKIP TO LAST variable last row that is mapped to the specified pattern variable
SKIP TO FIRST variable first row that is mapped to the specified pattern variable
© 2019 Ververica42
Summary
● opens SQL for new use cases that assume order of events
● enables reuse of SQL goods
● Available since 1.7
● Try it yourself: https://github.com/ververica/sql-training/wiki
© 2019 Ververica43
First@ververica.com www.ververica.com @VervericaData
Thank you!Thank you!
© 2019 Ververica44
First@ververica.com www.ververica.com @VervericaData
Thank you!Questions?
© 2019 Ververica45
First@ververica.com www.ververica.com @VervericaData
Thank you!
dawid@ververica.com www.ververica.com @VervericaData

More Related Content

What's hot

Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangDatabricks
 
Apache Kafka – (Pattern and) Anti-Pattern
Apache Kafka – (Pattern and) Anti-PatternApache Kafka – (Pattern and) Anti-Pattern
Apache Kafka – (Pattern and) Anti-Patternconfluent
 
WSO2 Charon
WSO2 CharonWSO2 Charon
WSO2 CharonHasiniG
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Cloudera, Inc.
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Apache Flink 101 - the rise of stream processing and beyond
Apache Flink 101 - the rise of stream processing and beyondApache Flink 101 - the rise of stream processing and beyond
Apache Flink 101 - the rise of stream processing and beyondBowen Li
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentFlink Forward
 
[FFE19] Build a Flink AI Ecosystem
[FFE19] Build a Flink AI Ecosystem[FFE19] Build a Flink AI Ecosystem
[FFE19] Build a Flink AI EcosystemJiangjie Qin
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
CI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on Databricks
CI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on DatabricksCI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on Databricks
CI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on DatabricksDatabricks
 
Solving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowSolving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowWes McKinney
 
CI/CD with an Idempotent Kafka Producer & Consumer | Kafka Summit London 2022
CI/CD with an Idempotent Kafka Producer & Consumer | Kafka Summit London 2022CI/CD with an Idempotent Kafka Producer & Consumer | Kafka Summit London 2022
CI/CD with an Idempotent Kafka Producer & Consumer | Kafka Summit London 2022HostedbyConfluent
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explainedconfluent
 
Data Loss and Duplication in Kafka
Data Loss and Duplication in KafkaData Loss and Duplication in Kafka
Data Loss and Duplication in KafkaJayesh Thakrar
 
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Verv...
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Verv...Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Verv...
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Verv...Flink Forward
 
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...Tokuhiro Matsuno
 
Extending Complex Event Processing to Graph-structured Information
Extending Complex Event Processing to Graph-structured InformationExtending Complex Event Processing to Graph-structured Information
Extending Complex Event Processing to Graph-structured InformationAntonio Vallecillo
 

What's hot (20)

Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
 
Apache Kafka – (Pattern and) Anti-Pattern
Apache Kafka – (Pattern and) Anti-PatternApache Kafka – (Pattern and) Anti-Pattern
Apache Kafka – (Pattern and) Anti-Pattern
 
WSO2 Charon
WSO2 CharonWSO2 Charon
WSO2 Charon
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Apache Flink 101 - the rise of stream processing and beyond
Apache Flink 101 - the rise of stream processing and beyondApache Flink 101 - the rise of stream processing and beyond
Apache Flink 101 - the rise of stream processing and beyond
 
Flink Streaming
Flink StreamingFlink Streaming
Flink Streaming
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
 
[FFE19] Build a Flink AI Ecosystem
[FFE19] Build a Flink AI Ecosystem[FFE19] Build a Flink AI Ecosystem
[FFE19] Build a Flink AI Ecosystem
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
CI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on Databricks
CI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on DatabricksCI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on Databricks
CI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on Databricks
 
Solving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowSolving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache Arrow
 
Memory Analyzer Tool (MAT)
Memory Analyzer Tool (MAT) Memory Analyzer Tool (MAT)
Memory Analyzer Tool (MAT)
 
CI/CD with an Idempotent Kafka Producer & Consumer | Kafka Summit London 2022
CI/CD with an Idempotent Kafka Producer & Consumer | Kafka Summit London 2022CI/CD with an Idempotent Kafka Producer & Consumer | Kafka Summit London 2022
CI/CD with an Idempotent Kafka Producer & Consumer | Kafka Summit London 2022
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Data Loss and Duplication in Kafka
Data Loss and Duplication in KafkaData Loss and Duplication in Kafka
Data Loss and Duplication in Kafka
 
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Verv...
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Verv...Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Verv...
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Verv...
 
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
 
Extending Complex Event Processing to Graph-structured Information
Extending Complex Event Processing to Graph-structured InformationExtending Complex Event Processing to Graph-structured Information
Extending Complex Event Processing to Graph-structured Information
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 

Similar to Webinar: Detecting row patterns with Flink SQL - Dawid Wysakowicz

Flink Forward Berlin 2018: Dawid Wysakowicz - "Detecting Patterns in Event St...
Flink Forward Berlin 2018: Dawid Wysakowicz - "Detecting Patterns in Event St...Flink Forward Berlin 2018: Dawid Wysakowicz - "Detecting Patterns in Event St...
Flink Forward Berlin 2018: Dawid Wysakowicz - "Detecting Patterns in Event St...Flink Forward
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesAltinity Ltd
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Altinity Ltd
 
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...Flink Forward
 
Why and how to leverage the simplicity and power of SQL on Flink
Why and how to leverage the simplicity and power of SQL on FlinkWhy and how to leverage the simplicity and power of SQL on Flink
Why and how to leverage the simplicity and power of SQL on FlinkDataWorks Summit
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOAltinity Ltd
 
Practical Data Science : Data Cleaning and Summarising
Practical Data Science : Data Cleaning and SummarisingPractical Data Science : Data Cleaning and Summarising
Practical Data Science : Data Cleaning and SummarisingHariniMS1
 
Real-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleReal-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleSingleStore
 
Deep Dive Into Swift
Deep Dive Into SwiftDeep Dive Into Swift
Deep Dive Into SwiftSarath C
 
Chaincode Development 區塊鏈鏈碼開發
Chaincode Development 區塊鏈鏈碼開發Chaincode Development 區塊鏈鏈碼開發
Chaincode Development 區塊鏈鏈碼開發HO-HSUN LIN
 
Les03 (Using Single Row Functions To Customize Output)
Les03 (Using Single Row Functions To Customize Output)Les03 (Using Single Row Functions To Customize Output)
Les03 (Using Single Row Functions To Customize Output)Achmad Solichin
 
Trees And More With Postgre S Q L
Trees And  More With  Postgre S Q LTrees And  More With  Postgre S Q L
Trees And More With Postgre S Q LPerconaPerformance
 
Intro to tsql unit 10
Intro to tsql   unit 10Intro to tsql   unit 10
Intro to tsql unit 10Syed Asrarali
 
I really need help with this assignment it is multiple parts# Part.pdf
I really need help with this assignment it is multiple parts# Part.pdfI really need help with this assignment it is multiple parts# Part.pdf
I really need help with this assignment it is multiple parts# Part.pdfillyasraja7
 
DSL - expressive syntax on top of a clean semantic model
DSL - expressive syntax on top of a clean semantic modelDSL - expressive syntax on top of a clean semantic model
DSL - expressive syntax on top of a clean semantic modelDebasish Ghosh
 
Vertical Recommendation Using Collaborative Filtering
Vertical Recommendation Using Collaborative FilteringVertical Recommendation Using Collaborative Filtering
Vertical Recommendation Using Collaborative Filteringgorass
 
Loops in C# for loops while and do while loop.
Loops in C# for loops while and do while loop.Loops in C# for loops while and do while loop.
Loops in C# for loops while and do while loop.Abid Kohistani
 

Similar to Webinar: Detecting row patterns with Flink SQL - Dawid Wysakowicz (20)

Flink Forward Berlin 2018: Dawid Wysakowicz - "Detecting Patterns in Event St...
Flink Forward Berlin 2018: Dawid Wysakowicz - "Detecting Patterns in Event St...Flink Forward Berlin 2018: Dawid Wysakowicz - "Detecting Patterns in Event St...
Flink Forward Berlin 2018: Dawid Wysakowicz - "Detecting Patterns in Event St...
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
 
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
 
Why and how to leverage the simplicity and power of SQL on Flink
Why and how to leverage the simplicity and power of SQL on FlinkWhy and how to leverage the simplicity and power of SQL on Flink
Why and how to leverage the simplicity and power of SQL on Flink
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
 
SAS functions.pptx
SAS functions.pptxSAS functions.pptx
SAS functions.pptx
 
Practical Data Science : Data Cleaning and Summarising
Practical Data Science : Data Cleaning and SummarisingPractical Data Science : Data Cleaning and Summarising
Practical Data Science : Data Cleaning and Summarising
 
Real-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleReal-Time Analytics at Uber Scale
Real-Time Analytics at Uber Scale
 
Deep Dive Into Swift
Deep Dive Into SwiftDeep Dive Into Swift
Deep Dive Into Swift
 
Les03
Les03Les03
Les03
 
Chaincode Development 區塊鏈鏈碼開發
Chaincode Development 區塊鏈鏈碼開發Chaincode Development 區塊鏈鏈碼開發
Chaincode Development 區塊鏈鏈碼開發
 
Les03 (Using Single Row Functions To Customize Output)
Les03 (Using Single Row Functions To Customize Output)Les03 (Using Single Row Functions To Customize Output)
Les03 (Using Single Row Functions To Customize Output)
 
Trees And More With Postgre S Q L
Trees And  More With  Postgre S Q LTrees And  More With  Postgre S Q L
Trees And More With Postgre S Q L
 
Intro to tsql unit 10
Intro to tsql   unit 10Intro to tsql   unit 10
Intro to tsql unit 10
 
I really need help with this assignment it is multiple parts# Part.pdf
I really need help with this assignment it is multiple parts# Part.pdfI really need help with this assignment it is multiple parts# Part.pdf
I really need help with this assignment it is multiple parts# Part.pdf
 
DSL - expressive syntax on top of a clean semantic model
DSL - expressive syntax on top of a clean semantic modelDSL - expressive syntax on top of a clean semantic model
DSL - expressive syntax on top of a clean semantic model
 
Vertical Recommendation Using Collaborative Filtering
Vertical Recommendation Using Collaborative FilteringVertical Recommendation Using Collaborative Filtering
Vertical Recommendation Using Collaborative Filtering
 
Exploiting vectorization with ISPC
Exploiting vectorization with ISPCExploiting vectorization with ISPC
Exploiting vectorization with ISPC
 
Loops in C# for loops while and do while loop.
Loops in C# for loops while and do while loop.Loops in C# for loops while and do while loop.
Loops in C# for loops while and do while loop.
 

More from Ververica

2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...Ververica
 
Webinar: How to contribute to Apache Flink - Robert Metzger
Webinar:  How to contribute to Apache Flink - Robert MetzgerWebinar:  How to contribute to Apache Flink - Robert Metzger
Webinar: How to contribute to Apache Flink - Robert MetzgerVerverica
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanVerverica
 
Webinar: Flink SQL in Action - Fabian Hueske
 Webinar: Flink SQL in Action - Fabian Hueske Webinar: Flink SQL in Action - Fabian Hueske
Webinar: Flink SQL in Action - Fabian HueskeVerverica
 
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...Ververica
 
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 22018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2Ververica
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
 
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkVerverica
 
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache FlinkTzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache FlinkVerverica
 
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP Ververica
 
Aljoscha Krettek - Portable stateful big data processing in Apache Beam
Aljoscha Krettek - Portable stateful big data processing in Apache BeamAljoscha Krettek - Portable stateful big data processing in Apache Beam
Aljoscha Krettek - Portable stateful big data processing in Apache BeamVerverica
 
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...Ververica
 
Timo Walther - Table & SQL API - unified APIs for batch and stream processing
Timo Walther - Table & SQL API - unified APIs for batch and stream processingTimo Walther - Table & SQL API - unified APIs for batch and stream processing
Timo Walther - Table & SQL API - unified APIs for batch and stream processingVerverica
 
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Meetup:  Sanjar Akhmedov - Joining Infinity – Windowless Stream ...Apache Flink Meetup:  Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...Ververica
 
Kostas Kloudas - Extending Flink's Streaming APIs
Kostas Kloudas - Extending Flink's Streaming APIsKostas Kloudas - Extending Flink's Streaming APIs
Kostas Kloudas - Extending Flink's Streaming APIsVerverica
 
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkVerverica
 
Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...
Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...
Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...Ververica
 
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup Ververica
 
Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup
Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup
Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup Ververica
 
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...Ververica
 

More from Ververica (20)

2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
 
Webinar: How to contribute to Apache Flink - Robert Metzger
Webinar:  How to contribute to Apache Flink - Robert MetzgerWebinar:  How to contribute to Apache Flink - Robert Metzger
Webinar: How to contribute to Apache Flink - Robert Metzger
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
Webinar: Flink SQL in Action - Fabian Hueske
 Webinar: Flink SQL in Action - Fabian Hueske Webinar: Flink SQL in Action - Fabian Hueske
Webinar: Flink SQL in Action - Fabian Hueske
 
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
 
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 22018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache Flink
 
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache FlinkTzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
 
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
 
Aljoscha Krettek - Portable stateful big data processing in Apache Beam
Aljoscha Krettek - Portable stateful big data processing in Apache BeamAljoscha Krettek - Portable stateful big data processing in Apache Beam
Aljoscha Krettek - Portable stateful big data processing in Apache Beam
 
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
 
Timo Walther - Table & SQL API - unified APIs for batch and stream processing
Timo Walther - Table & SQL API - unified APIs for batch and stream processingTimo Walther - Table & SQL API - unified APIs for batch and stream processing
Timo Walther - Table & SQL API - unified APIs for batch and stream processing
 
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Meetup:  Sanjar Akhmedov - Joining Infinity – Windowless Stream ...Apache Flink Meetup:  Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
 
Kostas Kloudas - Extending Flink's Streaming APIs
Kostas Kloudas - Extending Flink's Streaming APIsKostas Kloudas - Extending Flink's Streaming APIs
Kostas Kloudas - Extending Flink's Streaming APIs
 
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache Flink
 
Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...
Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...
Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...
 
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
 
Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup
Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup
Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup
 
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
 

Recently uploaded

Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераMark Opanasiuk
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Julian Hyde
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...FIDO Alliance
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfFIDO Alliance
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomCzechDreamin
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeCzechDreamin
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...marcuskenyatta275
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?Mark Billinghurst
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityScyllaDB
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyJohn Staveley
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FIDO Alliance
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyUXDXConf
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...CzechDreamin
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlPeter Udo Diehl
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...FIDO Alliance
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfFIDO Alliance
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024Stephanie Beckett
 

Recently uploaded (20)

Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 

Webinar: Detecting row patterns with Flink SQL - Dawid Wysakowicz

  • 1. Dawid Wysakowicz, @dwysakowicz, SOFTWARE ENGINEER Detecting Patterns in Event Streams with Flink SQL
  • 2. © 2019 Ververica2 Original creators of Apache Flink® Complete Stream Processing Infrastructure About Ververica
  • 4. © 2019 Ververica4 Why people like SQL? • Well known interface: "lingua franca of data processing" • No programming required - easier to learn • Declarative way to express your query, WHAT not HOW • Out-of-the box optimization • Reusability ‒ built-in functions ‒ user defined functions
  • 5. © 2019 Ververica5 SQL on Streams • Unified semantics for bounded and unbounded input ‒ Same query, same input, same result Stream is converted into dynamic table Dynamic table is converted into stream Continuous query on dynamic table produces result d table
  • 6. © 2019 Ververica6 SQL on Streams • Unified semantics for stream and batch input ‒ Same query, same input, same result Stream is converted into dynamic table Dynamic table is converted into stream Continuous query on dynamic table produces result d table Flink SQL in Action by Fabian Hueske (link)
  • 7. © 2019 Ververica7 The New York Taxi Rides Data Set • The New York City & Limousine Commission provides a public data set about past taxi rides in New York City • We can derive a streaming table from the data • Table: TaxiRides rideId: BIGINT // ID of the taxi ride taxiId: BIGINT // ID of the taxi cab driverId: BIGINT // ID of the drive isStart: BOOLEAN // flag for pick-up (true) or drop-off (false) event lon: DOUBLE // longitude of pick-up or drop-off location lat: DOUBLE // latitude of pick-up or drop-off location rowTime: TIMESTAMP // time of pick-up or drop-off event psgCnt: INT // number of passengers
  • 8. © 2019 Ververica8 Number of rides per 30 minutes per area Time Number of rides
  • 9. © 2019 Ververica9 SQL Example SELECT toAreaId(lat, lon) as cellId, COUNT(distinct rideId) as rideCount, TUMBLE_ROWTIME(rowTime, INTERVAL '30' minute) AS rowTime, TUMBLE_START(rowTime, INTERVAL '30' minute) AS startTime, TUMBLE_END(rowTime, INTERVAL '30' minute) AS endTime FROM TaxiRides GROUP BY toAreaId(lat, lon), TUMBLE(rowTime, INTERVAL '30' minute)
  • 10. © 2019 Ververica10 SQL Support in Flink 1.8 • SELECT FROM WHERE • GROUP BY / HAVING ‒ Non-windowed, TUMBLE, HOP, SESSION windows • JOIN ‒ Windowed INNER, LEFT / RIGHT / FULL OUTER JOIN ‒ Non-windowed INNER / LEFT / RIGHT / FULL OUTER JOIN ‒ [streaming only] Temporal Tables • Scalar, aggregation, table-valued UDFs • Many built-in functions • [streaming only] OVER / WINDOW ‒ UNBOUNDED / BOUNDED PRECEDING • [streaming only] support of MATCH_RECOGNIZE
  • 11. © 2019 Ververica11 • Find rides with mid-stops What is hard with SQL?
  • 12. © 2019 Ververica12 Mid stops Pattern.<Row>begin("S”).where( (row) -> { return row.isStart == true; }).next("E").where( (row) -> { return row.isStart == true; }); CEP.pattern(input.keyBy(“driverId”), pattern) .flatSelect( new PatternFlatSelectFunction<Row, Row>() { @Override public void flatSelect( Map<String, List<Row>> pattern, Collector<Row> out) throws Exception { out.collect(( pattern.get(“S”).get(0).getRideId )); } );
  • 14. © 2019 Ververica14 Common use-cases • stock market analysis • customer behaviour • tracking money laundering • service quality • network intrusion detection
  • 15. © 2019 Ververica15 Position in a SQL Query SELECT ... FROM ... MATCH_RECOGNIZE (...) WHERE ... GROUP BY ...
  • 16. © 2019 Ververica16 SELECT * FROM TaxiRides MATCH_RECOGNIZE ( PARTITION BY driverId ORDER BY rowTime MEASURES S.rideId as sRideId AFTER MATCH SKIP PAST LAST ROW PATTERN (S E) DEFINE S AS S.isStart = true, E AS E.isStart = true ) Rides with mid-stops partition the data by given field = keyBy
  • 17. © 2019 Ververica17 Rides with mid-stops SELECT * FROM TaxiRides MATCH_RECOGNIZE ( PARTITION BY driverId ORDER BY rowTime MEASURES S.rideId as sRideId AFTER MATCH SKIP PAST LAST ROW PATTERN (S E) DEFINE S AS S.isStart = true, E AS E.isStart = true ) ● specify order ● primary order = Event or Processing time 1 3 2 5 4
  • 18. © 2019 Ververica18 SELECT * FROM TaxiRides MATCH_RECOGNIZE ( PARTITION BY driverId ORDER BY rowTime MEASURES S.rideId as sRideId AFTER MATCH SKIP PAST LAST ROW PATTERN (S E) DEFINE S AS S.isStart = true, E AS E.isStart = true ) Rides with mid-stops construct pattern S E
  • 19. PATTERN: Defining a Pattern • Concatenation: ‒ All rows of a pattern must be mapped to pattern variables ‒ A pattern like (A B) means that the contiguity is strict between A and B ‒ In other words: No rows between A and B • Quantifiers ‒ Number of rows mapped to a pattern variable * 0 or more rows + 1 or more rows ? 0 or 1 rows { n }, { n, }, { n, m }, { , m } Define intervals (inclusive) B*? Perform mapping reluctant instead of greedy (default behavior)
  • 20. © 2019 Ververica20 SELECT * FROM TaxiRides MATCH_RECOGNIZE ( PARTITION BY driverId ORDER BY rowTime MEASURES S.rideId as sRideId AFTER MATCH SKIP PAST LAST ROW PATTERN (S E) DEFINE S AS S.isStart = true, E AS E.isStart = true ) Rides with mid-stops extract measures from matched sequence S E
  • 21. DEFINE/MEASURES: Define/Access Variables • MEASURES ‒ Defines what will be included in the output of a matching pattern ‒ Project columns and define expressions for evaluation ‒ Number of produced rows depends on the output mode. Currently, ONE ROW PER MATCH = one output summary row per match only ‒ Output schema: [partitioning columns] + [measures columns] • DEFINE ‒ Conditions that a row has to fulfill to be classified to the corresponding variable ‒ No condition for a variable evaluates to TRUE
  • 23. © 2019 Ververica23 Rides with more than one mid-stop SELECT * FROM TaxiRides MATCH_RECOGNIZE ( PARTITION BY driverId ORDER BY rowTime MEASURES S.rideId as sRideId, COUNT(M.rideId) as countMidStops AFTER MATCH SKIP PAST LAST ROW PATTERN (S M{2,} E) DEFINE S AS S.isStart = true, M AS M.rideId <> S.rideId, E AS E.isStart = false AND E.rideId = S.rideId ) SELECT * FROM TaxiRides MATCH_RECOGNIZE ( PARTITION BY driverId ORDER BY rowTime MEASURES S.rideId as sRideId AFTER MATCH SKIP PAST LAST ROW PATTERN (S E) DEFINE S AS S.isStart = true, E AS E.isStart = true )
  • 24. © 2019 Ververica24 Rides with more than one mid-stop SELECT * FROM TaxiRides MATCH_RECOGNIZE ( PARTITION BY driverId ORDER BY rowTime MEASURES S.rideId as sRideId, COUNT(M.rideId) as countMidStops AFTER MATCH SKIP PAST LAST ROW PATTERN (S M{2,} E) DEFINE S AS S.isStart = true, M AS M.rideId <> S.rideId, E AS E.isStart = false AND E.rideId = S.rideId ) SELECT * FROM TaxiRides MATCH_RECOGNIZE ( PARTITION BY driverId ORDER BY rowTime MEASURES S.rideId as sRideId AFTER MATCH SKIP PAST LAST ROW PATTERN (S E) DEFINE S AS S.isStart = true, E AS E.isStart = true )
  • 25. © 2019 Ververica25 Rush (peak) hours - V shape Time Number of rides
  • 26. © 2019 Ververica26 Statistics per Area CREATE VIEW RidesInArea AS SELECT toAreaId(lat, lon) as cellId, COUNT(distinct rideId) as rideCount, TUMBLE_ROWTIME(rowTime, INTERVAL '30' minute) AS rowTime, TUMBLE_START(rowTime, INTERVAL '30' minute) AS startTime, TUMBLE_END(rowTime, INTERVAL '30' minute) AS endTime FROM TaxiRides GROUP BY toAreaId(lat, lon), TUMBLE(rowTime, INTERVAL '30' minute)
  • 27. © 2019 Ververica27 Rush (peak) hours SELECT * FROM RidesInArea MATCH_RECOGNIZE( PARTITION BY cellId ORDER BY rowTime MEASURES FIRST(UP.startTime) as rushStart, LAST(DOWN.endTime) AS rushEnd, SUM(UP.rideCount) + SUM(DOWN.rideCount) AS rideSum AFTER MATCH SKIP PAST LAST ROW PATTERN (UP{4,} DOWN{2,} E) DEFINE UP AS UP.rideCount > LAST(UP.rideCount, 1) or LAST(UP.rideCount, 1) IS NULL, DOWN AS DOWN.rideCount < LAST(DOWN.rideCount, 1) OR LAST(DOWN.rideCount, 1) IS NULL, E AS E.rideCount > LAST(DOWN.rideCount) ) Time Number of rides Use previous table/view
  • 28. © 2019 Ververica28 Rush (peak) hours Time Number of rides apply match to result of the inner query Use previous table/view SELECT * FROM RidesInArea MATCH_RECOGNIZE( PARTITION BY cellId ORDER BY rowTime MEASURES FIRST(UP.startTime) as rushStart, LAST(DOWN.endTime) AS rushEnd, SUM(UP.rideCount) + SUM(DOWN.rideCount) AS rideSum AFTER MATCH SKIP PAST LAST ROW PATTERN (UP{4,} DOWN{2,} E) DEFINE UP AS UP.rideCount > LAST(UP.rideCount, 1) or LAST(UP.rideCount, 1) IS NULL, DOWN AS DOWN.rideCount < LAST(DOWN.rideCount, 1) OR LAST(DOWN.rideCount, 1) IS NULL, E AS E.rideCount > LAST(DOWN.rideCount) )
  • 29. © 2019 Ververica29 SELECT * FROM RidesInArea MATCH_RECOGNIZE( PARTITION BY cellId ORDER BY rowTime MEASURES FIRST(UP.startTime) as rushStart, LAST(DOWN.endTime) AS rushEnd, SUM(UP.rideCount) + SUM(DOWN.rideCount) AS rideSum AFTER MATCH SKIP PAST LAST ROW PATTERN (UP{4,} DOWN{2,} E) DEFINE UP AS UP.rideCount > LAST(UP.rideCount, 1) or LAST(UP.rideCount, 1) IS NULL, DOWN AS DOWN.rideCount < LAST(DOWN.rideCount, 1) OR LAST(DOWN.rideCount, 1) IS NULL, E AS E.rideCount > LAST(DOWN.rideCount) ) Rush (peak) hours Time Number of rides access elements of looping pattern access elements of looping pattern in m
  • 30. © 2019 Ververica30 SELECT * FROM RidesInArea MATCH_RECOGNIZE( PARTITION BY cellId ORDER BY rowTime MEASURES FIRST(UP.startTime) as rushStart, LAST(DOWN.endTime) AS rushEnd, SUM(UP.rideCount) + SUM(DOWN.rideCount) AS rideSum AFTER MATCH SKIP PAST LAST ROW PATTERN (UP{4,} DOWN{2,} E) DEFINE UP AS UP.rideCount > LAST(UP.rideCount, 1) or LAST(UP.rideCount, 1) IS NULL, DOWN AS DOWN.rideCount < LAST(DOWN.rideCount, 1) OR LAST(DOWN.rideCount, 1) IS NULL, E AS E.rideCount > LAST(DOWN.rideCount) ) Rush (peak) hours Time Number of rides aggregate values from looping patterns
  • 31. DEFINE/MEASURES: Define/Access Variables • Pattern Variable Referencing ‒ Access to the set of rows mapped to a particular pattern variable (so far) ‒ A.price = set of rows mapped so far to A plus the current row if we try to match the current row to A ‒ If A is a set, the last row is selected for scalar operations. ‒ If no pattern variable is specified (e.g. SUM(price)), the default pattern variable "*" is used. This set contains all rows matched for pattern + current row. PATTERN (A B+) DEFINE A AS A.price > 10, B AS B.price > A.price AND SUM(price) < 100 AND SUM(B.price) < 80
  • 32. DEFINE/MEASURES: Define/Access Variables • Pattern Variable Navigation ‒ Logical offsets enable navigation within the events that were mapped to a particular pattern variable. ‒ FIRST(variable.field, n) n starts from the beginning ‒ LAST(variable.field, n) n starts from the end ‒ Expressions on same "list" are supported: LAST(A.price * A.tax) PATTERN (A B+) DEFINE A AS A.price > 10, B AS (LAST(B.price, 1) IS NULL OR B.price > LAST(B.price, 1)) AND (LAST(B.price, 2) IS NULL OR B.price > 2 * LAST(B.price, 2))
  • 33. © 2019 Ververica33 Driver fatigue Ride Duration Sum: 5h
  • 34. © 2019 Ververica34 Driver fatigue Ride Duration Sum: 7h > 6h
  • 35. © 2019 Ververica35 Driver fatigue Ride Duration Sum: 7h > 6h
  • 36. © 2019 Ververica36 Rides durations CREATE VIEW RidesDurations AS SELECT * FROM TaxiRides MATCH_RECOGNIZE ( PARTITION BY driverId ORDER BY rowTime MEASURES rideId as rideId, timeDiff(S.rowTime, E.rowTime) as rideDuration, MATCH_ROWTIME() as rowTime, S.rowTime as startTime, E.rowTime AS endTime AFTER MATCH SKIP PAST LAST ROW PATTERN (S E) DEFINE S AS S.isStart = true, E AS E.isStart = false AND E.rideId = S.rideId );
  • 37. © 2019 Ververica37 Rides durations CREATE VIEW RidesDurations AS SELECT * FROM TaxiRides MATCH_RECOGNIZE ( PARTITION BY driverId ORDER BY rowTime MEASURES rideId as rideId, timeDiff(S.rowTime, E.rowTime) as rideDuration, MATCH_ROWTIME() as rowTime, S.rowTime as startTime, E.rowTime AS endTime AFTER MATCH SKIP PAST LAST ROW PATTERN (S E) DEFINE S AS S.isStart = true, E AS E.isStart = false AND E.rideId = S.rideId ); Time attribute of the match
  • 38. © 2019 Ververica38 Drivers fatigue SELECT * FROM RidesDurations MATCH_RECOGNIZE ( PARTITION BY driverId ORDER BY rowTime MEASURES formatDuration(SUM(rideDuration)) as totalRideDuration, FIRST(R.startTime) as startTime, LAST(R.endTime) as endTime AFTER MATCH SKIP PAST LAST ROW PATTERN (R+? E) WITHIN INTERVAL '1' DAY DEFINE E AS SUM(rideDuration) >= durationOfHours(2) );
  • 39. © 2019 Ververica39 Drivers fatigue SELECT * FROM RidesDurations MATCH_RECOGNIZE ( PARTITION BY driverId ORDER BY rowTime MEASURES formatDuration(SUM(rideDuration)) as totalRideDuration, FIRST(R.startTime) as startTime, LAST(R.endTime) as endTime AFTER MATCH SKIP PAST LAST ROW PATTERN (R+? E) WITHIN INTERVAL '1' DAY DEFINE E AS SUM(rideDuration) >= durationOfHours(2) ); Aggregation in condition
  • 40. © 2019 Ververica40 Feature set of MATCH_RECOGNIZE • Quantifiers support: ‒ + (one or more), * (zero or more), {x,y} (times) ‒ greedy(default), ?(reluctant) • with some restrictions (not working for last pattern) • After Match Skip ‒ skip_to_first/last, skip_past_last, skip_to_next • Aggregates (since 1.8) • Allow time attribute extraction (since 1.8) • Not supported: ‒ alter(|), permute, exclude ‘{- -}’
  • 41. AFTER MATCH SKIP: Continuation strategy • Location where to start a new matching procedure after a complete match was found • Thus, also specifies how many matches a single event can belong to SKIP PAST LAST ROW next row after the last row of the current match SKIP TO NEXT ROW next row after the starting row of the match SKIP TO LAST variable last row that is mapped to the specified pattern variable SKIP TO FIRST variable first row that is mapped to the specified pattern variable
  • 42. © 2019 Ververica42 Summary ● opens SQL for new use cases that assume order of events ● enables reuse of SQL goods ● Available since 1.7 ● Try it yourself: https://github.com/ververica/sql-training/wiki
  • 43. © 2019 Ververica43 First@ververica.com www.ververica.com @VervericaData Thank you!Thank you!
  • 44. © 2019 Ververica44 First@ververica.com www.ververica.com @VervericaData Thank you!Questions?
  • 45. © 2019 Ververica45 First@ververica.com www.ververica.com @VervericaData Thank you! dawid@ververica.com www.ververica.com @VervericaData