SlideShare a Scribd company logo
1 of 29
Download to read offline
Till Rohrmann
trohrmann@apache.org
@stsffap
Fabian Hueske
fhueske@apache.org
@fhueske
Streaming Analytics & CEP
Two sides of the same coin?
Streams are Everywhere
§  Most data is continuously produced as stream
§  Processing data as it arrives
is becoming very popular
§  Many diverse applications
and use cases
2
Complex Event Processing
§  Analyzing a stream of events and drawing conclusions
•  Detect patterns and assemble new events
§  Applications
•  Network intrusion
•  Process monitoring
•  Algorithmic trading
§  Demanding requirements on stream processor
•  Low latency!
•  Exactly-once semantics & event-time support
3
Batch Analytics
4
§  The batch approach to data analytics
Streaming Analytics
§  Online aggregation of streams
•  No delay – Continuous results
§  Stream analytics subsumes batch analytics
•  Batch is a finite stream
§  Demanding requirements on stream processor
•  High throughput
•  Exactly-once semantics
•  Event-time & advanced window support
5
Apache Flink™
§  Platform for scalable stream processing
§  Meets requirements of CEP and stream analytics
•  Low latency and high throughput
•  Exactly-once semantics
•  Event-time & advanced windowing
§  Core DataStream API available for Java & Scala
6
This Talk is About
§  Flink’s new APIs for CEP and Stream Analytics
•  DSL to define CEP patterns and actions
•  Stream SQL to define queries on streams
§  Integration of CEP and Stream SQL
§  Early stage à Work in progress
7
Tracking an Order Process
Use Case
8
Order Fulfillment Scenario
9
Order Events
§  Process is reflected in a stream of order events
§  Order(orderId,	tStamp,	“received”)	
§  Shipment(orderId,	tStamp,	“shipped”)	
§  Delivery(orderId,	tStamp,	“delivered”)	
§  orderId: Identifies the order
§  tStamp: Time at which the event happened
10
Aggregating Massive Streams
Stream Analytics
11
Stream Analytics
§  Traditional batch analytics
•  Repeated queries on finite and changing data sets
•  Queries join and aggregate large data sets
§  Stream analytics
•  “Standing” query produces continuous results
from infinite input stream
•  Query computes aggregates on high-volume streams
§  How to compute aggregates on infinite streams?
12
Compute Aggregates on Streams
§  Split infinite stream into finite “windows”
•  Split usually by time
§  Tumbling windows
•  Fixed size & consecutive
§  Sliding windows
•  Fixed size & may overlap
§  Event time mandatory for correct & consistent results!
13
Example: Count Orders by Hour
14
Example: Count Orders by Hour
15
SELECT STREAM
TUMBLE_START(tStamp, INTERVAL ‘1’ HOUR) AS hour,
COUNT(*) AS cnt
FROM events
WHERE
status = ‘received’
GROUP BY
TUMBLE(tStamp, INTERVAL ‘1’ HOUR)
Stream SQL Architecture
§  Flink features SQL on static
and streaming tables
§  Parsing and optimization by
Apache Calcite
§  SQL queries are translated
into native Flink programs
16
Pattern Matching on Streams
Complex Event Processing
17
Real-time Warnings
18
CEP to the Rescue
§  Define processing and delivery intervals (SLAs)
§  ProcessSucc(orderId,	tStamp,	duration)	
§  ProcessWarn(orderId,	tStamp)	
§  DeliverySucc(orderId,	tStamp,	duration)	
§  DeliveryWarn(orderId,	tStamp)	
§  orderId:	Identifies the order
§  tStamp:	Time when the event happened
§  duration:	Duration of the processing/delivery
19
CEP Example
20
Processing: Order à Shipment
21
val processingPattern = Pattern
.begin[Event]("received").subtype(classOf[Order])
.followedBy("shipped").where(_.status == "shipped")
.within(Time.hours(1))
val processingPatternStream = CEP.pattern(
input.keyBy("orderId"),
processingPattern)
val procResult: DataStream[Either[ProcessWarn, ProcessSucc]] =
processingPatternStream.select {
(pP, timestamp) => // Timeout handler
ProcessWarn(pP("received").orderId, timestamp)
} {
fP => // Select function
ProcessSucc(
fP("received").orderId, fP("shipped").tStamp,
fP("shipped").tStamp – fP("received").tStamp)
}
… and both at the same time!
Integrated Stream Analytics with CEP
22
Count Delayed Shipments
23
Compute Avg Processing Time
24
CEP + Stream SQL
25
// complex event processing result
val delResult: DataStream[Either[DeliveryWarn, DeliverySucc]] = …
val delWarn: DataStream[DeliveryWarn] = delResult.flatMap(_.left.toOption)
val deliveryWarningTable: Table = delWarn.toTable(tableEnv)
tableEnv.registerTable(”deliveryWarnings", deliveryWarningTable)
// calculate the delayed deliveries per day
val delayedDeliveriesPerDay = tableEnv.sql(
"""SELECT STREAM
| TUMBLE_START(tStamp, INTERVAL ‘1’ DAY) AS day,
| COUNT(*) AS cnt
|FROM deliveryWarnings
|GROUP BY TUMBLE(tStamp, INTERVAL ‘1’ DAY)""".stripMargin)
CEP-enriched Stream SQL
26
SELECT
TUMBLE_START(tStamp, INTERVAL '1' DAY) as day,
AVG(duration) as avgDuration
FROM (
// CEP pattern
SELECT (b.tStamp - a.tStamp) as duration, b.tStamp as tStamp
FROM inputs
PATTERN
a FOLLOW BY b PARTITION BY orderId ORDER BY tStamp
WITHIN INTERVAL '1’ HOUR
WHERE
a.status = ‘received’ AND b.status = ‘shipped’
)
GROUP BY
TUMBLE(tStamp, INTERVAL '1’ DAY)
Conclusion
§  Apache Flink handles CEP and analytical
workloads
§  Apache Flink offers intuitive APIs
§  New class of applications by CEP and
Stream SQL integration J
27
Flink Forward 2016, Berlin
Submission deadline: June 30, 2016
Early bird deadline: July 15, 2016
www.flink-forward.org
We are hiring!
data-artisans.com/careers

More Related Content

What's hot

What's hot (20)

Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
File Format Benchmark - Avro, JSON, ORC and Parquet
File Format Benchmark - Avro, JSON, ORC and ParquetFile Format Benchmark - Avro, JSON, ORC and Parquet
File Format Benchmark - Avro, JSON, ORC and Parquet
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Apache Flink Stream Processing
Apache Flink Stream ProcessingApache Flink Stream Processing
Apache Flink Stream Processing
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flink
 
Apache NiFi Record Processing
Apache NiFi Record ProcessingApache NiFi Record Processing
Apache NiFi Record Processing
 
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
 
Apache flink
Apache flinkApache flink
Apache flink
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
 
Flink Streaming
Flink StreamingFlink Streaming
Flink Streaming
 
Streaming SQL with Apache Calcite
Streaming SQL with Apache CalciteStreaming SQL with Apache Calcite
Streaming SQL with Apache Calcite
 
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
 
On-boarding with JanusGraph Performance
On-boarding with JanusGraph PerformanceOn-boarding with JanusGraph Performance
On-boarding with JanusGraph Performance
 
Unified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache FlinkUnified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache Flink
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System Overview
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and Flink
 
Flink history, roadmap and vision
Flink history, roadmap and visionFlink history, roadmap and vision
Flink history, roadmap and vision
 

Similar to Streaming Analytics & CEP - Two sides of the same coin?

Flink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San JoseFlink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San Jose
Kostas Tzoumas
 
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
confluent
 

Similar to Streaming Analytics & CEP - Two sides of the same coin? (20)

Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
 
Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)
 
Apache Flink @ Tel Aviv / Herzliya Meetup
Apache Flink @ Tel Aviv / Herzliya MeetupApache Flink @ Tel Aviv / Herzliya Meetup
Apache Flink @ Tel Aviv / Herzliya Meetup
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
Flink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San JoseFlink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San Jose
 
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
 
K. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward KeynoteK. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward Keynote
 
Flink System Overview
Flink System OverviewFlink System Overview
Flink System Overview
 
Data Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and FrameworksData Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and Frameworks
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
Kafka Streams: the easiest way to start with stream processing
Kafka Streams: the easiest way to start with stream processingKafka Streams: the easiest way to start with stream processing
Kafka Streams: the easiest way to start with stream processing
 
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
 
Julian Hyde - Streaming SQL
Julian Hyde - Streaming SQLJulian Hyde - Streaming SQL
Julian Hyde - Streaming SQL
 
Data Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkData Stream Processing with Apache Flink
Data Stream Processing with Apache Flink
 
Cooking with Data and Processes
Cooking with Data and ProcessesCooking with Data and Processes
Cooking with Data and Processes
 
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkArbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Spark Barcelona Meetup: Migrating Batch Jobs into Structured Streaming
Spark Barcelona Meetup: Migrating Batch Jobs into Structured StreamingSpark Barcelona Meetup: Migrating Batch Jobs into Structured Streaming
Spark Barcelona Meetup: Migrating Batch Jobs into Structured Streaming
 
Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014
 

More from Till Rohrmann

Gilbert: Declarative Sparse Linear Algebra on Massively Parallel Dataflow Sys...
Gilbert: Declarative Sparse Linear Algebra on Massively Parallel Dataflow Sys...Gilbert: Declarative Sparse Linear Algebra on Massively Parallel Dataflow Sys...
Gilbert: Declarative Sparse Linear Algebra on Massively Parallel Dataflow Sys...
Till Rohrmann
 

More from Till Rohrmann (18)

Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
 
Apache flink 1.7 and Beyond
Apache flink 1.7 and BeyondApache flink 1.7 and Beyond
Apache flink 1.7 and Beyond
 
Elastic Streams at Scale @ Flink Forward 2018 Berlin
Elastic Streams at Scale @ Flink Forward 2018 BerlinElastic Streams at Scale @ Flink Forward 2018 Berlin
Elastic Streams at Scale @ Flink Forward 2018 Berlin
 
Scaling stream data pipelines with Pravega and Apache Flink
Scaling stream data pipelines with Pravega and Apache FlinkScaling stream data pipelines with Pravega and Apache Flink
Scaling stream data pipelines with Pravega and Apache Flink
 
Modern Stream Processing With Apache Flink @ GOTO Berlin 2017
Modern Stream Processing With Apache Flink @ GOTO Berlin 2017Modern Stream Processing With Apache Flink @ GOTO Berlin 2017
Modern Stream Processing With Apache Flink @ GOTO Berlin 2017
 
Apache Flink Meets Apache Mesos And DC/OS @ Mesos Meetup Berlin
Apache Flink Meets Apache Mesos And DC/OS @ Mesos Meetup BerlinApache Flink Meets Apache Mesos And DC/OS @ Mesos Meetup Berlin
Apache Flink Meets Apache Mesos And DC/OS @ Mesos Meetup Berlin
 
Apache Flink® Meets Apache Mesos® and DC/OS
Apache Flink® Meets Apache Mesos® and DC/OSApache Flink® Meets Apache Mesos® and DC/OS
Apache Flink® Meets Apache Mesos® and DC/OS
 
From Apache Flink® 1.3 to 1.4
From Apache Flink® 1.3 to 1.4From Apache Flink® 1.3 to 1.4
From Apache Flink® 1.3 to 1.4
 
Apache Flink and More @ MesosCon Asia 2017
Apache Flink and More @ MesosCon Asia 2017Apache Flink and More @ MesosCon Asia 2017
Apache Flink and More @ MesosCon Asia 2017
 
Redesigning Apache Flink's Distributed Architecture @ Flink Forward 2017
Redesigning Apache Flink's Distributed Architecture @ Flink Forward 2017Redesigning Apache Flink's Distributed Architecture @ Flink Forward 2017
Redesigning Apache Flink's Distributed Architecture @ Flink Forward 2017
 
Gilbert: Declarative Sparse Linear Algebra on Massively Parallel Dataflow Sys...
Gilbert: Declarative Sparse Linear Algebra on Massively Parallel Dataflow Sys...Gilbert: Declarative Sparse Linear Algebra on Massively Parallel Dataflow Sys...
Gilbert: Declarative Sparse Linear Algebra on Massively Parallel Dataflow Sys...
 
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
 
Apache Flink: Streaming Done Right @ FOSDEM 2016
Apache Flink: Streaming Done Right @ FOSDEM 2016Apache Flink: Streaming Done Right @ FOSDEM 2016
Apache Flink: Streaming Done Right @ FOSDEM 2016
 
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015
 
Fault Tolerance and Job Recovery in Apache Flink @ FlinkForward 2015
Fault Tolerance and Job Recovery in Apache Flink @ FlinkForward 2015Fault Tolerance and Job Recovery in Apache Flink @ FlinkForward 2015
Fault Tolerance and Job Recovery in Apache Flink @ FlinkForward 2015
 
Interactive Data Analysis with Apache Flink @ Flink Meetup in Berlin
Interactive Data Analysis with Apache Flink @ Flink Meetup in BerlinInteractive Data Analysis with Apache Flink @ Flink Meetup in Berlin
Interactive Data Analysis with Apache Flink @ Flink Meetup in Berlin
 
Computing recommendations at extreme scale with Apache Flink @Buzzwords 2015
Computing recommendations at extreme scale with Apache Flink @Buzzwords 2015Computing recommendations at extreme scale with Apache Flink @Buzzwords 2015
Computing recommendations at extreme scale with Apache Flink @Buzzwords 2015
 
Machine Learning with Apache Flink at Stockholm Machine Learning Group
Machine Learning with Apache Flink at Stockholm Machine Learning GroupMachine Learning with Apache Flink at Stockholm Machine Learning Group
Machine Learning with Apache Flink at Stockholm Machine Learning Group
 

Recently uploaded

Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Stephen266013
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
dq9vz1isj
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Valters Lauzums
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
RafigAliyev2
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
如何办理新加坡国立大学毕业证(NUS毕业证)学位证成绩单原版一比一
如何办理新加坡国立大学毕业证(NUS毕业证)学位证成绩单原版一比一如何办理新加坡国立大学毕业证(NUS毕业证)学位证成绩单原版一比一
如何办理新加坡国立大学毕业证(NUS毕业证)学位证成绩单原版一比一
hwhqz6r1y
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
cyebo
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
cyebo
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
pyhepag
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
ju0dztxtn
 
如何办理澳洲悉尼大学毕业证(USYD毕业证书)学位证书成绩单原版一比一
如何办理澳洲悉尼大学毕业证(USYD毕业证书)学位证书成绩单原版一比一如何办理澳洲悉尼大学毕业证(USYD毕业证书)学位证书成绩单原版一比一
如何办理澳洲悉尼大学毕业证(USYD毕业证书)学位证书成绩单原版一比一
w7jl3eyno
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
DilipVasan
 

Recently uploaded (20)

2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
如何办理新加坡国立大学毕业证(NUS毕业证)学位证成绩单原版一比一
如何办理新加坡国立大学毕业证(NUS毕业证)学位证成绩单原版一比一如何办理新加坡国立大学毕业证(NUS毕业证)学位证成绩单原版一比一
如何办理新加坡国立大学毕业证(NUS毕业证)学位证成绩单原版一比一
 
123.docx. .
123.docx.                                 .123.docx.                                 .
123.docx. .
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
 
如何办理澳洲悉尼大学毕业证(USYD毕业证书)学位证书成绩单原版一比一
如何办理澳洲悉尼大学毕业证(USYD毕业证书)学位证书成绩单原版一比一如何办理澳洲悉尼大学毕业证(USYD毕业证书)学位证书成绩单原版一比一
如何办理澳洲悉尼大学毕业证(USYD毕业证书)学位证书成绩单原版一比一
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 

Streaming Analytics & CEP - Two sides of the same coin?

  • 2. Streams are Everywhere §  Most data is continuously produced as stream §  Processing data as it arrives is becoming very popular §  Many diverse applications and use cases 2
  • 3. Complex Event Processing §  Analyzing a stream of events and drawing conclusions •  Detect patterns and assemble new events §  Applications •  Network intrusion •  Process monitoring •  Algorithmic trading §  Demanding requirements on stream processor •  Low latency! •  Exactly-once semantics & event-time support 3
  • 4. Batch Analytics 4 §  The batch approach to data analytics
  • 5. Streaming Analytics §  Online aggregation of streams •  No delay – Continuous results §  Stream analytics subsumes batch analytics •  Batch is a finite stream §  Demanding requirements on stream processor •  High throughput •  Exactly-once semantics •  Event-time & advanced window support 5
  • 6. Apache Flink™ §  Platform for scalable stream processing §  Meets requirements of CEP and stream analytics •  Low latency and high throughput •  Exactly-once semantics •  Event-time & advanced windowing §  Core DataStream API available for Java & Scala 6
  • 7. This Talk is About §  Flink’s new APIs for CEP and Stream Analytics •  DSL to define CEP patterns and actions •  Stream SQL to define queries on streams §  Integration of CEP and Stream SQL §  Early stage à Work in progress 7
  • 8. Tracking an Order Process Use Case 8
  • 10. Order Events §  Process is reflected in a stream of order events §  Order(orderId, tStamp, “received”) §  Shipment(orderId, tStamp, “shipped”) §  Delivery(orderId, tStamp, “delivered”) §  orderId: Identifies the order §  tStamp: Time at which the event happened 10
  • 12. Stream Analytics §  Traditional batch analytics •  Repeated queries on finite and changing data sets •  Queries join and aggregate large data sets §  Stream analytics •  “Standing” query produces continuous results from infinite input stream •  Query computes aggregates on high-volume streams §  How to compute aggregates on infinite streams? 12
  • 13. Compute Aggregates on Streams §  Split infinite stream into finite “windows” •  Split usually by time §  Tumbling windows •  Fixed size & consecutive §  Sliding windows •  Fixed size & may overlap §  Event time mandatory for correct & consistent results! 13
  • 14. Example: Count Orders by Hour 14
  • 15. Example: Count Orders by Hour 15 SELECT STREAM TUMBLE_START(tStamp, INTERVAL ‘1’ HOUR) AS hour, COUNT(*) AS cnt FROM events WHERE status = ‘received’ GROUP BY TUMBLE(tStamp, INTERVAL ‘1’ HOUR)
  • 16. Stream SQL Architecture §  Flink features SQL on static and streaming tables §  Parsing and optimization by Apache Calcite §  SQL queries are translated into native Flink programs 16
  • 17. Pattern Matching on Streams Complex Event Processing 17
  • 19. CEP to the Rescue §  Define processing and delivery intervals (SLAs) §  ProcessSucc(orderId, tStamp, duration) §  ProcessWarn(orderId, tStamp) §  DeliverySucc(orderId, tStamp, duration) §  DeliveryWarn(orderId, tStamp) §  orderId: Identifies the order §  tStamp: Time when the event happened §  duration: Duration of the processing/delivery 19
  • 21. Processing: Order à Shipment 21 val processingPattern = Pattern .begin[Event]("received").subtype(classOf[Order]) .followedBy("shipped").where(_.status == "shipped") .within(Time.hours(1)) val processingPatternStream = CEP.pattern( input.keyBy("orderId"), processingPattern) val procResult: DataStream[Either[ProcessWarn, ProcessSucc]] = processingPatternStream.select { (pP, timestamp) => // Timeout handler ProcessWarn(pP("received").orderId, timestamp) } { fP => // Select function ProcessSucc( fP("received").orderId, fP("shipped").tStamp, fP("shipped").tStamp – fP("received").tStamp) }
  • 22. … and both at the same time! Integrated Stream Analytics with CEP 22
  • 25. CEP + Stream SQL 25 // complex event processing result val delResult: DataStream[Either[DeliveryWarn, DeliverySucc]] = … val delWarn: DataStream[DeliveryWarn] = delResult.flatMap(_.left.toOption) val deliveryWarningTable: Table = delWarn.toTable(tableEnv) tableEnv.registerTable(”deliveryWarnings", deliveryWarningTable) // calculate the delayed deliveries per day val delayedDeliveriesPerDay = tableEnv.sql( """SELECT STREAM | TUMBLE_START(tStamp, INTERVAL ‘1’ DAY) AS day, | COUNT(*) AS cnt |FROM deliveryWarnings |GROUP BY TUMBLE(tStamp, INTERVAL ‘1’ DAY)""".stripMargin)
  • 26. CEP-enriched Stream SQL 26 SELECT TUMBLE_START(tStamp, INTERVAL '1' DAY) as day, AVG(duration) as avgDuration FROM ( // CEP pattern SELECT (b.tStamp - a.tStamp) as duration, b.tStamp as tStamp FROM inputs PATTERN a FOLLOW BY b PARTITION BY orderId ORDER BY tStamp WITHIN INTERVAL '1’ HOUR WHERE a.status = ‘received’ AND b.status = ‘shipped’ ) GROUP BY TUMBLE(tStamp, INTERVAL '1’ DAY)
  • 27. Conclusion §  Apache Flink handles CEP and analytical workloads §  Apache Flink offers intuitive APIs §  New class of applications by CEP and Stream SQL integration J 27
  • 28. Flink Forward 2016, Berlin Submission deadline: June 30, 2016 Early bird deadline: July 15, 2016 www.flink-forward.org