SlideShare a Scribd company logo
Efficient Window Aggregation
with Stream Slicing
Berlin, September 3-5, 2018
Philipp M. Grulich
Research Assistant (DFKI)
Jonas Traub
Research Associate (TU Berlin)
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Stream Slicing Example
2
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Stream Slicing Example
3
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Stream Slicing Example
4
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Stream Slicing Example
5
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Stream Slicing Example
6
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Stream Slicing Example
7
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Stream Slicing Example
8
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Stream Slicing Example
9
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Stream Slicing Example
10
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Stream Slicing Example
11
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Stream Slicing Research
12
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Stream Slicing Research
13
CIKM 2016
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Stream Slicing Research
14
ICDE 2018
CIKM 2016
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
15
.window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)))
.sum()
Example Query:
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
16
.window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)))
.sum()
Example Query:
Processing with Buckets:
Events: Buckets:
Eventtime
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
16
.window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)))
.sum()
Example Query:
Processing with Buckets:
<4,3>
Events: Buckets:
Eventtime
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
16
.window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)))
.sum()
Example Query:
Processing with Buckets:
<0:60, 3><4,3>
Events: Buckets:
Eventtime
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
16
.window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)))
.sum()
Example Query:
Processing with Buckets:
<0:60, 3><4,3>
<15,6>
Events: Buckets:
Eventtime
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
16
.window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)))
.sum()
Example Query:
Processing with Buckets:
<0:60, 3>
<10:70, 6>
<4,3>
<15,6>
<0:60, 9>
Events: Buckets:
Eventtime
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
16
.window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)))
.sum()
Example Query:
Processing with Buckets:
<0:60, 3>
<10:70, 6>
<4,3>
<15,6>
<0:60, 9>
<55,6>
Events: Buckets:
Eventtime
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
16
.window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)))
.sum()
Example Query:
Processing with Buckets:
<0:60, 3>
<10:70, 6>
...
<4,3>
<15,6>
<0:60, 9>
<55,6>
<0:60, 15>
<10:70, 12>
Events: Buckets:
Eventtime
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
16
.window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)))
.sum()
Example Query:
Processing with Buckets:
<0:60, 3>
<10:70, 6>
...
<4,3>
<15,6>
<0:60, 9>
<55,6>
<0:60, 15>
<10:70, 12>
<66,1>
Events: Buckets:
Eventtime
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
16
.window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)))
.sum()
Example Query:
Processing with Buckets:
<0:60, 3>
<10:70, 6>
...
<60:120, 1>
<4,3>
<15,6>
<0:60, 9>
<55,6>
<0:60, 15>
<10:70, 12>
<66,1>
<10:70, 13>
Events: Buckets:
Eventtime
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
17
Number of Buckets = Window Length / Slide Length
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
17
Number of Buckets = Window Length / Slide Length
SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)) --> 6 Buckets
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
17
Number of Buckets = Window Length / Slide Length
SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))
SlidingEventTimeWindows.of(Time.day(1), Time.seconds(10))
--> 6 Buckets
--> 8640 Buckets
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
17
Number of Buckets = Window Length / Slide Length
SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))
SlidingEventTimeWindows.of(Time.day(1), Time.seconds(10))
--> 6 Buckets
--> 8640 Buckets
Overlapping windows cause:
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
17
Number of Buckets = Window Length / Slide Length
SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))
SlidingEventTimeWindows.of(Time.day(1), Time.seconds(10))
--> 6 Buckets
--> 8640 Buckets
Overlapping windows cause:
● Every event is assigned to many windows.
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
17
Number of Buckets = Window Length / Slide Length
SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))
SlidingEventTimeWindows.of(Time.day(1), Time.seconds(10))
--> 6 Buckets
--> 8640 Buckets
Overlapping windows cause:
● Every event is assigned to many windows.
● Repeated aggregations --> aggregation function is called on every window
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
17
Number of Buckets = Window Length / Slide Length
SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))
SlidingEventTimeWindows.of(Time.day(1), Time.seconds(10))
--> 6 Buckets
--> 8640 Buckets
Overlapping windows cause:
● Every event is assigned to many windows.
● Repeated aggregations --> aggregation function is called on every window
● High memory consumption --> especially for windows without incremental aggregation
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Flink Windowing Bottlenecks
17
Number of Buckets = Window Length / Slide Length
SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))
SlidingEventTimeWindows.of(Time.day(1), Time.seconds(10))
--> 6 Buckets
--> 8640 Buckets
Overlapping windows cause:
● Every event is assigned to many windows.
● Repeated aggregations --> aggregation function is called on every window
● High memory consumption --> especially for windows without incremental aggregation
● Check for merging windows
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Architecture Overview
18
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Architecture Overview
18
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Architecture Overview
18
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Architecture Overview
18
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Architecture Overview
18
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Architecture Overview
18
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Session Window Aggregate Sharing
19
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Session Window Aggregate Sharing
19
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Session Window Aggregate Sharing
19
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Out-of-Order Processing and Sessions
20
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Out-of-Order Processing and Sessions
20
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Out-of-Order Processing and Sessions
20
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Out-of-Order Processing and Sessions
20
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Out-of-Order Processing and Sessions
20
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Out-of-Order Processing and Sessions
20
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Multi-Window Processing (Example: Fitness Tracker)
21
[...].window(
// Daily report:
TumblingEventTimeWindows.of(Time.days(1)),
// Monitoring dashboard (last hour):
SlidingEventTimeWindows.of(Time.hours(1), Time.seconds(1)),
// Activity periods:
EventTimeSessionWindows.of(Time.minutes(1)))
.sum()
Multi-Window Processing
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Stream Slicing Performance
22
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Stream Slicing Performance
23
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Runtime-Dynamic Windows
24
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Runtime-Dynamic Windows
24
Event Stream:
Window Definition Stream:
<WindowDefinition>
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Runtime-Dynamic Windows
24
Event Stream:
Dynamic Window Operator
Window Definition Stream:
<WindowDefinition>
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Runtime-Dynamic Windows
24
Event Stream:
Dynamic Window Operator
Output Stream:
<Window, Agg>
Window Definition Stream:
<WindowDefinition>
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Runtime-Dynamic Windows
24
.dynamicWindow(windowDefinitionStream)
.sum()
Event Stream:
Dynamic Window Operator
Output Stream:
<Window, Agg>
Window Definition Stream:
<WindowDefinition>
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
From Research to Production
25
Research
Production
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
From Research to Production
● Implement complete fault-tolerance and state-management
25
Research
Production
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
From Research to Production
● Implement complete fault-tolerance and state-management
● State migration
25
Research
Production
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
From Research to Production
● Implement complete fault-tolerance and state-management
● State migration
○ Hard limitation: Aggregated buckets in state snapshots cannot be migrated
25
Research
Production
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
From Research to Production
● Implement complete fault-tolerance and state-management
● State migration
○ Hard limitation: Aggregated buckets in state snapshots cannot be migrated
● Sophisticated testing
25
Research
Production
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
From Research to Production
● Implement complete fault-tolerance and state-management
● State migration
○ Hard limitation: Aggregated buckets in state snapshots cannot be migrated
● Sophisticated testing
How to expose multi-windows and dynamic-windows to users?
25
Research
Production
Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing
Wrap-Up
Scotty Features:
- stream slicing
- pre-aggregation
- aggregate sharing
- out-of-order processing
- session window support
- multi-window support
- runtime-dynamic window support
Let’s bring it to production!
JIRA: [FLINK-7001]
26
This talk is supported by the European Union Horizon 2020 Projects
Proteus (687691), Streamline (688191), SAGE (671500), and
E2Data (780245) and by the German Ministry for Education and
Research as Berlin Big Data Center (01IS14013A) and Software
Campus (01IS12056).

More Related Content

More from Jonas Traub

Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream ProcessingScotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Jonas Traub
 
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Jonas Traub
 
Efficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCLEfficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCL
Jonas Traub
 
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
Jonas Traub
 
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
Jonas Traub
 
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
Jonas Traub
 
I²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming DataI²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming Data
Jonas Traub
 
LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)
Jonas Traub
 
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisLWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
Jonas Traub
 

More from Jonas Traub (9)

Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream ProcessingScotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing
 
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
 
Efficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCLEfficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCL
 
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
 
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
 
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
 
I²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming DataI²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming Data
 
LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)
 
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisLWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
 

Recently uploaded

Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 

Recently uploaded (20)

Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 

Flink Forward 2018: Efficient Window Aggregation with Stream Slicing

  • 1. Efficient Window Aggregation with Stream Slicing Berlin, September 3-5, 2018 Philipp M. Grulich Research Assistant (DFKI) Jonas Traub Research Associate (TU Berlin)
  • 2. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Stream Slicing Example 2
  • 3. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Stream Slicing Example 3
  • 4. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Stream Slicing Example 4
  • 5. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Stream Slicing Example 5
  • 6. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Stream Slicing Example 6
  • 7. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Stream Slicing Example 7
  • 8. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Stream Slicing Example 8
  • 9. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Stream Slicing Example 9
  • 10. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Stream Slicing Example 10
  • 11. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Stream Slicing Example 11
  • 12. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Stream Slicing Research 12
  • 13. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Stream Slicing Research 13 CIKM 2016
  • 14. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Stream Slicing Research 14 ICDE 2018 CIKM 2016
  • 15. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 15 .window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))) .sum() Example Query:
  • 16. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 16 .window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))) .sum() Example Query: Processing with Buckets: Events: Buckets: Eventtime
  • 17. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 16 .window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))) .sum() Example Query: Processing with Buckets: <4,3> Events: Buckets: Eventtime
  • 18. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 16 .window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))) .sum() Example Query: Processing with Buckets: <0:60, 3><4,3> Events: Buckets: Eventtime
  • 19. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 16 .window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))) .sum() Example Query: Processing with Buckets: <0:60, 3><4,3> <15,6> Events: Buckets: Eventtime
  • 20. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 16 .window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))) .sum() Example Query: Processing with Buckets: <0:60, 3> <10:70, 6> <4,3> <15,6> <0:60, 9> Events: Buckets: Eventtime
  • 21. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 16 .window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))) .sum() Example Query: Processing with Buckets: <0:60, 3> <10:70, 6> <4,3> <15,6> <0:60, 9> <55,6> Events: Buckets: Eventtime
  • 22. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 16 .window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))) .sum() Example Query: Processing with Buckets: <0:60, 3> <10:70, 6> ... <4,3> <15,6> <0:60, 9> <55,6> <0:60, 15> <10:70, 12> Events: Buckets: Eventtime
  • 23. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 16 .window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))) .sum() Example Query: Processing with Buckets: <0:60, 3> <10:70, 6> ... <4,3> <15,6> <0:60, 9> <55,6> <0:60, 15> <10:70, 12> <66,1> Events: Buckets: Eventtime
  • 24. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 16 .window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))) .sum() Example Query: Processing with Buckets: <0:60, 3> <10:70, 6> ... <60:120, 1> <4,3> <15,6> <0:60, 9> <55,6> <0:60, 15> <10:70, 12> <66,1> <10:70, 13> Events: Buckets: Eventtime
  • 25. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 17 Number of Buckets = Window Length / Slide Length
  • 26. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 17 Number of Buckets = Window Length / Slide Length SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)) --> 6 Buckets
  • 27. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 17 Number of Buckets = Window Length / Slide Length SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)) SlidingEventTimeWindows.of(Time.day(1), Time.seconds(10)) --> 6 Buckets --> 8640 Buckets
  • 28. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 17 Number of Buckets = Window Length / Slide Length SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)) SlidingEventTimeWindows.of(Time.day(1), Time.seconds(10)) --> 6 Buckets --> 8640 Buckets Overlapping windows cause:
  • 29. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 17 Number of Buckets = Window Length / Slide Length SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)) SlidingEventTimeWindows.of(Time.day(1), Time.seconds(10)) --> 6 Buckets --> 8640 Buckets Overlapping windows cause: ● Every event is assigned to many windows.
  • 30. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 17 Number of Buckets = Window Length / Slide Length SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)) SlidingEventTimeWindows.of(Time.day(1), Time.seconds(10)) --> 6 Buckets --> 8640 Buckets Overlapping windows cause: ● Every event is assigned to many windows. ● Repeated aggregations --> aggregation function is called on every window
  • 31. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 17 Number of Buckets = Window Length / Slide Length SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)) SlidingEventTimeWindows.of(Time.day(1), Time.seconds(10)) --> 6 Buckets --> 8640 Buckets Overlapping windows cause: ● Every event is assigned to many windows. ● Repeated aggregations --> aggregation function is called on every window ● High memory consumption --> especially for windows without incremental aggregation
  • 32. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Flink Windowing Bottlenecks 17 Number of Buckets = Window Length / Slide Length SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10)) SlidingEventTimeWindows.of(Time.day(1), Time.seconds(10)) --> 6 Buckets --> 8640 Buckets Overlapping windows cause: ● Every event is assigned to many windows. ● Repeated aggregations --> aggregation function is called on every window ● High memory consumption --> especially for windows without incremental aggregation ● Check for merging windows
  • 33. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Architecture Overview 18
  • 34. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Architecture Overview 18
  • 35. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Architecture Overview 18
  • 36. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Architecture Overview 18
  • 37. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Architecture Overview 18
  • 38. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Architecture Overview 18
  • 39. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Session Window Aggregate Sharing 19
  • 40. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Session Window Aggregate Sharing 19
  • 41. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Session Window Aggregate Sharing 19
  • 42. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Out-of-Order Processing and Sessions 20
  • 43. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Out-of-Order Processing and Sessions 20
  • 44. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Out-of-Order Processing and Sessions 20
  • 45. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Out-of-Order Processing and Sessions 20
  • 46. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Out-of-Order Processing and Sessions 20
  • 47. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Out-of-Order Processing and Sessions 20
  • 48. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Multi-Window Processing (Example: Fitness Tracker) 21 [...].window( // Daily report: TumblingEventTimeWindows.of(Time.days(1)), // Monitoring dashboard (last hour): SlidingEventTimeWindows.of(Time.hours(1), Time.seconds(1)), // Activity periods: EventTimeSessionWindows.of(Time.minutes(1))) .sum() Multi-Window Processing
  • 49. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Stream Slicing Performance 22
  • 50. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Stream Slicing Performance 23
  • 51. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Runtime-Dynamic Windows 24
  • 52. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Runtime-Dynamic Windows 24 Event Stream: Window Definition Stream: <WindowDefinition>
  • 53. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Runtime-Dynamic Windows 24 Event Stream: Dynamic Window Operator Window Definition Stream: <WindowDefinition>
  • 54. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Runtime-Dynamic Windows 24 Event Stream: Dynamic Window Operator Output Stream: <Window, Agg> Window Definition Stream: <WindowDefinition>
  • 55. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Runtime-Dynamic Windows 24 .dynamicWindow(windowDefinitionStream) .sum() Event Stream: Dynamic Window Operator Output Stream: <Window, Agg> Window Definition Stream: <WindowDefinition>
  • 56. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing From Research to Production 25 Research Production
  • 57. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing From Research to Production ● Implement complete fault-tolerance and state-management 25 Research Production
  • 58. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing From Research to Production ● Implement complete fault-tolerance and state-management ● State migration 25 Research Production
  • 59. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing From Research to Production ● Implement complete fault-tolerance and state-management ● State migration ○ Hard limitation: Aggregated buckets in state snapshots cannot be migrated 25 Research Production
  • 60. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing From Research to Production ● Implement complete fault-tolerance and state-management ● State migration ○ Hard limitation: Aggregated buckets in state snapshots cannot be migrated ● Sophisticated testing 25 Research Production
  • 61. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing From Research to Production ● Implement complete fault-tolerance and state-management ● State migration ○ Hard limitation: Aggregated buckets in state snapshots cannot be migrated ● Sophisticated testing How to expose multi-windows and dynamic-windows to users? 25 Research Production
  • 62. Jonas Traub (TU Berlin), Philipp M. Grulich (DFKI) - Efficient Window Aggregation with Stream Slicing Wrap-Up Scotty Features: - stream slicing - pre-aggregation - aggregate sharing - out-of-order processing - session window support - multi-window support - runtime-dynamic window support Let’s bring it to production! JIRA: [FLINK-7001] 26 This talk is supported by the European Union Horizon 2020 Projects Proteus (687691), Streamline (688191), SAGE (671500), and E2Data (780245) and by the German Ministry for Education and Research as Berlin Big Data Center (01IS14013A) and Software Campus (01IS12056).