Jonas Traub Philipp M. Grulich Alejandro Rodríguez Cuéllar Sebastian Breß
Asterios Katsifodimos Tilmann Rabl Volker Markl
Efficient Window Aggregation with
General Stream Slicing
22nd International Conference on Extending Database Technology
March 26-29, 2019, Lisbon, Portugal
Stream Processing Pipelines
27.03.2019 Efficient Window Aggregation with General Stream Slicing 2
A stream processing pipeline is a series of concurrently running operators.
Stream Processing Pipelines
27.03.2019 Efficient Window Aggregation with General Stream Slicing 2
A stream processing pipeline is a series of concurrently running operators.
Window
Aggregation
Stream Processing Pipelines
27.03.2019 Efficient Window Aggregation with General Stream Slicing 2
A stream processing pipeline is a series of concurrently running operators.
Window
Aggregation
53
Stream Processing Pipelines
27.03.2019 Efficient Window Aggregation with General Stream Slicing 2
A stream processing pipeline is a series of concurrently running operators.
Window
Aggregation
8
Motivation
27.03.2019 Efficient Window Aggregation with General Stream Slicing 3
Motivation
27.03.2019 Efficient Window Aggregation with General Stream Slicing 3
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 4
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 5
The number of slices depends on the workload.
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 5
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 6
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 7
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 8
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 9
We store partial aggregates instead of all tuples.  Small memory footprint.
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 9
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 10
We assign each tuple to exactly one slice.  O(1) per-tuple complexity.
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 10
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 11
We require just a few computation steps to calculate final aggregates.  Low latency.
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 11
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 12
We share partial aggregations among all users and queries.  Efficiency by preventing redundancy.
Stream Slicing Example
27.03.2019 Efficient Window Aggregation with General Stream Slicing 12
General Stream Slicing
27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
General Stream Slicing
Workload
Characteristics
27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
General Stream Slicing
Workload
Characteristics
Aggregation
Functions
distributive
algebraic
holistic
associativity
cummutativity
invertibility
27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
General Stream Slicing
Workload
Characteristics
Window
Types
Context Free
Forward Context Free
Forward Context Aware
Aggregation
Functions
distributive
algebraic
holistic
associativity
cummutativity
invertibility
27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
General Stream Slicing
Workload
Characteristics
Window
Types
Context Free
Forward Context Free
Forward Context Aware
Window
Measures
time
tuple count
arbitrary
Aggregation
Functions
distributive
algebraic
holistic
associativity
cummutativity
invertibility
27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
General Stream Slicing
Workload
Characteristics
Window
Types
Context Free
Forward Context Free
Forward Context Aware
Stream
Order
in-order
out-of-order
Window
Measures
time
tuple count
arbitrary
Aggregation
Functions
distributive
algebraic
holistic
associativity
cummutativity
invertibility
27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
General Stream Slicing
Workload
Characteristics
Window
Types
Context Free
Forward Context Free
Forward Context Aware
Stream
Order
in-order
out-of-order
Window
Measures
time
tuple count
arbitrary
Aggregation
Functions
distributive
algebraic
holistic
associativity
cummutativity
invertibility
27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
General Stream Slicing combines generality and efficiency in a single solution.
Window Aggregation Concepts
27.03.2019 Efficient Window Aggregation with General Stream Slicing 14
Variations of Stream SlicingNon-Slicing Techniques
General Slicing Core
27.03.2019 Efficient Window Aggregation with General Stream Slicing 15
General Slicing Core
The General Slicing Core adapts to work load characteristics
and provides extension point for user-defined window types and aggregation functions.
27.03.2019 Efficient Window Aggregation with General Stream Slicing 15
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Part 1: Three Fundamental Operations on Slices
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices
Part 1: Three Fundamental Operations on Slices
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices Split Slices
Part 1: Three Fundamental Operations on Slices
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices Split Slices Update Slices
Part 1: Three Fundamental Operations on Slices
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices Split Slices Update Slices
Part 1: Three Fundamental Operations on Slices
Part 2: Adapt to Workload Characteristics:
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices Split Slices Update Slices
Part 1: Three Fundamental Operations on Slices
Part 2: Adapt to Workload Characteristics:
Do we need to store original tuples?
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices Split Slices Update Slices
Part 1: Three Fundamental Operations on Slices
Part 2: Adapt to Workload Characteristics:
Do we need to store original tuples?
Do we potentially need to split slices?
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices Split Slices Update Slices
Part 1: Three Fundamental Operations on Slices
Part 2: Adapt to Workload Characteristics:
Do we need to store original tuples?
Do we potentially need to split slices?
Do we potentially need
to remove tuples from slices?
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices Split Slices Update Slices
Part 1: Three Fundamental Operations on Slices
Part 2: Adapt to Workload Characteristics:
Do we need to store original tuples?
Do we potentially need to split slices?
Do we potentially need
to remove tuples from slices?
General Stream Slicing Internals
27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
Merge Slices Split Slices Update Slices
Part 1: Three Fundamental Operations on Slices
Part 2: Adapt to Workload Characteristics:
Do we need to store original tuples?
Do we potentially need to split slices?
Do we potentially need
to remove tuples from slices?
General Stream Slicing adapts to current workload characteristics.
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
Count-based tumbling window
with a length of 5 tuples.
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Count-based tumbling window
with a length of 5 tuples.
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Count-based tumbling window
with a length of 5 tuples.
11 13 12
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
11 13 12
What if the stream is out-of-order?
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
What if the stream is out-of-order?
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
What if the stream is out-of-order?
5
49
Out-of-order Tuple
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
What if the stream is out-of-order?
5
49
Out-of-order Tuple
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
What if the stream is out-of-order?
5
49
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
What if the stream is out-of-order?
5
49
13 12
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
What if the stream is out-of-order?
5
49
13 12
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
What if the stream is out-of-order?
5
49
13 12
5
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
What if the stream is out-of-order?
5
49
13 125 + - 3
5
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
What if the stream is out-of-order?
5
49
13 123 1+ -5 + - 3
5
Impact of Workload Characteristics (Example)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Tuple Count
15
Event Time
5 12 13 20 35 37 42 46 48 51 52 57 63 64 65
11 13 12
1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
What if the stream is out-of-order?
5
49
13 123 1+ -5 + - 3
5
What if the aggregation function is not invertible?
In-order Processing with Context Free Windows
27.03.2019 Efficient Window Aggregation with General Stream Slicing 18
In-order Processing with Context Free Windows
27.03.2019 Efficient Window Aggregation with General Stream Slicing 18
Slicing techniques scale to large numbers of concurrent windows.
Impact of Stream Order
27.03.2019 Efficient Window Aggregation with General Stream Slicing 19
Impact of Stream Order
27.03.2019 Efficient Window Aggregation with General Stream Slicing 19
Slicing techniques are robust against out-of-order tuples.
Impact of Aggregation Functions (20% out-of-order)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 20
Impact of Aggregation Functions (20% out-of-order)
27.03.2019 Efficient Window Aggregation with General Stream Slicing 20
Stream Slicing performs well on many different kinds of aggregation functions.
Efficient Window Aggregation with General Stream Slicing
27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
Efficient Window Aggregation with General Stream Slicing
• We identify workload characteristics which impact
applicability and performance of window aggregation techniques.
27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
Efficient Window Aggregation with General Stream Slicing
• We identify workload characteristics which impact
applicability and performance of window aggregation techniques.
• We present a generally applicable and highly efficient solution for
streaming window aggregation.
27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
Efficient Window Aggregation with General Stream Slicing
• We identify workload characteristics which impact
applicability and performance of window aggregation techniques.
• We present a generally applicable and highly efficient solution for
streaming window aggregation.
• We show that general stream slicing is generally applicable and
offers better performance than alternative approaches.
27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
Efficient Window Aggregation with General Stream Slicing
• We identify workload characteristics which impact
applicability and performance of window aggregation techniques.
• We present a generally applicable and highly efficient solution for
streaming window aggregation.
• We show that general stream slicing is generally applicable and
offers better performance than alternative approaches.
27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
tu-berlin-dima.github.io/scotty-window-processor
Open Source Repository:

Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)

  • 1.
    Jonas Traub PhilippM. Grulich Alejandro Rodríguez Cuéllar Sebastian Breß Asterios Katsifodimos Tilmann Rabl Volker Markl Efficient Window Aggregation with General Stream Slicing 22nd International Conference on Extending Database Technology March 26-29, 2019, Lisbon, Portugal
  • 2.
    Stream Processing Pipelines 27.03.2019Efficient Window Aggregation with General Stream Slicing 2 A stream processing pipeline is a series of concurrently running operators.
  • 3.
    Stream Processing Pipelines 27.03.2019Efficient Window Aggregation with General Stream Slicing 2 A stream processing pipeline is a series of concurrently running operators. Window Aggregation
  • 4.
    Stream Processing Pipelines 27.03.2019Efficient Window Aggregation with General Stream Slicing 2 A stream processing pipeline is a series of concurrently running operators. Window Aggregation 53
  • 5.
    Stream Processing Pipelines 27.03.2019Efficient Window Aggregation with General Stream Slicing 2 A stream processing pipeline is a series of concurrently running operators. Window Aggregation 8
  • 6.
    Motivation 27.03.2019 Efficient WindowAggregation with General Stream Slicing 3
  • 7.
    Motivation 27.03.2019 Efficient WindowAggregation with General Stream Slicing 3
  • 8.
    Stream Slicing Example 27.03.2019Efficient Window Aggregation with General Stream Slicing 4
  • 9.
    Stream Slicing Example 27.03.2019Efficient Window Aggregation with General Stream Slicing 5
  • 10.
    The number ofslices depends on the workload. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 5
  • 11.
    Stream Slicing Example 27.03.2019Efficient Window Aggregation with General Stream Slicing 6
  • 12.
    Stream Slicing Example 27.03.2019Efficient Window Aggregation with General Stream Slicing 7
  • 13.
    Stream Slicing Example 27.03.2019Efficient Window Aggregation with General Stream Slicing 8
  • 14.
    Stream Slicing Example 27.03.2019Efficient Window Aggregation with General Stream Slicing 9
  • 15.
    We store partialaggregates instead of all tuples.  Small memory footprint. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 9
  • 16.
    Stream Slicing Example 27.03.2019Efficient Window Aggregation with General Stream Slicing 10
  • 17.
    We assign eachtuple to exactly one slice.  O(1) per-tuple complexity. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 10
  • 18.
    Stream Slicing Example 27.03.2019Efficient Window Aggregation with General Stream Slicing 11
  • 19.
    We require justa few computation steps to calculate final aggregates.  Low latency. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 11
  • 20.
    Stream Slicing Example 27.03.2019Efficient Window Aggregation with General Stream Slicing 12
  • 21.
    We share partialaggregations among all users and queries.  Efficiency by preventing redundancy. Stream Slicing Example 27.03.2019 Efficient Window Aggregation with General Stream Slicing 12
  • 22.
    General Stream Slicing 27.03.2019Efficient Window Aggregation with General Stream Slicing 13
  • 23.
    General Stream Slicing Workload Characteristics 27.03.2019Efficient Window Aggregation with General Stream Slicing 13
  • 24.
  • 25.
    General Stream Slicing Workload Characteristics Window Types ContextFree Forward Context Free Forward Context Aware Aggregation Functions distributive algebraic holistic associativity cummutativity invertibility 27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
  • 26.
    General Stream Slicing Workload Characteristics Window Types ContextFree Forward Context Free Forward Context Aware Window Measures time tuple count arbitrary Aggregation Functions distributive algebraic holistic associativity cummutativity invertibility 27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
  • 27.
    General Stream Slicing Workload Characteristics Window Types ContextFree Forward Context Free Forward Context Aware Stream Order in-order out-of-order Window Measures time tuple count arbitrary Aggregation Functions distributive algebraic holistic associativity cummutativity invertibility 27.03.2019 Efficient Window Aggregation with General Stream Slicing 13
  • 28.
    General Stream Slicing Workload Characteristics Window Types ContextFree Forward Context Free Forward Context Aware Stream Order in-order out-of-order Window Measures time tuple count arbitrary Aggregation Functions distributive algebraic holistic associativity cummutativity invertibility 27.03.2019 Efficient Window Aggregation with General Stream Slicing 13 General Stream Slicing combines generality and efficiency in a single solution.
  • 29.
    Window Aggregation Concepts 27.03.2019Efficient Window Aggregation with General Stream Slicing 14 Variations of Stream SlicingNon-Slicing Techniques
  • 30.
    General Slicing Core 27.03.2019Efficient Window Aggregation with General Stream Slicing 15
  • 31.
    General Slicing Core TheGeneral Slicing Core adapts to work load characteristics and provides extension point for user-defined window types and aggregation functions. 27.03.2019 Efficient Window Aggregation with General Stream Slicing 15
  • 32.
    General Stream SlicingInternals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16
  • 33.
    General Stream SlicingInternals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Part 1: Three Fundamental Operations on Slices
  • 34.
    General Stream SlicingInternals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Part 1: Three Fundamental Operations on Slices
  • 35.
    General Stream SlicingInternals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Split Slices Part 1: Three Fundamental Operations on Slices
  • 36.
    General Stream SlicingInternals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Split Slices Update Slices Part 1: Three Fundamental Operations on Slices
  • 37.
    General Stream SlicingInternals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Split Slices Update Slices Part 1: Three Fundamental Operations on Slices Part 2: Adapt to Workload Characteristics:
  • 38.
    General Stream SlicingInternals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Split Slices Update Slices Part 1: Three Fundamental Operations on Slices Part 2: Adapt to Workload Characteristics: Do we need to store original tuples?
  • 39.
    General Stream SlicingInternals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Split Slices Update Slices Part 1: Three Fundamental Operations on Slices Part 2: Adapt to Workload Characteristics: Do we need to store original tuples? Do we potentially need to split slices?
  • 40.
    General Stream SlicingInternals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Split Slices Update Slices Part 1: Three Fundamental Operations on Slices Part 2: Adapt to Workload Characteristics: Do we need to store original tuples? Do we potentially need to split slices? Do we potentially need to remove tuples from slices?
  • 41.
    General Stream SlicingInternals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Split Slices Update Slices Part 1: Three Fundamental Operations on Slices Part 2: Adapt to Workload Characteristics: Do we need to store original tuples? Do we potentially need to split slices? Do we potentially need to remove tuples from slices?
  • 42.
    General Stream SlicingInternals 27.03.2019 Efficient Window Aggregation with General Stream Slicing 16 Merge Slices Split Slices Update Slices Part 1: Three Fundamental Operations on Slices Part 2: Adapt to Workload Characteristics: Do we need to store original tuples? Do we potentially need to split slices? Do we potentially need to remove tuples from slices? General Stream Slicing adapts to current workload characteristics.
  • 43.
    Impact of WorkloadCharacteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17
  • 44.
    Impact of WorkloadCharacteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1
  • 45.
    Impact of WorkloadCharacteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 Count-based tumbling window with a length of 5 tuples.
  • 46.
    Impact of WorkloadCharacteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Count-based tumbling window with a length of 5 tuples.
  • 47.
    Impact of WorkloadCharacteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Count-based tumbling window with a length of 5 tuples. 11 13 12
  • 48.
    Impact of WorkloadCharacteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 11 13 12 What if the stream is out-of-order?
  • 49.
    Impact of WorkloadCharacteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 What if the stream is out-of-order?
  • 50.
    Impact of WorkloadCharacteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 What if the stream is out-of-order? 5 49 Out-of-order Tuple
  • 51.
    Impact of WorkloadCharacteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 What if the stream is out-of-order? 5 49 Out-of-order Tuple
  • 52.
    Impact of WorkloadCharacteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 What if the stream is out-of-order? 5 49
  • 53.
    Impact of WorkloadCharacteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 What if the stream is out-of-order? 5 49 13 12
  • 54.
    Impact of WorkloadCharacteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 What if the stream is out-of-order? 5 49 13 12
  • 55.
    Impact of WorkloadCharacteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 What if the stream is out-of-order? 5 49 13 12 5
  • 56.
    Impact of WorkloadCharacteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 What if the stream is out-of-order? 5 49 13 125 + - 3 5
  • 57.
    Impact of WorkloadCharacteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 What if the stream is out-of-order? 5 49 13 123 1+ -5 + - 3 5
  • 58.
    Impact of WorkloadCharacteristics (Example) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 17 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Tuple Count 15 Event Time 5 12 13 20 35 37 42 46 48 51 52 57 63 64 65 11 13 12 1 2 1 4 3 1 5 2 2 3 6 1 2 2 1 What if the stream is out-of-order? 5 49 13 123 1+ -5 + - 3 5 What if the aggregation function is not invertible?
  • 59.
    In-order Processing withContext Free Windows 27.03.2019 Efficient Window Aggregation with General Stream Slicing 18
  • 60.
    In-order Processing withContext Free Windows 27.03.2019 Efficient Window Aggregation with General Stream Slicing 18 Slicing techniques scale to large numbers of concurrent windows.
  • 61.
    Impact of StreamOrder 27.03.2019 Efficient Window Aggregation with General Stream Slicing 19
  • 62.
    Impact of StreamOrder 27.03.2019 Efficient Window Aggregation with General Stream Slicing 19 Slicing techniques are robust against out-of-order tuples.
  • 63.
    Impact of AggregationFunctions (20% out-of-order) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 20
  • 64.
    Impact of AggregationFunctions (20% out-of-order) 27.03.2019 Efficient Window Aggregation with General Stream Slicing 20 Stream Slicing performs well on many different kinds of aggregation functions.
  • 65.
    Efficient Window Aggregationwith General Stream Slicing 27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
  • 66.
    Efficient Window Aggregationwith General Stream Slicing • We identify workload characteristics which impact applicability and performance of window aggregation techniques. 27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
  • 67.
    Efficient Window Aggregationwith General Stream Slicing • We identify workload characteristics which impact applicability and performance of window aggregation techniques. • We present a generally applicable and highly efficient solution for streaming window aggregation. 27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
  • 68.
    Efficient Window Aggregationwith General Stream Slicing • We identify workload characteristics which impact applicability and performance of window aggregation techniques. • We present a generally applicable and highly efficient solution for streaming window aggregation. • We show that general stream slicing is generally applicable and offers better performance than alternative approaches. 27.03.2019 Efficient Window Aggregation with General Stream Slicing 21
  • 69.
    Efficient Window Aggregationwith General Stream Slicing • We identify workload characteristics which impact applicability and performance of window aggregation techniques. • We present a generally applicable and highly efficient solution for streaming window aggregation. • We show that general stream slicing is generally applicable and offers better performance than alternative approaches. 27.03.2019 Efficient Window Aggregation with General Stream Slicing 21 tu-berlin-dima.github.io/scotty-window-processor Open Source Repository: