SlideShare a Scribd company logo
1 of 33
Download to read offline
The Flux Capacitor of
Kafka Streams and ksqlDB
Matthias J. Sax | Software Engineer
@MatthiasJSax
Back to the Time Topic
2@MatthiasJSax
https://www.confluent.io/kafka-summit-san-francisco-2019/whats-the-time-and-why/
Stream Processing is our Density.
Recap: Time 101
4@MatthiasJSax
Event Time
• When an event happened (embedded in the message/record)
• Ensures deterministic processing
• Used to express processing semantics, i.e., impacts the result
Processing Time (aka Wall-clock Time)
• When an event/message/record is processed
• Used for non-functional properties
• Timeouts
• Data rate control
• Periodic actions
• Should not impact the result: otherwise, non-deterministic
First, you turn the time circuits on.
Tracking Time
Stream-time: the maximum observed input event timestamp (aka ROWTIME)
• Monotonically increasing
• Allows to identify out-of-order and late input
• Tracked per task / used instead of watermarks
6@MatthiasJSax
14:01… 14:03… 14:08…14:01… 14:02… 14:11…
stream-time
14:03 14:08 14:1114:01
advances
Yeah, well, history is gonna change
Input records with descending event timestamp are considered out-of-order
• Out-of-order if event-time < stream-time
7@MatthiasJSax
14:01… 14:03… 14:08…14:01… 14:02… 14:11…
stream-time
14:03 14:1114:0814:01
advances
out-of-order out-of-order
You are not thinking fourth-dimensionally
8@MatthiasJSax
14:11…14:05…
14:03…14:04…
14:01…
14:02… 14:08…
Topic-A, Partition 0
Topic-B, Partition 0
14:01…
… 14:01
14:02…
14:02…
14:04…
14:04…
14:03…
14:03…
14:05…
14:08…
14:08…
14:05…
out-of-order
You are not thinking fourth-dimensionally
9@MatthiasJSax
14:11…Topic-A, Partition 0
Topic-B, Partition 0 empty
Pause processing and poll() for new data.
Unblock when timeout max.task.idle.ms hits.
… 14:01
14:02… 14:04… 14:03…
14:05…
14:08…
When the hell are they?
Tumbling Windows
• fixed size / non-overlapping / grouped (i.e, GROUP BY)
Time Windows
11@MatthiasJSax
14:00 14:05 14:1514:10
No variable size window support yet:
• Weeks, Month, Years
• No out-of-the-box time zone support
• https://github.com/confluentinc/kafka-streams-examples/blob/5.5.0-post/src/test/java/io/confluent/examples/streams/window/DailyTimeWindows.java
Time Windows
12@MatthiasJSax
Hopping Windows
• fixed size / overlapping / grouped (i.e., GROUP BY)
• Different to a sliding window!
14:00 14:05 14:1514:10
14:01 14:06 14:1614:11
14:02 14:07 14:1714:12
14:03 14:08 14:1814:13
14:04 14:09 14:1914:14
Different use-case: aggregate the data of the last (e.g.) 10 minutes
• Window boundaries are data dependent and unknown upfront (cf. KIP-450)
Sliding Windows
13@MatthiasJSax
14:03… 14:07… 14:12… 14:19… 14:26…
13:53 | 14:03
13:57 14:07
14:02 14:12
14:04 14:14
14:08 14:18
14:09 14:19
14:13 14:23
14:16 14:26
14:20 14:30
When we are processing, we don’t need watermarks
Grace period: defines a cut-off for out-of-order records that are (too) late
• Grace period is defined per operator
• Late if stream-time - event-time > grace period
• Late data is ignored and not processed by the operator
14@MatthiasJSax
14:01… 14:03… 14:08…14:01… 14:02… 14:11…
stream-time
14:03 14:1114:0814:01
advances
grace := 5min
-> late (delay: 6min)
Retention Time
How long to store data in a (windowed) table.
TimeWindows.of(Duration.ofMinutes(5L)).grace(Duration.ofMinutes(1L))
Materialized.as(…).withRetention(Duration.ofHours(1L))
WINDOW TUMBLING(SIZE 5 MINUTES, GRACE PERIOD 1 MINUTE, RETENTION TIME 1 HOUR)
15@MatthiasJSax
stream-time
SIZE
5 MINUTES
GRACE PERIOD
1 MINUTE
windowStart
@14:00
windowEnd
@14:05
window close
@14:06
14:05 15:05
retention
(1 hour)
If my calculations are correct…
16@MatthiasJSax
Table is continuously updated, but when to emit data to the result stream?
• Non-deterministic via caching (default)
• Output data rate reduction (non-functional)
• Deterministic rate control via suppress() | EMIT FINAL
• Periodic or final (for window operations)
• Stream-time based!
14:32…
14:01Marty
14:26Doc
14:05Einstein
14:23Biff
14:15Elaine
14:23George
?
stream-time: 14:26
14:25…
Crossing Timelines (aka Joins)
Stream-Stream Join
18@MatthiasJSax
Streams are conceptually unbounded
• Limited join scope via a sliding time window
leftStream.join(rightStream, JoinWindows.of(Duration.ofMinutes(5L)));
SELECT * FROM leftStream AS l JOIN rightStream AS r WITHIN 5 MINUTES ON l.id = r.id;
14:041 14:162 14:083
14:01A 14:11B 14:23C
14:041⨝A 14:162⨝B 14:113⨝B
max(l.ts; r.ts)
Chaining stream-stream joins is not associative!
• Order matters: ⨝(s1,s2,s3) != (s1 ⨝ s2) ⨝ s3 != (s1 ⨝ s3) ⨝ s2 != (s2 ⨝ s3) ⨝ s1
N-way Stream-Stream Join
19@MatthiasJSax
Chaining stream-stream joins is not associative!
• Order matters: ⨝(s1,s2,s3) != (s1 ⨝ s2) ⨝ s3 != (s1 ⨝ s3) ⨝ s2 != (s2 ⨝ s3) ⨝ s1
N-way Stream-Stream Join
20@MatthiasJSax
14:06X 14:21Y
14:212⨝Y⨝b
14:16b14:11a
14:011 14:26314:162
* window size=5min
Chaining stream-stream joins is not associative!
• Order matters: ⨝(s1,s2,s3) != (s1 ⨝ s2) ⨝ s3 != (s1 ⨝ s3) ⨝ s2 != (s2 ⨝ s3) ⨝ s1
N-way Stream-Stream Join
21@MatthiasJSax
14:06X 14:21Y
14:011 14:26314:162
14:212⨝Y14:061⨝X 14:263⨝Y
* window size=5min
Chaining stream-stream joins is not associative!
• Order matters: ⨝(s1,s2,s3) != (s1 ⨝ s2) ⨝ s3 != (s1 ⨝ s3) ⨝ s2 != (s2 ⨝ s3) ⨝ s1
N-way Stream-Stream Join
22@MatthiasJSax
14:16b14:11a
14:212⨝Y⨝b14:111⨝Y⨝a
14:212⨝Y14:061⨝X 14:263⨝Y
* window size=5min
Chaining stream-stream joins is not associative!
• Order matters: ⨝(s1,s2,s3) != (s1 ⨝ s2) ⨝ s3 != (s1 ⨝ s3) ⨝ s2 != (s2 ⨝ s3) ⨝ s1
N-way Stream-Stream Join
23@MatthiasJSax
14:16b14:11a
14:011 14:26314:162
14:162⨝a
14:162⨝b
* window size=5min
Chaining stream-stream joins is not associative!
• Order matters: ⨝(s1,s2,s3) != (s1 ⨝ s2) ⨝ s3 != (s1 ⨝ s3) ⨝ s2 != (s2 ⨝ s3) ⨝ s1
N-way Stream-Stream Join
24@MatthiasJSax
14:06X 14:21Y
14:162⨝a
14:162⨝b
14:212⨝Y⨝b
14:212⨝Y⨝a
* window size=5min
Chaining stream-stream joins is not associative!
• Order matters: ⨝(s1,s2,s3) != (s1 ⨝ s2) ⨝ s3 != (s1 ⨝ s3) ⨝ s2 != (s2 ⨝ s3) ⨝ s1
N-way Stream-Stream Join
25@MatthiasJSax
14:06X 14:21Y
14:16b14:11a
14:11X⨝a 14:21Y⨝b
* window size=5min
Chaining stream-stream joins is not associative!
• Order matters: ⨝(s1,s2,s3) != (s1 ⨝ s2) ⨝ s3 != (s1 ⨝ s3) ⨝ s2 != (s2 ⨝ s3) ⨝ s1
N-way Stream-Stream Join
26@MatthiasJSax
14:011 14:26314:162
14:11X⨝a 14:21Y⨝b
14:212⨝Y⨝b
14:162⨝X⨝a 14:263⨝Y⨝b
* window size=5min
Stream-Table Join
27@MatthiasJSax
Stream-Table join is a temporal join
14:01a 14:03b 14:05c 14:08b 14:11a
14:02… 14:04… 14:07…14:06… 14:10…
14:01a
14:03b
14:05c
14:05
14:01a
14:08b
14:05c
14:08
14:11a
14:08b
14:05c
14:11
14:01a
14:03b
14:03
14:01a
14:01
14:06 14:07 14:1014:0414:02
Time Traveling is just too Dangerous
28@MatthiasJSax
Is it? Well, mind compaction!
14:05c 14:08b 14:11a
14:02… 14:04… 14:07…14:06… 14:10…
14:05c
14:05
14:08b
14:05c
14:08
14:11a
14:08b
14:05c
14:11
14:06 14:07 14:1014:0414:02
14:01a 14:03b
You Need to Know your History
29@MatthiasJSax
Table Changelog
Stream
append
new data
(tail)
truncation
retention time
compaction lag
(preserves full history)
compacted head
(old data)
You Need to Know your History
30@MatthiasJSax
Table Changelog
Stream
truncation
retention time
Lost History
fully compacted append
new data
(tail)
You are the doc, Doc
31@MatthiasJSax
Wrapping up
• Event time vs processing time
• Stream-time, grace period, and retention time
(no watermarks)
• Tumbling/hopping vs sliding windows
• Join:
• Temporal semantics
• Stream-stream and stream-table
• Tables and time traveling
Hope, it was educational.
Thanks! We are hiring!
@MatthiasJSax
matthias@confluent.io | mjsax@apache.org

More Related Content

What's hot

Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingTill Rohrmann
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlJiangjie Qin
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controllerconfluent
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
 
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Databricks
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Storesconfluent
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaShiao-An Yuan
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaJiangjie Qin
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planningconfluent
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?confluent
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkFlink Forward
 
Capacity Planning Your Kafka Cluster | Jason Bell, Digitalis
Capacity Planning Your Kafka Cluster | Jason Bell, DigitalisCapacity Planning Your Kafka Cluster | Jason Bell, Digitalis
Capacity Planning Your Kafka Cluster | Jason Bell, DigitalisHostedbyConfluent
 
Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain confluent
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
 

What's hot (20)

Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache Flink
 
Capacity Planning Your Kafka Cluster | Jason Bell, Digitalis
Capacity Planning Your Kafka Cluster | Jason Bell, DigitalisCapacity Planning Your Kafka Cluster | Jason Bell, Digitalis
Capacity Planning Your Kafka Cluster | Jason Bell, Digitalis
 
Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 

Similar to The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) Kafka Summit 2020

Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentTemporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentHostedbyConfluent
 
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentTemporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentHostedbyConfluent
 
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkArbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkDatabricks
 
Have your cake and eat it too, further dispelling the myths of the lambda arc...
Have your cake and eat it too, further dispelling the myths of the lambda arc...Have your cake and eat it too, further dispelling the myths of the lambda arc...
Have your cake and eat it too, further dispelling the myths of the lambda arc...Dimos Raptis
 
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014Luigi Dell'Aquila
 
Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...
Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...
Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...Codemotion
 
Flink System Overview
Flink System OverviewFlink System Overview
Flink System OverviewTimo Walther
 
Beam me up, Samza!
Beam me up, Samza!Beam me up, Samza!
Beam me up, Samza!Xinyu Liu
 
Lesson 05 - Time in Distrributed System.pptx
Lesson 05 - Time in Distrributed System.pptxLesson 05 - Time in Distrributed System.pptx
Lesson 05 - Time in Distrributed System.pptxLagamaPasala
 
Continuous Processing with Apache Flink - Strata London 2016
Continuous Processing with Apache Flink - Strata London 2016Continuous Processing with Apache Flink - Strata London 2016
Continuous Processing with Apache Flink - Strata London 2016Stephan Ewen
 
Cassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data ModelingCassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data ModelingMatthew Dennis
 
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...C4Media
 
Analyzing and Interpreting AWR
Analyzing and Interpreting AWRAnalyzing and Interpreting AWR
Analyzing and Interpreting AWRpasalapudi
 
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry confluent
 
Operating Systems Process Scheduling Algorithms
Operating Systems   Process Scheduling AlgorithmsOperating Systems   Process Scheduling Algorithms
Operating Systems Process Scheduling Algorithmssathish sak
 
Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)
Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)
Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)Jeff Hung
 
Apache Samza Past, Present and Future
Apache Samza  Past, Present and FutureApache Samza  Past, Present and Future
Apache Samza Past, Present and FutureKartik Paramasivam
 
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingCloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingDoiT International
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Data Con LA
 

Similar to The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) Kafka Summit 2020 (20)

Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentTemporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
 
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentTemporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
 
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkArbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
 
Have your cake and eat it too, further dispelling the myths of the lambda arc...
Have your cake and eat it too, further dispelling the myths of the lambda arc...Have your cake and eat it too, further dispelling the myths of the lambda arc...
Have your cake and eat it too, further dispelling the myths of the lambda arc...
 
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
 
Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...
Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...
Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...
 
Flink System Overview
Flink System OverviewFlink System Overview
Flink System Overview
 
Beam me up, Samza!
Beam me up, Samza!Beam me up, Samza!
Beam me up, Samza!
 
Lesson 05 - Time in Distrributed System.pptx
Lesson 05 - Time in Distrributed System.pptxLesson 05 - Time in Distrributed System.pptx
Lesson 05 - Time in Distrributed System.pptx
 
Continuous Processing with Apache Flink - Strata London 2016
Continuous Processing with Apache Flink - Strata London 2016Continuous Processing with Apache Flink - Strata London 2016
Continuous Processing with Apache Flink - Strata London 2016
 
Cassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data ModelingCassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data Modeling
 
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
 
Analyzing and Interpreting AWR
Analyzing and Interpreting AWRAnalyzing and Interpreting AWR
Analyzing and Interpreting AWR
 
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
 
Operating Systems Process Scheduling Algorithms
Operating Systems   Process Scheduling AlgorithmsOperating Systems   Process Scheduling Algorithms
Operating Systems Process Scheduling Algorithms
 
Scheduling
SchedulingScheduling
Scheduling
 
Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)
Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)
Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)
 
Apache Samza Past, Present and Future
Apache Samza  Past, Present and FutureApache Samza  Past, Present and Future
Apache Samza Past, Present and Future
 
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingCloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 

Recently uploaded (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) Kafka Summit 2020

  • 1. The Flux Capacitor of Kafka Streams and ksqlDB Matthias J. Sax | Software Engineer @MatthiasJSax
  • 2. Back to the Time Topic 2@MatthiasJSax https://www.confluent.io/kafka-summit-san-francisco-2019/whats-the-time-and-why/
  • 3. Stream Processing is our Density.
  • 4. Recap: Time 101 4@MatthiasJSax Event Time • When an event happened (embedded in the message/record) • Ensures deterministic processing • Used to express processing semantics, i.e., impacts the result Processing Time (aka Wall-clock Time) • When an event/message/record is processed • Used for non-functional properties • Timeouts • Data rate control • Periodic actions • Should not impact the result: otherwise, non-deterministic
  • 5. First, you turn the time circuits on.
  • 6. Tracking Time Stream-time: the maximum observed input event timestamp (aka ROWTIME) • Monotonically increasing • Allows to identify out-of-order and late input • Tracked per task / used instead of watermarks 6@MatthiasJSax 14:01… 14:03… 14:08…14:01… 14:02… 14:11… stream-time 14:03 14:08 14:1114:01 advances
  • 7. Yeah, well, history is gonna change Input records with descending event timestamp are considered out-of-order • Out-of-order if event-time < stream-time 7@MatthiasJSax 14:01… 14:03… 14:08…14:01… 14:02… 14:11… stream-time 14:03 14:1114:0814:01 advances out-of-order out-of-order
  • 8. You are not thinking fourth-dimensionally 8@MatthiasJSax 14:11…14:05… 14:03…14:04… 14:01… 14:02… 14:08… Topic-A, Partition 0 Topic-B, Partition 0 14:01… … 14:01 14:02… 14:02… 14:04… 14:04… 14:03… 14:03… 14:05… 14:08… 14:08… 14:05… out-of-order
  • 9. You are not thinking fourth-dimensionally 9@MatthiasJSax 14:11…Topic-A, Partition 0 Topic-B, Partition 0 empty Pause processing and poll() for new data. Unblock when timeout max.task.idle.ms hits. … 14:01 14:02… 14:04… 14:03… 14:05… 14:08…
  • 10. When the hell are they?
  • 11. Tumbling Windows • fixed size / non-overlapping / grouped (i.e, GROUP BY) Time Windows 11@MatthiasJSax 14:00 14:05 14:1514:10 No variable size window support yet: • Weeks, Month, Years • No out-of-the-box time zone support • https://github.com/confluentinc/kafka-streams-examples/blob/5.5.0-post/src/test/java/io/confluent/examples/streams/window/DailyTimeWindows.java
  • 12. Time Windows 12@MatthiasJSax Hopping Windows • fixed size / overlapping / grouped (i.e., GROUP BY) • Different to a sliding window! 14:00 14:05 14:1514:10 14:01 14:06 14:1614:11 14:02 14:07 14:1714:12 14:03 14:08 14:1814:13 14:04 14:09 14:1914:14
  • 13. Different use-case: aggregate the data of the last (e.g.) 10 minutes • Window boundaries are data dependent and unknown upfront (cf. KIP-450) Sliding Windows 13@MatthiasJSax 14:03… 14:07… 14:12… 14:19… 14:26… 13:53 | 14:03 13:57 14:07 14:02 14:12 14:04 14:14 14:08 14:18 14:09 14:19 14:13 14:23 14:16 14:26 14:20 14:30
  • 14. When we are processing, we don’t need watermarks Grace period: defines a cut-off for out-of-order records that are (too) late • Grace period is defined per operator • Late if stream-time - event-time > grace period • Late data is ignored and not processed by the operator 14@MatthiasJSax 14:01… 14:03… 14:08…14:01… 14:02… 14:11… stream-time 14:03 14:1114:0814:01 advances grace := 5min -> late (delay: 6min)
  • 15. Retention Time How long to store data in a (windowed) table. TimeWindows.of(Duration.ofMinutes(5L)).grace(Duration.ofMinutes(1L)) Materialized.as(…).withRetention(Duration.ofHours(1L)) WINDOW TUMBLING(SIZE 5 MINUTES, GRACE PERIOD 1 MINUTE, RETENTION TIME 1 HOUR) 15@MatthiasJSax stream-time SIZE 5 MINUTES GRACE PERIOD 1 MINUTE windowStart @14:00 windowEnd @14:05 window close @14:06 14:05 15:05 retention (1 hour)
  • 16. If my calculations are correct… 16@MatthiasJSax Table is continuously updated, but when to emit data to the result stream? • Non-deterministic via caching (default) • Output data rate reduction (non-functional) • Deterministic rate control via suppress() | EMIT FINAL • Periodic or final (for window operations) • Stream-time based! 14:32… 14:01Marty 14:26Doc 14:05Einstein 14:23Biff 14:15Elaine 14:23George ? stream-time: 14:26 14:25…
  • 18. Stream-Stream Join 18@MatthiasJSax Streams are conceptually unbounded • Limited join scope via a sliding time window leftStream.join(rightStream, JoinWindows.of(Duration.ofMinutes(5L))); SELECT * FROM leftStream AS l JOIN rightStream AS r WITHIN 5 MINUTES ON l.id = r.id; 14:041 14:162 14:083 14:01A 14:11B 14:23C 14:041⨝A 14:162⨝B 14:113⨝B max(l.ts; r.ts)
  • 19. Chaining stream-stream joins is not associative! • Order matters: ⨝(s1,s2,s3) != (s1 ⨝ s2) ⨝ s3 != (s1 ⨝ s3) ⨝ s2 != (s2 ⨝ s3) ⨝ s1 N-way Stream-Stream Join 19@MatthiasJSax
  • 20. Chaining stream-stream joins is not associative! • Order matters: ⨝(s1,s2,s3) != (s1 ⨝ s2) ⨝ s3 != (s1 ⨝ s3) ⨝ s2 != (s2 ⨝ s3) ⨝ s1 N-way Stream-Stream Join 20@MatthiasJSax 14:06X 14:21Y 14:212⨝Y⨝b 14:16b14:11a 14:011 14:26314:162 * window size=5min
  • 21. Chaining stream-stream joins is not associative! • Order matters: ⨝(s1,s2,s3) != (s1 ⨝ s2) ⨝ s3 != (s1 ⨝ s3) ⨝ s2 != (s2 ⨝ s3) ⨝ s1 N-way Stream-Stream Join 21@MatthiasJSax 14:06X 14:21Y 14:011 14:26314:162 14:212⨝Y14:061⨝X 14:263⨝Y * window size=5min
  • 22. Chaining stream-stream joins is not associative! • Order matters: ⨝(s1,s2,s3) != (s1 ⨝ s2) ⨝ s3 != (s1 ⨝ s3) ⨝ s2 != (s2 ⨝ s3) ⨝ s1 N-way Stream-Stream Join 22@MatthiasJSax 14:16b14:11a 14:212⨝Y⨝b14:111⨝Y⨝a 14:212⨝Y14:061⨝X 14:263⨝Y * window size=5min
  • 23. Chaining stream-stream joins is not associative! • Order matters: ⨝(s1,s2,s3) != (s1 ⨝ s2) ⨝ s3 != (s1 ⨝ s3) ⨝ s2 != (s2 ⨝ s3) ⨝ s1 N-way Stream-Stream Join 23@MatthiasJSax 14:16b14:11a 14:011 14:26314:162 14:162⨝a 14:162⨝b * window size=5min
  • 24. Chaining stream-stream joins is not associative! • Order matters: ⨝(s1,s2,s3) != (s1 ⨝ s2) ⨝ s3 != (s1 ⨝ s3) ⨝ s2 != (s2 ⨝ s3) ⨝ s1 N-way Stream-Stream Join 24@MatthiasJSax 14:06X 14:21Y 14:162⨝a 14:162⨝b 14:212⨝Y⨝b 14:212⨝Y⨝a * window size=5min
  • 25. Chaining stream-stream joins is not associative! • Order matters: ⨝(s1,s2,s3) != (s1 ⨝ s2) ⨝ s3 != (s1 ⨝ s3) ⨝ s2 != (s2 ⨝ s3) ⨝ s1 N-way Stream-Stream Join 25@MatthiasJSax 14:06X 14:21Y 14:16b14:11a 14:11X⨝a 14:21Y⨝b * window size=5min
  • 26. Chaining stream-stream joins is not associative! • Order matters: ⨝(s1,s2,s3) != (s1 ⨝ s2) ⨝ s3 != (s1 ⨝ s3) ⨝ s2 != (s2 ⨝ s3) ⨝ s1 N-way Stream-Stream Join 26@MatthiasJSax 14:011 14:26314:162 14:11X⨝a 14:21Y⨝b 14:212⨝Y⨝b 14:162⨝X⨝a 14:263⨝Y⨝b * window size=5min
  • 27. Stream-Table Join 27@MatthiasJSax Stream-Table join is a temporal join 14:01a 14:03b 14:05c 14:08b 14:11a 14:02… 14:04… 14:07…14:06… 14:10… 14:01a 14:03b 14:05c 14:05 14:01a 14:08b 14:05c 14:08 14:11a 14:08b 14:05c 14:11 14:01a 14:03b 14:03 14:01a 14:01 14:06 14:07 14:1014:0414:02
  • 28. Time Traveling is just too Dangerous 28@MatthiasJSax Is it? Well, mind compaction! 14:05c 14:08b 14:11a 14:02… 14:04… 14:07…14:06… 14:10… 14:05c 14:05 14:08b 14:05c 14:08 14:11a 14:08b 14:05c 14:11 14:06 14:07 14:1014:0414:02 14:01a 14:03b
  • 29. You Need to Know your History 29@MatthiasJSax Table Changelog Stream append new data (tail) truncation retention time compaction lag (preserves full history) compacted head (old data)
  • 30. You Need to Know your History 30@MatthiasJSax Table Changelog Stream truncation retention time Lost History fully compacted append new data (tail)
  • 31. You are the doc, Doc 31@MatthiasJSax Wrapping up • Event time vs processing time • Stream-time, grace period, and retention time (no watermarks) • Tumbling/hopping vs sliding windows • Join: • Temporal semantics • Stream-stream and stream-table • Tables and time traveling
  • 32. Hope, it was educational.
  • 33. Thanks! We are hiring! @MatthiasJSax matthias@confluent.io | mjsax@apache.org