SlideShare a Scribd company logo
Temporal Joins in
Kafka Streams and ksqlDB
Matthias J. Sax | Software Engineer
@MatthiasJSax
Ecosystem
2
@MatthiasJSax
ksqlDB: streaming database for Apache Kafka
• SQL interface to process data stored in Apache Kafka
• Declarative approach to stream processing
• Queries instead of “programming”
Kafka Streams: Java library for stream processing
• Part of Apache Kafka
• ”Functional” DSL but still programming
Both ksqlDB and Kafka Streams support joins.
Joins are powerful but streaming joins can be difficult to understand.
Joins: The Basics
3
@MatthiasJSax
https://www.confluent.io/kafka-summit-ny19/zen-and-the-art-of-streaming-joins/
Temporal Joins – Why should I give a Damn?
4
Static Data vs Streaming Data
• Data is constantly in motion
• Input tables are not static but updated all the time
• The result must be updated continuously and with deterministic semantics
Relational Joins are Defined over (static) Tables only:
• What about joining streams?
• What about joining a stream and a table?
Temporal Joins define deterministic (event-time) semantics
over continuously changing inputs.
@MatthiasJSax
Event-time vs Processing-time
5
Database Transactions are not predicable!
Database Txs offer ACID guarantees, that are defined over processing time:
• If you run a set of concurrent (read/write) transactions over a database multiple times, there is no guarantee that you get
the same result!
• You ”only” get a guarantee that each ”run” produces a consistent result
@MatthiasJSax
Example: Tx Processing
6
Tx1 w
Tx3 r (join)
Tx2 w
?
@MatthiasJSax
Streams, Records, Timestamps
7
Topic can be processed as:
• Event Stream (STREAM in ksqlDB / KStream in Kafka Streams)
• Changelog Stream (TABLE in ksqlDB / KTable in Kafka Streams)
• ”Tx Order” is determined upstream
Topic contains:
• Timestamped records
• Timestamps define “Tx Order”
• Need to obey pre-defined “Tx Order” when processing the data streams (ie, event-time semantics)
• Timestamps are data!
• Temporal joins are defined on event-time: provides deterministic processing semantics
@MatthiasJSax
* GlobalKTables in Kafka Streams are one exception (ie, non-deterministic stream-globalTable-join)
All* joins in Kafka Streams and ksqlDB
are temporal joins!
@MatthiasJSax
Versioned Tables
9
@MatthiasJSax
Tables evolve over time:
We can associate a different table version for each point in stream-time
Changelog Stream:
Table Versions:
14:01
a 14:03
b 14:05
c 14:08
b 14:11
a
14:01
a
14:03
b
14:05
c
14:05
14:01
a
14:08
b
14:05
c
14:08
14:11
a
14:08
b
14:05
c
14:11
14:01
a
14:03
b
14:03
14:01
a
14:01
stream-time
Temporal Table-Table Join
10
@MatthiasJSax
Join tables with the same version (ie, event-time)
Left Table
Right Table
Result Table
stream-time
14:01 14:03 14:05 14:08 14:11
14:02 14:04 14:06 14:07 14:09 14:10
Example: Table-Table Join
11
@MatthiasJSax
Data Enrichment: Stream-Table Join
12
@MatthiasJSax
Enrich events with table data: ”lookup join”
For each event-stream record, do a table lookup:
• Temporal table lookup: join a stream record with event-time T to table version T
Changelog Stream:
Input Table:
Input Stream:
Result Stream:
14:06
…
14:05
… 14:10
…
14:02
…
14:06
… 14:10
…
14:05
…
14:04
… 14:07
…
Example: Stream-Table Join
13
@MatthiasJSax
There is no concept of “bootstrapping” a table:
• Table versions will be evolved based on processing progress, ie,
stream-time.
• This ensure that the correct table version is loaded at each point
in stream-time.
@MatthiasJSax
Joining Event Streams – How to Handle Infinite Input
15
@MatthiasJSax
Event Streams are infinite and there is no concept of “versions”
Limit join “scope” with a temporal join condition, ie, a time-band-join.
-- mental model
SELECT * FROM stream1, stream2
WHERE
-- equi-join condition
stream1.key = stream2.key
AND
-- time condition
stream1.ts - windowSize <= stream2.ts
AND stream2.ts <= stream1.ts + windowSize
Joining Event Streams – How to Handle Infinite Input
16
@MatthiasJSax
Example: join window size 5
Left Stream
Right Stream
Result Stream
14:04
1 14:16
3
14:01
1 14:16
3
SELECT *
FROM leftStream AS l JOIN rightStream AS r
WITHIN 5 minutes ON l.id = r.id;
14:04
1 14:11
2 14:12
3
Left/Outer Stream-Stream Join
17
@MatthiasJSax
Example: spurious left join result with window size 5
Left Stream
Right Stream
Result Stream
14:04
1 14:16
3
14:01
1 14:16
3
14:04
1 14:11
2 14:12
3
14:11
2 14:12
3
Left/Outer Stream-Stream Join
18
@MatthiasJSax
Example: delayed left join result with window size 5 (WIP)
Left Stream
Right Stream
Result Stream
14:04
1 14:16
3
14:01
1 14:16
3
14:04
1 14:11
2 14:12
3
14:11
2
Timestamping Result Records
19
@MatthiasJSax
Result determinism requires deterministic result record event-timestamps
Out-of-Order data processing need to be considered
Example: Stream-Stream join with window size 5
14:04
1 14:16
2 14:08
2
14:01
1 14:11
2 14:23
2
14:04
1 14:16
2 14:11
2
max(l.ts; r.ts)
The Outlier: GlobalKTables
20
@MatthiasJSax
GlobalKTables have no concept of stream-time
Designed for “static” (but still mutable) data
• In contrast to regular tables, a GlobalKTable is bootstrapped at startup
• GlobalKTable updates are applied unsynchronized
• Stream-GlobalKTable join is non-deterministic on GlobalKTable updates
Global Changelog:
Global Table:
Input Stream:
14:05
…
14:02
… 14:10
…
14:04
… 14:07
… 14:09
…
Broadcast vs Replication and Temporal Semantics
21
@MatthiasJSax
TABLE
KTable
n/a
GlobalKTable
TABLE*
KTable*
n/a
(*) with custom timestamp extractor
than ensures “preferred
processing”, e.g., always returns
timestamp zero
Wrapping Up
22
Temporal Join are a Key Concept in Data Stream Processing
• Generalization of SQL joins (for snapshots) to continuously changing data
• Ensure deterministic / reproducible results
• Types of Temporal Joins:
• Joining evolving tables
• Joining streams to evolving tables
• Stream-Stream join
• Outlier: GlobalKTables
• Sharding vs replication & time synchronized vs unsynchronized/non-determistic
Thanks! We are hiring!
@MatthiasJSax
matthias@confluent.io | mjsax@apache.org
Joins: The Basics
24
@MatthiasJSax
Joins: The Basics
25
Join Types
INNER LEFT (OUTER) RIGHT (OUTER) (FULL) OUTER
R left-join S <=> S right-join R
Join Conditions
Most distributed systems only support equi-joins (ie, left.attribute = right.otherAttribute) because they can
be computed efficiently.
@MatthiasJSax
Joining Event Streams – How to Handle Infinite Input
26
@MatthiasJSax
Example: join window size 5
Left Stream
Right Stream
Result Stream
14:04
1 14:16
2
14:01
1 14:12
3 14:16
2
14:04
1 14:11
2 14:23
3
SELECT *
FROM leftStream AS l JOIN rightStream AS r
WITHIN 5 minutes ON l.id = r.id;
Left/Outer Stream-Stream Join
27
@MatthiasJSax
Example: spurious left join result with window size 5
Left Stream
Right Stream
Result Stream
14:04
1 14:16
2
14:01
1 14:12
3 14:16
2
14:04
1 14:11
2 14:23
3
14:11
2 14:23
3
Left/Outer Stream-Stream Join
28
@MatthiasJSax
Example: delayed left join result with window size 5 (WIP)
Left Stream
Right Stream
Result Stream
14:04
1 14:16
2
14:01
1 14:16
2
14:10
3
14:04
1 14:11
2
14:10
3
Bending Time: Timestamp Extractor
29
@MatthiasJSax
Fighting Time Jitter (max.task.idle.ms)
30
@MatthiasJSax
Future Work: outer-s-s-join / versioned tables / s-t join / time
synchronization / custom topic prioritization
31
@MatthiasJSax
Typography – Headings
32
H1 / Mark Pro Bold
H2 / Mark Pro Medium 16pt
24pt
FF Mark® is the new primary
typeface for Confluent. When
creating a new text box, the font
defaults to Calibri in
PowerPoint. Please make sure
to [download and] select one of
the outlined fonts listed here.
Download FF Mark Pro:
cnfl.io/ffmark-font
Typography – Body
33
Body / Mark Pro Book
Body / Mark Pro Light 12pt
Source Code / Source Code Pro 12pt
12pt
FF Mark® is the new primary
typeface for Confluent. When
creating a new text box, the font
defaults to Calibri in
PowerPoint. Please make sure
to [download and] select one of
the outlined fonts listed here.
Download FF Mark Pro:
cnfl.io/ffmark-font
Single Line Headline Mark Pro Bold 24 pt
34
Single column text slide
Subhead Mark Pro Bold 16 pt
Body copy would go here.
This would just be to show how longer copy would look. Mark Pro Book 12 pt.
• Bullet Level 1
• Bullet Level 2
35
Single column text slide – Denim
36
Subhead Mark Pro Bold 16 pt
Body copy would go here.
This would just be to show how longer copy would look. Mark Pro Book 12 pt.
• Bullet Level 1
• Bullet Level 2
Single column text slide – Robins Egg
37
Subhead Mark Pro Bold 16 pt
Body copy would go here.
This would just be to show how longer copy would look. Mark Pro Book 12 pt.
• Bullet Level 1
• Bullet Level 2
Single column text slide – Powder
38
Subhead Mark Pro Bold 16 pt
Body copy would go here.
This would just be to show how longer copy would look. Mark Pro Book 12 pt.
• Bullet Level 1
• Bullet Level 2
39
Headline
Mark Pro Bold 24 pt
Subhead Mark Pro Bold 16 pt
Body copy would go here. This would just be to
show how longer copy would look. Mark Pro
Light 14 pt.
Subhead Mark Pro Bold 16 pt
Body copy would go here. This would just be to
show how longer copy would look. Mark Pro
Light 14 pt.
40
Headline
Mark Pro Bold 24 pt
Subhead Mark Pro Bold 16 pt
Body copy would go here. This would just be to
show how longer copy would look. Mark Pro
Light 14 pt.
Subhead Mark Pro Bold 16 pt
Body copy would go here. This would just be to
show how longer copy would look. Mark Pro
Light 14 pt.
41
Headline
Mark Pro Bold 24 pt
Subhead Mark Pro Bold 16 pt
Body copy would go here. This would just be to
show how longer copy would look. Mark Pro
Light 14 pt.
Subhead Mark Pro Bold 16 pt
Body copy would go here. This would just be to
show how longer copy would look. Mark Pro
Light 14 pt.
Breaker Page – Denim
Sample line 2
Breaker Page – Robins Egg
Sample line 2
Breaker Page – Powder
Sample line 2
Full Name, Title, Company
“Lorem ipsum dolor sit amet, consectetur
adipiscing elit. Etsi ea quidem, quae adhuc
dixisti, quamvis ad aetatem recte isto modo
dicerentur.”
“Lorem ipsum dolor sit amet, consectetur
adipiscing elit. Etsi ea quidem, quae adhuc
dixisti, quamvis ad aetatem recte isto modo
dicerentur.”
Full Name, Title, Company
Full Name, Title, Company
“Lorem ipsum dolor sit amet, consectetur
adipiscing elit. Etsi ea quidem, quae adhuc
dixisti, quamvis ad aetatem recte isto modo
dicerentur.”

More Related Content

What's hot

XStream: stream processing platform at facebook
XStream:  stream processing platform at facebookXStream:  stream processing platform at facebook
XStream: stream processing platform at facebook
Aniket Mokashi
 
Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...
Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...
Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...
Scalar, Inc.
 
並行実行制御の最適化手法
並行実行制御の最適化手法並行実行制御の最適化手法
並行実行制御の最適化手法
Sho Nakazono
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb Sharding
Araf Karsh Hamid
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®
confluent
 
Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)
Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)
Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)
Chris Bolman
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
Guido Schmutz
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registry
confluent
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
DataWorks Summit
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
confluent
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
Discover Pinterest
 
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
SANG WON PARK
 
Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain
confluent
 
Rainbird: Realtime Analytics at Twitter (Strata 2011)
Rainbird: Realtime Analytics at Twitter (Strata 2011)Rainbird: Realtime Analytics at Twitter (Strata 2011)
Rainbird: Realtime Analytics at Twitter (Strata 2011)
Kevin Weil
 
トランザクションをSerializableにする4つの方法
トランザクションをSerializableにする4つの方法トランザクションをSerializableにする4つの方法
トランザクションをSerializableにする4つの方法
Kumazaki Hiroki
 
分散システムについて語らせてくれ
分散システムについて語らせてくれ分散システムについて語らせてくれ
分散システムについて語らせてくれ
Kumazaki Hiroki
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Flink Forward
 
バッチを Akka Streams で再実装したら100倍速くなった話 #ScalaMatsuri
バッチを Akka Streams で再実装したら100倍速くなった話 #ScalaMatsuriバッチを Akka Streams で再実装したら100倍速くなった話 #ScalaMatsuri
バッチを Akka Streams で再実装したら100倍速くなった話 #ScalaMatsuri
Kazuki Negoro
 

What's hot (20)

XStream: stream processing platform at facebook
XStream:  stream processing platform at facebookXStream:  stream processing platform at facebook
XStream: stream processing platform at facebook
 
Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...
Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...
Making Cassandra more capable, faster, and more reliable (at ApacheCon@Home 2...
 
並行実行制御の最適化手法
並行実行制御の最適化手法並行実行制御の最適化手法
並行実行制御の最適化手法
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb Sharding
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®
 
Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)
Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)
Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registry
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
 
Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain
 
Rainbird: Realtime Analytics at Twitter (Strata 2011)
Rainbird: Realtime Analytics at Twitter (Strata 2011)Rainbird: Realtime Analytics at Twitter (Strata 2011)
Rainbird: Realtime Analytics at Twitter (Strata 2011)
 
トランザクションをSerializableにする4つの方法
トランザクションをSerializableにする4つの方法トランザクションをSerializableにする4つの方法
トランザクションをSerializableにする4つの方法
 
分散システムについて語らせてくれ
分散システムについて語らせてくれ分散システムについて語らせてくれ
分散システムについて語らせてくれ
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
 
バッチを Akka Streams で再実装したら100倍速くなった話 #ScalaMatsuri
バッチを Akka Streams で再実装したら100倍速くなった話 #ScalaMatsuriバッチを Akka Streams で再実装したら100倍速くなった話 #ScalaMatsuri
バッチを Akka Streams で再実装したら100倍速くなった話 #ScalaMatsuri
 

Similar to Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent

The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
HostedbyConfluent
 
When Streaming Needs Batch With Konstantin Knauf | Current 2022
When Streaming Needs Batch With Konstantin Knauf | Current 2022When Streaming Needs Batch With Konstantin Knauf | Current 2022
When Streaming Needs Batch With Konstantin Knauf | Current 2022
HostedbyConfluent
 
Streaming SQL Foundations: Why I ❤ Streams+Tables
Streaming SQL Foundations: Why I ❤ Streams+TablesStreaming SQL Foundations: Why I ❤ Streams+Tables
Streaming SQL Foundations: Why I ❤ Streams+Tables
C4Media
 
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward
 
Cassandra introduction mars jug
Cassandra introduction mars jugCassandra introduction mars jug
Cassandra introduction mars jug
Duyhai Doan
 
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...
Flink Forward
 
Flink Forward San Francisco 2019: The Trade Desk's Year in Flink - Jonathan ...
Flink Forward San Francisco 2019: The Trade Desk's Year in Flink -  Jonathan ...Flink Forward San Francisco 2019: The Trade Desk's Year in Flink -  Jonathan ...
Flink Forward San Francisco 2019: The Trade Desk's Year in Flink - Jonathan ...
Flink Forward
 
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
Paris Carbone
 
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Altinity Ltd
 
Foundations of streaming SQL: stream & table theory
Foundations of streaming SQL: stream & table theoryFoundations of streaming SQL: stream & table theory
Foundations of streaming SQL: stream & table theory
DataWorks Summit
 
Spline: Data Lineage For Spark Structured Streaming
Spline: Data Lineage For Spark Structured StreamingSpline: Data Lineage For Spark Structured Streaming
Spline: Data Lineage For Spark Structured Streaming
Vaclav Kosar
 
Cassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestCassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapest
Duyhai Doan
 
Delta: Building Merge on Read
Delta: Building Merge on ReadDelta: Building Merge on Read
Delta: Building Merge on Read
Databricks
 
A Comparative Performance Evaluation of Apache Flink
A Comparative Performance Evaluation of Apache FlinkA Comparative Performance Evaluation of Apache Flink
A Comparative Performance Evaluation of Apache Flink
Dongwon Kim
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkDongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of Flink
Flink Forward
 
Spark Cassandra 2016
Spark Cassandra 2016Spark Cassandra 2016
Spark Cassandra 2016
Duyhai Doan
 
Spark cassandra integration 2016
Spark cassandra integration 2016Spark cassandra integration 2016
Spark cassandra integration 2016
Duyhai Doan
 
Introduction to Stateful Stream Processing with Apache Flink.
Introduction to Stateful Stream Processing with Apache Flink.Introduction to Stateful Stream Processing with Apache Flink.
Introduction to Stateful Stream Processing with Apache Flink.
Konstantinos Kloudas
 
Topic and schema management-meetupberlin
Topic and schema management-meetupberlinTopic and schema management-meetupberlin
Topic and schema management-meetupberlin
confluent
 

Similar to Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent (20)

The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
 
When Streaming Needs Batch With Konstantin Knauf | Current 2022
When Streaming Needs Batch With Konstantin Knauf | Current 2022When Streaming Needs Batch With Konstantin Knauf | Current 2022
When Streaming Needs Batch With Konstantin Knauf | Current 2022
 
Streaming SQL Foundations: Why I ❤ Streams+Tables
Streaming SQL Foundations: Why I ❤ Streams+TablesStreaming SQL Foundations: Why I ❤ Streams+Tables
Streaming SQL Foundations: Why I ❤ Streams+Tables
 
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
 
Cassandra introduction mars jug
Cassandra introduction mars jugCassandra introduction mars jug
Cassandra introduction mars jug
 
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...
 
Flink Forward San Francisco 2019: The Trade Desk's Year in Flink - Jonathan ...
Flink Forward San Francisco 2019: The Trade Desk's Year in Flink -  Jonathan ...Flink Forward San Francisco 2019: The Trade Desk's Year in Flink -  Jonathan ...
Flink Forward San Francisco 2019: The Trade Desk's Year in Flink - Jonathan ...
 
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
 
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
 
Foundations of streaming SQL: stream & table theory
Foundations of streaming SQL: stream & table theoryFoundations of streaming SQL: stream & table theory
Foundations of streaming SQL: stream & table theory
 
Spline: Data Lineage For Spark Structured Streaming
Spline: Data Lineage For Spark Structured StreamingSpline: Data Lineage For Spark Structured Streaming
Spline: Data Lineage For Spark Structured Streaming
 
Cassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestCassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapest
 
Delta: Building Merge on Read
Delta: Building Merge on ReadDelta: Building Merge on Read
Delta: Building Merge on Read
 
A Comparative Performance Evaluation of Apache Flink
A Comparative Performance Evaluation of Apache FlinkA Comparative Performance Evaluation of Apache Flink
A Comparative Performance Evaluation of Apache Flink
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkDongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of Flink
 
Spark Cassandra 2016
Spark Cassandra 2016Spark Cassandra 2016
Spark Cassandra 2016
 
Spark cassandra integration 2016
Spark cassandra integration 2016Spark cassandra integration 2016
Spark cassandra integration 2016
 
Introduction to Stateful Stream Processing with Apache Flink.
Introduction to Stateful Stream Processing with Apache Flink.Introduction to Stateful Stream Processing with Apache Flink.
Introduction to Stateful Stream Processing with Apache Flink.
 
Topic and schema management-meetupberlin
Topic and schema management-meetupberlinTopic and schema management-meetupberlin
Topic and schema management-meetupberlin
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
HostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
HostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
HostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
HostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
HostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
HostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
HostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
HostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
HostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
HostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
HostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
HostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
HostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
HostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
HostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 

Recently uploaded (20)

State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 

Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent

  • 1. Temporal Joins in Kafka Streams and ksqlDB Matthias J. Sax | Software Engineer @MatthiasJSax
  • 2. Ecosystem 2 @MatthiasJSax ksqlDB: streaming database for Apache Kafka • SQL interface to process data stored in Apache Kafka • Declarative approach to stream processing • Queries instead of “programming” Kafka Streams: Java library for stream processing • Part of Apache Kafka • ”Functional” DSL but still programming Both ksqlDB and Kafka Streams support joins. Joins are powerful but streaming joins can be difficult to understand.
  • 4. Temporal Joins – Why should I give a Damn? 4 Static Data vs Streaming Data • Data is constantly in motion • Input tables are not static but updated all the time • The result must be updated continuously and with deterministic semantics Relational Joins are Defined over (static) Tables only: • What about joining streams? • What about joining a stream and a table? Temporal Joins define deterministic (event-time) semantics over continuously changing inputs. @MatthiasJSax
  • 5. Event-time vs Processing-time 5 Database Transactions are not predicable! Database Txs offer ACID guarantees, that are defined over processing time: • If you run a set of concurrent (read/write) transactions over a database multiple times, there is no guarantee that you get the same result! • You ”only” get a guarantee that each ”run” produces a consistent result @MatthiasJSax
  • 6. Example: Tx Processing 6 Tx1 w Tx3 r (join) Tx2 w ? @MatthiasJSax
  • 7. Streams, Records, Timestamps 7 Topic can be processed as: • Event Stream (STREAM in ksqlDB / KStream in Kafka Streams) • Changelog Stream (TABLE in ksqlDB / KTable in Kafka Streams) • ”Tx Order” is determined upstream Topic contains: • Timestamped records • Timestamps define “Tx Order” • Need to obey pre-defined “Tx Order” when processing the data streams (ie, event-time semantics) • Timestamps are data! • Temporal joins are defined on event-time: provides deterministic processing semantics @MatthiasJSax
  • 8. * GlobalKTables in Kafka Streams are one exception (ie, non-deterministic stream-globalTable-join) All* joins in Kafka Streams and ksqlDB are temporal joins! @MatthiasJSax
  • 9. Versioned Tables 9 @MatthiasJSax Tables evolve over time: We can associate a different table version for each point in stream-time Changelog Stream: Table Versions: 14:01 a 14:03 b 14:05 c 14:08 b 14:11 a 14:01 a 14:03 b 14:05 c 14:05 14:01 a 14:08 b 14:05 c 14:08 14:11 a 14:08 b 14:05 c 14:11 14:01 a 14:03 b 14:03 14:01 a 14:01 stream-time
  • 10. Temporal Table-Table Join 10 @MatthiasJSax Join tables with the same version (ie, event-time) Left Table Right Table Result Table stream-time 14:01 14:03 14:05 14:08 14:11 14:02 14:04 14:06 14:07 14:09 14:10
  • 12. Data Enrichment: Stream-Table Join 12 @MatthiasJSax Enrich events with table data: ”lookup join” For each event-stream record, do a table lookup: • Temporal table lookup: join a stream record with event-time T to table version T Changelog Stream: Input Table: Input Stream: Result Stream: 14:06 … 14:05 … 14:10 … 14:02 … 14:06 … 14:10 … 14:05 … 14:04 … 14:07 …
  • 14. There is no concept of “bootstrapping” a table: • Table versions will be evolved based on processing progress, ie, stream-time. • This ensure that the correct table version is loaded at each point in stream-time. @MatthiasJSax
  • 15. Joining Event Streams – How to Handle Infinite Input 15 @MatthiasJSax Event Streams are infinite and there is no concept of “versions” Limit join “scope” with a temporal join condition, ie, a time-band-join. -- mental model SELECT * FROM stream1, stream2 WHERE -- equi-join condition stream1.key = stream2.key AND -- time condition stream1.ts - windowSize <= stream2.ts AND stream2.ts <= stream1.ts + windowSize
  • 16. Joining Event Streams – How to Handle Infinite Input 16 @MatthiasJSax Example: join window size 5 Left Stream Right Stream Result Stream 14:04 1 14:16 3 14:01 1 14:16 3 SELECT * FROM leftStream AS l JOIN rightStream AS r WITHIN 5 minutes ON l.id = r.id; 14:04 1 14:11 2 14:12 3
  • 17. Left/Outer Stream-Stream Join 17 @MatthiasJSax Example: spurious left join result with window size 5 Left Stream Right Stream Result Stream 14:04 1 14:16 3 14:01 1 14:16 3 14:04 1 14:11 2 14:12 3 14:11 2 14:12 3
  • 18. Left/Outer Stream-Stream Join 18 @MatthiasJSax Example: delayed left join result with window size 5 (WIP) Left Stream Right Stream Result Stream 14:04 1 14:16 3 14:01 1 14:16 3 14:04 1 14:11 2 14:12 3 14:11 2
  • 19. Timestamping Result Records 19 @MatthiasJSax Result determinism requires deterministic result record event-timestamps Out-of-Order data processing need to be considered Example: Stream-Stream join with window size 5 14:04 1 14:16 2 14:08 2 14:01 1 14:11 2 14:23 2 14:04 1 14:16 2 14:11 2 max(l.ts; r.ts)
  • 20. The Outlier: GlobalKTables 20 @MatthiasJSax GlobalKTables have no concept of stream-time Designed for “static” (but still mutable) data • In contrast to regular tables, a GlobalKTable is bootstrapped at startup • GlobalKTable updates are applied unsynchronized • Stream-GlobalKTable join is non-deterministic on GlobalKTable updates Global Changelog: Global Table: Input Stream: 14:05 … 14:02 … 14:10 … 14:04 … 14:07 … 14:09 …
  • 21. Broadcast vs Replication and Temporal Semantics 21 @MatthiasJSax TABLE KTable n/a GlobalKTable TABLE* KTable* n/a (*) with custom timestamp extractor than ensures “preferred processing”, e.g., always returns timestamp zero
  • 22. Wrapping Up 22 Temporal Join are a Key Concept in Data Stream Processing • Generalization of SQL joins (for snapshots) to continuously changing data • Ensure deterministic / reproducible results • Types of Temporal Joins: • Joining evolving tables • Joining streams to evolving tables • Stream-Stream join • Outlier: GlobalKTables • Sharding vs replication & time synchronized vs unsynchronized/non-determistic
  • 23. Thanks! We are hiring! @MatthiasJSax matthias@confluent.io | mjsax@apache.org
  • 25. Joins: The Basics 25 Join Types INNER LEFT (OUTER) RIGHT (OUTER) (FULL) OUTER R left-join S <=> S right-join R Join Conditions Most distributed systems only support equi-joins (ie, left.attribute = right.otherAttribute) because they can be computed efficiently. @MatthiasJSax
  • 26. Joining Event Streams – How to Handle Infinite Input 26 @MatthiasJSax Example: join window size 5 Left Stream Right Stream Result Stream 14:04 1 14:16 2 14:01 1 14:12 3 14:16 2 14:04 1 14:11 2 14:23 3 SELECT * FROM leftStream AS l JOIN rightStream AS r WITHIN 5 minutes ON l.id = r.id;
  • 27. Left/Outer Stream-Stream Join 27 @MatthiasJSax Example: spurious left join result with window size 5 Left Stream Right Stream Result Stream 14:04 1 14:16 2 14:01 1 14:12 3 14:16 2 14:04 1 14:11 2 14:23 3 14:11 2 14:23 3
  • 28. Left/Outer Stream-Stream Join 28 @MatthiasJSax Example: delayed left join result with window size 5 (WIP) Left Stream Right Stream Result Stream 14:04 1 14:16 2 14:01 1 14:16 2 14:10 3 14:04 1 14:11 2 14:10 3
  • 29. Bending Time: Timestamp Extractor 29 @MatthiasJSax
  • 30. Fighting Time Jitter (max.task.idle.ms) 30 @MatthiasJSax
  • 31. Future Work: outer-s-s-join / versioned tables / s-t join / time synchronization / custom topic prioritization 31 @MatthiasJSax
  • 32. Typography – Headings 32 H1 / Mark Pro Bold H2 / Mark Pro Medium 16pt 24pt FF Mark® is the new primary typeface for Confluent. When creating a new text box, the font defaults to Calibri in PowerPoint. Please make sure to [download and] select one of the outlined fonts listed here. Download FF Mark Pro: cnfl.io/ffmark-font
  • 33. Typography – Body 33 Body / Mark Pro Book Body / Mark Pro Light 12pt Source Code / Source Code Pro 12pt 12pt FF Mark® is the new primary typeface for Confluent. When creating a new text box, the font defaults to Calibri in PowerPoint. Please make sure to [download and] select one of the outlined fonts listed here. Download FF Mark Pro: cnfl.io/ffmark-font
  • 34. Single Line Headline Mark Pro Bold 24 pt 34
  • 35. Single column text slide Subhead Mark Pro Bold 16 pt Body copy would go here. This would just be to show how longer copy would look. Mark Pro Book 12 pt. • Bullet Level 1 • Bullet Level 2 35
  • 36. Single column text slide – Denim 36 Subhead Mark Pro Bold 16 pt Body copy would go here. This would just be to show how longer copy would look. Mark Pro Book 12 pt. • Bullet Level 1 • Bullet Level 2
  • 37. Single column text slide – Robins Egg 37 Subhead Mark Pro Bold 16 pt Body copy would go here. This would just be to show how longer copy would look. Mark Pro Book 12 pt. • Bullet Level 1 • Bullet Level 2
  • 38. Single column text slide – Powder 38 Subhead Mark Pro Bold 16 pt Body copy would go here. This would just be to show how longer copy would look. Mark Pro Book 12 pt. • Bullet Level 1 • Bullet Level 2
  • 39. 39 Headline Mark Pro Bold 24 pt Subhead Mark Pro Bold 16 pt Body copy would go here. This would just be to show how longer copy would look. Mark Pro Light 14 pt. Subhead Mark Pro Bold 16 pt Body copy would go here. This would just be to show how longer copy would look. Mark Pro Light 14 pt.
  • 40. 40 Headline Mark Pro Bold 24 pt Subhead Mark Pro Bold 16 pt Body copy would go here. This would just be to show how longer copy would look. Mark Pro Light 14 pt. Subhead Mark Pro Bold 16 pt Body copy would go here. This would just be to show how longer copy would look. Mark Pro Light 14 pt.
  • 41. 41 Headline Mark Pro Bold 24 pt Subhead Mark Pro Bold 16 pt Body copy would go here. This would just be to show how longer copy would look. Mark Pro Light 14 pt. Subhead Mark Pro Bold 16 pt Body copy would go here. This would just be to show how longer copy would look. Mark Pro Light 14 pt.
  • 42. Breaker Page – Denim Sample line 2
  • 43. Breaker Page – Robins Egg Sample line 2
  • 44. Breaker Page – Powder Sample line 2
  • 45. Full Name, Title, Company “Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etsi ea quidem, quae adhuc dixisti, quamvis ad aetatem recte isto modo dicerentur.”
  • 46. “Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etsi ea quidem, quae adhuc dixisti, quamvis ad aetatem recte isto modo dicerentur.” Full Name, Title, Company
  • 47. Full Name, Title, Company “Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etsi ea quidem, quae adhuc dixisti, quamvis ad aetatem recte isto modo dicerentur.”