SlideShare a Scribd company logo
1 of 48
Download to read offline
IQ in Kafka Streams:
The Next Generation
Vicky Papavasileiou Software Engineer @ Confluent
John Roesler Software Engineer @ Confluent, Apache Kafka PMC
Interactive Query in Kafka Streams:
The Next Generation
Vicky Papavasileiou Software Engineer @ Confluent
John Roesler Software Engineer @ Confluent, Apache Kafka PMC
What is Kafka Streams?
3
What is Kafka Streams?
Stream processing framework
● Stateless stream processing: filters, transformations
● Stateful stream processing: aggregations, joins
● Big data processing: partitioning, grouping (aka shuffle)
● Event stream processing: windowing
● Relational stream processing: tables, joins, foreign-key joins
4
What else is Kafka Streams?
Streaming application framework
● Stateful applications
● Scalable cluster
● High availability
● Streaming environment
Application developers only need to handle:
● Metadata APIs
● Interactive Query APIs
5
6
Use Case: Current Balance
7
What is Interactive Query?
transactions
currentBalance
1
Jay
1
Jay
$10
burger
2
Sue
$11
pizza
3
Jay
$5
coffee
3 Jay $15
customer, balance, last_purchase
Jay $15 coffee
Sue $11 pizza
currentBalance
2 Sue $11
get (Jay)
IQ
● Query the state of streaming applications from outside the application
1 Jay $10
Jay $10 burger
8
Instance A
currentBalance
part 0
REST API
Instance B
REST API
Kafka Streams
Meta currentBalance
part 1
Kafka Streams
Meta
Use IQ and Metadata API to build a Streaming App
9
Instance A
GET bal?id=Jay
currentBalance
part 0
REST API
Instance B
REST API
Kafka Streams
Meta currentBalance
part 1
Kafka Streams
Meta
Use IQ and Metadata API to build a Streaming App
1. Send the request to
an app instance
10
Instance A
GET bal?id=Jay
currentBalance
part 0
REST API
Instance B
REST API
Kafka Streams
Meta currentBalance
part 1
Kafka Streams
Meta
Use IQ and Metadata API to build a Streaming App
1. Send the request to
an app instance
2. Consult Streams
Metadata
11
Instance A
GET bal?id=Jay
currentBalance
part 0
REST API
Instance B
REST API
Kafka Streams
Meta currentBalance
part 1
Kafka Streams
Meta
Use IQ and Metadata API to build a Streaming App
1. Send the request to
an app instance
2. Consult Streams
Metadata 3. Forward to
correct instance
12
Instance A
GET bal?id=Jay
currentBalance
part 0
REST API
Instance B
REST API
Kafka Streams
Meta currentBalance
part 1
Kafka Streams
Meta
Use IQ and Metadata API to build a Streaming App
1. Send the request to
an app instance
2. Consult Streams
Metadata 3. Forward to
correct instance
4. Fetch from local
state with IQ
13
Instance A
GET bal?id=Jay
currentBalance
part 0
REST API
Instance B
REST API
Kafka Streams
Meta currentBalance
part 1
Kafka Streams
Meta
5. Return local state(s) to the query handler
Use IQ and Metadata API to build a Streaming App
1. Send the request to
an app instance
2. Consult Streams
Metadata 3. Forward to
correct instance
4. Fetch from local
state with IQ
14
Instance A
GET bal?id=Jay 200: {id:Jay, bal:$15, last_transaction:coffee}
6. Send the results back to the caller
currentBalance
part 0
REST API
Instance B
REST API
Kafka Streams
Meta currentBalance
part 1
Kafka Streams
Meta
5. Return local state(s) to the query handler
Use IQ and Metadata API to build a Streaming App
1. Send the request to
an app instance
2. Consult Streams
Metadata 3. Forward to
correct instance
4. Fetch from local
state with IQ
15
Instance A
GET bal?id=Jay 200: {id:Jay, bal:$15, last_transaction:coffee}
6. Send the results back to the caller
currentBalance
part 0
REST API
Instance B
REST API
Kafka Streams
Meta currentBalance
part 1
Kafka Streams
Meta
5. Return local state(s) to the query handler
Use IQ and Metadata API to build a Streaming App
1. Send the request to
an app instance
2. Consult Streams
Metadata 3. Forward to
correct instance
4. Fetch from local
state with IQ
Problems with IQ v1
● Customization is hard.
○ Both when plugging in custom storage engines and adding new query types
● Not enough control: encapsulation is too aggressive
○ Simple abstraction that gets in the way of creating performant, reliable applications
● The consistency model is too sparse
○ Have to choose between "strong" and "eventual"
16
Preview release in Apache Kafka 3.2
● Customization
○ Easy to add custom queries
○ Easy to support custom state stores
● Control
○ Request API : Attach specific requirements to the query (e.g. skip cache)
○ Response API: Get results of individual partitions and metadata (e.g. execution time)
● Consistency
○ Adds "Position" concept to implement various consistency guarantees
IQ v2 to the rescue
17
Simplicity of IQ v2
val currentBalanceStore =
kafkaStreams.store(
StoreQueryParameters.fromNameAndType(
"currentBalance",
QueryableStoreTypes.timestampedKeyValueStore()
)
);
val balance =
currentBalanceStore.get(“Jay”).value();
val request =
inStore(“currentBalance”)
.withQuery(KeyQuery.withKey(“Jay”));
val result = kafkaStreams
.query(request)
.getOnlyPartitionResult().getResult();
IQ v1 IQ v2
18
Easy to use and intuitive API
1. Customization
○ Add custom queries in user space
2. Control
○ Intuitive query interface
○ Flexible response handling
3. Consistency
○ Broad range of consistency levels
19
Roadmap
20
Customize IQ by implementing custom queries and custom stores
● IQ v1
○ KS allows custom stores but requires changes to Apache Kafka to
expose through IQ
○ Need to add new query to ReadOnlyKeyValueStore interface and all
classes that implement it
● IQ v2
○ Add custom queries to custom stores in user space
○ Easy to contribute new queries to Apache Kafka
1. Customization
21
MeteredKeyValueStore: serialize/deserialize keys and values
CachingKeyValueStore: buffers writes
ChangeLoggingKeyValueStore: make writes durable
RocksDB, InMemory, custom:
● implement KeyValueStore
● stores serialized data
● pluggable
Customization: Anatomy of a state store
22
get (Jay)
Metered: get(String)
Caching: get(Bytes)
Write buffer
Change logging
Changelog topic
RocksDB: get(Bytes)
1. Serialize
2. Check if it is in
the buffer, else get
from state store
Anatomy of IQ
23
Customization: Add new query using IQ v1
Add to Metered store
Add to Caching store
Add to Change logging
store
Add to RocksDB store
And every other store that
implements the
ReadOnlyKeyValueStore
interface
MeteredKeyValueStore: serialize/deserialize keys and values
CachingKeyValueStore: buffers writes
ChangeLoggingKeyValueStore: make writes durable
RocksDB, InMemory, custom:
● implement KeyValueStore
● stores serialized data
● pluggable
24
Customization: Add new query using IQ v2
Only add to store that
evaluates the query
(eg custom)
IF you want to
use cached data,
then integrate
with the cache
MeteredKeyValueStore: serialize/deserialize keys and values
CachingKeyValueStore: buffers writes
ChangeLoggingKeyValueStore: make writes durable
RocksDB, InMemory, custom:
● implement KeyValueStore
● stores serialized data
● pluggable
Customization: Add reverseRange and reverseAll queries in IQ v1
25
KIP-617 PR #9137: Changes to 44 files / 111 files in total
Customization: Add RangeQuery in IQ v2
26
KIP-805 PR #11598 : Changes to 5 files (the rest is internal refactoring)
1. Customization
○ Add custom queries in user space
2. Control
○ Intuitive query interface
○ Flexible response handling
3. Consistency
○ Broad range of consistency levels
27
Roadmap
28
Control which partitions and stores to query
● IQ v1
○ Either query one specific partition or all partitions
○ All queries compose the cache with the underlying bytes stores
● IQ v2:
○ Specify subset of partitions per query
○ Specify whether to use cache per query
○ Implement custom logic per query in user code
2. Control
29
val request =
inStore(“currentBalance”)
.withQuery(KeyQuery.withKey(“Jay”))
.withPartitions(Sets.newHashSet(1));
1. Specify the store to query
2. Specify the query. Predefined
queries include key lookups, range
queries, windowed queries
3. Specify the partitions
Control: IQ v2 request
30
Control what to return and how to handle a response
● IQ v1
○ Results of all partitions are combined into one iterator, not possible to
distinguish which rows come from which partition
○ If a partition fails, the entire query fails. Impossible to know which
partition failed, need to repeat the entire query
● IQ v2:
○ Iterator per partition
○ Failures per partition. If a partition has failed, repeat the query only for
that partition
○ Return extra information such as execution time, tracing information, etc.
2. Control
31
val result = kafkaStreams
.query(request)
.getOnlyPartitionResult()
.getResult();
Issue the query
public final class StateQueryResult {
Map<Integer, QueryResult<R>> getPartitionResults()
QueryResult<R> getOnlyPartitionResult()
public final class QueryResult {
R getResult();
List<String> getExecutionInfo();
FailureReason getFailureReason();
Position getPosition();
Control: IQ v2 response
Execution info: Execution time, store info, etc
Failure reason: Store was not the active, partition
does not exist, etc
Position of store at time of query evaluation
Map of results per partition
Get results of single partition
Get actual rows
1. Customization
○ Add custom queries in user space
2. Control
○ Intuitive query interface
○ Flexible response handling
3. Consistency
○ Broad range of consistency levels
32
Roadmap
Streaming applications require a broad range of consistency levels
● Eventual consistency doesn’t cut it:
○ Developers cannot validate correctness
○ Applications are not user-friendly
● Kafka streams offers:
○ Strong consistency by default
○ Eventual consistency with StoreQueryParameters#enableStaleStores
33
3. Consistency
34
Strong consistency through failure recovery
transactions
currentBalance
1
Jay
1
Jay
$10
burger
2
Sue
$11
pizza
3
Jay
$5
coffee
1 Jay $10
2 Sue $11
3 Jay $15
1. Active fails
changelog
2. New active gets elected
currentBalance
Active instance Active instance
3. Restore from changelog
get (Jay)
● During restoration, IQ will fail
● Only after new active has fully caught up, IQ will succeed
● Ensure strong consistency, IQ guaranteed to see the most recent write
35
Eventual consistency through standbys
transactions
currentBalance
1
Jay
1
Jay
$10
burger
2
Sue
$11
pizza
3
Jay
$5
coffee
1 Jay $10
2 Sue $11
3 Jay $15
1. Configure Streams with replicas
changelog
2. Query standby with
StoreQueryParameters#
enableStaleStores
currentBalance
Active instance Standby instance
3. Return stale results get (Jay)
● Query during replication
● Eventual consistency as there is no guarantee on staleness
3. Consistency
Strong consistency: See most recent write
● Only query the active (no load balancing)
36
Staleness
Availability
Latency
Low
Middle
High
High
Middle
Low
High
Middle
Low
Strong
3. Consistency
37
Staleness
Availability
Latency
Low
Middle
High
High
Middle
Low
High
Middle
Low
Eventual
Eventual consistency: No guarantee how
stale the results are
● Query any instance
Strong
Monotonic reads : No time travel in query
results
● Query any instance that is up to bound
of last read
3. Consistency
38
Staleness
Availability
Latency
Low
Middle
High
High
Middle
Low
High
Middle
Low
Monotonic
reads
Strong
Eventual
3. Consistency
39
Staleness
Availability
Latency
Low
Middle
High
High
Middle
Low
High
Middle
Low
Bounded consistency: Eventual
consistency but with a limit on staleness
● Query any instance that is up to bound
Eventual
Bounded
Monotonic
reads
Strong
3. Consistency
40
Staleness
Availability
Latency
Low
Middle
High
High
Middle
Low
High
Middle
Low
Eventual
Bounded
Monotonic
reads
Strong
IQ v1 offers
StoreQueryParameters#enableStaleStores
IQ v2 offers
41
Consistency: Position tracking in IQ v2
transactions
currentBalance
1
Jay
1
Jay
$10
burger
2
Sue
$11
pizza
3
Jay
$5
coffee
customer, total, last_reason
Jay $15 coffee
Sue $11 pizza
1 Jay $10
2 Sue $11
Position: {0: 1}
Position: {0: 2}
3 Jay $15
Position: {0: 3}
Jay $10 burger
Track the position of the state
store wrt to the input topic offset
changelog
currentBalance
Standby instance
customer, total, last_reason
Jay $10 burger
Sue $11 pizza
Position: {0: 2}
Position: {0: 1}
42
Consistency: Monotonic reads
Server A
Server B
Client
query(“Jay”)
{Jay $10 burger}
{position: 53}
Position: 53
Within a single client session, a query is guaranteed to see the
same or newer values than a previous query
query(“Jay”)
position ≧ 53
Position: 54
{Jay $15 coffee}
{position: 54}
ERR: Not up
to bound
query(“Jay”)
position ≧ 54
Position: 53
43
Consistency: Bounded by Offset Lag
Server A
Server B
Client
query(“Jay”)
{Jay $10 burger}
{position: 53}
Position: 53
query(“Jay”)
position ≧ 48
Position: 54
{Jay $15 coffee}
{position: 54}
Bound : 53-5=48 Bound : 54-5=49
query(“Jay”)
position ≧ 49
Position: 53
{Jay $10 burger}
{position: 53}
Bound is: (highest offset seen - acceptable lag)
Less strict than monotonic, but within reason
Generally, a more intuitive way to think about bounding eventual consistency.
1. Use Kafka AdminClient to translate time to offsets
a. OffsetSpec.forTimestamp(System.currentTimeMillis() - 1h)
b. Admin.listOffsets(partitions)
2. Use the offsets as a lower bound on query position
a. PositionBound.at(
Position.withComponent(topic, partition, offset)
);
44
Consistency: Bounded by Time
45
val request =
inStore(“currentBalance”)
.withQuery(KeyQuery.withKey(“Jay”))
.withPartitions(Sets.newHashSet(1))
.withPositionBound(PositionBound.at(53));
Specify position bound
Consistency: IQ v2
val result = kafkaStreams
.query(request)
.getOnlyPartitionResult()
.getPosition();
Get position of result
Take-aways
IQ v2 allows developers:
● Customization: Build custom queries for custom stores in user space
● Control: Handle partitions separately and get meaningful information in
response
● Consistency: Implement broad range of consistency levels such as monotonic
reads and bounded eventual consistency
46
Conclusion
● Released in preview mode in 3.2 AK
○ API not guaranteed to not change
● Future work:
○ Add more queries to Apache Kafka (an excellent first KIP/contribution)
○ Cleanup for GA
■ Add options to hit/skip the cache
■ More query coverage for things like prefix scan
○ Extend the consistency model to propagate through the topology
47
Wanna join our team?
Confluent is hiring for ksqlDB/Streams engineering positions in UK and Germany
https://careers.confluent.io/open-positions
48

More Related Content

Similar to Interactive Query in Kafka Streams: The Next Generation with Vasiliki Papavasileiou and John Roesler | Kafka Summit London 2022

Building scalable rest service using Akka HTTP
Building scalable rest service using Akka HTTPBuilding scalable rest service using Akka HTTP
Building scalable rest service using Akka HTTPdatamantra
 
Kick Your Database to the Curb
Kick Your Database to the CurbKick Your Database to the Curb
Kick Your Database to the CurbBill Bejeck
 
A Tour of Apache Kafka
A Tour of Apache KafkaA Tour of Apache Kafka
A Tour of Apache Kafkaconfluent
 
NetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talksNetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talksRuslan Meshenberg
 
Workflow as code with Azure Durable Functions
Workflow as code with Azure Durable FunctionsWorkflow as code with Azure Durable Functions
Workflow as code with Azure Durable FunctionsMassimo Bonanni
 
Intro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataIntro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataApache Apex
 
Building Distributed Systems in Scala
Building Distributed Systems in ScalaBuilding Distributed Systems in Scala
Building Distributed Systems in ScalaAlex Payne
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformApache Apex
 
Distributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and ScalaDistributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and ScalaMax Alexejev
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingApache Apex
 
Jena University Talk 2016.03.09 -- SQL at Zalando Technology
Jena University Talk 2016.03.09 -- SQL at Zalando TechnologyJena University Talk 2016.03.09 -- SQL at Zalando Technology
Jena University Talk 2016.03.09 -- SQL at Zalando TechnologyValentine Gogichashvili
 
How to Build an Apache Kafka® Connector
How to Build an Apache Kafka® ConnectorHow to Build an Apache Kafka® Connector
How to Build an Apache Kafka® Connectorconfluent
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingApache Apex
 
Kappa Architecture on Apache Kafka and Querona: datamass.io
Kappa Architecture on Apache Kafka and Querona: datamass.ioKappa Architecture on Apache Kafka and Querona: datamass.io
Kappa Architecture on Apache Kafka and Querona: datamass.ioPiotr Czarnas
 
What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?confluent
 
Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...
Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...
Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...confluent
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...Athens Big Data
 
Deploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and KubernetesDeploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and Kubernetesconfluent
 
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...Christian Tzolov
 

Similar to Interactive Query in Kafka Streams: The Next Generation with Vasiliki Papavasileiou and John Roesler | Kafka Summit London 2022 (20)

Building scalable rest service using Akka HTTP
Building scalable rest service using Akka HTTPBuilding scalable rest service using Akka HTTP
Building scalable rest service using Akka HTTP
 
Kick Your Database to the Curb
Kick Your Database to the CurbKick Your Database to the Curb
Kick Your Database to the Curb
 
A Tour of Apache Kafka
A Tour of Apache KafkaA Tour of Apache Kafka
A Tour of Apache Kafka
 
NetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talksNetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talks
 
Workflow as code with Azure Durable Functions
Workflow as code with Azure Durable FunctionsWorkflow as code with Azure Durable Functions
Workflow as code with Azure Durable Functions
 
Intro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataIntro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big Data
 
Building Distributed Systems in Scala
Building Distributed Systems in ScalaBuilding Distributed Systems in Scala
Building Distributed Systems in Scala
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
Distributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and ScalaDistributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and Scala
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
 
Jena University Talk 2016.03.09 -- SQL at Zalando Technology
Jena University Talk 2016.03.09 -- SQL at Zalando TechnologyJena University Talk 2016.03.09 -- SQL at Zalando Technology
Jena University Talk 2016.03.09 -- SQL at Zalando Technology
 
Java one2013
Java one2013Java one2013
Java one2013
 
How to Build an Apache Kafka® Connector
How to Build an Apache Kafka® ConnectorHow to Build an Apache Kafka® Connector
How to Build an Apache Kafka® Connector
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
 
Kappa Architecture on Apache Kafka and Querona: datamass.io
Kappa Architecture on Apache Kafka and Querona: datamass.ioKappa Architecture on Apache Kafka and Querona: datamass.io
Kappa Architecture on Apache Kafka and Querona: datamass.io
 
What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?
 
Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...
Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...
Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 
Deploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and KubernetesDeploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and Kubernetes
 
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 

Recently uploaded (20)

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 

Interactive Query in Kafka Streams: The Next Generation with Vasiliki Papavasileiou and John Roesler | Kafka Summit London 2022

  • 1. IQ in Kafka Streams: The Next Generation Vicky Papavasileiou Software Engineer @ Confluent John Roesler Software Engineer @ Confluent, Apache Kafka PMC
  • 2. Interactive Query in Kafka Streams: The Next Generation Vicky Papavasileiou Software Engineer @ Confluent John Roesler Software Engineer @ Confluent, Apache Kafka PMC
  • 3. What is Kafka Streams? 3
  • 4. What is Kafka Streams? Stream processing framework ● Stateless stream processing: filters, transformations ● Stateful stream processing: aggregations, joins ● Big data processing: partitioning, grouping (aka shuffle) ● Event stream processing: windowing ● Relational stream processing: tables, joins, foreign-key joins 4
  • 5. What else is Kafka Streams? Streaming application framework ● Stateful applications ● Scalable cluster ● High availability ● Streaming environment Application developers only need to handle: ● Metadata APIs ● Interactive Query APIs 5
  • 7. 7 What is Interactive Query? transactions currentBalance 1 Jay 1 Jay $10 burger 2 Sue $11 pizza 3 Jay $5 coffee 3 Jay $15 customer, balance, last_purchase Jay $15 coffee Sue $11 pizza currentBalance 2 Sue $11 get (Jay) IQ ● Query the state of streaming applications from outside the application 1 Jay $10 Jay $10 burger
  • 8. 8 Instance A currentBalance part 0 REST API Instance B REST API Kafka Streams Meta currentBalance part 1 Kafka Streams Meta Use IQ and Metadata API to build a Streaming App
  • 9. 9 Instance A GET bal?id=Jay currentBalance part 0 REST API Instance B REST API Kafka Streams Meta currentBalance part 1 Kafka Streams Meta Use IQ and Metadata API to build a Streaming App 1. Send the request to an app instance
  • 10. 10 Instance A GET bal?id=Jay currentBalance part 0 REST API Instance B REST API Kafka Streams Meta currentBalance part 1 Kafka Streams Meta Use IQ and Metadata API to build a Streaming App 1. Send the request to an app instance 2. Consult Streams Metadata
  • 11. 11 Instance A GET bal?id=Jay currentBalance part 0 REST API Instance B REST API Kafka Streams Meta currentBalance part 1 Kafka Streams Meta Use IQ and Metadata API to build a Streaming App 1. Send the request to an app instance 2. Consult Streams Metadata 3. Forward to correct instance
  • 12. 12 Instance A GET bal?id=Jay currentBalance part 0 REST API Instance B REST API Kafka Streams Meta currentBalance part 1 Kafka Streams Meta Use IQ and Metadata API to build a Streaming App 1. Send the request to an app instance 2. Consult Streams Metadata 3. Forward to correct instance 4. Fetch from local state with IQ
  • 13. 13 Instance A GET bal?id=Jay currentBalance part 0 REST API Instance B REST API Kafka Streams Meta currentBalance part 1 Kafka Streams Meta 5. Return local state(s) to the query handler Use IQ and Metadata API to build a Streaming App 1. Send the request to an app instance 2. Consult Streams Metadata 3. Forward to correct instance 4. Fetch from local state with IQ
  • 14. 14 Instance A GET bal?id=Jay 200: {id:Jay, bal:$15, last_transaction:coffee} 6. Send the results back to the caller currentBalance part 0 REST API Instance B REST API Kafka Streams Meta currentBalance part 1 Kafka Streams Meta 5. Return local state(s) to the query handler Use IQ and Metadata API to build a Streaming App 1. Send the request to an app instance 2. Consult Streams Metadata 3. Forward to correct instance 4. Fetch from local state with IQ
  • 15. 15 Instance A GET bal?id=Jay 200: {id:Jay, bal:$15, last_transaction:coffee} 6. Send the results back to the caller currentBalance part 0 REST API Instance B REST API Kafka Streams Meta currentBalance part 1 Kafka Streams Meta 5. Return local state(s) to the query handler Use IQ and Metadata API to build a Streaming App 1. Send the request to an app instance 2. Consult Streams Metadata 3. Forward to correct instance 4. Fetch from local state with IQ
  • 16. Problems with IQ v1 ● Customization is hard. ○ Both when plugging in custom storage engines and adding new query types ● Not enough control: encapsulation is too aggressive ○ Simple abstraction that gets in the way of creating performant, reliable applications ● The consistency model is too sparse ○ Have to choose between "strong" and "eventual" 16
  • 17. Preview release in Apache Kafka 3.2 ● Customization ○ Easy to add custom queries ○ Easy to support custom state stores ● Control ○ Request API : Attach specific requirements to the query (e.g. skip cache) ○ Response API: Get results of individual partitions and metadata (e.g. execution time) ● Consistency ○ Adds "Position" concept to implement various consistency guarantees IQ v2 to the rescue 17
  • 18. Simplicity of IQ v2 val currentBalanceStore = kafkaStreams.store( StoreQueryParameters.fromNameAndType( "currentBalance", QueryableStoreTypes.timestampedKeyValueStore() ) ); val balance = currentBalanceStore.get(“Jay”).value(); val request = inStore(“currentBalance”) .withQuery(KeyQuery.withKey(“Jay”)); val result = kafkaStreams .query(request) .getOnlyPartitionResult().getResult(); IQ v1 IQ v2 18 Easy to use and intuitive API
  • 19. 1. Customization ○ Add custom queries in user space 2. Control ○ Intuitive query interface ○ Flexible response handling 3. Consistency ○ Broad range of consistency levels 19 Roadmap
  • 20. 20 Customize IQ by implementing custom queries and custom stores ● IQ v1 ○ KS allows custom stores but requires changes to Apache Kafka to expose through IQ ○ Need to add new query to ReadOnlyKeyValueStore interface and all classes that implement it ● IQ v2 ○ Add custom queries to custom stores in user space ○ Easy to contribute new queries to Apache Kafka 1. Customization
  • 21. 21 MeteredKeyValueStore: serialize/deserialize keys and values CachingKeyValueStore: buffers writes ChangeLoggingKeyValueStore: make writes durable RocksDB, InMemory, custom: ● implement KeyValueStore ● stores serialized data ● pluggable Customization: Anatomy of a state store
  • 22. 22 get (Jay) Metered: get(String) Caching: get(Bytes) Write buffer Change logging Changelog topic RocksDB: get(Bytes) 1. Serialize 2. Check if it is in the buffer, else get from state store Anatomy of IQ
  • 23. 23 Customization: Add new query using IQ v1 Add to Metered store Add to Caching store Add to Change logging store Add to RocksDB store And every other store that implements the ReadOnlyKeyValueStore interface MeteredKeyValueStore: serialize/deserialize keys and values CachingKeyValueStore: buffers writes ChangeLoggingKeyValueStore: make writes durable RocksDB, InMemory, custom: ● implement KeyValueStore ● stores serialized data ● pluggable
  • 24. 24 Customization: Add new query using IQ v2 Only add to store that evaluates the query (eg custom) IF you want to use cached data, then integrate with the cache MeteredKeyValueStore: serialize/deserialize keys and values CachingKeyValueStore: buffers writes ChangeLoggingKeyValueStore: make writes durable RocksDB, InMemory, custom: ● implement KeyValueStore ● stores serialized data ● pluggable
  • 25. Customization: Add reverseRange and reverseAll queries in IQ v1 25 KIP-617 PR #9137: Changes to 44 files / 111 files in total
  • 26. Customization: Add RangeQuery in IQ v2 26 KIP-805 PR #11598 : Changes to 5 files (the rest is internal refactoring)
  • 27. 1. Customization ○ Add custom queries in user space 2. Control ○ Intuitive query interface ○ Flexible response handling 3. Consistency ○ Broad range of consistency levels 27 Roadmap
  • 28. 28 Control which partitions and stores to query ● IQ v1 ○ Either query one specific partition or all partitions ○ All queries compose the cache with the underlying bytes stores ● IQ v2: ○ Specify subset of partitions per query ○ Specify whether to use cache per query ○ Implement custom logic per query in user code 2. Control
  • 29. 29 val request = inStore(“currentBalance”) .withQuery(KeyQuery.withKey(“Jay”)) .withPartitions(Sets.newHashSet(1)); 1. Specify the store to query 2. Specify the query. Predefined queries include key lookups, range queries, windowed queries 3. Specify the partitions Control: IQ v2 request
  • 30. 30 Control what to return and how to handle a response ● IQ v1 ○ Results of all partitions are combined into one iterator, not possible to distinguish which rows come from which partition ○ If a partition fails, the entire query fails. Impossible to know which partition failed, need to repeat the entire query ● IQ v2: ○ Iterator per partition ○ Failures per partition. If a partition has failed, repeat the query only for that partition ○ Return extra information such as execution time, tracing information, etc. 2. Control
  • 31. 31 val result = kafkaStreams .query(request) .getOnlyPartitionResult() .getResult(); Issue the query public final class StateQueryResult { Map<Integer, QueryResult<R>> getPartitionResults() QueryResult<R> getOnlyPartitionResult() public final class QueryResult { R getResult(); List<String> getExecutionInfo(); FailureReason getFailureReason(); Position getPosition(); Control: IQ v2 response Execution info: Execution time, store info, etc Failure reason: Store was not the active, partition does not exist, etc Position of store at time of query evaluation Map of results per partition Get results of single partition Get actual rows
  • 32. 1. Customization ○ Add custom queries in user space 2. Control ○ Intuitive query interface ○ Flexible response handling 3. Consistency ○ Broad range of consistency levels 32 Roadmap
  • 33. Streaming applications require a broad range of consistency levels ● Eventual consistency doesn’t cut it: ○ Developers cannot validate correctness ○ Applications are not user-friendly ● Kafka streams offers: ○ Strong consistency by default ○ Eventual consistency with StoreQueryParameters#enableStaleStores 33 3. Consistency
  • 34. 34 Strong consistency through failure recovery transactions currentBalance 1 Jay 1 Jay $10 burger 2 Sue $11 pizza 3 Jay $5 coffee 1 Jay $10 2 Sue $11 3 Jay $15 1. Active fails changelog 2. New active gets elected currentBalance Active instance Active instance 3. Restore from changelog get (Jay) ● During restoration, IQ will fail ● Only after new active has fully caught up, IQ will succeed ● Ensure strong consistency, IQ guaranteed to see the most recent write
  • 35. 35 Eventual consistency through standbys transactions currentBalance 1 Jay 1 Jay $10 burger 2 Sue $11 pizza 3 Jay $5 coffee 1 Jay $10 2 Sue $11 3 Jay $15 1. Configure Streams with replicas changelog 2. Query standby with StoreQueryParameters# enableStaleStores currentBalance Active instance Standby instance 3. Return stale results get (Jay) ● Query during replication ● Eventual consistency as there is no guarantee on staleness
  • 36. 3. Consistency Strong consistency: See most recent write ● Only query the active (no load balancing) 36 Staleness Availability Latency Low Middle High High Middle Low High Middle Low Strong
  • 38. Monotonic reads : No time travel in query results ● Query any instance that is up to bound of last read 3. Consistency 38 Staleness Availability Latency Low Middle High High Middle Low High Middle Low Monotonic reads Strong Eventual
  • 39. 3. Consistency 39 Staleness Availability Latency Low Middle High High Middle Low High Middle Low Bounded consistency: Eventual consistency but with a limit on staleness ● Query any instance that is up to bound Eventual Bounded Monotonic reads Strong
  • 41. 41 Consistency: Position tracking in IQ v2 transactions currentBalance 1 Jay 1 Jay $10 burger 2 Sue $11 pizza 3 Jay $5 coffee customer, total, last_reason Jay $15 coffee Sue $11 pizza 1 Jay $10 2 Sue $11 Position: {0: 1} Position: {0: 2} 3 Jay $15 Position: {0: 3} Jay $10 burger Track the position of the state store wrt to the input topic offset changelog currentBalance Standby instance customer, total, last_reason Jay $10 burger Sue $11 pizza Position: {0: 2} Position: {0: 1}
  • 42. 42 Consistency: Monotonic reads Server A Server B Client query(“Jay”) {Jay $10 burger} {position: 53} Position: 53 Within a single client session, a query is guaranteed to see the same or newer values than a previous query query(“Jay”) position ≧ 53 Position: 54 {Jay $15 coffee} {position: 54} ERR: Not up to bound query(“Jay”) position ≧ 54 Position: 53
  • 43. 43 Consistency: Bounded by Offset Lag Server A Server B Client query(“Jay”) {Jay $10 burger} {position: 53} Position: 53 query(“Jay”) position ≧ 48 Position: 54 {Jay $15 coffee} {position: 54} Bound : 53-5=48 Bound : 54-5=49 query(“Jay”) position ≧ 49 Position: 53 {Jay $10 burger} {position: 53} Bound is: (highest offset seen - acceptable lag) Less strict than monotonic, but within reason
  • 44. Generally, a more intuitive way to think about bounding eventual consistency. 1. Use Kafka AdminClient to translate time to offsets a. OffsetSpec.forTimestamp(System.currentTimeMillis() - 1h) b. Admin.listOffsets(partitions) 2. Use the offsets as a lower bound on query position a. PositionBound.at( Position.withComponent(topic, partition, offset) ); 44 Consistency: Bounded by Time
  • 45. 45 val request = inStore(“currentBalance”) .withQuery(KeyQuery.withKey(“Jay”)) .withPartitions(Sets.newHashSet(1)) .withPositionBound(PositionBound.at(53)); Specify position bound Consistency: IQ v2 val result = kafkaStreams .query(request) .getOnlyPartitionResult() .getPosition(); Get position of result
  • 46. Take-aways IQ v2 allows developers: ● Customization: Build custom queries for custom stores in user space ● Control: Handle partitions separately and get meaningful information in response ● Consistency: Implement broad range of consistency levels such as monotonic reads and bounded eventual consistency 46
  • 47. Conclusion ● Released in preview mode in 3.2 AK ○ API not guaranteed to not change ● Future work: ○ Add more queries to Apache Kafka (an excellent first KIP/contribution) ○ Cleanup for GA ■ Add options to hit/skip the cache ■ More query coverage for things like prefix scan ○ Extend the consistency model to propagate through the topology 47
  • 48. Wanna join our team? Confluent is hiring for ksqlDB/Streams engineering positions in UK and Germany https://careers.confluent.io/open-positions 48