Synchronous Commands over Apache Kafka (Neil Buesing, Object Partners, Inc) Kafka Summit 2020

Synchronous Kafka
Neil Buesing
Kafka Summit 2020
@nbuesing
Synchronous Commands over Apache Kafka

Producer
Messaging
Consumer
Command/
Response
• Something that should happen
• Tell others what to do
• Presumption of a response
• Ask questions from others
Request (Command) Driven
• Something that has happened
• Tell others what you did
• No presumption of a response
• Others determine what they do
Event Driven

Quizzer
Web
Application
Single Page
Application
Load
Balancer
Application Design
Quizzer
quiz-start
quiz-next
quiz-result
quiz-submit
Builder
questions
difﬁculty
quiz
quiz-status

API Database
Quizzer - Streams Application
Submit
Next
Result
Quiz
Users
Aggregate
(KTable)
Questions
Difﬁculty
Global KTable
KTable
KTable
Start
Status

Quizzer - Streams Application
• https://events.conﬂuent.io/meetups

• "Building a Web Application with Kafka as your
Database", March 24th, 2020

• "Interactive Kafka Streams", May 21st, 2020

Load
Balancer
Web
Application
Web Application Design
Http Status
Request (Command) Driven Event Driven

Web
Application
202
Accepted
200
OK
Event Driven Design
This is not always possible

Web
Application
200 OK
201 CREATE
Blocking / Command Driven Design
When redesign is not an option
• Websockets
• Server Sent Events

Web
Application
202
Accepted
CQRS / Command Driven Design
200
OK

Web
Application
202
Accepted
200
OK
CQRS / Command Driven Design

Command Driven Design
the legacy app / blocking

The Legacy App
Web
Application
200
OK
200
OK
• How do you block in the web
application?
• How do you ensure the correct
web application instance that
publishes to Kafka is able to
consume the response topic.
Legacy Application, expects 200/OK Response
Blocking

Blocking
waiting for the response

Blocking Options
• Techniques to block for a response message in a
JVM Application.
• Countdown Latch
• Deferred Result (Spring MVC)

Blocking - Countdown Latch
• Algorithm
• Publish to Kafka
• Block on Latch
• Release Latch (Consumer from Kafka in separate thread)

Object waitForResponse(String requestId) {
CountDownLatch l = new CountDownLatch(1);
CountDownLatch racer = latchByRequestId.putIfAbsent(requestId, l);
if (racer != null) l = racer;
//block
boolean success = l.await(timeoutMs, TimeUnit.MILLISECONDS);
if (success) {
//remove response from shared map
return responseByRequestId.remove(requestId);
} else {
throw new CorrelationException("Timeout: " + requestId);
}
}

void addResponse(String id, Object response) {
CountDownLatch l = new CountDownLatch(1);
CountDownLatch r = latchByRequestId.putIfAbsent(id, l);
if (r != null) l = r; //usually
//make response available for blocking thread
responseByRequestId.put(id, response);
l.countDown(); //unblock
}

• Pros
• Standard Java code Java code (since 1.5)
• Can be used anywhere
• Cons
• Blocks request thread
• Limits incoming requests (Servlet Container)
• Increases resource consumption

Blocking - Deferred Result
• Oﬄoads to secondary thread
• Less coding
• Speciﬁc to Spring MVC
• CompletableFuture interface supported
• Other Web frameworks have this too

Cache<String, DeferredResult> cache =
CacheBuilder.newBuilder().maximumSize(1000).build();
DeferredResult waitForResponse(String requestId) {
DeferredResult deferredResult = new DeferredResult(5000L);
cache.put(requestId, deferredResult);
return deferredResult; //no actual waiting here, spring does that.
}

void addResponse(String requestId, JsonNode resp) {
DeferredResult result = cache.getIfPresent(requestId);
if (result != null) {
ResponseEntity<JsonNode> content = new ResponseEntity<>(resp, OK);
result.setResult(content); //unblocks response
cache.invalidate(requestId);
}
}

Consuming
consume from same instance

Solutions
• Single Topic & consumer.assign() 
• Topics for each Web Application 
• Single Topic & consumer.subscribe()

Consuming Response Topic
Web
Application
quiz-next-0
quiz-submit
200
OK
quiz-next-1
quiz-next-2
quiz-next-3
Single Topic

consumer.assign()

Consuming - 1 Topic & Assignment
• every Web Application assigns themselves to all partitions

• request-id in Kafka Header

• response topic must have header (automatic in Kafka Streams)

• key free for other uses, doesn't have to be the request-id

• all web applications get all messages

• discard messages where request-id doesn't exist

• don't deserialize key/value before checking header

• Pros

• Can spin up additional web-applications w/out creating
topics

• Not limited to the number of partitions

• Correlation ID (request Id) does not have  
to be key.

• No pause with a consumer group rebalancing.

• Cons

• Every Web Application has to consume ever message

• Have to check and deserialize request-id header

quiz-next-2quiz-next-2quiz-next-2
quiz-next-1quiz-next-1quiz-next-1
quiz-next-a-1quiz-next-a-1quiz-next-a-1
Web
Application
quiz-next-a-0
quiz-submit
200
OK
quiz-next-b-0
quiz-next-c-0
Multiple Topics

Topics for each  
Web Application

Consuming - Topic / Web App
• Every web application gets its own topic

• additional header, resp-topic.

• Streaming application responds to the topic deﬁned in
the header

• TopicNameExtractor (access to headers) 
.to((k, v, context) ->  
bytesToString(context.headers().lastHeader("resp-topic").value()));

• Pros

• Only consume messages you produced

• No pause from a consumer group rebalancing

• no additional burden or assumption on 
use of key.

• Cons

• More work on streaming application to respond to the
proper topic.

• Must create a topic for every web application
instance

• Responses spanned across multiple topics

Web
Application
quiz-next-0
quiz-submit
200
OK
quiz-next-1
quiz-next-2
quiz-next-3
Single Topic 
consumer.subscribe()

Consuming - 1 Topic & Subscribe
• consumer.subscribe("response-topic", rebalListener)
• considerations

• is the topic key based on data from the 
incoming request?

• how sophisticated is your Load Balancer?

• Topic Key is known value (quiz_id vs request_id)
• route to all, "Not Me"
• Topic Key is not a known value (request_id)
• round-robin route to web-service and check  
hash before using generated key.

• Have LB generate request-id and hash 
performed before routing (LB needs more info)

Load Balancer - NGINX
server {
location / {
js_content content;
}
location /rh1 {
rewrite /rh1/(.*) /$1 break;
}
location /rh2 {
rewrite /rh2/(.*) /$1 break;
}
}
}
function content(r) {
function done(res) {
if (303 !== res.status) { // "Not Me" Check
for (var h in res.headersOut) {
r.headersOut[h] = res.headersOut[h];
}
r.return(res.status, res.responseBody);
}
}
r.subrequest('/rh1' + r.uri, {
args: r.variables.args,
body: r.requestBody,
method: 'POST'},
done);
r.subrequest('/rh2' + r.uri, {
args: r.variables.args,
body: r.requestBody,
method: 'POST'},
done);
}

• Pros

• Leverages most common Consumer Group Pattern

• No burden on streaming applications

• KIP-429 
Kafka Consumer Incremental Rebalance Protocol

• Only a single consumer processes the message

• Cons

• More coordination depending on topic key

• Responses paused during a rebalancing

• Partitions moving consumers on rebalance

• Key and Partitioning concerns  
minimized when using with CQRS.

Command Driven Design
the interactive application

• No need to block in Web
application.
• No need to route request back
to speciﬁc instance
• Requires Fully Accessible State
Web
Application
Command Query Responsibility Segregation
200
OK
202
Accepted
Querying

• No need to block in Web
application.
• Route request back to same
instance
• State / Web Application State
Web
Application
Command Query Responsibility Segregation
202
Accepted
200
OK
Querying

Blocking Querying
getting for the response

Leveraging Http Redirects
• 303 See Other

• Client will redirect and convert to GET

• Unfortunately, browsers handle location, so AJAX
solutions require additional work.

• CORs and allowed headers

• Build your own rules requires speciﬁc API contract

Consuming
consume from state store

State Stores
• Global State Store

• Doesn't matter which Web Service handles the query

• examples

• microservice (web service doesn't need to know)

• ksqlDB (while it might Shard the data, it is queries as a single
unit)

• any key=value datastore (Cassandra, Mongo, MemCache)

State Stores
• Embedded Shard State Store

• Need to route/reroute query to the correct Web Service

• Leverage Load Balancer

• Inter web-service communication (as in Kafka Streams Metadata
API)

• Kafka Streams Examples / Microservices / OrderService.java /
fetchFromOtherHost

https://github.com/conﬂuentinc/kafka-streams-examples/blob/master/src/main/java/io/conﬂuent/examples/streams/microservices/OrdersService.java

Kafka Streams State Stores
• 1 Topic & subscribe() streams consumer within Web Service

• KIPS

• KIP-429 (Kafka 2.4) 
Kafka Consumer Incremental Rebalance Protocol

• Allow consumer.poll() to return data in the middle of rebalance (https://issues.apache.org/jira/
browse/KAFKA-8421) (Kafka 2.5)

• KIP-535 (Kafka 2.5) 
Allow state stores to serve stale reads during rebalance

• KIP-562 (Kafka 2.5) 
Allow fetching a key from a single partition rather than iterating over all the stores on an instance

• KIP-441 (Expected, Kafka 2.6) 
Smooth Scaling Out for Kafka Streams

Kafka Streams State Stores
• Things to consider

• Minimize duration of data being stored

• Isolate (minimize) topologies, reduce
session.timeout.ms

• stand by replicas

ksqldb State Store
• leverage client for table queries

• Table must be created by KSQL operation

• latest_by_oﬀset() function works well for this

• want state-stores to be self cleaning

• leverage windowing

• ksql state store queries handles all windowed stores

ksqldb State Store
create stream QUIZ_NEXT with (KAFKA_TOPIC='quiz_next', VALUE_FORMAT='avro');
create table KSQL_QUIZ_NEXT as  
select request_id,  
latest_by_offset(quiz_id) as quiz_id, 
latest_by_offset(user_id) as user_id, 
latest_by_offset(question_id) as question_id, 
latest_by_offset(statement) as statement, 
latest_by_offset(a) as a, 
latest_by_offset(b) as b, 
latest_by_offset(c) as c, 
latest_by_offset(d) as d, 
latest_by_offset(difﬁculty) as difﬁculty 
from QUIZ_NEXT 
window tumbling (size 30 seconds) 
group by request_id;

ksqldb State Store
create stream QUIZ_RESULT with (KAFKA_TOPIC='quiz_result', VALUE_FORMAT='avro'); 
 
create table KSQL_QUIZ_RESULT as  
select request_id,  
latest_by_offset(quiz_id) as quiz_id,  
latest_by_offset(user_id) as user_id,  
latest_by_offset(user_name) as user_name,  
latest_by_offset(questions) as questions,  
latest_by_offset(correct) as correct  
from QUIZ_RESULT 
window tumbling (size 30 seconds) 
group by request_id;

BatchedQueryResult result =  
client.executeQuery( 
"SELECT * FROM KSQL_QUIZ_NEXT where " + 
"REQUEST_ID='" + requestId + "';");
List<Row> list = result.get();
int last = list.size() - 1; 
 
map.put("quiz_id", list.get(last).getString("QUIZ_ID")); 
...
ksqldb Queries

Final Thoughts
• if using consumer groups, design with rebalancing in mind.

• Explore options with your Load Balancer.

• CQRS w/ Kafka Streams as your State Store

• 2.5+

• Minimize Topology Complexity

• Minimize changelog data by leveraging windowing or proper
tombstoning.

Resources
• Book Event Driven Systems

• https://www.conﬂuent.io/wp-content/uploads/
conﬂuent-designing-event-driven-systems.pdf

• Source Code

• http://github.com/nbuesing/quizzer

Thanks
Kafka Summit
Conﬂuent
Object Partners, Inc.

Questions
@nbuesing
https://buesing.dev/post/ks2020
nbuesing http://www.objectpartners.com

Synchronous Commands over Apache Kafka (Neil Buesing, Object Partners, Inc) Kafka Summit 2020

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Synchronous Commands over Apache Kafka (Neil Buesing, Object Partners, Inc) Kafka Summit 2020

Similar to Synchronous Commands over Apache Kafka (Neil Buesing, Object Partners, Inc) Kafka Summit 2020 (20)

More from confluent

More from confluent (20)

Recently uploaded

Recently uploaded (20)

Synchronous Commands over Apache Kafka (Neil Buesing, Object Partners, Inc) Kafka Summit 2020