Making Sense of Your Event-Driven Dataflows (Jorge Esteban Quilcate Otoya, SYSCO AS) Kafka Summit NYC 2019

@jeqo89 at #kafkasummit
Making sense of
event-driven dataflows
Distributed tracing for Apache Ka a®-based applications
with Zipkin

Complexity happens…

“Complexity is anything [...]
that makes a system
hard to understand
and modify”
John Ousterhout, “A Philosophy of Software Design”

“Complexity is caused
by two things:
dependencies and
obscurity”
John Ousterhout, “A Philosophy of Software Design”

twitter.com/rakyll/status/971231712049971200

Jorge
Esteban
Quilcate
Otoya
twitter: @jeqo89 | github: jeqo
Peruvian in Oslo, Norway
Integration team at SYSCO AS
Part of Apache Kafka and
Zipkin communities

Talk: “Making sense of your event-driven dataflows”
40 min
Q&AWhy?
What distributed
tracing?
How to instrument Kafka
apps?
Demo
What’s next?
Demo
time

Trace
Span

“Demystifying” Kafka client configurations
Kafka producers
Kafka Streams
Kafka Consumer
github.com/jeqo/tracing-kafka-apps

Trace ID Trace metrics
Trace timeline
Spans

Is Kafka producer `send` sync or async?
Blocking
call

Non-blocking
call
Is Kafka producer `send` sync or async?

Batched record

auto.commit=true

commit per record

services
Report

“The more accurately you
try to measure the position
of a particle,
the less accurately
you can measure its speed”
Heisenberg's uncertainty principle

CLIENT SERVER
TraceContext=abc
tracer tracer
Traces reporting: Annotation-based approach
TRACES

PRODUCER CONSUMER
tracer tracerTraceContext=abc
BROKER
TraceContext=abc
TRACES

/** Annotation-based approach **/
ScopedSpan span =
tracer.startScopedSpan("process");
try { // The span is in "scope"
doProcess();
} catch (RuntimeException | Error e) {
span.error(e); // mark as error
throw e;
} finally {
span.finish(); // always finish
}

/** Instrumentation for Kafka Clients **/
Producer<K, V> producer =
new KafkaProducer<>(settings);
Producer<K, V> tracedProducer =
kafkaTracing.producer(producer);
producer.send(
new ProducerRecord<>(
"my-topic", key, value
));

Producer<K, V> producer =
new KafkaProducer<>(settings);
Producer<K, V> tracedProducer =
kafkaTracing.producer(producer); // wrap
tracedProducer.send(
new ProducerRecord<>(
"my-topic", key, value
));

Consumer<K, V> consumer =
new KafkaConsumer<>(settings);
Consumer<K, V> tracedConsumer =
kafkaTracing.consumer(consumer);
while (running) {
var records = consumer.poll(1000);
records.forEach(this::process);
}

Consumer<K, V> consumer =
new KafkaConsumer<>(settings);
Consumer<K, V> tracedConsumer =
kafkaTracing.consumer(consumer); // wrap
while (running) {
var records = tracedConsumer.poll(1000);
records.forEach(this::process);
}

void process(ConsumerRecord<K, V> record){
// extract span from record headers
Span span = kafkaTracing.nextSpan(record)
.name("process")
.start();
try (var ws = tracer.withSpanInScope(span)) {
doProcess(record);
} catch (RuntimeException | Error e) {
span.error(e); throw e;
} finally { span.finish(); }
}

/** Instrumentation for Kafka Streams **/
var b = new StreamsBuilder();
b.stream("input-topic")
.map(this::parseRecord))
.join(table, this::tableJoiner)
.transformValues(this::transform))
.to("output-topic");
KafkaStreams kafkaStreams =
new KafkaStreams(b.build(), config);
kafkaStreams.start();

.map(this::parseRecord))
.transformValues(this::transform))
KafkaStreams kafkaStreams = // wrap
ksTracing.kafkaStreams(b.build(), config);
kafkaStreams.start();

.transform(ksTracing.map(“parse”,
this::parseRecord))
.transformValues(ksTracing.transformValues(
“transform”,
this::transform)))

CLIENT SERVER
TraceContext=abc
Traces reporting: Black-box approach
agent
agent
TRACES

CLIENT SERVER
TraceContext=abc
Traces reporting: mixed approach
agent
agent
TRACES
tracer tracer

Report Transport
services

/** Transports for Zipkin **/
var sender =
URLConnectionSender.create(
"http://localhost:9411/api/v2/spans"
);
var reporter = AsyncReporter.create(sender);

var sender =
KafkaSender.newBuilder()
.bootstrapServers(
"localhost:9092")
.build();

var sender =
BringYourOwnSender.newBuilder()
.build();

Report Transport Storage
BringYourOwnDB
services

Dependencies
(batch)
services
Data-at-rest

Streaming
Messaging
Kafka
Clients
REST
Proxy
KSQL
Kafka Source
Connector
Kafka
Streams
Kafka Sink
Connector

REST Proxy
KSQL
Kafka Source
Connector
Kafka Sink
Connector
Kafka
Interceptors

@jeqo89 at #kafkasummitProducer Interceptor API

@jeqo89 at #kafkasummitConsumer Interceptor API

@jeqo89 at #kafkasummitDemo: Tracing Kafka-based applications
github.com/jeqo/talk-kafka-zipkin

What’s next?

Dependencies
(batch)
Distributed Tracing IS A Stream Processing Problem
Data-at-rest
services

Distributed Tracing IS A Stream Processing Problem
Span
Consumer
Trace
Aggregation
Span Store
spans-collected
traces-completed
Dependencies
Store
github.com/jeqo/zipkin-storage-kafka
Custom
processors

@jeqo89 at #kafkasummitCanopy: An End-to-End Performance Tracing And Analysis System”

Peter Alvaro et al., “Automating Failure Testing Research at Internet Scale”

is there anyone applying this?

Haystack: tracing and analysis platform
Tracing
Trends
and
Metrics
Anomaly
Detection
Remediation
Alerting
services

Demo: Extending Zipkin with Kafka and Haystack
github.com/jeqo/talk-kafka-zipkin

* Demo 1, source code: github.com/jeqo/tracing-kafka-apps
* Demo 2, source code: github.com/jeqo/talk-kafka-zipkin
* Blog post: confluent.io/blog/importance-of-distributed-tracing-for-apache-kafka-based-applications
* Zipkin: github.com/openzipkin (moving to Apache foundation), gitter.im/openzipkin/zipkin
* Sites using Zipkin: cwiki.apache.org/confluence/display/ZIPKIN/Sites
* Haystack: github.com/ExpediaDotCom/haystack, gitter.im/expedia-haystack
* Zipkin Kafka Backend: github.com/jeqo/zipkin-storage-kafka
* Kafka Interceptor for Zipkin: github.com/sysco-middleware/kafka-interceptor-zipkin
* Martin Kleppmann et al. 2019. Online Event Processing. https://dl.acm.org/citation.cfm?id=3321612
* John Ousterhout. A Philosophy of Software Design.
www.amazon.com/Philosophy-Software-Design-John-Ousterhout/dp/1732102201
* Jonathan Kaldor et al.2017. Canopy: An End-to-End Performance Tracing And Analysis System.SOSP’17(2017).
doi.org/10.1145/3132747.3132749
* Peter Alvaro et al.2016. Automating Failure Testing Research at Internet Scale.SoCC ’16. dx.doi.org/10.1145/2987550.2987555
Resources

fin
github.com/jeqo

Making Sense of Your Event-Driven Dataflows (Jorge Esteban Quilcate Otoya, SYSCO AS) Kafka Summit NYC 2019

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Making Sense of Your Event-Driven Dataflows (Jorge Esteban Quilcate Otoya, SYSCO AS) Kafka Summit NYC 2019

Similar to Making Sense of Your Event-Driven Dataflows (Jorge Esteban Quilcate Otoya, SYSCO AS) Kafka Summit NYC 2019 (20)

More from confluent

More from confluent (20)

Recently uploaded

Recently uploaded (20)

Making Sense of Your Event-Driven Dataflows (Jorge Esteban Quilcate Otoya, SYSCO AS) Kafka Summit NYC 2019