"Can you determine how a given event came to be? Is it an aggregation, a combination of multiple events with different sources? What are its origins?
As event driven architectures become more sophisticated, with features such as stateful stream processing, data joining, and multi-cluster flows, it becomes harder to trace the path of an event, its origins and touch points. At the same time, it also becomes more important.
Using code examples and usage scenarios we will dive into the tracing capabilities of OpenTelemetry for Kafka clients, including those using the Consumer/Producer and Kafka Streams libraries, as well as the Connect and ksqlDB platforms. This will culminate in an end-to-end tracing pipeline demonstration.
This talk will cover the following topics:
- Distributed tracing concepts, including context propagation and the OpenTelemetry implementation stack
- OpenTelemetry’s Kafka instrumentation, what is supported out of the box, code examples, edge cases, challenges and solutions
- A demonstration of an end-to-end tracing implementation
In this session, you will gain an understanding of the importance of end-to-end traceability, and several tools & examples for improving observability in your own distributed event driven applications."
A Practical Guide To End-to-End Tracing In Event Driven Architectures
1. A Practical Guide To End-to-End
Tracing In Event Driven Architectures
by Roman Kolesnev
2. Roman - UK developer at PIE Labs
PIE Labs, Confluent
• What is Distributed Tracing ?
• What is OpenTelemetry?
• Kafka Client instrumentation
• Kafka Streams instrumentation
• ksqlDB instrumentation
• Kafka Connect instrumentation
Who we are, what we’ll talk about…
3. Components of a DT system
• Instrumentation
• Collection
• Visualisation
https://blog.gurock.com/distributed-tracig/
4. What makes up a trace?
https://docs.logz.io/user-guide/distributed-tracing/what-is-tracing
5. Why Distributed Tracing?
Adding context to the message and process flow.
• Dependency graph
• Record of Event flow
• Log correlation
• Contextual metrics
• Answer questions like:
“This result looks weird. Show me all the intermediate
states, so I can debug where the weirdness started…”
9. Section: What is OpenTelemetry?
• What is Distributed Tracing (DT)?
• What is OpenTelemetry?
• Kafka Client instrumentation
• Kafka Streams instrumentation
• ksqlDB instrumentation
• Kafka Connect instrumentation
10. Overview of OpenTelemetry
1. Standardised, vendor-agnostic
2. High-quality, ubiquitous, and portable
3. Collection of tools, APIs, and SDKs
4. Instrument, generate
5. Collect, and export
6. To an observability back-end (not OT - e.g. Jaeger)
7. To help you analyze your software’s performance and behavior
11. Support for Kafka in OpenTelemetry
Kafka Clients:
• Javaagent
• Tracing Wrappers
• Tracing Interceptors
Kafka Streams:
• Javaagent
• Supply Kafka Clients with Tracing
OpenTelmetry instrumentation
12. • What is Distributed Tracing (DT)?
• What is OpenTelemetry?
• Kafka Client instrumentation
• Kafka Streams instrumentation
• ksqlDB instrumentation
• Kafka Connect instrumentation
Section: Kafka Client instrumentation
37. Kafka Streams support - summary
• Supported as is with Javaagent:
• Stateless
• Stateful - with limitations - single thread context, - no caching
• Wrapping State Store approach
• Inlining Span creation into Stateful operations
38. • What is Distributed Tracing (DT)?
• What is OpenTelemetry?
• Kafka Client instrumentation
• Kafka Streams instrumentation
• ksqlDB instrumentation
• Kafka Connect instrumentation
Section: kSQL instrumentation
39. ksqlDB support - summary
• ksqlDB runs Kafka Streams jobs under the hood
• Using Javaagent with ksqlDB - limitations
• Trace grouping implications