SlideShare a Scribd company logo
1 of 63
Download to read offline
Mind the App
How to monitor your Kafka Streams applications
Bruno Cadonna, Kafka Summit 2021 Europe
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
About me
2
Bruno Cadonna
Contributor to Apache Kafka &
Software Developer at Confluent
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Content
3
• Basics about metrics in Kafka
• Metrics in Kafka Streams
• KIP-444: Improving Kafka Streams’ metrics
• KIP-471 and KIP-607: RocksDB metrics
• KIP-613: End-to-end latency metrics
• Takeaways
Basics about metrics in Kafka
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
A metric in Kafka
5
• consists of a name, a value, and a configuration
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
A metric in Kafka
6
• consists of a name, a value, and a configuration
• a metric name is composed of
• name
• group
• tags
• description
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
A metric in Kafka
7
• consists of a name, a value, and a configuration
• a metric name is composed of
• name
• group
• tags
• description
• a metric value inherits from the Object class, e.g. integral number, decimal number, string, …
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
A metric in Kafka
8
• consists of a name, a value, and a configuration
• a metric name is composed of
• name
• group
• tags
• description
• a metric value inherits from the Object class, e.g. integral number, decimal number, string, …
• metric config contains the recording level which can be INFO, DEBUG, TRACE
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
A metric in Kafka
9
• consists of a name, a value, and a configuration
• a metric name is composed of
• name
• group
• tags
• description
• a metric value inherits from the Object class, e.g. integral number, decimal number, string, …
• metric config contains the recording level which can be INFO, DEBUG, TRACE
• example:
• name: process-rate
• group: stream-thread-metrics
• tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1
• description: The average number of processed records per second
• value: 123456.78
• recording level: INFO
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
A sensor in Kafka
10
• maintains a sequence of recorded values
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
A sensor in Kafka
11
• maintains a sequence of recorded values
• maintains a set of metrics
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
A sensor in Kafka
12
• maintains a sequence of recorded values
• maintains a set of metrics
• each metric specifies an aggregation on the recorded values
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
A sensor in Kafka
13
• maintains a sequence of recorded values
• maintains a set of metrics
• each metric specifies an aggregation on the recorded values
• each time a value is recorded all metrics in a sensor are updated
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
A sensor in Kafka
14
• maintains a sequence of recorded values
• maintains a set of metrics
• each metric specifies an aggregation for the recorded values
• each time a value is recorded all metrics in a sensor are updated
• example:
• process-rate and process-total are recorded by the same sensor
• process-rate computes the number of processed records over time
• process-total computes the total number of processed records
Metrics in Kafka Streams
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Anatomy of a Kafka Streams application
16
Kafka Streams client
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Anatomy of a Kafka Streams application
17
stream thread 1
stream thread 2
Kafka Streams client
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Anatomy of a Kafka Streams application
18
stream thread 1
task 1
task 2
task 3
task 4
task 5
processor node
state store
cache
stream thread 2
Kafka Streams client
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
How does Kafka Streams report metrics?
19
Kafka Streams client
metrics()
read-only map of metrics
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
How does Kafka Streams report metrics?
20
metrics()
read-only map of metrics
JMX reporter
implements
MetricsReporter
my reporter
implements
MetricsReporter
Kafka Streams config:
metric.reporter
by default,
no need to set
Kafka Streams client
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
How does Kafka Streams report metrics?
21
metrics()
read-only map of metrics
JMX reporter
implements
MetricsReporter
my reporter
implements
MetricsReporter
Kafka Streams config:
metric.reporter
interface MetricsReporter {
// called when a metric is added or updated
void metricChange(KafkaMetric metric);
// called when a metric is removed
void metricRemoval(KafkaMetric metric);
}
by default,
no need to set
Kafka Streams client
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
jconsole
22
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
jconsole
23
metric name
metric description
metric value
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
jconsole
24
metric name
tag: thread-id
metric group
metric description
metric value
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Datadog
25
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Datadog
26
metric name
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Datadog
27
metric group
tags
metric name
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
What metrics does Kafka Streams expose?
28
• Kafka Streams client level:
• name: state
• group: stream-metrics
• tags: client-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
What metrics does Kafka Streams expose?
29
• Kafka Streams client level:
• name: state
• group: stream-metrics
• tags: client-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003
• stream thread level:
• name: process-rate
• group: stream-thread-metrics
• tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
What metrics does Kafka Streams expose?
30
• Kafka Streams client level:
• name: state
• group: stream-metrics
• tags: client-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003
• stream thread level:
• name: process-rate
• group: stream-thread-metrics
• tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1
• task level:
• name: process-latency-avg
• group: stream-task-metrics
• tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1,
task-id = 0_1
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
…some more metrics
31
• processor node level
• name: process-rate
• group: stream-processor-node-metrics
• tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1,
task-id = 0_1,
processor-node-id = KSTREAM-SINK-0000000004
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
…some more metrics
32
• processor node level
• name: process-rate
• group: stream-processor-node-metrics
• tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1,
task-id = 0_1,
processor-node-id = KSTREAM-SINK-0000000004
• state store level
• name: put-rate
• group: stream-state-metrics
• tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1,
task-id = 0_1,
rocksdb-state-id = count-items
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
…some more metrics
33
• processor node level
• name: process-rate
• group: stream-processor-node-metrics
• tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1,
task-id = 0_1,
processor-node-id = KSTREAM-SINK-0000000004
• state store level
• name: put-rate
• group: stream-state-metrics
• tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1,
task-id = 0_1,
rocksdb-state-id = count-items
• cache level
• name: hit-ratio-avg
• group: stream-record-cache-metrics
• tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1,
task-id = 0_1,
record-cache-id = 0_1-count-items
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
… and finally
34
• all metrics of embedded consumers, producers, and admin client
• name: last-rebalance-seconds-ago
• group: consumer-coordinator-metrics
• tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1-consumer
KIP-444:
Improving Kafka Streams’ metrics
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
New metrics
36
• introduces client-level metrics
• version,
• commit-id,
• application-id,
• topology-description,
• state,
• alive-stream-threads
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
New metrics
37
• introduces client-level metrics
• version,
• commit-id,
• application-id,
• topology-description,
• state,
• alive-stream-threads
• introduces new task level metrics
• active-process-ratio,
• standby-process-ratio (not yet implemented),
• dropped-records
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Refactorings
38
• renames some metric names and some metric tags
• client-level and stream thread-level metrics on INFO and most metrics on lower levels on
DEBUG
• removes all parent metrics except one and let users do the roll-up themselves
• removes overlapping metrics
• dropped-records (task-level, INFO) replaces
• late-records-drop (processor node, INFO),
• skipped-records (processor node, INFO),
• expired-window-record-drop (state store, DEBUG)
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Improving custom metrics
39
• Sensor addLatencyRateTotalSensor(final String scopeName,
final String entityName,
final String operationName,
final Sensor.RecordingLevel recordingLevel,
final String... tags);
• Sensor addRateTotalSensor(final String scopeName,
final String entityName,
final String operationName,
final Sensor.RecordingLevel recordingLevel,
final String... tags);
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Improving custom metrics
40
• Sensor addLatencyRateTotalSensor(final String scopeName,
final String entityName,
final String operationName,
final Sensor.RecordingLevel recordingLevel,
final String... tags);
• Sensor addRateTotalSensor(final String scopeName,
final String entityName,
final String operationName,
final Sensor.RecordingLevel recordingLevel,
final String... tags);
• only available where you have access to the ProcessorContext
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Improving custom metrics
41
• Sensor addLatencyRateTotalSensor(final String scopeName,
final String entityName,
final String operationName,
final Sensor.RecordingLevel recordingLevel,
final String... tags);
• Sensor addRateTotalSensor(final String scopeName,
final String entityName,
final String operationName,
final Sensor.RecordingLevel recordingLevel,
final String... tags);
• only available where you have access to the ProcessorContext
• you can add additional metrics to the sensor with Sensor#add()
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Example of custom metrics
42
public class Processor<String, String, String, String>() {
private ProcessorContext context;
private KeyValueStore<String, Integer> kvStore;
private Sensor countEmptyRecords;
@Overrid
public void init(final ProcessorContext<String, String> context) {
this.context = context;
countEmptyRecords = context.metrics().addRateTotalSensor(
"word-counter",
"word-counter" + context.taskId(),
"count-empty-messages",
RecordingLevel.INFO
);
kvStore = context.getStateStore("Counts");
}
@Override
public void process(final Record<String, String> record) {
final String[] words = record.value().toLowerCase(Locale.getDefault()).split(" ");
if (words.length == 0) {
countEmptyRecords.record();
}
for (final String word : words) {
final Integer oldValue = kvStore.get(word);
if (oldValue == null) {
kvStore.put(word, 1);
} else {
kvStore.put(word, oldValue + 1);
}
}
}
};
KIP-471 and KIP-607:
RocksDB metrics
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
RocksDB metrics
44
• RocksDB is the default state store in Kafka Streams
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
RocksDB metrics
45
• RocksDB is the default state store in Kafka Streams
• statistics-based metrics (KIP-471, AK 2.4): cumulative measurements over time collected by
RocksDB
• name: bytes-written-rate
• group: stream-state-metrics
• tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1,
task-id = 0_1,
rocksdb-state-id = count-items
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
RocksDB metrics
46
• RocksDB is the default state store in Kafka Streams
• statistics-based metrics (KIP-471, AK 2.4): cumulative measurements over time collected by
RocksDB
• name: bytes-written-rate
• group: stream-state-metrics
• tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1,
task-id = 0_1,
rocksdb-state-id = count-items
• properties-based metrics (KIP-607, AK 2.7): properties exposed by RocksDB providing current
measurements
• name: block-cache-usage
• group: stream-state-metrics
• tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1,
task-id = 0_1,
rocksdb-state-id = count-items
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Recording RocksDB metrics
47
• statistics-based metrics
• collecting statistics-based metrics may have an impact on performance
• recording metrics during state store operations might be costly
• instead each state store has a metric recorder
• all metric recorders are triggered once per minute by one dedicated thread that is started at Kafka Streams client start-up
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Recording RocksDB metrics
48
• statistics-based metrics
• collecting statistics-based metrics may have an impact on performance
• recording metrics during state store operations might be costly
• instead each state store has a metric recorder
• all metric recorders are triggered once per minute by one dedicated thread that is started at Kafka Streams client start-up
• properties-based metrics
• all properties-based metrics are gauges
• a gauge executes some given code each time the metric is queried
• properties-based metrics query RocksDB properties
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
When to look at RocksDB metrics?
49
• high memory usage
• size-all-mem-tables
• block-cache-usage
• block-cache-pinned-usage
• estimate-table-readers-mem
statistics-based metrics
properties-based metrics
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
When to look at RocksDB metrics?
50
• high memory usage
• size-all-mem-tables
• block-cache-usage
• block-cache-pinned-usage
• estimate-table-readers-mem
• high disk usage
• total-sst-files-size
statistics-based metrics
properties-based metrics
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
When to look at RocksDB metrics?
51
• high memory usage
• size-all-mem-tables
• block-cache-usage
• block-cache-pinned-usage
• estimate-table-readers-mem
• high disk usage
• total-sst-files-size
• high disk I/O and write stalls
• memtable-bytes-flushed-[rate | total]
• bytes-[read | written]-compaction-rate
• write-stall-duration-[avg | total]
• memtable-hit-ratio
• block-cache-[data | index | filter]-hit-ratio
statistics-based metrics
properties-based metrics
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
When to look at RocksDB metrics?
52
• high memory usage
• size-all-mem-tables
• block-cache-usage
• block-cache-pinned-usage
• estimate-table-readers-mem
• high disk usage
• total-sst-files-size
• high disk I/O and write stalls
• memtable-bytes-flushed-[rate | total]
• bytes-[read | written]-compaction-rate
• write-stall-duration-[avg | total]
• memtable-hit-ratio
• block-cache-[data | index | filter]-hit-ratio
• too many open files
• number-open-files
statistics-based metrics
properties-based metrics
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
When to look at RocksDB metrics?
53
• high memory usage
• size-all-mem-tables
• block-cache-usage
• block-cache-pinned-usage
• estimate-table-readers-mem
• high disk usage
• total-sst-files-size
• high disk I/O and write stalls
• memtable-bytes-flushed-[rate | total]
• bytes-[read | written]-compaction-rate
• write-stall-duration-[avg | total]
• memtable-hit-ratio
• block-cache-[data | index | filter]-hit-ratio
• too many open files
• number-open-files
for more details, check out the blog post:
How to Tune RocksDB for Your Kafka Streams Application
https://www.confluent.io/blog/how-to-tune-rocksdb-kafka-streams-state-stores-performance/
statistics-based metrics
properties-based metrics
KIP-613:
End-to-end latency metrics
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
End-to-end-latency metrics
55
source node filter
aggregation
sink node
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
End-to-end-latency metrics
56
source node filter
aggregation
sink node
consumption latency (INFO) name: record-e2e-latency-[min | max | avg]
group: stream-processor-node-metrics
tags: thread-id = myapp-…,
task-id = 0_1,
processor-node-id = KSTREAM-SOURCE-0000000004
event time processing time
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
End-to-end-latency metrics
57
source node filter
aggregation
sink node
consumption latency (INFO) name: record-e2e-latency-[min | max | avg]
group: stream-processor-node-metrics
tags: thread-id = myapp-…,
task-id = 0_1,
processor-node-id = KSTREAM-SOURCE-0000000004
event time processing time
full end-to-end latency (INFO) name: record-e2e-latency-[min | max | avg]
group: stream-processor-node-metrics
tags: thread-id = myapp-…,
task-id = 0_1,
processor-node-id = KSTREAM-SINK-0000000004
event time processing time
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
End-to-end-latency metrics
58
source node filter
aggregation
sink node
begin-to-state latency (TRACE)
event time processing time
name: record-e2e-latency-[min | max | avg]
group: stream-state-metrics
tags: thread-id = myapp-…,
task-id = 0_1,
rocksdb-state-id = count-items
consumption latency (INFO) name: record-e2e-latency-[min | max | avg]
group: stream-processor-node-metrics
tags: thread-id = myapp-…,
task-id = 0_1,
processor-node-id = KSTREAM-SOURCE-0000000004
event time processing time
full end-to-end latency (INFO) name: record-e2e-latency-[min | max | avg]
group: stream-processor-node-metrics
tags: thread-id = myapp-…,
task-id = 0_1,
processor-node-id = KSTREAM-SINK-0000000004
event time processing time
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
End-to-end-latency metrics (advanced)
59
source node filter
aggregation
sink node source node filter
aggregation
sink node
task 1 task 2
event time processing time
processing time
event time
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
End-to-end-latency metrics (advanced)
60
source node filter
aggregation
sink node source node filter
aggregation
sink node
task 1 task 2
event time processing time
processing time
event time
event time processing time
processing delay of task 2
Takeaways
Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Takeaways
62
• Kafka Streams exposes various metrics on different levels
• metrics were consolidated recently-ish
• RocksDB metrics let you gain insight into state stores
• Kafka Streams allows monitoring record end-to-end latencies
Thank you!
bruno@confluent.io
63
cnfl.io/slack
cnfl.io/blog
cnfl.io/meetups
cnfl.io/forum

More Related Content

What's hot

Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explainedconfluent
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database Systemconfluent
 
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...Kai Wähner
 
Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...
Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...
Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...HostedbyConfluent
 
How Apache Kafka® Works
How Apache Kafka® WorksHow Apache Kafka® Works
How Apache Kafka® Worksconfluent
 
Introduction to Kafka connect
Introduction to Kafka connectIntroduction to Kafka connect
Introduction to Kafka connectKnoldus Inc.
 
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...confluent
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Storesconfluent
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
 
MirrorMaker: Beyond the Basics with Mickael Maison
MirrorMaker: Beyond the Basics with Mickael MaisonMirrorMaker: Beyond the Basics with Mickael Maison
MirrorMaker: Beyond the Basics with Mickael MaisonHostedbyConfluent
 
Capacity Planning Your Kafka Cluster | Jason Bell, Digitalis
Capacity Planning Your Kafka Cluster | Jason Bell, DigitalisCapacity Planning Your Kafka Cluster | Jason Bell, Digitalis
Capacity Planning Your Kafka Cluster | Jason Bell, DigitalisHostedbyConfluent
 
So You Want to Write a Connector?
So You Want to Write a Connector? So You Want to Write a Connector?
So You Want to Write a Connector? confluent
 
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안SANG WON PARK
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorFlink Forward
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...confluent
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsGuozhang Wang
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practicesconfluent
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache KafkaPaul Brebner
 

What's hot (20)

Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
 
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
 
Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...
Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...
Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...
 
How Apache Kafka® Works
How Apache Kafka® WorksHow Apache Kafka® Works
How Apache Kafka® Works
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Introduction to Kafka connect
Introduction to Kafka connectIntroduction to Kafka connect
Introduction to Kafka connect
 
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
MirrorMaker: Beyond the Basics with Mickael Maison
MirrorMaker: Beyond the Basics with Mickael MaisonMirrorMaker: Beyond the Basics with Mickael Maison
MirrorMaker: Beyond the Basics with Mickael Maison
 
Capacity Planning Your Kafka Cluster | Jason Bell, Digitalis
Capacity Planning Your Kafka Cluster | Jason Bell, DigitalisCapacity Planning Your Kafka Cluster | Jason Bell, Digitalis
Capacity Planning Your Kafka Cluster | Jason Bell, Digitalis
 
So You Want to Write a Connector?
So You Want to Write a Connector? So You Want to Write a Connector?
So You Want to Write a Connector?
 
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka Streams
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practices
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache Kafka
 

Similar to Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna, Confluent

Data Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectData Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectKaufman Ng
 
Automatically scaling Kubernetes workloads - SVC215-S - New York AWS Summit
Automatically scaling Kubernetes workloads - SVC215-S - New York AWS SummitAutomatically scaling Kubernetes workloads - SVC215-S - New York AWS Summit
Automatically scaling Kubernetes workloads - SVC215-S - New York AWS SummitAmazon Web Services
 
A Practical Deep Dive into Observability of Streaming Applications with Kosta...
A Practical Deep Dive into Observability of Streaming Applications with Kosta...A Practical Deep Dive into Observability of Streaming Applications with Kosta...
A Practical Deep Dive into Observability of Streaming Applications with Kosta...HostedbyConfluent
 
Introducing Kafka's Streams API
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams APIconfluent
 
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...HostedbyConfluent
 
What’s new in Apache Spark 2.3
What’s new in Apache Spark 2.3What’s new in Apache Spark 2.3
What’s new in Apache Spark 2.3DataWorks Summit
 
Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Nitin Kumar
 
Deploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and KubernetesDeploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and Kubernetesconfluent
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsTimothy Spann
 
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...Flink Forward
 
Web Scale Reasoning and the LarKC Project
Web Scale Reasoning and the LarKC ProjectWeb Scale Reasoning and the LarKC Project
Web Scale Reasoning and the LarKC ProjectSaltlux Inc.
 
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent RamièreAu delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramièreconfluent
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Data Con LA
 
Concepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaConcepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaQAware GmbH
 
ОЛЕГ МАЦЬКІВ «Crash course on Operator Framework» Lviv DevOps Conference 2019
ОЛЕГ МАЦЬКІВ «Crash course on Operator Framework» Lviv DevOps Conference 2019ОЛЕГ МАЦЬКІВ «Crash course on Operator Framework» Lviv DevOps Conference 2019
ОЛЕГ МАЦЬКІВ «Crash course on Operator Framework» Lviv DevOps Conference 2019UA DevOps Conference
 
Apache spark 2.4 and beyond
Apache spark 2.4 and beyondApache spark 2.4 and beyond
Apache spark 2.4 and beyondXiao Li
 
Presentación11.pdf
Presentación11.pdfPresentación11.pdf
Presentación11.pdfPabloCanesta
 
Load Balancing in the Cloud using Nginx & Kubernetes
Load Balancing in the Cloud using Nginx & KubernetesLoad Balancing in the Cloud using Nginx & Kubernetes
Load Balancing in the Cloud using Nginx & KubernetesLee Calcote
 

Similar to Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna, Confluent (20)

Data Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectData Pipelines with Kafka Connect
Data Pipelines with Kafka Connect
 
Automatically scaling Kubernetes workloads - SVC215-S - New York AWS Summit
Automatically scaling Kubernetes workloads - SVC215-S - New York AWS SummitAutomatically scaling Kubernetes workloads - SVC215-S - New York AWS Summit
Automatically scaling Kubernetes workloads - SVC215-S - New York AWS Summit
 
A Practical Deep Dive into Observability of Streaming Applications with Kosta...
A Practical Deep Dive into Observability of Streaming Applications with Kosta...A Practical Deep Dive into Observability of Streaming Applications with Kosta...
A Practical Deep Dive into Observability of Streaming Applications with Kosta...
 
Introducing Kafka's Streams API
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams API
 
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
 
What’s new in Apache Spark 2.3
What’s new in Apache Spark 2.3What’s new in Apache Spark 2.3
What’s new in Apache Spark 2.3
 
Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017
 
Deploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and KubernetesDeploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and Kubernetes
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
 
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
 
dA Platform Overview
dA Platform OverviewdA Platform Overview
dA Platform Overview
 
Web Scale Reasoning and the LarKC Project
Web Scale Reasoning and the LarKC ProjectWeb Scale Reasoning and the LarKC Project
Web Scale Reasoning and the LarKC Project
 
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent RamièreAu delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
 
Concepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaConcepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with Kafka
 
ОЛЕГ МАЦЬКІВ «Crash course on Operator Framework» Lviv DevOps Conference 2019
ОЛЕГ МАЦЬКІВ «Crash course on Operator Framework» Lviv DevOps Conference 2019ОЛЕГ МАЦЬКІВ «Crash course on Operator Framework» Lviv DevOps Conference 2019
ОЛЕГ МАЦЬКІВ «Crash course on Operator Framework» Lviv DevOps Conference 2019
 
Apache spark 2.4 and beyond
Apache spark 2.4 and beyondApache spark 2.4 and beyond
Apache spark 2.4 and beyond
 
Resume2015
Resume2015Resume2015
Resume2015
 
Presentación11.pdf
Presentación11.pdfPresentación11.pdf
Presentación11.pdf
 
Load Balancing in the Cloud using Nginx & Kubernetes
Load Balancing in the Cloud using Nginx & KubernetesLoad Balancing in the Cloud using Nginx & Kubernetes
Load Balancing in the Cloud using Nginx & Kubernetes
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

Recently uploaded (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna, Confluent

  • 1. Mind the App How to monitor your Kafka Streams applications Bruno Cadonna, Kafka Summit 2021 Europe
  • 2. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. About me 2 Bruno Cadonna Contributor to Apache Kafka & Software Developer at Confluent
  • 3. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Content 3 • Basics about metrics in Kafka • Metrics in Kafka Streams • KIP-444: Improving Kafka Streams’ metrics • KIP-471 and KIP-607: RocksDB metrics • KIP-613: End-to-end latency metrics • Takeaways
  • 5. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. A metric in Kafka 5 • consists of a name, a value, and a configuration
  • 6. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. A metric in Kafka 6 • consists of a name, a value, and a configuration • a metric name is composed of • name • group • tags • description
  • 7. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. A metric in Kafka 7 • consists of a name, a value, and a configuration • a metric name is composed of • name • group • tags • description • a metric value inherits from the Object class, e.g. integral number, decimal number, string, …
  • 8. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. A metric in Kafka 8 • consists of a name, a value, and a configuration • a metric name is composed of • name • group • tags • description • a metric value inherits from the Object class, e.g. integral number, decimal number, string, … • metric config contains the recording level which can be INFO, DEBUG, TRACE
  • 9. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. A metric in Kafka 9 • consists of a name, a value, and a configuration • a metric name is composed of • name • group • tags • description • a metric value inherits from the Object class, e.g. integral number, decimal number, string, … • metric config contains the recording level which can be INFO, DEBUG, TRACE • example: • name: process-rate • group: stream-thread-metrics • tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1 • description: The average number of processed records per second • value: 123456.78 • recording level: INFO
  • 10. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. A sensor in Kafka 10 • maintains a sequence of recorded values
  • 11. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. A sensor in Kafka 11 • maintains a sequence of recorded values • maintains a set of metrics
  • 12. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. A sensor in Kafka 12 • maintains a sequence of recorded values • maintains a set of metrics • each metric specifies an aggregation on the recorded values
  • 13. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. A sensor in Kafka 13 • maintains a sequence of recorded values • maintains a set of metrics • each metric specifies an aggregation on the recorded values • each time a value is recorded all metrics in a sensor are updated
  • 14. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. A sensor in Kafka 14 • maintains a sequence of recorded values • maintains a set of metrics • each metric specifies an aggregation for the recorded values • each time a value is recorded all metrics in a sensor are updated • example: • process-rate and process-total are recorded by the same sensor • process-rate computes the number of processed records over time • process-total computes the total number of processed records
  • 15. Metrics in Kafka Streams
  • 16. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Anatomy of a Kafka Streams application 16 Kafka Streams client
  • 17. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Anatomy of a Kafka Streams application 17 stream thread 1 stream thread 2 Kafka Streams client
  • 18. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Anatomy of a Kafka Streams application 18 stream thread 1 task 1 task 2 task 3 task 4 task 5 processor node state store cache stream thread 2 Kafka Streams client
  • 19. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. How does Kafka Streams report metrics? 19 Kafka Streams client metrics() read-only map of metrics
  • 20. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. How does Kafka Streams report metrics? 20 metrics() read-only map of metrics JMX reporter implements MetricsReporter my reporter implements MetricsReporter Kafka Streams config: metric.reporter by default, no need to set Kafka Streams client
  • 21. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. How does Kafka Streams report metrics? 21 metrics() read-only map of metrics JMX reporter implements MetricsReporter my reporter implements MetricsReporter Kafka Streams config: metric.reporter interface MetricsReporter { // called when a metric is added or updated void metricChange(KafkaMetric metric); // called when a metric is removed void metricRemoval(KafkaMetric metric); } by default, no need to set Kafka Streams client
  • 22. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. jconsole 22
  • 23. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. jconsole 23 metric name metric description metric value
  • 24. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. jconsole 24 metric name tag: thread-id metric group metric description metric value
  • 25. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Datadog 25
  • 26. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Datadog 26 metric name
  • 27. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Datadog 27 metric group tags metric name
  • 28. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. What metrics does Kafka Streams expose? 28 • Kafka Streams client level: • name: state • group: stream-metrics • tags: client-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003
  • 29. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. What metrics does Kafka Streams expose? 29 • Kafka Streams client level: • name: state • group: stream-metrics • tags: client-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003 • stream thread level: • name: process-rate • group: stream-thread-metrics • tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1
  • 30. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. What metrics does Kafka Streams expose? 30 • Kafka Streams client level: • name: state • group: stream-metrics • tags: client-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003 • stream thread level: • name: process-rate • group: stream-thread-metrics • tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1 • task level: • name: process-latency-avg • group: stream-task-metrics • tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1, task-id = 0_1
  • 31. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. …some more metrics 31 • processor node level • name: process-rate • group: stream-processor-node-metrics • tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1, task-id = 0_1, processor-node-id = KSTREAM-SINK-0000000004
  • 32. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. …some more metrics 32 • processor node level • name: process-rate • group: stream-processor-node-metrics • tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1, task-id = 0_1, processor-node-id = KSTREAM-SINK-0000000004 • state store level • name: put-rate • group: stream-state-metrics • tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1, task-id = 0_1, rocksdb-state-id = count-items
  • 33. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. …some more metrics 33 • processor node level • name: process-rate • group: stream-processor-node-metrics • tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1, task-id = 0_1, processor-node-id = KSTREAM-SINK-0000000004 • state store level • name: put-rate • group: stream-state-metrics • tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1, task-id = 0_1, rocksdb-state-id = count-items • cache level • name: hit-ratio-avg • group: stream-record-cache-metrics • tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1, task-id = 0_1, record-cache-id = 0_1-count-items
  • 34. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. … and finally 34 • all metrics of embedded consumers, producers, and admin client • name: last-rebalance-seconds-ago • group: consumer-coordinator-metrics • tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1-consumer
  • 36. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. New metrics 36 • introduces client-level metrics • version, • commit-id, • application-id, • topology-description, • state, • alive-stream-threads
  • 37. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. New metrics 37 • introduces client-level metrics • version, • commit-id, • application-id, • topology-description, • state, • alive-stream-threads • introduces new task level metrics • active-process-ratio, • standby-process-ratio (not yet implemented), • dropped-records
  • 38. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Refactorings 38 • renames some metric names and some metric tags • client-level and stream thread-level metrics on INFO and most metrics on lower levels on DEBUG • removes all parent metrics except one and let users do the roll-up themselves • removes overlapping metrics • dropped-records (task-level, INFO) replaces • late-records-drop (processor node, INFO), • skipped-records (processor node, INFO), • expired-window-record-drop (state store, DEBUG)
  • 39. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Improving custom metrics 39 • Sensor addLatencyRateTotalSensor(final String scopeName, final String entityName, final String operationName, final Sensor.RecordingLevel recordingLevel, final String... tags); • Sensor addRateTotalSensor(final String scopeName, final String entityName, final String operationName, final Sensor.RecordingLevel recordingLevel, final String... tags);
  • 40. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Improving custom metrics 40 • Sensor addLatencyRateTotalSensor(final String scopeName, final String entityName, final String operationName, final Sensor.RecordingLevel recordingLevel, final String... tags); • Sensor addRateTotalSensor(final String scopeName, final String entityName, final String operationName, final Sensor.RecordingLevel recordingLevel, final String... tags); • only available where you have access to the ProcessorContext
  • 41. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Improving custom metrics 41 • Sensor addLatencyRateTotalSensor(final String scopeName, final String entityName, final String operationName, final Sensor.RecordingLevel recordingLevel, final String... tags); • Sensor addRateTotalSensor(final String scopeName, final String entityName, final String operationName, final Sensor.RecordingLevel recordingLevel, final String... tags); • only available where you have access to the ProcessorContext • you can add additional metrics to the sensor with Sensor#add()
  • 42. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Example of custom metrics 42 public class Processor<String, String, String, String>() { private ProcessorContext context; private KeyValueStore<String, Integer> kvStore; private Sensor countEmptyRecords; @Overrid public void init(final ProcessorContext<String, String> context) { this.context = context; countEmptyRecords = context.metrics().addRateTotalSensor( "word-counter", "word-counter" + context.taskId(), "count-empty-messages", RecordingLevel.INFO ); kvStore = context.getStateStore("Counts"); } @Override public void process(final Record<String, String> record) { final String[] words = record.value().toLowerCase(Locale.getDefault()).split(" "); if (words.length == 0) { countEmptyRecords.record(); } for (final String word : words) { final Integer oldValue = kvStore.get(word); if (oldValue == null) { kvStore.put(word, 1); } else { kvStore.put(word, oldValue + 1); } } } };
  • 44. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. RocksDB metrics 44 • RocksDB is the default state store in Kafka Streams
  • 45. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. RocksDB metrics 45 • RocksDB is the default state store in Kafka Streams • statistics-based metrics (KIP-471, AK 2.4): cumulative measurements over time collected by RocksDB • name: bytes-written-rate • group: stream-state-metrics • tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1, task-id = 0_1, rocksdb-state-id = count-items
  • 46. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. RocksDB metrics 46 • RocksDB is the default state store in Kafka Streams • statistics-based metrics (KIP-471, AK 2.4): cumulative measurements over time collected by RocksDB • name: bytes-written-rate • group: stream-state-metrics • tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1, task-id = 0_1, rocksdb-state-id = count-items • properties-based metrics (KIP-607, AK 2.7): properties exposed by RocksDB providing current measurements • name: block-cache-usage • group: stream-state-metrics • tags: thread-id = myapp-2d0b492c-87f1-11eb-8dcd-0242ac130003-StreamThread-1, task-id = 0_1, rocksdb-state-id = count-items
  • 47. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Recording RocksDB metrics 47 • statistics-based metrics • collecting statistics-based metrics may have an impact on performance • recording metrics during state store operations might be costly • instead each state store has a metric recorder • all metric recorders are triggered once per minute by one dedicated thread that is started at Kafka Streams client start-up
  • 48. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Recording RocksDB metrics 48 • statistics-based metrics • collecting statistics-based metrics may have an impact on performance • recording metrics during state store operations might be costly • instead each state store has a metric recorder • all metric recorders are triggered once per minute by one dedicated thread that is started at Kafka Streams client start-up • properties-based metrics • all properties-based metrics are gauges • a gauge executes some given code each time the metric is queried • properties-based metrics query RocksDB properties
  • 49. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. When to look at RocksDB metrics? 49 • high memory usage • size-all-mem-tables • block-cache-usage • block-cache-pinned-usage • estimate-table-readers-mem statistics-based metrics properties-based metrics
  • 50. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. When to look at RocksDB metrics? 50 • high memory usage • size-all-mem-tables • block-cache-usage • block-cache-pinned-usage • estimate-table-readers-mem • high disk usage • total-sst-files-size statistics-based metrics properties-based metrics
  • 51. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. When to look at RocksDB metrics? 51 • high memory usage • size-all-mem-tables • block-cache-usage • block-cache-pinned-usage • estimate-table-readers-mem • high disk usage • total-sst-files-size • high disk I/O and write stalls • memtable-bytes-flushed-[rate | total] • bytes-[read | written]-compaction-rate • write-stall-duration-[avg | total] • memtable-hit-ratio • block-cache-[data | index | filter]-hit-ratio statistics-based metrics properties-based metrics
  • 52. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. When to look at RocksDB metrics? 52 • high memory usage • size-all-mem-tables • block-cache-usage • block-cache-pinned-usage • estimate-table-readers-mem • high disk usage • total-sst-files-size • high disk I/O and write stalls • memtable-bytes-flushed-[rate | total] • bytes-[read | written]-compaction-rate • write-stall-duration-[avg | total] • memtable-hit-ratio • block-cache-[data | index | filter]-hit-ratio • too many open files • number-open-files statistics-based metrics properties-based metrics
  • 53. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. When to look at RocksDB metrics? 53 • high memory usage • size-all-mem-tables • block-cache-usage • block-cache-pinned-usage • estimate-table-readers-mem • high disk usage • total-sst-files-size • high disk I/O and write stalls • memtable-bytes-flushed-[rate | total] • bytes-[read | written]-compaction-rate • write-stall-duration-[avg | total] • memtable-hit-ratio • block-cache-[data | index | filter]-hit-ratio • too many open files • number-open-files for more details, check out the blog post: How to Tune RocksDB for Your Kafka Streams Application https://www.confluent.io/blog/how-to-tune-rocksdb-kafka-streams-state-stores-performance/ statistics-based metrics properties-based metrics
  • 55. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. End-to-end-latency metrics 55 source node filter aggregation sink node
  • 56. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. End-to-end-latency metrics 56 source node filter aggregation sink node consumption latency (INFO) name: record-e2e-latency-[min | max | avg] group: stream-processor-node-metrics tags: thread-id = myapp-…, task-id = 0_1, processor-node-id = KSTREAM-SOURCE-0000000004 event time processing time
  • 57. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. End-to-end-latency metrics 57 source node filter aggregation sink node consumption latency (INFO) name: record-e2e-latency-[min | max | avg] group: stream-processor-node-metrics tags: thread-id = myapp-…, task-id = 0_1, processor-node-id = KSTREAM-SOURCE-0000000004 event time processing time full end-to-end latency (INFO) name: record-e2e-latency-[min | max | avg] group: stream-processor-node-metrics tags: thread-id = myapp-…, task-id = 0_1, processor-node-id = KSTREAM-SINK-0000000004 event time processing time
  • 58. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. End-to-end-latency metrics 58 source node filter aggregation sink node begin-to-state latency (TRACE) event time processing time name: record-e2e-latency-[min | max | avg] group: stream-state-metrics tags: thread-id = myapp-…, task-id = 0_1, rocksdb-state-id = count-items consumption latency (INFO) name: record-e2e-latency-[min | max | avg] group: stream-processor-node-metrics tags: thread-id = myapp-…, task-id = 0_1, processor-node-id = KSTREAM-SOURCE-0000000004 event time processing time full end-to-end latency (INFO) name: record-e2e-latency-[min | max | avg] group: stream-processor-node-metrics tags: thread-id = myapp-…, task-id = 0_1, processor-node-id = KSTREAM-SINK-0000000004 event time processing time
  • 59. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. End-to-end-latency metrics (advanced) 59 source node filter aggregation sink node source node filter aggregation sink node task 1 task 2 event time processing time processing time event time
  • 60. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. End-to-end-latency metrics (advanced) 60 source node filter aggregation sink node source node filter aggregation sink node task 1 task 2 event time processing time processing time event time event time processing time processing delay of task 2
  • 62. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. Takeaways 62 • Kafka Streams exposes various metrics on different levels • metrics were consolidated recently-ish • RocksDB metrics let you gain insight into state stores • Kafka Streams allows monitoring record end-to-end latencies