SlideShare a Scribd company logo
1 of 92
Download to read offline
Building a Scalable Real-Time Fleet
Management IoT Data Tracker with
Kafka Streams and gRPC
Rui Batista | Adrien Bestel | tb.lx by Daimler Truck
2
Speakers
Rui Batista
Senior Backend Engineer
Adrien Bestel
Principal Ops Engineer
the digital product studio
for Daimler Truck
3
4
• What is IoT Data at Daimler Truck?
• Use case: Last known data of devices
• Make it work; Make it better; Repeat
Agenda
5
What is IoT Data at Daimler
Truck?
6
IoT Data @ Daimler Truck
7
IoT Data @ Daimler Truck
• Gps
• Speed
• Battery / fuel level
• …
Sensors
8
IoT Data @ Daimler Truck
• Gps
• Speed
• Battery / fuel level
• …
Sensors
• Collect sensors data
• Upload data
cTP
9
IoT Data @ Daimler Truck
Gather sensors data and
transmit it to the Cloud
through mobile networks
Vehicle
10
IoT Data @ Daimler Truck
Entry point into the
cloud, receives data from
all devices and route it
downstream
Cloud
11
IoT Data @ Daimler Truck
Ingest data from all
vehicles and route it to
business units
Data Ingestion
12
IoT Data @ Daimler Truck
Process data, persist it,
augment it, make it
available to downstream
users
Data Processing
13
Use case: Last known data of
devices in Daimler Buses
14
Use case: Last known data of devices
Value Proposition
Access the last know state of vehicles, regardless of their
current connectivity status
Usage
• What was the last position of my fleet?
• What was the last state of charge of my electric vehicle?
• I don’t want to process all the IoT Data from my vehicles, but I
want to periodically check the distance travelled by each of my
vehicles.
15
Use case: Last known data of devices
Requirements
• Keep the last state of a signal, keyed by tenant, vehicle ID and signal ID
• Allow accessing a set of signals for a set of vehicles
• Data must be made available near real time (<5 seconds end to end)
• API must be fast (usage in Front End)
16
Architecture | Data Flow
Input Topic
Kafka Streams
Topology Store
17
Architecture | Data Flow
Input Topic
Kafka Streams
Topology Store
[
{
“tenantId”: “tenant1”,
“vehicleId”: “device1”,
“signalId”: “signal1,
“timestamp”: 1708335166982,
“value”: 123.4
},
…
]
18
Architecture | Data Flow
Input Topic
Kafka Streams
Topology Store
key: tenant1;device1;signal1
timestamp: 1708335166982
value: 123.4
key: tenant1;device1;signal1
timestamp: 1708335100000
value: 0.1
>
19
Architecture | Data Flow
Input Topic
Kafka Streams
Topology Store
key: tenant1;device1;signal1
timestamp: 1708335166982
value: 123.4
20
Architecture | API
Kafka
Cluster
Instance 1
Store
Instance 2
Store
Instance 3
Store
21
Architecture | API
Kafka
Cluster
Instance 1
Store
Instance 2
Store
Instance 3
Store
Data is split across many local stores,
each of which only handles part of the
entire state store
22
Architecture | API
Kafka
Cluster
Instance 1
Store
Instance 2
Store
Instance 3
Store
application.id: my-topology
application.server: instance1:8080
group.instance.id: instance1
application.id: my-topology
application.server: instance2:8080
group.instance.id: instance2
application.id: my-topology
application.server: instance3:8080
group.instance.id: instance3
1/ Register instances of the
Kafka Streams application
23
Architecture | API
Kafka
Cluster
Instance 1
Store
Instance 2
Store
Instance 3
Store
2/ When receiving a requests, get
metadata to know which data is
retrievable where
GET key1, key2,
key3, …
Metadata
24
Architecture | API
Kafka
Cluster
Instance 1
Store
Instance 2
Store
Instance 3
Store
key1
key2
key3
3/ Get some data in local store
The rest is retrieved using RPC requests
to other instances.
RPC
25
Architecture | API
Kafka
Cluster
Instance 1
Store
Instance 2
Store
Instance 3
Store
4/ Return values
value1, value2,
value3, …
26
Architecture | Context
Deployment
• Stateful set in Kubernetes (sticky assignment)
• Kafka Cluster in Kubernetes (Strimzi)
Tech stack
• Scala
• Kafka Streams
• RocksDB
• Interactive Queries with gRPC + Protocol Buffers
• Micrometer (lots of metrics)
27
Make it work: our first version
28
First Version | Topology
input
key-value-repartition-
v1-repartition
key-value-
store-v1
messages-source-
v1
key-values-
extractor-v1
key-value-repartition-
v1-repartition-source
keep-most-
recent-reducer-v1
Sub Topology 1 Sub Topology 2
29
First Version | Topology
input
key-value-repartition-
v1-repartition
key-value-
store-v1
messages-source-
v1
key-values-
extractor-v1
key-value-repartition-
v1-repartition-source
keep-most-
recent-reducer-v1
Sub Topology 1 Sub Topology 2
streamsBuilder
// source of input data
.stream[String, Message](
inputTopic,
Consumed
.`with`(Serdes.String(), messageSerde)
.withName("messages-source-v1")
)
30
First Version | Topology
input
key-value-repartition-
v1-repartition
key-value-
store-v1
messages-source-
v1
key-values-
extractor-v1
key-value-repartition-
v1-repartition-source
keep-most-
recent-reducer-v1
Sub Topology 1 Sub Topology 2
// extract key values - changing the key of the stream
.flatMap(
{ case (_, message) => message.toKeyValues },
Named.as("key-values-extractor-v1"),
)
31
First Version | Topology
input
key-value-repartition-
v1-repartition
key-value-
store-v1
messages-source-
v1
key-values-
extractor-v1
key-value-repartition-
v1-repartition-source
keep-most-
recent-reducer-v1
Sub Topology 1 Sub Topology 2
// Kafka streams does that implicitly
.repartition(
Repartitioned
.as("key-value-repartition-v1")
.withKeySerde(keyValueKeySerde)
.withValueSerde(keyValueSerde)
)
32
First Version | Topology
input
key-value-repartition-
v1-repartition
key-value-
store-v1
messages-source-
v1
key-values-
extractor-v1
key-value-repartition-
v1-repartition-source
keep-most-
recent-reducer-v1
Sub Topology 1 Sub Topology 2
// group by key, effectively allowing operations at key level
.groupByKey(Grouped.as("group-by-key-v1"))
// Keep the last value based on the timestamp
.reduce(
{ case (keyValue1, keyValue2) => keyValue1.maxByTimestamp(keyValue2) },
Named.as("keep-most-recent-reducer-v1"),
Materialized.as(Stores.persistentKeyValueStore("key-value-store-v1")),
)
33
First Version | gRPC Service
class GrpcService(
kafkaStreams: KafkaStreams,
storeName: String,
localhost: HostInfo
) {
def getKeyValue(tenantId: String, deviceId: String, signalId: String) = {
val key = (tenantId, deviceId, signalId)
val activeHost = kafkaStreams.queryMetadataForKey(storeName, key, implicitly)
.activeHost()
if (activeHost == localhost) {
queryLocalStore(…)
} else {
queryRemoteHostWithGrpc(…)
}
}
}
34
First Version | Interactive Query
implicit val bytesOrdering: Ordering[Bytes] = Ordering.fromLessThan { (a, b) =>
Bytes.BYTES_LEXICO_COMPARATOR.compare(a.get(), b.get()) < 0
}
def queryLocalStore(val keys: List<(String, String, String)>) = {
if (keys.isEmpty) {
List.empty
} else {
// Serialize all keys to find the min and max by lexicographical order
val allKeys = keys
.map(key => Bytes.wrap(key.toByteArray) -> key)
.sortBy(_._1)(bytesOrdering)
val (_, minKey) = allKeys.head
val (_, maxKey) = allKeys.last
35
First Version | Interactive Query
// Range query to pull values
var request = StateQueryRequest
.inStore(storeName)
.withQuery(RangeQuery.withRange(minKey, maxKey))
.requireActive()
// Get the values
kafkaStreamsWrapper.kafkaStreams.query(request)
.getPartitionResults.values().asScala
.flatMap { result =>
Using(result.getResult)(_.asScala.map(_.value))
.getOrElse(Iterator.empty)
}
.filter(value => keys.contains(value.key))
.toList
36
First Version | Test setup
Load generator
• 100 devices
• 2 messages / device / second
• 10 to 100 signals per message
• 500 different signals
• 200 messages / second
• 10K signals per second
Production
• >10K devices
• 1K different signals
• 500K signals per second
37
Make it better #1: optimizing
the topology
38
First Optimization | Topology
39
First Optimization | Topology
input
key-value-repartition-
v1-repartition
key-value-
store-v1
messages-source-
v1
key-values-
extractor-v1
key-value-repartition-
v1-repartition-source
keep-most-
recent-reducer-v1
Sub Topology 1 Sub Topology 2
Most of the network pressure
comes from the repartition
40
First Optimization | Topology
input
key-value-repartition-
v1-repartition
key-value-
store-v1
messages-source-
v1
key-values-
extractor-v1
key-value-repartition-
v1-repartition-source
keep-most-
recent-reducer-v1
Sub Topology 1 Sub Topology 2
Invariant: The input topic is
already partitioned by deviceId
41
First Optimization | Topology
input
key-value-repartition-
v1-repartition
key-value-
store-v1
messages-source-
v1
key-values-
extractor-v1
key-value-repartition-
v1-repartition-source
keep-most-
recent-reducer-v1
Sub Topology 1 Sub Topology 2
We can remove these steps
using the processor API
42
First Optimization | Topology V2
input
key-value-
store-v2
messages-source-
v2
keep-most-
recent-processor-
v2
43
First Optimization | Topology V2
input
key-value-
store-v2
messages-source-
v2
keep-most-
recent-processor-
v2
streamsBuilder
// Create the key value store
.addStateStore(
Stores.keyValueStoreBuilder(
Stores.persistentKeyValueStore(v2StoreName),
keyValueKeySerde,
keyValueSerde,
).withCachingEnabled()
)
44
First Optimization | Topology V2
input
key-value-
store-v2
messages-source-
v2
keep-most-
recent-processor-
v2
// Create the input stream
.stream[String, Message](
inputTopic,
Consumed
.`with`(Serdes.String(), messageSerde)
.withName("messages-source-v2"),
)
45
First Optimization | Topology V2
input
key-value-
store-v2
messages-source-
v2
keep-most-
recent-processor-
v2
// Add the last values processor, connected to the store
.process(
() => new KeyValueStoreProcessor(v2StoreName),
Named.as("keep-most-recent-processor-v2"),
v2StoreName
)
46
First Optimization | Topology V2 | Processor
class KeyValueStoreProcessor(
storeName: String
) extends Processor[String, Message, Void, Void] {
private var store: KeyValueStore[KeyValue.Key, KeyValue] = _
override def init(context: ProcessorContext[Void, Void]): Unit = {
super.init(context)
// Inject the store on initialization
this.store = context.getStateStore[
KeyValueStore[KeyValue.Key, KeyValue]
](storeName)
}
47
First Optimization | Topology V2 | Processor
override def process(record: Record[String, Message]): Unit = {
// For each value
extractLastValues(record.value()).foreach { value =>
// Get the current value in store
val key = value.key
val currentValue = store.get(key)
// Override if the new value has a bigger timestamp
if (null == currentValue ||
currentValue.timestamp < value.timestamp) {
store.put(key, value)
}
}
}
48
First Optimization | Topology V2 | Impacts
Before
• Input 183 kB/s
• Repartition 270 kB/s
• Changelog 42 kB/s
49
First Optimization | Topology V2 | Impacts
After
• Input 183 kB/s
• Repartition 0 (-270 kB/s)
• Changelog 36 kB/s (-5 kB/s)
52
First Optimization | Topology V2 | gRPC?
class GrpcService(
kafkaStreams: KafkaStreams,
storeName: String,
localhost: HostInfo
) {
def getKeyValue(tenantId: String, deviceId: String, signalId: String) = {
val key = (tenantId, deviceId, signalId)
val activeHost = kafkaStreams.queryMetadataForKey(storeName, key, implicitly)
.activeHost()
if (activeHost == localhost) {
queryLocalStore(…)
} else {
queryRemoteHostWithGrpc(…)
}
}
}
53
First Optimization | Topology V2 | gRPC?
class GrpcService(
kafkaStreams: KafkaStreams,
storeName: String,
localhost: HostInfo
) {
def getKeyValue(tenantId: String, deviceId: String, signalId: String) = {
val key = (tenantId, deviceId, signalId)
val activeHost = kafkaStreams.queryMetadataForKey(storeName, key, implicitly)
.activeHost()
if (activeHost == localhost) {
queryLocalStore(…)
} else {
queryRemoteHostWithGrpc(…)
}
}
}
54
First Optimization | Topology V2 | gRPC?
class GrpcService(
kafkaStreams: KafkaStreams,
storeName: String,
localhost: HostInfo
) {
def getKeyValue(deviceId: String, signalId: String) = {
val key = deviceId
val activeHost = kafkaStreams.queryMetadataForKey(storeName, key, implicitly)
.activeHost()
if (activeHost == localhost) {
queryLocalStore(…)
} else {
queryRemoteHostWithGrpc(…)
}
}
}
55
First Optimization | Topology V2 | Take Away
• Describe your topologies to understand what is done automatically
• Study your system to understand if all steps are required
• Load test to understand the behavior of the different operations
• The low level API gives you a lot of control
56
Make it better #2: improving
state persistence
57
Second Optimization | Persistence
RocksDB Store
• Persisted to disk
• Cache in memory
• Write ahead log
• Store can exceed available RAM
In Memory Store
• Faster
• Size constrained by available RAM
• OOM can be thrown if store too big
58
Second Optimization | Persistence
RocksDB Store
• Persisted to disk
• Cache in memory
• Write ahead log
• Store can exceed available RAM
Used in our case
59
Second Optimization | Persistence | RocksDB
Tuning RocksDB: limit memory and disk usage
p.put(
StreamsConfig.ROCKSDB_CONFIG_SETTER_CLASS_CONFIG,
classOf[CustomRocksDBConfig]
)
class CustomRocksDBConfig extends RocksDBConfigSetter {
override def setConfig(
storeName: String,
options: Options,
configs: Map[String, AnyRef],
): Unit = {
// …
}
}
60
Second Optimization | Persistence | RocksDB
RocksDB allocates off-heap memory you need to limit:
• Block cache (for reads)
• Index and filter blocks
• Memtable (write buffer)
61
Second Optimization | Persistence | RocksDB
When tuning / optimizing RocksDB usage:
• check Kafka Streams latest recommendations on RocksDB documentation
• Kafka streams store cache and RocksDB are not mutually exclusive
• Consider enabling compression (Memory usage VS performance trade-off)
• Experiment
62
Second Optimization | Persistence | RocksDB
Operating RocksDB in Kubernetes
Persistent Volume
• Each instance of the application has
a persistent volume where state is
written
• Volume get detached and
reattached during operations
• Moving volume to another node
can be really slow (30m)
Ephemeral volume
• State is written to volume
• Volume is destroyed and recreated
during operations
• Creation of volume is fast (few
seconds)
• State is loaded from changelog at
startup
63
Second Optimization | Persistence | RocksDB
Operating RocksDB in Kubernetes
Persistent Volume
• Each instance of the application has
a persistent volume where state is
written
• Volume get detached and
reattached during operations
• Moving volume to another node
can be really slow (30m) Major issue
Used in our first versions
64
Second Optimization | Persistence | RocksDB
Operating RocksDB in Kubernetes
Ephemeral volume
• State is written to volume
• Volume is destroyed and recreated
during operations
• Creation of volume is fast (few
seconds)
• State is loaded from changelog at
startup
Moving to this
Let’s see that
65
Second Optimization | Persistence | Changelog
Kafka
Streams
Store
Changelog
topic
Key values Key values
Every change in the store
is replicated in the
changelog topic
The changelog is a compacted topic:
periodically, the topic is compacted and
the last (written) value of each key is kept
Persistence of stores’ states
66
Second Optimization | Persistence | Changelog
Kafka
Streams
Store
Changelog
topic
Key values
The changelog is completely consumed
at start to recreate the stores’ state
Initialization of stores’ states (at start)
67
Second Optimization | Persistence | Changelog
How does the changelog grow?
After 12 hours, 2GB
68
Second Optimization | Persistence | Changelog
INFO o.a.k.s.p.i.StoreChangelogReader
stream-thread Restoration in progress for 32 partitions
INFO o.a.k.s.p.internals.StreamThread
stream-thread Restoration took 80684 ms for all active tasks
…
69
Second Optimization | Persistence | Changelog
Restoration needs
~10MB/s bandwidth
to read the changelog
70
Second Optimization | Persistence | Changelog
Forcing a compaction,
2GB -> 5MB for 100 devices
and 500 signals per device
71
Second Optimization | Persistence | Changelog
Log cleaner thread 0 cleaned log kv-store-v1-key-value-store-v1-
changelog-22 (dirty section = [0, 2447900])
59.9 MB of log processed in 1.3 seconds (44.9 MB/sec).
Indexed 59.9 MB in 0.7 seconds (87.3 Mb/sec, 51.5% of total time)
Buffer utilization: 0.0%
Cleaned 59.9 MB in 0.6 seconds (92.5 Mb/sec, 48.5% of total time)
Start size: 59.9 MB (2,447,900 messages)
End size: 0.0 MB (1,540 messages)
99.9% size reduction (99.9% fewer messages)
72
Second Optimization | Persistence | Changelog
INFO o.a.k.s.p.internals.StreamThread
stream-thread Restoration took 1102 ms for all active tasks
73
Second Optimization | Persistence | Changelog
Restoration needs
~100kB/s bandwidth
to read the changelog
74
Second Optimization | Persistence | Changelog
Properties of the changelog
• Small when compacted (number of devices * number of signals values to keep)
• Number of values grows organically
• Lots of writes, changelog can grow fast
• How can we keep the changelog “small”?
75
Second Optimization | Persistence | Changelog
Configuration Description Default
max.compaction.lag.ms maximum time a message will remain ineligible for
compaction in the log.
Long.max
min.cleanable.dirty.ratio Controls how frequently the log compactor will attempt to
clean the log.
This ratio bounds the maximum space wasted in the log by
duplicates.
0.5
log.segment.bytes The maximum size of a single log file 1GB
76
Second Optimization | Persistence | Changelog
How to control compaction frequency?
• Setting a max compaction lag puts a upper bound on the period of compaction
• Lowering the segment size and/or the dirty ratio triggers compaction more often
77
Second Optimization | Persistence | Changelog
streamsBuilder
.addStateStore(
Stores
.keyValueStoreBuilder(
Stores.persistentKeyValueStore(v3StoreName),
keyValueKeySerde,
keyValueSerde,
)
.withLoggingEnabled(
Map(
TopicConfig.SEGMENT_BYTES_CONFIG -> (128 * 1024 * 1024).toString,
TopicConfig.MAX_COMPACTION_LAG_MS_CONFIG -> (60 * 60 * 1000).toString
).asJava
)
.withCachingEnabled()
)
78
Second Optimization | Persistence | Changelog
79
Second Optimization | Persistence | Take Away
• Persistent volumes can bring uncontrollable entropy
• Ephemeral volumes rely on small changelogs
• Compaction is “lazy” by default
• Optimize changelogs’ compaction based on your data
80
Make it better #3: optimize for
resilience and fault tolerance
81
Third Optimization | Resilience
What happens when an instance of a Kafka Streams application is down?
• Data assigned to the instance stops being processed
• Local store stops being available
82
Third Optimization | Resilience | Standby Replicas
Kafka
Streams
Store
Changelog
topic
Input
topic
Topology
Active Replica
Standby Replica
Kafka
Streams
Store
83
Third Optimization | Resilience | Standby Replicas
Kafka
Streams
Store
Changelog
topic
Input
topic
Topology
Active Replica
Standby Replica
Kafka
Streams
Store
The active replica processes input data
and stores the result in its local store
84
Third Optimization | Resilience | Standby Replicas
Kafka
Streams
Store
Changelog
topic
Input
topic
Topology
Active Replica
Standby Replica
Kafka
Streams
Store
The store is persisted in
its changelog topic
85
Third Optimization | Resilience | Standby Replicas
Kafka
Streams
Store
Changelog
topic
Input
topic
Topology
Active Replica
Standby Replica
Kafka
Streams
Store
The local store is replicated in the standby
replica by reading the changelog
86
Third Optimization | Resilience | Standby Replicas
• Standby replicas are (eventually consistent) shadow copies of state stores
• Each task of the topology has one active replica and num.standby.replicas standby replicas
Pros
• Faster rebalancing
• Better reliability
Cons
• More resources
• Eventual consistency
87
Third Optimization | Resilience | Interactive Queries
public <K> KeyQueryMetadata queryMetadataForKey(
final String storeName,
final K key,
final Serializer<K> keySerializer
)
public class KeyQueryMetadata {
private final HostInfo activeHost;
private final Set<HostInfo> standbyHosts;
private final int partition;
}
88
Third Optimization | Resilience | Interactive Queries
/**
* Returns LagInfo, for all store partitions (active or standby) local
* to this Streams instance. Note that the values returned are just
* estimates and meant to be used for making soft decisions on whether
* the data in the store partition is fresh enough for querying.
*
* Note: Each invocation of this method issues a call to the Kafka
* brokers. Thus its advisable to limit the frequency of invocation to
* once every few seconds.
*
* @return map of store names to another map of partition to LagInfos
*/
public Map<String, Map<Integer, LagInfo>> allLocalStorePartitionLags()
89
Third Optimization | Resilience | Interactive Queries
90
Third Optimization | Resilience | Take Away
Standby Replicas
• Faster recovery when an instance fails
• Allows to serve interactive queries when rebalancing…
• … but with stale data
91
Take Aways
92
Take Aways
• Describe your topologies to understand what your application really does
• Monitor your system using metrics from Kafka Streams, the Kafka Cluster, your
deployment environment, …
• Load Test as much as possible to understand trends and potential bottlenecks before they
become problematic in Production
• Prototype using the high level APIs, optimize using the low level APIs (at your own risks)
• gRPC 💚 Kafka Streams
• Understand Kafka Streams internals (in particular RocksDB and changelog compaction) to
help optimize further
• Kafka Streams is highly configurable
93
Thank you!
94

More Related Content

Similar to Building a Scalable Real-Time Fleet Management IoT Data Tracker with Kafka Streams and gRPC

Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...confluent
 
"Introduction to Kx Technology", James Corcoran, Head of Engineering EMEA at ...
"Introduction to Kx Technology", James Corcoran, Head of Engineering EMEA at ..."Introduction to Kx Technology", James Corcoran, Head of Engineering EMEA at ...
"Introduction to Kx Technology", James Corcoran, Head of Engineering EMEA at ...Dataconomy Media
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsGuozhang Wang
 
Web Scale Reasoning and the LarKC Project
Web Scale Reasoning and the LarKC ProjectWeb Scale Reasoning and the LarKC Project
Web Scale Reasoning and the LarKC ProjectSaltlux Inc.
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustEvan Chan
 
The magic behind your Lyft ride prices: A case study on machine learning and ...
The magic behind your Lyft ride prices: A case study on machine learning and ...The magic behind your Lyft ride prices: A case study on machine learning and ...
The magic behind your Lyft ride prices: A case study on machine learning and ...Karthik Murugesan
 
Containerizing Distributed Pipes
Containerizing Distributed PipesContainerizing Distributed Pipes
Containerizing Distributed Pipesinside-BigData.com
 
Monitoring Akka with Kamon 1.0
Monitoring Akka with Kamon 1.0Monitoring Akka with Kamon 1.0
Monitoring Akka with Kamon 1.0Steffen Gebert
 
Devoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en basDevoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en basFlorent Ramiere
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Guido Schmutz
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...HostedbyConfluent
 
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...Kai Wähner
 
Cloud Native London 2019 Faas composition using Kafka and cloud-events
Cloud Native London 2019 Faas composition using Kafka and cloud-eventsCloud Native London 2019 Faas composition using Kafka and cloud-events
Cloud Native London 2019 Faas composition using Kafka and cloud-eventsNeil Avery
 
Exactly-once Data Processing with Kafka Streams - July 27, 2017
Exactly-once Data Processing with Kafka Streams - July 27, 2017Exactly-once Data Processing with Kafka Streams - July 27, 2017
Exactly-once Data Processing with Kafka Streams - July 27, 2017confluent
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...HostedbyConfluent
 
Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19confluent
 
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...HostedbyConfluent
 

Similar to Building a Scalable Real-Time Fleet Management IoT Data Tracker with Kafka Streams and gRPC (20)

Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
 
"Introduction to Kx Technology", James Corcoran, Head of Engineering EMEA at ...
"Introduction to Kx Technology", James Corcoran, Head of Engineering EMEA at ..."Introduction to Kx Technology", James Corcoran, Head of Engineering EMEA at ...
"Introduction to Kx Technology", James Corcoran, Head of Engineering EMEA at ...
 
RISC V in Spacer
RISC V in SpacerRISC V in Spacer
RISC V in Spacer
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka Streams
 
Web Scale Reasoning and the LarKC Project
Web Scale Reasoning and the LarKC ProjectWeb Scale Reasoning and the LarKC Project
Web Scale Reasoning and the LarKC Project
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to Rust
 
The magic behind your Lyft ride prices: A case study on machine learning and ...
The magic behind your Lyft ride prices: A case study on machine learning and ...The magic behind your Lyft ride prices: A case study on machine learning and ...
The magic behind your Lyft ride prices: A case study on machine learning and ...
 
Containerizing Distributed Pipes
Containerizing Distributed PipesContainerizing Distributed Pipes
Containerizing Distributed Pipes
 
Monitoring Akka with Kamon 1.0
Monitoring Akka with Kamon 1.0Monitoring Akka with Kamon 1.0
Monitoring Akka with Kamon 1.0
 
Devoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en basDevoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en bas
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
 
So you think you can stream.pptx
So you think you can stream.pptxSo you think you can stream.pptx
So you think you can stream.pptx
 
Cloud Native London 2019 Faas composition using Kafka and cloud-events
Cloud Native London 2019 Faas composition using Kafka and cloud-eventsCloud Native London 2019 Faas composition using Kafka and cloud-events
Cloud Native London 2019 Faas composition using Kafka and cloud-events
 
Exactly-once Data Processing with Kafka Streams - July 27, 2017
Exactly-once Data Processing with Kafka Streams - July 27, 2017Exactly-once Data Processing with Kafka Streams - July 27, 2017
Exactly-once Data Processing with Kafka Streams - July 27, 2017
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19
 
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
 
Streaming ETL for All
Streaming ETL for AllStreaming ETL for All
Streaming ETL for All
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

Building a Scalable Real-Time Fleet Management IoT Data Tracker with Kafka Streams and gRPC

  • 1. Building a Scalable Real-Time Fleet Management IoT Data Tracker with Kafka Streams and gRPC Rui Batista | Adrien Bestel | tb.lx by Daimler Truck
  • 2. 2 Speakers Rui Batista Senior Backend Engineer Adrien Bestel Principal Ops Engineer
  • 3. the digital product studio for Daimler Truck 3
  • 4. 4 • What is IoT Data at Daimler Truck? • Use case: Last known data of devices • Make it work; Make it better; Repeat Agenda
  • 5. 5 What is IoT Data at Daimler Truck?
  • 6. 6 IoT Data @ Daimler Truck
  • 7. 7 IoT Data @ Daimler Truck • Gps • Speed • Battery / fuel level • … Sensors
  • 8. 8 IoT Data @ Daimler Truck • Gps • Speed • Battery / fuel level • … Sensors • Collect sensors data • Upload data cTP
  • 9. 9 IoT Data @ Daimler Truck Gather sensors data and transmit it to the Cloud through mobile networks Vehicle
  • 10. 10 IoT Data @ Daimler Truck Entry point into the cloud, receives data from all devices and route it downstream Cloud
  • 11. 11 IoT Data @ Daimler Truck Ingest data from all vehicles and route it to business units Data Ingestion
  • 12. 12 IoT Data @ Daimler Truck Process data, persist it, augment it, make it available to downstream users Data Processing
  • 13. 13 Use case: Last known data of devices in Daimler Buses
  • 14. 14 Use case: Last known data of devices Value Proposition Access the last know state of vehicles, regardless of their current connectivity status Usage • What was the last position of my fleet? • What was the last state of charge of my electric vehicle? • I don’t want to process all the IoT Data from my vehicles, but I want to periodically check the distance travelled by each of my vehicles.
  • 15. 15 Use case: Last known data of devices Requirements • Keep the last state of a signal, keyed by tenant, vehicle ID and signal ID • Allow accessing a set of signals for a set of vehicles • Data must be made available near real time (<5 seconds end to end) • API must be fast (usage in Front End)
  • 16. 16 Architecture | Data Flow Input Topic Kafka Streams Topology Store
  • 17. 17 Architecture | Data Flow Input Topic Kafka Streams Topology Store [ { “tenantId”: “tenant1”, “vehicleId”: “device1”, “signalId”: “signal1, “timestamp”: 1708335166982, “value”: 123.4 }, … ]
  • 18. 18 Architecture | Data Flow Input Topic Kafka Streams Topology Store key: tenant1;device1;signal1 timestamp: 1708335166982 value: 123.4 key: tenant1;device1;signal1 timestamp: 1708335100000 value: 0.1 >
  • 19. 19 Architecture | Data Flow Input Topic Kafka Streams Topology Store key: tenant1;device1;signal1 timestamp: 1708335166982 value: 123.4
  • 20. 20 Architecture | API Kafka Cluster Instance 1 Store Instance 2 Store Instance 3 Store
  • 21. 21 Architecture | API Kafka Cluster Instance 1 Store Instance 2 Store Instance 3 Store Data is split across many local stores, each of which only handles part of the entire state store
  • 22. 22 Architecture | API Kafka Cluster Instance 1 Store Instance 2 Store Instance 3 Store application.id: my-topology application.server: instance1:8080 group.instance.id: instance1 application.id: my-topology application.server: instance2:8080 group.instance.id: instance2 application.id: my-topology application.server: instance3:8080 group.instance.id: instance3 1/ Register instances of the Kafka Streams application
  • 23. 23 Architecture | API Kafka Cluster Instance 1 Store Instance 2 Store Instance 3 Store 2/ When receiving a requests, get metadata to know which data is retrievable where GET key1, key2, key3, … Metadata
  • 24. 24 Architecture | API Kafka Cluster Instance 1 Store Instance 2 Store Instance 3 Store key1 key2 key3 3/ Get some data in local store The rest is retrieved using RPC requests to other instances. RPC
  • 25. 25 Architecture | API Kafka Cluster Instance 1 Store Instance 2 Store Instance 3 Store 4/ Return values value1, value2, value3, …
  • 26. 26 Architecture | Context Deployment • Stateful set in Kubernetes (sticky assignment) • Kafka Cluster in Kubernetes (Strimzi) Tech stack • Scala • Kafka Streams • RocksDB • Interactive Queries with gRPC + Protocol Buffers • Micrometer (lots of metrics)
  • 27. 27 Make it work: our first version
  • 28. 28 First Version | Topology input key-value-repartition- v1-repartition key-value- store-v1 messages-source- v1 key-values- extractor-v1 key-value-repartition- v1-repartition-source keep-most- recent-reducer-v1 Sub Topology 1 Sub Topology 2
  • 29. 29 First Version | Topology input key-value-repartition- v1-repartition key-value- store-v1 messages-source- v1 key-values- extractor-v1 key-value-repartition- v1-repartition-source keep-most- recent-reducer-v1 Sub Topology 1 Sub Topology 2 streamsBuilder // source of input data .stream[String, Message]( inputTopic, Consumed .`with`(Serdes.String(), messageSerde) .withName("messages-source-v1") )
  • 30. 30 First Version | Topology input key-value-repartition- v1-repartition key-value- store-v1 messages-source- v1 key-values- extractor-v1 key-value-repartition- v1-repartition-source keep-most- recent-reducer-v1 Sub Topology 1 Sub Topology 2 // extract key values - changing the key of the stream .flatMap( { case (_, message) => message.toKeyValues }, Named.as("key-values-extractor-v1"), )
  • 31. 31 First Version | Topology input key-value-repartition- v1-repartition key-value- store-v1 messages-source- v1 key-values- extractor-v1 key-value-repartition- v1-repartition-source keep-most- recent-reducer-v1 Sub Topology 1 Sub Topology 2 // Kafka streams does that implicitly .repartition( Repartitioned .as("key-value-repartition-v1") .withKeySerde(keyValueKeySerde) .withValueSerde(keyValueSerde) )
  • 32. 32 First Version | Topology input key-value-repartition- v1-repartition key-value- store-v1 messages-source- v1 key-values- extractor-v1 key-value-repartition- v1-repartition-source keep-most- recent-reducer-v1 Sub Topology 1 Sub Topology 2 // group by key, effectively allowing operations at key level .groupByKey(Grouped.as("group-by-key-v1")) // Keep the last value based on the timestamp .reduce( { case (keyValue1, keyValue2) => keyValue1.maxByTimestamp(keyValue2) }, Named.as("keep-most-recent-reducer-v1"), Materialized.as(Stores.persistentKeyValueStore("key-value-store-v1")), )
  • 33. 33 First Version | gRPC Service class GrpcService( kafkaStreams: KafkaStreams, storeName: String, localhost: HostInfo ) { def getKeyValue(tenantId: String, deviceId: String, signalId: String) = { val key = (tenantId, deviceId, signalId) val activeHost = kafkaStreams.queryMetadataForKey(storeName, key, implicitly) .activeHost() if (activeHost == localhost) { queryLocalStore(…) } else { queryRemoteHostWithGrpc(…) } } }
  • 34. 34 First Version | Interactive Query implicit val bytesOrdering: Ordering[Bytes] = Ordering.fromLessThan { (a, b) => Bytes.BYTES_LEXICO_COMPARATOR.compare(a.get(), b.get()) < 0 } def queryLocalStore(val keys: List<(String, String, String)>) = { if (keys.isEmpty) { List.empty } else { // Serialize all keys to find the min and max by lexicographical order val allKeys = keys .map(key => Bytes.wrap(key.toByteArray) -> key) .sortBy(_._1)(bytesOrdering) val (_, minKey) = allKeys.head val (_, maxKey) = allKeys.last
  • 35. 35 First Version | Interactive Query // Range query to pull values var request = StateQueryRequest .inStore(storeName) .withQuery(RangeQuery.withRange(minKey, maxKey)) .requireActive() // Get the values kafkaStreamsWrapper.kafkaStreams.query(request) .getPartitionResults.values().asScala .flatMap { result => Using(result.getResult)(_.asScala.map(_.value)) .getOrElse(Iterator.empty) } .filter(value => keys.contains(value.key)) .toList
  • 36. 36 First Version | Test setup Load generator • 100 devices • 2 messages / device / second • 10 to 100 signals per message • 500 different signals • 200 messages / second • 10K signals per second Production • >10K devices • 1K different signals • 500K signals per second
  • 37. 37 Make it better #1: optimizing the topology
  • 39. 39 First Optimization | Topology input key-value-repartition- v1-repartition key-value- store-v1 messages-source- v1 key-values- extractor-v1 key-value-repartition- v1-repartition-source keep-most- recent-reducer-v1 Sub Topology 1 Sub Topology 2 Most of the network pressure comes from the repartition
  • 40. 40 First Optimization | Topology input key-value-repartition- v1-repartition key-value- store-v1 messages-source- v1 key-values- extractor-v1 key-value-repartition- v1-repartition-source keep-most- recent-reducer-v1 Sub Topology 1 Sub Topology 2 Invariant: The input topic is already partitioned by deviceId
  • 41. 41 First Optimization | Topology input key-value-repartition- v1-repartition key-value- store-v1 messages-source- v1 key-values- extractor-v1 key-value-repartition- v1-repartition-source keep-most- recent-reducer-v1 Sub Topology 1 Sub Topology 2 We can remove these steps using the processor API
  • 42. 42 First Optimization | Topology V2 input key-value- store-v2 messages-source- v2 keep-most- recent-processor- v2
  • 43. 43 First Optimization | Topology V2 input key-value- store-v2 messages-source- v2 keep-most- recent-processor- v2 streamsBuilder // Create the key value store .addStateStore( Stores.keyValueStoreBuilder( Stores.persistentKeyValueStore(v2StoreName), keyValueKeySerde, keyValueSerde, ).withCachingEnabled() )
  • 44. 44 First Optimization | Topology V2 input key-value- store-v2 messages-source- v2 keep-most- recent-processor- v2 // Create the input stream .stream[String, Message]( inputTopic, Consumed .`with`(Serdes.String(), messageSerde) .withName("messages-source-v2"), )
  • 45. 45 First Optimization | Topology V2 input key-value- store-v2 messages-source- v2 keep-most- recent-processor- v2 // Add the last values processor, connected to the store .process( () => new KeyValueStoreProcessor(v2StoreName), Named.as("keep-most-recent-processor-v2"), v2StoreName )
  • 46. 46 First Optimization | Topology V2 | Processor class KeyValueStoreProcessor( storeName: String ) extends Processor[String, Message, Void, Void] { private var store: KeyValueStore[KeyValue.Key, KeyValue] = _ override def init(context: ProcessorContext[Void, Void]): Unit = { super.init(context) // Inject the store on initialization this.store = context.getStateStore[ KeyValueStore[KeyValue.Key, KeyValue] ](storeName) }
  • 47. 47 First Optimization | Topology V2 | Processor override def process(record: Record[String, Message]): Unit = { // For each value extractLastValues(record.value()).foreach { value => // Get the current value in store val key = value.key val currentValue = store.get(key) // Override if the new value has a bigger timestamp if (null == currentValue || currentValue.timestamp < value.timestamp) { store.put(key, value) } } }
  • 48. 48 First Optimization | Topology V2 | Impacts Before • Input 183 kB/s • Repartition 270 kB/s • Changelog 42 kB/s
  • 49. 49 First Optimization | Topology V2 | Impacts After • Input 183 kB/s • Repartition 0 (-270 kB/s) • Changelog 36 kB/s (-5 kB/s)
  • 50. 52 First Optimization | Topology V2 | gRPC? class GrpcService( kafkaStreams: KafkaStreams, storeName: String, localhost: HostInfo ) { def getKeyValue(tenantId: String, deviceId: String, signalId: String) = { val key = (tenantId, deviceId, signalId) val activeHost = kafkaStreams.queryMetadataForKey(storeName, key, implicitly) .activeHost() if (activeHost == localhost) { queryLocalStore(…) } else { queryRemoteHostWithGrpc(…) } } }
  • 51. 53 First Optimization | Topology V2 | gRPC? class GrpcService( kafkaStreams: KafkaStreams, storeName: String, localhost: HostInfo ) { def getKeyValue(tenantId: String, deviceId: String, signalId: String) = { val key = (tenantId, deviceId, signalId) val activeHost = kafkaStreams.queryMetadataForKey(storeName, key, implicitly) .activeHost() if (activeHost == localhost) { queryLocalStore(…) } else { queryRemoteHostWithGrpc(…) } } }
  • 52. 54 First Optimization | Topology V2 | gRPC? class GrpcService( kafkaStreams: KafkaStreams, storeName: String, localhost: HostInfo ) { def getKeyValue(deviceId: String, signalId: String) = { val key = deviceId val activeHost = kafkaStreams.queryMetadataForKey(storeName, key, implicitly) .activeHost() if (activeHost == localhost) { queryLocalStore(…) } else { queryRemoteHostWithGrpc(…) } } }
  • 53. 55 First Optimization | Topology V2 | Take Away • Describe your topologies to understand what is done automatically • Study your system to understand if all steps are required • Load test to understand the behavior of the different operations • The low level API gives you a lot of control
  • 54. 56 Make it better #2: improving state persistence
  • 55. 57 Second Optimization | Persistence RocksDB Store • Persisted to disk • Cache in memory • Write ahead log • Store can exceed available RAM In Memory Store • Faster • Size constrained by available RAM • OOM can be thrown if store too big
  • 56. 58 Second Optimization | Persistence RocksDB Store • Persisted to disk • Cache in memory • Write ahead log • Store can exceed available RAM Used in our case
  • 57. 59 Second Optimization | Persistence | RocksDB Tuning RocksDB: limit memory and disk usage p.put( StreamsConfig.ROCKSDB_CONFIG_SETTER_CLASS_CONFIG, classOf[CustomRocksDBConfig] ) class CustomRocksDBConfig extends RocksDBConfigSetter { override def setConfig( storeName: String, options: Options, configs: Map[String, AnyRef], ): Unit = { // … } }
  • 58. 60 Second Optimization | Persistence | RocksDB RocksDB allocates off-heap memory you need to limit: • Block cache (for reads) • Index and filter blocks • Memtable (write buffer)
  • 59. 61 Second Optimization | Persistence | RocksDB When tuning / optimizing RocksDB usage: • check Kafka Streams latest recommendations on RocksDB documentation • Kafka streams store cache and RocksDB are not mutually exclusive • Consider enabling compression (Memory usage VS performance trade-off) • Experiment
  • 60. 62 Second Optimization | Persistence | RocksDB Operating RocksDB in Kubernetes Persistent Volume • Each instance of the application has a persistent volume where state is written • Volume get detached and reattached during operations • Moving volume to another node can be really slow (30m) Ephemeral volume • State is written to volume • Volume is destroyed and recreated during operations • Creation of volume is fast (few seconds) • State is loaded from changelog at startup
  • 61. 63 Second Optimization | Persistence | RocksDB Operating RocksDB in Kubernetes Persistent Volume • Each instance of the application has a persistent volume where state is written • Volume get detached and reattached during operations • Moving volume to another node can be really slow (30m) Major issue Used in our first versions
  • 62. 64 Second Optimization | Persistence | RocksDB Operating RocksDB in Kubernetes Ephemeral volume • State is written to volume • Volume is destroyed and recreated during operations • Creation of volume is fast (few seconds) • State is loaded from changelog at startup Moving to this Let’s see that
  • 63. 65 Second Optimization | Persistence | Changelog Kafka Streams Store Changelog topic Key values Key values Every change in the store is replicated in the changelog topic The changelog is a compacted topic: periodically, the topic is compacted and the last (written) value of each key is kept Persistence of stores’ states
  • 64. 66 Second Optimization | Persistence | Changelog Kafka Streams Store Changelog topic Key values The changelog is completely consumed at start to recreate the stores’ state Initialization of stores’ states (at start)
  • 65. 67 Second Optimization | Persistence | Changelog How does the changelog grow? After 12 hours, 2GB
  • 66. 68 Second Optimization | Persistence | Changelog INFO o.a.k.s.p.i.StoreChangelogReader stream-thread Restoration in progress for 32 partitions INFO o.a.k.s.p.internals.StreamThread stream-thread Restoration took 80684 ms for all active tasks …
  • 67. 69 Second Optimization | Persistence | Changelog Restoration needs ~10MB/s bandwidth to read the changelog
  • 68. 70 Second Optimization | Persistence | Changelog Forcing a compaction, 2GB -> 5MB for 100 devices and 500 signals per device
  • 69. 71 Second Optimization | Persistence | Changelog Log cleaner thread 0 cleaned log kv-store-v1-key-value-store-v1- changelog-22 (dirty section = [0, 2447900]) 59.9 MB of log processed in 1.3 seconds (44.9 MB/sec). Indexed 59.9 MB in 0.7 seconds (87.3 Mb/sec, 51.5% of total time) Buffer utilization: 0.0% Cleaned 59.9 MB in 0.6 seconds (92.5 Mb/sec, 48.5% of total time) Start size: 59.9 MB (2,447,900 messages) End size: 0.0 MB (1,540 messages) 99.9% size reduction (99.9% fewer messages)
  • 70. 72 Second Optimization | Persistence | Changelog INFO o.a.k.s.p.internals.StreamThread stream-thread Restoration took 1102 ms for all active tasks
  • 71. 73 Second Optimization | Persistence | Changelog Restoration needs ~100kB/s bandwidth to read the changelog
  • 72. 74 Second Optimization | Persistence | Changelog Properties of the changelog • Small when compacted (number of devices * number of signals values to keep) • Number of values grows organically • Lots of writes, changelog can grow fast • How can we keep the changelog “small”?
  • 73. 75 Second Optimization | Persistence | Changelog Configuration Description Default max.compaction.lag.ms maximum time a message will remain ineligible for compaction in the log. Long.max min.cleanable.dirty.ratio Controls how frequently the log compactor will attempt to clean the log. This ratio bounds the maximum space wasted in the log by duplicates. 0.5 log.segment.bytes The maximum size of a single log file 1GB
  • 74. 76 Second Optimization | Persistence | Changelog How to control compaction frequency? • Setting a max compaction lag puts a upper bound on the period of compaction • Lowering the segment size and/or the dirty ratio triggers compaction more often
  • 75. 77 Second Optimization | Persistence | Changelog streamsBuilder .addStateStore( Stores .keyValueStoreBuilder( Stores.persistentKeyValueStore(v3StoreName), keyValueKeySerde, keyValueSerde, ) .withLoggingEnabled( Map( TopicConfig.SEGMENT_BYTES_CONFIG -> (128 * 1024 * 1024).toString, TopicConfig.MAX_COMPACTION_LAG_MS_CONFIG -> (60 * 60 * 1000).toString ).asJava ) .withCachingEnabled() )
  • 76. 78 Second Optimization | Persistence | Changelog
  • 77. 79 Second Optimization | Persistence | Take Away • Persistent volumes can bring uncontrollable entropy • Ephemeral volumes rely on small changelogs • Compaction is “lazy” by default • Optimize changelogs’ compaction based on your data
  • 78. 80 Make it better #3: optimize for resilience and fault tolerance
  • 79. 81 Third Optimization | Resilience What happens when an instance of a Kafka Streams application is down? • Data assigned to the instance stops being processed • Local store stops being available
  • 80. 82 Third Optimization | Resilience | Standby Replicas Kafka Streams Store Changelog topic Input topic Topology Active Replica Standby Replica Kafka Streams Store
  • 81. 83 Third Optimization | Resilience | Standby Replicas Kafka Streams Store Changelog topic Input topic Topology Active Replica Standby Replica Kafka Streams Store The active replica processes input data and stores the result in its local store
  • 82. 84 Third Optimization | Resilience | Standby Replicas Kafka Streams Store Changelog topic Input topic Topology Active Replica Standby Replica Kafka Streams Store The store is persisted in its changelog topic
  • 83. 85 Third Optimization | Resilience | Standby Replicas Kafka Streams Store Changelog topic Input topic Topology Active Replica Standby Replica Kafka Streams Store The local store is replicated in the standby replica by reading the changelog
  • 84. 86 Third Optimization | Resilience | Standby Replicas • Standby replicas are (eventually consistent) shadow copies of state stores • Each task of the topology has one active replica and num.standby.replicas standby replicas Pros • Faster rebalancing • Better reliability Cons • More resources • Eventual consistency
  • 85. 87 Third Optimization | Resilience | Interactive Queries public <K> KeyQueryMetadata queryMetadataForKey( final String storeName, final K key, final Serializer<K> keySerializer ) public class KeyQueryMetadata { private final HostInfo activeHost; private final Set<HostInfo> standbyHosts; private final int partition; }
  • 86. 88 Third Optimization | Resilience | Interactive Queries /** * Returns LagInfo, for all store partitions (active or standby) local * to this Streams instance. Note that the values returned are just * estimates and meant to be used for making soft decisions on whether * the data in the store partition is fresh enough for querying. * * Note: Each invocation of this method issues a call to the Kafka * brokers. Thus its advisable to limit the frequency of invocation to * once every few seconds. * * @return map of store names to another map of partition to LagInfos */ public Map<String, Map<Integer, LagInfo>> allLocalStorePartitionLags()
  • 87. 89 Third Optimization | Resilience | Interactive Queries
  • 88. 90 Third Optimization | Resilience | Take Away Standby Replicas • Faster recovery when an instance fails • Allows to serve interactive queries when rebalancing… • … but with stale data
  • 90. 92 Take Aways • Describe your topologies to understand what your application really does • Monitor your system using metrics from Kafka Streams, the Kafka Cluster, your deployment environment, … • Load Test as much as possible to understand trends and potential bottlenecks before they become problematic in Production • Prototype using the high level APIs, optimize using the low level APIs (at your own risks) • gRPC 💚 Kafka Streams • Understand Kafka Streams internals (in particular RocksDB and changelog compaction) to help optimize further • Kafka Streams is highly configurable
  • 92. 94