Make Your Kafka Cluster Production-Ready - Jakub Scholz, Red Hat
Kubernetes became the de-facto standard for running cloud-native applications. And more and more users turn to it also to run stateful applications such as Apache Kafka. While there are different tools such as Helm charts or operators which can get you quickly up and running, there is often still a long way to make sure the Kafka cluster is production-ready. This talk will take you through the main aspects you should consider for your Kafka cluster and will cover things such as resource management, storage, scheduling, rolling updates, or reliability. It will show you how to do it using the Strimzi operator, but the lessons learned will apply also to any other Kafka cluster. If you are interested in production-ready Apache Kafka on Kubernetes, this is a talk for you.
2. Make Your Kafka Cluster Production-ready
About me
● Senior Principal Software Engineer @ Red Hat
● Maintainer of Strimzi project (https://strimzi.io)
● Occasional Apache Kafka contributor
@scholzj
https://github.com/scholzj
https://www.linkedin.com/in/scholzj/
2
10. Make Your Kafka Cluster Production-ready
Monitoring
● Understanding the state of the Kafka cluster
● Logs, Metrics, Tracing, Dashboards, Alerts, …
● The usual suspects: Prometheus, OpenTelemetry, …
● Consumer Lag
● Kafka Exporter
10
34. Why do I need to care about this?
Why not have production-ready Kafka out-of-the box?
25
Make Your Kafka Cluster Production-ready
35. Make Your Kafka Cluster Production-ready
Mind the gap!
● Different environments
● Development vs. CIs vs. Production
● Different requirements
● How much [Security|Monitoring|Availability|Performance] do I really need?
● Different infrastructure
● Labels, versions, topologies, tools, …
● No one-size-fits-all
26
36. Make Your Kafka Cluster Production-ready
Further resources
● Documentation
● Strimzi: https://strimzi.io/documentation/
● Apache Kafka: https://kafka.apache.org/documentation/
● Examples: https://github.com/strimzi/strimzi-kafka-operator/tree/main/examples
● Blog posts: https://strimzi.io/blog/
● “Make your Kafka cluster production-ready”: https://youtube.com/c/Strimzi
27