Getting Kafka running on Kubernetes is only step one of a journey to create a production-ready Kafka cluster. This talk walks through the other steps: 1) Monitoring and remediating faults. 2) Updates to Kubernetes nodes for clusters not using shared storage. 3) Automating Kafka updates and restarts. We present how to create fault-tolerant Kafka clusters on Kubernetes without sacrificing availability, durability, or latency. Learn about Lyft's overlay-free Kubernetes networking driver and how we use it to keep performance on par with non-Kubernetes clusters.
3. Provisioning team vs. Kafka team
Hi Kafka team, you’re using
the old provisioning system.
Please move Kafka to
Kubernetes.
We’d love to. How do we get
started?
4. Provisioning team vs. Kafka team
Here’s an example of a app,
just copy it.
We got a problem, data
keeps getting lost on each
deploy. What do we do?
6. Provisioning team vs. Kafka team
Mmmmmmmmmmmaaahhh!
Yeh, but we’ve got terabytes
of data to copy for each
node, it’ll take days to do
updates.
7. Benefits
Common Operating System provisioning system:
Security upgrades
Logs and metrics
Containerization:
Developer control of software dependencies
Automated deploys
Why Kubernetes?
8. Kafka is a “reliability tool”
Move data without lossiness
High stakes usage
Why Kafka?
29. Instance Store and K8s
Not as easy as EBS
Kafka container must re-schedule on same node
Recycle broker ids
30. Reschedule on same node
Active Broker 0 Active Broker 1 Active Broker 2 Idle K8s Node
31. Reschedule on same node
Active Broker 0
kafka-v2.2
Active Broker 1
kafka-v2.2
Active Broker 2
kafka-v2.2
Idle K8s Node
32. Reschedule on same node
Active Broker 0
kafka-v2.2
Active Broker 1
kafka-v2.2
Active Broker 2
kafka-v2.2
Idle K8s Node
33. Reschedule on same node
Active Broker 0
kafka-v2.2
Active Broker 1
kafka-v2.2
Active Broker 2
kafka-v2.2
Active Broker 0
kafka-v2.3
34. Reschedule on same node
Idle K8s Node Active Broker 1
kafka-v2.2
Active Broker 2
kafka-v2.2
Active Broker 0
kafka-v2.3
Wait up-to 24 hours for replication
35. Reschedule on same node
Idle K8s Node Active Broker 1
kafka-v2.2
Active Broker 2
kafka-v2.2
Active Broker 0
kafka-v2.3
36. Reschedule on same node
Active Broker 1
kafka-v2.3
Active Broker 1
kafka-v2.2
Active Broker 2
kafka-v2.2
Active Broker 0
kafka-v2.3
37. Reschedule on same node
Active Broker 1
kafka-v2.3
Idle K8s Node Active Broker 2
kafka-v2.2
Active Broker 0
kafka-v2.3
Wait up-to 24 hours for replication
38. Reschedule on same node
Active Broker 1
kafka-v2.3
Idle K8s Node Active Broker 2
kafka-v2.2
Active Broker 0
kafka-v2.3
Wait up-to 24 hours for replication
41. Kafka as a DaemonSet
DaemonSet declares:
Each broker is placed on every K8s node
Static id assigned to K8s node
Ids mounted into broker container
Only Kafka scheduled on K8s cluster
42. Kafka as a DaemonSet
Terraform
resource "aws_instance" "kubelet" { count = x … }
1
Kafka Id
1
Kafka Id
2
2
Kafka Id
3
3
Kubelet 1 Kubelet 2 Kubelet 3
43. Kafka Network Architecture
Client 1 Client 2
Kafka Node 1
Bootstrap Server - Load Balancer or Round-robin DNS
Kafka Node 2 Kafka Node 3
Talk to node 1, 2, and 3
Talk to node 1, 2, and 3 Talk to node 1, 2, and 3
44. Kubernetes Network w/ DaemonSet
Kafka Id
1
Kafka Id
2
Kafka Id
3
Kubelet 1 Kubelet 2 Kubelet 3
HostPort: 9093 HostPort: 9093 HostPort: 9093
Client 1 Client 2
Load Balancer or Round-robin DNS
hostNetwork:
true
46. Kafka as a StatefulSet
StatefulSet declares:
Kafka pod is pinned to specific local disk
StatefulSet id follows broker
Nodes labeled to have only Kafka
47. Kafka as a StatefulSet
Terraform
resource "aws_autoscaling_group" "kubelet" { min-size = x }
Kafka Id
1
Kubelet
Stateful
Set Id 1
Kafka Id
2
Kubelet
Stateful
Set Id 2
Kafka Id
3
Kubelet
Stateful
Set Id 3
Hot spare or other
Pods
Kubelet
57. Immutable Infrastructure Upgrade
Active Kubelet: 30 days old Active Kubelet: 30 days old
POD A POD B
POD C POD D
POD E POD F
POD G POD H
New Kubelet: 0 days old
58. Immutable Infrastructure Upgrade
Deleting Kubelet: 30 days
old
Active Kubelet: 30 days old
POD E POD F
POD G POD H
Active Kubelet: 0 days old
POD A POD B
POD C POD D
59. Immutable Infrastructure Upgrade
Deleting Kubelet: 30 days
old
Active Kubelet: 30 days old
POD E POD F
POD G POD H
Active Kubelet: 0 days old
POD A POD B
POD C POD D
60. Immutable Infrastructure Upgrade
Active Kubelet: 30 days old
POD E POD F
POD G POD H
Active Kubelet: 0 days old
POD A POD B
POD C POD D
New Kubelet: 0 days old
63. Immutable Infrastructure Upgrade
Recall the problem with state transfer
Killed Broker 0 Active Broker 1
kafka-v2.3
Active Broker 2
kafka-v2.3
Updated Broker 0
kafka-v2.3
Wait up-to 24 hours for replication
71. Encryption in Transit
Challenges:
TLS for clients to Kafka
Don’t use wild-card certs
MTLS for inter-broker communications
Don’t check your private certs into VCS
72. Encryption in Transit
KIAM w/ ACM-PCA
# On start:
keytool -genkeypair ...
keytool -certreq ...
aws acm-pca issue-certificate ...
aws acm-pca get-certificate ...
keytool -import ...
Kubelet
Client
Private-ca in truststore
KIAM
Agent
KIAM
Server
Fetch
AWS
ACM-PCA
AWS DNS-
Route53
External-
DNS
ENI
Kafka Node
1
TLS Connection
AWS
IAM Allow Kafka IAM
role
73. Encryption at Rest
On AWS:
Instance Store and EBS both offer
encryption
Encrypt before produce and decrypt on
consume
75. StatefulSets preferred, DaemonSet good back up option
Newer versions of Kubernetes support ephemeral disks better
Use AWS VPC Kubernetes CNI driver using IPvlan
With ephemeral disks, do mutable upgrades
Takeaways
76. Summary
K8s Adjustment Reliability Wins Monthly Cost Savings Per 100 nodes
Mutable Upgrades Increased availability+durability $4k
Ephemeral Volumes Improved tail latency $8k-$120k better than EBS
CNI driver using IPvlan Improved throughput+latency $0
Resolve Node Death Increased availability+durability $0
77. Rolling restarts by AZ in K8s
Remove need to intervene with StatefulSet on
node death
Publish comprehensive benchmarks
Other cloud provider benchmarks
Future Work