Connect, Test, Optimize!
Create an Ultimate Kafka Connector Benchmarking Toolkit
Suchi Amalapurapu
Principle Engineer, Confluent
Sudesh Wasnik
Senior Software Engineer II, Confluent
Kafka Streaming Application
A
No-Code way of connecting external systems to Kafka
Kafka
syslog
SOURCES SINKS
Kafka
Connect
Kafka
Connect
Apps
ksqlDB
Kafka Streaming Application
A
No-Code way of connecting external systems to Kafka
Kafka
syslog
SOURCES SINKS
Kafka
Connect
Kafka
Connect
Database Connectors File Store Connectors MQ Connectors SAAS Connectors
Sample Streaming Application
Kafka
Connect
External System
LetsA
MySQL Database
Kafka
MySQL Debezium Connector
Product Catalogue database
Captures change events
Stores change events
MySQL Stream Application
LetsA
MySQL Database
Kafka
High throughput
Slow CDC
Lag!
MySQL Debezium Connector
Kafka
Connect
External System
S3 Sink Streaming Application
LetsA
High throughput
Slower file flush rates
Slower ingestion
Kafka
Connect
External System
AWS S3
Kafka
Kafka Connector
LetsA
MySQL Database
Kafka
Very high throughput
Captures events, but slowly
Lag!
MySQL Debezium Connector
Kafka
Connect
External System
Performance Determinants
● Untuned Connector configuration
● #Tasks
● CPU, Resource crunch
● Unresponsive Sink
● Kafka consumer throttling
How to simulate test-scenarios ?
Slower file flush rates
S3 Sink Streaming Application
Challenges :
1. High breadth of connectors
2. End-system specifics
3. Various configuration combinations
4. Pre-existing test-scripts and systems
5. Integration with CI/CD
Test Setup
Test Setup
Kafka
Analyzer
Fetch Data
External System Kafka Connector
Generate
Load
Load Generator
Pluggable: Any Loadgen can be used
Sizeable : Configurable CPU/Memory
Scalable : Increase Replicas
Report Generation
Pluggable: Any Analyzer can be used
Sizeable : Configurable CPU/Memory
Custom Tests Designs
Report Generation
Test Setup
Kafka
Analyzer
Fetch Data
External System Kafka Connector
Generate
Load
Load Generator
Metric Analyzer
1. Scrape and replay metrics
2. Compare baselines metrics
3. Export metrics
Record Analyzer
1. Consumes destination Topic
2. Verify delivery-semantics
a. At-least once
b. Data-loss
Source System
Performance test pipeline
Run Test
Notification
Test Results
Analyze Results
Dashboards
Kafka Cluster
Connect
Analyzer
Load Generator
Storage
CI/CD
Input Specification
Kubernetes
Performance test pipeline
Run Test
Notification
Test Results
Analyze Results
Dashboards
Source System
Kafka Cluster
Connect
Analyzer
Load Generator
Kubernetes
Storage
CI/CD
Input Specification
Input Specification
Test Specification
• Each spec represents a
distinct test
• Exhaustive Spec
• Template engines like jinja2,
Django templating can be
used.
Performance test pipeline
Run Test
Notification
Test Results
Analyze Results
Dashboards
Source System
Kafka Cluster
Connect
Analyzer
Load Generator
Kubernetes
Storage
CI/CD
Input Specification
Source System
Source System
➔ Tuned.
➔ Pod crash monitoring
➔ Emit pod + end-system
metrics.
Source System
Source System
Performance test pipeline
Run Test
Notification
Test Results
Analyze Results
Dashboards
Kafka Cluster
Connect
Analyzer
Load Generator
Storage
CI/CD
Input Specification
Kubernetes
Load Generator
Source System
Performance test pipeline
Run Test
Notification
Test Results
Analyze Results
Dashboards
Kafka Cluster
Connect
Analyzer
Load Generator
Storage
CI/CD
Input Specification
Kubernetes
Kafka Cluster
Connect
➔ Set Kafka and Connect
base properties
➔ Scale application
➔ Emit JMX metrics
Kafka & Connect Cluster
Source System
Performance test pipeline
Run Test
Notification
Test Results
Analyze Results
Dashboards
Kafka Cluster
Connect
Analyzer
Load Generator
Storage
CI/CD
Input Specification
Kubernetes
Analyzer
➔ Scrape JMX / Custom
metrics
➔ Collect resource profiles
➔ Scrape custom metrics
➔ Export metrics to
metric-store (Bigquery,
Datadog)
Observability
Analysis Template
Metrics Threshold
1. CPU Usage 20%
2. Heap Memory Usage 20%
3. Source Record Poll Rate 10%
4. Producer Bytes ProduceRate 10%
➔ List Metric-name and their
properties
➔ Ex - Scrape
“kafka-consumer” metrics
only
Analysis Template
Framework is ready !
Let's review test scenarios. ..
S3-Sink Test :
Kafka
Analyzer
Fetch Data
Kafka Connector
Generate
Load
Load Generator
S3 Sink
Schema Registry
S3-Sink Test : Avro v/s JSON_SR
Result
➔ Test input-data-format : AVRO v/s JSON_SR
➔ Expectation :
◆ Avro messages smaller
◆ Avro should perform better
TEST
Low performance boost !
S3-Sink Test : File flush frequency
Result
➔ Test Flush-rate frequency
◆ Configuration-A:
● Scheduled Rotation Interval = 3 mins,
● Flush Size = 1000
◆ Configuration-B:
● Scheduled Rotation Interval = 6 mins,
● Flush Size = 3000
➔ Expectation: Configuration-B will perform
better
TEST
High performance boost !
JMS Source Test: v1 v/s v2 performance
➔ Test and compare : v1 v/s v2
➔ Issue-1 : Duplicates
➔ Issue-2: Pod restart during peak-load caused data-loss
➔ Fault Injection tests
Delivery semantics verified
Simulate Chaos in Performance tests
Chaos Scenarios Common to All Connectors
Network Loss (Chaos-Mesh)
Simulate a packet loss scenario
Pod Kill (Chaos-Mesh)
Delete k8s pods randomly
Noisy-Neighbor Scenario
Run Resource Intensive connector in same node
CPU Stress (Chaos-Mesh)
Simulate High CPU load.
Memory Stress (Chaos-Mesh)
Simulate high memory usage
Chaos Tests : Generic Scenarios
Source System
Performance test pipeline
Kafka Cluster
Connect
Analyzer
Load Generator
CI/CD
Kubernetes
A
Inject Chaos
Impact
1. Easy + repeatable tests
2. Uncover connector bottlenecks
3. LoadGen integrations
4. Report generation
5. Chaos DR tests on connectors
6. Performance gating of 15+ connectors
Takeaways
1. Importance of Kafka Connect Perf tests
2. Factors affecting connector performance
3. Test platform design guidelines
4. Configuration tuning and impact
5. Chaos tests
Special Thanks
● Scott Hendricks
● Anupam Aggarwal
● Vikas Balani
Thank you!

Connect, Test, Optimize: The Ultimate Kafka Connector Benchmarking Toolkit

  • 1.
    Connect, Test, Optimize! Createan Ultimate Kafka Connector Benchmarking Toolkit Suchi Amalapurapu Principle Engineer, Confluent Sudesh Wasnik Senior Software Engineer II, Confluent
  • 2.
    Kafka Streaming Application A No-Codeway of connecting external systems to Kafka Kafka syslog SOURCES SINKS Kafka Connect Kafka Connect Apps ksqlDB
  • 3.
    Kafka Streaming Application A No-Codeway of connecting external systems to Kafka Kafka syslog SOURCES SINKS Kafka Connect Kafka Connect Database Connectors File Store Connectors MQ Connectors SAAS Connectors
  • 4.
    Sample Streaming Application Kafka Connect ExternalSystem LetsA MySQL Database Kafka MySQL Debezium Connector Product Catalogue database Captures change events Stores change events
  • 5.
    MySQL Stream Application LetsA MySQLDatabase Kafka High throughput Slow CDC Lag! MySQL Debezium Connector Kafka Connect External System
  • 6.
    S3 Sink StreamingApplication LetsA High throughput Slower file flush rates Slower ingestion Kafka Connect External System AWS S3 Kafka Kafka Connector
  • 7.
    LetsA MySQL Database Kafka Very highthroughput Captures events, but slowly Lag! MySQL Debezium Connector Kafka Connect External System Performance Determinants ● Untuned Connector configuration ● #Tasks ● CPU, Resource crunch ● Unresponsive Sink ● Kafka consumer throttling How to simulate test-scenarios ? Slower file flush rates S3 Sink Streaming Application
  • 8.
    Challenges : 1. Highbreadth of connectors 2. End-system specifics 3. Various configuration combinations 4. Pre-existing test-scripts and systems 5. Integration with CI/CD Test Setup
  • 9.
    Test Setup Kafka Analyzer Fetch Data ExternalSystem Kafka Connector Generate Load Load Generator Pluggable: Any Loadgen can be used Sizeable : Configurable CPU/Memory Scalable : Increase Replicas Report Generation Pluggable: Any Analyzer can be used Sizeable : Configurable CPU/Memory Custom Tests Designs Report Generation
  • 10.
    Test Setup Kafka Analyzer Fetch Data ExternalSystem Kafka Connector Generate Load Load Generator Metric Analyzer 1. Scrape and replay metrics 2. Compare baselines metrics 3. Export metrics Record Analyzer 1. Consumes destination Topic 2. Verify delivery-semantics a. At-least once b. Data-loss
  • 11.
    Source System Performance testpipeline Run Test Notification Test Results Analyze Results Dashboards Kafka Cluster Connect Analyzer Load Generator Storage CI/CD Input Specification Kubernetes
  • 12.
    Performance test pipeline RunTest Notification Test Results Analyze Results Dashboards Source System Kafka Cluster Connect Analyzer Load Generator Kubernetes Storage CI/CD Input Specification Input Specification Test Specification • Each spec represents a distinct test • Exhaustive Spec • Template engines like jinja2, Django templating can be used.
  • 13.
    Performance test pipeline RunTest Notification Test Results Analyze Results Dashboards Source System Kafka Cluster Connect Analyzer Load Generator Kubernetes Storage CI/CD Input Specification Source System Source System ➔ Tuned. ➔ Pod crash monitoring ➔ Emit pod + end-system metrics. Source System
  • 14.
    Source System Performance testpipeline Run Test Notification Test Results Analyze Results Dashboards Kafka Cluster Connect Analyzer Load Generator Storage CI/CD Input Specification Kubernetes Load Generator
  • 15.
    Source System Performance testpipeline Run Test Notification Test Results Analyze Results Dashboards Kafka Cluster Connect Analyzer Load Generator Storage CI/CD Input Specification Kubernetes Kafka Cluster Connect ➔ Set Kafka and Connect base properties ➔ Scale application ➔ Emit JMX metrics Kafka & Connect Cluster
  • 16.
    Source System Performance testpipeline Run Test Notification Test Results Analyze Results Dashboards Kafka Cluster Connect Analyzer Load Generator Storage CI/CD Input Specification Kubernetes Analyzer ➔ Scrape JMX / Custom metrics ➔ Collect resource profiles ➔ Scrape custom metrics ➔ Export metrics to metric-store (Bigquery, Datadog) Observability Analysis Template Metrics Threshold 1. CPU Usage 20% 2. Heap Memory Usage 20% 3. Source Record Poll Rate 10% 4. Producer Bytes ProduceRate 10% ➔ List Metric-name and their properties ➔ Ex - Scrape “kafka-consumer” metrics only Analysis Template
  • 17.
    Framework is ready! Let's review test scenarios. ..
  • 18.
    S3-Sink Test : Kafka Analyzer FetchData Kafka Connector Generate Load Load Generator S3 Sink Schema Registry
  • 19.
    S3-Sink Test :Avro v/s JSON_SR Result ➔ Test input-data-format : AVRO v/s JSON_SR ➔ Expectation : ◆ Avro messages smaller ◆ Avro should perform better TEST Low performance boost !
  • 20.
    S3-Sink Test :File flush frequency Result ➔ Test Flush-rate frequency ◆ Configuration-A: ● Scheduled Rotation Interval = 3 mins, ● Flush Size = 1000 ◆ Configuration-B: ● Scheduled Rotation Interval = 6 mins, ● Flush Size = 3000 ➔ Expectation: Configuration-B will perform better TEST High performance boost !
  • 21.
    JMS Source Test:v1 v/s v2 performance ➔ Test and compare : v1 v/s v2 ➔ Issue-1 : Duplicates ➔ Issue-2: Pod restart during peak-load caused data-loss ➔ Fault Injection tests Delivery semantics verified
  • 22.
    Simulate Chaos inPerformance tests
  • 23.
    Chaos Scenarios Commonto All Connectors Network Loss (Chaos-Mesh) Simulate a packet loss scenario Pod Kill (Chaos-Mesh) Delete k8s pods randomly Noisy-Neighbor Scenario Run Resource Intensive connector in same node CPU Stress (Chaos-Mesh) Simulate High CPU load. Memory Stress (Chaos-Mesh) Simulate high memory usage Chaos Tests : Generic Scenarios
  • 24.
    Source System Performance testpipeline Kafka Cluster Connect Analyzer Load Generator CI/CD Kubernetes A Inject Chaos
  • 25.
    Impact 1. Easy +repeatable tests 2. Uncover connector bottlenecks 3. LoadGen integrations 4. Report generation 5. Chaos DR tests on connectors 6. Performance gating of 15+ connectors
  • 26.
    Takeaways 1. Importance ofKafka Connect Perf tests 2. Factors affecting connector performance 3. Test platform design guidelines 4. Configuration tuning and impact 5. Chaos tests
  • 27.
    Special Thanks ● ScottHendricks ● Anupam Aggarwal ● Vikas Balani Thank you!