"Calling APIs over the network from Kafka Streams is often a necessary evil: Although it can incur significant costs by blocking the processing and thus making our pipelines less reliable, we are sometimes forced to integrate business process-related microservices or machine learning models and other technology that does not pair well with the JVM. While we scale the Kafka Streams applications and the services in Kubernetes to avoid blocked pipelines, it is often too late, as the scaling traditionally relies on metrics like the consumer group lag or number of HTTP requests.
In this talk, we first give an overview of the caveats when integrating such services in Kafka Streams and basic approaches for mitigating those. Second, we present our solution for the timely scaling of complex Kafka Streams pipelines in conjunction with remotely connected APIs. In addition to the well-known metrics, we take further dimensions into account. Powered by Kafka Streams, we observe our data stream, extract metadata, aggregate statistics, and finally expose them as external metrics. We integrate this with auto-scaling frameworks such as KEDA to reliably scale our pipelines just in time."
4. Option 1: Retry Strategy
4
Predictor
Kafka Streams
User Prediction
REST API
Ad Prediction
REST API
Display Event
Producer Preprocessing
5. Option 2: Dead-Letter Queue
5
Predictor
Kafka Streams
User Prediction
REST API
Ad Prediction
REST API
Display Event
Producer Preprocessing
DLQ
6. Option 3: Back off Topics
6
Predictor
Kafka Streams
User Prediction
REST API
Ad Prediction
REST API
Display Event
Producer Preprocessing
10 100 10000
Exponential Back off Topics
7. Option 4: Let it Crash
7
7
Predictor
Kafka Streams
User Prediction
REST API
Ad Prediction
REST API
Display Event
Producer Preprocessing
9. How fast can we go?
9
Predictor
Kafka Streams
User Prediction
REST API
Ad Prediction
REST API
w/spredictor
?
10. How fast can we go?
10
Predictor
Kafka Streams
User Prediction
REST API
Ad Prediction
REST API
w/spredictor
?
Service RPS
User Prediction 2
Ad Prediction 1.3
Predictor
Kafka Streams
11. How fast can we go?
11
Predictor
Kafka Streams
User Prediction
REST API
Ad Prediction
REST API
w/spredictor
?
Service RPS Latency
User Prediction 2 500ms
Ad Prediction 1.3 750ms
Predictor
Kafka Streams
12. How fast can we go?
12
Predictor
Kafka Streams
User Prediction
REST API
Ad Prediction
REST API
w/spredictor
?
Service RPS Latency
User Prediction 2 500ms
Ad Prediction 1.3 750ms
Predictor
Kafka Streams
13. How fast can we go?
13
Predictor
Kafka Streams
User Prediction
REST API
Ad Prediction
REST API
w/spredictor
?
Service RPS Latency
User Prediction 2 500ms
Ad Prediction 1.3 750ms
Predictor
Kafka Streams
15. How fast can we go?
Predictor
Kafka Streams
Predictor
Kafka Streams
15
User Prediction
REST API
Ad Prediction
REST API
w/spredictor
?
Service RPS Latency
User Prediction 2 1000ms
Ad Prediction 1.3 1500ms
16. How fast can we go?
Predictor
Kafka Streams
Predictor
Kafka Streams
16
User Prediction
REST API
Ad Prediction
REST API
w/spredictor
?
Service RPS Latency
User Prediction 2 1000ms
Ad Prediction 1.3 1500ms
32. Predictor
Kafka Streams
Ad Prediction
REST AP
User Prediction
REST API
Predictor
Kafka Streams
User Prediction
REST API
Autoscaling
32
Predictor
Kafka Streams
Ad Prediction
REST AP
User Prediction
REST API
Ad Prediction
REST API
w/spredictor
?
Horizontal Pod
Autoscaler
Scale
What is the lag of
the consumer group?
39. Timely Autoscaling
39
Predictor
Kafka Streams
User Prediction
REST API
Ad Prediction
REST API
w/spredictor
?
Horizontal Pod
Autoscaler
Scale
Display Event
Producer Preprocessing
Metrics
Consumer / REST API
40. Timely Autoscaling
40
Predictor
Kafka Streams
User Prediction
REST API
Ad Prediction
REST API
w/spredictor
?
Horizontal Pod
Autoscaler
Scale
Display Event
Producer Preprocessing
Metrics
Consumer / REST API
How many unique user
and ad IDs are there?
48. Timely Autoscaling
48
Predictor
Kafka Streams
User Prediction
REST API
Ad Prediction
REST API
w/spredictor
?
Horizontal Pod
Autoscaler
Scale
Display Event
Producer Preprocessing
Metrics
Consumer / REST API
How many unique user
and ad IDs are there?
~ required RPS
49. User Prediction
REST API
Timely Autoscaling
49
Predictor
Kafka Streams
Predictor
Kafka Streams
Ad Prediction
REST API
User Prediction
REST API
Ad Prediction
REST API
w/spredictor
?
Horizontal Pod
Autoscaler
Scale
Display Event
Producer Preprocessing
Metrics
Consumer / REST API
How many unique user
and ad IDs are there?
~ required RPS