ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kubernetes applications with Prometheus and OpenTracing

Improving the Observability of
Cassandra, Kafka and Kubernetes
applications with Prometheus and
OpenTracing
Paul Brebner
Technology Evangelist
instaclustr.com
Observability track, ApacheCon 2019, Tuesday, 10th September, Las Vegas, USA
https://www.apachecon.com/acna19/s/#/scheduledEvent/1031

As distributed applications
grow more complex,
dynamic, and massively
scalable, “observability”
becomes more critical.
Observability is the practice
of using metrics, monitoring
and distributed tracing to
understand how a system
works.
Observability
Critical

As distributed cloud
applications grow more
complex, dynamic, and
massively scalable,
“observability” becomes
more critical.
Observability is the practice
of using metrics, monitoring
and distributed tracing to
understand how a system
works.
And find the invisible cows
Observability
Critical

Open
APM
Land
scape
Lots
of
options
https://openapm.io/landscape

Open
APM
Land
scape
Pick
some
https://openapm.io/landscape

Open
APM
Land
scape
In this talk we’ll explore two complementary Open
Source technologies:
- Prometheus for monitoring application metrics, and
- OpenTracing and Jaeger for distributed tracing.
We’ll discover how they improve the observability of
- an Anomaly Detection application,
- deployed on AWS Kubernetes, and
- using Instaclustr managed Apache Cassandra and
Kafka clusters.

Goal
To increase the
observability of an
anomaly detection
application
Kubernetes
Cluster
?

Cloud
context
Running across
Kafka, Cassandra,&
Kubernetes Clusters

Observability
Goal 1: Metrics
T
P
S
TPS
TPS
T
i
m
e
R
o
w
s
Anomalies
Producer rate
Consumer rate
Anomaly checks rate
Detector duration Rows returned
anomaly rate

Observability
Goal 2: Distributed
Tracing

1
2
3
Overview
Prometheus for Monitoring
OpenTracing for Distributed Tracing
Conclusions, and Cassandra Auto Scaling use case

Monitoring
with
Prometheus
Popular Open
Source monitoring
system from
Soundcloud
Now Cloud Native
Computing
Foundation (CNCF)

Prometheus
Monitoring of
applications and
servers
Pull-based
Architecture &
Components…

Prometheus
Server
Server
responsible for service discovery,
pulling metrics from monitored
applications, storing metrics, and
analysis of time series data

Prometheus
GUI
Built in simple graphing GUI, and
native support for Grafana

Prometheus
Optional
Push gateway
Alerting
Optional push gateway and
alerting
Optional push gateway and alerting

Prometheus
How does metrics
capture work?
Instrumentation and Agents (Exporters)
- Client libraries for instrumenting applications in
multiple programming languages
- Java client collects JVM metrics and enables
custom application metrics
- Node exporter for host hardware metrics

Prometheus
Data Model
■ Metrics
● Time series data
ᐨ timestamp and value; name, key:value pairs
● By convention name includes
ᐨ thing being monitored, logical type, and units
ᐨ e.g. http_requests_total, http_duration_seconds
■ Prometheus automatically adds labels
● Job, host:port
■ Metric types (only relevant for instrumentation)
● Counter (increasing values)
● Gauge (values up and down)
● Histogram
● Summary

Target
metrics
Business metric
(Anomaly checks/s)
Diagnostic metrics T
P
S
TPS
TPS
T
i
m
e
R
o
w
s
Anomalies
Producer rate
Consumer rate
Anomaly checks rate
Detector duration Rows returned
anomaly rate

Steps
Basic
■ Create and register Prometheus Metric types
● (e.g. Counter) for each timeseries type (e.g. throughputs) including
name and units
■ Instrument the code
● e.g. increment the count, using name of the component (e.g.
producer, consumer, etc) as label
■ Create HTTP server in code
■ Tell Prometheus where to scrape from (config file)
■ Run Prometheus Server
■ Browse to Prometheus server
■ View and select metrics, check that there’s data
■ Construct expression
■ Graph the expression
■ Run and configure Grafana for better graphs

Instrumentation
Counter example
// Use a single Counter for throughput metrics
// for all stages of the pipeline
// stages are distinguished by labels
static final Counter pipelineCounter = Counter
.build()
.name(appName + "_requests_total")
.help("Count of executions of pipeline stages")
.labelNames("stage")
.register();
. . .
// After successful execution of each stage:
// increment producer/consumer/detector rate count
pipelineCounter.labels(“producer”).inc();
. . .
pipelineCounter.labels(“consumer”).inc();
. . .
pipelineCounter.labels(“detector”).inc();

Instrumentation
Gauge example
// A Gauge can go up and down
// Used to measure the current value of some variable.
// pipelineGauge will measure duration of each labelled stage
static final Gauge pipelineGauge = Gauge
.build()
.name(appName + "_duration_seconds")
.help("Gauge of stage durations in seconds")
.labelNames("stage")
.register();
. . .
// in detector pipeline, compute duration and set
long duration = nowTime – startTime;
pipelineGauge.labels(”detector”).setToTime(duration);

HTTP Server
For metric pulls
// Metrics are pulled by Prometheus
// Create an HTTP server as the endpoint to pull from
// If there are multiple processes running on the same server
// then you need different port numbers
// Add IPs and port numbers to the Prometheus configuration
// file.
HTTPServer server = null;
try {
server = new HTTPServer(1234);
} catch (IOException e) {
e.printStackTrace();
}

Using
Prometheus
Configure
Run
■ Configure Prometheus with IP and Ports to poll.
● Edit the default Prometheus.yml file
● Includes polling frequency, timeouts etc
● Ok for testing but doesn’t scale for production systems
■ Get, install and run Prometheus.
● Initially just running locally.

Graphs
Counter
■ Browse to Prometheus Server URL
■ No default dashboards
■ View and select metrics
■ Execute them to graph
■ Counter value increases over time

Rate
Graph using irate
function
■ Enter expressions, e.g. irate function
■ Expression language has multiple data types and many
functions

Gauge
graph
Pipeline stage
durations in
seconds
■ Doesn’t need a function as it’s a Gauge

Grafana
Prometheus GUI ok
for debugging
Grafana better for
production
■ Install and run Grafana
■ Browse to Grafana URL, create a Prometheus data
source, add a Prometheus Graph.
■ Can enter multiple Prometheus expressions and graph
them on the same graph.
■ Example shows rate and duration metrics

Simple Test
configuration
Prometheus Server
outside Kubernetes
cluster, pulls metrics
from Pods
Dynamic/many
Pods are a
challenge
■ IP addresses to pull from are dynamic
● Have to update Prometheus pull configurations
● In production too many Pods to do this manually

Prometheus
on
Kubernetes
A few extra steps
makes life easier
■ Create and register Prometheus Metric types
● (e.g. Counter) for each timeseries type (e.g. throughputs) including name and
units
■ Instrument the code
● e.g. increment the count, using name of the component (e.g. producer,
consumer, etc) as label
■ Create HTTP server in code
■ Run Prometheus Server on Kubernetes cluster,
using Kubernetes Operator
■ Configure so it dynamically monitors selected Pods
■ Enable ingress and external access to Prometheus
server
■ Browse to Prometheus server
■ View and select metrics, check that there’s data
■ Construct expression
■ Graph the expression
■ Run and configure Grafana for better graphs

Prometheus
In production on
Kubernetes
Use Prometheus
Operator

Prometheus
In production on
Kubernetes
Use Prometheus
Operator
1 Install Prometheus
Operator and Run

Prometheus
In production on
Kubernetes
Use Prometheus
Operator
Operator and Run
2 Configure Service Objects to
monitor Pods

Prometheus
In production on
Kubernetes
Use Prometheus
Operator
Operator and Run
monitor Pods
3 Configure ServiceMonitors to
discover Service Objects

Prometheus
In production on
Kubernetes
Use Prometheus
Operator
Operator and Run
monitor Pods
4 Configure Prometheus objects
to specify which ServiceMonitors
should be included

Prometheus
In production on
Kubernetes
Use Prometheus
Operator
Operator and Run
monitor Pods
4 Configure Prometheus objects
to specify which ServiceMonitors
should be included
5 Allow ingress to Prometheus
by using a Kubernetes
NodePort Service
6 Create Role-based access
control rules for both
Prometheus and Prometheus
Operator
7 Configure AWS EC2 firewalls

Weavescope
Prometheus now
magically monitors
Pods as they come
and go
Showing
Prometheus
monitoring Pods
Prometheus
Operator
Pods

OpenTracing
Use Case:
Topology Maps
■ Prometheus collects and displays metric aggregations
● No dependency or order information, no single events
■ Distributed tracing shows “call tree” (causality, timing) for
each event
■ And Topology Maps

OpenTracing
Standard API for
distributed tracing
■ Specification, not implementation
■ Need
● Application instrumentation
● OpenTracing tracer
Traced Applications API Tracer implementations
Open Source, Datadog

Spans
Smallest logical unit
of work in
distributed system
■ Spans are smallest logical unit of work
● Have name, start time, duration, associated component
■ Simplest trace is a single span

Trace
Multi-span trace
■ Spans can be related
● ChildOf = synchronous dependency (wait)
● FollowsFrom = asynchronous relationships (no wait)
■ A Trace is a DAG of Spans.
● 1 or more Spans.

Instrumentation
■ Language specific client instrumentation
● Used to create spans in the application within the same process
■ Contributed libraries for frameworks
● E.g. Elasticsearch, Cassandra, Kafka etc
● Used to create spans across process boundaries (Kafka producers
-> consumers)
■ Choose and Instantiate a Tracer implementation
// Example instrumentation for consumer -> detector spans
static Tracer tracer = initTracer(”AnomaliaMachina");
. . .
Span span1 = tracer.buildSpan(”consumer").start();
. . .
span1.finish();
Span span2 = tracer
.buildSpan(”detector")
.addReference(References.CHILD_OF, span1.context())
.start();
. . .
span2.finish();
Steps

Tracing
across
process
boundaries
Inject/extract
metadata
■ To trace across process boundaries (processes,
servers, clouds) OpenTracing injects metadata into
the cross-process call flows to build traces across
heterogeneous systems.
■ Inject and extract a spanContext, how depends on
protocol.

How to do
this for
Kafka?
Producer
Automatically
inserts a span
context into Kafka
headers using
Interceptors
// Register tracer with GlobalTracer:
GlobalTracer.register(tracer);
// Add TracingProducerInterceptor to sender properties:
senderProps.put(ProducerConfig.INTERCEPTOR_CLASSES_CONFIG,
TracingProducerInterceptor.class.getName());
// Instantiate KafkaProducer
KafkaProducer<Integer, String> producer = new
KafkaProducer<>(senderProps);
// Send
producer.send(...);
// 3rd party library
// https://github.com/opentracing-contrib/java-kafka-client

Consumer
side
Extract spanContext
// Once you have a consumer record, extract
// the span context and
// create a new FOLLOWS_FROM span
SpanContext spanContext
= tracer.extract(Format.Builtin.TEXT_MAP, new
MyHeadersMapExtractAdapter(record.headers(),
false));
newSpan =
tracer.buildSpan("consumer").addReference(Refe
rences.FOLLOWS_FROM, spanContext).start();

Jaeger
Tracer
Open Source Tracer
Uber/CNCF

Jaeger
Tracer
How to use?
• Tracers can have different architectures and protocols
• Jaeger should scale well in production as
• It can use Cassandra and Spark
• Uses adaptive sampling
• Need to instantiate a Jaeger tracer in your code

Jaeger
GUI
■ Install and start Jaeger
■ Browse to Jaeger URL
■ Find traces by name, operation, and filter.
■ Select to drill down for more detail.

Jaeger
Single trace
■ Insight into total trace time, relationships and times
of spans
■ This is a trace of a single event through the
anomaly detector pipeline
● Producer (async)
● Consumer (async)
● Detector (async, with sync children)
ᐨ CassandraWrite
ᐨ CassandraRead
ᐨ AnomalyDetector

Jaeger
Dependencies view
■ Correctly shows anomaly detector topology
■ Only metric is number of spans observed
■ Can’t select subset of traces, or filter
■ Force directed view, select node and highlights
dependencies

Kafka
Challenge
Multiple Kafka topic
topologies
■ More complex example (application simulates
complex event flows across topics)
■ Show dependencies between source, intermediate
and sink Kafka topics.

Final
observable
system
• Enhanced
application
observability
• Metrics from
application
• Traces across
application and
Kafka cluster

Conclusions
Observations &
Alternatives
■ Topology view is basic (c.f. some commercial APMs)
■ Still need Prometheus for metrics
● in theory OpenTracing has everything needed for metrics.
■ Other OpenTracing tracers may be worth trying, e.g.
Datadog
■ OpenCensus is a competing approach.
■ Manual instrumentation is tedious and potentially
error prone, commercial APMs use Java agents with
byte-code injection instead, any Open Source?
■ The future? Kubernetes based service mesh
frameworks can construct traces for microservices
without instrumentation
● as they have visibility into how Pods interact with each other and
external systems
● and Pods only contain a single microservice, not a monolithic
application
● E.g. Kiali, observability for Istio, www.kiali.io

Results
Scaled out to 48
Cassandra nodes
Approx 600 cores
for whole system
109 Pods for
Prometheus to
monitor
Producer rate metric
(9 Pods)
Peak Producer rate = 2.3 Million events/s
Prometheus was critical for collecting, computing and displaying
the metrics, from multiple Pods

Business
metric
Detector rate
100 Pods
220,000 anomaly
checks/s computed
from 100 stacked
metrics
Anomaly Checks/s = 220,000
Prometheus was critical for tuning the system to achieve near perfect linear
scalability - used metrics for consumer and detector rate to tune thread pool
sizes to optimize anomaly checks/s, for increasingly bigger systems.
OpenTracing and Jaeger was useful during test deployment
- to check/debug if components were working together as expected

Cassandra &
OpenTracing
Visibility into
Cassandra clusters?
■ OpenTracing the example application was
● Across Kafka producers/consumers
● And within the Kubernetes deployed application
■ What options are there for improved visibility of
tracing across Cassandra clusters?
■ Instaclustr managed service
● OpenTracing support for the C* driver
● May not require any support from C* clusters
● https://github.com/opentracing-contrib/java-cassandra-driver
■ Self-managed clusters
● end-to-end OpenTracing through a C* cluster
● May require support from C* cluster
● https://github.com/thelastpickle/cassandra-zipkin-tracing

Cassandra &
Prometheus
Visibility into
Cassandra clusters?
■ Prometheus monitoring of the example application
● limited to application metrics collected from Kubernetes Pods
■ What options are there for integration with Casandra
Cluster metrics?
?

Cassandra &
Prometheus
Visibility into
Cassandra clusters?
Option 1
Self-managed
clusters
■ Instaclustr's Open Source Tools For Cassandra -
LDAP/Kerberos, Prometheus Exporter, Debug
Tooling and K8s Operator
● Adam Zegelin
● Wednesday, 11th Sep, 16:45 - 17:35
● Mesquite
■ Prometheus cassandra-exporter
● Exports Cassandra metrics to Prometheus
● https://github.com/instaclustr/cassandra-exporter
■ Kubernetes Operator for Cassandra
● The Cassandra operator will create the appropriate objects to inform the
Prometheus operator about the metrics endpoints available from Cassandra
● https://github.com/instaclustr/cassandra-operator/

Cassandra &
Prometheus
Visibility into
Cassandra clusters?
Option 2
Instaclustr managed
service
■ Instaclustr managed Cassandra
● NEW Instaclustr Prometheus Monitoring API
● Works with both Cassandra and Kafka
● Exports default Cassandra and Kafka metrics, Server metrics, and extra
managed service metrics

Use Case:
Cassandra
Auto scaling
Using Prometheus,
Instaclustr Managed
Cassandra with
Prometheus
Monitoring API and
Provisioning API

Cassandra
Auto scaling
1 Cassandra Cluster
metrics

Cassandra
Auto scaling
2 Load increasing
0
20
40
60
80
100
120
0 50 100 150 200 250
CPU%
Time (m)
Cassandra Cluster CPU %
Max Capacity CPU % (used for regression)

Cassandra
Auto scaling
3 Regression
prediction
0
20
40
60
80
100
120
0 50 100 150 200 250 300
CPU%
Time (m)
Cassandra Cluster CPU %
Max Capacity CPU % (used for regression) Regression
Cluster
predicted to
hit Max
Capacity
Regression
prediction

Cassandra
Auto scaling
4 Alert when it’s time
to trigger a cluster
resize (taking into
account predicted
rate of increase,
resizing time,
reduced cluster
capacity during
resize, but leave as
late as possible to
prevent unnecessary
resizing)
0
50
100
150
200
250
0 50 100 150 200 250 300 350
CAPACITY(%)
TIME (M)
RESIZE BY NODE
CPU % (used for regression) Regression Capacity (resize by node)
Latest time to
trigger resize
by node Reduced
cluster
capacity
during resize
Regression
prediction
Resize
time
Resized
capacity

Cassandra
Auto scaling
5 Use alert to trigger
5a Kubernetes Pod
scaling and
Pod scaling

Cassandra
Auto scaling
5 Use alert to trigger
5a Kubernetes Pod
scaling and
5b Dynamic Resizing
of Cassandra Cluster
(Using Webhooks?)
Pod scaling
Dynamic Resizing

More
information?
Blogs and another
talk
■ Anomalia Machina Blog Series (10 parts)
● https://www.instaclustr.com/anomalia-machina-10-final-results-
massively-scalable-anomaly-detection-with-apache-kafka-and-
cassandra/
■ All my blogs
● https://www.instaclustr.com/paul-brebner/
■ Kafka, Cassandra and Kubernetes at Scale -
Real-time Anomaly detection on 19 billion events
a day
● Paul Brebner
● Thursday, 12th Sep, 13:00 - 13:50
● Savoy

The End
Instaclustr Managed Platform for Open Source
www.instaclustr.com/platform/

ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kubernetes applications with Prometheus and OpenTracing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kubernetes applications with Prometheus and OpenTracing

Similar to ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kubernetes applications with Prometheus and OpenTracing (20)

More from Paul Brebner

More from Paul Brebner (20)

Recently uploaded

Recently uploaded (20)

ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kubernetes applications with Prometheus and OpenTracing