Nicolas Steinmetz [CérénIT] | Sustain Your Observability from Bare Metal TICK Stack and Apps to a Kubernetes World | InfluxDays Virtual Experience London 2020
When moving your apps to Kubernetes, you need to keep your existing observability at the same level or better. Kubernetes will give you some challenge, as you can’t strictly deploy the TICK Stack as you did before, but also allow some opportunities. The talk is about my journey on this topic and will cover Telegraf as DaemonSet to fetch nodes resources, as a deployment to fetch metrics from different endpoints and hopefully with Telegraf as an operator to illustrate sidecar deployment. All these metrics will be pushed to InfluxDB (v1/v2) and may be visualized in Chronograf or Grafana.
And why the TICK/TIG stack ?
Some principles to start with...
How did I get
there ? ∙Custom metrics for home sensors &
extended to platform monitoring
∙Best of breed platform
∙Nice UI and Dashboards
∙Ready to use
∙Python API (pre-telegraf world 😉)
∙Raspberry Pi compatible
∙Monitoring outside the platform
∙Telegraf to collect and send metrics
∙Grafana for Alerting and Visualising
Once upon a time...
From bare metal to containers...
∙(Ab)use of /etc/telegraf/telegraf.d
∙Automated by Infrastructure as
∙Host and application metrics 🤩
∙Docker inputs plugin provides only
general metrics (mem, cpu, net,
∙Lost visibility on what happens
inside the container 😰
Hello Docker !
∙Add telegraf in docker ! 😌
∙Get metrics back again for services
∙But not the perfect solution…
∙A new pattern is rising… 😏
To inﬁnity and beyond...
From containers to kubernetes...
k8s world ?
∙Nodes: master(s) & workers
∙Kubernetes Core Services (etcd,
∙Application and related kubernetes
∙De facto standard
∙Ecosystem relies on prometheus
∙Core service metrics
∙Alert Manager & Prom UI
∙Already have TIG !
∙Not another / custom kubernetes
∙Long term storage ?
∙Not embedded in the cluster nor
want to enable pull monitoring from
So why not just
∙Do we want to have exactly the
same data or something similar ?
∙Just not try to duplicate Prom
Operator dashboards but question
your needs and the existing
∙Available metrics may depend of
your kubernetes provider
Before diving !
∙Contributed an updated version of
telegraf-ds helm chart 💪
∙Mix of traditional plugins +
kubernetes input plugin
∙Opinionated default conﬁguration
Global & Node
∙Inspired from Prometheus Operator
∙Reproduced the one that interested
me and extended them
∙Most of the metrics are identical ; a
few are different
∙Telegraf-operator (alpha) chart
∙Inject a telegraf container as sidecar
∙Telegraf classes to deﬁne
conﬁguration to apply
∙Interesting for non service metrics
∙Take care of Telegraf proliferation
∙Young and promising initiative
∙Ready to use & Dashboard as code
∙Kubernetes dashboards by
∙Based on kubernetes and
kube_inventory input plugins
∙github.com > inﬂuxdata >
community-templates > k8s
∙Take only node-exporter,
∙Use telegraf to collect prometheus
metrics via prometheus input plugin
A third way ?
∙Explore deeper InﬂuxDB 2.0,
especially to dissociate Alerting from
∙Explore deeper telegraf-operator for
in pod metrics to conﬁrm my
∙Possible to monitor kubernetes
platform with telegraf
∙Don’t need to deploy prometheus in
∙Leverage prometheus exporters
with prometheus input plugin
∙Mix of Prometheus Operator and
Telegraf to have best of both worlds
∙Watch progress of telegraf operator