Prometheus has become the defacto monitoring system for cloud native applications, but for a while was eschewed by the Hashistack in favour of more traditional technologies. Thats all changing: Hashicorp's project are beginning to export metrics in the native Prometheus format, and many exporters exist to bridge the gap.
In this talk Tom will give a brief introduction to Prometheus, show you how to piece it all together, and give some recommendation on what to monitor and alert on.
2. Prometheus
● A monitoring & alerting system.
● Inspired by Google’s BorgMon
● Originally built by SoundCloud in 2012
● Open Source, now part of the CNCF
● Simple text-based metrics format
● Multidimensional datamodel
● Rich, concise query language
3.
4.
5. Prometheus’ data model is very simple:
<identifier> → [ (t0, v0), (t1, v1), ... ]
Timestamps are millisecond int64, values are float64
https://www.slideshare.net/Docker/monitoring-the-prometheus-way-julius-voltz-prometheus
10. And aggregate by a dimension…
PromQL: sum by (path) (rate(http_requests_total{job=“nginx”, status=~“502”}[1m]))
{path=“/home”} 0.0666
{path=“/settings”} 3.3
...
11. Do binary operations…
PromQL: sum by (path) (rate(http_requests_total{job=“nginx”, status=~“502”}[1m]))
/
sum by (path) (rate(http_requests_total{job=“nginx”}[1m]))
{path=“/home”} 0.001
{path=“/settings”} 1.0
...
12. Hashistack
● Consul: service discovery, K/V
store, service mesh…
● Vault: secret management and
automation
● Terraform: infrastructure config
as code
13. • All Hashicorp products use github.com/armon/go-metrics for metrics.
• Exposes metrics in statsd, dogstatsd
• Prometheus support added in 2015
• …but not plumbed through in most products (until recently)
14. Consul
● Prometheus support exposed in
1.1.0 (hashicorp/consul#4014,
hashicorp/consul#4016)
● Exposed metrics are being improved
(hashicorp/consul#4042)
15. Consul (II)
● Alternatively use the statsd_exporter
with customer metrics mapping
● github.com/prometheus/
statsd_exporter
17. Consul (III)
● Still don’t get the “operational”
metrics we want.
● Use the consul_exporter:
github.com/prometheus/
consul_exporter
18. Consul Mixin
● Set of predefined alerts &
dashboards for Prometheus /
Grafana
● github.com/kausalco/
public/consul-mixin
● Users “Prometheus Mixin”
format:
● Design Doc
19. Vault
● No Prometheus metrics yet
(hashicorp/vault#2937)
● Again, use statsd_exporter
● BUT statsd metrics aren’t very
“operational”
● Use vault_exporter
● github.com/grapeshot/
vault_exporter
20. Vault Mixin
● As per consul mixin, set of alerts
and dashboard for Prometheus
& Grafana
● github.com/grapeshot/
vault_exporter/vault-mixin
21. Terraform
● Terraform is a CLI tool - what
does it mean to monitor that?
● I don’t have enough confidence
to run it CI/CD..
● But I do want to know if
someone changes something
and doesn’t apply it.
22. Terradiff
• Runs on k8s
• Pod containing:
• git-sync (github.com/kubernetes/git-sync)
• prom-run (github.com/tomwilkie/prom-run)
terraform plan -detailed-exitcode
https://www.weave.works/blog/provisioning-lifecycle-production-ready-kubernetes-cluster/
23. Hashistack
● Consul: service discovery, K/V
store, service mesh…
● Vault: secret management and
automation
● Terraform: infrastructure config
as code