4. Questions that come after:
It's up but it is performant?
It's down but for everyone?
Its is degraded but are the users impacted?
Is it even relevant?
5. Metrics Monitoring
Gather fine grained data at frequent interval
Make them useful by labelling them ; store
them
Analyze them to understand what is going on
6. We are in the cloud era.
Here are some buzzwords for you
cloud, API, openstack, devops, docker, bimodal,
stateless, kubernetes, orchestration, automation,
serverless, docker, humanops, ansible, continuous
deployment, cri-o, jenkins, agile, docker, red hat,
containers, virtualization, provisionning,
monitoring, observability...
11. We need deserve better tools
Our customers ask us to respond fast, in
seconds
We make hundreds of operations per second
What is your monitoring frequency... 5
minutes?
14. Cloud Native
Easy to configure, deploy, maintain
Designed in multiple services
Container ready
Orchestration ready (dynamic config)
Fuzziness
15. Data Centric
A Metric in Prometheus has metadata:
myql_global_status_handlers_total{handler="tmp_write"}
1122
And lots of function to filter, change, remove...
those metadata while fetching them.
16. Open Source
Apache 2.0
Go
Support for multiple OS
Many "exporters":
https://github.com/prometheus/prometheus/wiki/Defa
ult-port-allocations
17. Efficient
Prometheus is designed to fetch data in an
interval measured in seconds
Millions of datapoints
Big improvements in prometheus 2.0
18. Storage
Prometheus 2.x uses prometheus/tsdb
2 hours blocks ; later compacted in up to 10
days blocks
15000 metrics/s ends up ad 800 MiB/day
24. Exporters
Exporters expose metrics with an HTTP API
Bindings available for many languages
Exporters do not save data ; they are not
"proxies" and don't "cache" anything
36. Security
Prometheus supports TLS client (also with
authentication)
Exporter side is your business
We use it with traefik (reverse proxy in go
with native metrics)
We manage certs with ansible
40. What is the Alertmanager
doing?
Receives alerts
Group them
Inhibits them
Dispatches them
Deals with HA
41. How to alerts?
Email
Some vendors: Slack, Hipchat, VictorOps,
pagerduty, ...
Generic Wehbook -> Plugin anything you want
42. High Availability
2 prometheus servers do exact the same job
They send alerts to Alertmanagers
Alertmanagers are clustered not to send the
same notification twice
44. Grafana
Open Source (Apache 2.0)
Web app
Specialized in visualization
Pluggable
Multiple datasources: prometheus, graphite,
influxdb...
Has an API!
45. History of Grafana
Grafana is a fork of Kibana 3 ; used to be JS-
Driven.
Now fully featured, requires a database, multi-
projects/users support, etc...