4. Questions that come after:
It's up but it is performant?
It's down but for everyone?
Its is degraded but are the users impacted?
Is it even relevant?
5. Metrics Monitoring
e.g. traditionally graphite
Gather fine grained data at frequent interval
Make them useful by labelling them ; store
them
Analyze them to understand what is going on
6. Metrics ARE PART OF
monitoring
Do not maintain a metrics + a "traditional
monitoring" stack
Alert from metrics directly!
8. We are in the cloud era.
Here are some buzzwords for you
cloud, API, openstack, devops, docker, bimodal,
stateless, kubernetes, orchestration, automation,
serverless, docker, humanops, ansible, continuous
deployment, cri-o, jenkins, agile, docker, red hat,
containers, virtualization, provisionning, monitoring,
observability...
13. We need deserve better tools
Our customers ask us to respond fast, in
seconds
We make hundreds of operations per second
What is your monitoring frequency... 5
minutes?
16. Cloud Native
Easy to configure, deploy, maintain
Designed in multiple services
Container ready
Orchestration ready (dynamic config)
Fuzziness
17. Data Centric
A Metric in Prometheus has metadata:
myql_global_status_handlers_total{handler="tmp_write"} 1122
And lots of function to filter, change, remove...
those metadata while fetching them.
=> OpenMetrics.io
18. Open Source
Apache 2.0
Go
Support for multiple OS
Many "exporters":
https://github.com/prometheus/prometheus/wiki/Default-
port-allocations
19. Simple
1 service = 1 things
Takes care of its db (time based retention
and/or disk space based retention)
26. Exporters
Exporters expose metrics with an HTTP API
Bindings available for many languages
Exporters do not save data ; they are not
"proxies" and don't "cache" anything
42. What is the Alertmanager
doing?
Receives alerts
Group them
Inhibits them
Dispatches them
Deals with HA
43. How to alerts?
Email
Some vendors: Slack, Hipchat, VictorOps,
pagerduty, ...
Generic Webhook -> Plug in anything you want
44. High Availability
2 prometheus servers do exact the same job
They send alerts to Alertmanagers
Alertmanagers are clustered not to send the
same notification twice
46. Grafana
Open Source (Apache 2.0)
Web app
Specialized in visualization
Pluggable
Multiple datasources: prometheus, graphite,
influxdb...
Has an API!
47. History of Grafana
Grafana is a fork of Kibana 3 ; used to be JS-
Driven.
Now fully featured, requires a database, multi-
projects/users support, etc...
55. Creating Grafana Dashboards
Takes time
Requires deep knowledge of the tools
Improved over time
Easy to share (json + online library)
Try grafonnet-lib!
56. Conclusion
Lots of data that can be explored in many ways
(subqueries are coming)
Trends and deviations are easy to calculate
Can monitor both business and technical
Very convenient to monitor any kind of stack