Prometheus	
A	next-genera1on	monitoring	system
History	
•  Started	in	2012	by	ex-Google	Site	Reliability	
Engineers	
•  Wri?en	in	Go	
•  Inspired	by	Google’s	Borgmon	
– Borgmon	monitors	Borg	
•  Public	announcement	in	January	2015	
h?p://www.slideshare.net/FabianReinartz/prometheus-a-next-gen-monitoring-system-3
Features	
•  pull	architecture	
– easy	to	scall	out	
•  mul1	dimensional	data	model	
•  powerful	query	language
pull	architecture	
h?ps://prometheus.io/docs/introduc1on/overview/
Demo	
•  node_exporter(machine	metrics)	
•  prometheus	server	configura1on
mul1	dimensional	data	model	
•  metric	types	
– counter	
– gauge	
– histogram	
– summary	
h?ps://prometheus.io/docs/concepts/metric_types/
h?p://www.boxever.com/the-power-of-mul1-dimensional-labels-in-prometheus	
Counter	Java	Code	Example
Counter	Metric	Example	
...	
h?p_response_status_total{status="200”}	@1460708848.148	277504	
h?p_response_status_total{status=“200”}	@1460708843.148	277503	
h?p_response_status_total{status="200”}	@1460708838.148	277502	
h?p_response_status_total{status=“200”}	@1460708833.148	277501	
...	
metric	name	 label	name	 1mestamp	 sample	value
How	to	handle	counter	metric	
•  Do	you	use	reset?	
•  Do	you	use	moving	average?	
h?p://www.robustpercep1on.io/how-does-a-prometheus-counter-work/	
No!	use	rate/irate/increase	func1on!	
rate(h?p_response_status_total[1m])
powerful	query	language	
sum	by(status)	(	
		rate(h?p_response_status_total	[1m]))	
)		
ALERT	DiskWillFillIn4Hours	
		IF	predict_linear(node_filesystem_free{job='node'}[1h],	4*3600)	<	0	
		FOR	5m	
		LABELS	{	
				severity="page"	
		}	
h?p://www.robustpercep1on.io/reduce-noise-from-disk-space-alerts/
Demo(access	log)	
•  fluent-plugin-prometheus	
•  prometheus	query	
•  grafana	graph

Prometheus