Prometheus - Utah Software Architecture Meetup - Clint Checketts

Prometheus
Multi-dimensional metrics
Clint Checketts

Stats
Times at bat
Batting Average
Pitching speed
AssistsBases run
Home runs
Wins and Losses

Application Stats
Deploy speed
Endpoint latency
Database
performance
Error countsCache hits
Garbage
collection
Outages

#gifee - Google Infrastructure for
Everyone Else
• Kubernetes (borg)
• Prometheus (borgmon)
• Open Tracing (Dapper)

Why Prometheus? Ownership
• To ensure that engineers
• have confidence in where the metrics are coming from,
• can minimize friction on creating dashboards they need,
• and improve alerts that affect them
• Types of Metrics
• Infrastructure
• Application Performance
• Feature Usage

Prometheus Pedigree
• Open Source
(Apache 2)
• Cloud Native
Computing
Foundation
• Inspired by Google
metrics ‘borgmon’
• Created Nov 2012

Prometheus Feature Set
• Metrics gathering
• Infrastructure exporters
• Application instrumentation
• Query language
• Alerting
• Graphing

Technical Summary
• Self contained, very easy to run
• Doesn’t use external DB
• Can run even if everything else is on fire
• Very efficient memory/disk usage
• Pull model for monitoring
• Service discovery model to determine what to
monitor (AWS, DNS, Kubernetes, etc)
• Keeps active series and queries in memory
• Memory usage dictates scaling model

Metrics Collection
Grafana Prometheus AWS
Kubernetes
Java Application
Graph Data
Metric Data
Service Discovery

Alerting
Prometheus
Alerting
Alert Manager
Routing

Exporters
Allows Prometheus to
scrape services that aren’t
Prometheus aware
Examples
Node Exporter
SNMP Exporter
MySQL Exporter
Jenkins Exporter
RabbitMQ Exporter

Pull Architecture
Application
Prometheus
Grafana
Kubernete
s
Graphs from
Collects from Discovers from

Grafana
 Graphing/Dashboarding service
 Templates
 Multiple query overlays (offset queries)

Metrics Types:
 counter - example total requests
 inc()
 guage - measure a value at a given time
 Inc() dec() set()
 histogram - quartiles with sample data
 Observe()
 summary - counts and totals
 startTimer() Observe()

How do I get this
goodness?
 Client Libraries
 Exporters
 Baked into tools
 Docker
 Kubernetes

Metric names and labels
 Names
 Explain what is being measured
 Include the unit (or ‘count’)
 Labels (Examples)
 Application name
 Quartile
 Endpoint
http_request_duration_sec{app=”api”, method=“get”,
quartile=“0.5”, handler=“/users”, statusCode=“200”} 0.2

PromQL examples
 All current request duration
 http_request_duration{app=“apiContetnt”}
 How many envs in a pool are available?
 env_pool_count{envtype=“domo/brief/master”} –
env_initing_count{envtype=“domo/brief/master”}
 What are my non-success request rates?
 sum(irate(http_request_duration_milliseconds_count{app=”a
piContent”, statusCode!=“200”}[1m])) by (statusCode) * 60

Summary
 Multi-dimensional
metrics are powerful
 Libraries support is
ready
 Go!

Prometheus - Utah Software Architecture Meetup - Clint Checketts

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Prometheus - Utah Software Architecture Meetup - Clint Checketts

Similar to Prometheus - Utah Software Architecture Meetup - Clint Checketts (20)

Recently uploaded

Recently uploaded (20)

Prometheus - Utah Software Architecture Meetup - Clint Checketts

Editor's Notes