SlideShare a Scribd company logo
Brian Brazil
Founder
Life of a Label
Who am I?
Engineer passionate about running software reliably in production.
Prometheus Core developer
Studied Computer Science in Trinity College Dublin.
Google SRE for 7 years, working on high-scale reliable systems.
Contributor to many open source projects, including Prometheus, Ansible,
Python, Aurora and Zookeeper.
Founder of Robust Perception, provider of commercial support and consulting
for Prometheus.
Who am I really?
The guy responsible for relabelling.
Instrumentation Labels
Instrumentation labels are to distinguish things happening at the code level inside
one metric.
E.g. GET vs POST, Visa vs Mastercard.
Could be from a random library, could be from business logic.
Instrumentation Example
from prometheus_client import Counter
c = Counter('my_requests_total', 'HTTP Failures',
['method', 'endpoint'])
c.labels('GET', '/').inc()
my_requests_total{method="GET", endpoint="/"} 1.0
So you've got these labels in your app...
You've gone and instrumented your application.
Well done!
Gold Star!
So how do you get those into Prometheus from your live systems?
Where to find targets to monitor
In a push system targets decide what monitoring systems to talk to.
With Prometheus it's the other way around, allowing each team to chose what
they want to monitor.
This means though that we need a way to find our targets. Listing them all by
hand is not likely to end well.
Enter service discovery.
Service Discovery
There are many supported SD methods: static configs, file, EC2, Consul,
Kubernetes, EC2, DNS, Azure, Twitter Serverset and AirBnB Nerve.
File SD reads YML or JSON files off local disk. Inotify used to pick up changes
automatically. Intended as hook for when other options don't suit you.
All others work by asking some system for targets over the network.
You've a list of hosts, now what?
So you've got a list of all instances from an EC2 region.
That's fine for monitoring the node exporter that runs everywhere, but how about
services that run on only some machines?
Need a way to select which machines to scrape.
And here's the crux
Different organisations do it different ways. Some may use the Name tag, others
VPC IDs. It's rarely even consistent within a team, let alone an organisation.
(There's hardcoded assumptions almost every time SD mechanisms are added)
Could representing it as a set of config options, but with so many variants it'd
quickly become unwieldy with likely hundreds of interacting config options.
Instead we have relabelling.
What do we want to allow?
What is relabelling?
Relabelling is a way to take in metadata about a target, and based on that select
which targets to scrape.
You can also use it to choose your target labels, as by default you'll only get an
__address__/instance label.
Choosing Targets
How do you allow for any potential way of going from arbitrary metadata?
When there's little to no structure, regexes are a good choice.
relabel_configs:
- source_labels: ["__meta_consul_tags"]
regex: ".*,production,.*"
action: keep
Keep and drop
The simplest relabel actions are keep and drop.
If the given regex matches with keep, the target continues on.
If it doesn't match, processing halts and we try the next target.
drop is the other way round, halting processing if the regex matches.
What if you want to match against two labels?
source_labels is a list so you can specify as many labels are you like.
Results will be concatenated, separated by a semicolon.
Can change separator via separator
Missing labels will have an empty string.
For more complex rules, relabel_configs is a list so you can add as many
actions as you like! All actions are applied until a keep or drop halts it.
Label handling
This is relatively simple so far. We have an SD, we use regexes to pick which
targets to scrape.
The real power of Prometheus is labels combined with the query language.
Wouldn't it be nice to be able to make some targets have labels like env="prod"
or env="dev" and aggregate across those?
Munging labels
The core of relabelling is the replace action.
It applies a regex to the source_labels, if it matches interpolates the regex
match groups into replacement and write the result to the target_label.
An empty label means the label is removed. __meta labels are discarded after
relabelling.
This is all simple in theory.
It gets complicated when you try to map your view of the world into Prometheus
labels working off whatever metadata you have.
Example: job name in EC2 Name tag
relabel_configs:
- source_labels: ["__meta_ec2_tag_Name"]
regex: "(.*)"
action: replace
replacement: "${1}"
target_label: "job"
Defaults to make things simpler
A label copy is very common, so the defaults reduce this to two lines:
relabel_configs:
- source_labels: ["__meta_ec2_tag_Name"]
target_label: "job"
Instance label
The label that SD returns with the host:port is __address__.
If no instance label is present by the end of relabelling, it defaults to the value of
__address__.
This means that you can have the instance label be something more meaningful
than a host:port - such as an EC2 instance id or Zookeeper path.
Avoid adding other labels as readable instance names, it'll break sharing
without based expressions.
Other labels
Many other settings are also configurable via relabelling.
scheme, metrics_path and params are just defaults, so whether to use http or
https could come from service discovery.
For params only the first value of each URL parameter is available for relabelling.
This is how the blackbox and SNMP exporters work, changing what would
normally be an __address__ label into a URL parameter.
Other relabel actions
There's two more relabel actions for advanced use cases.
labelmap copies labels based on regex substitution.
It's different in that regex and replacement apply to the label names, not the
label values. Useful if you've a set of key/value tags you want to copy wholesale
without listing every individual label in the relabel config.
hashmod is used with keep for sharding. It takes a modulus of a hash of the
source labels and puts it in the target label as an integer.
Other notes on labels: Dealing with label clashes
If there's a clash with instrumentation labels, the target label takes precedence.
The scraped label will be prefixed with exported_
This behaviour can be changed with honor_labels: true, which makes the
scraped label win and discards the target label.
In addition, an empty scraped label will remove labels including instance. Use
this for the Pushgateway and other places where you don't want an instance
label.
Metric relabel configs
Sometimes you need to temporarily change the scraped metrics while waiting for
instrumentation to be fixed.
metric_relabel_configs apply to all scraped samples just before they're
added to the database. Could use it to drop expensive metrics, or fix a label value.
Beware using expensive or extensive rules as it's applied to every sample.
As up isn't a scraped metric, only relabel_configs apply.
From alerts...
ALERT MyExampleAlert
IF rate(my_requests_total[5m]) < 10
FOR 5m
LABELS { severity = "page" }
The alert will have method and endpoint labels from the alert expression and a
new severity label.
External labels and alert relabeling applied too.
...to the Alertmanager
Just as with the rest of the Prometheus stack, labels are core to the Alertmanager.
A tree uses labels to route alerts into groups. Each team can have their own route!
Each group can choose which labels to fan-out notifications by, reducing spam.
Silences are specified using labels, suppressing precisely the alerts you want.
Summary
Instrumentation labels come from the application.
Service discovery creates targets.
Relabelling filters targets and adds target labels to make them meaningful for you.
Metric relabel configs apply to scraped time series.
Alerts can add labels before sending to the alertmanager.
The alertmanager uses labels for routing, grouping, deduplication and silencing.
Resources
Official Docs: https://prometheus.io/docs/operating/configuration/
Label flow: http://www.robustperception.io/life-of-a-label/
Target label best practices: http://www.robustperception.io/target-labels-are-for-
life-not-just-for-christmas/
Robust Perception Blog: www.robustperception.io/blog
Queries: prometheus@robustperception.io

More Related Content

What's hot

Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)
Brian Brazil
 
Prometheus design and philosophy
Prometheus design and philosophy   Prometheus design and philosophy
Prometheus design and philosophy
Docker, Inc.
 
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Brian Brazil
 
Provisioning and Capacity Planning (Travel Meets Big Data)
Provisioning and Capacity Planning (Travel Meets Big Data)Provisioning and Capacity Planning (Travel Meets Big Data)
Provisioning and Capacity Planning (Travel Meets Big Data)
Brian Brazil
 
Monitoring microservices with Prometheus
Monitoring microservices with PrometheusMonitoring microservices with Prometheus
Monitoring microservices with Prometheus
Tobias Schmidt
 
Prometheus (Monitorama 2016)
Prometheus (Monitorama 2016)Prometheus (Monitorama 2016)
Prometheus (Monitorama 2016)
Brian Brazil
 
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Brian Brazil
 
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
Brian Brazil
 
Prometheus with Grafana - AddWeb Solution
Prometheus with Grafana - AddWeb SolutionPrometheus with Grafana - AddWeb Solution
Prometheus with Grafana - AddWeb Solution
AddWeb Solution Pvt. Ltd.
 
Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus for Monitoring Metrics (Fermilab 2018)Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus for Monitoring Metrics (Fermilab 2018)
Brian Brazil
 
Staleness and Isolation in Prometheus 2.0 (PromCon 2017)
Staleness and Isolation in Prometheus 2.0 (PromCon 2017)Staleness and Isolation in Prometheus 2.0 (PromCon 2017)
Staleness and Isolation in Prometheus 2.0 (PromCon 2017)
Brian Brazil
 
What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)
Brian Brazil
 
End to-end monitoring with the prometheus operator - Max Inden
End to-end monitoring with the prometheus operator - Max IndenEnd to-end monitoring with the prometheus operator - Max Inden
End to-end monitoring with the prometheus operator - Max Inden
Paris Container Day
 
Prometheus Overview
Prometheus OverviewPrometheus Overview
Prometheus Overview
Brian Brazil
 
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)Evolving Prometheus for the Cloud Native World (FOSDEM 2018)
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)
Brian Brazil
 
Prometheus lightning talk (Devops Dublin March 2015)
Prometheus lightning talk (Devops Dublin March 2015)Prometheus lightning talk (Devops Dublin March 2015)
Prometheus lightning talk (Devops Dublin March 2015)
Brian Brazil
 
No C-QL (Or how I learned to stop worrying, and love eventual consistency) (N...
No C-QL (Or how I learned to stop worrying, and love eventual consistency) (N...No C-QL (Or how I learned to stop worrying, and love eventual consistency) (N...
No C-QL (Or how I learned to stop worrying, and love eventual consistency) (N...
Brian Brazil
 
Evolution of the Prometheus TSDB (Percona Live Europe 2017)
Evolution of the Prometheus TSDB  (Percona Live Europe 2017)Evolution of the Prometheus TSDB  (Percona Live Europe 2017)
Evolution of the Prometheus TSDB (Percona Live Europe 2017)
Brian Brazil
 
Gatling workshop lets test17
Gatling workshop lets test17Gatling workshop lets test17
Gatling workshop lets test17
Gerald Muecke
 
The hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusThe hitchhiker’s guide to Prometheus
The hitchhiker’s guide to Prometheus
Bol.com Techlab
 

What's hot (20)

Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)
 
Prometheus design and philosophy
Prometheus design and philosophy   Prometheus design and philosophy
Prometheus design and philosophy
 
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
 
Provisioning and Capacity Planning (Travel Meets Big Data)
Provisioning and Capacity Planning (Travel Meets Big Data)Provisioning and Capacity Planning (Travel Meets Big Data)
Provisioning and Capacity Planning (Travel Meets Big Data)
 
Monitoring microservices with Prometheus
Monitoring microservices with PrometheusMonitoring microservices with Prometheus
Monitoring microservices with Prometheus
 
Prometheus (Monitorama 2016)
Prometheus (Monitorama 2016)Prometheus (Monitorama 2016)
Prometheus (Monitorama 2016)
 
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
 
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
 
Prometheus with Grafana - AddWeb Solution
Prometheus with Grafana - AddWeb SolutionPrometheus with Grafana - AddWeb Solution
Prometheus with Grafana - AddWeb Solution
 
Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus for Monitoring Metrics (Fermilab 2018)Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus for Monitoring Metrics (Fermilab 2018)
 
Staleness and Isolation in Prometheus 2.0 (PromCon 2017)
Staleness and Isolation in Prometheus 2.0 (PromCon 2017)Staleness and Isolation in Prometheus 2.0 (PromCon 2017)
Staleness and Isolation in Prometheus 2.0 (PromCon 2017)
 
What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)
 
End to-end monitoring with the prometheus operator - Max Inden
End to-end monitoring with the prometheus operator - Max IndenEnd to-end monitoring with the prometheus operator - Max Inden
End to-end monitoring with the prometheus operator - Max Inden
 
Prometheus Overview
Prometheus OverviewPrometheus Overview
Prometheus Overview
 
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)Evolving Prometheus for the Cloud Native World (FOSDEM 2018)
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)
 
Prometheus lightning talk (Devops Dublin March 2015)
Prometheus lightning talk (Devops Dublin March 2015)Prometheus lightning talk (Devops Dublin March 2015)
Prometheus lightning talk (Devops Dublin March 2015)
 
No C-QL (Or how I learned to stop worrying, and love eventual consistency) (N...
No C-QL (Or how I learned to stop worrying, and love eventual consistency) (N...No C-QL (Or how I learned to stop worrying, and love eventual consistency) (N...
No C-QL (Or how I learned to stop worrying, and love eventual consistency) (N...
 
Evolution of the Prometheus TSDB (Percona Live Europe 2017)
Evolution of the Prometheus TSDB  (Percona Live Europe 2017)Evolution of the Prometheus TSDB  (Percona Live Europe 2017)
Evolution of the Prometheus TSDB (Percona Live Europe 2017)
 
Gatling workshop lets test17
Gatling workshop lets test17Gatling workshop lets test17
Gatling workshop lets test17
 
The hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusThe hitchhiker’s guide to Prometheus
The hitchhiker’s guide to Prometheus
 

Similar to Life of a Label (PromCon2016, Berlin)

Salesforce integration questions
Salesforce integration questionsSalesforce integration questions
Salesforce integration questions
Debabrat Rout
 
VB.net
VB.netVB.net
VB.net
PallaviKadam
 
Practical catalyst
Practical catalystPractical catalyst
Practical catalyst
dwm042
 
Lecture: Refactoring
Lecture: RefactoringLecture: Refactoring
Lecture: Refactoring
Marcus Denker
 
Instant DBMS Homework Help
Instant DBMS Homework HelpInstant DBMS Homework Help
Instant DBMS Homework Help
Database Homework Help
 
Design Patterns
Design PatternsDesign Patterns
Design Patterns
imedo.de
 
Code review
Code reviewCode review
Code review
Abhishek Sur
 
On Coding Guidelines
On Coding GuidelinesOn Coding Guidelines
On Coding Guidelines
DIlawar Singh
 
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
Gabriele Baldassarre
 
How to ace your .NET technical interview :: .Net Technical Check Tuneup
How to ace your .NET technical interview :: .Net Technical Check TuneupHow to ace your .NET technical interview :: .Net Technical Check Tuneup
How to ace your .NET technical interview :: .Net Technical Check Tuneup
Bala Subra
 
Stop hardcoding follow parameterization
Stop hardcoding  follow parameterizationStop hardcoding  follow parameterization
Stop hardcoding follow parameterization
Preeti Sagar
 
Custom Metadata Records Deployment From Apex Code
Custom Metadata Records Deployment From Apex CodeCustom Metadata Records Deployment From Apex Code
Custom Metadata Records Deployment From Apex Code
Bohdan Dovhań
 
The Ultimate Guide to Ad0 e103 adobe experience manager sites developer
The Ultimate Guide to Ad0 e103 adobe experience manager sites developerThe Ultimate Guide to Ad0 e103 adobe experience manager sites developer
The Ultimate Guide to Ad0 e103 adobe experience manager sites developer
ShoanSharma
 
Finding Your Way: Understanding Magento Code
Finding Your Way: Understanding Magento CodeFinding Your Way: Understanding Magento Code
Finding Your Way: Understanding Magento Code
Ben Marks
 
Best practices in enterprise applications
Best practices in enterprise applicationsBest practices in enterprise applications
Best practices in enterprise applicationsChandra Sekhar Saripaka
 
Back-2-Basics: .NET Coding Standards For The Real World
Back-2-Basics: .NET Coding Standards For The Real WorldBack-2-Basics: .NET Coding Standards For The Real World
Back-2-Basics: .NET Coding Standards For The Real World
David McCarter
 
C# interview
C# interviewC# interview
C# interview
ajeesharakkal
 
Task Perform addition subtraction division and multiplic.pdf
Task Perform addition subtraction division and multiplic.pdfTask Perform addition subtraction division and multiplic.pdf
Task Perform addition subtraction division and multiplic.pdf
acsmadurai
 
Php
PhpPhp
Back-2-Basics: .NET Coding Standards For The Real World
Back-2-Basics: .NET Coding Standards For The Real WorldBack-2-Basics: .NET Coding Standards For The Real World
Back-2-Basics: .NET Coding Standards For The Real World
David McCarter
 

Similar to Life of a Label (PromCon2016, Berlin) (20)

Salesforce integration questions
Salesforce integration questionsSalesforce integration questions
Salesforce integration questions
 
VB.net
VB.netVB.net
VB.net
 
Practical catalyst
Practical catalystPractical catalyst
Practical catalyst
 
Lecture: Refactoring
Lecture: RefactoringLecture: Refactoring
Lecture: Refactoring
 
Instant DBMS Homework Help
Instant DBMS Homework HelpInstant DBMS Homework Help
Instant DBMS Homework Help
 
Design Patterns
Design PatternsDesign Patterns
Design Patterns
 
Code review
Code reviewCode review
Code review
 
On Coding Guidelines
On Coding GuidelinesOn Coding Guidelines
On Coding Guidelines
 
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
 
How to ace your .NET technical interview :: .Net Technical Check Tuneup
How to ace your .NET technical interview :: .Net Technical Check TuneupHow to ace your .NET technical interview :: .Net Technical Check Tuneup
How to ace your .NET technical interview :: .Net Technical Check Tuneup
 
Stop hardcoding follow parameterization
Stop hardcoding  follow parameterizationStop hardcoding  follow parameterization
Stop hardcoding follow parameterization
 
Custom Metadata Records Deployment From Apex Code
Custom Metadata Records Deployment From Apex CodeCustom Metadata Records Deployment From Apex Code
Custom Metadata Records Deployment From Apex Code
 
The Ultimate Guide to Ad0 e103 adobe experience manager sites developer
The Ultimate Guide to Ad0 e103 adobe experience manager sites developerThe Ultimate Guide to Ad0 e103 adobe experience manager sites developer
The Ultimate Guide to Ad0 e103 adobe experience manager sites developer
 
Finding Your Way: Understanding Magento Code
Finding Your Way: Understanding Magento CodeFinding Your Way: Understanding Magento Code
Finding Your Way: Understanding Magento Code
 
Best practices in enterprise applications
Best practices in enterprise applicationsBest practices in enterprise applications
Best practices in enterprise applications
 
Back-2-Basics: .NET Coding Standards For The Real World
Back-2-Basics: .NET Coding Standards For The Real WorldBack-2-Basics: .NET Coding Standards For The Real World
Back-2-Basics: .NET Coding Standards For The Real World
 
C# interview
C# interviewC# interview
C# interview
 
Task Perform addition subtraction division and multiplic.pdf
Task Perform addition subtraction division and multiplic.pdfTask Perform addition subtraction division and multiplic.pdf
Task Perform addition subtraction division and multiplic.pdf
 
Php
PhpPhp
Php
 
Back-2-Basics: .NET Coding Standards For The Real World
Back-2-Basics: .NET Coding Standards For The Real WorldBack-2-Basics: .NET Coding Standards For The Real World
Back-2-Basics: .NET Coding Standards For The Real World
 

More from Brian Brazil

Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)
Brian Brazil
 
Evaluating Prometheus Knowledge in Interviews (PromCon 2018)
Evaluating Prometheus Knowledge in Interviews (PromCon 2018)Evaluating Prometheus Knowledge in Interviews (PromCon 2018)
Evaluating Prometheus Knowledge in Interviews (PromCon 2018)
Brian Brazil
 
Anatomy of a Prometheus Client Library (PromCon 2018)
Anatomy of a Prometheus Client Library (PromCon 2018)Anatomy of a Prometheus Client Library (PromCon 2018)
Anatomy of a Prometheus Client Library (PromCon 2018)
Brian Brazil
 
Rule 110 for Prometheus (PromCon 2017)
Rule 110 for Prometheus (PromCon 2017)Rule 110 for Prometheus (PromCon 2017)
Rule 110 for Prometheus (PromCon 2017)
Brian Brazil
 
Prometheus: From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
Prometheus:  From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)Prometheus:  From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
Prometheus: From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
Brian Brazil
 
An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)
Brian Brazil
 
Prometheus - Open Source Forum Japan
Prometheus  - Open Source Forum JapanPrometheus  - Open Source Forum Japan
Prometheus - Open Source Forum Japan
Brian Brazil
 
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Brian Brazil
 
An Exploration of the Formal Properties of PromQL
An Exploration of the Formal Properties of PromQLAn Exploration of the Formal Properties of PromQL
An Exploration of the Formal Properties of PromQL
Brian Brazil
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)
Brian Brazil
 
Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)
Brian Brazil
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Brian Brazil
 
Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)
Brian Brazil
 
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
Brian Brazil
 
Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...
Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...
Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...
Brian Brazil
 

More from Brian Brazil (15)

Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)
 
Evaluating Prometheus Knowledge in Interviews (PromCon 2018)
Evaluating Prometheus Knowledge in Interviews (PromCon 2018)Evaluating Prometheus Knowledge in Interviews (PromCon 2018)
Evaluating Prometheus Knowledge in Interviews (PromCon 2018)
 
Anatomy of a Prometheus Client Library (PromCon 2018)
Anatomy of a Prometheus Client Library (PromCon 2018)Anatomy of a Prometheus Client Library (PromCon 2018)
Anatomy of a Prometheus Client Library (PromCon 2018)
 
Rule 110 for Prometheus (PromCon 2017)
Rule 110 for Prometheus (PromCon 2017)Rule 110 for Prometheus (PromCon 2017)
Rule 110 for Prometheus (PromCon 2017)
 
Prometheus: From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
Prometheus:  From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)Prometheus:  From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
Prometheus: From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
 
An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)
 
Prometheus - Open Source Forum Japan
Prometheus  - Open Source Forum JapanPrometheus  - Open Source Forum Japan
Prometheus - Open Source Forum Japan
 
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
 
An Exploration of the Formal Properties of PromQL
An Exploration of the Formal Properties of PromQLAn Exploration of the Formal Properties of PromQL
An Exploration of the Formal Properties of PromQL
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)
 
Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
 
Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)
 
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
 
Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...
Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...
Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...
 

Recently uploaded

The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
laozhuseo02
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
JeyaPerumal1
 
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptxLiving-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
TristanJasperRamos
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
JungkooksNonexistent
 
ER(Entity Relationship) Diagram for online shopping - TAE
ER(Entity Relationship) Diagram for online shopping - TAEER(Entity Relationship) Diagram for online shopping - TAE
ER(Entity Relationship) Diagram for online shopping - TAE
Himani415946
 
Output determination SAP S4 HANA SAP SD CC
Output determination SAP S4 HANA SAP SD CCOutput determination SAP S4 HANA SAP SD CC
Output determination SAP S4 HANA SAP SD CC
ShahulHameed54211
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
nirahealhty
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
Rogerio Filho
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
natyesu
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
laozhuseo02
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
Gal Baras
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Sanjeev Rampal
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 

Recently uploaded (16)

The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
 
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptxLiving-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
 
ER(Entity Relationship) Diagram for online shopping - TAE
ER(Entity Relationship) Diagram for online shopping - TAEER(Entity Relationship) Diagram for online shopping - TAE
ER(Entity Relationship) Diagram for online shopping - TAE
 
Output determination SAP S4 HANA SAP SD CC
Output determination SAP S4 HANA SAP SD CCOutput determination SAP S4 HANA SAP SD CC
Output determination SAP S4 HANA SAP SD CC
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 

Life of a Label (PromCon2016, Berlin)

  • 2. Who am I? Engineer passionate about running software reliably in production. Prometheus Core developer Studied Computer Science in Trinity College Dublin. Google SRE for 7 years, working on high-scale reliable systems. Contributor to many open source projects, including Prometheus, Ansible, Python, Aurora and Zookeeper. Founder of Robust Perception, provider of commercial support and consulting for Prometheus.
  • 3. Who am I really? The guy responsible for relabelling.
  • 4. Instrumentation Labels Instrumentation labels are to distinguish things happening at the code level inside one metric. E.g. GET vs POST, Visa vs Mastercard. Could be from a random library, could be from business logic.
  • 5. Instrumentation Example from prometheus_client import Counter c = Counter('my_requests_total', 'HTTP Failures', ['method', 'endpoint']) c.labels('GET', '/').inc() my_requests_total{method="GET", endpoint="/"} 1.0
  • 6. So you've got these labels in your app... You've gone and instrumented your application. Well done! Gold Star! So how do you get those into Prometheus from your live systems?
  • 7. Where to find targets to monitor In a push system targets decide what monitoring systems to talk to. With Prometheus it's the other way around, allowing each team to chose what they want to monitor. This means though that we need a way to find our targets. Listing them all by hand is not likely to end well. Enter service discovery.
  • 8. Service Discovery There are many supported SD methods: static configs, file, EC2, Consul, Kubernetes, EC2, DNS, Azure, Twitter Serverset and AirBnB Nerve. File SD reads YML or JSON files off local disk. Inotify used to pick up changes automatically. Intended as hook for when other options don't suit you. All others work by asking some system for targets over the network.
  • 9. You've a list of hosts, now what? So you've got a list of all instances from an EC2 region. That's fine for monitoring the node exporter that runs everywhere, but how about services that run on only some machines? Need a way to select which machines to scrape.
  • 10. And here's the crux Different organisations do it different ways. Some may use the Name tag, others VPC IDs. It's rarely even consistent within a team, let alone an organisation. (There's hardcoded assumptions almost every time SD mechanisms are added) Could representing it as a set of config options, but with so many variants it'd quickly become unwieldy with likely hundreds of interacting config options. Instead we have relabelling.
  • 11. What do we want to allow?
  • 12. What is relabelling? Relabelling is a way to take in metadata about a target, and based on that select which targets to scrape. You can also use it to choose your target labels, as by default you'll only get an __address__/instance label.
  • 13. Choosing Targets How do you allow for any potential way of going from arbitrary metadata? When there's little to no structure, regexes are a good choice. relabel_configs: - source_labels: ["__meta_consul_tags"] regex: ".*,production,.*" action: keep
  • 14. Keep and drop The simplest relabel actions are keep and drop. If the given regex matches with keep, the target continues on. If it doesn't match, processing halts and we try the next target. drop is the other way round, halting processing if the regex matches.
  • 15. What if you want to match against two labels? source_labels is a list so you can specify as many labels are you like. Results will be concatenated, separated by a semicolon. Can change separator via separator Missing labels will have an empty string. For more complex rules, relabel_configs is a list so you can add as many actions as you like! All actions are applied until a keep or drop halts it.
  • 16. Label handling This is relatively simple so far. We have an SD, we use regexes to pick which targets to scrape. The real power of Prometheus is labels combined with the query language. Wouldn't it be nice to be able to make some targets have labels like env="prod" or env="dev" and aggregate across those?
  • 17. Munging labels The core of relabelling is the replace action. It applies a regex to the source_labels, if it matches interpolates the regex match groups into replacement and write the result to the target_label. An empty label means the label is removed. __meta labels are discarded after relabelling. This is all simple in theory. It gets complicated when you try to map your view of the world into Prometheus labels working off whatever metadata you have.
  • 18. Example: job name in EC2 Name tag relabel_configs: - source_labels: ["__meta_ec2_tag_Name"] regex: "(.*)" action: replace replacement: "${1}" target_label: "job"
  • 19. Defaults to make things simpler A label copy is very common, so the defaults reduce this to two lines: relabel_configs: - source_labels: ["__meta_ec2_tag_Name"] target_label: "job"
  • 20. Instance label The label that SD returns with the host:port is __address__. If no instance label is present by the end of relabelling, it defaults to the value of __address__. This means that you can have the instance label be something more meaningful than a host:port - such as an EC2 instance id or Zookeeper path. Avoid adding other labels as readable instance names, it'll break sharing without based expressions.
  • 21. Other labels Many other settings are also configurable via relabelling. scheme, metrics_path and params are just defaults, so whether to use http or https could come from service discovery. For params only the first value of each URL parameter is available for relabelling. This is how the blackbox and SNMP exporters work, changing what would normally be an __address__ label into a URL parameter.
  • 22. Other relabel actions There's two more relabel actions for advanced use cases. labelmap copies labels based on regex substitution. It's different in that regex and replacement apply to the label names, not the label values. Useful if you've a set of key/value tags you want to copy wholesale without listing every individual label in the relabel config. hashmod is used with keep for sharding. It takes a modulus of a hash of the source labels and puts it in the target label as an integer.
  • 23. Other notes on labels: Dealing with label clashes If there's a clash with instrumentation labels, the target label takes precedence. The scraped label will be prefixed with exported_ This behaviour can be changed with honor_labels: true, which makes the scraped label win and discards the target label. In addition, an empty scraped label will remove labels including instance. Use this for the Pushgateway and other places where you don't want an instance label.
  • 24. Metric relabel configs Sometimes you need to temporarily change the scraped metrics while waiting for instrumentation to be fixed. metric_relabel_configs apply to all scraped samples just before they're added to the database. Could use it to drop expensive metrics, or fix a label value. Beware using expensive or extensive rules as it's applied to every sample. As up isn't a scraped metric, only relabel_configs apply.
  • 25. From alerts... ALERT MyExampleAlert IF rate(my_requests_total[5m]) < 10 FOR 5m LABELS { severity = "page" } The alert will have method and endpoint labels from the alert expression and a new severity label. External labels and alert relabeling applied too.
  • 26. ...to the Alertmanager Just as with the rest of the Prometheus stack, labels are core to the Alertmanager. A tree uses labels to route alerts into groups. Each team can have their own route! Each group can choose which labels to fan-out notifications by, reducing spam. Silences are specified using labels, suppressing precisely the alerts you want.
  • 27. Summary Instrumentation labels come from the application. Service discovery creates targets. Relabelling filters targets and adds target labels to make them meaningful for you. Metric relabel configs apply to scraped time series. Alerts can add labels before sending to the alertmanager. The alertmanager uses labels for routing, grouping, deduplication and silencing.
  • 28. Resources Official Docs: https://prometheus.io/docs/operating/configuration/ Label flow: http://www.robustperception.io/life-of-a-label/ Target label best practices: http://www.robustperception.io/target-labels-are-for- life-not-just-for-christmas/ Robust Perception Blog: www.robustperception.io/blog Queries: prometheus@robustperception.io