SlideShare a Scribd company logo
1 of 53
Monitoring your App in
Kubernetes with Prometheus
Jeff Hoffer, Developer Experience
github.com/eudaimos
What does Weave do?
Weave helps devops
iterate faster with:
• observability &
monitoring
• continuous delivery
• container networks &
firewalls
Use Prometheus to
power our Monitoring
solution
What does Weave do?
Weave helps devops
iterate faster with:
• observability &
monitoring
• continuous delivery
• container networks &
firewalls
Use Prometheus to
power our Monitoring
solution
Agenda
1. Prometheus concepts: data model & PromQL
2. Prometheus architecture & pull model
3. Why Prometheus & Kubernetes are a good fit
4. What is Cortex?
5. Kubernetes recap
6. Training on real app
7. What’s next?
Prometheus
Borg —> Kubernetes
Borgmon —> Prometheus
Initially developed at Soundcloud
Data model & PromQL
• Prometheus is a labelled time-series database
• Labels are key-value pairs
• A time-series is [(timestamp, value), …]
• lists of timestamp, value tuples
• values are just floats – PromQL lets you make sense of them
• So the data type of Prometheus is
• {key1=A, key2=B} —> [(t0, v0), (t1, v1), …]
• …
Metrics Types
• count - single numeric metric that only goes up
• gauge - single numeric metric that arbitrarily goes up or down
• histogram - samples observations and counts them in
configurable buckets
• histogram_quantile() to calculate quantiles from histograms
or aggregations
• summary - samples observations and counts them
• configurable quantiles over sliding time windows
{quantile=“.5“}
Data model & PromQL
• __name__ is a magic label, you can
shorten the query syntax from
{__name__=“requests”}
to:
requests
Data model & PromQL
• Example: counter requests over a spike in traffic:
• 1, 2, 3, 13, 23, 33, 34, 35, 36
time
requests
1
3
13
23
33
36
t1 t2 t3 t4 t5 t6 t7 t8 t9
1 2 3 13 23 33 34 35 36
Data model & PromQL
• What Prom is storing
• {__name__=“requests”} —>
[(t1, 1), (t2, 2), (t3, 3), (t4, 13),
(t5, 23), (t6, 33), (t7, 34), (t8, 35),
(t9, 36), (t10, 37)]
or
t1 t2 t3 t4 t5 t6 t7 t8 t9
1 2 3 13 23 33 34 35 36
Data model & PromQL
• the [P] (period) syntax after a label turns an
instant type into a vector type
• for each value, turn the value into a vector
of all the values before and including that
value for the last period P
• Example P: 5s, 1m, 2h…
Data model & PromQL
• Recall our time-series requests
• What is requests[3s]? Vector query:
t1 t2 t3 t4 t5 t6 t7 t8 t9
1 2 3 13 23 33 34 35 36
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1
2
3
Data model & PromQL
• Recall our time-series requests
• What is requests[3s]? Vector query:
t1 t2 t3 t4 t5 t6 t7 t8 t9
1 2 3 13 23 33 34 35 36
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2
2 3
3 13
Data model & PromQL
• Recall our time-series requests
• What is requests[3s]? Vector query:
t1 t2 t3 t4 t5 t6 t7 t8 t9
1 2 3 13 23 33 34 35 36
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3
2 3 13
3 13 23
Data model & PromQL
• Recall our time-series requests
• What is requests[3s]? Vector query:
t1 t2 t3 t4 t5 t6 t7 t8 t9
1 2 3 13 23 33 34 35 36
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
Data model & PromQL
• rate() finds the per second rate of change
over a vector query
• for each vector rate() just does (last_value
- first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [3-1
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [2
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [2/(3-1)
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [2/2
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [1,
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [1, 13-2
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [1, 11
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [1, 11/(4-2)
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [1, 11/2
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [1, 5.5
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [1, 5.5, 23-3
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [1, 5.5, 20
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [1, 5.5, 20/2
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [1, 5.5, 10
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [1, 5.5, 10, 10
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [1, 5.5, 10, 10, 5.5,
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [1, 5.5, 10, 10, 5.5, 1
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
Data model & PromQL
• rate(requests[3s])
• [1, 5.5, 10, 10, 5.5, 1, 1]
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - first_value) / (last_time - first_time)
time
requests
1
3
13
23
33
36
t1 t2 t3 t4 t5 t6 t7 t8 t9
1 2 3 13 23 33 34 35 36
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
requests[3s]
time
rate(requests[3s])
1
5
10
t3 t4 t5 t6 t7 t8 t9
1 5.5 10 10 5.5 1 1
Now we can understand irate (“instantaneous rate”)
• irate(requests[3s])
• [
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - 2nd_last_value) / (last_time - 2nd_last_time)
Now we can understand irate (“instantaneous rate”)
• irate(requests[3s])
• [1, 10, 10, 10, 1, 1, 1]
t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9
1 2 3 13 23 33 34
2 3 13 23 33 34 35
3 13 23 33 34 35 36
(last_value - 2nd_last_value) / (last_time - 2nd_last_time)
Labels
• Recall that requests is just shorthand for
{__name__=“requests”}
• We can have more labels: {__name__=“requests”,
job=“frontend”}
• Shortens to requests{job=“frontend”}
• And so we could query
rate(requests{job=“frontend”}[1m])
Label Operators
• = -> exact match string
• != -> exact match string negated
• =~ -> regex match label
• !~ -> regex match negated
• Regex matching is slower b/c Prometheus
can’t use indexes
Architecture
Prometheus
Architecture
Prometheus
Alerts
• You can define PromQL queries that trigger alerts when
the result of a query matches a criteria. Example:
# Alert for any instance that have a median request latency >1s.
ALERT APIHighRequestLatency
IF api_http_request_latencies_second{quantile="0.5"} > 1
FOR 1m
ANNOTATIONS {
summary = "High request latency on {{ $labels.instance }}",
description = "{{ $labels.instance }} has a median request latency above 1s (current value: {{ $value }}s)",
}
Cortex
• Distributed, multi-tenant version of
Prometheus
• Prometheus architecture is single-server
• We wanted to build something scalable
CortexPrometheus
Cortex
• We run it for you
• Long term storage for your metrics
• We open sourced it
• https://github.com/weaveworks/cortex
Recap: all you need to know (Kube)
Pods
containers
ServicesDeployments
Container
Image
Docker container image, contains your application code in an isolated
environment.
Pod A set of containers, sharing network namespace and local volumes, co-
scheduled on one machine. Mortal. Has pod IP. Has labels.
Deployment Specify how many replicas of a pod should run in a cluster. Then ensures that
many are running across the cluster. Has labels.
Service Names things in DNS. Gets virtual IP. Two types: ClusterIP for internal
services, NodePort for publishing to outside. Routes based on labels.
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
kind: Service
metadata:
name: frontend
spec:
type: NodePort
selector:
app: nginx
ports:
- port: 80
targetPort: 80
nodePort: 30002
Kubernetes services and deployments
Why Kubernetes <3 Prometheus
• Prom discovers what to scrape by asking Kube
• Prom’s pull model matches Kube dynamic
scheduling
• Allows Prom to identify thing it’s pulling from
• Prom label/value pairs mirror Kube labels
• Pods were made for exporters
Training!
Join the Weave user group!
meetup.com/pro/Weave/
weave.works/help
Other topics
Kubernetes 101
Continuous delivery: hooking up my CI/CD pipeline
to Kubernetes
Network policy for security
We have talks on all these topics in the Weave
user group!
Thanks! Questions?
We are hiring!
DX in San Francisco
Engineers in London & SF
weave.works/weave-company/hiring

More Related Content

What's hot

Exploiting Ranking Factorization Machines for Microblog Retrieval
Exploiting Ranking Factorization Machines for Microblog RetrievalExploiting Ranking Factorization Machines for Microblog Retrieval
Exploiting Ranking Factorization Machines for Microblog RetrievalRunwei Qiang
 
The Ring programming language version 1.9 book - Part 33 of 210
The Ring programming language version 1.9 book - Part 33 of 210The Ring programming language version 1.9 book - Part 33 of 210
The Ring programming language version 1.9 book - Part 33 of 210Mahmoud Samir Fayed
 
Verification of Concurrent and Distributed Systems
Verification of Concurrent and Distributed SystemsVerification of Concurrent and Distributed Systems
Verification of Concurrent and Distributed SystemsMykola Novik
 
The Ring programming language version 1.9 book - Part 43 of 210
The Ring programming language version 1.9 book - Part 43 of 210The Ring programming language version 1.9 book - Part 43 of 210
The Ring programming language version 1.9 book - Part 43 of 210Mahmoud Samir Fayed
 
Flux and InfluxDB 2.0 by Paul Dix
Flux and InfluxDB 2.0 by Paul DixFlux and InfluxDB 2.0 by Paul Dix
Flux and InfluxDB 2.0 by Paul DixInfluxData
 

What's hot (9)

Java file
Java fileJava file
Java file
 
Reactive&amp;reactor
Reactive&amp;reactorReactive&amp;reactor
Reactive&amp;reactor
 
Exploiting Ranking Factorization Machines for Microblog Retrieval
Exploiting Ranking Factorization Machines for Microblog RetrievalExploiting Ranking Factorization Machines for Microblog Retrieval
Exploiting Ranking Factorization Machines for Microblog Retrieval
 
The Ring programming language version 1.9 book - Part 33 of 210
The Ring programming language version 1.9 book - Part 33 of 210The Ring programming language version 1.9 book - Part 33 of 210
The Ring programming language version 1.9 book - Part 33 of 210
 
Verification of Concurrent and Distributed Systems
Verification of Concurrent and Distributed SystemsVerification of Concurrent and Distributed Systems
Verification of Concurrent and Distributed Systems
 
Parallel streams in java 8
Parallel streams in java 8Parallel streams in java 8
Parallel streams in java 8
 
The Ring programming language version 1.9 book - Part 43 of 210
The Ring programming language version 1.9 book - Part 43 of 210The Ring programming language version 1.9 book - Part 43 of 210
The Ring programming language version 1.9 book - Part 43 of 210
 
Data Structure (MC501)
Data Structure (MC501)Data Structure (MC501)
Data Structure (MC501)
 
Flux and InfluxDB 2.0 by Paul Dix
Flux and InfluxDB 2.0 by Paul DixFlux and InfluxDB 2.0 by Paul Dix
Flux and InfluxDB 2.0 by Paul Dix
 

Similar to Monitoring your Application in Kubernetes with Prometheus

Monitoring your Application in Kubernetes with Prometheus
Monitoring your Application in Kubernetes with Prometheus Monitoring your Application in Kubernetes with Prometheus
Monitoring your Application in Kubernetes with Prometheus Weaveworks
 
Just in time (series) - KairosDB
Just in time (series) - KairosDBJust in time (series) - KairosDB
Just in time (series) - KairosDBVictor Anjos
 
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...Lucidworks
 
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?Marlon Dumas
 
Better Full Text Search in PostgreSQL
Better Full Text Search in PostgreSQLBetter Full Text Search in PostgreSQL
Better Full Text Search in PostgreSQLArtur Zakirov
 
Chapter 3 -Built-in Matlab Functions
Chapter 3 -Built-in Matlab FunctionsChapter 3 -Built-in Matlab Functions
Chapter 3 -Built-in Matlab FunctionsSiva Gopal
 
Latency SLOs done right
Latency SLOs done rightLatency SLOs done right
Latency SLOs done rightFred Moyer
 
Automated Parameterization of Performance Models from Measurements
Automated Parameterization of Performance Models from MeasurementsAutomated Parameterization of Performance Models from Measurements
Automated Parameterization of Performance Models from MeasurementsWeikun Wang
 
Generating Automated and Online Test Oracles for Simulink Models with Continu...
Generating Automated and Online Test Oracles for Simulink Models with Continu...Generating Automated and Online Test Oracles for Simulink Models with Continu...
Generating Automated and Online Test Oracles for Simulink Models with Continu...Lionel Briand
 
StrataGEM: A Generic Petri Net Verification Framework
StrataGEM: A Generic Petri Net Verification FrameworkStrataGEM: A Generic Petri Net Verification Framework
StrataGEM: A Generic Petri Net Verification FrameworkEdmundo López Bóbeda
 
Performance Tuning and Optimization
Performance Tuning and OptimizationPerformance Tuning and Optimization
Performance Tuning and OptimizationMongoDB
 
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022InfluxData
 
Queuing theory and traffic analysis in depth
Queuing theory and traffic analysis in depthQueuing theory and traffic analysis in depth
Queuing theory and traffic analysis in depthIdcIdk1
 
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | Prometheus
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | PrometheusCreating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | Prometheus
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | PrometheusInfluxData
 
"Эффективность и оптимизация кода в Java 8" Сергей Моренец
"Эффективность и оптимизация кода в Java 8" Сергей Моренец"Эффективность и оптимизация кода в Java 8" Сергей Моренец
"Эффективность и оптимизация кода в Java 8" Сергей МоренецFwdays
 
Chapter 5 - Plotting
Chapter 5 - PlottingChapter 5 - Plotting
Chapter 5 - PlottingSiva Gopal
 

Similar to Monitoring your Application in Kubernetes with Prometheus (20)

Monitoring your Application in Kubernetes with Prometheus
Monitoring your Application in Kubernetes with Prometheus Monitoring your Application in Kubernetes with Prometheus
Monitoring your Application in Kubernetes with Prometheus
 
Just in time (series) - KairosDB
Just in time (series) - KairosDBJust in time (series) - KairosDB
Just in time (series) - KairosDB
 
BIRTE-13-Kawashima
BIRTE-13-KawashimaBIRTE-13-Kawashima
BIRTE-13-Kawashima
 
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...
 
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
 
DEA
DEADEA
DEA
 
Better Full Text Search in PostgreSQL
Better Full Text Search in PostgreSQLBetter Full Text Search in PostgreSQL
Better Full Text Search in PostgreSQL
 
Chapter 3 -Built-in Matlab Functions
Chapter 3 -Built-in Matlab FunctionsChapter 3 -Built-in Matlab Functions
Chapter 3 -Built-in Matlab Functions
 
Latency SLOs done right
Latency SLOs done rightLatency SLOs done right
Latency SLOs done right
 
Automated Parameterization of Performance Models from Measurements
Automated Parameterization of Performance Models from MeasurementsAutomated Parameterization of Performance Models from Measurements
Automated Parameterization of Performance Models from Measurements
 
Generating Automated and Online Test Oracles for Simulink Models with Continu...
Generating Automated and Online Test Oracles for Simulink Models with Continu...Generating Automated and Online Test Oracles for Simulink Models with Continu...
Generating Automated and Online Test Oracles for Simulink Models with Continu...
 
R and data mining
R and data miningR and data mining
R and data mining
 
StrataGEM: A Generic Petri Net Verification Framework
StrataGEM: A Generic Petri Net Verification FrameworkStrataGEM: A Generic Petri Net Verification Framework
StrataGEM: A Generic Petri Net Verification Framework
 
Performance Tuning and Optimization
Performance Tuning and OptimizationPerformance Tuning and Optimization
Performance Tuning and Optimization
 
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
 
Queuing theory and traffic analysis in depth
Queuing theory and traffic analysis in depthQueuing theory and traffic analysis in depth
Queuing theory and traffic analysis in depth
 
Distributed systems scheduling
Distributed systems schedulingDistributed systems scheduling
Distributed systems scheduling
 
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | Prometheus
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | PrometheusCreating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | Prometheus
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | Prometheus
 
"Эффективность и оптимизация кода в Java 8" Сергей Моренец
"Эффективность и оптимизация кода в Java 8" Сергей Моренец"Эффективность и оптимизация кода в Java 8" Сергей Моренец
"Эффективность и оптимизация кода в Java 8" Сергей Моренец
 
Chapter 5 - Plotting
Chapter 5 - PlottingChapter 5 - Plotting
Chapter 5 - Plotting
 

More from Weaveworks

Weave AI Controllers (Weave GitOps Office Hours)
Weave AI Controllers (Weave GitOps Office Hours)Weave AI Controllers (Weave GitOps Office Hours)
Weave AI Controllers (Weave GitOps Office Hours)Weaveworks
 
Flamingo: Expand ArgoCD with Flux (Office Hours)
Flamingo: Expand ArgoCD with Flux (Office Hours)Flamingo: Expand ArgoCD with Flux (Office Hours)
Flamingo: Expand ArgoCD with Flux (Office Hours)Weaveworks
 
Webinar: Capabilities, Confidence and Community – What Flux GA Means for You
Webinar: Capabilities, Confidence and Community – What Flux GA Means for YouWebinar: Capabilities, Confidence and Community – What Flux GA Means for You
Webinar: Capabilities, Confidence and Community – What Flux GA Means for YouWeaveworks
 
Six Signs You Need Platform Engineering
Six Signs You Need Platform EngineeringSix Signs You Need Platform Engineering
Six Signs You Need Platform EngineeringWeaveworks
 
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
SRE and GitOps for Building Robust Kubernetes Platforms.pdfSRE and GitOps for Building Robust Kubernetes Platforms.pdf
SRE and GitOps for Building Robust Kubernetes Platforms.pdfWeaveworks
 
Webinar: End to End Security & Operations with Chainguard and Weave GitOps
Webinar: End to End Security & Operations with Chainguard and Weave GitOpsWebinar: End to End Security & Operations with Chainguard and Weave GitOps
Webinar: End to End Security & Operations with Chainguard and Weave GitOpsWeaveworks
 
Flux Beyond Git Harnessing the Power of OCI
Flux Beyond Git Harnessing the Power of OCIFlux Beyond Git Harnessing the Power of OCI
Flux Beyond Git Harnessing the Power of OCIWeaveworks
 
Automated Provisioning, Management & Cost Control for Kubernetes Clusters
Automated Provisioning, Management & Cost Control for Kubernetes ClustersAutomated Provisioning, Management & Cost Control for Kubernetes Clusters
Automated Provisioning, Management & Cost Control for Kubernetes ClustersWeaveworks
 
How to Avoid Kubernetes Multi-tenancy Catastrophes
How to Avoid Kubernetes Multi-tenancy CatastrophesHow to Avoid Kubernetes Multi-tenancy Catastrophes
How to Avoid Kubernetes Multi-tenancy CatastrophesWeaveworks
 
Building internal developer platform with EKS and GitOps
Building internal developer platform with EKS and GitOpsBuilding internal developer platform with EKS and GitOps
Building internal developer platform with EKS and GitOpsWeaveworks
 
GitOps Testing in Kubernetes with Flux and Testkube.pdf
GitOps Testing in Kubernetes with Flux and Testkube.pdfGitOps Testing in Kubernetes with Flux and Testkube.pdf
GitOps Testing in Kubernetes with Flux and Testkube.pdfWeaveworks
 
Intro to GitOps with Weave GitOps, Flagger and Linkerd
Intro to GitOps with Weave GitOps, Flagger and LinkerdIntro to GitOps with Weave GitOps, Flagger and Linkerd
Intro to GitOps with Weave GitOps, Flagger and LinkerdWeaveworks
 
Implementing Flux for Scale with Soft Multi-tenancy
Implementing Flux for Scale with Soft Multi-tenancyImplementing Flux for Scale with Soft Multi-tenancy
Implementing Flux for Scale with Soft Multi-tenancyWeaveworks
 
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKS
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKSAccelerating Hybrid Multistage Delivery with Weave GitOps on EKS
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKSWeaveworks
 
The Story of Flux Reaching Graduation in the CNCF
The Story of Flux Reaching Graduation in the CNCFThe Story of Flux Reaching Graduation in the CNCF
The Story of Flux Reaching Graduation in the CNCFWeaveworks
 
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...Weaveworks
 
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...Weaveworks
 
Flux’s Security & Scalability with OCI & Helm Slides.pdf
Flux’s Security & Scalability with OCI & Helm Slides.pdfFlux’s Security & Scalability with OCI & Helm Slides.pdf
Flux’s Security & Scalability with OCI & Helm Slides.pdfWeaveworks
 
Flux Security & Scalability using VS Code GitOps Extension
Flux Security & Scalability using VS Code GitOps Extension Flux Security & Scalability using VS Code GitOps Extension
Flux Security & Scalability using VS Code GitOps Extension Weaveworks
 
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOpsDeploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOpsWeaveworks
 

More from Weaveworks (20)

Weave AI Controllers (Weave GitOps Office Hours)
Weave AI Controllers (Weave GitOps Office Hours)Weave AI Controllers (Weave GitOps Office Hours)
Weave AI Controllers (Weave GitOps Office Hours)
 
Flamingo: Expand ArgoCD with Flux (Office Hours)
Flamingo: Expand ArgoCD with Flux (Office Hours)Flamingo: Expand ArgoCD with Flux (Office Hours)
Flamingo: Expand ArgoCD with Flux (Office Hours)
 
Webinar: Capabilities, Confidence and Community – What Flux GA Means for You
Webinar: Capabilities, Confidence and Community – What Flux GA Means for YouWebinar: Capabilities, Confidence and Community – What Flux GA Means for You
Webinar: Capabilities, Confidence and Community – What Flux GA Means for You
 
Six Signs You Need Platform Engineering
Six Signs You Need Platform EngineeringSix Signs You Need Platform Engineering
Six Signs You Need Platform Engineering
 
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
SRE and GitOps for Building Robust Kubernetes Platforms.pdfSRE and GitOps for Building Robust Kubernetes Platforms.pdf
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
 
Webinar: End to End Security & Operations with Chainguard and Weave GitOps
Webinar: End to End Security & Operations with Chainguard and Weave GitOpsWebinar: End to End Security & Operations with Chainguard and Weave GitOps
Webinar: End to End Security & Operations with Chainguard and Weave GitOps
 
Flux Beyond Git Harnessing the Power of OCI
Flux Beyond Git Harnessing the Power of OCIFlux Beyond Git Harnessing the Power of OCI
Flux Beyond Git Harnessing the Power of OCI
 
Automated Provisioning, Management & Cost Control for Kubernetes Clusters
Automated Provisioning, Management & Cost Control for Kubernetes ClustersAutomated Provisioning, Management & Cost Control for Kubernetes Clusters
Automated Provisioning, Management & Cost Control for Kubernetes Clusters
 
How to Avoid Kubernetes Multi-tenancy Catastrophes
How to Avoid Kubernetes Multi-tenancy CatastrophesHow to Avoid Kubernetes Multi-tenancy Catastrophes
How to Avoid Kubernetes Multi-tenancy Catastrophes
 
Building internal developer platform with EKS and GitOps
Building internal developer platform with EKS and GitOpsBuilding internal developer platform with EKS and GitOps
Building internal developer platform with EKS and GitOps
 
GitOps Testing in Kubernetes with Flux and Testkube.pdf
GitOps Testing in Kubernetes with Flux and Testkube.pdfGitOps Testing in Kubernetes with Flux and Testkube.pdf
GitOps Testing in Kubernetes with Flux and Testkube.pdf
 
Intro to GitOps with Weave GitOps, Flagger and Linkerd
Intro to GitOps with Weave GitOps, Flagger and LinkerdIntro to GitOps with Weave GitOps, Flagger and Linkerd
Intro to GitOps with Weave GitOps, Flagger and Linkerd
 
Implementing Flux for Scale with Soft Multi-tenancy
Implementing Flux for Scale with Soft Multi-tenancyImplementing Flux for Scale with Soft Multi-tenancy
Implementing Flux for Scale with Soft Multi-tenancy
 
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKS
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKSAccelerating Hybrid Multistage Delivery with Weave GitOps on EKS
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKS
 
The Story of Flux Reaching Graduation in the CNCF
The Story of Flux Reaching Graduation in the CNCFThe Story of Flux Reaching Graduation in the CNCF
The Story of Flux Reaching Graduation in the CNCF
 
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
 
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
 
Flux’s Security & Scalability with OCI & Helm Slides.pdf
Flux’s Security & Scalability with OCI & Helm Slides.pdfFlux’s Security & Scalability with OCI & Helm Slides.pdf
Flux’s Security & Scalability with OCI & Helm Slides.pdf
 
Flux Security & Scalability using VS Code GitOps Extension
Flux Security & Scalability using VS Code GitOps Extension Flux Security & Scalability using VS Code GitOps Extension
Flux Security & Scalability using VS Code GitOps Extension
 
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOpsDeploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
 

Recently uploaded

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Recently uploaded (20)

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Monitoring your Application in Kubernetes with Prometheus

  • 1. Monitoring your App in Kubernetes with Prometheus Jeff Hoffer, Developer Experience github.com/eudaimos
  • 2. What does Weave do? Weave helps devops iterate faster with: • observability & monitoring • continuous delivery • container networks & firewalls Use Prometheus to power our Monitoring solution
  • 3. What does Weave do? Weave helps devops iterate faster with: • observability & monitoring • continuous delivery • container networks & firewalls Use Prometheus to power our Monitoring solution
  • 4. Agenda 1. Prometheus concepts: data model & PromQL 2. Prometheus architecture & pull model 3. Why Prometheus & Kubernetes are a good fit 4. What is Cortex? 5. Kubernetes recap 6. Training on real app 7. What’s next?
  • 5. Prometheus Borg —> Kubernetes Borgmon —> Prometheus Initially developed at Soundcloud
  • 6. Data model & PromQL • Prometheus is a labelled time-series database • Labels are key-value pairs • A time-series is [(timestamp, value), …] • lists of timestamp, value tuples • values are just floats – PromQL lets you make sense of them • So the data type of Prometheus is • {key1=A, key2=B} —> [(t0, v0), (t1, v1), …] • …
  • 7. Metrics Types • count - single numeric metric that only goes up • gauge - single numeric metric that arbitrarily goes up or down • histogram - samples observations and counts them in configurable buckets • histogram_quantile() to calculate quantiles from histograms or aggregations • summary - samples observations and counts them • configurable quantiles over sliding time windows {quantile=“.5“}
  • 8. Data model & PromQL • __name__ is a magic label, you can shorten the query syntax from {__name__=“requests”} to: requests
  • 9. Data model & PromQL • Example: counter requests over a spike in traffic: • 1, 2, 3, 13, 23, 33, 34, 35, 36 time requests 1 3 13 23 33 36 t1 t2 t3 t4 t5 t6 t7 t8 t9 1 2 3 13 23 33 34 35 36
  • 10. Data model & PromQL • What Prom is storing • {__name__=“requests”} —> [(t1, 1), (t2, 2), (t3, 3), (t4, 13), (t5, 23), (t6, 33), (t7, 34), (t8, 35), (t9, 36), (t10, 37)] or t1 t2 t3 t4 t5 t6 t7 t8 t9 1 2 3 13 23 33 34 35 36
  • 11. Data model & PromQL • the [P] (period) syntax after a label turns an instant type into a vector type • for each value, turn the value into a vector of all the values before and including that value for the last period P • Example P: 5s, 1m, 2h…
  • 12. Data model & PromQL • Recall our time-series requests • What is requests[3s]? Vector query: t1 t2 t3 t4 t5 t6 t7 t8 t9 1 2 3 13 23 33 34 35 36 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3
  • 13. Data model & PromQL • Recall our time-series requests • What is requests[3s]? Vector query: t1 t2 t3 t4 t5 t6 t7 t8 t9 1 2 3 13 23 33 34 35 36 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 2 3 3 13
  • 14. Data model & PromQL • Recall our time-series requests • What is requests[3s]? Vector query: t1 t2 t3 t4 t5 t6 t7 t8 t9 1 2 3 13 23 33 34 35 36 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 2 3 13 3 13 23
  • 15. Data model & PromQL • Recall our time-series requests • What is requests[3s]? Vector query: t1 t2 t3 t4 t5 t6 t7 t8 t9 1 2 3 13 23 33 34 35 36 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36
  • 16. Data model & PromQL • rate() finds the per second rate of change over a vector query • for each vector rate() just does (last_value - first_value) / (last_time - first_time)
  • 17. Data model & PromQL • rate(requests[3s]) • [ t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 18. Data model & PromQL • rate(requests[3s]) • [3-1 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 19. Data model & PromQL • rate(requests[3s]) • [2 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 20. Data model & PromQL • rate(requests[3s]) • [2/(3-1) t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 21. Data model & PromQL • rate(requests[3s]) • [2/2 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 22. Data model & PromQL • rate(requests[3s]) • [1, t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 23. Data model & PromQL • rate(requests[3s]) • [1, 13-2 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 24. Data model & PromQL • rate(requests[3s]) • [1, 11 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 25. Data model & PromQL • rate(requests[3s]) • [1, 11/(4-2) t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 26. Data model & PromQL • rate(requests[3s]) • [1, 11/2 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 27. Data model & PromQL • rate(requests[3s]) • [1, 5.5 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 28. Data model & PromQL • rate(requests[3s]) • [1, 5.5, 23-3 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 29. Data model & PromQL • rate(requests[3s]) • [1, 5.5, 20 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 30. Data model & PromQL • rate(requests[3s]) • [1, 5.5, 20/2 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 31. Data model & PromQL • rate(requests[3s]) • [1, 5.5, 10 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 32. Data model & PromQL • rate(requests[3s]) • [1, 5.5, 10, 10 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 33. Data model & PromQL • rate(requests[3s]) • [1, 5.5, 10, 10, 5.5, t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 34. Data model & PromQL • rate(requests[3s]) • [1, 5.5, 10, 10, 5.5, 1 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 35. Data model & PromQL • rate(requests[3s]) • [1, 5.5, 10, 10, 5.5, 1, 1] t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - first_value) / (last_time - first_time)
  • 36. time requests 1 3 13 23 33 36 t1 t2 t3 t4 t5 t6 t7 t8 t9 1 2 3 13 23 33 34 35 36 t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 requests[3s] time rate(requests[3s]) 1 5 10 t3 t4 t5 t6 t7 t8 t9 1 5.5 10 10 5.5 1 1
  • 37. Now we can understand irate (“instantaneous rate”) • irate(requests[3s]) • [ t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - 2nd_last_value) / (last_time - 2nd_last_time)
  • 38. Now we can understand irate (“instantaneous rate”) • irate(requests[3s]) • [1, 10, 10, 10, 1, 1, 1] t1-3 t2-4 t3-5 t4-6 t5-7 t6-8 t7-9 1 2 3 13 23 33 34 2 3 13 23 33 34 35 3 13 23 33 34 35 36 (last_value - 2nd_last_value) / (last_time - 2nd_last_time)
  • 39. Labels • Recall that requests is just shorthand for {__name__=“requests”} • We can have more labels: {__name__=“requests”, job=“frontend”} • Shortens to requests{job=“frontend”} • And so we could query rate(requests{job=“frontend”}[1m])
  • 40. Label Operators • = -> exact match string • != -> exact match string negated • =~ -> regex match label • !~ -> regex match negated • Regex matching is slower b/c Prometheus can’t use indexes
  • 43. Alerts • You can define PromQL queries that trigger alerts when the result of a query matches a criteria. Example: # Alert for any instance that have a median request latency >1s. ALERT APIHighRequestLatency IF api_http_request_latencies_second{quantile="0.5"} > 1 FOR 1m ANNOTATIONS { summary = "High request latency on {{ $labels.instance }}", description = "{{ $labels.instance }} has a median request latency above 1s (current value: {{ $value }}s)", }
  • 44. Cortex • Distributed, multi-tenant version of Prometheus • Prometheus architecture is single-server • We wanted to build something scalable
  • 46. Cortex • We run it for you • Long term storage for your metrics • We open sourced it • https://github.com/weaveworks/cortex
  • 47. Recap: all you need to know (Kube) Pods containers ServicesDeployments Container Image Docker container image, contains your application code in an isolated environment. Pod A set of containers, sharing network namespace and local volumes, co- scheduled on one machine. Mortal. Has pod IP. Has labels. Deployment Specify how many replicas of a pod should run in a cluster. Then ensures that many are running across the cluster. Has labels. Service Names things in DNS. Gets virtual IP. Two types: ClusterIP for internal services, NodePort for publishing to outside. Routes based on labels.
  • 48. kind: Deployment metadata: name: nginx-deployment spec: replicas: 2 template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.7.9 kind: Service metadata: name: frontend spec: type: NodePort selector: app: nginx ports: - port: 80 targetPort: 80 nodePort: 30002 Kubernetes services and deployments
  • 49. Why Kubernetes <3 Prometheus • Prom discovers what to scrape by asking Kube • Prom’s pull model matches Kube dynamic scheduling • Allows Prom to identify thing it’s pulling from • Prom label/value pairs mirror Kube labels • Pods were made for exporters
  • 51. Join the Weave user group! meetup.com/pro/Weave/ weave.works/help
  • 52. Other topics Kubernetes 101 Continuous delivery: hooking up my CI/CD pipeline to Kubernetes Network policy for security We have talks on all these topics in the Weave user group!
  • 53. Thanks! Questions? We are hiring! DX in San Francisco Engineers in London & SF weave.works/weave-company/hiring

Editor's Notes

  1. it’s like numerical differentiation you can think of [3s] like a “smoothing factor” allow you to miss traces
  2. note that prometheus pulls metrics from jobs/exporters this is your app/cluster/network/nodes (anything that can be instrumented)
  3. note that prometheus pulls metrics from jobs/exporters this is your app/cluster/network/nodes (anything that can be instrumented)
  4. allows Prom to know identity of thing it’s pulling from
  5. https://github.com/lukemarsden/wlw
  6. https://www.katacoda.com/courses/weave/kubernetes-training
  7. https://www.katacoda.com/courses/weave/kubernetes-training