SlideShare a Scribd company logo
1 of 24
Download to read offline
Monitoring the Hashistack
with Prometheus
Tom Wilkie @tom_wilkie

August 2018
❤
Prometheus
● A monitoring & alerting system.

● Inspired by Google’s BorgMon

● Originally built by SoundCloud in 2012

● Open Source, now part of the CNCF

● Simple text-based metrics format

● Multidimensional datamodel

● Rich, concise query language
Prometheus’ data model is very simple:

<identifier> → [ (t0, v0), (t1, v1), ... ]
Timestamps are millisecond int64, values are float64
https://www.slideshare.net/Docker/monitoring-the-prometheus-way-julius-voltz-prometheus
Prometheus identifiers

http_requests_total{job=“nginx”, instances=“1.2.3.4:80”, path=“/home”, status=“200”}
http_requests_total{job=“nginx”, instances=“1.2.3.4:80”, path=“/home”, status=“500”}
http_requests_total{job=“nginx”, instances=“1.2.3.4:80”, path=“/settings”, status=“200”}
http_requests_total{job=“nginx”, instances=“1.2.3.4:80”, path=“/settings”, status=“502”}
Prometheus series selector

http_requests_total{job=“nginx”, status=~“5..”}
Building queries usually starts with a selector

PromQL: http_requests_total{job=“nginx”, status=~“5..”}
{job=“nginx”, instances=“1.2.3.4:80”, path=“/home”, status=“500”} 34
{job=“nginx”, instances=“1.2.3.4:80”, path=“/settings”, status=“502”} 56
{job=“nginx”, instances=“2.3.4.5:80”, path=“/home”, status=“500”} 76
{job=“nginx”, instances=“2.3.4.5:80”, path=“/setting”, status=“502”} 96
...
Can select vectors of values…

PromQL: http_requests_total{job=“nginx”, status=~“502”}[1m]
{job=“nginx”, instances=“1.2.3.4:80”, path=“/home”, status=“500”} [30, 31, 32, 34]
{job=“nginx”, instances=“1.2.3.4:80”, path=“/settings”, status=“500”} [4, 24, 56, 56]
{job=“nginx”, instances=“2.3.4.5:80”, path=“/home”, status=“500”} [76, 76, 76, 76]
{job=“nginx”, instances=“2.3.4.5:80”, path=“/setting”, status=“500”} [56, 106, 5, 96]
...
And apply functions…

PromQL: rate(http_requests_total{job=“nginx”, status=~“502”}[1m])
{job=“nginx”, instances=“1.2.3.4:80”, path=“/home”, status=“500”} 0.0666
{job=“nginx”, instances=“1.2.3.4:80”, path=“/settings”, status=“500”} 0.866
{job=“nginx”, instances=“2.3.4.5:80”, path=“/home”, status=“500”} 0.0
{job=“nginx”, instances=“2.3.4.5:80”, path=“/settings”, status=“500”} 2.43
...
And aggregate by a dimension…

PromQL: sum by (path) (rate(http_requests_total{job=“nginx”, status=~“502”}[1m]))
{path=“/home”} 0.0666
{path=“/settings”} 3.3
...
Do binary operations…

PromQL: sum by (path) (rate(http_requests_total{job=“nginx”, status=~“502”}[1m]))
/
sum by (path) (rate(http_requests_total{job=“nginx”}[1m]))
{path=“/home”} 0.001
{path=“/settings”} 1.0
...
Hashistack
● Consul: service discovery, K/V
store, service mesh…
● Vault: secret management and
automation
● Terraform: infrastructure config
as code
• All Hashicorp products use github.com/armon/go-metrics for metrics.

• Exposes metrics in statsd, dogstatsd

• Prometheus support added in 2015

• …but not plumbed through in most products (until recently)
Consul
● Prometheus support exposed in
1.1.0 (hashicorp/consul#4014,
hashicorp/consul#4016)

● Exposed metrics are being improved
(hashicorp/consul#4042)
Consul (II)
● Alternatively use the statsd_exporter
with customer metrics mapping

● github.com/prometheus/
statsd_exporter
https://github.com/kausalco/public/blob/master/consul-mixin/statsd-mapping.yaml
Consul (III)
● Still don’t get the “operational”
metrics we want.

● Use the consul_exporter:
github.com/prometheus/
consul_exporter
Consul Mixin
● Set of predefined alerts &
dashboards for Prometheus /
Grafana

● github.com/kausalco/
public/consul-mixin

● Users “Prometheus Mixin”
format:

● Design Doc
Vault
● No Prometheus metrics yet
(hashicorp/vault#2937)

● Again, use statsd_exporter

● BUT statsd metrics aren’t very
“operational”

● Use vault_exporter

● github.com/grapeshot/
vault_exporter
Vault Mixin
● As per consul mixin, set of alerts
and dashboard for Prometheus
& Grafana

● github.com/grapeshot/
vault_exporter/vault-mixin
Terraform
● Terraform is a CLI tool - what
does it mean to monitor that?

● I don’t have enough confidence
to run it CI/CD..

● But I do want to know if
someone changes something
and doesn’t apply it.
Terradiff
• Runs on k8s

• Pod containing:

• git-sync (github.com/kubernetes/git-sync)

• prom-run (github.com/tomwilkie/prom-run)

terraform plan -detailed-exitcode
https://www.weave.works/blog/provisioning-lifecycle-production-ready-kubernetes-cluster/
Hashistack
● Consul: service discovery, K/V
store, service mesh…
● Vault: secret management and
automation
● Terraform: infrastructure config
as code
Questions?

More Related Content

What's hot

Reverse proxies & Inconsistency
Reverse proxies & InconsistencyReverse proxies & Inconsistency
Reverse proxies & InconsistencyGreenD0g
 
Introduction to EasyBuild: Tutorial Part 1
Introduction to EasyBuild: Tutorial Part 1Introduction to EasyBuild: Tutorial Part 1
Introduction to EasyBuild: Tutorial Part 1inside-BigData.com
 
Dependency Injection in Apache Spark Applications
Dependency Injection in Apache Spark ApplicationsDependency Injection in Apache Spark Applications
Dependency Injection in Apache Spark ApplicationsDatabricks
 
Logs/Metrics Gathering With OpenShift EFK Stack
Logs/Metrics Gathering With OpenShift EFK StackLogs/Metrics Gathering With OpenShift EFK Stack
Logs/Metrics Gathering With OpenShift EFK StackJosef Karásek
 
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.ioTHE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.ioDevOpsDays Tel Aviv
 
How to Monitoring the SRE Golden Signals (E-Book)
How to Monitoring the SRE Golden Signals (E-Book)How to Monitoring the SRE Golden Signals (E-Book)
How to Monitoring the SRE Golden Signals (E-Book)Siglos
 
Automating the Cloud with Terraform, and Ansible
Automating the Cloud with Terraform, and AnsibleAutomating the Cloud with Terraform, and Ansible
Automating the Cloud with Terraform, and AnsibleBrian Hogan
 
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Docker, Inc.
 
Monitoring With Prometheus
Monitoring With PrometheusMonitoring With Prometheus
Monitoring With PrometheusKnoldus Inc.
 
Prometheus Overview
Prometheus OverviewPrometheus Overview
Prometheus OverviewBrian Brazil
 
Cloud Monitoring with Prometheus
Cloud Monitoring with PrometheusCloud Monitoring with Prometheus
Cloud Monitoring with PrometheusQAware GmbH
 
OpenTelemetry Introduction
OpenTelemetry Introduction OpenTelemetry Introduction
OpenTelemetry Introduction DimitrisFinas1
 
Load Testing - How to Stress Your Odoo with Locust
Load Testing - How to Stress Your Odoo with LocustLoad Testing - How to Stress Your Odoo with Locust
Load Testing - How to Stress Your Odoo with LocustOdoo
 
Securing your Pulsar Cluster with Vault_Chris Kellogg
Securing your Pulsar Cluster with Vault_Chris KelloggSecuring your Pulsar Cluster with Vault_Chris Kellogg
Securing your Pulsar Cluster with Vault_Chris KelloggStreamNative
 
Grafana Loki: like Prometheus, but for Logs
Grafana Loki: like Prometheus, but for LogsGrafana Loki: like Prometheus, but for Logs
Grafana Loki: like Prometheus, but for LogsMarco Pracucci
 
Terraform -- Infrastructure as Code
Terraform -- Infrastructure as CodeTerraform -- Infrastructure as Code
Terraform -- Infrastructure as CodeMartin Schütte
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusGrafana Labs
 
Getting Started Monitoring with Prometheus and Grafana
Getting Started Monitoring with Prometheus and GrafanaGetting Started Monitoring with Prometheus and Grafana
Getting Started Monitoring with Prometheus and GrafanaSyah Dwi Prihatmoko
 

What's hot (20)

Reverse proxies & Inconsistency
Reverse proxies & InconsistencyReverse proxies & Inconsistency
Reverse proxies & Inconsistency
 
Introduction to EasyBuild: Tutorial Part 1
Introduction to EasyBuild: Tutorial Part 1Introduction to EasyBuild: Tutorial Part 1
Introduction to EasyBuild: Tutorial Part 1
 
Dependency Injection in Apache Spark Applications
Dependency Injection in Apache Spark ApplicationsDependency Injection in Apache Spark Applications
Dependency Injection in Apache Spark Applications
 
Logs/Metrics Gathering With OpenShift EFK Stack
Logs/Metrics Gathering With OpenShift EFK StackLogs/Metrics Gathering With OpenShift EFK Stack
Logs/Metrics Gathering With OpenShift EFK Stack
 
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.ioTHE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
 
How to Monitoring the SRE Golden Signals (E-Book)
How to Monitoring the SRE Golden Signals (E-Book)How to Monitoring the SRE Golden Signals (E-Book)
How to Monitoring the SRE Golden Signals (E-Book)
 
Automating the Cloud with Terraform, and Ansible
Automating the Cloud with Terraform, and AnsibleAutomating the Cloud with Terraform, and Ansible
Automating the Cloud with Terraform, and Ansible
 
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
 
Monitoring With Prometheus
Monitoring With PrometheusMonitoring With Prometheus
Monitoring With Prometheus
 
Prometheus Overview
Prometheus OverviewPrometheus Overview
Prometheus Overview
 
Airflow Intro-1.pdf
Airflow Intro-1.pdfAirflow Intro-1.pdf
Airflow Intro-1.pdf
 
Cloud Monitoring with Prometheus
Cloud Monitoring with PrometheusCloud Monitoring with Prometheus
Cloud Monitoring with Prometheus
 
OpenTelemetry Introduction
OpenTelemetry Introduction OpenTelemetry Introduction
OpenTelemetry Introduction
 
Load Testing - How to Stress Your Odoo with Locust
Load Testing - How to Stress Your Odoo with LocustLoad Testing - How to Stress Your Odoo with Locust
Load Testing - How to Stress Your Odoo with Locust
 
Monitoring With Prometheus
Monitoring With PrometheusMonitoring With Prometheus
Monitoring With Prometheus
 
Securing your Pulsar Cluster with Vault_Chris Kellogg
Securing your Pulsar Cluster with Vault_Chris KelloggSecuring your Pulsar Cluster with Vault_Chris Kellogg
Securing your Pulsar Cluster with Vault_Chris Kellogg
 
Grafana Loki: like Prometheus, but for Logs
Grafana Loki: like Prometheus, but for LogsGrafana Loki: like Prometheus, but for Logs
Grafana Loki: like Prometheus, but for Logs
 
Terraform -- Infrastructure as Code
Terraform -- Infrastructure as CodeTerraform -- Infrastructure as Code
Terraform -- Infrastructure as Code
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with Prometheus
 
Getting Started Monitoring with Prometheus and Grafana
Getting Started Monitoring with Prometheus and GrafanaGetting Started Monitoring with Prometheus and Grafana
Getting Started Monitoring with Prometheus and Grafana
 

Similar to Monitoring the Hashistack with Prometheus

Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataGetInData
 
Django deployment with PaaS
Django deployment with PaaSDjango deployment with PaaS
Django deployment with PaaSAppsembler
 
MuleSoft Meetup Roma - Processi di Automazione su CloudHub
MuleSoft Meetup Roma - Processi di Automazione su CloudHubMuleSoft Meetup Roma - Processi di Automazione su CloudHub
MuleSoft Meetup Roma - Processi di Automazione su CloudHubAlfonso Martino
 
Prometheus - basics
Prometheus - basicsPrometheus - basics
Prometheus - basicsJuraj Hantak
 
OpenTelemetry 101 FTW
OpenTelemetry 101 FTWOpenTelemetry 101 FTW
OpenTelemetry 101 FTWNGINX, Inc.
 
Monitoring Cloud Native Applications with Prometheus
Monitoring Cloud Native Applications with PrometheusMonitoring Cloud Native Applications with Prometheus
Monitoring Cloud Native Applications with PrometheusJacopo Nardiello
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Brian Brazil
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseHao Chen
 
Uni w pachube 111108
Uni w pachube 111108Uni w pachube 111108
Uni w pachube 111108Paul Tanner
 
DevOps Braga #15: Agentless monitoring with icinga and prometheus
DevOps Braga #15: Agentless monitoring with icinga and prometheusDevOps Braga #15: Agentless monitoring with icinga and prometheus
DevOps Braga #15: Agentless monitoring with icinga and prometheusDevOps Braga
 
FIWARE Wednesday Webinars - Short Term History within Smart Systems
FIWARE Wednesday Webinars - Short Term History within Smart SystemsFIWARE Wednesday Webinars - Short Term History within Smart Systems
FIWARE Wednesday Webinars - Short Term History within Smart SystemsFIWARE
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusTobias Schmidt
 
Monitoring at scale: Migrating to Prometheus at Fastly
Monitoring at scale: Migrating to Prometheus at FastlyMonitoring at scale: Migrating to Prometheus at Fastly
Monitoring at scale: Migrating to Prometheus at FastlyMarcus Barczak
 
OSMC 2018 | Thruk 2½ – Current state of development by Sven Nierlein
OSMC 2018 | Thruk 2½ – Current state of development by Sven NierleinOSMC 2018 | Thruk 2½ – Current state of development by Sven Nierlein
OSMC 2018 | Thruk 2½ – Current state of development by Sven NierleinNETWAYS
 
Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)Brian Brazil
 
openATTIC using grafana and prometheus
openATTIC using  grafana and prometheusopenATTIC using  grafana and prometheus
openATTIC using grafana and prometheusAlex Lau
 
Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Brian Brazil
 
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...NETWAYS
 

Similar to Monitoring the Hashistack with Prometheus (20)

Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
 
Django deployment with PaaS
Django deployment with PaaSDjango deployment with PaaS
Django deployment with PaaS
 
MuleSoft Meetup Roma - Processi di Automazione su CloudHub
MuleSoft Meetup Roma - Processi di Automazione su CloudHubMuleSoft Meetup Roma - Processi di Automazione su CloudHub
MuleSoft Meetup Roma - Processi di Automazione su CloudHub
 
Prometheus - basics
Prometheus - basicsPrometheus - basics
Prometheus - basics
 
OpenTelemetry 101 FTW
OpenTelemetry 101 FTWOpenTelemetry 101 FTW
OpenTelemetry 101 FTW
 
Docker practical solutions
Docker practical solutionsDocker practical solutions
Docker practical solutions
 
Monitoring Cloud Native Applications with Prometheus
Monitoring Cloud Native Applications with PrometheusMonitoring Cloud Native Applications with Prometheus
Monitoring Cloud Native Applications with Prometheus
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
 
Apache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real TimeApache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real Time
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San Jose
 
Uni w pachube 111108
Uni w pachube 111108Uni w pachube 111108
Uni w pachube 111108
 
DevOps Braga #15: Agentless monitoring with icinga and prometheus
DevOps Braga #15: Agentless monitoring with icinga and prometheusDevOps Braga #15: Agentless monitoring with icinga and prometheus
DevOps Braga #15: Agentless monitoring with icinga and prometheus
 
FIWARE Wednesday Webinars - Short Term History within Smart Systems
FIWARE Wednesday Webinars - Short Term History within Smart SystemsFIWARE Wednesday Webinars - Short Term History within Smart Systems
FIWARE Wednesday Webinars - Short Term History within Smart Systems
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with Prometheus
 
Monitoring at scale: Migrating to Prometheus at Fastly
Monitoring at scale: Migrating to Prometheus at FastlyMonitoring at scale: Migrating to Prometheus at Fastly
Monitoring at scale: Migrating to Prometheus at Fastly
 
OSMC 2018 | Thruk 2½ – Current state of development by Sven Nierlein
OSMC 2018 | Thruk 2½ – Current state of development by Sven NierleinOSMC 2018 | Thruk 2½ – Current state of development by Sven Nierlein
OSMC 2018 | Thruk 2½ – Current state of development by Sven Nierlein
 
Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)
 
openATTIC using grafana and prometheus
openATTIC using  grafana and prometheusopenATTIC using  grafana and prometheus
openATTIC using grafana and prometheus
 
Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)
 
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
 

Recently uploaded

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 

Recently uploaded (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 

Monitoring the Hashistack with Prometheus

  • 1. Monitoring the Hashistack with Prometheus Tom Wilkie @tom_wilkie August 2018 ❤
  • 2. Prometheus ● A monitoring & alerting system. ● Inspired by Google’s BorgMon ● Originally built by SoundCloud in 2012 ● Open Source, now part of the CNCF ● Simple text-based metrics format ● Multidimensional datamodel ● Rich, concise query language
  • 3.
  • 4.
  • 5. Prometheus’ data model is very simple: <identifier> → [ (t0, v0), (t1, v1), ... ] Timestamps are millisecond int64, values are float64 https://www.slideshare.net/Docker/monitoring-the-prometheus-way-julius-voltz-prometheus
  • 6. Prometheus identifiers http_requests_total{job=“nginx”, instances=“1.2.3.4:80”, path=“/home”, status=“200”} http_requests_total{job=“nginx”, instances=“1.2.3.4:80”, path=“/home”, status=“500”} http_requests_total{job=“nginx”, instances=“1.2.3.4:80”, path=“/settings”, status=“200”} http_requests_total{job=“nginx”, instances=“1.2.3.4:80”, path=“/settings”, status=“502”} Prometheus series selector http_requests_total{job=“nginx”, status=~“5..”}
  • 7. Building queries usually starts with a selector PromQL: http_requests_total{job=“nginx”, status=~“5..”} {job=“nginx”, instances=“1.2.3.4:80”, path=“/home”, status=“500”} 34 {job=“nginx”, instances=“1.2.3.4:80”, path=“/settings”, status=“502”} 56 {job=“nginx”, instances=“2.3.4.5:80”, path=“/home”, status=“500”} 76 {job=“nginx”, instances=“2.3.4.5:80”, path=“/setting”, status=“502”} 96 ...
  • 8. Can select vectors of values… PromQL: http_requests_total{job=“nginx”, status=~“502”}[1m] {job=“nginx”, instances=“1.2.3.4:80”, path=“/home”, status=“500”} [30, 31, 32, 34] {job=“nginx”, instances=“1.2.3.4:80”, path=“/settings”, status=“500”} [4, 24, 56, 56] {job=“nginx”, instances=“2.3.4.5:80”, path=“/home”, status=“500”} [76, 76, 76, 76] {job=“nginx”, instances=“2.3.4.5:80”, path=“/setting”, status=“500”} [56, 106, 5, 96] ...
  • 9. And apply functions… PromQL: rate(http_requests_total{job=“nginx”, status=~“502”}[1m]) {job=“nginx”, instances=“1.2.3.4:80”, path=“/home”, status=“500”} 0.0666 {job=“nginx”, instances=“1.2.3.4:80”, path=“/settings”, status=“500”} 0.866 {job=“nginx”, instances=“2.3.4.5:80”, path=“/home”, status=“500”} 0.0 {job=“nginx”, instances=“2.3.4.5:80”, path=“/settings”, status=“500”} 2.43 ...
  • 10. And aggregate by a dimension… PromQL: sum by (path) (rate(http_requests_total{job=“nginx”, status=~“502”}[1m])) {path=“/home”} 0.0666 {path=“/settings”} 3.3 ...
  • 11. Do binary operations… PromQL: sum by (path) (rate(http_requests_total{job=“nginx”, status=~“502”}[1m])) / sum by (path) (rate(http_requests_total{job=“nginx”}[1m])) {path=“/home”} 0.001 {path=“/settings”} 1.0 ...
  • 12. Hashistack ● Consul: service discovery, K/V store, service mesh… ● Vault: secret management and automation ● Terraform: infrastructure config as code
  • 13. • All Hashicorp products use github.com/armon/go-metrics for metrics. • Exposes metrics in statsd, dogstatsd • Prometheus support added in 2015 • …but not plumbed through in most products (until recently)
  • 14. Consul ● Prometheus support exposed in 1.1.0 (hashicorp/consul#4014, hashicorp/consul#4016) ● Exposed metrics are being improved (hashicorp/consul#4042)
  • 15. Consul (II) ● Alternatively use the statsd_exporter with customer metrics mapping ● github.com/prometheus/ statsd_exporter
  • 17. Consul (III) ● Still don’t get the “operational” metrics we want. ● Use the consul_exporter: github.com/prometheus/ consul_exporter
  • 18. Consul Mixin ● Set of predefined alerts & dashboards for Prometheus / Grafana ● github.com/kausalco/ public/consul-mixin ● Users “Prometheus Mixin” format: ● Design Doc
  • 19. Vault ● No Prometheus metrics yet (hashicorp/vault#2937) ● Again, use statsd_exporter ● BUT statsd metrics aren’t very “operational” ● Use vault_exporter ● github.com/grapeshot/ vault_exporter
  • 20. Vault Mixin ● As per consul mixin, set of alerts and dashboard for Prometheus & Grafana ● github.com/grapeshot/ vault_exporter/vault-mixin
  • 21. Terraform ● Terraform is a CLI tool - what does it mean to monitor that? ● I don’t have enough confidence to run it CI/CD.. ● But I do want to know if someone changes something and doesn’t apply it.
  • 22. Terradiff • Runs on k8s • Pod containing: • git-sync (github.com/kubernetes/git-sync) • prom-run (github.com/tomwilkie/prom-run) terraform plan -detailed-exitcode https://www.weave.works/blog/provisioning-lifecycle-production-ready-kubernetes-cluster/
  • 23. Hashistack ● Consul: service discovery, K/V store, service mesh… ● Vault: secret management and automation ● Terraform: infrastructure config as code