SlideShare a Scribd company logo
Multi-tenant Kubernetes
observability with Prometheus
robusta-dev Natan Yellin aantn
Natan Yellin, robusta.dev
$ whoami
Co-founder of robusta.dev
Multi-cluster Kubernetes observability
Add-on to Prometheus
Substack newsletter: Why this Kubernetes thing?
Natan Yellin aantn
robusta-dev
How should I gather
Prometheus metrics from
all my tenants?
Natan Yellin aantn
robusta-dev
Assumptions
Natan Yellin aantn
Clusters
Namespaces
Virtual clusters (e.g. capsule, kamaji, vcluster)
etc...
1. Many Kubernetes tenants
2. Tenants need some form of isolation
3. We want to monitor with Prometheus
robusta-dev
What should I use?
Natan Yellin aantn
robusta-dev
In the beginning there was one
Natan Yellin aantn
robusta-dev
In the beginning there was one
Natan Yellin aantn
Simple
No security isolation/RBAC
No performance isolation
If tenants are clusters, discovery is
annoying
Advantages:
Disadvantages:
"One team broke Prometheus for
everyone else"
robusta-dev
Then there were many
Natan Yellin aantn
robusta-dev
Then there were many
Natan Yellin aantn
Simple
Security isolation
Performance isolation
Scalable?
No unified queries
No unified management
More resources?
Advantages:
Major Disadvantage:
Minor Disadvantages:
"If you break it, it only breaks for your
product line."
robusta-dev
What we want
Natan Yellin aantn
Isolation
Scalability
Decentralized:
Query all Prometheuses at once
Centralized:
robusta-dev
What else we want?
Natan Yellin aantn
Scalability
Long term storage of metrics
1.
2.
robusta-dev
Three approaches
Natan Yellin aantn
robusta-dev
Solve it outside Prometheus
Natan Yellin aantn
robusta-dev
Solve it outside Prometheus
Natan Yellin aantn
Doesn't touch Prometheus itself
Delegates problem to other tool
Queries need to address one
Prometheus at a time
Key advantages:
Key disadvantage:
robusta-dev
Multiple + Centralized (take 1)
Natan Yellin aantn
robusta-dev
Multiple + central (take 1)
Natan Yellin aantn
Reuses existing Prometheus
Federated can do roll-up
Federated can selectively scrape
With roll-up/selective you can't
actually query all Prometheuses
Scaling
Key advantages:
Key disadvantages:
robusta-dev
Natan Yellin aantn
Disclaimer: Thanos has lots of options, I'm simplifying a little
robusta-dev
Multiple + central (take 2)
Natan Yellin aantn
robusta-dev
Multiple Prometheuses + central Prometheus (take 2)
Natan Yellin aantn
Super scalable!
Reuses existing Prometheus
Very common solution, lots of tooling
No RBAC built-in
Key advantages:
Key disadvantages:
"Most mature option" - most people
robusta-dev
One Prometheus to Rule them All
Natan Yellin aantn
robusta-dev
One Prometheus to Rule them All
Natan Yellin aantn
robusta-dev
Cortex
Grafana Mimir
VictoriaMetrics
TimescaleDB
M3DB
Options:
...
Grafana Mimir
Natan Yellin aantn
robusta-dev
Native multi-tenancy!
Backed by Grafana
Complexity
Key advantages:
Key disadvantages:
Other useful tools
Natan Yellin aantn
Add prom-label-proxy to Thanos
(and others) to enforce RBAC
robusta-dev
Thank you!
Natan Yellin aantn
A special thank you to Shalom Cohen and Evgeny Uklist + Racoons team for
providing inputs
robusta-dev
Questions?
Natan Yellin aantn
robusta-dev

More Related Content

What's hot

What's hot (20)

Room 1 - 4 - Phạm Tường Chiến & Trần Văn Thắng - Deliver managed Kubernetes C...
Room 1 - 4 - Phạm Tường Chiến & Trần Văn Thắng - Deliver managed Kubernetes C...Room 1 - 4 - Phạm Tường Chiến & Trần Văn Thắng - Deliver managed Kubernetes C...
Room 1 - 4 - Phạm Tường Chiến & Trần Văn Thắng - Deliver managed Kubernetes C...
 
왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요
왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요
왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요
 
DevOps with Kubernetes
DevOps with KubernetesDevOps with Kubernetes
DevOps with Kubernetes
 
Kubernetes Concepts And Architecture Powerpoint Presentation Slides
Kubernetes Concepts And Architecture Powerpoint Presentation SlidesKubernetes Concepts And Architecture Powerpoint Presentation Slides
Kubernetes Concepts And Architecture Powerpoint Presentation Slides
 
Introduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopIntroduction to Kubernetes Workshop
Introduction to Kubernetes Workshop
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
Introduction of CCE and DevCloud
Introduction of CCE and DevCloudIntroduction of CCE and DevCloud
Introduction of CCE and DevCloud
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
 
Containers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatContainers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red Hat
 
Designing a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd productsDesigning a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd products
 
Istio service mesh introduction
Istio service mesh introductionIstio service mesh introduction
Istio service mesh introduction
 
Red Hat OpenShift Container Platform Overview
Red Hat OpenShift Container Platform OverviewRed Hat OpenShift Container Platform Overview
Red Hat OpenShift Container Platform Overview
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
K8s security best practices
K8s security best practicesK8s security best practices
K8s security best practices
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
Introduction to the Container Network Interface (CNI)
Introduction to the Container Network Interface (CNI)Introduction to the Container Network Interface (CNI)
Introduction to the Container Network Interface (CNI)
 
Kubernetes Docker Container Implementation Ppt PowerPoint Presentation Slide ...
Kubernetes Docker Container Implementation Ppt PowerPoint Presentation Slide ...Kubernetes Docker Container Implementation Ppt PowerPoint Presentation Slide ...
Kubernetes Docker Container Implementation Ppt PowerPoint Presentation Slide ...
 
An Architectural Deep Dive With Kubernetes And Containers Powerpoint Presenta...
An Architectural Deep Dive With Kubernetes And Containers Powerpoint Presenta...An Architectural Deep Dive With Kubernetes And Containers Powerpoint Presenta...
An Architectural Deep Dive With Kubernetes And Containers Powerpoint Presenta...
 
OpenTelemetry For Architects
OpenTelemetry For ArchitectsOpenTelemetry For Architects
OpenTelemetry For Architects
 
Cloud Native Landscape (CNCF and OCI)
Cloud Native Landscape (CNCF and OCI)Cloud Native Landscape (CNCF and OCI)
Cloud Native Landscape (CNCF and OCI)
 

Similar to Prometheus Multi Tenancy

An Introduction to Maven
An Introduction to MavenAn Introduction to Maven
An Introduction to Maven
Vadym Lotar
 
OpenNebulaConf2015 1.14 Are Today’s FOSS Security Practices Robust Enough in ...
OpenNebulaConf2015 1.14 Are Today’s FOSS Security Practices Robust Enough in ...OpenNebulaConf2015 1.14 Are Today’s FOSS Security Practices Robust Enough in ...
OpenNebulaConf2015 1.14 Are Today’s FOSS Security Practices Robust Enough in ...
OpenNebula Project
 

Similar to Prometheus Multi Tenancy (20)

Oscon 2012 tdd_cassandra
Oscon 2012 tdd_cassandraOscon 2012 tdd_cassandra
Oscon 2012 tdd_cassandra
 
Creating an effective developer experience on Kubernetes
Creating an effective developer experience on KubernetesCreating an effective developer experience on Kubernetes
Creating an effective developer experience on Kubernetes
 
Easier, Better, Faster, Safer Deployment with Docker and Immutable Containers
Easier, Better, Faster, Safer Deployment with Docker and Immutable ContainersEasier, Better, Faster, Safer Deployment with Docker and Immutable Containers
Easier, Better, Faster, Safer Deployment with Docker and Immutable Containers
 
The hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusThe hitchhiker’s guide to Prometheus
The hitchhiker’s guide to Prometheus
 
Prometheus monitoring
Prometheus monitoringPrometheus monitoring
Prometheus monitoring
 
The hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusThe hitchhiker’s guide to Prometheus
The hitchhiker’s guide to Prometheus
 
OWF: Xen - Open Source Hypervisor Designed for Clouds
OWF: Xen - Open Source Hypervisor Designed for CloudsOWF: Xen - Open Source Hypervisor Designed for Clouds
OWF: Xen - Open Source Hypervisor Designed for Clouds
 
An Introduction to Maven
An Introduction to MavenAn Introduction to Maven
An Introduction to Maven
 
Maven overview
Maven overviewMaven overview
Maven overview
 
OpenStack Tempest and REST API testing
OpenStack Tempest and REST API testingOpenStack Tempest and REST API testing
OpenStack Tempest and REST API testing
 
Scalable and Available Services with Docker and Kubernetes
Scalable and Available Services with Docker and KubernetesScalable and Available Services with Docker and Kubernetes
Scalable and Available Services with Docker and Kubernetes
 
The history of testing framework in Ruby
The history of testing framework in RubyThe history of testing framework in Ruby
The history of testing framework in Ruby
 
OpenNebulaConf2015 1.14 Are Today’s FOSS Security Practices Robust Enough in ...
OpenNebulaConf2015 1.14 Are Today’s FOSS Security Practices Robust Enough in ...OpenNebulaConf2015 1.14 Are Today’s FOSS Security Practices Robust Enough in ...
OpenNebulaConf2015 1.14 Are Today’s FOSS Security Practices Robust Enough in ...
 
Upgrade Kubernetes the boring way
Upgrade Kubernetes the boring wayUpgrade Kubernetes the boring way
Upgrade Kubernetes the boring way
 
Robusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxRobusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptx
 
Android Mobile Continuous Integration. UA Mobile 2016.
Android Mobile Continuous Integration. UA Mobile 2016.Android Mobile Continuous Integration. UA Mobile 2016.
Android Mobile Continuous Integration. UA Mobile 2016.
 
Test driven Infrastructure development with Ansible and Molecule
Test driven Infrastructure development with Ansible and MoleculeTest driven Infrastructure development with Ansible and Molecule
Test driven Infrastructure development with Ansible and Molecule
 
7 Habits of Highly Effective Jenkins Users
7 Habits of Highly Effective Jenkins Users7 Habits of Highly Effective Jenkins Users
7 Habits of Highly Effective Jenkins Users
 
Securing OpenStack and Beyond with Ansible
Securing OpenStack and Beyond with AnsibleSecuring OpenStack and Beyond with Ansible
Securing OpenStack and Beyond with Ansible
 
Continuous Kernel Integration
Continuous Kernel IntegrationContinuous Kernel Integration
Continuous Kernel Integration
 

Recently uploaded

Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 

Prometheus Multi Tenancy

  • 1. Multi-tenant Kubernetes observability with Prometheus robusta-dev Natan Yellin aantn Natan Yellin, robusta.dev
  • 2. $ whoami Co-founder of robusta.dev Multi-cluster Kubernetes observability Add-on to Prometheus Substack newsletter: Why this Kubernetes thing? Natan Yellin aantn robusta-dev
  • 3. How should I gather Prometheus metrics from all my tenants? Natan Yellin aantn robusta-dev
  • 4. Assumptions Natan Yellin aantn Clusters Namespaces Virtual clusters (e.g. capsule, kamaji, vcluster) etc... 1. Many Kubernetes tenants 2. Tenants need some form of isolation 3. We want to monitor with Prometheus robusta-dev
  • 5. What should I use? Natan Yellin aantn robusta-dev
  • 6. In the beginning there was one Natan Yellin aantn robusta-dev
  • 7. In the beginning there was one Natan Yellin aantn Simple No security isolation/RBAC No performance isolation If tenants are clusters, discovery is annoying Advantages: Disadvantages: "One team broke Prometheus for everyone else" robusta-dev
  • 8. Then there were many Natan Yellin aantn robusta-dev
  • 9. Then there were many Natan Yellin aantn Simple Security isolation Performance isolation Scalable? No unified queries No unified management More resources? Advantages: Major Disadvantage: Minor Disadvantages: "If you break it, it only breaks for your product line." robusta-dev
  • 10. What we want Natan Yellin aantn Isolation Scalability Decentralized: Query all Prometheuses at once Centralized: robusta-dev
  • 11. What else we want? Natan Yellin aantn Scalability Long term storage of metrics 1. 2. robusta-dev
  • 12. Three approaches Natan Yellin aantn robusta-dev
  • 13. Solve it outside Prometheus Natan Yellin aantn robusta-dev
  • 14. Solve it outside Prometheus Natan Yellin aantn Doesn't touch Prometheus itself Delegates problem to other tool Queries need to address one Prometheus at a time Key advantages: Key disadvantage: robusta-dev
  • 15. Multiple + Centralized (take 1) Natan Yellin aantn robusta-dev
  • 16. Multiple + central (take 1) Natan Yellin aantn Reuses existing Prometheus Federated can do roll-up Federated can selectively scrape With roll-up/selective you can't actually query all Prometheuses Scaling Key advantages: Key disadvantages: robusta-dev
  • 17. Natan Yellin aantn Disclaimer: Thanos has lots of options, I'm simplifying a little robusta-dev
  • 18. Multiple + central (take 2) Natan Yellin aantn robusta-dev
  • 19. Multiple Prometheuses + central Prometheus (take 2) Natan Yellin aantn Super scalable! Reuses existing Prometheus Very common solution, lots of tooling No RBAC built-in Key advantages: Key disadvantages: "Most mature option" - most people robusta-dev
  • 20. One Prometheus to Rule them All Natan Yellin aantn robusta-dev
  • 21. One Prometheus to Rule them All Natan Yellin aantn robusta-dev Cortex Grafana Mimir VictoriaMetrics TimescaleDB M3DB Options: ...
  • 22. Grafana Mimir Natan Yellin aantn robusta-dev Native multi-tenancy! Backed by Grafana Complexity Key advantages: Key disadvantages:
  • 23. Other useful tools Natan Yellin aantn Add prom-label-proxy to Thanos (and others) to enforce RBAC robusta-dev
  • 24. Thank you! Natan Yellin aantn A special thank you to Shalom Cohen and Evgeny Uklist + Racoons team for providing inputs robusta-dev