SlideShare a Scribd company logo
Monitoring Cockpit
for Kubernetes Clusters
Ulrike Klusik
5.11.2019
Monitoring Cockpit 2
Our Customers Kubernetes Implementation: OpenShift
• OpenShift is a commercial
Kubernetes implementation (OKD its
community version)
• From Monitoring perspective:
• Nodes as compute resources
• central service URLs with high
availability and performance SLAs,
• infrastructure Pods implementing
the services which can be
dynamically changing.
• The API already provides meta data
about the cluster components =>
used to determine the metric targets from https://docs.okd.io/3.11/architecture/index.html
Monitoring Cockpit 3
Monitoring with Prometheus
• Prometheus is good integrated in Kubernetes
ecosystem.
• Idea of Monitoring with Prometheus
• Monitored targets must provide their specific
metrics via http(s) endpoints
• Targets are typically determined dynamically
via service discovery and regularly scraped
• Alert rules are defined as conditions on metrics
• Alertmanager deduplicates alerts
and routes them to the incident handling tools
from https://prometheus.io/assets/architecture.png
Monitoring Cockpit 4
State of the Art Kubernetes Monitoring with Prometheus
• Prometheus Monitoring Mixin for Kubernetes
(https://github.com/kubernetes-monitoring/kubernetes-mixin)
• Provides for the standard Kubernetes services:
• Alert rules
• Dashboards
• Redhat’s OpenShift (a commercial Kubernetes implementation) includes an immutable
Prometheus Monitoring solution (https://github.com/openshift/cluster-monitoring-operator)
with fix alerts and dashboards. Also bases on the Mixins, plus some OpenShift specific
additions.
• What we were missing in these solutions:
• End user / application experience of cluster services
• Metrics volume too large for longer metric retention
• Cluster Overview over service and node availability
Monitoring Cockpit 5
namespace
Nodes
host
NODE-
EXPORTER
OMD server
INFLUXDB
ALERTMGR
(cluster possible)
Container
OMD-Service
Grafana
Monitoring Architecture
Kubelet +
cAdvisor
Openshift
metric target
HAProxy(Router)
infrastructure projects
remote write
(selected metrics)
Incident Mananagent
systems (e.g. Remedy,
Service Now)
custom webhook
api-servers
kube controllers
EFK Logging
(via Pods)
GlusterFS (via
Heketi-Route)
Project prometheus-infra-mon
PROMETHEUS
KSM/OSM
Kubernetes/OpenShift Cluster
• Kube-State-Metrics(KSM)/
OpenShift-State-Metrics(OSM):
metrics over objects and their
states
• Node-Exporter for operation
system metrics
• Blackbox Exporter: for test calls
to Service URLs
Blackbox
Exporter
Kubernetes
metric target
Monitoring Cockpit 6
Dashboards (Demo)
Monitoring Cockpit 7
Conclusion
• Special design decisions:
• remote write: The key metrics are stored in an external database for longer retention
• blackbox exporter: for active service availability tests
• Grafana and its plugins (especially polystat-panel) are an awesome tool to visualize
metrics in a compact way.
Thank You!
ConSol
Consulting & Solutions Software GmbH
St.-Cajetan-Straße 43
D-81669 München
Tel.: +49-89-45841-100
info@consol.de
www.consol.de
Twitter: @consol_de

More Related Content

What's hot

OpenStack reliability metrics
OpenStack reliability metricsOpenStack reliability metrics
OpenStack reliability metricsIlya Shakhat
 
Intro to os-faults library
Intro to os-faults libraryIntro to os-faults library
Intro to os-faults libraryIlya Shakhat
 
Cloud computing(bit mesra kolkata extn.)
Cloud computing(bit mesra kolkata extn.)Cloud computing(bit mesra kolkata extn.)
Cloud computing(bit mesra kolkata extn.)ASHUTOSH KUMAR
 
FEWS Data Analysis with ARR2016
FEWS Data Analysis with ARR2016 FEWS Data Analysis with ARR2016
FEWS Data Analysis with ARR2016 Lindsay Millard
 
Elastic Streams at Scale @ Flink Forward 2018 Berlin
Elastic Streams at Scale @ Flink Forward 2018 BerlinElastic Streams at Scale @ Flink Forward 2018 Berlin
Elastic Streams at Scale @ Flink Forward 2018 BerlinTill Rohrmann
 
Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...
Flink Forward San Francisco 2018 keynote:  Srikanth Satya - "Stream Processin...Flink Forward San Francisco 2018 keynote:  Srikanth Satya - "Stream Processin...
Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...Flink Forward
 
Monitoring akka cluster on kubernetes
Monitoring akka cluster on kubernetesMonitoring akka cluster on kubernetes
Monitoring akka cluster on kubernetesSeva Dolgopolov
 
Implementation of WaterCoach SeqFEWS
Implementation of WaterCoach SeqFEWS Implementation of WaterCoach SeqFEWS
Implementation of WaterCoach SeqFEWS Lindsay Millard
 
Monitoring on Kubernetes using prometheus
Monitoring on Kubernetes using prometheusMonitoring on Kubernetes using prometheus
Monitoring on Kubernetes using prometheusChandresh Pancholi
 
Kubernetes deployment strategies - CNCF Webinar
Kubernetes deployment strategies - CNCF WebinarKubernetes deployment strategies - CNCF Webinar
Kubernetes deployment strategies - CNCF WebinarEtienne Tremel
 
Atomic Rules - Arkville 21.02
Atomic Rules - Arkville 21.02Atomic Rules - Arkville 21.02
Atomic Rules - Arkville 21.02Atomic Rules LLC
 
Flink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructureFernando Lopez Aguilar
 
Kubernetes and Prometheus
Kubernetes and PrometheusKubernetes and Prometheus
Kubernetes and PrometheusWeaveworks
 
Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...
Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...
Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...Flink Forward
 
20171027 モニタリング勉強会
20171027 モニタリング勉強会20171027 モニタリング勉強会
20171027 モニタリング勉強会Paul Traylor
 
ONAP MultiCloud/K8s Casablanca
ONAP MultiCloud/K8s CasablancaONAP MultiCloud/K8s Casablanca
ONAP MultiCloud/K8s CasablancaVictor Morales
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusGrafana Labs
 

What's hot (20)

OpenStack reliability metrics
OpenStack reliability metricsOpenStack reliability metrics
OpenStack reliability metrics
 
Intro to os-faults library
Intro to os-faults libraryIntro to os-faults library
Intro to os-faults library
 
Cloud computing(bit mesra kolkata extn.)
Cloud computing(bit mesra kolkata extn.)Cloud computing(bit mesra kolkata extn.)
Cloud computing(bit mesra kolkata extn.)
 
FEWS Data Analysis with ARR2016
FEWS Data Analysis with ARR2016 FEWS Data Analysis with ARR2016
FEWS Data Analysis with ARR2016
 
Elastic Streams at Scale @ Flink Forward 2018 Berlin
Elastic Streams at Scale @ Flink Forward 2018 BerlinElastic Streams at Scale @ Flink Forward 2018 Berlin
Elastic Streams at Scale @ Flink Forward 2018 Berlin
 
Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...
Flink Forward San Francisco 2018 keynote:  Srikanth Satya - "Stream Processin...Flink Forward San Francisco 2018 keynote:  Srikanth Satya - "Stream Processin...
Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...
 
Monitoring akka cluster on kubernetes
Monitoring akka cluster on kubernetesMonitoring akka cluster on kubernetes
Monitoring akka cluster on kubernetes
 
Implementation of WaterCoach SeqFEWS
Implementation of WaterCoach SeqFEWS Implementation of WaterCoach SeqFEWS
Implementation of WaterCoach SeqFEWS
 
Monitoring on Kubernetes using prometheus
Monitoring on Kubernetes using prometheusMonitoring on Kubernetes using prometheus
Monitoring on Kubernetes using prometheus
 
Variations of git merging
Variations of git mergingVariations of git merging
Variations of git merging
 
Kubernetes deployment strategies - CNCF Webinar
Kubernetes deployment strategies - CNCF WebinarKubernetes deployment strategies - CNCF Webinar
Kubernetes deployment strategies - CNCF Webinar
 
Atomic Rules - Arkville 21.02
Atomic Rules - Arkville 21.02Atomic Rules - Arkville 21.02
Atomic Rules - Arkville 21.02
 
Flink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructure
 
Kubernetes and Prometheus
Kubernetes and PrometheusKubernetes and Prometheus
Kubernetes and Prometheus
 
Kubernetes intro
Kubernetes introKubernetes intro
Kubernetes intro
 
Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...
Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...
Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...
 
20171027 モニタリング勉強会
20171027 モニタリング勉強会20171027 モニタリング勉強会
20171027 モニタリング勉強会
 
ONAP MultiCloud/K8s Casablanca
ONAP MultiCloud/K8s CasablancaONAP MultiCloud/K8s Casablanca
ONAP MultiCloud/K8s Casablanca
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with Prometheus
 

Similar to OSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike Klusik

Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017
Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017
Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017Bob Cotton
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataGetInData
 
Kubernetes fundamentals
Kubernetes fundamentalsKubernetes fundamentals
Kubernetes fundamentalsVictor Morales
 
Prometheus kubernetes tech talk
Prometheus kubernetes tech talkPrometheus kubernetes tech talk
Prometheus kubernetes tech talkChandresh Pancholi
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudDatadog
 
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with PrometheusOpenStack Korea Community
 
How kubernetes operators can rescue dev secops in midst of a pandemic updated
How kubernetes operators can rescue dev secops in midst of a pandemic updatedHow kubernetes operators can rescue dev secops in midst of a pandemic updated
How kubernetes operators can rescue dev secops in midst of a pandemic updatedShikha Srivastava
 
Tungsten Fabric Overview
Tungsten Fabric OverviewTungsten Fabric Overview
Tungsten Fabric OverviewMichelle Holley
 
MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)Lucas Jellema
 
Build cloud native solution using open source
Build cloud native solution using open source Build cloud native solution using open source
Build cloud native solution using open source Nitesh Jadhav
 
Nex clipper 1905_summary_eng
Nex clipper 1905_summary_engNex clipper 1905_summary_eng
Nex clipper 1905_summary_engJinyong Kim
 
Red Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShiftRed Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShiftKangaroot
 
Monitoring on Kubernetes using Prometheus - Chandresh
Monitoring on Kubernetes using Prometheus - Chandresh Monitoring on Kubernetes using Prometheus - Chandresh
Monitoring on Kubernetes using Prometheus - Chandresh CodeOps Technologies LLP
 
Microservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing MicroservicesMicroservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing MicroservicesQAware GmbH
 
Kubernetes Monitoring & Best Practices
Kubernetes Monitoring & Best PracticesKubernetes Monitoring & Best Practices
Kubernetes Monitoring & Best PracticesAjeet Singh Raina
 
Prometheus - basics
Prometheus - basicsPrometheus - basics
Prometheus - basicsJuraj Hantak
 
Introducing github.com/open-cluster-management – How to deliver apps across c...
Introducing github.com/open-cluster-management – How to deliver apps across c...Introducing github.com/open-cluster-management – How to deliver apps across c...
Introducing github.com/open-cluster-management – How to deliver apps across c...Michael Elder
 
Implementing Observability for Kubernetes.pdf
Implementing Observability for Kubernetes.pdfImplementing Observability for Kubernetes.pdf
Implementing Observability for Kubernetes.pdfJose Manuel Ortega Candel
 

Similar to OSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike Klusik (20)

Monitoring Cockpit for OpenShift Clusters
Monitoring Cockpit for OpenShift ClustersMonitoring Cockpit for OpenShift Clusters
Monitoring Cockpit for OpenShift Clusters
 
Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017
Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017
Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
 
Kubernetes fundamentals
Kubernetes fundamentalsKubernetes fundamentals
Kubernetes fundamentals
 
Prometheus kubernetes tech talk
Prometheus kubernetes tech talkPrometheus kubernetes tech talk
Prometheus kubernetes tech talk
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloud
 
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
 
How kubernetes operators can rescue dev secops in midst of a pandemic updated
How kubernetes operators can rescue dev secops in midst of a pandemic updatedHow kubernetes operators can rescue dev secops in midst of a pandemic updated
How kubernetes operators can rescue dev secops in midst of a pandemic updated
 
Tungsten Fabric Overview
Tungsten Fabric OverviewTungsten Fabric Overview
Tungsten Fabric Overview
 
MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)
 
Build cloud native solution using open source
Build cloud native solution using open source Build cloud native solution using open source
Build cloud native solution using open source
 
Nex clipper 1905_summary_eng
Nex clipper 1905_summary_engNex clipper 1905_summary_eng
Nex clipper 1905_summary_eng
 
Red Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShiftRed Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShift
 
Monitoring on Kubernetes using Prometheus - Chandresh
Monitoring on Kubernetes using Prometheus - Chandresh Monitoring on Kubernetes using Prometheus - Chandresh
Monitoring on Kubernetes using Prometheus - Chandresh
 
Microservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing MicroservicesMicroservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing Microservices
 
Kubernetes Monitoring & Best Practices
Kubernetes Monitoring & Best PracticesKubernetes Monitoring & Best Practices
Kubernetes Monitoring & Best Practices
 
Prometheus - basics
Prometheus - basicsPrometheus - basics
Prometheus - basics
 
Introducing github.com/open-cluster-management – How to deliver apps across c...
Introducing github.com/open-cluster-management – How to deliver apps across c...Introducing github.com/open-cluster-management – How to deliver apps across c...
Introducing github.com/open-cluster-management – How to deliver apps across c...
 
Implementing Observability for Kubernetes.pdf
Implementing Observability for Kubernetes.pdfImplementing Observability for Kubernetes.pdf
Implementing Observability for Kubernetes.pdf
 
Container Orchestration using kubernetes
Container Orchestration using kubernetesContainer Orchestration using kubernetes
Container Orchestration using kubernetes
 

Recently uploaded

GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesNeo4j
 
OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024Shane Coughlan
 
iGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by SkilrockiGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by SkilrockSkilrock Technologies
 
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...Alluxio, Inc.
 
JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)Max Lee
 
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...rajkumar669520
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion Clinic
 
IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024vaibhav130304
 
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
KLARNA -  Language Models and Knowledge Graphs: A Systems ApproachKLARNA -  Language Models and Knowledge Graphs: A Systems Approach
KLARNA - Language Models and Knowledge Graphs: A Systems ApproachNeo4j
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfkalichargn70th171
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?XfilesPro
 
APVP,apvp apvp High quality supplier safe spot transport, 98% purity
APVP,apvp apvp High quality supplier safe spot transport, 98% purityAPVP,apvp apvp High quality supplier safe spot transport, 98% purity
APVP,apvp apvp High quality supplier safe spot transport, 98% purityamy56318795
 
A Guideline to Gorgias to to Re:amaze Data Migration
A Guideline to Gorgias to to Re:amaze Data MigrationA Guideline to Gorgias to to Re:amaze Data Migration
A Guideline to Gorgias to to Re:amaze Data MigrationHelp Desk Migration
 
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdfImplementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdfVictor Lopez
 
Workforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfWorkforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfDeskTrack
 
10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdf10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdfkalichargn70th171
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1KnowledgeSeed
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAlluxio, Inc.
 

Recently uploaded (20)

GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
 
OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024
 
iGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by SkilrockiGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by Skilrock
 
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
 
JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)
 
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
 
IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024
 
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
KLARNA -  Language Models and Knowledge Graphs: A Systems ApproachKLARNA -  Language Models and Knowledge Graphs: A Systems Approach
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
 
APVP,apvp apvp High quality supplier safe spot transport, 98% purity
APVP,apvp apvp High quality supplier safe spot transport, 98% purityAPVP,apvp apvp High quality supplier safe spot transport, 98% purity
APVP,apvp apvp High quality supplier safe spot transport, 98% purity
 
A Guideline to Gorgias to to Re:amaze Data Migration
A Guideline to Gorgias to to Re:amaze Data MigrationA Guideline to Gorgias to to Re:amaze Data Migration
A Guideline to Gorgias to to Re:amaze Data Migration
 
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdfImplementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
 
Workforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfWorkforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdf
 
10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdf10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdf
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
 

OSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike Klusik

  • 1. Monitoring Cockpit for Kubernetes Clusters Ulrike Klusik 5.11.2019
  • 2. Monitoring Cockpit 2 Our Customers Kubernetes Implementation: OpenShift • OpenShift is a commercial Kubernetes implementation (OKD its community version) • From Monitoring perspective: • Nodes as compute resources • central service URLs with high availability and performance SLAs, • infrastructure Pods implementing the services which can be dynamically changing. • The API already provides meta data about the cluster components => used to determine the metric targets from https://docs.okd.io/3.11/architecture/index.html
  • 3. Monitoring Cockpit 3 Monitoring with Prometheus • Prometheus is good integrated in Kubernetes ecosystem. • Idea of Monitoring with Prometheus • Monitored targets must provide their specific metrics via http(s) endpoints • Targets are typically determined dynamically via service discovery and regularly scraped • Alert rules are defined as conditions on metrics • Alertmanager deduplicates alerts and routes them to the incident handling tools from https://prometheus.io/assets/architecture.png
  • 4. Monitoring Cockpit 4 State of the Art Kubernetes Monitoring with Prometheus • Prometheus Monitoring Mixin for Kubernetes (https://github.com/kubernetes-monitoring/kubernetes-mixin) • Provides for the standard Kubernetes services: • Alert rules • Dashboards • Redhat’s OpenShift (a commercial Kubernetes implementation) includes an immutable Prometheus Monitoring solution (https://github.com/openshift/cluster-monitoring-operator) with fix alerts and dashboards. Also bases on the Mixins, plus some OpenShift specific additions. • What we were missing in these solutions: • End user / application experience of cluster services • Metrics volume too large for longer metric retention • Cluster Overview over service and node availability
  • 5. Monitoring Cockpit 5 namespace Nodes host NODE- EXPORTER OMD server INFLUXDB ALERTMGR (cluster possible) Container OMD-Service Grafana Monitoring Architecture Kubelet + cAdvisor Openshift metric target HAProxy(Router) infrastructure projects remote write (selected metrics) Incident Mananagent systems (e.g. Remedy, Service Now) custom webhook api-servers kube controllers EFK Logging (via Pods) GlusterFS (via Heketi-Route) Project prometheus-infra-mon PROMETHEUS KSM/OSM Kubernetes/OpenShift Cluster • Kube-State-Metrics(KSM)/ OpenShift-State-Metrics(OSM): metrics over objects and their states • Node-Exporter for operation system metrics • Blackbox Exporter: for test calls to Service URLs Blackbox Exporter Kubernetes metric target
  • 7. Monitoring Cockpit 7 Conclusion • Special design decisions: • remote write: The key metrics are stored in an external database for longer retention • blackbox exporter: for active service availability tests • Grafana and its plugins (especially polystat-panel) are an awesome tool to visualize metrics in a compact way.
  • 9. ConSol Consulting & Solutions Software GmbH St.-Cajetan-Straße 43 D-81669 München Tel.: +49-89-45841-100 info@consol.de www.consol.de Twitter: @consol_de