Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023

•Download as PPTX, PDF•

0 likes•261 views

Talk by: Eric Lippmann We recently started researching and developing a Module for Icinga to monitor Kubernetes environments. During the past months we learned a lot about the platform and how we can monitor Kubernetes with Icinga efficiently. In this talk I will present our challenges but also the progress that we made. The talk will include a sneak peak into the current state of the Module and outline our vision of monitoring Kubernetes with Icinga.

Software

Eric Lippmann | Icinga Camp Milan | Oct 17, 2023
Monitor Kubernetes with
Icinga
(how it could be)

Icinga – Traditional Monitoring
• Hosts
• Bare metal, virtual machines
• Cloud instances to some extent
• Services
• Resource usage
• Applications, …
• Check Plugins
• Alerts

Icinga – Traditional Monitoring
• Automation
• Configuration Management
• Director
• Icinga APIs
• Metrics

Monitoring K8s – What to Monitor
• Hosts (where K8s components run)
• K8s itself
• Services, e.g. Deployments, *Sets, Jobs
• Pods
• Containers
• Key metics
 Not only infrastructure but also workloads

Challenges – Complexity
• Loads of resource types
• Multiple components and layers
• Different failure points
• Understanding of the entire stack
 Via hosts, services and check plugins?

Challenges – Ephemeral
Run Fail Respawn Run

K8s Monitoring
Cluster
Nodes
Applications
Pods
Containers
Health
Metrics
Resource
usage
Expec-
tations
Events

K8s Monitoring – Probes
Liveness probes periodically check container liveness and
restart containers that fail it.
Readiness probes indicate container readiness and remove
failing ones from their service endpoints.
Startup probes defer the execution of liveness and readiness
probes and restarts containers that fail it.

K8s Monitoring – Approaches
• Poll K8s APIs
• Agent per node via DaemonSet
• Agent per pod (sidecar container)
• Events
• Metrics
• Logs
• APM

Possible K8s Metric Sources
• Node metrics from Prometheus node exporter
• Container metrics from cAdvisor (or metrics-server)
• K8s metrics
• API server
• etcd
• scheduler
• controller manager
• kube-state-metrics

Icinga K8s Monitoring, at the moment…
• Collects K8s resources and their
• health, events, certain metrics and logs
• Visualizes K8s resources and hierarchies

Icinga K8s Monitoring, should also…
• Correlate health, logs, metrics and events
• Provide alerts
• Of course, via icinga-notifications
• Give configuration tips

Icinga K8s Monitoring Architecture
• Icinga Web Module (PHP)
• View resources and hierarchies
• Daemon (Go)
• Collect resources, health, events,
logs and certain metrics
• Send alerts
• Database (PostgreSQL / MySQL / MariaDB)
• Stores resources, health, …

Icinga K8s Monitoring Ideas
• Account node failures
• Number of nodes remaining referenced to the load
• CPU, memory and storage
• Compare requests, limits and actual utilization
• Indicate overcommitment of nodes
• Monitor DNS, K8s probes, latencies, traffic, …
• Affinities and anti-affinities

twitter.com/icinga github.com/icinga facebook.com/icinga
icinga.com
Thank You!
What are your questions?

Similar to Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023

Monitoring docker, k8s and your applications with the elastic stackSmartWave

Monitoring kubernetes across data center and cloudDatadog

DevOpsDays Houston 2019 - Terry Shea - Centralizing Kubernetes OperationsDevOpsDays Houston

Azure Application insights - An IntroductionMatthias Güntert

OSMC 2023 | Current State of Icinga by Bernd ErkNETWAYS

Kubernetes Monitoring & Best PracticesAjeet Singh Raina

MeetUp Monitoring with Prometheus and Grafana (September 2018)Lucas Jellema

Monitoring and Log Management forSematext Group, Inc.

Asynchronous design with Spring and RTI: 1M events per secondStuart (Pid) Williams

CQRS and Event Sourcing for IoT applicationsMichael Blackstock

EM12c: Capacity Planning with OEM MetricsMaaz Anjum

Observability with Spring-based distributed systemsTommy Ludwig

Centralizing Kubernetes and Container OperationsKublr

Cloud monitoring with Applications ManagerManageEngine, Zoho Corporation

Rocana Deep Dive OC Big Data Meetup #19 Sept 21st 2016cdmaxime

Kubernetes in AdformEdgaras Apšega

Modernizing Cloud and Hyperconverged Infrastructure monitoringManageEngine, Zoho Corporation

Monitoring MySQL at scaleOvais Tariq

Cloud-native application monitoring powered by Riverbed and ElasticsearchRichard Juknavorian

Nagios Conference 2014 - Konstantin Benz - Monitoring Openstack The Relations...Nagios

Similar to Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023 (20)

Monitoring docker, k8s and your applications with the elastic stack

Monitoring kubernetes across data center and cloud

DevOpsDays Houston 2019 - Terry Shea - Centralizing Kubernetes Operations

Azure Application insights - An Introduction

OSMC 2023 | Current State of Icinga by Bernd Erk

Kubernetes Monitoring & Best Practices

MeetUp Monitoring with Prometheus and Grafana (September 2018)

Monitoring and Log Management for

Asynchronous design with Spring and RTI: 1M events per second

CQRS and Event Sourcing for IoT applications

EM12c: Capacity Planning with OEM Metrics

Observability with Spring-based distributed systems

Centralizing Kubernetes and Container Operations

Cloud monitoring with Applications Manager

Rocana Deep Dive OC Big Data Meetup #19 Sept 21st 2016

Kubernetes in Adform

Modernizing Cloud and Hyperconverged Infrastructure monitoring

Monitoring MySQL at scale

Cloud-native application monitoring powered by Riverbed and Elasticsearch

Nagios Conference 2014 - Konstantin Benz - Monitoring Openstack The Relations...

Recently uploaded

WSO2CON 2024 - How to Run a Security ProgramWSO2

WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2

WSO2Con2024 - Organization Management: The Revolution in B2B CIAMWSO2

%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd

%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba

WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2

WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2

WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...WSO2

WSO2Con2024 - Low-Code Integration ToolingWSO2

WSO2Con2024 - Software Delivery in Hybrid EnvironmentsWSO2

WSO2CON2024 - Why Should You Consider Ballerina for Your Next IntegrationWSO2

Announcing Codolex 2.0 from GDK SoftwareJim McKeeth

WSO2CON 2024 - How CSI Piemonte Is Apifying the Public AdministrationWSO2

Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver

WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2

What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen

AI & Machine Learning Presentation TemplatePresentation.STUDIO

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba

WSO2Con2024 - Unleashing the Financial Potential of 13 Million PeopleWSO2

Recently uploaded (20)

WSO2CON 2024 - How to Run a Security Program

WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...

WSO2Con2024 - Organization Management: The Revolution in B2B CIAM

%in Soweto+277-882-255-28 abortion pills for sale in soweto

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...

%in tembisa+277-882-255-28 abortion pills for sale in tembisa

WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source

WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...

WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...

WSO2Con2024 - Low-Code Integration Tooling

WSO2Con2024 - Software Delivery in Hybrid Environments

WSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration

Announcing Codolex 2.0 from GDK Software

WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration

Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...

WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...

What Goes Wrong with Language Definitions and How to Improve the Situation

AI & Machine Learning Presentation Template

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...

WSO2Con2024 - Unleashing the Financial Potential of 13 Million People

Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023

1. Eric Lippmann | Icinga Camp Milan | Oct 17, 2023 Monitor Kubernetes with Icinga (how it could be)

2. Eric, CTO @ Icinga

3. Traditional Monitoring

4. Icinga – Traditional Monitoring • Hosts • Bare metal, virtual machines • Cloud instances to some extent • Services • Resource usage • Applications, … • Check Plugins • Alerts

5. Icinga – Traditional Monitoring • Automation • Configuration Management • Director • Icinga APIs • Metrics

6. Monitoring K8s

8. Monitoring K8s – What to Monitor • Hosts (where K8s components run) • K8s itself • Services, e.g. Deployments, *Sets, Jobs • Pods • Containers • Key metics  Not only infrastructure but also workloads

9. Challenges – Complexity • Loads of resource types • Multiple components and layers • Different failure points • Understanding of the entire stack  Via hosts, services and check plugins?

10. Challenges – Ephemeral Run Fail Respawn Run

11. Challenges – Pods Come and Go

12. Challenges – Metrics

13. K8s Monitoring Cluster Nodes Applications Pods Containers Health Metrics Resource usage Expec- tations Events

14. K8s Monitoring – Probes Liveness probes periodically check container liveness and restart containers that fail it. Readiness probes indicate container readiness and remove failing ones from their service endpoints. Startup probes defer the execution of liveness and readiness probes and restarts containers that fail it.

15. K8s Monitoring – Approaches • Poll K8s APIs • Agent per node via DaemonSet • Agent per pod (sidecar container) • Events • Metrics • Logs • APM

16. Possible K8s Metric Sources • Node metrics from Prometheus node exporter • Container metrics from cAdvisor (or metrics-server) • K8s metrics • API server • etcd • scheduler • controller manager • kube-state-metrics

17. Icinga K8s Monitoring

18. Icinga K8s Monitoring, at the moment… • Collects K8s resources and their • health, events, certain metrics and logs • Visualizes K8s resources and hierarchies

19. Icinga K8s Monitoring, should also… • Correlate health, logs, metrics and events • Provide alerts • Of course, via icinga-notifications • Give configuration tips

20. Icinga K8s Monitoring Architecture • Icinga Web Module (PHP) • View resources and hierarchies • Daemon (Go) • Collect resources, health, events, logs and certain metrics • Send alerts • Database (PostgreSQL / MySQL / MariaDB) • Stores resources, health, …

21. Icinga K8s Monitoring Architecture

22. Icinga K8s Monitoring Ideas • Account node failures • Number of nodes remaining referenced to the load • CPU, memory and storage • Compare requests, limits and actual utilization • Indicate overcommitment of nodes • Monitor DNS, K8s probes, latencies, traffic, … • Affinities and anti-affinities

23. twitter.com/icinga github.com/icinga facebook.com/icinga icinga.com Thank You! What are your questions?

Editor's Notes

Hosts Rather static Ping checks Services Resource usaga CPU, Memory, Storage. Network, Latencies Apps Webserver Databases URLs Check Plugins Contain logic Common understanding of what is wrong Not each and everyone has to find and configure own rules Cube Business Process vSphere
Hosts K8s Nodes K8s itself Etcd, scheduler, controller, api server Services aka. K8s resources Cluster Monitoring (infrastructure) All clusters should monitor the underlying server components since problems at the server level will show up in the workloads. Some metrics to look for while monitoring node resources are CPU, disk, and network bandwidth. Having an overview of these metrics will let you know if it’s time to scale the cluster up or down (this is especially useful when using cloud providers where running cost is important). Workload Monitoring (workload) Metrics related to deployments and their pods should be taken into consideration here. Checking the number of pods a deployment has at a moment compared to its desired state can be relevant. Also, we can look for health checks, container metrics, and finally application metrics.
Everything is gone Logs. Metrics, events
Jobs Configuration changes Scaling Name changes (not for StatefulSet) History Collect everything but alert on service level
In order to determine the health at every level, from the application to the operating system to the infrastructure, you need to monitor metrics in all the different layers and components - services, containers, pods, deployments, nodes, and clusters. And each and everyone has to understand which metrics there are, what they mean and how to interpret them. In this scenario, monitoring the cluster metrics would show roughly 50% memory utilization. It’s not very useful information, nor is it alarming. But what would happen if you go down a level and monitor the metrics of each node? In that case, one of the nodes would show 100% memory usage — this would reveal a problem, but not its origin. Going down another level to the pod metrics would get you closer to the problem, and going down yet another level to the container metrics would allow you to isolate the culprit of the memory leak. This simple example shows the value of monitoring the metrics of each Kubernetes layer. Yes, cluster-wide metrics provide a high-level overview of Kubernetes deployment performance, but you’ll need those lower-layer metrics to identify problems and obtain useful insights that will help you administer the cluster and optimize the resources.
Cluster Kubernetes components Resource usage Underutilized / Over capacitv Nodes Number of nodes sufficient? Account node failures Capacity of Pods, Ips and ressources Pods Resource usage against requests and limits Running vs desired Containers Logs Metrics Cluster, Pods, Containers, Deployments, Sets, Applications Expectations Number of replicas Deployment Updated pods

Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023

Recommended

Recommended

More Related Content

Similar to Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023

Similar to Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023 (20)

More from Icinga

More from Icinga (20)

Recently uploaded

Recently uploaded (20)

Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023

Editor's Notes