SlideShare a Scribd company logo
@nileshgule
Improve
Monitoring and Observability
for
Kubernetes
with
OSS tools
Nilesh Gule
ARCHITECT | MICROSOFT MVP
| First Docker Captain in
Singapore
“Code with Passion and
Strive for Excellence”
nileshgule
@nileshgul
e
Nilesh Gule
NileshGule
www.handsonarchitect.co
m
https://www.youtube.com/@nilesh-gule
@nileshgule
@nileshgule
CNCF cloud trail
https://github.com/cncf/trailmap
@nileshgule
CNCF Observability landscape
https://landscape.cncf.io
@nileshgule
CNCF Observability Radar
https://radar.cncf.io/2020-09-observability
@nileshgule
3 Pillars of Observability
Logs Metrics Traces
@nileshgule
Centralized
Logging
@nileshgule
❑ Application specific
❖ Long term log retention for compliance reasons
❖ Workloads scheduled on different nodes during
application restarts / updates
❖ Autoscaling workloads
❑ Kubernetes upgrades
❖ Auto healing can reschedule workloads
❖ Underlying nodes added / deleted during cluster
scaling
❖ Underlying nodes replaced during cluster
upgrades
Container based workloads
Why centralized logging
❖ Not much control over underlying infra
❖ Relies on cloud prover specific logging and monitoring
solution
PaaS / Serverless services
@nileshgule
Financial Services App Loki integration
Log collector Log storage Log search, visualise,
dashboards
backend-service account-service authentication-service forex-service transaction-service
@nileshgule
Demo 1 – Log Aggregation with Loki
@nileshgule
Metrics
@nileshgule
• Application specific
• Monitor resource usage
• Monitor scaling needs
• Monitor anomalies / outliers
• Kubernetes platform level
• Monitor cluster resources (CPU / RAM)
• API health
• Autoscaling
Container based workloads
Why Metrics
• Monitor resource usage
• Scaling
• Bottlenecks
PaaS / Serverless services
@nileshgule
Prometheus Architecture
@nileshgule
Demo 2 – Metrics using Prometheus &
Grafana
@nileshgule
Financial Services App Prometheus integration
Scrape Metrics Metrics storage visualise,
dashboards
backend-service account-service authentication-service forex-service transaction-service
service-monitor
@nileshgule
Distributed Tracing
@nileshgule
• Distributed Tracing
• Understanding complex systems
• Performance monitoring and optimizations
• Debugging and problem resolution
Why Distributed Tracing
@nileshgule
Financial Services App Jaeger integration
Distributed Traces Visualise Traces
backend-service account-service authentication-service forex-service transaction-service
Jaeger Operator
@nileshgule
Demo 3 – Distributed Tracing using Jaeger
@nileshgule
End to End Observability
backend-service account-service authentication-service forex-service transaction-service
@nileshgule
Analogy - Use right tool for right purpose
@nileshgule
Summary
Modern day cloud native applications need new ways to address observability &
monitoring
✓ Use best-of-class for given use case
✓ Rely on open standards (e.g. OpenTelemetry)
✓ Build portable observability systems (e.g. hybrid cloud migration)
Log Aggregation
✓ Loki helps in centralized logging
✓ Grafana is used to visualize logs and build dashboards
Metrics
✓ Prometheus provides easy to use metrics for platforms, applications
✓ Grafana provides visualization capabilities to build intuitive dashboards
Distributed Tracing
✓ Jaeger provides distributed tracing capabilities
@nileshgule
Some Recommendations
♣ Too many agents
♣ Instrumentation, vendor lock-in
♣ Cloud native logs
♣ Cloud native metrics
♣ Cloud native traces
♣ Single pane of glass, correlation
∞ OpenTelemetry collector
∞ OpenTelemetry, OpenMetrics
∞ Fluent Bit / Fluentd, OpenSearch, Loki
∞ Prometheus, Cortex, Thanos
∞ OpenTelemetry, Jaeger, Grafana
∞ Grafana
Challenges Tools
@nileshgule
References
Log Aggregation
❖ Grafana Loki
Monitoring & Alerting
❖ Prometheus
❖ Grafana
❖ Kube Prometheus stack
❖ Houssem Dellai – Prometheus & Grafana
for monitoring Kubernetes
Distributed Tracing
❖ Jaeger Tracing
@nileshgule
Source Code & slide deck
Financial Services Demo
https://github.com/infofractionalservices/microservices/tree/do
cker_build_fixes
https://speakerdeck.com/nileshgule/
https://www.slideshare.net/nileshgule/
Q&A

More Related Content

Similar to Improve Monitoring And Observability for Kubernetes with OSS tools.pdf

Crossing the river by feeling the stones from legacy to cloud native applica...
Crossing the river by feeling the stones  from legacy to cloud native applica...Crossing the river by feeling the stones  from legacy to cloud native applica...
Crossing the river by feeling the stones from legacy to cloud native applica...
OPNFV
 

Similar to Improve Monitoring And Observability for Kubernetes with OSS tools.pdf (20)

Peek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and RoadmapPeek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and Roadmap
 
Crossing the river by feeling the stones from legacy to cloud native applica...
Crossing the river by feeling the stones  from legacy to cloud native applica...Crossing the river by feeling the stones  from legacy to cloud native applica...
Crossing the river by feeling the stones from legacy to cloud native applica...
 
Continuous Lifecycle London 2018 Event Keynote
Continuous Lifecycle London 2018 Event KeynoteContinuous Lifecycle London 2018 Event Keynote
Continuous Lifecycle London 2018 Event Keynote
 
GitOps, Driving NGN Operations Teams 211127 #kcdgt 2021
GitOps, Driving NGN Operations Teams 211127 #kcdgt 2021GitOps, Driving NGN Operations Teams 211127 #kcdgt 2021
GitOps, Driving NGN Operations Teams 211127 #kcdgt 2021
 
3 reasons to pick a time series platform for monitoring dev ops driven contai...
3 reasons to pick a time series platform for monitoring dev ops driven contai...3 reasons to pick a time series platform for monitoring dev ops driven contai...
3 reasons to pick a time series platform for monitoring dev ops driven contai...
 
The path to a serverless-native era with Kubernetes
The path to a serverless-native era with KubernetesThe path to a serverless-native era with Kubernetes
The path to a serverless-native era with Kubernetes
 
Airflow techtonic template
Airflow   techtonic templateAirflow   techtonic template
Airflow techtonic template
 
Designing Apps for Runtime Fabric: Logging, Monitoring & Object Store Persist...
Designing Apps for Runtime Fabric: Logging, Monitoring & Object Store Persist...Designing Apps for Runtime Fabric: Logging, Monitoring & Object Store Persist...
Designing Apps for Runtime Fabric: Logging, Monitoring & Object Store Persist...
 
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
 
Monitoring kubernetes wwith prometheus and grafana azure singapore - 19 aug...
Monitoring kubernetes wwith prometheus and grafana   azure singapore - 19 aug...Monitoring kubernetes wwith prometheus and grafana   azure singapore - 19 aug...
Monitoring kubernetes wwith prometheus and grafana azure singapore - 19 aug...
 
DevOps On Google Cloud Platform Online Training.pptx
DevOps On Google Cloud Platform Online Training.pptxDevOps On Google Cloud Platform Online Training.pptx
DevOps On Google Cloud Platform Online Training.pptx
 
8 - OpenShift - A look at a container platform: what's in the box
8 - OpenShift - A look at a container platform: what's in the box8 - OpenShift - A look at a container platform: what's in the box
8 - OpenShift - A look at a container platform: what's in the box
 
NextGenML
NextGenML NextGenML
NextGenML
 
Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...
Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...
Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...
 
'How to build efficient backend based on microservice architecture' by Anton ...
'How to build efficient backend based on microservice architecture' by Anton ...'How to build efficient backend based on microservice architecture' by Anton ...
'How to build efficient backend based on microservice architecture' by Anton ...
 
Lesson_08_Continuous_Monitoring.pdf
Lesson_08_Continuous_Monitoring.pdfLesson_08_Continuous_Monitoring.pdf
Lesson_08_Continuous_Monitoring.pdf
 
[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...
[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...
[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...
 
Migrating from Self-Managed Kubernetes on EC2 to a GitOps Enabled EKS
Migrating from Self-Managed Kubernetes on EC2 to a GitOps Enabled EKSMigrating from Self-Managed Kubernetes on EC2 to a GitOps Enabled EKS
Migrating from Self-Managed Kubernetes on EC2 to a GitOps Enabled EKS
 
Anthos Application Modernization Platform
Anthos Application Modernization PlatformAnthos Application Modernization Platform
Anthos Application Modernization Platform
 
Weave GitOps - continuous delivery for any Kubernetes
Weave GitOps - continuous delivery for any KubernetesWeave GitOps - continuous delivery for any Kubernetes
Weave GitOps - continuous delivery for any Kubernetes
 

More from Nilesh Gule

More from Nilesh Gule (20)

Code Creativity and Customers- Navigating the Generative AI Landscape.pdf
Code Creativity and Customers- Navigating the Generative AI Landscape.pdfCode Creativity and Customers- Navigating the Generative AI Landscape.pdf
Code Creativity and Customers- Navigating the Generative AI Landscape.pdf
 
Modular Architecturs for Resilience and Adaptability.pdf
Modular Architecturs for Resilience and Adaptability.pdfModular Architecturs for Resilience and Adaptability.pdf
Modular Architecturs for Resilience and Adaptability.pdf
 
Autoscale applications based on external events with KEDA.pdf
Autoscale applications based on external events with KEDA.pdfAutoscale applications based on external events with KEDA.pdf
Autoscale applications based on external events with KEDA.pdf
 
Singapore JUG - Open Telemetry.pdf
Singapore JUG - Open Telemetry.pdfSingapore JUG - Open Telemetry.pdf
Singapore JUG - Open Telemetry.pdf
 
Cloud Native Ninja - Getting Started with Kubernetes - Part 9.pdf
Cloud Native Ninja - Getting Started with Kubernetes - Part 9.pdfCloud Native Ninja - Getting Started with Kubernetes - Part 9.pdf
Cloud Native Ninja - Getting Started with Kubernetes - Part 9.pdf
 
Build Secure Portable Applications using AKS and its ecosystem
Build Secure Portable Applications using AKS and its ecosystemBuild Secure Portable Applications using AKS and its ecosystem
Build Secure Portable Applications using AKS and its ecosystem
 
Cloud Native Ninja - PT8 - Containerize React app.pdf
Cloud Native Ninja - PT8 - Containerize React app.pdfCloud Native Ninja - PT8 - Containerize React app.pdf
Cloud Native Ninja - PT8 - Containerize React app.pdf
 
Cloud Native Ninja - PT8 - Containerize React app.pdf
Cloud Native Ninja - PT8 - Containerize React app.pdfCloud Native Ninja - PT8 - Containerize React app.pdf
Cloud Native Ninja - PT8 - Containerize React app.pdf
 
Modular Architecturs for resilience and Adaptability.pdf
Modular Architecturs for resilience and Adaptability.pdfModular Architecturs for resilience and Adaptability.pdf
Modular Architecturs for resilience and Adaptability.pdf
 
Modular Architecturs for resilience and Adaptability.pdf
Modular Architecturs for resilience and Adaptability.pdfModular Architecturs for resilience and Adaptability.pdf
Modular Architecturs for resilience and Adaptability.pdf
 
Cloud Native Ninja - PT7 - Containerize Go apps.pdf
Cloud Native Ninja - PT7 - Containerize Go apps.pdfCloud Native Ninja - PT7 - Containerize Go apps.pdf
Cloud Native Ninja - PT7 - Containerize Go apps.pdf
 
Cloud Native Ninja - PT6 - Containerize Spring Boot apps.pdf
Cloud Native Ninja - PT6 - Containerize Spring Boot apps.pdfCloud Native Ninja - PT6 - Containerize Spring Boot apps.pdf
Cloud Native Ninja - PT6 - Containerize Spring Boot apps.pdf
 
Cloud Native Ninja - PT5 - Publish container images.pdf
Cloud Native Ninja - PT5 - Publish container images.pdfCloud Native Ninja - PT5 - Publish container images.pdf
Cloud Native Ninja - PT5 - Publish container images.pdf
 
Portable Multi-cloud Microservices with Dapr .pdf
Portable Multi-cloud Microservices with Dapr .pdfPortable Multi-cloud Microservices with Dapr .pdf
Portable Multi-cloud Microservices with Dapr .pdf
 
Portable Multi-cloud Microservices with Dapr .pdf
Portable Multi-cloud Microservices with Dapr .pdfPortable Multi-cloud Microservices with Dapr .pdf
Portable Multi-cloud Microservices with Dapr .pdf
 
Manage Multi Container Apps with Docker Compose.pdf
Manage Multi Container Apps with Docker Compose.pdfManage Multi Container Apps with Docker Compose.pdf
Manage Multi Container Apps with Docker Compose.pdf
 
Portable Multi-cloud Microservices with Dapr .pptx
Portable Multi-cloud Microservices with Dapr .pptxPortable Multi-cloud Microservices with Dapr .pptx
Portable Multi-cloud Microservices with Dapr .pptx
 
Cloud Native Ninja - PT3 - Containerize DOTNET apps.pdf
Cloud Native Ninja - PT3 - Containerize DOTNET apps.pdfCloud Native Ninja - PT3 - Containerize DOTNET apps.pdf
Cloud Native Ninja - PT3 - Containerize DOTNET apps.pdf
 
Cloud Native Ninja - Distributed Microservices with Dapr - part 2.pdf
Cloud Native Ninja - Distributed Microservices with Dapr - part 2.pdfCloud Native Ninja - Distributed Microservices with Dapr - part 2.pdf
Cloud Native Ninja - Distributed Microservices with Dapr - part 2.pdf
 
Distributed Event Driven Systems with KEDA.pdf
Distributed Event Driven Systems with KEDA.pdfDistributed Event Driven Systems with KEDA.pdf
Distributed Event Driven Systems with KEDA.pdf
 

Recently uploaded

Recently uploaded (20)

Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
The architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdfThe architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdf
 

Improve Monitoring And Observability for Kubernetes with OSS tools.pdf