SlideShare a Scribd company logo
1 of 23
Download to read offline
Principles of Observability
Jānis Orlovs
Riga DevOPS Meetup
22th November, 2017
About Me
• I come from OPS side
• For me most interesting part is
to make operations boring
• Mostly have been working in
Financial Services
• 4finance
• Swedbank
• KPMG
• T
• Playing basketball in amateur
level for ages
Complexity of Systems
Level of System Distribution
simple monolith
modular monolith
complex modular monolith or microservices
Complexity
Failures of Complex Systems
• Complex systems are
intrinsically hazardous systems
• Catastrophe is always just
around the corner
• All practitioner actions are
gambles.
Paper URL:
https://goo.gl/sTvJw8
Failures as Mysteries
https://twitter.com/honest_update
Monitoring System
Monitoring system complexes should address two questions:
what’s broken, and why? ...
“What” versus “why” is one of the most important distinctions in
writing good monitoring with maximum signal and minimum
noise
Source: Service Reliability Engineering Book
General Monitoring House Rules
• Metrics and Checks that catch real incidents most often should be as
simple, predictable, and reliable as possible.
• Data collection, aggregation, and alerting configuration that is rarely
exercised should be up for removal.
• Signals that are collected, but not exposed in any prebaked dashboard
nor used by any alert, are candidates for removal.
Monitoring Approaches
Blackbox Approach: Checks Monitoring
• Checks, not metrics.
• Simple, yes/no questions.
• First generation of monitoring systems
• Not suitable what’s actually happening under the hood, without
guessing
Whitebox Monitoring
Whitebox Approach: Metrics Monitoring
• Addreses known failure vectors.
• There is needed to be developed instrumentation for exposing
data to monitoring
• Proper monitoring is mixture technical data with business data
• Too much monitoring is noise
Whitebox Approach: Logging
• Valuable insigth: place where starts are investigations
• View of Request
• View of System
• Easy to collect data, from data points.
• Plain text
• Structured
• Binary
• LogAll vs LogActionalbe data
• Data sets bloats, large scale ingestions of data tricky
Tracing
• Most challenging part to implement from historical point-of-
view
• Tracing captures the lifetime of requests as they flow through
the various components of a distributed system
• Recent developments in tracing tools gives brigth look in future:
• Dtrace and BFP framework
• OpenTracing: http://opentracing.io/
Observability
In control theory, observability is a measure of how well internal states
of a system can be inferred from knowledge of its external outputs. The
observability and controllability of a system are mathematical duals.
Source: Wikipedia
Choosing Rigth Observability Tools
Privacy and Observability
• Starting 25th May, 2018 EU personal data protection directive or
GDPR will be fully in place.
• Drastic accountability measures:
• Up to 10m EUR or 2% global turnover for the first audit fail
• Up to 20m EUR or 4% global turnover for the second audit fail
• Observability tools are silent huge personal data collectors
• Include in your Company’s data protection Sscope or anonymize
data
Conclusions
• Reliability of systems makes money (not loosing it)
• In distributed systems all teams involved in systems
development has to commit to making systems observable
• For one type of tasks choose one tool
• Review what data you collect, visualize your data
• Pick your own Observability target based on the requirements
of your service.
Principles of Observability
Jānis Orlovs
Riga DevOPS Meetup
22th November, 2017

More Related Content

What's hot

Improve monitoring and observability for kubernetes with oss tools
Improve monitoring and observability for kubernetes with oss toolsImprove monitoring and observability for kubernetes with oss tools
Improve monitoring and observability for kubernetes with oss toolsNilesh Gule
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Brian Brazil
 
OpenTelemetry Introduction
OpenTelemetry Introduction OpenTelemetry Introduction
OpenTelemetry Introduction DimitrisFinas1
 
DevOps Monitoring and Alerting
DevOps Monitoring and AlertingDevOps Monitoring and Alerting
DevOps Monitoring and AlertingKhairul Zebua
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observabilityTheo Schlossnagle
 
Observability, Distributed Tracing, and Open Source: The Missing Primer
Observability, Distributed Tracing, and Open Source: The Missing PrimerObservability, Distributed Tracing, and Open Source: The Missing Primer
Observability, Distributed Tracing, and Open Source: The Missing PrimerVMware Tanzu
 
Learning how AWS implement AWS VPC CNI
Learning how AWS implement AWS VPC CNILearning how AWS implement AWS VPC CNI
Learning how AWS implement AWS VPC CNIHungWei Chiu
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to PrometheusJulien Pivotto
 
Welcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewWelcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewAmazon Web Services
 
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdfOSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdfNETWAYS
 
Opentelemetry - From frontend to backend
Opentelemetry - From frontend to backendOpentelemetry - From frontend to backend
Opentelemetry - From frontend to backendSebastian Poxhofer
 
Cloud-Native Observability
Cloud-Native ObservabilityCloud-Native Observability
Cloud-Native ObservabilityTyler Treat
 
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...Splunk
 
Exploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on KubernetesExploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on KubernetesRed Hat Developers
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingAraf Karsh Hamid
 
DevOps Transformation in Technical
DevOps Transformation in TechnicalDevOps Transformation in Technical
DevOps Transformation in TechnicalOpsta
 
Terraform GitOps on Codefresh
Terraform GitOps on CodefreshTerraform GitOps on Codefresh
Terraform GitOps on CodefreshCodefresh
 

What's hot (20)

Observability
ObservabilityObservability
Observability
 
Improve monitoring and observability for kubernetes with oss tools
Improve monitoring and observability for kubernetes with oss toolsImprove monitoring and observability for kubernetes with oss tools
Improve monitoring and observability for kubernetes with oss tools
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)
 
OpenTelemetry Introduction
OpenTelemetry Introduction OpenTelemetry Introduction
OpenTelemetry Introduction
 
DevOps Monitoring and Alerting
DevOps Monitoring and AlertingDevOps Monitoring and Alerting
DevOps Monitoring and Alerting
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
 
Observability, Distributed Tracing, and Open Source: The Missing Primer
Observability, Distributed Tracing, and Open Source: The Missing PrimerObservability, Distributed Tracing, and Open Source: The Missing Primer
Observability, Distributed Tracing, and Open Source: The Missing Primer
 
Learning how AWS implement AWS VPC CNI
Learning how AWS implement AWS VPC CNILearning how AWS implement AWS VPC CNI
Learning how AWS implement AWS VPC CNI
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to Prometheus
 
Welcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewWelcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution Overview
 
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdfOSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
 
Opentelemetry - From frontend to backend
Opentelemetry - From frontend to backendOpentelemetry - From frontend to backend
Opentelemetry - From frontend to backend
 
Cloud-Native Observability
Cloud-Native ObservabilityCloud-Native Observability
Cloud-Native Observability
 
Gitops Hands On
Gitops Hands OnGitops Hands On
Gitops Hands On
 
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
 
Exploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on KubernetesExploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on Kubernetes
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb Sharding
 
DevOps Transformation in Technical
DevOps Transformation in TechnicalDevOps Transformation in Technical
DevOps Transformation in Technical
 
Observability driven development
Observability driven developmentObservability driven development
Observability driven development
 
Terraform GitOps on Codefresh
Terraform GitOps on CodefreshTerraform GitOps on Codefresh
Terraform GitOps on Codefresh
 

Similar to Principles of Observability Riga DevOPS Meetup

Secure and Compliant Data Management in FinTech Applications
Secure and Compliant Data Management in FinTech ApplicationsSecure and Compliant Data Management in FinTech Applications
Secure and Compliant Data Management in FinTech ApplicationsLionel Briand
 
IT Operation Analytic for security- MiSSconf(sp1)
IT Operation Analytic for security- MiSSconf(sp1)IT Operation Analytic for security- MiSSconf(sp1)
IT Operation Analytic for security- MiSSconf(sp1)stelligence
 
IBM i Security SIEM Integration
IBM i Security SIEM IntegrationIBM i Security SIEM Integration
IBM i Security SIEM IntegrationPrecisely
 
HPE-Security update talk presented in Vienna to partners on 15th April 2016
HPE-Security update talk presented in Vienna to partners on 15th April 2016HPE-Security update talk presented in Vienna to partners on 15th April 2016
HPE-Security update talk presented in Vienna to partners on 15th April 2016SteveAtHPE
 
Protecting Your Business from Unauthorized IBM i Access
Protecting Your Business from Unauthorized IBM i AccessProtecting Your Business from Unauthorized IBM i Access
Protecting Your Business from Unauthorized IBM i AccessPrecisely
 
2013 Data Protection Maturity Trends: How Do You Compare?
2013 Data Protection Maturity Trends: How Do You Compare?2013 Data Protection Maturity Trends: How Do You Compare?
2013 Data Protection Maturity Trends: How Do You Compare?Lumension
 
MongoDB.local Sydney: The Changing Face of Data Privacy & Ethics, and How Mon...
MongoDB.local Sydney: The Changing Face of Data Privacy & Ethics, and How Mon...MongoDB.local Sydney: The Changing Face of Data Privacy & Ethics, and How Mon...
MongoDB.local Sydney: The Changing Face of Data Privacy & Ethics, and How Mon...MongoDB
 
5 Reasons Observability of your Mainframe and IBM i is Critical for IT
5 Reasons Observability of your Mainframe and IBM i is Critical for IT 5 Reasons Observability of your Mainframe and IBM i is Critical for IT
5 Reasons Observability of your Mainframe and IBM i is Critical for IT Precisely
 
SpiceWorks Webinar: Whose logs, what logs, why logs
SpiceWorks Webinar: Whose logs, what logs, why logs  SpiceWorks Webinar: Whose logs, what logs, why logs
SpiceWorks Webinar: Whose logs, what logs, why logs AlienVault
 
The EU Data Protection Regulation and what it means for your organization
The EU Data Protection Regulation and what it means for your organizationThe EU Data Protection Regulation and what it means for your organization
The EU Data Protection Regulation and what it means for your organizationSophos Benelux
 
IBM i Security: Identifying the Events That Matter Most
IBM i Security: Identifying the Events That Matter MostIBM i Security: Identifying the Events That Matter Most
IBM i Security: Identifying the Events That Matter MostPrecisely
 
Identify and Stop Insider Threats
Identify and Stop Insider ThreatsIdentify and Stop Insider Threats
Identify and Stop Insider ThreatsLancope, Inc.
 
GDPR challenges for the healthcare sector and the practical steps to compliance
GDPR challenges for the healthcare sector and the practical steps to complianceGDPR challenges for the healthcare sector and the practical steps to compliance
GDPR challenges for the healthcare sector and the practical steps to complianceIT Governance Ltd
 
TIC-TOC: Disrupt the Threat Management Conversation with Dominique Singer and...
TIC-TOC: Disrupt the Threat Management Conversation with Dominique Singer and...TIC-TOC: Disrupt the Threat Management Conversation with Dominique Singer and...
TIC-TOC: Disrupt the Threat Management Conversation with Dominique Singer and...SaraPia5
 
CNIT 121: 2 IR Management Handbook
CNIT 121: 2 IR Management HandbookCNIT 121: 2 IR Management Handbook
CNIT 121: 2 IR Management HandbookSam Bowne
 
Automation of document management paul fenton webinar
Automation of document management paul fenton webinarAutomation of document management paul fenton webinar
Automation of document management paul fenton webinarMontrium
 
Webinar - Compliance with the Microsoft Cloud- 2017-04-19
Webinar - Compliance with the Microsoft Cloud- 2017-04-19Webinar - Compliance with the Microsoft Cloud- 2017-04-19
Webinar - Compliance with the Microsoft Cloud- 2017-04-19TechSoup
 

Similar to Principles of Observability Riga DevOPS Meetup (20)

Secure and Compliant Data Management in FinTech Applications
Secure and Compliant Data Management in FinTech ApplicationsSecure and Compliant Data Management in FinTech Applications
Secure and Compliant Data Management in FinTech Applications
 
Wc4
Wc4Wc4
Wc4
 
IT Operation Analytic for security- MiSSconf(sp1)
IT Operation Analytic for security- MiSSconf(sp1)IT Operation Analytic for security- MiSSconf(sp1)
IT Operation Analytic for security- MiSSconf(sp1)
 
IBM i Security SIEM Integration
IBM i Security SIEM IntegrationIBM i Security SIEM Integration
IBM i Security SIEM Integration
 
HPE-Security update talk presented in Vienna to partners on 15th April 2016
HPE-Security update talk presented in Vienna to partners on 15th April 2016HPE-Security update talk presented in Vienna to partners on 15th April 2016
HPE-Security update talk presented in Vienna to partners on 15th April 2016
 
Protecting Your Business from Unauthorized IBM i Access
Protecting Your Business from Unauthorized IBM i AccessProtecting Your Business from Unauthorized IBM i Access
Protecting Your Business from Unauthorized IBM i Access
 
2013 Data Protection Maturity Trends: How Do You Compare?
2013 Data Protection Maturity Trends: How Do You Compare?2013 Data Protection Maturity Trends: How Do You Compare?
2013 Data Protection Maturity Trends: How Do You Compare?
 
MongoDB.local Sydney: The Changing Face of Data Privacy & Ethics, and How Mon...
MongoDB.local Sydney: The Changing Face of Data Privacy & Ethics, and How Mon...MongoDB.local Sydney: The Changing Face of Data Privacy & Ethics, and How Mon...
MongoDB.local Sydney: The Changing Face of Data Privacy & Ethics, and How Mon...
 
Insider threat v3
Insider threat v3Insider threat v3
Insider threat v3
 
5 Reasons Observability of your Mainframe and IBM i is Critical for IT
5 Reasons Observability of your Mainframe and IBM i is Critical for IT 5 Reasons Observability of your Mainframe and IBM i is Critical for IT
5 Reasons Observability of your Mainframe and IBM i is Critical for IT
 
SpiceWorks Webinar: Whose logs, what logs, why logs
SpiceWorks Webinar: Whose logs, what logs, why logs  SpiceWorks Webinar: Whose logs, what logs, why logs
SpiceWorks Webinar: Whose logs, what logs, why logs
 
The EU Data Protection Regulation and what it means for your organization
The EU Data Protection Regulation and what it means for your organizationThe EU Data Protection Regulation and what it means for your organization
The EU Data Protection Regulation and what it means for your organization
 
IBM i Security: Identifying the Events That Matter Most
IBM i Security: Identifying the Events That Matter MostIBM i Security: Identifying the Events That Matter Most
IBM i Security: Identifying the Events That Matter Most
 
Identify and Stop Insider Threats
Identify and Stop Insider ThreatsIdentify and Stop Insider Threats
Identify and Stop Insider Threats
 
Cyber Forensics Module 1
Cyber Forensics Module 1Cyber Forensics Module 1
Cyber Forensics Module 1
 
GDPR challenges for the healthcare sector and the practical steps to compliance
GDPR challenges for the healthcare sector and the practical steps to complianceGDPR challenges for the healthcare sector and the practical steps to compliance
GDPR challenges for the healthcare sector and the practical steps to compliance
 
TIC-TOC: Disrupt the Threat Management Conversation with Dominique Singer and...
TIC-TOC: Disrupt the Threat Management Conversation with Dominique Singer and...TIC-TOC: Disrupt the Threat Management Conversation with Dominique Singer and...
TIC-TOC: Disrupt the Threat Management Conversation with Dominique Singer and...
 
CNIT 121: 2 IR Management Handbook
CNIT 121: 2 IR Management HandbookCNIT 121: 2 IR Management Handbook
CNIT 121: 2 IR Management Handbook
 
Automation of document management paul fenton webinar
Automation of document management paul fenton webinarAutomation of document management paul fenton webinar
Automation of document management paul fenton webinar
 
Webinar - Compliance with the Microsoft Cloud- 2017-04-19
Webinar - Compliance with the Microsoft Cloud- 2017-04-19Webinar - Compliance with the Microsoft Cloud- 2017-04-19
Webinar - Compliance with the Microsoft Cloud- 2017-04-19
 

Recently uploaded

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Recently uploaded (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

Principles of Observability Riga DevOPS Meetup

  • 1. Principles of Observability Jānis Orlovs Riga DevOPS Meetup 22th November, 2017
  • 2. About Me • I come from OPS side • For me most interesting part is to make operations boring • Mostly have been working in Financial Services • 4finance • Swedbank • KPMG • T • Playing basketball in amateur level for ages
  • 3. Complexity of Systems Level of System Distribution simple monolith modular monolith complex modular monolith or microservices Complexity
  • 4. Failures of Complex Systems • Complex systems are intrinsically hazardous systems • Catastrophe is always just around the corner • All practitioner actions are gambles. Paper URL: https://goo.gl/sTvJw8
  • 6. Monitoring System Monitoring system complexes should address two questions: what’s broken, and why? ... “What” versus “why” is one of the most important distinctions in writing good monitoring with maximum signal and minimum noise Source: Service Reliability Engineering Book
  • 7. General Monitoring House Rules • Metrics and Checks that catch real incidents most often should be as simple, predictable, and reliable as possible. • Data collection, aggregation, and alerting configuration that is rarely exercised should be up for removal. • Signals that are collected, but not exposed in any prebaked dashboard nor used by any alert, are candidates for removal.
  • 9.
  • 10. Blackbox Approach: Checks Monitoring • Checks, not metrics. • Simple, yes/no questions. • First generation of monitoring systems • Not suitable what’s actually happening under the hood, without guessing
  • 12. Whitebox Approach: Metrics Monitoring • Addreses known failure vectors. • There is needed to be developed instrumentation for exposing data to monitoring • Proper monitoring is mixture technical data with business data • Too much monitoring is noise
  • 13.
  • 14. Whitebox Approach: Logging • Valuable insigth: place where starts are investigations • View of Request • View of System • Easy to collect data, from data points. • Plain text • Structured • Binary • LogAll vs LogActionalbe data • Data sets bloats, large scale ingestions of data tricky
  • 15.
  • 16. Tracing • Most challenging part to implement from historical point-of- view • Tracing captures the lifetime of requests as they flow through the various components of a distributed system • Recent developments in tracing tools gives brigth look in future: • Dtrace and BFP framework • OpenTracing: http://opentracing.io/
  • 17.
  • 18. Observability In control theory, observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs. The observability and controllability of a system are mathematical duals. Source: Wikipedia
  • 20.
  • 21. Privacy and Observability • Starting 25th May, 2018 EU personal data protection directive or GDPR will be fully in place. • Drastic accountability measures: • Up to 10m EUR or 2% global turnover for the first audit fail • Up to 20m EUR or 4% global turnover for the second audit fail • Observability tools are silent huge personal data collectors • Include in your Company’s data protection Sscope or anonymize data
  • 22. Conclusions • Reliability of systems makes money (not loosing it) • In distributed systems all teams involved in systems development has to commit to making systems observable • For one type of tasks choose one tool • Review what data you collect, visualize your data • Pick your own Observability target based on the requirements of your service.
  • 23. Principles of Observability Jānis Orlovs Riga DevOPS Meetup 22th November, 2017