SlideShare a Scribd company logo
1 of 48
From Monitoring to
Domain-Oriented
Observability
What’s the difference between monitoring and
observability and why does it matter?
2. About me
● Took part in developing of microservice architecture based on
Event Sourcing and CQRS
● I have obtained a position of Tech Lead. I implemented Canary
release and Feature Toggles (aka Feature Flags), migrated
microservice from REST to Event-Driven
● Made a webinar about microservices testing.
● I am trying to apply best engineering practices to Safeguard Cyber project
● I want to build such a process in a company (team) in which it will be pleasant to work and develop
professionally. Where ideas will be heard, where a person will not need to sacrifice his family or
health for professional growth.
3. Parts of presentation
● Why you should start thinking about monitoring and our case
● Theoretical minimum about monitoring
● How to switch from monitoring to Domain-oriented Observability
4. Part 1
● Why you should start thinking about monitoring and our case
5. About our product
6. Threat Detection
7. Cyber Defense
8. Machine Learning and AI
9. Fire on Production
10. Fire on production
11. Fire on production
12. Fire on production
12. Fire on production
14. Problems we faced with
● What is the cause of performance drop?
● What can lead to poor system performance?
● How can certain changes influence the system?
● Our product is difficult to fit in SLA
15. Part 2
● Theoretical minimum about monitoring
16 Monitoring
1) Logging
2) Tracing
3) Metrics
4) Alerts
“A log is an immutable,
timestamped record of event
describing what happened over
time “
17 Logging
18. Kibana filters
19. Kibana filters result
20. Errors frequency analysis
“A trace is a representation of series
of causally related distributed events
that encode the end-to-end request
flow through a distributed system”
21. Tracing
22. Tracing
“Metrics are a numeric
representation of data
measure over intervals of
time”
23. Metrics
24. Metrics
25. Metrics
26. Metrics
● throughput
● success
● error
● performance
27. Metrics subtypes
● Rate - the number of requests, per second, you services are serving.
● Errors - the number of failed requests per second.
● Duration - distributions of the amount of time each request takes.
28. Three key metrics by RED methodology
31. Trends are very important.
30. Trends are very important
Automated alerts are essential to
monitoring. They allow you to spot
problems anywhere in your
infrastructure, so that you can
rapidly identify their causes and
minimize service degradation and
disruption. Alerts draw human
attention to the particular systems
that require observation,
inspection, and intervention.
31. Alerts
● There should be people’s reaction
● Alert should have priority
● There should be possibility to disable notifications
● Alert should provide further instructions
32. Alert rules
33. Alerts in Kibana
34. Part 3
● How to switch from monitoring to Domain-oriented Observability
Definition:
“In control theory, observability is a measure of how well internal states of a
system can be inferred from knowledge of its external outputs. The observability
and controllability of a system are mathematical duals.”
- Wikipedia
In English:
Can you understand what’s happening inside your code and system, simply by
asking questions using your tools? Can you answer any new question you think
of, or only the ones you prepared for?
35. Observability
36. Black and White box Component View
37. Observability code example
38. Observability code example
39. Cleanup the mess
40. Cleanup the mess
41. Moving code to class
42. Domain Probe: DiscountInstrumentation
● AOP
● DECORATOR
● λάμβδα
43. Other opportunities
44. Testing + Monitoring
45. Testing + Monitoring
Start being proactive
Don’t be firefighters
● The RED Method
● Monitoring Distributed Systems
● Domain-Oriented Observability
● Distributed Systems Observability by Cindy Sridharan
● Testing in Production, the safe way
● Deploy != Release part1 and part2
● SRE: Observability: Metric Namespaces and Structures
● Observability: Metric, Logging, and Tracing
● Decorator
● Monitoring in the time of Cloud Native
● https://www.elastic.co/learn
46. Resources:
Questions

More Related Content

What's hot

Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observabilityTheo Schlossnagle
 
Logging and observability
Logging and observabilityLogging and observability
Logging and observabilityAnton Drukh
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observabilityTheo Schlossnagle
 
More Than Monitoring: How Observability Takes You From Firefighting to Fire P...
More Than Monitoring: How Observability Takes You From Firefighting to Fire P...More Than Monitoring: How Observability Takes You From Firefighting to Fire P...
More Than Monitoring: How Observability Takes You From Firefighting to Fire P...DevOps.com
 
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...Splunk
 
Observability For Modern Applications
Observability For Modern ApplicationsObservability For Modern Applications
Observability For Modern ApplicationsAmazon Web Services
 
Observability at Scale
Observability at Scale Observability at Scale
Observability at Scale Knoldus Inc.
 
Observability – the good, the bad, and the ugly
Observability – the good, the bad, and the uglyObservability – the good, the bad, and the ugly
Observability – the good, the bad, and the uglyTimetrix
 
Observability for modern applications
Observability for modern applications  Observability for modern applications
Observability for modern applications MoovingON
 
Combining logs, metrics, and traces for unified observability
Combining logs, metrics, and traces for unified observabilityCombining logs, metrics, and traces for unified observability
Combining logs, metrics, and traces for unified observabilityElasticsearch
 
Datadog: From a single product to a growing platform by Alexis Lê-Quôc, CTO
Datadog: From a single product to a growing platform by Alexis Lê-Quôc, CTODatadog: From a single product to a growing platform by Alexis Lê-Quôc, CTO
Datadog: From a single product to a growing platform by Alexis Lê-Quôc, CTOTheFamily
 
OpenTelemetry For Developers
OpenTelemetry For DevelopersOpenTelemetry For Developers
OpenTelemetry For DevelopersKevin Brockhoff
 
Monitor every app, in every stage, with free and open Elastic APM
Monitor every app, in every stage, with free and open Elastic APMMonitor every app, in every stage, with free and open Elastic APM
Monitor every app, in every stage, with free and open Elastic APMElasticsearch
 
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.ioTHE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.ioDevOpsDays Tel Aviv
 
Monitoring modern applications using Elastic
Monitoring modern applications using ElasticMonitoring modern applications using Elastic
Monitoring modern applications using ElasticElasticsearch
 
Do You Really Need to Evolve From Monitoring to Observability?
Do You Really Need to Evolve From Monitoring to Observability?Do You Really Need to Evolve From Monitoring to Observability?
Do You Really Need to Evolve From Monitoring to Observability?Splunk
 
初探 Elastic Observability 的實踐方法
初探 Elastic Observability 的實踐方法初探 Elastic Observability 的實踐方法
初探 Elastic Observability 的實踐方法Joe Wu
 
Building a centralized observability platform
Building a centralized observability platformBuilding a centralized observability platform
Building a centralized observability platformElasticsearch
 

What's hot (20)

Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
 
Logging and observability
Logging and observabilityLogging and observability
Logging and observability
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
 
More Than Monitoring: How Observability Takes You From Firefighting to Fire P...
More Than Monitoring: How Observability Takes You From Firefighting to Fire P...More Than Monitoring: How Observability Takes You From Firefighting to Fire P...
More Than Monitoring: How Observability Takes You From Firefighting to Fire P...
 
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
 
Observability For Modern Applications
Observability For Modern ApplicationsObservability For Modern Applications
Observability For Modern Applications
 
Observability at Scale
Observability at Scale Observability at Scale
Observability at Scale
 
Observability – the good, the bad, and the ugly
Observability – the good, the bad, and the uglyObservability – the good, the bad, and the ugly
Observability – the good, the bad, and the ugly
 
Observability for modern applications
Observability for modern applications  Observability for modern applications
Observability for modern applications
 
Combining logs, metrics, and traces for unified observability
Combining logs, metrics, and traces for unified observabilityCombining logs, metrics, and traces for unified observability
Combining logs, metrics, and traces for unified observability
 
Observability
Observability Observability
Observability
 
Datadog: From a single product to a growing platform by Alexis Lê-Quôc, CTO
Datadog: From a single product to a growing platform by Alexis Lê-Quôc, CTODatadog: From a single product to a growing platform by Alexis Lê-Quôc, CTO
Datadog: From a single product to a growing platform by Alexis Lê-Quôc, CTO
 
OpenTelemetry For Developers
OpenTelemetry For DevelopersOpenTelemetry For Developers
OpenTelemetry For Developers
 
Monitor every app, in every stage, with free and open Elastic APM
Monitor every app, in every stage, with free and open Elastic APMMonitor every app, in every stage, with free and open Elastic APM
Monitor every app, in every stage, with free and open Elastic APM
 
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.ioTHE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
 
Monitoring modern applications using Elastic
Monitoring modern applications using ElasticMonitoring modern applications using Elastic
Monitoring modern applications using Elastic
 
Do You Really Need to Evolve From Monitoring to Observability?
Do You Really Need to Evolve From Monitoring to Observability?Do You Really Need to Evolve From Monitoring to Observability?
Do You Really Need to Evolve From Monitoring to Observability?
 
初探 Elastic Observability 的實踐方法
初探 Elastic Observability 的實踐方法初探 Elastic Observability 的實踐方法
初探 Elastic Observability 的實踐方法
 
Building a centralized observability platform
Building a centralized observability platformBuilding a centralized observability platform
Building a centralized observability platform
 
Observability
ObservabilityObservability
Observability
 

Similar to Monitoring and observability

Observability in highly distributed systems
Observability in highly distributed systemsObservability in highly distributed systems
Observability in highly distributed systemsDevOps Indonesia
 
5 Clear Signs You Need Security Policy Automation
5 Clear Signs You Need Security Policy Automation5 Clear Signs You Need Security Policy Automation
5 Clear Signs You Need Security Policy AutomationTufin
 
ThirdEye - LinkedIn's Business-wide monitoring platform
ThirdEye - LinkedIn's Business-wide monitoring platformThirdEye - LinkedIn's Business-wide monitoring platform
ThirdEye - LinkedIn's Business-wide monitoring platformAkshay Rai
 
Top 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management TeamsTop 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management TeamsMatthew Boeckman
 
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...Adrian Cockcroft
 
Top 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management TeamsTop 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management TeamsDeborah Schalm
 
Top 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management TeamsTop 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management TeamsDevOps.com
 
BSIT3CD_Continuation of Cyber incident response (1).pdf
BSIT3CD_Continuation of Cyber incident response (1).pdfBSIT3CD_Continuation of Cyber incident response (1).pdf
BSIT3CD_Continuation of Cyber incident response (1).pdfStevenJoeBiago
 
Cloud Native DevOps
Cloud Native DevOpsCloud Native DevOps
Cloud Native DevOpsJim Bugwadia
 
Monitoring - deeper dive
Monitoring  - deeper diveMonitoring  - deeper dive
Monitoring - deeper diveRobert Kubiś
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
What is Platform Observability? An Overview
What is Platform Observability? An OverviewWhat is Platform Observability? An Overview
What is Platform Observability? An OverviewKumar Kolaganti
 
VTU 5TH SEM CSE SOFTWARE ENGINEERING SOLVED PAPERS - JUN13 DEC13 JUN14 DEC14 ...
VTU 5TH SEM CSE SOFTWARE ENGINEERING SOLVED PAPERS - JUN13 DEC13 JUN14 DEC14 ...VTU 5TH SEM CSE SOFTWARE ENGINEERING SOLVED PAPERS - JUN13 DEC13 JUN14 DEC14 ...
VTU 5TH SEM CSE SOFTWARE ENGINEERING SOLVED PAPERS - JUN13 DEC13 JUN14 DEC14 ...vtunotesbysree
 
From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...Manuel Martín
 
Clone of an organization
Clone of an organizationClone of an organization
Clone of an organizationIRJET Journal
 
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...AgileNetwork
 
Observability for Application Developers (1)-1.pptx
Observability for Application Developers (1)-1.pptxObservability for Application Developers (1)-1.pptx
Observability for Application Developers (1)-1.pptxOpsTree solutions
 
The differing ways to monitor and instrument
The differing ways to monitor and instrumentThe differing ways to monitor and instrument
The differing ways to monitor and instrumentJonah Kowall
 
WTF is a Microservice - Rafael Schloming, Datawire
WTF is a Microservice - Rafael Schloming, DatawireWTF is a Microservice - Rafael Schloming, Datawire
WTF is a Microservice - Rafael Schloming, DatawireAmbassador Labs
 

Similar to Monitoring and observability (20)

Observability in highly distributed systems
Observability in highly distributed systemsObservability in highly distributed systems
Observability in highly distributed systems
 
5 Clear Signs You Need Security Policy Automation
5 Clear Signs You Need Security Policy Automation5 Clear Signs You Need Security Policy Automation
5 Clear Signs You Need Security Policy Automation
 
ThirdEye - LinkedIn's Business-wide monitoring platform
ThirdEye - LinkedIn's Business-wide monitoring platformThirdEye - LinkedIn's Business-wide monitoring platform
ThirdEye - LinkedIn's Business-wide monitoring platform
 
Top 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management TeamsTop 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management Teams
 
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
 
Cerita
CeritaCerita
Cerita
 
Top 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management TeamsTop 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management Teams
 
Top 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management TeamsTop 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management Teams
 
BSIT3CD_Continuation of Cyber incident response (1).pdf
BSIT3CD_Continuation of Cyber incident response (1).pdfBSIT3CD_Continuation of Cyber incident response (1).pdf
BSIT3CD_Continuation of Cyber incident response (1).pdf
 
Cloud Native DevOps
Cloud Native DevOpsCloud Native DevOps
Cloud Native DevOps
 
Monitoring - deeper dive
Monitoring  - deeper diveMonitoring  - deeper dive
Monitoring - deeper dive
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
What is Platform Observability? An Overview
What is Platform Observability? An OverviewWhat is Platform Observability? An Overview
What is Platform Observability? An Overview
 
VTU 5TH SEM CSE SOFTWARE ENGINEERING SOLVED PAPERS - JUN13 DEC13 JUN14 DEC14 ...
VTU 5TH SEM CSE SOFTWARE ENGINEERING SOLVED PAPERS - JUN13 DEC13 JUN14 DEC14 ...VTU 5TH SEM CSE SOFTWARE ENGINEERING SOLVED PAPERS - JUN13 DEC13 JUN14 DEC14 ...
VTU 5TH SEM CSE SOFTWARE ENGINEERING SOLVED PAPERS - JUN13 DEC13 JUN14 DEC14 ...
 
From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...
 
Clone of an organization
Clone of an organizationClone of an organization
Clone of an organization
 
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
 
Observability for Application Developers (1)-1.pptx
Observability for Application Developers (1)-1.pptxObservability for Application Developers (1)-1.pptx
Observability for Application Developers (1)-1.pptx
 
The differing ways to monitor and instrument
The differing ways to monitor and instrumentThe differing ways to monitor and instrument
The differing ways to monitor and instrument
 
WTF is a Microservice - Rafael Schloming, Datawire
WTF is a Microservice - Rafael Schloming, DatawireWTF is a Microservice - Rafael Schloming, Datawire
WTF is a Microservice - Rafael Schloming, Datawire
 

More from Danylenko Max

How to write clean tests
How to write clean testsHow to write clean tests
How to write clean testsDanylenko Max
 
Consumer Driven Contract.pdf
Consumer Driven Contract.pdfConsumer Driven Contract.pdf
Consumer Driven Contract.pdfDanylenko Max
 
Consumer driven contract
Consumer driven contractConsumer driven contract
Consumer driven contractDanylenko Max
 
How to successfully grow a code review culture
How to successfullygrow a code review cultureHow to successfullygrow a code review culture
How to successfully grow a code review cultureDanylenko Max
 
Testing microservices
Testing microservicesTesting microservices
Testing microservicesDanylenko Max
 

More from Danylenko Max (6)

How to write clean tests
How to write clean testsHow to write clean tests
How to write clean tests
 
Consumer Driven Contract.pdf
Consumer Driven Contract.pdfConsumer Driven Contract.pdf
Consumer Driven Contract.pdf
 
Consumer driven contract
Consumer driven contractConsumer driven contract
Consumer driven contract
 
Fail fast! approach
Fail fast! approachFail fast! approach
Fail fast! approach
 
How to successfully grow a code review culture
How to successfullygrow a code review cultureHow to successfullygrow a code review culture
How to successfully grow a code review culture
 
Testing microservices
Testing microservicesTesting microservices
Testing microservices
 

Recently uploaded

iGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by SkilrockiGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by SkilrockSkilrock Technologies
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?XfilesPro
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowPeter Caitens
 
A Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data MigrationA Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data MigrationHelp Desk Migration
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAlluxio, Inc.
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1KnowledgeSeed
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareinfo611746
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Krakówbim.edu.pl
 
How to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabberHow to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabbereGrabber
 
The Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionThe Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionWave PLM
 
OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024Shane Coughlan
 
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesNeo4j
 
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfmbmh111980
 
Workforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfWorkforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfDeskTrack
 
IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024vaibhav130304
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationWave PLM
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfkalichargn70th171
 

Recently uploaded (20)

iGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by SkilrockiGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by Skilrock
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
Top Mobile App Development Companies 2024
Top Mobile App Development Companies 2024Top Mobile App Development Companies 2024
Top Mobile App Development Companies 2024
 
A Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data MigrationA Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data Migration
 
5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting software
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Kraków
 
How to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabberHow to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabber
 
The Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionThe Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion Production
 
OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024
 
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
 
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
 
Workforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfWorkforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdf
 
IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM Integration
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 

Monitoring and observability

  • 1. From Monitoring to Domain-Oriented Observability What’s the difference between monitoring and observability and why does it matter?
  • 2. 2. About me ● Took part in developing of microservice architecture based on Event Sourcing and CQRS ● I have obtained a position of Tech Lead. I implemented Canary release and Feature Toggles (aka Feature Flags), migrated microservice from REST to Event-Driven ● Made a webinar about microservices testing. ● I am trying to apply best engineering practices to Safeguard Cyber project ● I want to build such a process in a company (team) in which it will be pleasant to work and develop professionally. Where ideas will be heard, where a person will not need to sacrifice his family or health for professional growth.
  • 3. 3. Parts of presentation ● Why you should start thinking about monitoring and our case ● Theoretical minimum about monitoring ● How to switch from monitoring to Domain-oriented Observability
  • 4. 4. Part 1 ● Why you should start thinking about monitoring and our case
  • 5. 5. About our product
  • 9. 9. Fire on Production
  • 10. 10. Fire on production
  • 11. 11. Fire on production
  • 12. 12. Fire on production
  • 13. 12. Fire on production
  • 14. 14. Problems we faced with ● What is the cause of performance drop? ● What can lead to poor system performance? ● How can certain changes influence the system? ● Our product is difficult to fit in SLA
  • 15. 15. Part 2 ● Theoretical minimum about monitoring
  • 16. 16 Monitoring 1) Logging 2) Tracing 3) Metrics 4) Alerts
  • 17. “A log is an immutable, timestamped record of event describing what happened over time “ 17 Logging
  • 21. “A trace is a representation of series of causally related distributed events that encode the end-to-end request flow through a distributed system” 21. Tracing
  • 23. “Metrics are a numeric representation of data measure over intervals of time” 23. Metrics
  • 27. ● throughput ● success ● error ● performance 27. Metrics subtypes
  • 28. ● Rate - the number of requests, per second, you services are serving. ● Errors - the number of failed requests per second. ● Duration - distributions of the amount of time each request takes. 28. Three key metrics by RED methodology
  • 29. 31. Trends are very important.
  • 30. 30. Trends are very important
  • 31. Automated alerts are essential to monitoring. They allow you to spot problems anywhere in your infrastructure, so that you can rapidly identify their causes and minimize service degradation and disruption. Alerts draw human attention to the particular systems that require observation, inspection, and intervention. 31. Alerts
  • 32. ● There should be people’s reaction ● Alert should have priority ● There should be possibility to disable notifications ● Alert should provide further instructions 32. Alert rules
  • 33. 33. Alerts in Kibana
  • 34. 34. Part 3 ● How to switch from monitoring to Domain-oriented Observability
  • 35. Definition: “In control theory, observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs. The observability and controllability of a system are mathematical duals.” - Wikipedia In English: Can you understand what’s happening inside your code and system, simply by asking questions using your tools? Can you answer any new question you think of, or only the ones you prepared for? 35. Observability
  • 36. 36. Black and White box Component View
  • 41. 41. Moving code to class
  • 42. 42. Domain Probe: DiscountInstrumentation
  • 43. ● AOP ● DECORATOR ● λάμβδα 43. Other opportunities
  • 44. 44. Testing + Monitoring
  • 45. 45. Testing + Monitoring
  • 46. Start being proactive Don’t be firefighters
  • 47. ● The RED Method ● Monitoring Distributed Systems ● Domain-Oriented Observability ● Distributed Systems Observability by Cindy Sridharan ● Testing in Production, the safe way ● Deploy != Release part1 and part2 ● SRE: Observability: Metric Namespaces and Structures ● Observability: Metric, Logging, and Tracing ● Decorator ● Monitoring in the time of Cloud Native ● https://www.elastic.co/learn 46. Resources: