Neeraj Bagga
Sr Tech Executive
https://www.linkedin.com/in/neerajbagg
https://neerajbagga.medium.com
#2 Neeraj Bagga
(Feel free to use the content and presentation)
Observability Monitoring
Adjective Verb
It is a state Something we do
Drills down into What and Why Tracks performance
Designed for debugging, granular
insight and context
Suitable for overall health
Asks questions about the
unknowns
Informs about the knowns
Best for dynamic environments
with unknown permutations
Best for static environments
#3 Neeraj Bagga
(Feel free to use the content and presentation)
ANSWER QUESTIONS WE
DID NOT KNOW TO ASK
ADAPT TO CHANGES ALLOWS DATA DRIVEN
DECISIONS
#4 Neeraj Bagga
(Feel free to use the content and presentation)
1. Answer questions we
did not know to ask
2. Adapt to changes
3. Allows Data Driven
Decisions
4. Reduced MTTR Higher the risk in application
the more Observable it should be
#5 Neeraj Bagga
(Feel free to use the content and presentation)
#6 Neeraj Bagga
(Feel free to use the content and presentation)
LOGS METRICS TRACES CULTURE *
#7 Neeraj Bagga
(Feel free to use the content and presentation)
#8 Neeraj Bagga
(Feel free to use the content and presentation)
Proactive Deliberate Commonality
Education
Fail Fast, Fail
Often
#9 Neeraj Bagga
(Feel free to use the content and presentation)
HELP
Channels, Embedding
Education
SHARE
Champions, SOE,
PRACTIC
E
Wargames,
Encouragement
FEEDBACK
Engage, Solicit inputs
#10 Neeraj Bagga
(Feel free to use the content and presentation)
• Periodically measure
team maturity
• Consistent
questionnaire
• Follows Data Driven
approach
AWS Xray
And more ….
#12 Neeraj Bagga
(Feel free to use the content and presentation)
Infrastructure Monitoring Application Performance Monitoring (APM)
Log Investigation
(Splunk Cloud)
Incident Response
(Splunk On-Call)
https://www.observability.splunk.com/
#13 Neeraj Bagga
(Feel free to use the content and presentation)
https://www.elastic.co/observability
#14 Neeraj Bagga
(Feel free to use the content and presentation)
https://aws.amazon.com/xray/
#15 Neeraj Bagga
(Feel free to use the content and presentation)
https://www.sumologic.com/how-it-works/
#16 Neeraj Bagga
(Feel free to use the content and presentation)
Observability, what, why and how

Observability, what, why and how

Editor's Notes

  • #8 Logging – A record of events to help understand what changed in the system/application behavior when things went wrong. For example, using Grafana Loki to log certain events. Metrics – A value pertaining to your system/application at a point in time. For example, using Grafana to understand resource utilization, or app performance metrics like throughput and response-time. Tracing – A representation of a single user’s journey through an application transaction. For example, using Jaeger to understand the call flows between services or how much time it takes a user to finish a transaction.