SlideShare a Scribd company logo
Once upon a time…
No Good Deed…
adatole
@LeonAdato
The Four Questions
(every monitoring engineer
is asked)
Hello!
▧ Working in IT 30+ years
▧ 20+ years in monitoring
○ CASE $CompanySize
■ <=100
■ >100 && <1000
■ >1000 && <5000
■ >250,000
▧ Currently “Head Geek” at SolarWinds
○ Head Geek <> Developer
○ Head Geek != Marketing
○ Head Geek ≠ Sales
○ “Head Geek” LIKE “%Advocate%”
○ Head Geek == STORYTELLER
Leon Adato
Where To Find Me
Twitter: @LeonAdato
THWACK.com: AdatoLe
WWWeb www.AdatoSystems.com
Podcast TechnicallyReligious.com
5
MONITORING Engineer??
6
adatole
@LeonAdato
The Four Questions of Alerting
Why didn’t I get
an alert?
Why did I get
this alert?
What will alert
on my system?
What’s being
monitored on
my system?
adatole
@LeonAdato
The Jewish Roots of…
Questions
▧ “Du fregst a gutte kashe”
▧ Nobel Lauriat in Physics Dr. Isidor Rabi
adatole
@LeonAdato
“My mother made me a scientist without ever intending to.
Every other mother in Brooklyn would ask her child:
“So? Did you learn anything today?”
But not my mother.
“Izzy,” she would say, “did you ask a good question today?”
That difference — asking good questions — made me
become a scientist.
9
adatole
@LeonAdato
The Jewish Roots of…
Questions
▧ “Du fregst a gutte kashe”
▧ Nobel Lauriat in Physics Dr. Isidor Rabi
▧ THE Four Questions
‫ָּה‬‫נ‬ ַ‫ת‬ ְׁ‫ש‬ִּ‫נ‬ ‫ה‬ ַ‫מ‬,‫ֵּילֹות‬‫ל‬ ַ‫ה‬ ‫ָּל‬‫כ‬ ִּ‫מ‬ ‫ֶּה‬‫ז‬ ַ‫ה‬ ‫ָּה‬‫ל‬ְׁ‫י‬ַ‫ל‬ ַ‫ה‬
Why is this night different from all other nights?
adatole
@LeonAdato
What’s the Teretz?*
▧ We need the same open-ness to questions
▧ Relish the experience of asking, of discovery
▧ We don’t work in tech because
I already know that
▧ We work in tech because we love
I’ll find out
*Teretz = answer
adatole
@LeonAdato

“Your system is down.”
Question #1: Why did I get that alert?
☺
CPU on the Windows device owned
by Accounting named
Mnth_Reporting (IP: 10.2.3.4, DNS:
MonRep.MyCorp.Net) has been over
80% for more than 15 minutes. CPU
at 2:16am EST is 96%.
Device details: http://blahblah.
Acknowledge this alert: http://ackme
This message brought to you by the
alert: CPU_CRIT_PROD and the
polling engine Poller7
12
adatole
@LeonAdato
What’s the Teretz?
▧ Name of the system
▧ Specific component or sub-
element
▧ Current statistic or status
▧ Time the event occurred
▧ Time the alert was sent
▧ Custom fields like location,
owner, etc.
▧ OS type and version
▧ IP address
▧ DNS name or Sysname
▧ The threshold
▧ The duration
▧ A link to the device or
metric
▧ The name of the alert
▧ The polling engine
adatole
@LeonAdato
▧ The story
… before the story
…… before the story
▧ Context matters!
▧ History matters!
▧ Not only “why did I get this alert”
▧ But “why do these alerts exist at
all?”
Jewish Roots:
My Father Was a Wandering Aramean
adatole
@LeonAdato
Question #2: Why DIDN’T I Get That Alert?
adatole
@LeonAdato
Question #2: Why DIDN’T I Get That Alert?
▧ It was designed like that
○ Alert windows
○ Problem duration
○ Ticket not reset
○ Mute/unmanage/shut-up
○ Parent-child
adatole
@LeonAdato
Question #2: Why DIDN’T I Get That Alert?
▧ Change Un-Control
○ Credential changed
○ Network changed
○ Custom Property
○ Element removed
○ Physical to Virtual
adatole
@LeonAdato
Question #2: Why DIDN’T I Get That Alert?
▧ Monitoring Failed
○ Polling stopped
○ Agent stopped
○ Data throttled
○ Db is out of sync
○ New code/image missing tracing
○ Monitoring “supply chain” failed (email)
○ Event correlation rules
adatole
@LeonAdato
What’s the Teretz?
▧ Understand (and communicate) exceptions
▧ Save your receipts
▧ Save other people’s receipts too, if you can
▧ Monitor your monitoring
▧ Test your notification delivery infrastructure
▧ Have validation steps ready
adatole
@LeonAdato
Question #3: What’s monitored on my system(s)?
adatole
@LeonAdato
Alerting ≠ Monitoring
adatole
@LeonAdato
Question #3: What’s monitored on my system(s)?
adatole
@LeonAdato
What’s the Teretz?
▧ One size fits… some?
▧ Skillcheck: SQL
▧ Skillcheck: wireshark
▧ Look at the screens
adatole
@LeonAdato
Jewish Roots:
Burning Hail and Black Swans
▧ Why do we remember the plagues?
○ Visceral, unexpected, unique
▧ Let’s talk about “black swans”
▧ The plagues as black swan events
adatole
@LeonAdato
Question #4: What COULD alert for my
systems?
adatole
@LeonAdato
Question #4: What COULD alert for my
systems?
▧ What *IS* an alert?
○ Emergency
○ Interruption
○ Unplanned Work
▧ What does alerting NEED to be
○ Timely
○ Meaningful
○ Actionable
adatole
@LeonAdato
Question #4: What COULD alert for my
systems?
▧ Why does this matter?
○ # of systems
○ # of alerts that can trigger for those systems
○ # of staff hours to address those alerts
○ # of alerts that could trigger simultaneously
adatole
@LeonAdato
What’s the Teretz?
▧ This can be a VERY difficult question to answer
▧ But it’s difficulty is in proportion to importance
▧ Speaks to potential impact to the company,
workload, interruptions.
adatole
@LeonAdato
What’s the Teretz?
adatole
@LeonAdato
Jewish Roots:
Are You Ready For the Hard Questions?
▧ Scholar, Skeptic, Simple, & Silent
▧ Meet each user where they are
▧ Let’s talk about the Skeptic (“the wicked son”)
▧ Listen past the snark for the question
adatole
@LeonAdato
Question #5: What Do you
Monitor “Standard”?
adatole
@LeonAdato
Wait, I thought you said FOUR questions!
adatole
@LeonAdato
Jewish Roots:
Four or Five cups?
▧ Symbolism of wine as joy
▧ We need to remember to pause for joyful moments
▧ Despite rigorous Talmudic analysis, there are still questions
without clear answers.
▧ BUT… that doesn’t mean we disengage.
▧ We return to these questions over and over, try new
approaches.
A lot like IT problems.
adatole
@LeonAdato
OK, So That Fifth Question:
What Do You Monitor “Standard”?
▧ When you load up a box into monitoring, what do
consumers automatically get?
▧ If you can’t describe this, how will anyone know
what to ask for “extra”?
adatole
@LeonAdato
The Mostly Un-Necessary Summary
Being prepared for the 4 (ok 5) questions
▧ Your monitoring will be (better) prepared for the stresses it
will be exposed to.
▧ You will be (better) prepared as an advocate for monitoring
▧ You’ll spend less time answering repetitive questions and
more time doing to the work of a monitoring engineer.
(i.e.: the GOOD stuff!)
adatole
@LeonAdato
If you still have
questions…
36
adatole
@LeonAdato
Thank You!
I’m READY
Tell me what questions you have
37

More Related Content

Similar to The Four Questions (Every Monitoring Engineer gets asked), by Leon Adato

Machine Learning for dummies!
Machine Learning for dummies!Machine Learning for dummies!
Machine Learning for dummies!
ZOLLHOF - Tech Incubator
 
Big Data for Social Good
Big Data for Social GoodBig Data for Social Good
Big Data for Social Good
DataLook
 
I believe I can fly (Extract London 2015)
I believe I can fly (Extract London 2015)I believe I can fly (Extract London 2015)
I believe I can fly (Extract London 2015)
Ignacio Elola Villar
 
Tokens, Complex Systems, and Nature
Tokens, Complex Systems, and NatureTokens, Complex Systems, and Nature
Tokens, Complex Systems, and Nature
Trent McConaghy
 
Digital Analytics Checkup: How to evaluate the impact of your web analytics data
Digital Analytics Checkup: How to evaluate the impact of your web analytics dataDigital Analytics Checkup: How to evaluate the impact of your web analytics data
Digital Analytics Checkup: How to evaluate the impact of your web analytics data
CrossView
 
A living hell - lessons learned in eight years of parsing real estate data
A living hell - lessons learned in eight years of parsing real estate data  A living hell - lessons learned in eight years of parsing real estate data
A living hell - lessons learned in eight years of parsing real estate data
lokku
 
Technology to Improve Your (Business) Life
Technology to Improve Your (Business) LifeTechnology to Improve Your (Business) Life
Technology to Improve Your (Business) Life
Garry Polmateer
 
Better the devil you know
Better the devil you knowBetter the devil you know
Better the devil you know
Alexandra Deschamps-Sonsino
 
Jason Yee - Chaos! - Codemotion Rome 2019
Jason Yee - Chaos! - Codemotion Rome 2019Jason Yee - Chaos! - Codemotion Rome 2019
Jason Yee - Chaos! - Codemotion Rome 2019
Codemotion
 
Eat Your Vegetables - Data Security for Data Scientists
Eat Your Vegetables - Data Security for Data ScientistsEat Your Vegetables - Data Security for Data Scientists
Eat Your Vegetables - Data Security for Data Scientists
William Voorhees
 
Big Data from Small Places
Big Data from Small PlacesBig Data from Small Places
Big Data from Small Places
Initial State
 
Big Data from Small Places
Big Data from Small PlacesBig Data from Small Places
Big Data from Small Places
Stanislav Mikhaylyuk
 
Christian Heilmann - Building human interfaces powered by AI - Codemotion Ber...
Christian Heilmann - Building human interfaces powered by AI - Codemotion Ber...Christian Heilmann - Building human interfaces powered by AI - Codemotion Ber...
Christian Heilmann - Building human interfaces powered by AI - Codemotion Ber...
Codemotion
 
Blugsphere2011 admin
Blugsphere2011 adminBlugsphere2011 admin
Blugsphere2011 admin
AusLUG
 
Information security awareness training
Information security awareness trainingInformation security awareness training
Information security awareness training
Sandeep Taileng
 
Trouble shooting a computer
Trouble shooting a computerTrouble shooting a computer
Trouble shooting a computer
heidirobison
 
Monitoring Is Never Done
Monitoring Is Never DoneMonitoring Is Never Done
Monitoring Is Never Done
Melanie Cey
 
What To Do When It All Goes So Wrong
What To Do When It All Goes So WrongWhat To Do When It All Goes So Wrong
What To Do When It All Goes So Wrong
David Levy
 
Hackers secrets
Hackers secretsHackers secrets
Hackers secrets
Felipe Prado
 
Incident Response Fails
Incident Response FailsIncident Response Fails
Incident Response Fails
Michael Gough
 

Similar to The Four Questions (Every Monitoring Engineer gets asked), by Leon Adato (20)

Machine Learning for dummies!
Machine Learning for dummies!Machine Learning for dummies!
Machine Learning for dummies!
 
Big Data for Social Good
Big Data for Social GoodBig Data for Social Good
Big Data for Social Good
 
I believe I can fly (Extract London 2015)
I believe I can fly (Extract London 2015)I believe I can fly (Extract London 2015)
I believe I can fly (Extract London 2015)
 
Tokens, Complex Systems, and Nature
Tokens, Complex Systems, and NatureTokens, Complex Systems, and Nature
Tokens, Complex Systems, and Nature
 
Digital Analytics Checkup: How to evaluate the impact of your web analytics data
Digital Analytics Checkup: How to evaluate the impact of your web analytics dataDigital Analytics Checkup: How to evaluate the impact of your web analytics data
Digital Analytics Checkup: How to evaluate the impact of your web analytics data
 
A living hell - lessons learned in eight years of parsing real estate data
A living hell - lessons learned in eight years of parsing real estate data  A living hell - lessons learned in eight years of parsing real estate data
A living hell - lessons learned in eight years of parsing real estate data
 
Technology to Improve Your (Business) Life
Technology to Improve Your (Business) LifeTechnology to Improve Your (Business) Life
Technology to Improve Your (Business) Life
 
Better the devil you know
Better the devil you knowBetter the devil you know
Better the devil you know
 
Jason Yee - Chaos! - Codemotion Rome 2019
Jason Yee - Chaos! - Codemotion Rome 2019Jason Yee - Chaos! - Codemotion Rome 2019
Jason Yee - Chaos! - Codemotion Rome 2019
 
Eat Your Vegetables - Data Security for Data Scientists
Eat Your Vegetables - Data Security for Data ScientistsEat Your Vegetables - Data Security for Data Scientists
Eat Your Vegetables - Data Security for Data Scientists
 
Big Data from Small Places
Big Data from Small PlacesBig Data from Small Places
Big Data from Small Places
 
Big Data from Small Places
Big Data from Small PlacesBig Data from Small Places
Big Data from Small Places
 
Christian Heilmann - Building human interfaces powered by AI - Codemotion Ber...
Christian Heilmann - Building human interfaces powered by AI - Codemotion Ber...Christian Heilmann - Building human interfaces powered by AI - Codemotion Ber...
Christian Heilmann - Building human interfaces powered by AI - Codemotion Ber...
 
Blugsphere2011 admin
Blugsphere2011 adminBlugsphere2011 admin
Blugsphere2011 admin
 
Information security awareness training
Information security awareness trainingInformation security awareness training
Information security awareness training
 
Trouble shooting a computer
Trouble shooting a computerTrouble shooting a computer
Trouble shooting a computer
 
Monitoring Is Never Done
Monitoring Is Never DoneMonitoring Is Never Done
Monitoring Is Never Done
 
What To Do When It All Goes So Wrong
What To Do When It All Goes So WrongWhat To Do When It All Goes So Wrong
What To Do When It All Goes So Wrong
 
Hackers secrets
Hackers secretsHackers secrets
Hackers secrets
 
Incident Response Fails
Incident Response FailsIncident Response Fails
Incident Response Fails
 

More from Cloud Native Day Tel Aviv

Cloud Native is a Cultural Decision. By Reshef Mann
Cloud Native is a Cultural Decision. By Reshef MannCloud Native is a Cultural Decision. By Reshef Mann
Cloud Native is a Cultural Decision. By Reshef Mann
Cloud Native Day Tel Aviv
 
Container Runtime Security with Falco, by Néstor Salceda
Container Runtime Security with Falco, by Néstor SalcedaContainer Runtime Security with Falco, by Néstor Salceda
Container Runtime Security with Falco, by Néstor Salceda
Cloud Native Day Tel Aviv
 
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
Cloud Native Day Tel Aviv
 
Running I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati ShalomRunning I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati Shalom
Cloud Native Day Tel Aviv
 
WTF Do We Need a Service Mesh? By Anton Weiss.
WTF Do We Need a Service Mesh? By Anton Weiss.WTF Do We Need a Service Mesh? By Anton Weiss.
WTF Do We Need a Service Mesh? By Anton Weiss.
Cloud Native Day Tel Aviv
 
Update Strategies for the Edge, by Kat Cosgrove
Update Strategies for the Edge, by Kat CosgroveUpdate Strategies for the Edge, by Kat Cosgrove
Update Strategies for the Edge, by Kat Cosgrove
Cloud Native Day Tel Aviv
 
Building a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
Building a Cloud-Native SaaS Product The Hard Way. By Arthur BerezinBuilding a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
Building a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
Cloud Native Day Tel Aviv
 
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
Cloud Native Day Tel Aviv
 
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-ShalomCloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
Cloud Native Day Tel Aviv
 
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
Cloud Native Day Tel Aviv
 
Cloud native transformation patterns, by Pini Reznik
Cloud native transformation patterns, by Pini ReznikCloud native transformation patterns, by Pini Reznik
Cloud native transformation patterns, by Pini Reznik
Cloud Native Day Tel Aviv
 
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
Cloud Native Day Tel Aviv
 
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
Cloud Native Day Tel Aviv
 
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
Cloud Native Day Tel Aviv
 
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
Cloud Native Day Tel Aviv
 
Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...
Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...
Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...
Cloud Native Day Tel Aviv
 
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
Cloud Native Day Tel Aviv
 
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
Cloud Native Day Tel Aviv
 
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
Cloud Native Day Tel Aviv
 
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...
Cloud Native Day Tel Aviv
 

More from Cloud Native Day Tel Aviv (20)

Cloud Native is a Cultural Decision. By Reshef Mann
Cloud Native is a Cultural Decision. By Reshef MannCloud Native is a Cultural Decision. By Reshef Mann
Cloud Native is a Cultural Decision. By Reshef Mann
 
Container Runtime Security with Falco, by Néstor Salceda
Container Runtime Security with Falco, by Néstor SalcedaContainer Runtime Security with Falco, by Néstor Salceda
Container Runtime Security with Falco, by Néstor Salceda
 
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
 
Running I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati ShalomRunning I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati Shalom
 
WTF Do We Need a Service Mesh? By Anton Weiss.
WTF Do We Need a Service Mesh? By Anton Weiss.WTF Do We Need a Service Mesh? By Anton Weiss.
WTF Do We Need a Service Mesh? By Anton Weiss.
 
Update Strategies for the Edge, by Kat Cosgrove
Update Strategies for the Edge, by Kat CosgroveUpdate Strategies for the Edge, by Kat Cosgrove
Update Strategies for the Edge, by Kat Cosgrove
 
Building a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
Building a Cloud-Native SaaS Product The Hard Way. By Arthur BerezinBuilding a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
Building a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
 
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
 
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-ShalomCloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
 
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
 
Cloud native transformation patterns, by Pini Reznik
Cloud native transformation patterns, by Pini ReznikCloud native transformation patterns, by Pini Reznik
Cloud native transformation patterns, by Pini Reznik
 
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
 
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
 
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
 
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
 
Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...
Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...
Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...
 
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
 
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
 
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
 
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...
 

Recently uploaded

Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 

Recently uploaded (20)

Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 

The Four Questions (Every Monitoring Engineer gets asked), by Leon Adato

  • 1. Once upon a time…
  • 3. The Four Questions (every monitoring engineer is asked)
  • 4. Hello! ▧ Working in IT 30+ years ▧ 20+ years in monitoring ○ CASE $CompanySize ■ <=100 ■ >100 && <1000 ■ >1000 && <5000 ■ >250,000 ▧ Currently “Head Geek” at SolarWinds ○ Head Geek <> Developer ○ Head Geek != Marketing ○ Head Geek ≠ Sales ○ “Head Geek” LIKE “%Advocate%” ○ Head Geek == STORYTELLER Leon Adato
  • 5. Where To Find Me Twitter: @LeonAdato THWACK.com: AdatoLe WWWeb www.AdatoSystems.com Podcast TechnicallyReligious.com 5
  • 7. The Four Questions of Alerting Why didn’t I get an alert? Why did I get this alert? What will alert on my system? What’s being monitored on my system? adatole @LeonAdato
  • 8. The Jewish Roots of… Questions ▧ “Du fregst a gutte kashe” ▧ Nobel Lauriat in Physics Dr. Isidor Rabi adatole @LeonAdato
  • 9. “My mother made me a scientist without ever intending to. Every other mother in Brooklyn would ask her child: “So? Did you learn anything today?” But not my mother. “Izzy,” she would say, “did you ask a good question today?” That difference — asking good questions — made me become a scientist. 9 adatole @LeonAdato
  • 10. The Jewish Roots of… Questions ▧ “Du fregst a gutte kashe” ▧ Nobel Lauriat in Physics Dr. Isidor Rabi ▧ THE Four Questions ‫ָּה‬‫נ‬ ַ‫ת‬ ְׁ‫ש‬ִּ‫נ‬ ‫ה‬ ַ‫מ‬,‫ֵּילֹות‬‫ל‬ ַ‫ה‬ ‫ָּל‬‫כ‬ ִּ‫מ‬ ‫ֶּה‬‫ז‬ ַ‫ה‬ ‫ָּה‬‫ל‬ְׁ‫י‬ַ‫ל‬ ַ‫ה‬ Why is this night different from all other nights? adatole @LeonAdato
  • 11. What’s the Teretz?* ▧ We need the same open-ness to questions ▧ Relish the experience of asking, of discovery ▧ We don’t work in tech because I already know that ▧ We work in tech because we love I’ll find out *Teretz = answer adatole @LeonAdato
  • 12.  “Your system is down.” Question #1: Why did I get that alert? ☺ CPU on the Windows device owned by Accounting named Mnth_Reporting (IP: 10.2.3.4, DNS: MonRep.MyCorp.Net) has been over 80% for more than 15 minutes. CPU at 2:16am EST is 96%. Device details: http://blahblah. Acknowledge this alert: http://ackme This message brought to you by the alert: CPU_CRIT_PROD and the polling engine Poller7 12 adatole @LeonAdato
  • 13. What’s the Teretz? ▧ Name of the system ▧ Specific component or sub- element ▧ Current statistic or status ▧ Time the event occurred ▧ Time the alert was sent ▧ Custom fields like location, owner, etc. ▧ OS type and version ▧ IP address ▧ DNS name or Sysname ▧ The threshold ▧ The duration ▧ A link to the device or metric ▧ The name of the alert ▧ The polling engine adatole @LeonAdato
  • 14. ▧ The story … before the story …… before the story ▧ Context matters! ▧ History matters! ▧ Not only “why did I get this alert” ▧ But “why do these alerts exist at all?” Jewish Roots: My Father Was a Wandering Aramean adatole @LeonAdato
  • 15. Question #2: Why DIDN’T I Get That Alert? adatole @LeonAdato
  • 16. Question #2: Why DIDN’T I Get That Alert? ▧ It was designed like that ○ Alert windows ○ Problem duration ○ Ticket not reset ○ Mute/unmanage/shut-up ○ Parent-child adatole @LeonAdato
  • 17. Question #2: Why DIDN’T I Get That Alert? ▧ Change Un-Control ○ Credential changed ○ Network changed ○ Custom Property ○ Element removed ○ Physical to Virtual adatole @LeonAdato
  • 18. Question #2: Why DIDN’T I Get That Alert? ▧ Monitoring Failed ○ Polling stopped ○ Agent stopped ○ Data throttled ○ Db is out of sync ○ New code/image missing tracing ○ Monitoring “supply chain” failed (email) ○ Event correlation rules adatole @LeonAdato
  • 19. What’s the Teretz? ▧ Understand (and communicate) exceptions ▧ Save your receipts ▧ Save other people’s receipts too, if you can ▧ Monitor your monitoring ▧ Test your notification delivery infrastructure ▧ Have validation steps ready adatole @LeonAdato
  • 20. Question #3: What’s monitored on my system(s)? adatole @LeonAdato
  • 22. Question #3: What’s monitored on my system(s)? adatole @LeonAdato
  • 23. What’s the Teretz? ▧ One size fits… some? ▧ Skillcheck: SQL ▧ Skillcheck: wireshark ▧ Look at the screens adatole @LeonAdato
  • 24. Jewish Roots: Burning Hail and Black Swans ▧ Why do we remember the plagues? ○ Visceral, unexpected, unique ▧ Let’s talk about “black swans” ▧ The plagues as black swan events adatole @LeonAdato
  • 25. Question #4: What COULD alert for my systems? adatole @LeonAdato
  • 26. Question #4: What COULD alert for my systems? ▧ What *IS* an alert? ○ Emergency ○ Interruption ○ Unplanned Work ▧ What does alerting NEED to be ○ Timely ○ Meaningful ○ Actionable adatole @LeonAdato
  • 27. Question #4: What COULD alert for my systems? ▧ Why does this matter? ○ # of systems ○ # of alerts that can trigger for those systems ○ # of staff hours to address those alerts ○ # of alerts that could trigger simultaneously adatole @LeonAdato
  • 28. What’s the Teretz? ▧ This can be a VERY difficult question to answer ▧ But it’s difficulty is in proportion to importance ▧ Speaks to potential impact to the company, workload, interruptions. adatole @LeonAdato
  • 30. Jewish Roots: Are You Ready For the Hard Questions? ▧ Scholar, Skeptic, Simple, & Silent ▧ Meet each user where they are ▧ Let’s talk about the Skeptic (“the wicked son”) ▧ Listen past the snark for the question adatole @LeonAdato
  • 31. Question #5: What Do you Monitor “Standard”? adatole @LeonAdato
  • 32. Wait, I thought you said FOUR questions! adatole @LeonAdato
  • 33. Jewish Roots: Four or Five cups? ▧ Symbolism of wine as joy ▧ We need to remember to pause for joyful moments ▧ Despite rigorous Talmudic analysis, there are still questions without clear answers. ▧ BUT… that doesn’t mean we disengage. ▧ We return to these questions over and over, try new approaches. A lot like IT problems. adatole @LeonAdato
  • 34. OK, So That Fifth Question: What Do You Monitor “Standard”? ▧ When you load up a box into monitoring, what do consumers automatically get? ▧ If you can’t describe this, how will anyone know what to ask for “extra”? adatole @LeonAdato
  • 35. The Mostly Un-Necessary Summary Being prepared for the 4 (ok 5) questions ▧ Your monitoring will be (better) prepared for the stresses it will be exposed to. ▧ You will be (better) prepared as an advocate for monitoring ▧ You’ll spend less time answering repetitive questions and more time doing to the work of a monitoring engineer. (i.e.: the GOOD stuff!) adatole @LeonAdato
  • 36. If you still have questions… 36 adatole @LeonAdato
  • 37. Thank You! I’m READY Tell me what questions you have 37