SlideShare a Scribd company logo
1 of 70
Download to read offline
@robtreat2
Managing Chaos In Production:
Testing vs Monitoring
Robert Treat
@robtreat2
monitoring,
it’s like testing,
but in production,
so actually important
Robert Treat
@robtreat2
a discussion on trade-offs between testing
and monitoring and how to apply them in a
devops lifecycle
Robert Treat
❖ @robtreat2
❖ robert@omniti.com
❖ xzilla.net
❖ slideshare.net/xzilla
WHO AM I?
๏ Former dev, ops, dba, & more
๏ CEO @ OmniTI
๏ Build & operate systems at scale
๏ IT Services / Consulting
@robtreat2
TESTING
@robtreat2
testing is overrated
@robtreat2
testing is required
@robtreat2
testing is not enough
@robtreat2
> unit testing
> functional testing
> resilience testing
> performance testing
> …
@robtreat2
testing can give a false
sense of security
@robtreat2
testing is deterministic
@robtreat2
data problem
@robtreat2
> quantity of data
> frequency of data
> quality of data
@robtreat2
example
Wolfe+585
@robtreat2
example
Hubert Blaine Wolfeschlegelsteinhausenbergerdorffwelchevoralternwaren-
gewissenhaftschaferswessenschafewarenwohlgepflegeundsorgfaltigkeitbe-
schutzenvorangreifendurchihrraubgierigfeindewelchevoralternzwolfhundert-
tausendjahresvorandieerscheinenvonderersteerdemenschderraumschiff-
genachtmittungsteinundsiebeniridiumelektrischmotorsgebrauchlichtalsseinur-
sprungvonkraftgestartseinlangefahrthinzwischensternartigraumaufdersuchen-
nachbarschaftdersternwelchegehabtbewohnbarplanetenkreisedrehensichund-
wohinderneuerassevonverstandigmenschlichkeitkonntefortpflanzenundsicher-
freuenanlebenslanglichfreudeundruhemitnichteinfurchtvorangreifenvor-
andererintelligentgeschopfsvonhinzwischensternartigraum, Sr.
@robtreat2
user problem
@robtreat2
“Users (n) - distributed fault injection
test suite for production
@robtreat2
example
the
Corrupted Blood
incident
@robtreat2
example
@robtreat2
other factors
@robtreat2
> lack of foresight
> too many use-cases
> change to assumptions
@robtreat2
you can never add
enough tests
@robtreat2
100% code coverage
is not the goal
@robtreat2
the goal of testing is to win
the confidence game
@robtreat2
we want to be reasonably
confident that the code we
are going to push will not
break production
@robtreat2
reasonably confident
@robtreat2
testing is good for
“known knowns”
@robtreat2
testing is not so good for
“unknown unknowns”
@robtreat2
enter monitoring
@robtreat2
why monitor?
@robtreat2
> software is never perfect
> systems are complex
> external dependency worry
> proactive is better than reactive
> …
@robtreat2
because things change
@robtreat2
because things change
a lot in production
@robtreat2
what to monitor?
@robtreat2
in God we trust all others
we monitor
“
@robtreat2
> systems
> databases
> applications
> integration points
> performance
> user behavior
> …
@robtreat2
is it enough?
@robtreat2
is it too much?
@robtreat2
what is important?
@robtreat2
what is important?
(i.e. what to alert on)
@robtreat2
example
> servers up and running
> HTTP checks return 200
> tweets are lost
@robtreat2
servers working
!=
business working
@robtreat2
I don’t give a **** if the
datacenter is on fire as
long as I am still making
money
“ — CEO
@robtreat2
most people use
monitoring to focus on the
wrong things
@robtreat2
my fault
@robtreat2
monitoring vendors fault
@robtreat2
we need to be smarter
about how we talk about
monitoring
@robtreat2
often when people say
monitoring
they actually mean
visibility
@robtreat2
metrics collection for
“all the things”
monitor
what affects business
@robtreat2
metrics collections for
“all the things”
monitor
what affects business
@robtreat2
top-down approach
> understand business
> define baseline
> correlate data
@robtreat2
example
๏ online marketing company
๏ major e-commerce component
๏ ~100 million users
๏ 1 billion emails/month
๏ 300,000 lines of code
๏5600+ metrics collected
@robtreat2
it all starts with a call …
@robtreat2
revenue
@robtreat2
revenue
+ traffic
@robtreat2
revenue + traffic
+ load time
@robtreat2
revenue + traffic + load time
+ db
@robtreat2
revenue + traffic + load time + db
+ email bounces
@robtreat2
… email wasn’t monitored?
what if …
@robtreat2
… email wasn’t monitored?
(it would be after this)
what if …
@robtreat2
instrumentation
is never done
@robtreat2
example
> same symptoms
> all metrics are within norm
> higher decline rates
@robtreat2
example
> same symptoms
> all metrics are within norm
> higher decline rates
AmEx blocked
@robtreat2
tl;dr
@robtreat2
testing and monitoring
not
testing or monitoring
@robtreat2
understand the business
@robtreat2
increase observability
(metrics, tracing, etc..)
@robtreat2
monitor things that are
impactful
@robtreat2
alert only on actionable
emergencies
@robtreat2
THANK YOU
questions?

More Related Content

Similar to Managing Chaos In Production: Testing vs Monitoring

Similar to Managing Chaos In Production: Testing vs Monitoring (20)

Infrastructure for Decision Makers
Infrastructure for Decision MakersInfrastructure for Decision Makers
Infrastructure for Decision Makers
 
OSDC 2019 | Feature Branching considered Evil by Thierry de Pauw
OSDC 2019 | Feature Branching considered Evil by Thierry de PauwOSDC 2019 | Feature Branching considered Evil by Thierry de Pauw
OSDC 2019 | Feature Branching considered Evil by Thierry de Pauw
 
DevOps Game Theory / Observability Deck
DevOps Game Theory / Observability DeckDevOps Game Theory / Observability Deck
DevOps Game Theory / Observability Deck
 
SearchLove Boston 2018 - Emily Grossman - The Marketer’s Guide to Performance...
SearchLove Boston 2018 - Emily Grossman - The Marketer’s Guide to Performance...SearchLove Boston 2018 - Emily Grossman - The Marketer’s Guide to Performance...
SearchLove Boston 2018 - Emily Grossman - The Marketer’s Guide to Performance...
 
Evolving toward devops through transaction centric monitoring
Evolving toward devops through transaction centric monitoringEvolving toward devops through transaction centric monitoring
Evolving toward devops through transaction centric monitoring
 
What we can learn from hackers (about the definition of work)
What we can learn from hackers (about the definition of work)What we can learn from hackers (about the definition of work)
What we can learn from hackers (about the definition of work)
 
Magento 2 Performance: Every Second Counts
Magento 2 Performance: Every Second CountsMagento 2 Performance: Every Second Counts
Magento 2 Performance: Every Second Counts
 
Technical SEO - Generational cruft in SEO - there is never a new site when th...
Technical SEO - Generational cruft in SEO - there is never a new site when th...Technical SEO - Generational cruft in SEO - there is never a new site when th...
Technical SEO - Generational cruft in SEO - there is never a new site when th...
 
Pushing the hassle from production to developers. Easily
Pushing the hassle from production to developers. EasilyPushing the hassle from production to developers. Easily
Pushing the hassle from production to developers. Easily
 
Production testing through monitoring
Production testing through monitoringProduction testing through monitoring
Production testing through monitoring
 
Dev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and FlickrDev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and Flickr
 
Inextricably linked reproducibility and productivity in data science and ai ...
Inextricably linked reproducibility and productivity in data science and ai  ...Inextricably linked reproducibility and productivity in data science and ai  ...
Inextricably linked reproducibility and productivity in data science and ai ...
 
Testing Pyramid
Testing PyramidTesting Pyramid
Testing Pyramid
 
Python for Data Logistics
Python for Data LogisticsPython for Data Logistics
Python for Data Logistics
 
The working architecture of NodeJS applications, Виктор Турский
The working architecture of NodeJS applications, Виктор ТурскийThe working architecture of NodeJS applications, Виктор Турский
The working architecture of NodeJS applications, Виктор Турский
 
The working architecture of node js applications open tech week javascript ...
The working architecture of node js applications   open tech week javascript ...The working architecture of node js applications   open tech week javascript ...
The working architecture of node js applications open tech week javascript ...
 
Track Everything with Google Tag Manager - #DFWSEM May 2017
Track Everything with Google Tag Manager -  #DFWSEM May 2017Track Everything with Google Tag Manager -  #DFWSEM May 2017
Track Everything with Google Tag Manager - #DFWSEM May 2017
 
Big feature - small sprint
Big feature - small sprint Big feature - small sprint
Big feature - small sprint
 
Birmingham JUG Lightweight Microservices with Microprofile and Raspberry PIs
Birmingham JUG Lightweight Microservices with Microprofile and Raspberry PIsBirmingham JUG Lightweight Microservices with Microprofile and Raspberry PIs
Birmingham JUG Lightweight Microservices with Microprofile and Raspberry PIs
 
When DevOps and Networking Intersect by Brent Salisbury of socketplane.io
When DevOps and Networking Intersect by Brent Salisbury of socketplane.ioWhen DevOps and Networking Intersect by Brent Salisbury of socketplane.io
When DevOps and Networking Intersect by Brent Salisbury of socketplane.io
 

More from Robert Treat

Less Alarming Alerts!
Less Alarming Alerts!Less Alarming Alerts!
Less Alarming Alerts!
Robert Treat
 
Managing Databases In A DevOps Environment
Managing Databases In A DevOps EnvironmentManaging Databases In A DevOps Environment
Managing Databases In A DevOps Environment
Robert Treat
 
Scaling with Postgres (Highload++ 2010)
Scaling with Postgres (Highload++ 2010)Scaling with Postgres (Highload++ 2010)
Scaling with Postgres (Highload++ 2010)
Robert Treat
 
Intro to Postgres 9 Tutorial
Intro to Postgres 9 TutorialIntro to Postgres 9 Tutorial
Intro to Postgres 9 Tutorial
Robert Treat
 

More from Robert Treat (20)

Advanced Int->Bigint Conversions
Advanced Int->Bigint ConversionsAdvanced Int->Bigint Conversions
Advanced Int->Bigint Conversions
 
Explaining Explain
Explaining ExplainExplaining Explain
Explaining Explain
 
the-lost-art-of-plpgsql
the-lost-art-of-plpgsqlthe-lost-art-of-plpgsql
the-lost-art-of-plpgsql
 
Managing Databases In A DevOps Environment 2016
Managing Databases In A DevOps Environment 2016Managing Databases In A DevOps Environment 2016
Managing Databases In A DevOps Environment 2016
 
Less Alarming Alerts - SRECon 2016
Less Alarming Alerts - SRECon 2016 Less Alarming Alerts - SRECon 2016
Less Alarming Alerts - SRECon 2016
 
What Ops Can Learn From Design
What Ops Can Learn From DesignWhat Ops Can Learn From Design
What Ops Can Learn From Design
 
Postgres 9.4 First Look
Postgres 9.4 First LookPostgres 9.4 First Look
Postgres 9.4 First Look
 
Less Alarming Alerts!
Less Alarming Alerts!Less Alarming Alerts!
Less Alarming Alerts!
 
Past, Present, and Pachyderm - All Things Open - 2013
Past, Present, and Pachyderm - All Things Open - 2013Past, Present, and Pachyderm - All Things Open - 2013
Past, Present, and Pachyderm - All Things Open - 2013
 
Big Bad "Upgraded" Postgres
Big Bad "Upgraded" PostgresBig Bad "Upgraded" Postgres
Big Bad "Upgraded" Postgres
 
Managing Databases In A DevOps Environment
Managing Databases In A DevOps EnvironmentManaging Databases In A DevOps Environment
Managing Databases In A DevOps Environment
 
The Essential PostgreSQL.conf
The Essential PostgreSQL.confThe Essential PostgreSQL.conf
The Essential PostgreSQL.conf
 
Pro Postgres 9
Pro Postgres 9Pro Postgres 9
Pro Postgres 9
 
Advanced WAL File Management With OmniPITR
Advanced WAL File Management With OmniPITRAdvanced WAL File Management With OmniPITR
Advanced WAL File Management With OmniPITR
 
Scaling with Postgres (Highload++ 2010)
Scaling with Postgres (Highload++ 2010)Scaling with Postgres (Highload++ 2010)
Scaling with Postgres (Highload++ 2010)
 
Intro to Postgres 9 Tutorial
Intro to Postgres 9 TutorialIntro to Postgres 9 Tutorial
Intro to Postgres 9 Tutorial
 
Check Please!
Check Please!Check Please!
Check Please!
 
Database Scalability Patterns
Database Scalability PatternsDatabase Scalability Patterns
Database Scalability Patterns
 
A Guide To PostgreSQL 9.0
A Guide To PostgreSQL 9.0A Guide To PostgreSQL 9.0
A Guide To PostgreSQL 9.0
 
Scaling With Postgres
Scaling With PostgresScaling With Postgres
Scaling With Postgres
 

Recently uploaded

Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
soniya singh
 
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLLucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
imonikaupta
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
soniya singh
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
ellan12
 
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
soniya singh
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
sexy call girls service in goa
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
soniya singh
 

Recently uploaded (20)

(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
 
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLLucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
 
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
 
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night StandHot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
 

Managing Chaos In Production: Testing vs Monitoring