Gamification of Chaos Testing

Gamification of Chaos
Testing
Bram Vogelaar
@attachmentgenie

29.74 Seconds
What could this number represent?

99.97% Uptime?
That is a weird SLO?

Well... actually...it is this guy

“It takes 10K hrs to
become an expert”
Malcom Gladwell
Because of this!

“Experts have on
average spent 10K hrs
learning their craft”
Malcom Gladwell
What he really said!

Pilots and Fireman do this most of their time

They even spend a lot of time doing this

Large complex systems will always be in a
degraded state

Can you figure out if your
platform is in an error
state?
Can you honestly answer this question?

The discipline of experimenting
on a distributed system in
order to build confidence in
the system’s capability to
withstand turbulent conditions
in production
Chaos Engineering

K8S is so complex it grew an entire ecosystem

Look we did some science!
Published Result

Experimental design matters
How representative is one gel to the next?
How many replicates are significant proof?
How about negative controls?
Is one side of this gel representative for the other?
Are the proteins separated enough?
Are these really the proteins we think they are?
Was this the right technique to begin with?
How much sample do we have for repeats?

Game Day Exercises
https://www.youtube.com/watch?v=xdiGW-RSb2w

Game analytics help guide training

Make that sidecar work for you

1. Athletes and Musicians
– Practice makes perfect
– Pre and Post game analytics will point at options to adjust and win
2. Pilots/Doctors/Fireman
– People respond differently to stress
– Life like simulations are critical
– Checklist will help you prevent making mistakes
3. Scientists
– Break problems down into known tasks
– Lies, Damn Lies and Statistics….and Biases
– Experimental Design matters
What have we learned so far

Checklists are also
incredibly boring!
That’s all mighty fine, but...

●
Game Day Exercises
●
(Non)-Destructive Tests
– Configuration Management / Infrastructure as Code
●
Observability
– Metrics
– Logs
– Traces
●
Living in the year 3000: Breaking production on purpose
So let’s translate it to IT engineering

Failure is normal and
expected behavior
Convince Management that:

Training
==
Testing
==
Monitoring
Let engineers be scientists

Get Engineers to be comfortable facing failure

Create “real” DoD’s
And
Runbooks
When pressing $NewThing ™ into service

bram@attachmentgenie.com
@attachmentgenie
https://www.slideshare.net/attachmentgenie
Contact

Gamification of Chaos Testing

Recommended

Recommended

More Related Content

Similar to Gamification of Chaos Testing

Similar to Gamification of Chaos Testing (20)

More from Bram Vogelaar

More from Bram Vogelaar (20)

Recently uploaded

Recently uploaded (20)

Gamification of Chaos Testing