Chaos Drills can contribute a lot to your services resilience, and it’s actually quite a fun activity. We’ve (Outbrain) built a tool called GomJabbar to help you run those drills. GomJabbar is similar to Netflix's Chaos Monkey, but was built for a more relaxed environment / platform requirements, and allow it to run on your private cloud infrastructure.
In this talk I'll explain why we built it, and how you can utilize it to improve your infrastructure and services resilience to failures.
The video can be found here: https://youtu.be/Jecwwd_xoiI
13. "I must not fear.
Fear is the mind-killer.
Fear is the little-death that brings total obliteration.
I will face my fear.
I will permit it to pass over me and through me.
And when it has gone past I will turn the inner eye to
see its path.
Where the fear has gone there will be nothing.
Only I will remain."
(Litany Against Fear - Frank Herbert - Dune)
Getting Started at your Organization
17. What did we learn?
● Broken Monitoring
● Broken Alerts
● Unattended Ownerless Services
● SPOF Modules
● Cascading Failures caused by high latency and
packet loss