12. Build extremely resilient infrastructure to minimize the infrastructural
outages that Ops normally handles. The best way to avoid the pain
of your team being paged is to make the pagers never happen.1
13. Provide developers with strong tooling so that they can easily
manage the infrastructure. This doesn't mean turning them into
experts, but giving them control over that which they should control.2
14. Use Ops as a 2nd line of support to developers
in the event that the steps weren't sufficient.3
16. Looking ahead…
• Can we extend our use of Mesos to
other systems (MySQL, Kafka,
HBase)?
• Can we develop self-healing systems
that automate manual processes?