All Day DevOps - Practice Makes Perfect: Developing Expertise Through Chaos Engineering

•

2 likes•244 views

Our technical systems are getting more complicated by the day. Whether it’s due to intention or accident this complexity has the same effect on our ability to manage the applications our clients depend on: it gets a lot harder. When the system produces a ‘surprise’ and no longer performs according to assumption effective incident response is critical. Engineers involved must quickly align behind a common goal, communicate efficiently, and predictably coordinate actions to return system behavior to normal. The high level of cohesion necessary to act in this manner doesn’t happen overnight and relying on live-incidents to build this expertise can be painful and costly. In this talk we’ll cover how teams can prepare themselves for the worst of incidents by covering: * The critical building blocks of teamwork that are necessary to bring surprises to resolution; * How to incorporate deliberate practice into the workday to build up incident response muscle memory; and * The incorporation of Chaos engineering practices such as GameDays to realistically simulate how the team will react to a real surprise.

Technology

Acknowledgement upfront
Hollnagel, Erik, and David D. Woods. Joint cognitive systems: Foundations of cognitive systems engineering. CRC Press, 2005.
Klein, Gary, Paul J. Feltovich, Jeffrey M. Bradshaw, and David D. Woods. “Common Ground and Coordination in Joint Activity.” In Organizational Simulation,
edited by William B. Rouse and Kenneth R. Boff, 139–84. Hoboken, NJ, USA: John Wiley & Sons, Inc., 2005. https://doi.org/10.1002/0471739448.ch6.
Dekker, Sidney. "Failure to adapt or adaptations that fail: contrasting models on procedures and safety." Applied ergonomics 34, no. 3 (2003): 233-238.
Allspaw, John. “Trade-offs Under Pressure: Heuristics and Observations of Teams Resolving Internet Service Outages.” 2015.
http://lup.lub.lu.se/student-papers/record/8084520/file/8084521.pdf
Cook, Richard I. "How complex systems fail." Cognitive Technologies Laboratory, University of Chicago. Chicago IL (1998).
Beyer, Betsy, Chris Jones, Jennifer Petoff, and Niall Richard Murphy. Site Reliability Engineering: How Google Runs Production Systems. " O'Reilly Media, Inc.", 2016.
Beyer, Betsy, Niall Richard Murphy, David K. Rensin, Kent Kawahara, and Stephen Thorne. The Site Reliability Workbook: Practical Ways to Implement SRE. " O'Reilly Media,
Inc.", 2018.
Rosenthal, Casey, Lorin Hochstein, Aaron Blohowiak, Nora Jones, and Ali Basiri. Chaos Engineering. O'Reilly Media, Incorporated, 2017.

!Our systems are
important
It’s not our day job It’s a team effort

“We hear about the procedures they always follow, and the ones they
sometimes skip because if they followed them blindly and to the letter
they’d have a bad day, guaranteed.
We hear about what they usually do when that alert goes off and
everything is fine again. When “this” and “that” happens, they do “these
things,” but only in certain circumstances. We coax out the hidden
nuances underlying their actions, decisions, and rationales.”
- Etsy Debriefing Facilitation Guide (Allspaw, Evans, Schauenberg)

https://principlesofchaos.org/?lang=ENcontent

All Day DevOps - Practice Makes Perfect: Developing Expertise Through Chaos Engineering

Similar to All Day DevOps - Practice Makes Perfect: Developing Expertise Through Chaos Engineering

Collaboration in science and technology it summitkevin_donovan

Collaboration in science and technology it summitMerce Crosas

On serendipity in recommender systems - Haifa RecSoc workshop june 2015Giovanni Semeraro

Collaboration in science and technologyMerce Crosas

The Architecture of UnderstandingPeter Morville

Sensemaking & Wayfinding: Why IA is Essential When Navigating Complex SystemsMatt Arnold

Lecture 3 Social Dynamics Leith Sharp-1.pptxJaymeNeto12

Silent Interaction: Healthcare UX,지금 우리가 질문해야 할 몇 가지Billy Choi

Diversity and InclusionAlexander Serebrenik

The Benefits of Explicit/Implicit Hyper-Personalization (Rafal Ohme, Digital ...CX Emotion

Frances Ryan DARTS5 presentationARLGSW

Personal online reputations: Managing what you can’t controlFrances Ryan

The Architecture of UnderstandingPeter Morville

Assessing the available and accessible evidence: How personal reputations are...Frances Ryan

School of Computing PhD Research Conference PresentationFrances Ryan

Finding our narrative, Harold JarcheMotiva

NFAR | New Ethical Dilemmas 1.5 hourmikewilhelm

Ethics and information architecture - The 6th Academics and Practitioners Rou...Sarah Rice

The quantified self at work NuBizHRMWE

Inspiration Architecture: The Future of LibrariesPeter Morville

Similar to All Day DevOps - Practice Makes Perfect: Developing Expertise Through Chaos Engineering (20)

Collaboration in science and technology it summit

On serendipity in recommender systems - Haifa RecSoc workshop june 2015

Collaboration in science and technology

The Architecture of Understanding

Sensemaking & Wayfinding: Why IA is Essential When Navigating Complex Systems

Lecture 3 Social Dynamics Leith Sharp-1.pptx

Silent Interaction: Healthcare UX,지금 우리가 질문해야 할 몇 가지

Diversity and Inclusion

The Benefits of Explicit/Implicit Hyper-Personalization (Rafal Ohme, Digital ...

Frances Ryan DARTS5 presentation

Personal online reputations: Managing what you can’t control

The Architecture of Understanding

Assessing the available and accessible evidence: How personal reputations are...

School of Computing PhD Research Conference Presentation

Finding our narrative, Harold Jarche

NFAR | New Ethical Dilemmas 1.5 hour

Ethics and information architecture - The 6th Academics and Practitioners Rou...

The quantified self at work

Inspiration Architecture: The Future of Libraries

Recently uploaded

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software

Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group

Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Vulnerability_Management_GRC_by Sohang Sengupta.pptxnull - The Open Security Community

Pigging Solutions in Pet Food ManufacturingPigging Solutions

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

The transition to renewables in India.pdfCompetition Advisory Services (India) LLP

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

Understanding the Laravel MVC ArchitecturePixlogix Infotech

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

Recently uploaded (20)

GenCyber Cyber Security Day Presentation

Injustice - Developers Among Us (SciFiDevCon 2024)

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

Snow Chain-Integrated Tire for a Safe Drive on Winter Roads

Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget

My Hashitalk Indonesia April 2024 Presentation

Next-generation AAM aircraft unveiled by Supernal, S-A2

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Vulnerability_Management_GRC_by Sohang Sengupta.pptx

Pigging Solutions in Pet Food Manufacturing

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

The transition to renewables in India.pdf

Unblocking The Main Thread Solving ANRs and Frozen Frames

Maximizing Board Effectiveness 2024 Webinar.pptx

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

Understanding the Laravel MVC Architecture

Breaking the Kubernetes Kill Chain: Host Path Mount

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...

All Day DevOps - Practice Makes Perfect: Developing Expertise Through Chaos Engineering

1. NOVEMBER 6, 2019

3. Who I am

4. Acknowledgement upfront Hollnagel, Erik, and David D. Woods. Joint cognitive systems: Foundations of cognitive systems engineering. CRC Press, 2005. Klein, Gary, Paul J. Feltovich, Jeffrey M. Bradshaw, and David D. Woods. “Common Ground and Coordination in Joint Activity.” In Organizational Simulation, edited by William B. Rouse and Kenneth R. Boff, 139–84. Hoboken, NJ, USA: John Wiley & Sons, Inc., 2005. https://doi.org/10.1002/0471739448.ch6. Dekker, Sidney. "Failure to adapt or adaptations that fail: contrasting models on procedures and safety." Applied ergonomics 34, no. 3 (2003): 233-238. Allspaw, John. “Trade-offs Under Pressure: Heuristics and Observations of Teams Resolving Internet Service Outages.” 2015. http://lup.lub.lu.se/student-papers/record/8084520/file/8084521.pdf Cook, Richard I. "How complex systems fail." Cognitive Technologies Laboratory, University of Chicago. Chicago IL (1998). Beyer, Betsy, Chris Jones, Jennifer Petoff, and Niall Richard Murphy. Site Reliability Engineering: How Google Runs Production Systems. " O'Reilly Media, Inc.", 2016. Beyer, Betsy, Niall Richard Murphy, David K. Rensin, Kent Kawahara, and Stephen Thorne. The Site Reliability Workbook: Practical Ways to Implement SRE. " O'Reilly Media, Inc.", 2018. Rosenthal, Casey, Lorin Hochstein, Aaron Blohowiak, Nora Jones, and Ali Basiri. Chaos Engineering. O'Reilly Media, Incorporated, 2017.

6. !Our systems are important It’s not our day job It’s a team effort

7. !

10.

11.

12.

13.

14.

15.

16.

17.

18.

19.

20.

21.

22.

23.

24.

25.

26. “We hear about the procedures they always follow, and the ones they sometimes skip because if they followed them blindly and to the letter they’d have a bad day, guaranteed. We hear about what they usually do when that alert goes off and everything is fine again. When “this” and “that” happens, they do “these things,” but only in certain circumstances. We coax out the hidden nuances underlying their actions, decisions, and rationales.” - Etsy Debriefing Facilitation Guide (Allspaw, Evans, Schauenberg)

27.

28. https://principlesofchaos.org/?lang=ENcontent

29.

30.

31. Focus is here

32.

33. Focus is holistic

All Day DevOps - Practice Makes Perfect: Developing Expertise Through Chaos Engineering

Recommended

Recommended

More Related Content

Similar to All Day DevOps - Practice Makes Perfect: Developing Expertise Through Chaos Engineering

Similar to All Day DevOps - Practice Makes Perfect: Developing Expertise Through Chaos Engineering (20)

Recently uploaded

Recently uploaded (20)

All Day DevOps - Practice Makes Perfect: Developing Expertise Through Chaos Engineering