SlideShare a Scribd company logo
Introducing Chaos Engineering
as part of DevOps
What is Chaos Engineering ?
4
“Everything fails all the time. We lose whole data centers!
Those things happen.”
Werner Vogels (CTO @Amazon.com)
What ?
5
“Everything fails all the time. We lose whole data centers!
Those things happen.”
Werner Vogels (CTO @Amazon.com)
Who am I ?
6
Twitter: @GurtejPalSingh
LinkedIn: https://www.linkedin.com/in/gurtejpalsingh/
Blog: https://gurtejpsingh.wordpress.com/
Associate Director – Agile Service Delivery @ LoyaltyOne
Enterprise Digital Strategy (Agile Coaching & DevOps Consulting) | Planning & Execution | Technology Consulting
Gurtej Pal Singh
Agenda
• What ?
• Chaos Engineering VS Testing
• The Story of How & Why
• Chaos Engineering & DevOps
• Industry Presence
7
Ideal Architecture
Fault
Tolerant
Highly
Available
Auto
Scalable
Gurtej Pal Singh | @GurtejPalSingh
8
Let us see…
Database Database
Database
Request DB Request
Fetch from Cache
Web Server
Users
Gurtej Pal Singh | @GurtejPalSingh
9
Then to now…
Gurtej Pal Singh | @GurtejPalSingh
10Pic courtesy: https://blog.knoldus.com
Let us see…..
Gurtej Pal Singh | @GurtejPalSingh
11
Gurtej Pal Singh | @GurtejPalSingh
11
Chaos Engineering Software Testing
Chaos Engineering VS Testing
Chaos Engineering VS Testing
Gurtej Pal Singh | @GurtejPalSingh
13
Models of the two…
Testing == Validation
Chaos Engineering == Experimentation
Gurtej Pal Singh | @GurtejPalSingh
14
Gurtej Pal Singh | @GurtejPalSingh
14
The story of How & Why…
Prerequisites
• Review the application architecture to identify :
•Dependencies
•Customer impact
•Failure Points
•Recovery procedures
Gurtej Pal Singh | @GurtejPalSingh
16
Principles of Chaos
• Build a Hypothesis around Steady State Behavior
• Vary real world events
• Experiment in production
• Automate experiments to run continuously
• Minimize Blast Radius
Gurtej Pal Singh | @GurtejPalSingh
17
Sample Chaos Experiments*
• Terminate random virtual server in a region.
• Subject entire fleet to high CPU/Memory within a region.
• Increase latency in one or more servers.
• Block access to a storage system.
• Failover a database to its secondary.
• Random killing of critical processes.
*For a generic three tier application, It is not comprehensive and depends on the application tooling support and maturity, but it is
a good starting point.
Gurtej Pal Singh | @GurtejPalSingh
18
Approach
Web Setup
Sane Setup
Chaos Setup
Requests
Expected behavior
Expected
benchmarked
behavior
Hypothesis
(dis)proved behavior
Incorporate
Learnings
Gurtej Pal Singh | @GurtejPalSingh
19
ForComparingtheresults
Levels of Chaos
Sophistication
Adoption
Gurtej Pal Singh | @GurtejPalSingh
20
No chaos Story Frequent Crashes
Presumed Resiliency True Resiliency
Why ?
Gurtej Pal Singh | @GurtejPalSingh
21Image: www.salonuready.info
Gurtej Pal Singh | @GurtejPalSingh
19
Chaos Engineering & DevOps
Chaos Engineering in CI CD Pipeline
Gurtej Pal Singh | @GurtejPalSingh
23
Simian Army :
• Chaos Monkey
• Chaos Gorilla
• Chaos Kong
• ChAP (Chaos Automation Platform)
Few tools to get started with
Chaos Engineering Gurtej Pal Singh | @GurtejPalSingh
24
Chaos Engineering as a part
of DevOps Culture
Post the Chaos Experiments has been continuously validated in a CICD pipeline:
Gurtej Pal Singh | @GurtejPalSingh
25
Developers & stakeholders continually embrace failure as a way to prepare for and prevent
it, resulting in stronger and more resilient applications.
• Enable it in production , incrementally, in a random fashion (without notifying the support teams)
• Tooling must be mature enough to stop the experiment without further impacts.
• Initialize Chaos experiments introduced during random non-peak times, post maturity , move to peak
times.
Industry Presence
Gurtej Pal Singh | @GurtejPalSingh
26
Industry Presence
Gurtej Pal Singh | @GurtejPalSingh
24
Chaos Engineering
Companies, People,
Tools and Practices
What ?
Discipline of experimenting in a distributed system to build confidence
in system capability in turbulent conditions in “PRODUCTION”.
https://principlesofchaos.org/
Gurtej Pal Singh | @GurtejPalSingh
28
Gurtej Pal Singh | @GurtejPalSingh
29
- Werner Vogels –
VP & CTO at Amazon.com
The main point is no more to avoid failures,
but to limit impact of those failures.
References & Resources
• https://principlesofchaos.org/
• https://medium.com/capital-one-tech/continuous-chaos-introducing-chaos-
engineering-into-devops-practices-75757e1cca6d
• https://www.gremlin.com/uploads/20171210-
Chaos_Engineering_White_Paper.pdf
• https://www.youtube.com/watch?v=6ilMZqKdMMU
• https://www.youtube.com/watch?v=qHykK5pFRW4
• https://medium.com/netflix-techblog/the-netflix-simian-army-16e57fbab116
• https://medium.com/netflix-techblog/chap-chaos-automation-platform-
53e6d528371f
• https://www.oreilly.com/library/view/chaos-engineering/9781491988459/
Gurtej Pal Singh | @GurtejPalSingh
30
Gurtej Pal Singh | @GurtejPalSingh
29
Twitter: @GurtejPalSingh
LinkedIn: https://www.linkedin.com/in/gurtejpalsingh/

More Related Content

What's hot

Codestrong 2012 breakout session developing i phone and android apps using ...
Codestrong 2012 breakout session   developing i phone and android apps using ...Codestrong 2012 breakout session   developing i phone and android apps using ...
Codestrong 2012 breakout session developing i phone and android apps using ...Axway Appcelerator
 
Post-agile approaches - agile for the real world and how to avoid agile failure
Post-agile approaches - agile for the real world and how to avoid agile failurePost-agile approaches - agile for the real world and how to avoid agile failure
Post-agile approaches - agile for the real world and how to avoid agile failure
Yuval Yeret
 
Egit democamp-karlsruhe-2011-11-29
Egit democamp-karlsruhe-2011-11-29Egit democamp-karlsruhe-2011-11-29
Egit democamp-karlsruhe-2011-11-29Stefan Lay
 
Using Github Insight as metric for the Developer collaboration and work metri...
Using Github Insight as metric for the Developer collaboration and work metri...Using Github Insight as metric for the Developer collaboration and work metri...
Using Github Insight as metric for the Developer collaboration and work metri...
Najib Radzuan
 
Remote and Open: How GitLab functions (presentation at Landing.careers)
Remote and Open: How GitLab functions (presentation at Landing.careers)Remote and Open: How GitLab functions (presentation at Landing.careers)
Remote and Open: How GitLab functions (presentation at Landing.careers)
🌍 Job van der Voort
 
Confluence Training Course
Confluence Training CourseConfluence Training Course
Confluence Training Course
Astro Tech
 
Failing Fast - An Autopsy of a Failed Release
Failing Fast - An Autopsy of a Failed ReleaseFailing Fast - An Autopsy of a Failed Release
Failing Fast - An Autopsy of a Failed Release
johnfcshaw
 
Stayin' Alive! Feature Disco Your Way to PI Planning
Stayin' Alive! Feature Disco Your Way to PI PlanningStayin' Alive! Feature Disco Your Way to PI Planning
Stayin' Alive! Feature Disco Your Way to PI Planning
Em Campbell-Pretty
 
Telstra’s Journey to SAFe - RallyON - June 2013
Telstra’s Journey to SAFe - RallyON - June 2013Telstra’s Journey to SAFe - RallyON - June 2013
Telstra’s Journey to SAFe - RallyON - June 2013
Em Campbell-Pretty
 
The Agile Architect - CAMUG - Oct 1, 2015
The Agile Architect - CAMUG - Oct 1, 2015The Agile Architect - CAMUG - Oct 1, 2015
The Agile Architect - CAMUG - Oct 1, 2015
Chris Edwards, P.Eng.
 
The Agile Architect - Agile India 2016
The Agile Architect - Agile India 2016The Agile Architect - Agile India 2016
The Agile Architect - Agile India 2016
Chris Edwards, P.Eng.
 
Increase the Velocity of Your Software Releases Using GitHub and DeployHub
Increase the Velocity of Your Software Releases Using GitHub and DeployHubIncrease the Velocity of Your Software Releases Using GitHub and DeployHub
Increase the Velocity of Your Software Releases Using GitHub and DeployHub
DevOps.com
 
A Principles Based Approach to SAFe
A Principles Based Approach to SAFeA Principles Based Approach to SAFe
A Principles Based Approach to SAFe
Em Campbell-Pretty
 
Web.dev extended : What's new in Web [GDG Taichung]
Web.dev extended : What's new in Web [GDG Taichung]Web.dev extended : What's new in Web [GDG Taichung]
Web.dev extended : What's new in Web [GDG Taichung]
Chieh Kai Yang
 
Context Driven Agile Leadership
Context Driven Agile LeadershipContext Driven Agile Leadership
Context Driven Agile Leadership
Synerzip
 
DevOps presentation at gemeente Rotterdam
DevOps presentation at gemeente RotterdamDevOps presentation at gemeente Rotterdam
DevOps presentation at gemeente Rotterdam
Miel Donkers
 
Introducing GitLab
Introducing GitLabIntroducing GitLab
Introducing GitLab
Taisuke Inoue
 
JHipster React - Devoxx BE 2017
JHipster React - Devoxx BE 2017JHipster React - Devoxx BE 2017
JHipster React - Devoxx BE 2017
Deepu K Sasidharan
 
Introducing GitLab (September 2018)
Introducing GitLab (September 2018)Introducing GitLab (September 2018)
Introducing GitLab (September 2018)
Noa Harel
 

What's hot (20)

Codestrong 2012 breakout session developing i phone and android apps using ...
Codestrong 2012 breakout session   developing i phone and android apps using ...Codestrong 2012 breakout session   developing i phone and android apps using ...
Codestrong 2012 breakout session developing i phone and android apps using ...
 
Post-agile approaches - agile for the real world and how to avoid agile failure
Post-agile approaches - agile for the real world and how to avoid agile failurePost-agile approaches - agile for the real world and how to avoid agile failure
Post-agile approaches - agile for the real world and how to avoid agile failure
 
Egit democamp-karlsruhe-2011-11-29
Egit democamp-karlsruhe-2011-11-29Egit democamp-karlsruhe-2011-11-29
Egit democamp-karlsruhe-2011-11-29
 
Using Github Insight as metric for the Developer collaboration and work metri...
Using Github Insight as metric for the Developer collaboration and work metri...Using Github Insight as metric for the Developer collaboration and work metri...
Using Github Insight as metric for the Developer collaboration and work metri...
 
Remote and Open: How GitLab functions (presentation at Landing.careers)
Remote and Open: How GitLab functions (presentation at Landing.careers)Remote and Open: How GitLab functions (presentation at Landing.careers)
Remote and Open: How GitLab functions (presentation at Landing.careers)
 
Confluence Training Course
Confluence Training CourseConfluence Training Course
Confluence Training Course
 
Failing Fast - An Autopsy of a Failed Release
Failing Fast - An Autopsy of a Failed ReleaseFailing Fast - An Autopsy of a Failed Release
Failing Fast - An Autopsy of a Failed Release
 
Stayin' Alive! Feature Disco Your Way to PI Planning
Stayin' Alive! Feature Disco Your Way to PI PlanningStayin' Alive! Feature Disco Your Way to PI Planning
Stayin' Alive! Feature Disco Your Way to PI Planning
 
True Git
True Git True Git
True Git
 
Telstra’s Journey to SAFe - RallyON - June 2013
Telstra’s Journey to SAFe - RallyON - June 2013Telstra’s Journey to SAFe - RallyON - June 2013
Telstra’s Journey to SAFe - RallyON - June 2013
 
The Agile Architect - CAMUG - Oct 1, 2015
The Agile Architect - CAMUG - Oct 1, 2015The Agile Architect - CAMUG - Oct 1, 2015
The Agile Architect - CAMUG - Oct 1, 2015
 
The Agile Architect - Agile India 2016
The Agile Architect - Agile India 2016The Agile Architect - Agile India 2016
The Agile Architect - Agile India 2016
 
Increase the Velocity of Your Software Releases Using GitHub and DeployHub
Increase the Velocity of Your Software Releases Using GitHub and DeployHubIncrease the Velocity of Your Software Releases Using GitHub and DeployHub
Increase the Velocity of Your Software Releases Using GitHub and DeployHub
 
A Principles Based Approach to SAFe
A Principles Based Approach to SAFeA Principles Based Approach to SAFe
A Principles Based Approach to SAFe
 
Web.dev extended : What's new in Web [GDG Taichung]
Web.dev extended : What's new in Web [GDG Taichung]Web.dev extended : What's new in Web [GDG Taichung]
Web.dev extended : What's new in Web [GDG Taichung]
 
Context Driven Agile Leadership
Context Driven Agile LeadershipContext Driven Agile Leadership
Context Driven Agile Leadership
 
DevOps presentation at gemeente Rotterdam
DevOps presentation at gemeente RotterdamDevOps presentation at gemeente Rotterdam
DevOps presentation at gemeente Rotterdam
 
Introducing GitLab
Introducing GitLabIntroducing GitLab
Introducing GitLab
 
JHipster React - Devoxx BE 2017
JHipster React - Devoxx BE 2017JHipster React - Devoxx BE 2017
JHipster React - Devoxx BE 2017
 
Introducing GitLab (September 2018)
Introducing GitLab (September 2018)Introducing GitLab (September 2018)
Introducing GitLab (September 2018)
 

Similar to An introduction to chaos engineering as part of DevOps at XP2019

DOES16 London - Chris Jackson - Disrupting an Enterprise from the Inside
DOES16 London -  Chris Jackson - Disrupting an Enterprise from the InsideDOES16 London -  Chris Jackson - Disrupting an Enterprise from the Inside
DOES16 London - Chris Jackson - Disrupting an Enterprise from the Inside
Gene Kim
 
The Role of GitOps in IT Strategy - June 2021 - Schlomo Schapiro
The Role of GitOps in IT Strategy - June 2021 - Schlomo SchapiroThe Role of GitOps in IT Strategy - June 2021 - Schlomo Schapiro
The Role of GitOps in IT Strategy - June 2021 - Schlomo Schapiro
Schlomo Schapiro
 
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGroup
 
Cutting Agency IT Costs, Growing Innovation
Cutting Agency IT Costs, Growing InnovationCutting Agency IT Costs, Growing Innovation
Cutting Agency IT Costs, Growing Innovation
Apigee | Google Cloud
 
Sendachi | 451 | GitHub Webinar: Demystifying Collaboration at Scale: DevOp...
Sendachi | 451 | GitHub Webinar: Demystifying Collaboration at Scale: DevOp...Sendachi | 451 | GitHub Webinar: Demystifying Collaboration at Scale: DevOp...
Sendachi | 451 | GitHub Webinar: Demystifying Collaboration at Scale: DevOp...
Sendachi
 
Augmenting and Automating DevOps with Artificial Intelligence
Augmenting and Automating DevOps with Artificial IntelligenceAugmenting and Automating DevOps with Artificial Intelligence
Augmenting and Automating DevOps with Artificial Intelligence
Eficode
 
Extreme SAFe - Turning Up the Flow in PI Execution
Extreme SAFe - Turning Up the Flow in PI ExecutionExtreme SAFe - Turning Up the Flow in PI Execution
Extreme SAFe - Turning Up the Flow in PI Execution
Em Campbell-Pretty
 
DevOps Roadtrip NYC
DevOps Roadtrip NYC DevOps Roadtrip NYC
DevOps Roadtrip NYC
VictorOps
 
Pure APIs: Development workflows for successful API integrations
Pure APIs: Development workflows for successful API integrationsPure APIs: Development workflows for successful API integrations
Pure APIs: Development workflows for successful API integrations
José Haro Peralta
 
Why Everyone Needs DevOps Now - Gene Kim
Why Everyone Needs DevOps Now - Gene KimWhy Everyone Needs DevOps Now - Gene Kim
Why Everyone Needs DevOps Now - Gene Kim
Dynatrace
 
Enabling Agility Through DevOps
Enabling Agility Through DevOpsEnabling Agility Through DevOps
Enabling Agility Through DevOps
Leland Newsom CSP-SM, SPC5, SDP
 
AI/ML/DL: Getting Started with Machine Learning on Azure
AI/ML/DL: Getting Started with Machine Learning on AzureAI/ML/DL: Getting Started with Machine Learning on Azure
AI/ML/DL: Getting Started with Machine Learning on Azure
Marvin Heng
 
The future of (Windows) operations #WinOps #DevOps
The future of (Windows) operations #WinOps #DevOpsThe future of (Windows) operations #WinOps #DevOps
The future of (Windows) operations #WinOps #DevOps
DevOpsGroup
 
Why Everyone Needs DevOps Now: 15 Year Study Of High Performing Technology Orgs
Why Everyone Needs DevOps Now: 15 Year Study Of High Performing Technology OrgsWhy Everyone Needs DevOps Now: 15 Year Study Of High Performing Technology Orgs
Why Everyone Needs DevOps Now: 15 Year Study Of High Performing Technology Orgs
Gene Kim
 
QuickBooks Connect 2016 - Implementing analytic and optimization tools on you...
QuickBooks Connect 2016 - Implementing analytic and optimization tools on you...QuickBooks Connect 2016 - Implementing analytic and optimization tools on you...
QuickBooks Connect 2016 - Implementing analytic and optimization tools on you...
Intuit Developer
 
The Bright Ops Future - Reinventing Operations in the Age of Cloud-Native IT
The Bright Ops Future - Reinventing Operations in the Age of Cloud-Native ITThe Bright Ops Future - Reinventing Operations in the Age of Cloud-Native IT
The Bright Ops Future - Reinventing Operations in the Age of Cloud-Native IT
VMware Tanzu
 
The Role of GitOps in IT-Strategy - November 2021 - Schlomo Schapiro - Contin...
The Role of GitOps in IT-Strategy - November 2021 - Schlomo Schapiro - Contin...The Role of GitOps in IT-Strategy - November 2021 - Schlomo Schapiro - Contin...
The Role of GitOps in IT-Strategy - November 2021 - Schlomo Schapiro - Contin...
Schlomo Schapiro
 
Slow down. Be Human. Building trust across teams with data
Slow down. Be Human. Building trust across teams with dataSlow down. Be Human. Building trust across teams with data
Slow down. Be Human. Building trust across teams with data
Matthew Eng
 
Building Better Software Faster
Building Better Software FasterBuilding Better Software Faster
Building Better Software Faster
Sander Hoogendoorn
 
DevOpsDays - Pick any Three - Devops from scratch
DevOpsDays - Pick any Three - Devops from scratchDevOpsDays - Pick any Three - Devops from scratch
DevOpsDays - Pick any Three - Devops from scratch
Pete Cheslock
 

Similar to An introduction to chaos engineering as part of DevOps at XP2019 (20)

DOES16 London - Chris Jackson - Disrupting an Enterprise from the Inside
DOES16 London -  Chris Jackson - Disrupting an Enterprise from the InsideDOES16 London -  Chris Jackson - Disrupting an Enterprise from the Inside
DOES16 London - Chris Jackson - Disrupting an Enterprise from the Inside
 
The Role of GitOps in IT Strategy - June 2021 - Schlomo Schapiro
The Role of GitOps in IT Strategy - June 2021 - Schlomo SchapiroThe Role of GitOps in IT Strategy - June 2021 - Schlomo Schapiro
The Role of GitOps in IT Strategy - June 2021 - Schlomo Schapiro
 
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
 
Cutting Agency IT Costs, Growing Innovation
Cutting Agency IT Costs, Growing InnovationCutting Agency IT Costs, Growing Innovation
Cutting Agency IT Costs, Growing Innovation
 
Sendachi | 451 | GitHub Webinar: Demystifying Collaboration at Scale: DevOp...
Sendachi | 451 | GitHub Webinar: Demystifying Collaboration at Scale: DevOp...Sendachi | 451 | GitHub Webinar: Demystifying Collaboration at Scale: DevOp...
Sendachi | 451 | GitHub Webinar: Demystifying Collaboration at Scale: DevOp...
 
Augmenting and Automating DevOps with Artificial Intelligence
Augmenting and Automating DevOps with Artificial IntelligenceAugmenting and Automating DevOps with Artificial Intelligence
Augmenting and Automating DevOps with Artificial Intelligence
 
Extreme SAFe - Turning Up the Flow in PI Execution
Extreme SAFe - Turning Up the Flow in PI ExecutionExtreme SAFe - Turning Up the Flow in PI Execution
Extreme SAFe - Turning Up the Flow in PI Execution
 
DevOps Roadtrip NYC
DevOps Roadtrip NYC DevOps Roadtrip NYC
DevOps Roadtrip NYC
 
Pure APIs: Development workflows for successful API integrations
Pure APIs: Development workflows for successful API integrationsPure APIs: Development workflows for successful API integrations
Pure APIs: Development workflows for successful API integrations
 
Why Everyone Needs DevOps Now - Gene Kim
Why Everyone Needs DevOps Now - Gene KimWhy Everyone Needs DevOps Now - Gene Kim
Why Everyone Needs DevOps Now - Gene Kim
 
Enabling Agility Through DevOps
Enabling Agility Through DevOpsEnabling Agility Through DevOps
Enabling Agility Through DevOps
 
AI/ML/DL: Getting Started with Machine Learning on Azure
AI/ML/DL: Getting Started with Machine Learning on AzureAI/ML/DL: Getting Started with Machine Learning on Azure
AI/ML/DL: Getting Started with Machine Learning on Azure
 
The future of (Windows) operations #WinOps #DevOps
The future of (Windows) operations #WinOps #DevOpsThe future of (Windows) operations #WinOps #DevOps
The future of (Windows) operations #WinOps #DevOps
 
Why Everyone Needs DevOps Now: 15 Year Study Of High Performing Technology Orgs
Why Everyone Needs DevOps Now: 15 Year Study Of High Performing Technology OrgsWhy Everyone Needs DevOps Now: 15 Year Study Of High Performing Technology Orgs
Why Everyone Needs DevOps Now: 15 Year Study Of High Performing Technology Orgs
 
QuickBooks Connect 2016 - Implementing analytic and optimization tools on you...
QuickBooks Connect 2016 - Implementing analytic and optimization tools on you...QuickBooks Connect 2016 - Implementing analytic and optimization tools on you...
QuickBooks Connect 2016 - Implementing analytic and optimization tools on you...
 
The Bright Ops Future - Reinventing Operations in the Age of Cloud-Native IT
The Bright Ops Future - Reinventing Operations in the Age of Cloud-Native ITThe Bright Ops Future - Reinventing Operations in the Age of Cloud-Native IT
The Bright Ops Future - Reinventing Operations in the Age of Cloud-Native IT
 
The Role of GitOps in IT-Strategy - November 2021 - Schlomo Schapiro - Contin...
The Role of GitOps in IT-Strategy - November 2021 - Schlomo Schapiro - Contin...The Role of GitOps in IT-Strategy - November 2021 - Schlomo Schapiro - Contin...
The Role of GitOps in IT-Strategy - November 2021 - Schlomo Schapiro - Contin...
 
Slow down. Be Human. Building trust across teams with data
Slow down. Be Human. Building trust across teams with dataSlow down. Be Human. Building trust across teams with data
Slow down. Be Human. Building trust across teams with data
 
Building Better Software Faster
Building Better Software FasterBuilding Better Software Faster
Building Better Software Faster
 
DevOpsDays - Pick any Three - Devops from scratch
DevOpsDays - Pick any Three - Devops from scratchDevOpsDays - Pick any Three - Devops from scratch
DevOpsDays - Pick any Three - Devops from scratch
 

Recently uploaded

Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
Tendenci - The Open Source AMS (Association Management Software)
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
WSO2
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
kalichargn70th171
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 

Recently uploaded (20)

Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 

An introduction to chaos engineering as part of DevOps at XP2019

  • 2.
  • 3. What is Chaos Engineering ? 4
  • 4. “Everything fails all the time. We lose whole data centers! Those things happen.” Werner Vogels (CTO @Amazon.com) What ? 5
  • 5. “Everything fails all the time. We lose whole data centers! Those things happen.” Werner Vogels (CTO @Amazon.com) Who am I ? 6 Twitter: @GurtejPalSingh LinkedIn: https://www.linkedin.com/in/gurtejpalsingh/ Blog: https://gurtejpsingh.wordpress.com/ Associate Director – Agile Service Delivery @ LoyaltyOne Enterprise Digital Strategy (Agile Coaching & DevOps Consulting) | Planning & Execution | Technology Consulting Gurtej Pal Singh
  • 6. Agenda • What ? • Chaos Engineering VS Testing • The Story of How & Why • Chaos Engineering & DevOps • Industry Presence 7
  • 8. Let us see… Database Database Database Request DB Request Fetch from Cache Web Server Users Gurtej Pal Singh | @GurtejPalSingh 9
  • 9. Then to now… Gurtej Pal Singh | @GurtejPalSingh 10Pic courtesy: https://blog.knoldus.com
  • 10. Let us see….. Gurtej Pal Singh | @GurtejPalSingh 11
  • 11. Gurtej Pal Singh | @GurtejPalSingh 11 Chaos Engineering Software Testing Chaos Engineering VS Testing
  • 12. Chaos Engineering VS Testing Gurtej Pal Singh | @GurtejPalSingh 13
  • 13. Models of the two… Testing == Validation Chaos Engineering == Experimentation Gurtej Pal Singh | @GurtejPalSingh 14
  • 14. Gurtej Pal Singh | @GurtejPalSingh 14 The story of How & Why…
  • 15. Prerequisites • Review the application architecture to identify : •Dependencies •Customer impact •Failure Points •Recovery procedures Gurtej Pal Singh | @GurtejPalSingh 16
  • 16. Principles of Chaos • Build a Hypothesis around Steady State Behavior • Vary real world events • Experiment in production • Automate experiments to run continuously • Minimize Blast Radius Gurtej Pal Singh | @GurtejPalSingh 17
  • 17. Sample Chaos Experiments* • Terminate random virtual server in a region. • Subject entire fleet to high CPU/Memory within a region. • Increase latency in one or more servers. • Block access to a storage system. • Failover a database to its secondary. • Random killing of critical processes. *For a generic three tier application, It is not comprehensive and depends on the application tooling support and maturity, but it is a good starting point. Gurtej Pal Singh | @GurtejPalSingh 18
  • 18. Approach Web Setup Sane Setup Chaos Setup Requests Expected behavior Expected benchmarked behavior Hypothesis (dis)proved behavior Incorporate Learnings Gurtej Pal Singh | @GurtejPalSingh 19 ForComparingtheresults
  • 19. Levels of Chaos Sophistication Adoption Gurtej Pal Singh | @GurtejPalSingh 20 No chaos Story Frequent Crashes Presumed Resiliency True Resiliency
  • 20. Why ? Gurtej Pal Singh | @GurtejPalSingh 21Image: www.salonuready.info
  • 21. Gurtej Pal Singh | @GurtejPalSingh 19 Chaos Engineering & DevOps
  • 22. Chaos Engineering in CI CD Pipeline Gurtej Pal Singh | @GurtejPalSingh 23
  • 23. Simian Army : • Chaos Monkey • Chaos Gorilla • Chaos Kong • ChAP (Chaos Automation Platform) Few tools to get started with Chaos Engineering Gurtej Pal Singh | @GurtejPalSingh 24
  • 24. Chaos Engineering as a part of DevOps Culture Post the Chaos Experiments has been continuously validated in a CICD pipeline: Gurtej Pal Singh | @GurtejPalSingh 25 Developers & stakeholders continually embrace failure as a way to prepare for and prevent it, resulting in stronger and more resilient applications. • Enable it in production , incrementally, in a random fashion (without notifying the support teams) • Tooling must be mature enough to stop the experiment without further impacts. • Initialize Chaos experiments introduced during random non-peak times, post maturity , move to peak times.
  • 25. Industry Presence Gurtej Pal Singh | @GurtejPalSingh 26
  • 26. Industry Presence Gurtej Pal Singh | @GurtejPalSingh 24 Chaos Engineering Companies, People, Tools and Practices
  • 27. What ? Discipline of experimenting in a distributed system to build confidence in system capability in turbulent conditions in “PRODUCTION”. https://principlesofchaos.org/ Gurtej Pal Singh | @GurtejPalSingh 28
  • 28. Gurtej Pal Singh | @GurtejPalSingh 29 - Werner Vogels – VP & CTO at Amazon.com The main point is no more to avoid failures, but to limit impact of those failures.
  • 29. References & Resources • https://principlesofchaos.org/ • https://medium.com/capital-one-tech/continuous-chaos-introducing-chaos- engineering-into-devops-practices-75757e1cca6d • https://www.gremlin.com/uploads/20171210- Chaos_Engineering_White_Paper.pdf • https://www.youtube.com/watch?v=6ilMZqKdMMU • https://www.youtube.com/watch?v=qHykK5pFRW4 • https://medium.com/netflix-techblog/the-netflix-simian-army-16e57fbab116 • https://medium.com/netflix-techblog/chap-chaos-automation-platform- 53e6d528371f • https://www.oreilly.com/library/view/chaos-engineering/9781491988459/ Gurtej Pal Singh | @GurtejPalSingh 30
  • 30. Gurtej Pal Singh | @GurtejPalSingh 29 Twitter: @GurtejPalSingh LinkedIn: https://www.linkedin.com/in/gurtejpalsingh/