Successfully reported this slideshow.
Your SlideShare is downloading. ×

#ATAGTR2021 Presentation : "Chaos engineering: Break it to make it" by Anupam Agarwal, Peeyush Girdhar.

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Upcoming SlideShare
Chaos engineering intro
Chaos engineering intro
Loading in …3
×

Check these out next

1 of 16 Ad

#ATAGTR2021 Presentation : "Chaos engineering: Break it to make it" by Anupam Agarwal, Peeyush Girdhar.

Download to read offline

Interactive Session on "Chaos engineering: Break it to make it" by Anupam Agarwal,Nagarro, Peeyush Girdhar, Cloud / DevOps Nagarro. at #ATAGTR2021.

#ATAGTR2021 was the 6th Edition of Global Testing Retreat.

The video recording of the session is now available on the following link: https://www.youtube.com/watch?v=4bM4f8xNp2A

To know more about #ATAGTR2021, please visit:https://gtr.agiletestingalliance.org/

Interactive Session on "Chaos engineering: Break it to make it" by Anupam Agarwal,Nagarro, Peeyush Girdhar, Cloud / DevOps Nagarro. at #ATAGTR2021.

#ATAGTR2021 was the 6th Edition of Global Testing Retreat.

The video recording of the session is now available on the following link: https://www.youtube.com/watch?v=4bM4f8xNp2A

To know more about #ATAGTR2021, please visit:https://gtr.agiletestingalliance.org/

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to #ATAGTR2021 Presentation : "Chaos engineering: Break it to make it" by Anupam Agarwal, Peeyush Girdhar. (20)

Advertisement

More from Agile Testing Alliance (20)

Recently uploaded (20)

Advertisement

#ATAGTR2021 Presentation : "Chaos engineering: Break it to make it" by Anupam Agarwal, Peeyush Girdhar.

  1. 1. #ATAGTR2021 Chaos Engineering: Break It to Make It Anupam Agarwal & Peeyush Girdh
  2. 2. KNOW YOUR SPEAKERS Anupam Agarwal Peeyush Girdhar Cloud/DevOps Architect Cloud/DevOps Architect
  3. 3. AGENDA 01 02 03 04 Concept of Chaos Engineering Need for Chaos Engineering Chaos Engineering vs Normal Testing Start your journey with Chaos Engineering
  4. 4. Why the World Needs more Resilient Systems ? 1 BREACH 2 MATURITY 3 TEAMS 4 TESTING Organizations confirmed or suspected breaches tied to their applications or Infrastructure. Organization that are in immature or improving state with respect to environment resilience. Teams have not incorporated resilience testing in their design during initial stages of SDLC Traditional testing are still not helping them to find the issues within the ecosystems.. 24% 86% 65% 47% Common issues faced by multiple organizations
  5. 5. Chaos Engineering : Where are we ? The art of breaking things purposefully Ever since Netflix introduced Chaos Engineering through their Simian Army toolset in 2012, the idea of inducing failure as a preventative means has become one of the preferred resilience techniques for cloud native distributed systems. “Chaos Engineering is the discipline of experimenting on a distributed system in order to induce artificial failures to build confidence in the system's capability to withstand turbulent conditions in production.” Here's how Netflix describes why they built these chaos tools: The cloud is all about redundancy and fault-tolerance. Since no single component can guarantee 100% uptime (and even the most expensive hardware eventually fails), we have to design a cloud architecture where individual components can fail without affecting the availability of the entire system. In effect, we have to be stronger than our weakest link.
  6. 6. Why Chaos Engineering? Chaos Engineering is Preventive Medicine Chaos Engineering is an approach for learning about how your system behaves by applying a discipline of empirical exploration. Chaos engineering enables organizations to develop reliable and fault-tolerant software systems, building your team’s confidence in them. The more stable your systems are, the more confident you can be that they will function properly. By designing and executing Chaos Engineering experiments, you will learn about weaknesses in your system that could potentially lead to outages in customer environment. LEARN PREVENT OUTAGES BUILD CONFIDENCE
  7. 7. Getting Started with Chaos Engineering Disciplined approach to find failures before they become outages. DEFINE ‘STEADY STATE’ CREATE HYPOTHESIS RUN EXPERIMENTS INTERPRET THE RESULTS LEARN & IMPROVE Start by defining ‘steady state’ as some measurable output of a system that indicates normal behavior. Hypothesize that this steady state will continue in both the control group and the experimental group Introduce attacks that reflect real world events like server crash, hard drive malfunctioning, network outage etc. Try to disprove the hypothesis by looking for a difference in steady state between the control group and the experimental Improve functionalities in the existing system from the above experiments and their results.
  8. 8. Chaos Engineering Meets DevOps Maximize benefits by practicing automated Chaos Engineering within your CI/CD pipelines
  9. 9. DEVOPS SLOs/Error Budget Documentati on Architectu re ModelRunbooks Monitoring Network Provider CDNs Cloud & SaaS Providers Performan ce Error Handling Timeouts/Retri es/Circuit Breakers Automated Testing Continuous Integration Continuous Deployment Feature Flagging/ Progressive Continuous Chaos Graduate chaos experiments into different phases
  10. 10. What is Game Day? Game Day are like fire drills on a dedicated day for running chaos engineering experiments on our systems. Define the timelines Whiteboarding Execution Review Define the Targets How to run a Game Day Promote Chaos Days !!
  11. 11. How Chaos Engineering differ from Testing ? Practice for generating new information • Experiments propose a hypothesis, and if the hypothesis is not disproven, confidence grows in that hypothesis. If it is disproven, then we learn something new. GENERATE NEW INFORMATION • An important distinction can be drawn between testing and experimentation. Tests make an assertion, based on existing knowledge, and then running the test collapses the valence of that assertion, usually into either true or false. DRAW DISTINCTION  When you want to explore the many ways, a complex system can misbehave, injecting communication failures like latency and errors is one good approach. EXPLORATION OF UNKNOWN • Testing, strictly speaking, does not create new knowledge. Testing requires that the engineer writing the test knows specific properties about the system that they are looking for in advance. COMPLEX ECOSYSTEM
  12. 12. Tools to kickstart your Chaos Journey AWS Fault Injection Which one to choose?
  13. 13. Is it even worth embracing? Pros Cons • Insights received after running chaos testing can lead to a reduction in production incidents for the future. • Implementing Chaos tools for a large- scale system and experimenting can lead to an increase in cost. • Helps in improving the confidence and engagement of team members for carrying out disaster recovery methods and makes applications highly reliable. • Carelessness or Incorrect steps in formation and implementation can impact the application, thereby hampering the customer. • On a high level, Chaos Engineering provides us an advantage by overall system availability. • It doesn't support all kinds of deployment. • Production outages can lead to huge losses; therefore, chaos engineering helps in the prevention of large losses in revenue. • Most of the chaos Engineering tools do not covers all type of environments and its components. • The team can verify system's behavior on failure to take Opportunities & Obstacles
  14. 14. DEMO- QUICK INSIGHT
  15. 15. 15 ANY QUESTIONS For Any Queries, please write at : anupamaggarwal.0611@gmail.com / girdhar.peeyush@gmail.com
  16. 16. THANK YOU

Editor's Notes

  • Opportunities & Obstacles

×