SlideShare a Scribd company logo
Simplify Pharmacy. Save Money.
Shahzad Zafar
Vice President of Engineering
Creating
Chaos…
Engineering for the
Unexpected!
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
Creating Chaos … Engineering!
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
Shahzad Zafar
Vice President of Engineering
Creating Chaos
… Engineering!
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
Why This Topic?
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
Why This Topic?
"A distributed system is one in which the failure of a computer you didn't even know existed can
render your own computer unusable" - Leslie Lamport
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
What is Chaos Engineering?
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
What is Chaos Engineering?
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
What is Chaos Engineering?
@RxSavings @m_shahzad_z
 Requires
► Having a hypothesis
► Identifying control conditions
► Uses real-world events
► Limiting the scope or blast radius
► Make it as real as possible
- Ideally running it in Prod
Simplify Pharmacy. Save Money.
Chaos Monkey vs. Chaos Engineering
Chaos Monkey
Chaos Gorilla
Chaos Kong
Janitor Monkey
Doctor Monkey
Compliance Monkey
Latency Monkey
Security Monkey
@RxSavings @m_shahzad_z
Chaos
Engineering
Simplify Pharmacy. Save Money.
Principles of Chaos Engineering (aka running the experiments)
 #1 Have a Good Hypothesis
► Start with the Why?
► Like any experiment, know what is the expected behavior
 #2 Use Real-World Events
► Use frequent and/or high impact scenarios
► Review incidents and use them refine scenarios
 #3 Continuous Experimentation
► Automate the process of running experiments
► Tools to both orchestrate and analyze experiments
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
Principles of Chaos Engineering (aka running the experiments)
 #4 Use Business Metrics
► Start with steady state system metrics such as throughput, error rates etc. (outputs)
► Move quickly to using business metrics such as value added, functionality usage (outcomes)
 #5 Limiting Blast Radius
► Goal is not to experiment against the whole system
► Scale the experiment up and stop when it starts impacting
business metrics
 #6 Run Experiments in Production
► Most realistic setup is in Production
► Use principles #4 and #5 to avoid impacting users
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
Where to Start?
 Start with Known Weakest Link
► Helps in building practice and muscle memory
► Work your way backwards to find the unknowns
 Monitoring
► First few times could be manual monitoring
- As long as monitoring steps are accounted for in the hypothesis
► Quickly automate, so you can focus on anomalies during an experiment
 Being Inclusive
► Humans are part of the system … test them
► Find your Brent (from the Phoenix Project)
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
Risk Tolerance
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
Where to Start?
 Organizational Risk Tolerance
► Starting with planned, announced events
► Run enough experiments to improve tolerance
► High risk times is when to run the experiments
- Work to be done in “off” hours should not be acceptable
- Build our system to be resilient to any change at any time
► Goal: build resilient products
- By running unannounced experiments, all the time
 Understanding the process of creating hypothesis
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
DevOps & Chaos Engineering
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
DevOps & Chaos Engineering
 Given the ever increasing toolset
► Need vertical alignment from inception to
delivery
► DevOps mindset and behaviors are needed
truly chaos test your system
► System monitoring and operations need to be
built-in as features from the beginning
► 1 in 2n chance of success
- Where n is the number of dependencies
- Troy Magennis – Agile2018 Keynote
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
DevOps & Chaos Engineering
 Value Stream Mapping
► Map out the entire system to find bottlenecks and weak spots
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
DevOps & Chaos Engineering
 Value Stream Mapping
► Map out the entire system to find bottlenecks and weak spots
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
Real Experiments
 Test failure of a load balancer or service
► Identify resiliency at an individual component level
 Fault testing for an Availability Zone or Region
► Identify failover resiliency
 Test failure of an entire rack
► Identify resiliency when several components fails
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
Real Experiments
 Power Loss vs. Server Shutdown
► In our first experiment, hypothesis was it would have the same result
► Pulling the power out revealed some other dependencies that did not show up when just
shutting down a server
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
Scaling Beyond a Team
 Moving from ”The Shadows” to Invested
► Pilot is small and might not need approvals, beyond team buy-in
► Getting investment helps in broader buy-in and support to build tooling around it
 Creating an Automation Tool, which can
► Do canary analysis
► Have default monitoring and controls
 Get to a point where running an experiment needs to be
► Routine
► Not time consuming
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
Conclusions
 Start small, grow from there
 Spend time writing your hypothesis
 Automate and build-in needed capabilities
 Recognize risk tolerance
► And get comfortable running experiments during ‘high risk’ times
 Run experiments all the time
And to ensure system resiliency…
Create Chaos!
@RxSavings @m_shahzad_z
Simplify Pharmacy. Save Money.
References
 Chaos Engineering
► Building Confidence in System Behavior through
Experiments
► https://www.oreilly.com/webops-perf/free/chaos-
engineering.csp
 Canary Analyze All The Things
► https://www.infoq.com/presentations/canary-analysis-
deployment-pattern
 The Phoenix Project
► https://www.amazon.com/Phoenix-Project-DevOps-
Helping-Business/dp/0988262592
 A comprehensive guide by Gremlin
► https://www.gremlin.com/chaos-monkey/
 Performing Chaos at Netflix Scale
► https://www.youtube.com/watch?v=LaKGx0dAUlo
@RxSavings
Shahzad Zafar
@m_shahzad_z
Thank You!

More Related Content

What's hot

Hazop_study_introduction
Hazop_study_introductionHazop_study_introduction
Hazop_study_introduction
Mehdi Shariatmadar
 
06 overview of_ra1
06 overview of_ra106 overview of_ra1
06 overview of_ra1
Anil Raina
 
Hazop analysis complete report
Hazop analysis complete reportHazop analysis complete report
Hazop analysis complete report
Ravi chandra kancherla
 
Hazop Study
Hazop StudyHazop Study
Hazop Study
Martin Shija
 
Hazop gide line
Hazop gide lineHazop gide line
Balancing cost and performance with Apache Cassandra
Balancing cost and performance with Apache CassandraBalancing cost and performance with Apache Cassandra
Balancing cost and performance with Apache Cassandra
Jonathan Shook
 
Hazop & hazan
Hazop & hazanHazop & hazan
Hazop & hazan
Sahil Gohad
 
09.1 hazop studytrainingcourse
09.1 hazop studytrainingcourse09.1 hazop studytrainingcourse
09.1 hazop studytrainingcourse
joepaulnelson
 
Hazard and Operability Study (HAZOP) | Gaurav Singh Rajput
Hazard and Operability Study (HAZOP) | Gaurav Singh RajputHazard and Operability Study (HAZOP) | Gaurav Singh Rajput
Hazard and Operability Study (HAZOP) | Gaurav Singh Rajput
Gaurav Singh Rajput
 
INTRODUCTION & BASIS OF HAZOP
INTRODUCTION & BASIS OF HAZOPINTRODUCTION & BASIS OF HAZOP
INTRODUCTION & BASIS OF HAZOP
Nimra Nayyar
 
Hazop method
Hazop methodHazop method
Hazop Hazard and Operability Study
Hazop Hazard and Operability StudyHazop Hazard and Operability Study
Hazop Hazard and Operability Study
Anand kumar
 
Hazop leaders manual final
Hazop leaders manual finalHazop leaders manual final
Hazop leaders manual final
umar farooq
 
Equipment management technology solutions wat
Equipment management technology solutions watEquipment management technology solutions wat
Equipment management technology solutions wat
The Business Council of Mongolia
 

What's hot (14)

Hazop_study_introduction
Hazop_study_introductionHazop_study_introduction
Hazop_study_introduction
 
06 overview of_ra1
06 overview of_ra106 overview of_ra1
06 overview of_ra1
 
Hazop analysis complete report
Hazop analysis complete reportHazop analysis complete report
Hazop analysis complete report
 
Hazop Study
Hazop StudyHazop Study
Hazop Study
 
Hazop gide line
Hazop gide lineHazop gide line
Hazop gide line
 
Balancing cost and performance with Apache Cassandra
Balancing cost and performance with Apache CassandraBalancing cost and performance with Apache Cassandra
Balancing cost and performance with Apache Cassandra
 
Hazop & hazan
Hazop & hazanHazop & hazan
Hazop & hazan
 
09.1 hazop studytrainingcourse
09.1 hazop studytrainingcourse09.1 hazop studytrainingcourse
09.1 hazop studytrainingcourse
 
Hazard and Operability Study (HAZOP) | Gaurav Singh Rajput
Hazard and Operability Study (HAZOP) | Gaurav Singh RajputHazard and Operability Study (HAZOP) | Gaurav Singh Rajput
Hazard and Operability Study (HAZOP) | Gaurav Singh Rajput
 
INTRODUCTION & BASIS OF HAZOP
INTRODUCTION & BASIS OF HAZOPINTRODUCTION & BASIS OF HAZOP
INTRODUCTION & BASIS OF HAZOP
 
Hazop method
Hazop methodHazop method
Hazop method
 
Hazop Hazard and Operability Study
Hazop Hazard and Operability StudyHazop Hazard and Operability Study
Hazop Hazard and Operability Study
 
Hazop leaders manual final
Hazop leaders manual finalHazop leaders manual final
Hazop leaders manual final
 
Equipment management technology solutions wat
Equipment management technology solutions watEquipment management technology solutions wat
Equipment management technology solutions wat
 

Similar to Creating Chaos - Engineering for the Unexpected

Creating Chaos … Engineering
Creating Chaos … EngineeringCreating Chaos … Engineering
Creating Chaos … Engineering
Synerzip
 
5 Essential Tips for Load Testing Beginners
5 Essential Tips for Load Testing Beginners5 Essential Tips for Load Testing Beginners
5 Essential Tips for Load Testing Beginners
Neotys
 
Normal accidents and outpatient surgeries
Normal accidents and outpatient surgeriesNormal accidents and outpatient surgeries
Normal accidents and outpatient surgeries
Jonathan Creasy
 
Develop your inner tester
Develop your inner tester Develop your inner tester
Develop your inner tester
Anne-Marie Charrett
 
Protect-Biz for non-profits
Protect-Biz for non-profitsProtect-Biz for non-profits
Protect-Biz for non-profits
Salt Marsh Solutions, Inc.
 
Webinar - 'Test Case Immunity’- Optimize testing
Webinar - 'Test Case Immunity’- Optimize testing Webinar - 'Test Case Immunity’- Optimize testing
Webinar - 'Test Case Immunity’- Optimize testing
STAG Software Private Limited
 
Six Sigma Quality - Overproduction
Six Sigma Quality - OverproductionSix Sigma Quality - Overproduction
Six Sigma Quality - Overproduction
David Nichols
 
Applying SRE techniques to micro service design
Applying SRE techniques to micro service designApplying SRE techniques to micro service design
Applying SRE techniques to micro service design
Theo Schlossnagle
 
What does it take to be a performance tester?
What does it take to be a performance tester?What does it take to be a performance tester?
What does it take to be a performance tester?
SQALab
 
Applying principles of chaos engineering to serverless (O'Reilly Software Arc...
Applying principles of chaos engineering to serverless (O'Reilly Software Arc...Applying principles of chaos engineering to serverless (O'Reilly Software Arc...
Applying principles of chaos engineering to serverless (O'Reilly Software Arc...
Yan Cui
 
Prometheus - Open Source Forum Japan
Prometheus  - Open Source Forum JapanPrometheus  - Open Source Forum Japan
Prometheus - Open Source Forum Japan
Brian Brazil
 
The limits of unit testing by Craig Stuntz
The limits of unit testing by Craig StuntzThe limits of unit testing by Craig Stuntz
The limits of unit testing by Craig Stuntz
QA or the Highway
 
The Limits of Unit Testing by Craig Stuntz
The Limits of Unit Testing by Craig StuntzThe Limits of Unit Testing by Craig Stuntz
The Limits of Unit Testing by Craig Stuntz
QA or the Highway
 
Continues Deployment - Tech Talk week
Continues Deployment - Tech Talk weekContinues Deployment - Tech Talk week
Continues Deployment - Tech Talk week
rantav
 
Applying principles of chaos engineering to serverless (CodeMesh)
Applying principles of chaos engineering to serverless (CodeMesh)Applying principles of chaos engineering to serverless (CodeMesh)
Applying principles of chaos engineering to serverless (CodeMesh)
Yan Cui
 
Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...
Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...
Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...
Codemotion
 
Applying principles of chaos engineering to Serverless (CodeMotion Berlin)
Applying principles of chaos engineering to Serverless (CodeMotion Berlin)Applying principles of chaos engineering to Serverless (CodeMotion Berlin)
Applying principles of chaos engineering to Serverless (CodeMotion Berlin)
Yan Cui
 
Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...
Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...
Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...
Codemotion
 
Creating and Implementing Your Analytics Strategy
Creating and Implementing Your Analytics StrategyCreating and Implementing Your Analytics Strategy
Creating and Implementing Your Analytics Strategy
T. Scott Clendaniel
 
Puppeting in a Highly Regulated Industry
Puppeting in a Highly Regulated IndustryPuppeting in a Highly Regulated Industry
Puppeting in a Highly Regulated Industry
Puppet
 

Similar to Creating Chaos - Engineering for the Unexpected (20)

Creating Chaos … Engineering
Creating Chaos … EngineeringCreating Chaos … Engineering
Creating Chaos … Engineering
 
5 Essential Tips for Load Testing Beginners
5 Essential Tips for Load Testing Beginners5 Essential Tips for Load Testing Beginners
5 Essential Tips for Load Testing Beginners
 
Normal accidents and outpatient surgeries
Normal accidents and outpatient surgeriesNormal accidents and outpatient surgeries
Normal accidents and outpatient surgeries
 
Develop your inner tester
Develop your inner tester Develop your inner tester
Develop your inner tester
 
Protect-Biz for non-profits
Protect-Biz for non-profitsProtect-Biz for non-profits
Protect-Biz for non-profits
 
Webinar - 'Test Case Immunity’- Optimize testing
Webinar - 'Test Case Immunity’- Optimize testing Webinar - 'Test Case Immunity’- Optimize testing
Webinar - 'Test Case Immunity’- Optimize testing
 
Six Sigma Quality - Overproduction
Six Sigma Quality - OverproductionSix Sigma Quality - Overproduction
Six Sigma Quality - Overproduction
 
Applying SRE techniques to micro service design
Applying SRE techniques to micro service designApplying SRE techniques to micro service design
Applying SRE techniques to micro service design
 
What does it take to be a performance tester?
What does it take to be a performance tester?What does it take to be a performance tester?
What does it take to be a performance tester?
 
Applying principles of chaos engineering to serverless (O'Reilly Software Arc...
Applying principles of chaos engineering to serverless (O'Reilly Software Arc...Applying principles of chaos engineering to serverless (O'Reilly Software Arc...
Applying principles of chaos engineering to serverless (O'Reilly Software Arc...
 
Prometheus - Open Source Forum Japan
Prometheus  - Open Source Forum JapanPrometheus  - Open Source Forum Japan
Prometheus - Open Source Forum Japan
 
The limits of unit testing by Craig Stuntz
The limits of unit testing by Craig StuntzThe limits of unit testing by Craig Stuntz
The limits of unit testing by Craig Stuntz
 
The Limits of Unit Testing by Craig Stuntz
The Limits of Unit Testing by Craig StuntzThe Limits of Unit Testing by Craig Stuntz
The Limits of Unit Testing by Craig Stuntz
 
Continues Deployment - Tech Talk week
Continues Deployment - Tech Talk weekContinues Deployment - Tech Talk week
Continues Deployment - Tech Talk week
 
Applying principles of chaos engineering to serverless (CodeMesh)
Applying principles of chaos engineering to serverless (CodeMesh)Applying principles of chaos engineering to serverless (CodeMesh)
Applying principles of chaos engineering to serverless (CodeMesh)
 
Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...
Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...
Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...
 
Applying principles of chaos engineering to Serverless (CodeMotion Berlin)
Applying principles of chaos engineering to Serverless (CodeMotion Berlin)Applying principles of chaos engineering to Serverless (CodeMotion Berlin)
Applying principles of chaos engineering to Serverless (CodeMotion Berlin)
 
Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...
Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...
Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...
 
Creating and Implementing Your Analytics Strategy
Creating and Implementing Your Analytics StrategyCreating and Implementing Your Analytics Strategy
Creating and Implementing Your Analytics Strategy
 
Puppeting in a Highly Regulated Industry
Puppeting in a Highly Regulated IndustryPuppeting in a Highly Regulated Industry
Puppeting in a Highly Regulated Industry
 

More from Shahzad Zafar

Creating a Culture of Agility
Creating a Culture of AgilityCreating a Culture of Agility
Creating a Culture of Agility
Shahzad Zafar
 
The Team, The Team, The Team
The Team, The Team, The TeamThe Team, The Team, The Team
The Team, The Team, The Team
Shahzad Zafar
 
Running the DevOps Marathon
Running the DevOps MarathonRunning the DevOps Marathon
Running the DevOps Marathon
Shahzad Zafar
 
Up in the Sky! Is it a Bird? Is it a Plane? No, it's a Scrum Master!
Up in the Sky! Is it a Bird? Is it a Plane? No, it's a Scrum Master!Up in the Sky! Is it a Bird? Is it a Plane? No, it's a Scrum Master!
Up in the Sky! Is it a Bird? Is it a Plane? No, it's a Scrum Master!
Shahzad Zafar
 
AgileKC - DEVOPS - June 2014
AgileKC - DEVOPS - June 2014AgileKC - DEVOPS - June 2014
AgileKC - DEVOPS - June 2014
Shahzad Zafar
 
KCDC - Live & Breath Agile
KCDC - Live & Breath AgileKCDC - Live & Breath Agile
KCDC - Live & Breath Agile
Shahzad Zafar
 

More from Shahzad Zafar (6)

Creating a Culture of Agility
Creating a Culture of AgilityCreating a Culture of Agility
Creating a Culture of Agility
 
The Team, The Team, The Team
The Team, The Team, The TeamThe Team, The Team, The Team
The Team, The Team, The Team
 
Running the DevOps Marathon
Running the DevOps MarathonRunning the DevOps Marathon
Running the DevOps Marathon
 
Up in the Sky! Is it a Bird? Is it a Plane? No, it's a Scrum Master!
Up in the Sky! Is it a Bird? Is it a Plane? No, it's a Scrum Master!Up in the Sky! Is it a Bird? Is it a Plane? No, it's a Scrum Master!
Up in the Sky! Is it a Bird? Is it a Plane? No, it's a Scrum Master!
 
AgileKC - DEVOPS - June 2014
AgileKC - DEVOPS - June 2014AgileKC - DEVOPS - June 2014
AgileKC - DEVOPS - June 2014
 
KCDC - Live & Breath Agile
KCDC - Live & Breath AgileKCDC - Live & Breath Agile
KCDC - Live & Breath Agile
 

Recently uploaded

ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
Las Vegas Warehouse
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
SakkaravarthiShanmug
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
shadow0702a
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
Madan Karki
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
ecqow
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
RamonNovais6
 
AI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptxAI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptx
architagupta876
 
Software Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.pptSoftware Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.ppt
TaghreedAltamimi
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
gerogepatton
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
Nada Hikmah
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
PKavitha10
 
Hematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood CountHematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood Count
shahdabdulbaset
 

Recently uploaded (20)

ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
 
AI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptxAI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptx
 
Software Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.pptSoftware Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.ppt
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
 
Hematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood CountHematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood Count
 

Creating Chaos - Engineering for the Unexpected

  • 1. Simplify Pharmacy. Save Money. Shahzad Zafar Vice President of Engineering Creating Chaos… Engineering for the Unexpected! @RxSavings @m_shahzad_z
  • 2. Simplify Pharmacy. Save Money. Creating Chaos … Engineering! @RxSavings @m_shahzad_z
  • 3. Simplify Pharmacy. Save Money. Shahzad Zafar Vice President of Engineering Creating Chaos … Engineering! @RxSavings @m_shahzad_z
  • 4. Simplify Pharmacy. Save Money. Why This Topic? @RxSavings @m_shahzad_z
  • 5. Simplify Pharmacy. Save Money. Why This Topic? "A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable" - Leslie Lamport @RxSavings @m_shahzad_z
  • 6. Simplify Pharmacy. Save Money. What is Chaos Engineering? @RxSavings @m_shahzad_z
  • 7. Simplify Pharmacy. Save Money. What is Chaos Engineering? @RxSavings @m_shahzad_z
  • 8. Simplify Pharmacy. Save Money. What is Chaos Engineering? @RxSavings @m_shahzad_z  Requires ► Having a hypothesis ► Identifying control conditions ► Uses real-world events ► Limiting the scope or blast radius ► Make it as real as possible - Ideally running it in Prod
  • 9. Simplify Pharmacy. Save Money. Chaos Monkey vs. Chaos Engineering Chaos Monkey Chaos Gorilla Chaos Kong Janitor Monkey Doctor Monkey Compliance Monkey Latency Monkey Security Monkey @RxSavings @m_shahzad_z Chaos Engineering
  • 10. Simplify Pharmacy. Save Money. Principles of Chaos Engineering (aka running the experiments)  #1 Have a Good Hypothesis ► Start with the Why? ► Like any experiment, know what is the expected behavior  #2 Use Real-World Events ► Use frequent and/or high impact scenarios ► Review incidents and use them refine scenarios  #3 Continuous Experimentation ► Automate the process of running experiments ► Tools to both orchestrate and analyze experiments @RxSavings @m_shahzad_z
  • 11. Simplify Pharmacy. Save Money. Principles of Chaos Engineering (aka running the experiments)  #4 Use Business Metrics ► Start with steady state system metrics such as throughput, error rates etc. (outputs) ► Move quickly to using business metrics such as value added, functionality usage (outcomes)  #5 Limiting Blast Radius ► Goal is not to experiment against the whole system ► Scale the experiment up and stop when it starts impacting business metrics  #6 Run Experiments in Production ► Most realistic setup is in Production ► Use principles #4 and #5 to avoid impacting users @RxSavings @m_shahzad_z
  • 12. Simplify Pharmacy. Save Money. Where to Start?  Start with Known Weakest Link ► Helps in building practice and muscle memory ► Work your way backwards to find the unknowns  Monitoring ► First few times could be manual monitoring - As long as monitoring steps are accounted for in the hypothesis ► Quickly automate, so you can focus on anomalies during an experiment  Being Inclusive ► Humans are part of the system … test them ► Find your Brent (from the Phoenix Project) @RxSavings @m_shahzad_z
  • 13. Simplify Pharmacy. Save Money. Risk Tolerance @RxSavings @m_shahzad_z
  • 14. Simplify Pharmacy. Save Money. Where to Start?  Organizational Risk Tolerance ► Starting with planned, announced events ► Run enough experiments to improve tolerance ► High risk times is when to run the experiments - Work to be done in “off” hours should not be acceptable - Build our system to be resilient to any change at any time ► Goal: build resilient products - By running unannounced experiments, all the time  Understanding the process of creating hypothesis @RxSavings @m_shahzad_z
  • 15. Simplify Pharmacy. Save Money. DevOps & Chaos Engineering @RxSavings @m_shahzad_z
  • 16. Simplify Pharmacy. Save Money. DevOps & Chaos Engineering  Given the ever increasing toolset ► Need vertical alignment from inception to delivery ► DevOps mindset and behaviors are needed truly chaos test your system ► System monitoring and operations need to be built-in as features from the beginning ► 1 in 2n chance of success - Where n is the number of dependencies - Troy Magennis – Agile2018 Keynote @RxSavings @m_shahzad_z
  • 17. Simplify Pharmacy. Save Money. DevOps & Chaos Engineering  Value Stream Mapping ► Map out the entire system to find bottlenecks and weak spots @RxSavings @m_shahzad_z
  • 18. Simplify Pharmacy. Save Money. DevOps & Chaos Engineering  Value Stream Mapping ► Map out the entire system to find bottlenecks and weak spots @RxSavings @m_shahzad_z
  • 19. Simplify Pharmacy. Save Money. Real Experiments  Test failure of a load balancer or service ► Identify resiliency at an individual component level  Fault testing for an Availability Zone or Region ► Identify failover resiliency  Test failure of an entire rack ► Identify resiliency when several components fails @RxSavings @m_shahzad_z
  • 20. Simplify Pharmacy. Save Money. Real Experiments  Power Loss vs. Server Shutdown ► In our first experiment, hypothesis was it would have the same result ► Pulling the power out revealed some other dependencies that did not show up when just shutting down a server @RxSavings @m_shahzad_z
  • 21. Simplify Pharmacy. Save Money. Scaling Beyond a Team  Moving from ”The Shadows” to Invested ► Pilot is small and might not need approvals, beyond team buy-in ► Getting investment helps in broader buy-in and support to build tooling around it  Creating an Automation Tool, which can ► Do canary analysis ► Have default monitoring and controls  Get to a point where running an experiment needs to be ► Routine ► Not time consuming @RxSavings @m_shahzad_z
  • 22. Simplify Pharmacy. Save Money. Conclusions  Start small, grow from there  Spend time writing your hypothesis  Automate and build-in needed capabilities  Recognize risk tolerance ► And get comfortable running experiments during ‘high risk’ times  Run experiments all the time And to ensure system resiliency… Create Chaos! @RxSavings @m_shahzad_z
  • 23. Simplify Pharmacy. Save Money. References  Chaos Engineering ► Building Confidence in System Behavior through Experiments ► https://www.oreilly.com/webops-perf/free/chaos- engineering.csp  Canary Analyze All The Things ► https://www.infoq.com/presentations/canary-analysis- deployment-pattern  The Phoenix Project ► https://www.amazon.com/Phoenix-Project-DevOps- Helping-Business/dp/0988262592  A comprehensive guide by Gremlin ► https://www.gremlin.com/chaos-monkey/  Performing Chaos at Netflix Scale ► https://www.youtube.com/watch?v=LaKGx0dAUlo @RxSavings Shahzad Zafar @m_shahzad_z Thank You!