SlideShare a Scribd company logo
1 of 23
Incidents
The Shorter, the Better
qeunit.com
Antoine CRASKE
#digital #architecture
#transformation
#qualityengineering #qe
#testautomation #opensource
@acraske_
linkedin/acraske
qeunit.com
La Redoute
Director of Technology Transformation
Director of Architecture & Technology
Senior Director of Engineering
Senior Engineering Manager
Previous positions of Project Director, IT Manager. Project Manager, Software Engineer
Entrepreneurship
Co-founder, atale.io
Co-founder, Cerberus Testing
Co-founder, Test Automation Camp
Communities
Speaker at Software, DevOps, Testing, Quality, Open source conferences
QE Unit, founder & organizer of the Quality Engineering community
TICE.Leiria, Meetup founder & organizer
Ministry of Testing Leiria, Meetup founder & organizer
Apache Kafka User Group Portugal, Meetup founder & organizer
Archilocus, Architecture community co-founder & co-organizer
Publications
On Defining Quality Engineering, QE Unit - with Rémi Dewitte (on Leanpub, Amazon)
Improving La Redoute's CI/CD Pipeline and DevOps Processes by Applying Machine
Learning Techniques, ResearchGate.
Collecting Data from Continuous Practices: an Infrastructure to Support Team Development,
ResearchGate.
Who am I
Antoine CRASKE
#digital #architecture
#transformation
#qualityengineering #qe
#testautomation #opensource
@acraske_
linkedin/acraske
qeunit.com
"You have failures
because you are successful"
—Dr. Richard Cook, How Complex Systems Fail
qeunit.com
We are full of failures, and far from success
Source: The 2022 Accelerate State of DevOps Report
qeunit.com
Incidents - if it was so easy
1200 incidents/months with 5 majors resolved in 5,81 hours
30k€ of direct costs with indirect of 100k and brand impacts
96% raise inability to learn from previous incidents
Source : Quocirca (2017), Damage Control – The impact of critical IT incidents
qeunit.com
What we all have done
Incident management
methods, organization, tooling
Prioritization matrix
Survive to last(s) “P1”
qeunit.com
Source : Tech Target
Source : Blameless.io
Source : istockphoto
Our questions
Which incidents to address or ignore?
Who are the minimal persons to include?
How to reverse the incidents trends?
qeunit.com
“Complex systems fail in complex ways”
qeunit.com
Complexity is not only in software
qeunit.com
A sum of probabilities
Incident
Risk A
Risk B
Risk C
Risk D
Problem 1
Problem 2
Order application does not handle retries
Financial application have downtime
Entire order processing flow is impacted
qeunit.com
With contributing factors influencing the system
Incident
Risk
Contributing
factor
Problem 1
Problem 2
Contributing
factor
Contributing
factor
Contributing
factors
Risk
Risk
Risk
Risk
Contributing
factor
Contributing
factor
Contributing
factor
Contributing
factors
Risk
Risk
Risk
● Internal/external
● Process/tools
● Human/skills
● Organization
● …
● Internal/external
● Process/tools
● Human/skills
● Organization
● …
Source: Divya Vohra Behla*, Susan Ferreira, Systems Thinking: An Analysis of Key Factors and Relationships,Complex Adaptive Systems.
Source: Ryan Kitchens said at SRECon in 2019 “the focus should be on remediating the system, not the individual.”
qeunit.com
“Success is nothing more than a few
simple disciplines, practiced every day.”
qeunit.com
Quality Engineering Incident Discipline
1. Anti-fragility
2. Raise incidents
3. Post-mortem, no excuses
4. Root-cause(s)
5. Blameless transparency
6. Learn
7. Step-by-step
qeunit.com
#1 - Anti-fragility¹
Failure is inevitable
● We cannot stop the business
● More speed, more risks
● It’s about building an adaptive capacity
“the ability to continue to adapt to changing environments, stakeholders, demands, contexts”
Invest for guided continuous improvements
● Identify safety boundaries
● Reduce impacts at boundaries
● Inputs for upstream remediation
Source: Riccardo Patriarca, Dynamic Models To Enhance Space Safety. Space Safety Magazine.
¹Nassim Nicholas Taleb, Antifragile: Things That Gain From Disorder.
qeunit.com
#2 - Raise incidents
MTTA/D/R are not sufficient alone
● Mean is an average
● But… incidents are not average
If you have to pick three indicators
● TTD (Time To Detect) in absolute value
● SLI then SLO
● Volume of people and teams involved
Source : 2021 VOID Report - the Verica Open Incident Database
Source : La Redoute internal, not authorized for disclosure.
Source: Alex Ewerlöf, How to Best Use MTT* Metrics to
Optimize Your Incident Response. InfoQ article.
qeunit.com
#3 - Post-mortem, no excuses
All incidents are opportunities to learn
● Increase knowledge of the system
● Incidents have risk and luck factors
● Near-misses are equally important
Develop an organizational discipline
● 0 excuses
● 100% follow-up with executive support
● Build up operational excellence
Source : 2021 VOID Report - the Verica Open Incident Database
Source : La Redoute internal, not authorized for disclosure.
qeunit.com
Software is a complex socio-technological system
● “Complex systems fail in complex ways”
● Contributing factors at the source of root causes
● Systemic approach instead of problem resolution
#4 - Root cause(s)
Source : Systems Thinking: Managing Chaos and Complexity. Jamshid Gharajedaghi.
Source : What is the Difference Between Root Cause
and Contributing Factor, Peedia (2022)
qeunit.com
#5 - Blameless transparency
Leverage the “Speed of Trust”¹
● Transparency builds relationships
● Transparency gives space to fix what’s broken
● The more you understand, the more you can trust
Tackling the hard parts
● “When things go wrong, we all experience fear”
● There’s no “blameless retrospective”
● Make it progressive
Source: Uber concealed huge data breach, BBC news
¹Covey, S. M. R. (2008). The speed of trust: the one thing that changes everything. Simon & Schuster.
Source : La Redoute internal, not authorized for disclosure.
Engineering transparency
Organizational transparency
Stakeholders transparency
Public transparency
Source: Transparency in incident response, Squadcast
qeunit.com
#6 - Learn
Solving an incident is not fixing an incident
● Siloed investigations by software engineer
● Investigators are not forensic medicine
● Identify themes and narratives leading to root causes
Dedicated “Incident Analysis” organization
● Staff strong Incident Analyst
● Block continuous time for Problem Management
● Ensure ongoing executive support
“Incident analysis is not actually about the incident, it’s an
opportunity we have to see the delta between how we
think our organization works and how it actually works”
—Nora Jones, CEO Jeli.io & Founder, LFI
Support
Delivery
Incident
Analyst
qeunit.com
#7 - Step-by-step
Act on the existing system first
● Already multiple contributing factors
● Don’t change too many system factors
● Build up the adaptive capacity
Iterate on realistic targets with maturity
● Evolving a system takes time
● Ensure continuity in specific periods
● Industrialize SLI, SLO, and then starts SRE
qeunit.com
Quality Engineering Incident Discipline
1. Anti-fragility
2. Raise incidents
3. Post-mortem, no excuses
4. Root-cause(s)
5. Blameless transparency
6. Learn
7. Step-by-step
qeunit.com
For more Quality Engineering
#peer-review #support #content-sharing
#mentoring #content
And also
Tech.rocks
moderntesting.org & AB Testing Podcast, Slack
platformengineering.org
qeunit.com
qeunit.com
Incidents
The Shorter, the Better
qeunit.com
Antoine CRASKE
#digital #architecture
#transformation
#qualityengineering #qe
#testautomation #opensource
@acraske_
linkedin/acraske
qeunit.com

More Related Content

What's hot

How To Improve Quality With Static Code Analysis
How To Improve Quality With Static Code Analysis How To Improve Quality With Static Code Analysis
How To Improve Quality With Static Code Analysis Perforce
 
Basics of Software Testing
Basics of Software TestingBasics of Software Testing
Basics of Software TestingShakal Shukla
 
Software testing life cycle
Software testing life cycleSoftware testing life cycle
Software testing life cycleGaruda Trainings
 
Exploratory Testing Explained
Exploratory Testing ExplainedExploratory Testing Explained
Exploratory Testing ExplainedTechWell
 
Agile Testing Process
Agile Testing ProcessAgile Testing Process
Agile Testing ProcessIntetics
 
Types of software testing
Types of software testingTypes of software testing
Types of software testingTestbytes
 
Agile testing - Testing From Day 1
Agile testing - Testing From Day 1Agile testing - Testing From Day 1
Agile testing - Testing From Day 1Kaizenko
 
Le test dans un cycle agile. Comment faire ?
Le test dans un cycle agile. Comment faire ?Le test dans un cycle agile. Comment faire ?
Le test dans un cycle agile. Comment faire ?Gilles Brieux
 
What is this exploratory testing thing
What is this exploratory testing thingWhat is this exploratory testing thing
What is this exploratory testing thingtonybruce
 
Automatisation des tests
Automatisation des testsAutomatisation des tests
Automatisation des testsZhu Wei QI
 
New trends in testing automation
New trends in testing automationNew trends in testing automation
New trends in testing automationEran Kinsbrunner
 
Tips for Writing Better Charters for Exploratory Testing Sessions by Michael...
 Tips for Writing Better Charters for Exploratory Testing Sessions by Michael... Tips for Writing Better Charters for Exploratory Testing Sessions by Michael...
Tips for Writing Better Charters for Exploratory Testing Sessions by Michael...TEST Huddle
 
Non-Functional testing
Non-Functional testingNon-Functional testing
Non-Functional testingKanoah
 
Testing types functional and nonfunctional - Kati Holasz
Testing types   functional and nonfunctional - Kati HolaszTesting types   functional and nonfunctional - Kati Holasz
Testing types functional and nonfunctional - Kati HolaszHolasz Kati
 
Emerging QA COE Practice by Mukund Wangikar
Emerging QA COE Practice by Mukund WangikarEmerging QA COE Practice by Mukund Wangikar
Emerging QA COE Practice by Mukund WangikarAgile Testing Alliance
 

What's hot (20)

How To Improve Quality With Static Code Analysis
How To Improve Quality With Static Code Analysis How To Improve Quality With Static Code Analysis
How To Improve Quality With Static Code Analysis
 
Testing fundamentals
Testing fundamentalsTesting fundamentals
Testing fundamentals
 
Software testing
Software testingSoftware testing
Software testing
 
Basics of Software Testing
Basics of Software TestingBasics of Software Testing
Basics of Software Testing
 
Software testing life cycle
Software testing life cycleSoftware testing life cycle
Software testing life cycle
 
Exploratory Testing Explained
Exploratory Testing ExplainedExploratory Testing Explained
Exploratory Testing Explained
 
Python in Test automation
Python in Test automationPython in Test automation
Python in Test automation
 
Agile Testing Process
Agile Testing ProcessAgile Testing Process
Agile Testing Process
 
Types of software testing
Types of software testingTypes of software testing
Types of software testing
 
Agile testing - Testing From Day 1
Agile testing - Testing From Day 1Agile testing - Testing From Day 1
Agile testing - Testing From Day 1
 
Le test dans un cycle agile. Comment faire ?
Le test dans un cycle agile. Comment faire ?Le test dans un cycle agile. Comment faire ?
Le test dans un cycle agile. Comment faire ?
 
What is this exploratory testing thing
What is this exploratory testing thingWhat is this exploratory testing thing
What is this exploratory testing thing
 
Automatisation des tests
Automatisation des testsAutomatisation des tests
Automatisation des tests
 
Gherkin /BDD intro
Gherkin /BDD introGherkin /BDD intro
Gherkin /BDD intro
 
New trends in testing automation
New trends in testing automationNew trends in testing automation
New trends in testing automation
 
Tips for Writing Better Charters for Exploratory Testing Sessions by Michael...
 Tips for Writing Better Charters for Exploratory Testing Sessions by Michael... Tips for Writing Better Charters for Exploratory Testing Sessions by Michael...
Tips for Writing Better Charters for Exploratory Testing Sessions by Michael...
 
Non-Functional testing
Non-Functional testingNon-Functional testing
Non-Functional testing
 
Testing types functional and nonfunctional - Kati Holasz
Testing types   functional and nonfunctional - Kati HolaszTesting types   functional and nonfunctional - Kati Holasz
Testing types functional and nonfunctional - Kati Holasz
 
QA Best Practices in Agile World_new
QA Best Practices in Agile World_newQA Best Practices in Agile World_new
QA Best Practices in Agile World_new
 
Emerging QA COE Practice by Mukund Wangikar
Emerging QA COE Practice by Mukund WangikarEmerging QA COE Practice by Mukund Wangikar
Emerging QA COE Practice by Mukund Wangikar
 

Similar to Quality Engineering Incident Discipline

Quality at Speed: The Imperatives of Integration Tomorrow
Quality at Speed: The Imperatives of Integration TomorrowQuality at Speed: The Imperatives of Integration Tomorrow
Quality at Speed: The Imperatives of Integration TomorrowAntoine Craske
 
Winnipeg ISACA Security is Dead, Rugged DevOps
Winnipeg ISACA Security is Dead, Rugged DevOpsWinnipeg ISACA Security is Dead, Rugged DevOps
Winnipeg ISACA Security is Dead, Rugged DevOpsGene Kim
 
Trusted, Transparent and Fair AI using Open Source
Trusted, Transparent and Fair AI using Open SourceTrusted, Transparent and Fair AI using Open Source
Trusted, Transparent and Fair AI using Open SourceAnimesh Singh
 
Humane assessment on cards
Humane assessment on cardsHumane assessment on cards
Humane assessment on cardsTudor Girba
 
End-to-End OT SecOps Transforming from Good to Great
End-to-End OT SecOps Transforming from Good to GreatEnd-to-End OT SecOps Transforming from Good to Great
End-to-End OT SecOps Transforming from Good to Greataccenture
 
VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering"
VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering"VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering"
VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering"Aaron Rinehart
 
Building and Scaling High Performing Technology Organizations by Jez Humble a...
Building and Scaling High Performing Technology Organizations by Jez Humble a...Building and Scaling High Performing Technology Organizations by Jez Humble a...
Building and Scaling High Performing Technology Organizations by Jez Humble a...Agile India
 
The Rationale for Continuous Delivery
The Rationale for Continuous DeliveryThe Rationale for Continuous Delivery
The Rationale for Continuous DeliveryPerforce
 
Executive Perspective Building an OT Security Program from the Top Down
Executive Perspective Building an OT Security Program from the Top DownExecutive Perspective Building an OT Security Program from the Top Down
Executive Perspective Building an OT Security Program from the Top Downaccenture
 
Why security is the kidney not the tail of the dog v3
Why security is the kidney not the tail of the dog v3Why security is the kidney not the tail of the dog v3
Why security is the kidney not the tail of the dog v3Ernest Staats
 
2013 Data Protection Maturity Trends: How Do You Compare?
2013 Data Protection Maturity Trends: How Do You Compare?2013 Data Protection Maturity Trends: How Do You Compare?
2013 Data Protection Maturity Trends: How Do You Compare?Lumension
 
PPT_MAJOR-PROJECT_.pptx
PPT_MAJOR-PROJECT_.pptxPPT_MAJOR-PROJECT_.pptx
PPT_MAJOR-PROJECT_.pptxAdfarRashid
 
Effective Software Testing for Modern Software Development
Effective Software Testing for Modern Software DevelopmentEffective Software Testing for Modern Software Development
Effective Software Testing for Modern Software DevelopmentAlan Richardson
 
Cybersecurity Operations: Examining the State of the SOC
Cybersecurity Operations: Examining the State of the SOCCybersecurity Operations: Examining the State of the SOC
Cybersecurity Operations: Examining the State of the SOCFidelis Cybersecurity
 
The Role of AI and Automation
The Role of AI and Automation The Role of AI and Automation
The Role of AI and Automation mcoello
 
“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”
“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”
“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”diannepatricia
 
DEV345_Tools Won’t Fix Your Broken DevOps
DEV345_Tools Won’t Fix Your Broken DevOpsDEV345_Tools Won’t Fix Your Broken DevOps
DEV345_Tools Won’t Fix Your Broken DevOpsAmazon Web Services
 

Similar to Quality Engineering Incident Discipline (20)

Quality at Speed: The Imperatives of Integration Tomorrow
Quality at Speed: The Imperatives of Integration TomorrowQuality at Speed: The Imperatives of Integration Tomorrow
Quality at Speed: The Imperatives of Integration Tomorrow
 
PNSQC 2021 January 28 Culture Jam
PNSQC 2021 January 28 Culture JamPNSQC 2021 January 28 Culture Jam
PNSQC 2021 January 28 Culture Jam
 
Winnipeg ISACA Security is Dead, Rugged DevOps
Winnipeg ISACA Security is Dead, Rugged DevOpsWinnipeg ISACA Security is Dead, Rugged DevOps
Winnipeg ISACA Security is Dead, Rugged DevOps
 
Trusted, Transparent and Fair AI using Open Source
Trusted, Transparent and Fair AI using Open SourceTrusted, Transparent and Fair AI using Open Source
Trusted, Transparent and Fair AI using Open Source
 
1 (1)
1 (1)1 (1)
1 (1)
 
Humane assessment on cards
Humane assessment on cardsHumane assessment on cards
Humane assessment on cards
 
End-to-End OT SecOps Transforming from Good to Great
End-to-End OT SecOps Transforming from Good to GreatEnd-to-End OT SecOps Transforming from Good to Great
End-to-End OT SecOps Transforming from Good to Great
 
VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering"
VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering"VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering"
VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering"
 
Building and Scaling High Performing Technology Organizations by Jez Humble a...
Building and Scaling High Performing Technology Organizations by Jez Humble a...Building and Scaling High Performing Technology Organizations by Jez Humble a...
Building and Scaling High Performing Technology Organizations by Jez Humble a...
 
Rewriting DevOps
Rewriting DevOpsRewriting DevOps
Rewriting DevOps
 
The Rationale for Continuous Delivery
The Rationale for Continuous DeliveryThe Rationale for Continuous Delivery
The Rationale for Continuous Delivery
 
Executive Perspective Building an OT Security Program from the Top Down
Executive Perspective Building an OT Security Program from the Top DownExecutive Perspective Building an OT Security Program from the Top Down
Executive Perspective Building an OT Security Program from the Top Down
 
Why security is the kidney not the tail of the dog v3
Why security is the kidney not the tail of the dog v3Why security is the kidney not the tail of the dog v3
Why security is the kidney not the tail of the dog v3
 
2013 Data Protection Maturity Trends: How Do You Compare?
2013 Data Protection Maturity Trends: How Do You Compare?2013 Data Protection Maturity Trends: How Do You Compare?
2013 Data Protection Maturity Trends: How Do You Compare?
 
PPT_MAJOR-PROJECT_.pptx
PPT_MAJOR-PROJECT_.pptxPPT_MAJOR-PROJECT_.pptx
PPT_MAJOR-PROJECT_.pptx
 
Effective Software Testing for Modern Software Development
Effective Software Testing for Modern Software DevelopmentEffective Software Testing for Modern Software Development
Effective Software Testing for Modern Software Development
 
Cybersecurity Operations: Examining the State of the SOC
Cybersecurity Operations: Examining the State of the SOCCybersecurity Operations: Examining the State of the SOC
Cybersecurity Operations: Examining the State of the SOC
 
The Role of AI and Automation
The Role of AI and Automation The Role of AI and Automation
The Role of AI and Automation
 
“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”
“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”
“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”
 
DEV345_Tools Won’t Fix Your Broken DevOps
DEV345_Tools Won’t Fix Your Broken DevOpsDEV345_Tools Won’t Fix Your Broken DevOps
DEV345_Tools Won’t Fix Your Broken DevOps
 

More from Antoine Craske

Reinventing our QA roles for Quality Engineering
Reinventing our QA roles for Quality EngineeringReinventing our QA roles for Quality Engineering
Reinventing our QA roles for Quality EngineeringAntoine Craske
 
Pivoting to a Mobile-First Strategy @ La Redoute
Pivoting to a Mobile-First Strategy @ La RedoutePivoting to a Mobile-First Strategy @ La Redoute
Pivoting to a Mobile-First Strategy @ La RedouteAntoine Craske
 
The Value of Test Automation for Quality at Speed
The Value of Test Automation for Quality at SpeedThe Value of Test Automation for Quality at Speed
The Value of Test Automation for Quality at SpeedAntoine Craske
 
How Open Source Is Shaping Quality Engineering
How Open Source Is Shaping Quality EngineeringHow Open Source Is Shaping Quality Engineering
How Open Source Is Shaping Quality EngineeringAntoine Craske
 
Build Better. Build Faster. - How to Transform DevOps for Quality at Speed
Build Better. Build Faster.  - How to Transform DevOps for Quality at SpeedBuild Better. Build Faster.  - How to Transform DevOps for Quality at Speed
Build Better. Build Faster. - How to Transform DevOps for Quality at SpeedAntoine Craske
 
How We Test Event-Driven Microservices
How We Test Event-Driven MicroservicesHow We Test Event-Driven Microservices
How We Test Event-Driven MicroservicesAntoine Craske
 
La Redoute DevOps Adoption, A Transformation Journey
La Redoute DevOps Adoption, A Transformation JourneyLa Redoute DevOps Adoption, A Transformation Journey
La Redoute DevOps Adoption, A Transformation JourneyAntoine Craske
 
Production-Ready Kubernetes: It's Not About Technology
Production-Ready Kubernetes: It's Not About TechnologyProduction-Ready Kubernetes: It's Not About Technology
Production-Ready Kubernetes: It's Not About TechnologyAntoine Craske
 
La Redoute Quality Engineering Transformation
La Redoute Quality Engineering TransformationLa Redoute Quality Engineering Transformation
La Redoute Quality Engineering TransformationAntoine Craske
 
This is How We Accelerate with Quality Engineering - Codacy Webinar
This is How We Accelerate with Quality Engineering - Codacy WebinarThis is How We Accelerate with Quality Engineering - Codacy Webinar
This is How We Accelerate with Quality Engineering - Codacy WebinarAntoine Craske
 
Shifting Quality To App-first @ La Redoute
Shifting Quality To App-first @ La RedouteShifting Quality To App-first @ La Redoute
Shifting Quality To App-first @ La RedouteAntoine Craske
 

More from Antoine Craske (11)

Reinventing our QA roles for Quality Engineering
Reinventing our QA roles for Quality EngineeringReinventing our QA roles for Quality Engineering
Reinventing our QA roles for Quality Engineering
 
Pivoting to a Mobile-First Strategy @ La Redoute
Pivoting to a Mobile-First Strategy @ La RedoutePivoting to a Mobile-First Strategy @ La Redoute
Pivoting to a Mobile-First Strategy @ La Redoute
 
The Value of Test Automation for Quality at Speed
The Value of Test Automation for Quality at SpeedThe Value of Test Automation for Quality at Speed
The Value of Test Automation for Quality at Speed
 
How Open Source Is Shaping Quality Engineering
How Open Source Is Shaping Quality EngineeringHow Open Source Is Shaping Quality Engineering
How Open Source Is Shaping Quality Engineering
 
Build Better. Build Faster. - How to Transform DevOps for Quality at Speed
Build Better. Build Faster.  - How to Transform DevOps for Quality at SpeedBuild Better. Build Faster.  - How to Transform DevOps for Quality at Speed
Build Better. Build Faster. - How to Transform DevOps for Quality at Speed
 
How We Test Event-Driven Microservices
How We Test Event-Driven MicroservicesHow We Test Event-Driven Microservices
How We Test Event-Driven Microservices
 
La Redoute DevOps Adoption, A Transformation Journey
La Redoute DevOps Adoption, A Transformation JourneyLa Redoute DevOps Adoption, A Transformation Journey
La Redoute DevOps Adoption, A Transformation Journey
 
Production-Ready Kubernetes: It's Not About Technology
Production-Ready Kubernetes: It's Not About TechnologyProduction-Ready Kubernetes: It's Not About Technology
Production-Ready Kubernetes: It's Not About Technology
 
La Redoute Quality Engineering Transformation
La Redoute Quality Engineering TransformationLa Redoute Quality Engineering Transformation
La Redoute Quality Engineering Transformation
 
This is How We Accelerate with Quality Engineering - Codacy Webinar
This is How We Accelerate with Quality Engineering - Codacy WebinarThis is How We Accelerate with Quality Engineering - Codacy Webinar
This is How We Accelerate with Quality Engineering - Codacy Webinar
 
Shifting Quality To App-first @ La Redoute
Shifting Quality To App-first @ La RedouteShifting Quality To App-first @ La Redoute
Shifting Quality To App-first @ La Redoute
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 

Quality Engineering Incident Discipline

  • 1. Incidents The Shorter, the Better qeunit.com Antoine CRASKE #digital #architecture #transformation #qualityengineering #qe #testautomation #opensource @acraske_ linkedin/acraske qeunit.com
  • 2. La Redoute Director of Technology Transformation Director of Architecture & Technology Senior Director of Engineering Senior Engineering Manager Previous positions of Project Director, IT Manager. Project Manager, Software Engineer Entrepreneurship Co-founder, atale.io Co-founder, Cerberus Testing Co-founder, Test Automation Camp Communities Speaker at Software, DevOps, Testing, Quality, Open source conferences QE Unit, founder & organizer of the Quality Engineering community TICE.Leiria, Meetup founder & organizer Ministry of Testing Leiria, Meetup founder & organizer Apache Kafka User Group Portugal, Meetup founder & organizer Archilocus, Architecture community co-founder & co-organizer Publications On Defining Quality Engineering, QE Unit - with Rémi Dewitte (on Leanpub, Amazon) Improving La Redoute's CI/CD Pipeline and DevOps Processes by Applying Machine Learning Techniques, ResearchGate. Collecting Data from Continuous Practices: an Infrastructure to Support Team Development, ResearchGate. Who am I Antoine CRASKE #digital #architecture #transformation #qualityengineering #qe #testautomation #opensource @acraske_ linkedin/acraske qeunit.com
  • 3. "You have failures because you are successful" —Dr. Richard Cook, How Complex Systems Fail qeunit.com
  • 4. We are full of failures, and far from success Source: The 2022 Accelerate State of DevOps Report qeunit.com
  • 5. Incidents - if it was so easy 1200 incidents/months with 5 majors resolved in 5,81 hours 30k€ of direct costs with indirect of 100k and brand impacts 96% raise inability to learn from previous incidents Source : Quocirca (2017), Damage Control – The impact of critical IT incidents qeunit.com
  • 6. What we all have done Incident management methods, organization, tooling Prioritization matrix Survive to last(s) “P1” qeunit.com Source : Tech Target Source : Blameless.io Source : istockphoto
  • 7. Our questions Which incidents to address or ignore? Who are the minimal persons to include? How to reverse the incidents trends? qeunit.com
  • 8. “Complex systems fail in complex ways” qeunit.com
  • 9. Complexity is not only in software qeunit.com
  • 10. A sum of probabilities Incident Risk A Risk B Risk C Risk D Problem 1 Problem 2 Order application does not handle retries Financial application have downtime Entire order processing flow is impacted qeunit.com
  • 11. With contributing factors influencing the system Incident Risk Contributing factor Problem 1 Problem 2 Contributing factor Contributing factor Contributing factors Risk Risk Risk Risk Contributing factor Contributing factor Contributing factor Contributing factors Risk Risk Risk ● Internal/external ● Process/tools ● Human/skills ● Organization ● … ● Internal/external ● Process/tools ● Human/skills ● Organization ● … Source: Divya Vohra Behla*, Susan Ferreira, Systems Thinking: An Analysis of Key Factors and Relationships,Complex Adaptive Systems. Source: Ryan Kitchens said at SRECon in 2019 “the focus should be on remediating the system, not the individual.” qeunit.com
  • 12. “Success is nothing more than a few simple disciplines, practiced every day.” qeunit.com
  • 13. Quality Engineering Incident Discipline 1. Anti-fragility 2. Raise incidents 3. Post-mortem, no excuses 4. Root-cause(s) 5. Blameless transparency 6. Learn 7. Step-by-step qeunit.com
  • 14. #1 - Anti-fragility¹ Failure is inevitable ● We cannot stop the business ● More speed, more risks ● It’s about building an adaptive capacity “the ability to continue to adapt to changing environments, stakeholders, demands, contexts” Invest for guided continuous improvements ● Identify safety boundaries ● Reduce impacts at boundaries ● Inputs for upstream remediation Source: Riccardo Patriarca, Dynamic Models To Enhance Space Safety. Space Safety Magazine. ¹Nassim Nicholas Taleb, Antifragile: Things That Gain From Disorder. qeunit.com
  • 15. #2 - Raise incidents MTTA/D/R are not sufficient alone ● Mean is an average ● But… incidents are not average If you have to pick three indicators ● TTD (Time To Detect) in absolute value ● SLI then SLO ● Volume of people and teams involved Source : 2021 VOID Report - the Verica Open Incident Database Source : La Redoute internal, not authorized for disclosure. Source: Alex Ewerlöf, How to Best Use MTT* Metrics to Optimize Your Incident Response. InfoQ article. qeunit.com
  • 16. #3 - Post-mortem, no excuses All incidents are opportunities to learn ● Increase knowledge of the system ● Incidents have risk and luck factors ● Near-misses are equally important Develop an organizational discipline ● 0 excuses ● 100% follow-up with executive support ● Build up operational excellence Source : 2021 VOID Report - the Verica Open Incident Database Source : La Redoute internal, not authorized for disclosure. qeunit.com
  • 17. Software is a complex socio-technological system ● “Complex systems fail in complex ways” ● Contributing factors at the source of root causes ● Systemic approach instead of problem resolution #4 - Root cause(s) Source : Systems Thinking: Managing Chaos and Complexity. Jamshid Gharajedaghi. Source : What is the Difference Between Root Cause and Contributing Factor, Peedia (2022) qeunit.com
  • 18. #5 - Blameless transparency Leverage the “Speed of Trust”¹ ● Transparency builds relationships ● Transparency gives space to fix what’s broken ● The more you understand, the more you can trust Tackling the hard parts ● “When things go wrong, we all experience fear” ● There’s no “blameless retrospective” ● Make it progressive Source: Uber concealed huge data breach, BBC news ¹Covey, S. M. R. (2008). The speed of trust: the one thing that changes everything. Simon & Schuster. Source : La Redoute internal, not authorized for disclosure. Engineering transparency Organizational transparency Stakeholders transparency Public transparency Source: Transparency in incident response, Squadcast qeunit.com
  • 19. #6 - Learn Solving an incident is not fixing an incident ● Siloed investigations by software engineer ● Investigators are not forensic medicine ● Identify themes and narratives leading to root causes Dedicated “Incident Analysis” organization ● Staff strong Incident Analyst ● Block continuous time for Problem Management ● Ensure ongoing executive support “Incident analysis is not actually about the incident, it’s an opportunity we have to see the delta between how we think our organization works and how it actually works” —Nora Jones, CEO Jeli.io & Founder, LFI Support Delivery Incident Analyst qeunit.com
  • 20. #7 - Step-by-step Act on the existing system first ● Already multiple contributing factors ● Don’t change too many system factors ● Build up the adaptive capacity Iterate on realistic targets with maturity ● Evolving a system takes time ● Ensure continuity in specific periods ● Industrialize SLI, SLO, and then starts SRE qeunit.com
  • 21. Quality Engineering Incident Discipline 1. Anti-fragility 2. Raise incidents 3. Post-mortem, no excuses 4. Root-cause(s) 5. Blameless transparency 6. Learn 7. Step-by-step qeunit.com
  • 22. For more Quality Engineering #peer-review #support #content-sharing #mentoring #content And also Tech.rocks moderntesting.org & AB Testing Podcast, Slack platformengineering.org qeunit.com qeunit.com
  • 23. Incidents The Shorter, the Better qeunit.com Antoine CRASKE #digital #architecture #transformation #qualityengineering #qe #testautomation #opensource @acraske_ linkedin/acraske qeunit.com