SlideShare a Scribd company logo
Resilience Engineering
The field, the community, and some
perspective shifting.
John Allspaw
Adaptive Capacity Labs
example #1
rm -rf $PATHNAME
@@ -1,2 +1,2 @@
-<!-- Status: Ok --> +<!-- Status: OK -->
Showing 1 changed file with 1 addition and 1 deletion.

index.html
example #2
all work is contextual
Adaptive Capacity Labs
http://bitly.com/AllspawThesis
http://stella.report
Year-long project
Researchers analyzed 3 incidents, at:
Six themes
•Postmortems as re-calibration
•Blameless v. sanctionless after action actions
•Controlling the costs of coordination
•Visualizations during anomaly management
•Strange Loops
•Dark Debt
What You Are In For
1. Resilience Engineering: a field and a community
2. Accentuating the positive
3. Avoidance of shallow data
4. Some food for thought
Resilience Engineering
• A field of study that emerged largely from Cognitive Systems Engineering,
early 2000s.
• 7 symposia over 12 years
Resilience Engineering
Community
is largely made up of practitioners and researchers from….
working in these domains…
Aviation/ATM
Rail
Maritime
Space
Surgery Power Plants
Intelligence
Agencies
Law Enforcement
Mining
Construction
Explosives
Firefighting
Anesthesia
Pediatrics
Power Grid &
Distribution
Military
Agencies
Software Engineering
Human Factors & Ergonomics Cognitive Systems Engineering Cybernetics Complexity Science Engineering*
Psychology Sociology Ecology Safety Science
Some of the cast of characters
David Woods
CSEL/OSU
Shawna Perry
Univ of Florida
Emergency Medicine
Dr. Richard Cook
Anesthesiologist
Researcher
Ivonne Andrade Herrera
SINTEF
Erik Hollnagel
Univ of S. Denmark
Anne-Sophie Nyssen
University de Liege Johan Bergström
Lund University
Sidney Dekker
Griffith University
Asher Balkin
CSEL/OSU
Laura Maguire
CSEL/OSU
Sample of Research
Experiences in Fukushima Dai-ichi nuclear power plant in light of resilience engineering
Unmanned Aircraft Systems in (Inter)national Airspace: Resilience as a Lever in the Debate
Sociotechnical Networks for Power Grid Resilience: South Korean Case Study
Limits on adaptation: Modeling Resilience and Brittleness in Hospital Emergency Departments
Books
externally sourced
code (e.g. DB)
results
the using
world
delivery
technology
stack
internally sourced code
results
externally sourced
code (e.g. DB)
results
the using
world
delivery
technology
stack
internally sourced code
results
macro
descriptions
externally sourced
code (e.g. DB)
results
the using
world
delivery
technology
stack
internally sourced code
results
code repositories
macro
descriptions
testing/validation
suites
code
code stuff
meta
rules
scripts,
rules, etc.
test cases
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
externally sourced
code (e.g. DB)
results
the using
world
delivery
technology
stack
internally sourced code
results
code repositories
macro
descriptions
testing/validation
suites
code
code stuff
meta
rules
scripts,
rules, etc.
test cases
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
externally sourced
code (e.g. DB)
results
the using
world
delivery
technology
stack
internally sourced code
results
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
systemsystem framing
doing
code repositories
macro
descriptions
testing/validation
suites
code
code stuff
meta
rules
scripts,
rules, etc.
test cases
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
externally sourced
code (e.g. DB)
results
the using
world
delivery
technology
stack
internally sourced code
results
deploy organization/
“monitoring”
Adding stuff
to the running
system
Getting stuff
ready to be part
of the running
system
architectural
& structural
framing
keeping track
of what “the
system” is
doing
code deploy
organization/
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
Adding stuff
to the running
system
Getting stuff
ready to be part
of the running
system
architectural
& structural
framing
keeping track
of what “the
system” is
doing
code repositories
macro
descriptions
testing/validation
suites
code
code stuff
meta
rules
scripts,
rules, etc.
test cases
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
externally sourced
code (e.g. DB)
results
the using
world
delivery
technology
stack
internally sourced code
results
The Work Is Done
Here
Your Product Or
Service
The Stuff You Build and
Maintain With
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
Adding stuff
to the running
system
Getting stuff
ready to be part
of the running
system
architectural
& structural
framing
keeping track
of what “the
system” is
doing
code repositories
macro
descriptions
testing/validation
suites
code
code stuff
meta
rules
scripts,
rules, etc.
test cases
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
externally sourced
code (e.g. DB)
results
the using
world
delivery
technology
stack
internally sourced code
results
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
Adding stuff
to the running
system
Getting stuff
ready to be part
of the running
system
architectural
& structural
framing
keeping track
of what “the
system” is
doing
code repositories
macro
descriptions
testing/validation
suites
code
code stuff
meta
rules
scripts,
rules, etc.
test cases
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
externally sourced
code (e.g. DB)
results
the using
world
delivery
technology
stack
internally sourced code
results
Copyright © 2016 by R.I. Cook for ACL, all rights reserved
ack: Michael Angeles http://konigi.com/tools/
What matters. Why what matters matters.
code repositories
macro
descriptions
testing/validation
suites
code
code stuff
meta
rules
scripts,
rules, etc.
test cases
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
above
the line
below
the line
Why is it doing that?
What needs to change?
What does it mean?
How should this work?
What’s it doing?
What does it mean?
What is happening?
What should be happening
What does it mean?
Adding stuff
to the running
system
Getting stuff
ready to be part
of the running
system
architectural
& structural
framing
keeping track
of what “the
system” is
doing
externally sourced
code (e.g. DB)
results
the using
world
delivery
technology
stack
internally sourced code
results
goals
purposes
risks
cognition
actions
interactions
speech
gestures
clicks
signals
representations
artifacts
the line of
representation
individuals have
unique models
of the “system”
Copyright © 2016 by R.I. Cook for ACL, all rights reserved
ack: Michael Angeles http://konigi.com/tools/
What matters. Why what matters matters.
code repositories
macro
descriptions
testing/validation
suites
code
code stuff
meta
rules
scripts,
rules, etc.
test cases
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
above
the line
below
the line
Why is it doing that?
What needs to change?
What does it mean?
How should this work?
What’s it doing?
What does it mean?
What is happening?
What should be happening
What does it mean?
Adding stuff
to the running
system
Getting stuff
ready to be part
of the running
system
architectural
& structural
framing
keeping track
of what “the
system” is
doing
externally sourced
code (e.g. DB)
results
the using
world
delivery
technology
stack
internally sourced code
results
goals
purposes
risks
cognition
actions
interactions
speech
gestures
clicks
signals
representations
artifacts
the line of
representation
individuals have
unique models
of the “system”
observing
inferring
anticipating
planning
troubleshooting
diagnosing
correcting
modifying
reacting
Copyright © 2016 by R.I. Cook for ACL, all rights reserved
ack: Michael Angeles http://konigi.com/tools/
What matters. Why what matters matters.
code repositories
macro
descriptions
testing/validation
suites
code
code stuff
meta
rules
scripts,
rules, etc.
test cases
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
above
the line
below
the line
Why is it doing that?
What needs to change?
What does it mean?
How should this work?
What’s it doing?
What does it mean?
What is happening?
What should be happening
What does it mean?
Adding stuff
to the running
system
Getting stuff
ready to be part
of the running
system
architectural
& structural
framing
keeping track
of what “the
system” is
doing
externally sourced
code (e.g. DB)
results
the using
world
delivery
technology
stack
internally sourced code
results
goals
purposes
risks
cognition
actions
interactions
speech
gestures
clicks
signals
representations
artifacts
the line of
representation
individuals have
unique models
of the “system”
Copyright © 2016 by R.I. Cook for ACL, all rights reserved
ack: Michael Angeles http://konigi.com/tools/
What matters. Why what matters matters.
code repositories
macro
descriptions
testing/validation
suites
code
code stuff
meta
rules
scripts,
rules, etc.
test cases
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
above
the line
below
the line
Why is it doing that?
What needs to change?
What does it mean?
How should this work?
What’s it doing?
What does it mean?
What is happening?
What should be happening
What does it mean?
Adding stuff
to the running
system
Getting stuff
ready to be part
of the running
system
architectural
& structural
framing
keeping track
of what “the
system” is
doing
externally sourced
code (e.g. DB)
results
the using
world
delivery
technology
stack
internally sourced code
results
goals
purposes
risks
cognition
actions
interactions
speech
gestures
clicks
signals
representations
artifacts
the line of
representation
individuals have
unique models
of the “system”
observing
inferring
anticipating
planning
troubleshooting
diagnosing
correcting
modifying
reacting
What matters. Why what matters matters.
code deploy
organization/
encapsulation “monitoring”
Why is it doing that?
hat needs to change?
What does it mean?
How should this work?
What’s it doing?
What does it mean?
What is happening?
What should be happening
What does it mean?
Adding stuff
to the running
system
Getting stuff
ready to be part
of the running
system
architectural
& structural
framing
keeping track
of what “the
system” is
doing
go
purp
ris
cogn
act
intera
spe
ges
cli
sig
represe
What matters. Why what matters matters.
code deploy
organization/
encapsulation “monitoring”
Why is it doing that?
hat needs to change?
What does it mean?
How should this work?
What’s it doing?
What does it mean?
What is happening?
What should be happening
What does it mean?
Adding stuff
to the running
system
Getting stuff
ready to be part
of the running
system
architectural
& structural
framing
keeping track
of what “the
system” is
doing
go
purp
ris
cogn
act
intera
spe
ges
cli
sig
represe
observing
inferring
anticipating
planning
troubleshooting
diagnosing
correcting
modifying
reacting
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
Adding stuff
to the running
system
Getting stuff
ready to be part
of the running
system
architectural
& structural
framing
keeping track
of what “the
system” is
doing
code repositories
macro
descriptions
testing/validation
suites
code
code stuff
meta
rules
scripts,
rules, etc.
test cases
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
externally sourced
code (e.g. DB)
results
the using
world
delivery
technology
stack
internally sourced code
results
Time
…and things are
changing here
things are
changing
here…
“above the line”
…is not “management”
…is not “organization design” or reporting structures
…is how people work (detect/diagnose/solve problems, both acute and
chronic) alongside and with technology and each other, under continual
trade-off scenarios, that provide the audacity to build and sustain adaptive
capacity.
Resilience is something that a system
does, not what a system has.
“Resilience is an expression of how people, 

alone or together, 

cope with everyday situations – large and small – 

by adjusting their performance to the conditions. 

An organization’s performance is resilient if it can function as required 

under expected and unexpected conditions alike 

(changes/disturbances/opportunities).”

Hollnagel, Erik. Safety-II in Practice: Developing the Resilience Potentials
–David Woods (2015)
“Resilience is
sustained adaptive capacity.”
Resilience is the story of the outage
that didn’t happen.
If you haven’t found people responsible for
outcomes, you haven’t “seen” the system.
Humans are predominantly seen as a liability or hazard.
They are a problem to be fixed.
Traditional view on the role of people (“Safety-I”)
Humans are seen as a resource necessary for system flexibility and resilience.
They provide flexible solutions to many potential problems.
RE view on the role of people in complex systems (“Safety-II”)
How does our software work, really?
How does our software break, really?
What do we do to keep it all working?
explanations of accidents
Safety-I
Accidents are caused by failures and malfunctions. The purpose of an
investigation is to identify causes and contributory factors.
Safety-II
Things go well and fail in basically the same ways, regardless of outcome.
The purpose of an investigation is to understand how things usually go right
as a basis for explaining how things occasionally go wrong.
Safety - II
Why “audacity”?
incidents
(outages, degradations, breaches, accidents, near-misses, “glitches”,
untoward/unexpected events, etc.)
what makes incidents
interesting & valuable?
code repositories
macro
descriptions
testing/validation
suites
code
code stuff
meta
rules
scripts,
rules, etc.
test cases
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
above
the line
below
the line externally sourced
code (e.g. DB)
results
the using
world
delivery
technology
stack
internally sourced code
results
code repositories
macro
descriptions
testing/validation
suites
code
code stuff
meta
rules
scripts,
rules, etc.
test cases
code
generating
tools
testing
tools
deploy
tools
organization/
encapsulation
tools
“monitoring”
tools
above
the line
below
the line externally sourced
code (e.g. DB)
resultsdelivery
technology
stack
internally sourced code
results
incidents as…
drivers of software design
- “incidents of yesterday inform the architectures of tomorrow”
- incidents “below the line” drive changes “above the line"
- staffing, budgets, planning, roadmaps, etc.
- shape the design of new components, subsystems, architectures
💥
5/6/2010 - “Flash Crash” - loss of $1 trillion in market value in <10min
3/23/2012 - BATS IPO - systems issue halted the exchange’s own IPO
5/23/2012 - Facebook IPO - systems issue delayed IPO trading
8/1/2012 - Knight Capital - $461 million in 45 minutes
“Regulation SCI”
- tend also to give birth to new forms of regulations, policies, norms,
compliance requirements, explosion of documentation, auditing, constraints,
etc.
- “incidents of yesterday inform the rules of tomorrow”
- influence staffing, budgets, planning, roadmaps, etc.
PCI-DSS
1988-1998, Visa and MasterCard reported
credit card losses due to fraud of $750 million
incidents as…
motivators for policy
incidents tend to focus our
attention on what matters
💥
incidents help us gauge the delta between
how
the system works
how we think
the system works
Δ
{almost always greater than we imagine
“…nonroutine, challenging events, because these tough cases have the
greatest potential for uncovering elements of expertise and related
cognitive phenomena.” (Klein, Crandall, Hoffman, 2006)
A family of well-worn methods, approaches, and techniques
Cognitive task/work analysis
Process tracing
Conversation analysis
Critical decision method
Critical incident technique
more…
research validates these opportunities
A digression.
incident
54 minutes
start resolve
12 minutes
54 minutes
start resolve
detect
incident
20 minutes
73 minutes
12 minutes
54 minutes
start resolve
detect
start
detect
resolve
incidents
12 minutes
54 minutes
start resolve
detect
20 minutes
73 minutes
start
detect
resolve
5
25 minutes
start
detect
resolve
incidents
incidents
12 minutes
54 minutes
start resolve
detect
20 minutes
73 minutes
start
detect
resolve
5
25 minutes
start
detect
resolve
135 minutes
100 minutes
start
detect
resolve
incidents
12 minutes
54 minutes
start resolve
detect
20 minutes
73 minutes
start
detect
resolve
5
25 minutes
start
detect
resolve
135 minutes
100 minutes
start
detect
resolve
minutes
incidents
minutes
incidents
minutes
janfebmaraprmayjun
incidents
minutes
janfebmaraprmayjun
incidents
minutes
jan feb mar apr may jun
incidents
minutes
jan feb mar apr may jun
“Resilience is an expression of how people, alone or together,
cope with everyday situations – large and small – 

by adjusting their performance to the conditions. 

An organization’s performance is resilient if it can function as required 

under expected and unexpected conditions alike 

(changes/disturbances/opportunities).”
“Resilience is
sustained adaptive capacity.”
incidents
minutes
jan feb mar apr may jun
What is it doing?!
Why is it doing that?!
What will it do next?
How did it get into this state?
WTF is happening?
If we do Y, will it help us figure out what to do?
Is it getting worse?
It looks like it’s fixed…but is it…?
If we do X, will it prevent it from getting worse…or make it worse?
Who else should we call that can help us?
Is this OUR issue, or are we BEING ATTACKED?!
incidents provide calibration about…
how decisions are focused
how attention flows
how work is coordinated
how escalation manifests
the weight of time pressure
the effects of uncertainty
the impact of ambiguity
what consequences are consequential
What can we learn
about these…
how decisions are focused
how attention flows
how work is coordinated
how escalation manifests
the weight of time pressure
the effects of uncertainty
the impact of ambiguity
what consequences are consequential
…from these?
(M)TTR?
(M)TTD?
Frequency of incidents?
Severity of incidents?
Customer impact?
Number of deploys?
“…while there is value in the items on the right, we value the items on the left more.”
Thought Food
• We cannot comprehensively understand how our systems behave - we
continually build and revise our understandings based on (relatively sparse)
signals our tech sends us.
• Continuous delivery, “Chaos”/fault injection, are coping strategies (hedges)
for the above state of affairs.
• Understanding activities “above the line” are basically unexplored or
ignored in our industry, and this needs to change.
End.

More Related Content

What's hot

Using Models for Incident, Change, Problem and Request Fulfillment Management
Using Models for Incident, Change, Problem and Request Fulfillment ManagementUsing Models for Incident, Change, Problem and Request Fulfillment Management
Using Models for Incident, Change, Problem and Request Fulfillment Management
ITSM Academy, Inc.
 
The vital role of AIOps in overcoming IT operational challenges - DEM07-SR - ...
The vital role of AIOps in overcoming IT operational challenges - DEM07-SR - ...The vital role of AIOps in overcoming IT operational challenges - DEM07-SR - ...
The vital role of AIOps in overcoming IT operational challenges - DEM07-SR - ...
Amazon Web Services
 
ITSM Foundation Course Material
ITSM Foundation Course MaterialITSM Foundation Course Material
ITSM Foundation Course Material
stefanhenry
 
Introduction to automation testing
Introduction  to automation testingIntroduction  to automation testing
Introduction to automation testing
onewomanmore witl
 
[Madrid-Meetup April 22] UAPIM.pptx
[Madrid-Meetup April 22] UAPIM.pptx[Madrid-Meetup April 22] UAPIM.pptx
[Madrid-Meetup April 22] UAPIM.pptx
jorgelebrato
 
Software Engineering UPTU
Software Engineering UPTUSoftware Engineering UPTU
Software Engineering UPTU
Rishi Shukla
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptx
VMware Tanzu
 
Engineering Software Products: 2. agile software engineering
Engineering Software Products: 2. agile software engineeringEngineering Software Products: 2. agile software engineering
Engineering Software Products: 2. agile software engineering
software-engineering-book
 
Chapter 5 Agile Software development
Chapter 5 Agile Software developmentChapter 5 Agile Software development
Chapter 5 Agile Software development
Didarul Amin
 
CapellaDays2022 | ThermoFisher - ESI TNO | A method for quantitative evaluati...
CapellaDays2022 | ThermoFisher - ESI TNO | A method for quantitative evaluati...CapellaDays2022 | ThermoFisher - ESI TNO | A method for quantitative evaluati...
CapellaDays2022 | ThermoFisher - ESI TNO | A method for quantitative evaluati...
Obeo
 
Site reliability engineering
Site reliability engineeringSite reliability engineering
Site reliability engineering
Jason Loeffler
 
Change Management ITIL
Change Management ITILChange Management ITIL
Change Management ITIL
dkmorgan51
 
Request for Proposal (RFP) For Video Conferencing Equipment
Request for Proposal (RFP) For Video Conferencing EquipmentRequest for Proposal (RFP) For Video Conferencing Equipment
Request for Proposal (RFP) For Video Conferencing EquipmentVideoguy
 
SRE in Startup
SRE in StartupSRE in Startup
SRE in Startup
Ladislav Prskavec
 
Is There a Return on Investment from Model-Based Systems Engineering?
Is There a Return on Investment from Model-Based Systems Engineering?Is There a Return on Investment from Model-Based Systems Engineering?
Is There a Return on Investment from Model-Based Systems Engineering?
Elizabeth Steiner
 
INCOSE Systems Engineering Competency Framework ( ISECF)
INCOSE Systems Engineering Competency Framework ( ISECF)INCOSE Systems Engineering Competency Framework ( ISECF)
INCOSE Systems Engineering Competency Framework ( ISECF)
Bernardo A. Delicado
 
Software Maintenance Project Proposal PowerPoint Presentation Slides
Software Maintenance Project Proposal PowerPoint Presentation SlidesSoftware Maintenance Project Proposal PowerPoint Presentation Slides
Software Maintenance Project Proposal PowerPoint Presentation Slides
SlideTeam
 
Digital Assurance: Develop a Comprehensive Testing Strategy for Digital Trans...
Digital Assurance: Develop a Comprehensive Testing Strategy for Digital Trans...Digital Assurance: Develop a Comprehensive Testing Strategy for Digital Trans...
Digital Assurance: Develop a Comprehensive Testing Strategy for Digital Trans...
CA Technologies
 
ISO 15288 Systems Engineering - Application to Air Force
ISO 15288 Systems Engineering - Application to Air ForceISO 15288 Systems Engineering - Application to Air Force
ISO 15288 Systems Engineering - Application to Air Force
Bernardo A. Delicado
 
ISO/IEC 42010 Recommended Practice for Architectural description
ISO/IEC 42010 Recommended Practice for Architectural descriptionISO/IEC 42010 Recommended Practice for Architectural description
ISO/IEC 42010 Recommended Practice for Architectural description
Hongseok Lee
 

What's hot (20)

Using Models for Incident, Change, Problem and Request Fulfillment Management
Using Models for Incident, Change, Problem and Request Fulfillment ManagementUsing Models for Incident, Change, Problem and Request Fulfillment Management
Using Models for Incident, Change, Problem and Request Fulfillment Management
 
The vital role of AIOps in overcoming IT operational challenges - DEM07-SR - ...
The vital role of AIOps in overcoming IT operational challenges - DEM07-SR - ...The vital role of AIOps in overcoming IT operational challenges - DEM07-SR - ...
The vital role of AIOps in overcoming IT operational challenges - DEM07-SR - ...
 
ITSM Foundation Course Material
ITSM Foundation Course MaterialITSM Foundation Course Material
ITSM Foundation Course Material
 
Introduction to automation testing
Introduction  to automation testingIntroduction  to automation testing
Introduction to automation testing
 
[Madrid-Meetup April 22] UAPIM.pptx
[Madrid-Meetup April 22] UAPIM.pptx[Madrid-Meetup April 22] UAPIM.pptx
[Madrid-Meetup April 22] UAPIM.pptx
 
Software Engineering UPTU
Software Engineering UPTUSoftware Engineering UPTU
Software Engineering UPTU
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptx
 
Engineering Software Products: 2. agile software engineering
Engineering Software Products: 2. agile software engineeringEngineering Software Products: 2. agile software engineering
Engineering Software Products: 2. agile software engineering
 
Chapter 5 Agile Software development
Chapter 5 Agile Software developmentChapter 5 Agile Software development
Chapter 5 Agile Software development
 
CapellaDays2022 | ThermoFisher - ESI TNO | A method for quantitative evaluati...
CapellaDays2022 | ThermoFisher - ESI TNO | A method for quantitative evaluati...CapellaDays2022 | ThermoFisher - ESI TNO | A method for quantitative evaluati...
CapellaDays2022 | ThermoFisher - ESI TNO | A method for quantitative evaluati...
 
Site reliability engineering
Site reliability engineeringSite reliability engineering
Site reliability engineering
 
Change Management ITIL
Change Management ITILChange Management ITIL
Change Management ITIL
 
Request for Proposal (RFP) For Video Conferencing Equipment
Request for Proposal (RFP) For Video Conferencing EquipmentRequest for Proposal (RFP) For Video Conferencing Equipment
Request for Proposal (RFP) For Video Conferencing Equipment
 
SRE in Startup
SRE in StartupSRE in Startup
SRE in Startup
 
Is There a Return on Investment from Model-Based Systems Engineering?
Is There a Return on Investment from Model-Based Systems Engineering?Is There a Return on Investment from Model-Based Systems Engineering?
Is There a Return on Investment from Model-Based Systems Engineering?
 
INCOSE Systems Engineering Competency Framework ( ISECF)
INCOSE Systems Engineering Competency Framework ( ISECF)INCOSE Systems Engineering Competency Framework ( ISECF)
INCOSE Systems Engineering Competency Framework ( ISECF)
 
Software Maintenance Project Proposal PowerPoint Presentation Slides
Software Maintenance Project Proposal PowerPoint Presentation SlidesSoftware Maintenance Project Proposal PowerPoint Presentation Slides
Software Maintenance Project Proposal PowerPoint Presentation Slides
 
Digital Assurance: Develop a Comprehensive Testing Strategy for Digital Trans...
Digital Assurance: Develop a Comprehensive Testing Strategy for Digital Trans...Digital Assurance: Develop a Comprehensive Testing Strategy for Digital Trans...
Digital Assurance: Develop a Comprehensive Testing Strategy for Digital Trans...
 
ISO 15288 Systems Engineering - Application to Air Force
ISO 15288 Systems Engineering - Application to Air ForceISO 15288 Systems Engineering - Application to Air Force
ISO 15288 Systems Engineering - Application to Air Force
 
ISO/IEC 42010 Recommended Practice for Architectural description
ISO/IEC 42010 Recommended Practice for Architectural descriptionISO/IEC 42010 Recommended Practice for Architectural description
ISO/IEC 42010 Recommended Practice for Architectural description
 

Similar to Resilience Engineering: A field of study, a community, and some perspective shifting.

Design For Testability
Design For TestabilityDesign For Testability
Design For Testability
Giovanni Asproni
 
Software Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and SecuritySoftware Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and Security
Tao Xie
 
The Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs PublicThe Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs Public
David Solivan
 
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Chakkrit (Kla) Tantithamthavorn
 
DSAPA.pdf
DSAPA.pdfDSAPA.pdf
DSAPA.pdf
Luis Pena
 
From Monoliths to Microservices at Realestate.com.au
From Monoliths to Microservices at Realestate.com.auFrom Monoliths to Microservices at Realestate.com.au
From Monoliths to Microservices at Realestate.com.au
evanbottcher
 
Software Security Assurance for DevOps
Software Security Assurance for DevOpsSoftware Security Assurance for DevOps
Software Security Assurance for DevOps
Black Duck by Synopsys
 
#DOAW16 - DevOps@work Roma 2016 - Testing your databases
#DOAW16 - DevOps@work Roma 2016 - Testing your databases#DOAW16 - DevOps@work Roma 2016 - Testing your databases
#DOAW16 - DevOps@work Roma 2016 - Testing your databases
Alessandro Alpi
 
Production Readiness Strategies in an Automated World
Production Readiness Strategies in an Automated WorldProduction Readiness Strategies in an Automated World
Production Readiness Strategies in an Automated World
Sean Chittenden
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software Datasets
Tao Xie
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
Robert Grossman
 
Software testing ppt
Software testing pptSoftware testing ppt
Software testing ppt
Poonkodi Jayakumar
 
Ensuring code quality
Ensuring code qualityEnsuring code quality
Ensuring code quality
MikhailVladimirov
 
Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringSoftware Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software Engineering
Tao Xie
 
Complex System Engineering
Complex System EngineeringComplex System Engineering
Complex System EngineeringEmmanuel Fuchs
 
Kelly potvin nosurprises_odtug_oow12
Kelly potvin nosurprises_odtug_oow12Kelly potvin nosurprises_odtug_oow12
Kelly potvin nosurprises_odtug_oow12Enkitec
 
Creating An Incremental Architecture For Your System
Creating An Incremental Architecture For Your SystemCreating An Incremental Architecture For Your System
Creating An Incremental Architecture For Your System
Giovanni Asproni
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosas
Merce Crosas
 
Machine programming
Machine programmingMachine programming
Machine programming
DESMOND YUEN
 

Similar to Resilience Engineering: A field of study, a community, and some perspective shifting. (20)

Design For Testability
Design For TestabilityDesign For Testability
Design For Testability
 
Software Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and SecuritySoftware Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and Security
 
The Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs PublicThe Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs Public
 
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
 
DSAPA.pdf
DSAPA.pdfDSAPA.pdf
DSAPA.pdf
 
From Monoliths to Microservices at Realestate.com.au
From Monoliths to Microservices at Realestate.com.auFrom Monoliths to Microservices at Realestate.com.au
From Monoliths to Microservices at Realestate.com.au
 
Software Security Assurance for DevOps
Software Security Assurance for DevOpsSoftware Security Assurance for DevOps
Software Security Assurance for DevOps
 
#DOAW16 - DevOps@work Roma 2016 - Testing your databases
#DOAW16 - DevOps@work Roma 2016 - Testing your databases#DOAW16 - DevOps@work Roma 2016 - Testing your databases
#DOAW16 - DevOps@work Roma 2016 - Testing your databases
 
Production Readiness Strategies in an Automated World
Production Readiness Strategies in an Automated WorldProduction Readiness Strategies in an Automated World
Production Readiness Strategies in an Automated World
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software Datasets
 
Introduction
IntroductionIntroduction
Introduction
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
 
Software testing ppt
Software testing pptSoftware testing ppt
Software testing ppt
 
Ensuring code quality
Ensuring code qualityEnsuring code quality
Ensuring code quality
 
Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringSoftware Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software Engineering
 
Complex System Engineering
Complex System EngineeringComplex System Engineering
Complex System Engineering
 
Kelly potvin nosurprises_odtug_oow12
Kelly potvin nosurprises_odtug_oow12Kelly potvin nosurprises_odtug_oow12
Kelly potvin nosurprises_odtug_oow12
 
Creating An Incremental Architecture For Your System
Creating An Incremental Architecture For Your SystemCreating An Incremental Architecture For Your System
Creating An Incremental Architecture For Your System
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosas
 
Machine programming
Machine programmingMachine programming
Machine programming
 

More from John Allspaw

Considerations for Alert Design
Considerations for Alert DesignConsiderations for Alert Design
Considerations for Alert Design
John Allspaw
 
Velocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Velocity EU 2012 Escalating Scenarios: Outage Handling PitfallsVelocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Velocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
John Allspaw
 
Responding to Outages Maturely
Responding to Outages MaturelyResponding to Outages Maturely
Responding to Outages Maturely
John Allspaw
 
Resilient Response In Complex Systems
Resilient Response In Complex SystemsResilient Response In Complex Systems
Resilient Response In Complex Systems
John Allspaw
 
Outages, PostMortems, and Human Error
Outages, PostMortems, and Human ErrorOutages, PostMortems, and Human Error
Outages, PostMortems, and Human Error
John Allspaw
 
Anticipation: What Could Possibly Go Wrong?
Anticipation: What Could Possibly Go Wrong?Anticipation: What Could Possibly Go Wrong?
Anticipation: What Could Possibly Go Wrong?
John Allspaw
 
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)Advanced PostMortem Fu and Human Error 101 (Velocity 2011)
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)
John Allspaw
 
Dev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and FlickrDev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and Flickr
John Allspaw
 
Go or No-Go: Operability and Contingency Planning at Etsy.com
Go or No-Go: Operability and Contingency Planning at Etsy.comGo or No-Go: Operability and Contingency Planning at Etsy.com
Go or No-Go: Operability and Contingency Planning at Etsy.com
John Allspaw
 
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeOps Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeJohn Allspaw
 
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeOps Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeJohn Allspaw
 
Capacity Planning For LAMP
Capacity Planning For LAMPCapacity Planning For LAMP
Capacity Planning For LAMP
John Allspaw
 
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
John Allspaw
 
Operational Efficiency Hacks Web20 Expo2009
Operational Efficiency Hacks Web20 Expo2009Operational Efficiency Hacks Web20 Expo2009
Operational Efficiency Hacks Web20 Expo2009John Allspaw
 
Capacity Management for Web Operations
Capacity Management for Web OperationsCapacity Management for Web Operations
Capacity Management for Web OperationsJohn Allspaw
 
Capacity Planning for Web Operations - Web20 Expo 2008
Capacity Planning for Web Operations - Web20 Expo 2008Capacity Planning for Web Operations - Web20 Expo 2008
Capacity Planning for Web Operations - Web20 Expo 2008John Allspaw
 

More from John Allspaw (16)

Considerations for Alert Design
Considerations for Alert DesignConsiderations for Alert Design
Considerations for Alert Design
 
Velocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Velocity EU 2012 Escalating Scenarios: Outage Handling PitfallsVelocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Velocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
 
Responding to Outages Maturely
Responding to Outages MaturelyResponding to Outages Maturely
Responding to Outages Maturely
 
Resilient Response In Complex Systems
Resilient Response In Complex SystemsResilient Response In Complex Systems
Resilient Response In Complex Systems
 
Outages, PostMortems, and Human Error
Outages, PostMortems, and Human ErrorOutages, PostMortems, and Human Error
Outages, PostMortems, and Human Error
 
Anticipation: What Could Possibly Go Wrong?
Anticipation: What Could Possibly Go Wrong?Anticipation: What Could Possibly Go Wrong?
Anticipation: What Could Possibly Go Wrong?
 
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)Advanced PostMortem Fu and Human Error 101 (Velocity 2011)
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)
 
Dev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and FlickrDev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and Flickr
 
Go or No-Go: Operability and Contingency Planning at Etsy.com
Go or No-Go: Operability and Contingency Planning at Etsy.comGo or No-Go: Operability and Contingency Planning at Etsy.com
Go or No-Go: Operability and Contingency Planning at Etsy.com
 
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeOps Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For Change
 
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeOps Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For Change
 
Capacity Planning For LAMP
Capacity Planning For LAMPCapacity Planning For LAMP
Capacity Planning For LAMP
 
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
 
Operational Efficiency Hacks Web20 Expo2009
Operational Efficiency Hacks Web20 Expo2009Operational Efficiency Hacks Web20 Expo2009
Operational Efficiency Hacks Web20 Expo2009
 
Capacity Management for Web Operations
Capacity Management for Web OperationsCapacity Management for Web Operations
Capacity Management for Web Operations
 
Capacity Planning for Web Operations - Web20 Expo 2008
Capacity Planning for Web Operations - Web20 Expo 2008Capacity Planning for Web Operations - Web20 Expo 2008
Capacity Planning for Web Operations - Web20 Expo 2008
 

Recently uploaded

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 

Recently uploaded (20)

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 

Resilience Engineering: A field of study, a community, and some perspective shifting.

  • 1. Resilience Engineering The field, the community, and some perspective shifting. John Allspaw Adaptive Capacity Labs
  • 2. example #1 rm -rf $PATHNAME
  • 3. @@ -1,2 +1,2 @@ -<!-- Status: Ok --> +<!-- Status: OK --> Showing 1 changed file with 1 addition and 1 deletion. index.html example #2
  • 4. all work is contextual
  • 7. http://stella.report Year-long project Researchers analyzed 3 incidents, at: Six themes •Postmortems as re-calibration •Blameless v. sanctionless after action actions •Controlling the costs of coordination •Visualizations during anomaly management •Strange Loops •Dark Debt
  • 8. What You Are In For 1. Resilience Engineering: a field and a community 2. Accentuating the positive 3. Avoidance of shallow data 4. Some food for thought
  • 9. Resilience Engineering • A field of study that emerged largely from Cognitive Systems Engineering, early 2000s. • 7 symposia over 12 years
  • 10. Resilience Engineering Community is largely made up of practitioners and researchers from…. working in these domains… Aviation/ATM Rail Maritime Space Surgery Power Plants Intelligence Agencies Law Enforcement Mining Construction Explosives Firefighting Anesthesia Pediatrics Power Grid & Distribution Military Agencies Software Engineering Human Factors & Ergonomics Cognitive Systems Engineering Cybernetics Complexity Science Engineering* Psychology Sociology Ecology Safety Science
  • 11. Some of the cast of characters David Woods CSEL/OSU Shawna Perry Univ of Florida Emergency Medicine Dr. Richard Cook Anesthesiologist Researcher Ivonne Andrade Herrera SINTEF Erik Hollnagel Univ of S. Denmark Anne-Sophie Nyssen University de Liege Johan Bergström Lund University Sidney Dekker Griffith University Asher Balkin CSEL/OSU Laura Maguire CSEL/OSU
  • 12. Sample of Research Experiences in Fukushima Dai-ichi nuclear power plant in light of resilience engineering Unmanned Aircraft Systems in (Inter)national Airspace: Resilience as a Lever in the Debate Sociotechnical Networks for Power Grid Resilience: South Korean Case Study Limits on adaptation: Modeling Resilience and Brittleness in Hospital Emergency Departments
  • 13. Books
  • 14.
  • 15. externally sourced code (e.g. DB) results the using world delivery technology stack internally sourced code results
  • 16. externally sourced code (e.g. DB) results the using world delivery technology stack internally sourced code results macro descriptions externally sourced code (e.g. DB) results the using world delivery technology stack internally sourced code results
  • 17. code repositories macro descriptions testing/validation suites code code stuff meta rules scripts, rules, etc. test cases code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools externally sourced code (e.g. DB) results the using world delivery technology stack internally sourced code results
  • 18. code repositories macro descriptions testing/validation suites code code stuff meta rules scripts, rules, etc. test cases code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools externally sourced code (e.g. DB) results the using world delivery technology stack internally sourced code results code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools systemsystem framing doing code repositories macro descriptions testing/validation suites code code stuff meta rules scripts, rules, etc. test cases code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools externally sourced code (e.g. DB) results the using world delivery technology stack internally sourced code results
  • 19. deploy organization/ “monitoring” Adding stuff to the running system Getting stuff ready to be part of the running system architectural & structural framing keeping track of what “the system” is doing code deploy organization/
  • 20. code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools Adding stuff to the running system Getting stuff ready to be part of the running system architectural & structural framing keeping track of what “the system” is doing code repositories macro descriptions testing/validation suites code code stuff meta rules scripts, rules, etc. test cases code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools externally sourced code (e.g. DB) results the using world delivery technology stack internally sourced code results The Work Is Done Here Your Product Or Service The Stuff You Build and Maintain With
  • 21. code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools Adding stuff to the running system Getting stuff ready to be part of the running system architectural & structural framing keeping track of what “the system” is doing code repositories macro descriptions testing/validation suites code code stuff meta rules scripts, rules, etc. test cases code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools externally sourced code (e.g. DB) results the using world delivery technology stack internally sourced code results
  • 22. code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools Adding stuff to the running system Getting stuff ready to be part of the running system architectural & structural framing keeping track of what “the system” is doing code repositories macro descriptions testing/validation suites code code stuff meta rules scripts, rules, etc. test cases code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools externally sourced code (e.g. DB) results the using world delivery technology stack internally sourced code results Copyright © 2016 by R.I. Cook for ACL, all rights reserved ack: Michael Angeles http://konigi.com/tools/ What matters. Why what matters matters. code repositories macro descriptions testing/validation suites code code stuff meta rules scripts, rules, etc. test cases code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools above the line below the line Why is it doing that? What needs to change? What does it mean? How should this work? What’s it doing? What does it mean? What is happening? What should be happening What does it mean? Adding stuff to the running system Getting stuff ready to be part of the running system architectural & structural framing keeping track of what “the system” is doing externally sourced code (e.g. DB) results the using world delivery technology stack internally sourced code results goals purposes risks cognition actions interactions speech gestures clicks signals representations artifacts the line of representation individuals have unique models of the “system” Copyright © 2016 by R.I. Cook for ACL, all rights reserved ack: Michael Angeles http://konigi.com/tools/ What matters. Why what matters matters. code repositories macro descriptions testing/validation suites code code stuff meta rules scripts, rules, etc. test cases code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools above the line below the line Why is it doing that? What needs to change? What does it mean? How should this work? What’s it doing? What does it mean? What is happening? What should be happening What does it mean? Adding stuff to the running system Getting stuff ready to be part of the running system architectural & structural framing keeping track of what “the system” is doing externally sourced code (e.g. DB) results the using world delivery technology stack internally sourced code results goals purposes risks cognition actions interactions speech gestures clicks signals representations artifacts the line of representation individuals have unique models of the “system” observing inferring anticipating planning troubleshooting diagnosing correcting modifying reacting
  • 23. Copyright © 2016 by R.I. Cook for ACL, all rights reserved ack: Michael Angeles http://konigi.com/tools/ What matters. Why what matters matters. code repositories macro descriptions testing/validation suites code code stuff meta rules scripts, rules, etc. test cases code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools above the line below the line Why is it doing that? What needs to change? What does it mean? How should this work? What’s it doing? What does it mean? What is happening? What should be happening What does it mean? Adding stuff to the running system Getting stuff ready to be part of the running system architectural & structural framing keeping track of what “the system” is doing externally sourced code (e.g. DB) results the using world delivery technology stack internally sourced code results goals purposes risks cognition actions interactions speech gestures clicks signals representations artifacts the line of representation individuals have unique models of the “system” Copyright © 2016 by R.I. Cook for ACL, all rights reserved ack: Michael Angeles http://konigi.com/tools/ What matters. Why what matters matters. code repositories macro descriptions testing/validation suites code code stuff meta rules scripts, rules, etc. test cases code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools above the line below the line Why is it doing that? What needs to change? What does it mean? How should this work? What’s it doing? What does it mean? What is happening? What should be happening What does it mean? Adding stuff to the running system Getting stuff ready to be part of the running system architectural & structural framing keeping track of what “the system” is doing externally sourced code (e.g. DB) results the using world delivery technology stack internally sourced code results goals purposes risks cognition actions interactions speech gestures clicks signals representations artifacts the line of representation individuals have unique models of the “system” observing inferring anticipating planning troubleshooting diagnosing correcting modifying reacting
  • 24. What matters. Why what matters matters. code deploy organization/ encapsulation “monitoring” Why is it doing that? hat needs to change? What does it mean? How should this work? What’s it doing? What does it mean? What is happening? What should be happening What does it mean? Adding stuff to the running system Getting stuff ready to be part of the running system architectural & structural framing keeping track of what “the system” is doing go purp ris cogn act intera spe ges cli sig represe What matters. Why what matters matters. code deploy organization/ encapsulation “monitoring” Why is it doing that? hat needs to change? What does it mean? How should this work? What’s it doing? What does it mean? What is happening? What should be happening What does it mean? Adding stuff to the running system Getting stuff ready to be part of the running system architectural & structural framing keeping track of what “the system” is doing go purp ris cogn act intera spe ges cli sig represe observing inferring anticipating planning troubleshooting diagnosing correcting modifying reacting
  • 25. code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools Adding stuff to the running system Getting stuff ready to be part of the running system architectural & structural framing keeping track of what “the system” is doing code repositories macro descriptions testing/validation suites code code stuff meta rules scripts, rules, etc. test cases code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools externally sourced code (e.g. DB) results the using world delivery technology stack internally sourced code results Time …and things are changing here things are changing here…
  • 26. “above the line” …is not “management” …is not “organization design” or reporting structures …is how people work (detect/diagnose/solve problems, both acute and chronic) alongside and with technology and each other, under continual trade-off scenarios, that provide the audacity to build and sustain adaptive capacity.
  • 27. Resilience is something that a system does, not what a system has.
  • 28. “Resilience is an expression of how people, alone or together, cope with everyday situations – large and small – by adjusting their performance to the conditions. An organization’s performance is resilient if it can function as required under expected and unexpected conditions alike (changes/disturbances/opportunities).” Hollnagel, Erik. Safety-II in Practice: Developing the Resilience Potentials
  • 29. –David Woods (2015) “Resilience is sustained adaptive capacity.”
  • 30. Resilience is the story of the outage that didn’t happen.
  • 31. If you haven’t found people responsible for outcomes, you haven’t “seen” the system.
  • 32. Humans are predominantly seen as a liability or hazard. They are a problem to be fixed. Traditional view on the role of people (“Safety-I”) Humans are seen as a resource necessary for system flexibility and resilience. They provide flexible solutions to many potential problems. RE view on the role of people in complex systems (“Safety-II”)
  • 33. How does our software work, really? How does our software break, really? What do we do to keep it all working?
  • 34. explanations of accidents Safety-I Accidents are caused by failures and malfunctions. The purpose of an investigation is to identify causes and contributory factors. Safety-II Things go well and fail in basically the same ways, regardless of outcome. The purpose of an investigation is to understand how things usually go right as a basis for explaining how things occasionally go wrong.
  • 37. incidents (outages, degradations, breaches, accidents, near-misses, “glitches”, untoward/unexpected events, etc.)
  • 39. code repositories macro descriptions testing/validation suites code code stuff meta rules scripts, rules, etc. test cases code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools above the line below the line externally sourced code (e.g. DB) results the using world delivery technology stack internally sourced code results code repositories macro descriptions testing/validation suites code code stuff meta rules scripts, rules, etc. test cases code generating tools testing tools deploy tools organization/ encapsulation tools “monitoring” tools above the line below the line externally sourced code (e.g. DB) resultsdelivery technology stack internally sourced code results incidents as… drivers of software design - “incidents of yesterday inform the architectures of tomorrow” - incidents “below the line” drive changes “above the line" - staffing, budgets, planning, roadmaps, etc. - shape the design of new components, subsystems, architectures 💥
  • 40. 5/6/2010 - “Flash Crash” - loss of $1 trillion in market value in <10min 3/23/2012 - BATS IPO - systems issue halted the exchange’s own IPO 5/23/2012 - Facebook IPO - systems issue delayed IPO trading 8/1/2012 - Knight Capital - $461 million in 45 minutes “Regulation SCI” - tend also to give birth to new forms of regulations, policies, norms, compliance requirements, explosion of documentation, auditing, constraints, etc. - “incidents of yesterday inform the rules of tomorrow” - influence staffing, budgets, planning, roadmaps, etc. PCI-DSS 1988-1998, Visa and MasterCard reported credit card losses due to fraud of $750 million incidents as… motivators for policy
  • 41. incidents tend to focus our attention on what matters 💥
  • 42. incidents help us gauge the delta between how the system works how we think the system works Δ {almost always greater than we imagine
  • 43. “…nonroutine, challenging events, because these tough cases have the greatest potential for uncovering elements of expertise and related cognitive phenomena.” (Klein, Crandall, Hoffman, 2006) A family of well-worn methods, approaches, and techniques Cognitive task/work analysis Process tracing Conversation analysis Critical decision method Critical incident technique more… research validates these opportunities
  • 46. 12 minutes 54 minutes start resolve detect incident
  • 47. 20 minutes 73 minutes 12 minutes 54 minutes start resolve detect start detect resolve incidents
  • 48. 12 minutes 54 minutes start resolve detect 20 minutes 73 minutes start detect resolve 5 25 minutes start detect resolve incidents
  • 49. incidents 12 minutes 54 minutes start resolve detect 20 minutes 73 minutes start detect resolve 5 25 minutes start detect resolve 135 minutes 100 minutes start detect resolve
  • 50. incidents 12 minutes 54 minutes start resolve detect 20 minutes 73 minutes start detect resolve 5 25 minutes start detect resolve 135 minutes 100 minutes start detect resolve minutes
  • 56. “Resilience is an expression of how people, alone or together, cope with everyday situations – large and small – by adjusting their performance to the conditions. An organization’s performance is resilient if it can function as required under expected and unexpected conditions alike (changes/disturbances/opportunities).”
  • 59. What is it doing?! Why is it doing that?! What will it do next? How did it get into this state? WTF is happening? If we do Y, will it help us figure out what to do? Is it getting worse? It looks like it’s fixed…but is it…? If we do X, will it prevent it from getting worse…or make it worse? Who else should we call that can help us? Is this OUR issue, or are we BEING ATTACKED?!
  • 60. incidents provide calibration about… how decisions are focused how attention flows how work is coordinated how escalation manifests the weight of time pressure the effects of uncertainty the impact of ambiguity what consequences are consequential
  • 61. What can we learn about these… how decisions are focused how attention flows how work is coordinated how escalation manifests the weight of time pressure the effects of uncertainty the impact of ambiguity what consequences are consequential …from these? (M)TTR? (M)TTD? Frequency of incidents? Severity of incidents? Customer impact? Number of deploys? “…while there is value in the items on the right, we value the items on the left more.”
  • 62. Thought Food • We cannot comprehensively understand how our systems behave - we continually build and revise our understandings based on (relatively sparse) signals our tech sends us. • Continuous delivery, “Chaos”/fault injection, are coping strategies (hedges) for the above state of affairs. • Understanding activities “above the line” are basically unexplored or ignored in our industry, and this needs to change.
  • 63. End.