SlideShare a Scribd company logo
1 of 59
Download to read offline
Why Resilience?
A primer at varying flight altitudes

Uwe Friedrichsen, codecentric AG, 2014
@ufried
Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com
Resilience? Never heard of it …
re•sil•ience (rɪˈzɪl yəns) also re•sil′ien•cy, n.

1.  the power or ability to return to the original form, position,
etc., after being bent, compressed, or stretched; elasticity.
2.  ability to recover readily from illness, depression, adversity,
or the like; buoyancy.

Random House Kernerman Webster's College Dictionary, © 2010 K Dictionaries Ltd.
Copyright 2005, 1997, 1991 by Random House, Inc. All rights reserved.


http://www.thefreedictionary.com/resilience
Resilience (IT)

The ability of an application to handle unexpected situations
-  without the user noticing it (best case)
-  with a graceful degradation of service (worst case)
Resilience is not about testing your application

(You should definitely test your application, but that‘s a different story)
public class MySUTTest {
@Test
public void shouldDoSomething() {
MySUT sut = new MySUT();
MyResult result = sut.doSomething();
assertEquals(<Some expected result>, result);
}
…
}
It‘s all about production!
Why should I care?
Business
Production
Availability
Resilience
Your web server doesn‘t look good …
The dreaded SiteTooSuccessfulException …
Reasons to care about resilience





•  Loss of lives
•  Loss of goods (manufacturing facilities)
•  Loss of money
•  Loss of reputation
Why should I care about it today?

(The risks you mention are not new)
Resilience drivers


•  Cloud-based systems
•  Highly scalable systems
•  Zero Downtime
•  IoT & Mobile
•  Social

à Reliably running distributed systems
What’s the business case?

(I don’t see any money to be made with it)
Counter question

Can you afford to ignore it?

(It’s not about making money, it’s about not loosing money)
Resilience business case

•  Identify risk scenarios

•  Calculate current occurrence probability
•  Calculate future occurrence probability

•  Calculate short-term losses
•  Calculate long-term losses

•  Assess risks and money
•  Do not forget the competitors
Let’s dive deeper into resilience
Classification attempt
Reliability: A set of attributes that bear on the capability of software to maintain its level

of performance under stated conditions for a stated period of time.
Efficiency
ISO/IEC 9126

software quality characteristics
Usability
Reliability
Portability
Maintainability
Functionality
Available with acceptable latency
Resilience goes
beyond that
How can I maximize availability?
Availability ≔ 
MTTF
MTTF + MTTR
MTTF: Mean Time To Failure
MTTR: Mean Time To Recovery
Traditional approach (robustness)
Availability ≔ 
MTTF
MTTF + MTTR
Maximize MTTF
A distributed system is one in which the failure
of a computer you didn't even know existed
can render your own computer unusable.

Leslie Lamport
Failures in todays complex, distributed,
interconnected systems are not the exception.

They are the normal case.
Contemporary approach (resilience)
Availability ≔ 
MTTF
MTTF + MTTR
Minimize MTTR
Do not try to avoid failures. Embrace them.
What kinds of failures

do I need to deal with?
Failure types



•  Crash failure
•  Omission failure
•  Timing failure
•  Response failure
•  Byzantine failure
How do I implement resilience?
Bulkheads
•  Divide system in failure units
•  Isolate failure units
•  Define fallback strategy
Redundancy
•  Elaborate use case

Minimize MTTR / scale transactions / handle response errors / …
•  Define routing & balancing strategy

Round robin / master-slave / fan-out & quickest one wins / …
•  Consider admin involvement

Automatic vs. manual / notification – monitoring / …
Loose Coupling
•  Isolate failure units (complements bulkheads)
•  Go asynchronous wherever possible
•  Use timeouts & circuit breakers
•  Make actions idempotent
Implementation Example #1

Timeouts
Timeouts (1)
// Basics
myObject.wait(); // Do not use this by default
myObject.wait(TIMEOUT); // Better use this
// Some more basics
myThread.join(); // Do not use this by default
myThread.join(TIMEOUT); // Better use this
Timeouts (2)
// Using the Java concurrent library
Callable<MyActionResult> myAction = <My Blocking Action>
ExecutorService executor = Executors.newSingleThreadExecutor();
Future<MyActionResult> future = executor.submit(myAction);
MyActionResult result = null;
try {
result = future.get(); // Do not use this by default
result = future.get(TIMEOUT, TIMEUNIT); // Better use this
} catch (TimeoutException e) { // Only thrown if timeouts are used
...
} catch (...) {
...
}
Timeouts (3)
// Using Guava SimpleTimeLimiter
Callable<MyActionResult> myAction = <My Blocking Action>
SimpleTimeLimiter limiter = new SimpleTimeLimiter();
MyActionResult result = null;
try {
result =
limiter.callWithTimeout(myAction, TIMEOUT, TIMEUNIT, false);
} catch (UncheckedTimeoutException e) {
...
} catch (...) {
...
}
Implementation Example #2

Circuit Breaker
Circuit Breaker – concept
Client
 Resource
Circuit Breaker
Request
Resource unavailable
Resource available
Closed
 Open
Half-Open
Lifecycle
Implemented patterns






•  Timeout
•  Circuit breaker
•  Load shedder
Supported patterns

•  Bulkheads

(a.k.a. Failure Units)
•  Fail fast
•  Fail silently
•  Graceful degradation of service
•  Failover
•  Escalation
•  Retry
•  ...
Hello, world!
public class HelloCommand extends HystrixCommand<String> {
private static final String COMMAND_GROUP = "default";
private final String name;
public HelloCommand(String name) {
super(HystrixCommandGroupKey.Factory.asKey(COMMAND_GROUP));
this.name = name;
}
@Override
protected String run() throws Exception {
return "Hello, " + name;
}
}
@Test
public void shouldGreetWorld() {
String result = new HelloCommand("World").execute();
assertEquals("Hello, World", result);
}
Source: https://github.com/Netflix/Hystrix/wiki/How-it-Works
Fallbacks
•  What will you do if a request fails?
•  Consider failure handling from the very beginning
•  Supplement with general failure handling strategies
Scalability
•  Define scaling strategy
•  Think full stack
•  Apply D-I-D rule
•  Design for elasticity
… and many more


•  Supervision patterns
•  Recovery & mitigation patterns
•  Anti-fragility patterns
•  Supporting patterns
•  A rich pattern family


Different approach than traditional

enterprise software development
How do I integrate resilience
into my
software development process?
Steps to adopt resilient software design






1.  Create awareness: 
 Go DevOps
2.  Create capability: 
 Coach your developers
3.  Create sustainability: 
 Inject errors
Related topics





Reactive
Anti-fragility
Fault-tolerant software design
Recovery-oriented computing
Wrap-up



•  Resilience is about availability
•  Crucial for todays complex systems
•  Not caring is a risk
•  Go DevOps to create awareness
Do not avoid failures. Embrace them!
@ufried
Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com
Why resilience - A primer at varying flight altitudes

More Related Content

What's hot

MicroServices architecture @ Ctrip v1.1
MicroServices architecture @ Ctrip v1.1MicroServices architecture @ Ctrip v1.1
MicroServices architecture @ Ctrip v1.1William Yang
 
VMUGIT UC 2013 - 04 Duncan Epping
VMUGIT UC 2013 - 04 Duncan EppingVMUGIT UC 2013 - 04 Duncan Epping
VMUGIT UC 2013 - 04 Duncan EppingVMUG IT
 
The Economics of Scale: Promises and Perils of Going Distributed
The Economics of Scale: Promises and Perils of Going DistributedThe Economics of Scale: Promises and Perils of Going Distributed
The Economics of Scale: Promises and Perils of Going DistributedTyler Treat
 
Simple Solutions for Complex Problems
Simple Solutions for Complex ProblemsSimple Solutions for Complex Problems
Simple Solutions for Complex ProblemsTyler Treat
 
Dr. Hectic and Mr. Hype - surviving the economic darwinism
Dr. Hectic and Mr. Hype - surviving the economic darwinismDr. Hectic and Mr. Hype - surviving the economic darwinism
Dr. Hectic and Mr. Hype - surviving the economic darwinismUwe Friedrichsen
 
Zebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathZebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathbcantrill
 

What's hot (6)

MicroServices architecture @ Ctrip v1.1
MicroServices architecture @ Ctrip v1.1MicroServices architecture @ Ctrip v1.1
MicroServices architecture @ Ctrip v1.1
 
VMUGIT UC 2013 - 04 Duncan Epping
VMUGIT UC 2013 - 04 Duncan EppingVMUGIT UC 2013 - 04 Duncan Epping
VMUGIT UC 2013 - 04 Duncan Epping
 
The Economics of Scale: Promises and Perils of Going Distributed
The Economics of Scale: Promises and Perils of Going DistributedThe Economics of Scale: Promises and Perils of Going Distributed
The Economics of Scale: Promises and Perils of Going Distributed
 
Simple Solutions for Complex Problems
Simple Solutions for Complex ProblemsSimple Solutions for Complex Problems
Simple Solutions for Complex Problems
 
Dr. Hectic and Mr. Hype - surviving the economic darwinism
Dr. Hectic and Mr. Hype - surviving the economic darwinismDr. Hectic and Mr. Hype - surviving the economic darwinism
Dr. Hectic and Mr. Hype - surviving the economic darwinism
 
Zebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathZebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data path
 

Viewers also liked

The promises and perils of microservices
The promises and perils of microservicesThe promises and perils of microservices
The promises and perils of microservicesUwe Friedrichsen
 
Modern times - architectures for a Next Generation of IT
Modern times - architectures for a Next Generation of ITModern times - architectures for a Next Generation of IT
Modern times - architectures for a Next Generation of ITUwe Friedrichsen
 
The Next Generation (of) IT
The Next Generation (of) ITThe Next Generation (of) IT
The Next Generation (of) ITUwe Friedrichsen
 
Conway's law revisited - Architectures for an effective IT
Conway's law revisited - Architectures for an effective ITConway's law revisited - Architectures for an effective IT
Conway's law revisited - Architectures for an effective ITUwe Friedrichsen
 
Towards complex adaptive architectures
Towards complex adaptive architecturesTowards complex adaptive architectures
Towards complex adaptive architecturesUwe Friedrichsen
 
DevOps is not enough - Embedding DevOps in a broader context
DevOps is not enough - Embedding DevOps in a broader contextDevOps is not enough - Embedding DevOps in a broader context
DevOps is not enough - Embedding DevOps in a broader contextUwe Friedrichsen
 
Case Management in Addiction Counselling
Case Management in Addiction CounsellingCase Management in Addiction Counselling
Case Management in Addiction CounsellingUniversity of Cumbria
 
Perception - My view of World
Perception - My view of WorldPerception - My view of World
Perception - My view of WorldVatsal Shah
 
My Roles - My Life
My Roles - My LifeMy Roles - My Life
My Roles - My LifeVatsal Shah
 
Nonviolent communication poster
Nonviolent communication posterNonviolent communication poster
Nonviolent communication posterTony O'Grady
 
Reducing Stress in Families: An Intro to Family Resilience
Reducing Stress in Families: An Intro to Family ResilienceReducing Stress in Families: An Intro to Family Resilience
Reducing Stress in Families: An Intro to Family ResilienceBrandy Vanderheiden, MFT SEP
 

Viewers also liked (20)

Fantastic Elastic
Fantastic ElasticFantastic Elastic
Fantastic Elastic
 
Devops for Developers
Devops for DevelopersDevops for Developers
Devops for Developers
 
Self healing data
Self healing dataSelf healing data
Self healing data
 
Resilience with Hystrix
Resilience with HystrixResilience with Hystrix
Resilience with Hystrix
 
The promises and perils of microservices
The promises and perils of microservicesThe promises and perils of microservices
The promises and perils of microservices
 
Patterns of resilience
Patterns of resiliencePatterns of resilience
Patterns of resilience
 
Modern times - architectures for a Next Generation of IT
Modern times - architectures for a Next Generation of ITModern times - architectures for a Next Generation of IT
Modern times - architectures for a Next Generation of IT
 
The Next Generation (of) IT
The Next Generation (of) ITThe Next Generation (of) IT
The Next Generation (of) IT
 
Conway's law revisited - Architectures for an effective IT
Conway's law revisited - Architectures for an effective ITConway's law revisited - Architectures for an effective IT
Conway's law revisited - Architectures for an effective IT
 
Towards complex adaptive architectures
Towards complex adaptive architecturesTowards complex adaptive architectures
Towards complex adaptive architectures
 
Watch your communication
Watch your communicationWatch your communication
Watch your communication
 
Life, IT and everything
Life, IT and everythingLife, IT and everything
Life, IT and everything
 
DevOps is not enough - Embedding DevOps in a broader context
DevOps is not enough - Embedding DevOps in a broader contextDevOps is not enough - Embedding DevOps in a broader context
DevOps is not enough - Embedding DevOps in a broader context
 
Case Management in Addiction Counselling
Case Management in Addiction CounsellingCase Management in Addiction Counselling
Case Management in Addiction Counselling
 
Perception - My view of World
Perception - My view of WorldPerception - My view of World
Perception - My view of World
 
My Roles - My Life
My Roles - My LifeMy Roles - My Life
My Roles - My Life
 
Nonviolent communication poster
Nonviolent communication posterNonviolent communication poster
Nonviolent communication poster
 
Stress Relief for Parents
Stress Relief for ParentsStress Relief for Parents
Stress Relief for Parents
 
Reducing Stress in Families: An Intro to Family Resilience
Reducing Stress in Families: An Intro to Family ResilienceReducing Stress in Families: An Intro to Family Resilience
Reducing Stress in Families: An Intro to Family Resilience
 
A Brief History of Resilience
A Brief History of ResilienceA Brief History of Resilience
A Brief History of Resilience
 

Similar to Why resilience - A primer at varying flight altitudes

Resisting to The Shocks
Resisting to The ShocksResisting to The Shocks
Resisting to The ShocksStefano Fago
 
Service resiliency in microservices
Service resiliency in microservicesService resiliency in microservices
Service resiliency in microservicesBallerina
 
Microservices Resiliency with BallerinaLang
Microservices Resiliency with BallerinaLangMicroservices Resiliency with BallerinaLang
Microservices Resiliency with BallerinaLangAfkham Azeez
 
Software Availability by Resiliency
Software Availability by ResiliencySoftware Availability by Resiliency
Software Availability by ResiliencyReza Samei
 
Preparing for a Black Swan: Planning and Programming for Risk Mitigation in E...
Preparing for a Black Swan: Planning and Programming for Risk Mitigation in E...Preparing for a Black Swan: Planning and Programming for Risk Mitigation in E...
Preparing for a Black Swan: Planning and Programming for Risk Mitigation in E...juliekannai
 
Circuit breakers - Using Spring-Boot + Hystrix + Dashboard + Retry
Circuit breakers - Using Spring-Boot + Hystrix + Dashboard + RetryCircuit breakers - Using Spring-Boot + Hystrix + Dashboard + Retry
Circuit breakers - Using Spring-Boot + Hystrix + Dashboard + RetryBruno Henrique Rother
 
Architectural Patterns of Resilient Distributed Systems
 Architectural Patterns of Resilient Distributed Systems Architectural Patterns of Resilient Distributed Systems
Architectural Patterns of Resilient Distributed SystemsInes Sombra
 
Building resilient applications
Building resilient applicationsBuilding resilient applications
Building resilient applicationsNuno Caneco
 
DS Crisis Management Foundation Risk
DS Crisis Management Foundation RiskDS Crisis Management Foundation Risk
DS Crisis Management Foundation RiskDS
 
Disaster Recovery Development Strategy Business Measures Management Maintenance
Disaster Recovery Development Strategy Business Measures Management MaintenanceDisaster Recovery Development Strategy Business Measures Management Maintenance
Disaster Recovery Development Strategy Business Measures Management MaintenanceSlideTeam
 
Applying principles of chaos engineering to serverless (O'Reilly Software Arc...
Applying principles of chaos engineering to serverless (O'Reilly Software Arc...Applying principles of chaos engineering to serverless (O'Reilly Software Arc...
Applying principles of chaos engineering to serverless (O'Reilly Software Arc...Yan Cui
 
Designing apps for resiliency
Designing apps for resiliencyDesigning apps for resiliency
Designing apps for resiliencyMasashi Narumoto
 
Resilience4j with Spring Boot
Resilience4j with Spring BootResilience4j with Spring Boot
Resilience4j with Spring BootKnoldus Inc.
 
Problem management foundation - IT risk
Problem management foundation - IT riskProblem management foundation - IT risk
Problem management foundation - IT riskRonald Bartels
 
Testing Safety Critical Systems (10-02-2014, VU amsterdam)
Testing Safety Critical Systems (10-02-2014, VU amsterdam)Testing Safety Critical Systems (10-02-2014, VU amsterdam)
Testing Safety Critical Systems (10-02-2014, VU amsterdam)Jaap van Ekris
 
[WSO2Con EU 2017] Resilience Patterns with Ballerina
[WSO2Con EU 2017] Resilience Patterns with Ballerina[WSO2Con EU 2017] Resilience Patterns with Ballerina
[WSO2Con EU 2017] Resilience Patterns with BallerinaWSO2
 
Normal accidents and outpatient surgeries
Normal accidents and outpatient surgeriesNormal accidents and outpatient surgeries
Normal accidents and outpatient surgeriesJonathan Creasy
 

Similar to Why resilience - A primer at varying flight altitudes (20)

Resisting to The Shocks
Resisting to The ShocksResisting to The Shocks
Resisting to The Shocks
 
Service resiliency in microservices
Service resiliency in microservicesService resiliency in microservices
Service resiliency in microservices
 
Microservices Resiliency with BallerinaLang
Microservices Resiliency with BallerinaLangMicroservices Resiliency with BallerinaLang
Microservices Resiliency with BallerinaLang
 
Software Availability by Resiliency
Software Availability by ResiliencySoftware Availability by Resiliency
Software Availability by Resiliency
 
Preparing for a Black Swan: Planning and Programming for Risk Mitigation in E...
Preparing for a Black Swan: Planning and Programming for Risk Mitigation in E...Preparing for a Black Swan: Planning and Programming for Risk Mitigation in E...
Preparing for a Black Swan: Planning and Programming for Risk Mitigation in E...
 
Working at-heights
Working at-heightsWorking at-heights
Working at-heights
 
Circuit breakers - Using Spring-Boot + Hystrix + Dashboard + Retry
Circuit breakers - Using Spring-Boot + Hystrix + Dashboard + RetryCircuit breakers - Using Spring-Boot + Hystrix + Dashboard + Retry
Circuit breakers - Using Spring-Boot + Hystrix + Dashboard + Retry
 
Architectural Patterns of Resilient Distributed Systems
 Architectural Patterns of Resilient Distributed Systems Architectural Patterns of Resilient Distributed Systems
Architectural Patterns of Resilient Distributed Systems
 
Chaos engineering
Chaos engineering Chaos engineering
Chaos engineering
 
Building resilient applications
Building resilient applicationsBuilding resilient applications
Building resilient applications
 
DS Crisis Management Foundation Risk
DS Crisis Management Foundation RiskDS Crisis Management Foundation Risk
DS Crisis Management Foundation Risk
 
Disaster Recovery Development Strategy Business Measures Management Maintenance
Disaster Recovery Development Strategy Business Measures Management MaintenanceDisaster Recovery Development Strategy Business Measures Management Maintenance
Disaster Recovery Development Strategy Business Measures Management Maintenance
 
Applying principles of chaos engineering to serverless (O'Reilly Software Arc...
Applying principles of chaos engineering to serverless (O'Reilly Software Arc...Applying principles of chaos engineering to serverless (O'Reilly Software Arc...
Applying principles of chaos engineering to serverless (O'Reilly Software Arc...
 
slides.08.pptx
slides.08.pptxslides.08.pptx
slides.08.pptx
 
Designing apps for resiliency
Designing apps for resiliencyDesigning apps for resiliency
Designing apps for resiliency
 
Resilience4j with Spring Boot
Resilience4j with Spring BootResilience4j with Spring Boot
Resilience4j with Spring Boot
 
Problem management foundation - IT risk
Problem management foundation - IT riskProblem management foundation - IT risk
Problem management foundation - IT risk
 
Testing Safety Critical Systems (10-02-2014, VU amsterdam)
Testing Safety Critical Systems (10-02-2014, VU amsterdam)Testing Safety Critical Systems (10-02-2014, VU amsterdam)
Testing Safety Critical Systems (10-02-2014, VU amsterdam)
 
[WSO2Con EU 2017] Resilience Patterns with Ballerina
[WSO2Con EU 2017] Resilience Patterns with Ballerina[WSO2Con EU 2017] Resilience Patterns with Ballerina
[WSO2Con EU 2017] Resilience Patterns with Ballerina
 
Normal accidents and outpatient surgeries
Normal accidents and outpatient surgeriesNormal accidents and outpatient surgeries
Normal accidents and outpatient surgeries
 

More from Uwe Friedrichsen

Timeless design in a cloud-native world
Timeless design in a cloud-native worldTimeless design in a cloud-native world
Timeless design in a cloud-native worldUwe Friedrichsen
 
The hitchhiker's guide for the confused developer
The hitchhiker's guide for the confused developerThe hitchhiker's guide for the confused developer
The hitchhiker's guide for the confused developerUwe Friedrichsen
 
Excavating the knowledge of our ancestors
Excavating the knowledge of our ancestorsExcavating the knowledge of our ancestors
Excavating the knowledge of our ancestorsUwe Friedrichsen
 
The truth about "You build it, you run it!"
The truth about "You build it, you run it!"The truth about "You build it, you run it!"
The truth about "You build it, you run it!"Uwe Friedrichsen
 
How to survive in a BASE world
How to survive in a BASE worldHow to survive in a BASE world
How to survive in a BASE worldUwe Friedrichsen
 

More from Uwe Friedrichsen (8)

Timeless design in a cloud-native world
Timeless design in a cloud-native worldTimeless design in a cloud-native world
Timeless design in a cloud-native world
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
 
Life after microservices
Life after microservicesLife after microservices
Life after microservices
 
The hitchhiker's guide for the confused developer
The hitchhiker's guide for the confused developerThe hitchhiker's guide for the confused developer
The hitchhiker's guide for the confused developer
 
Excavating the knowledge of our ancestors
Excavating the knowledge of our ancestorsExcavating the knowledge of our ancestors
Excavating the knowledge of our ancestors
 
The truth about "You build it, you run it!"
The truth about "You build it, you run it!"The truth about "You build it, you run it!"
The truth about "You build it, you run it!"
 
How to survive in a BASE world
How to survive in a BASE worldHow to survive in a BASE world
How to survive in a BASE world
 
Fault tolerance made easy
Fault tolerance made easyFault tolerance made easy
Fault tolerance made easy
 

Recently uploaded

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 

Recently uploaded (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 

Why resilience - A primer at varying flight altitudes

  • 1. Why Resilience? A primer at varying flight altitudes Uwe Friedrichsen, codecentric AG, 2014
  • 2. @ufried Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com
  • 4. re•sil•ience (rɪˈzɪl yəns) also re•sil′ien•cy, n. 1.  the power or ability to return to the original form, position, etc., after being bent, compressed, or stretched; elasticity. 2.  ability to recover readily from illness, depression, adversity, or the like; buoyancy. Random House Kernerman Webster's College Dictionary, © 2010 K Dictionaries Ltd. Copyright 2005, 1997, 1991 by Random House, Inc. All rights reserved. http://www.thefreedictionary.com/resilience
  • 5. Resilience (IT) The ability of an application to handle unexpected situations -  without the user noticing it (best case) -  with a graceful degradation of service (worst case)
  • 6. Resilience is not about testing your application (You should definitely test your application, but that‘s a different story) public class MySUTTest { @Test public void shouldDoSomething() { MySUT sut = new MySUT(); MyResult result = sut.doSomething(); assertEquals(<Some expected result>, result); } … }
  • 7. It‘s all about production!
  • 8. Why should I care?
  • 10. Your web server doesn‘t look good …
  • 12. Reasons to care about resilience •  Loss of lives •  Loss of goods (manufacturing facilities) •  Loss of money •  Loss of reputation
  • 13. Why should I care about it today? (The risks you mention are not new)
  • 14. Resilience drivers •  Cloud-based systems •  Highly scalable systems •  Zero Downtime •  IoT & Mobile •  Social à Reliably running distributed systems
  • 15. What’s the business case? (I don’t see any money to be made with it)
  • 16. Counter question Can you afford to ignore it? (It’s not about making money, it’s about not loosing money)
  • 17. Resilience business case •  Identify risk scenarios •  Calculate current occurrence probability •  Calculate future occurrence probability •  Calculate short-term losses •  Calculate long-term losses •  Assess risks and money •  Do not forget the competitors
  • 18. Let’s dive deeper into resilience
  • 19. Classification attempt Reliability: A set of attributes that bear on the capability of software to maintain its level
 of performance under stated conditions for a stated period of time. Efficiency ISO/IEC 9126
 software quality characteristics Usability Reliability Portability Maintainability Functionality Available with acceptable latency Resilience goes beyond that
  • 20. How can I maximize availability?
  • 21. Availability ≔ MTTF MTTF + MTTR MTTF: Mean Time To Failure MTTR: Mean Time To Recovery
  • 22. Traditional approach (robustness) Availability ≔ MTTF MTTF + MTTR Maximize MTTF
  • 23. A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable. Leslie Lamport
  • 24. Failures in todays complex, distributed, interconnected systems are not the exception. They are the normal case.
  • 25. Contemporary approach (resilience) Availability ≔ MTTF MTTF + MTTR Minimize MTTR
  • 26. Do not try to avoid failures. Embrace them.
  • 27. What kinds of failures
 do I need to deal with?
  • 28. Failure types •  Crash failure •  Omission failure •  Timing failure •  Response failure •  Byzantine failure
  • 29. How do I implement resilience?
  • 31. •  Divide system in failure units •  Isolate failure units •  Define fallback strategy
  • 33. •  Elaborate use case
 Minimize MTTR / scale transactions / handle response errors / … •  Define routing & balancing strategy
 Round robin / master-slave / fan-out & quickest one wins / … •  Consider admin involvement
 Automatic vs. manual / notification – monitoring / …
  • 35. •  Isolate failure units (complements bulkheads) •  Go asynchronous wherever possible •  Use timeouts & circuit breakers •  Make actions idempotent
  • 37. Timeouts (1) // Basics myObject.wait(); // Do not use this by default myObject.wait(TIMEOUT); // Better use this // Some more basics myThread.join(); // Do not use this by default myThread.join(TIMEOUT); // Better use this
  • 38. Timeouts (2) // Using the Java concurrent library Callable<MyActionResult> myAction = <My Blocking Action> ExecutorService executor = Executors.newSingleThreadExecutor(); Future<MyActionResult> future = executor.submit(myAction); MyActionResult result = null; try { result = future.get(); // Do not use this by default result = future.get(TIMEOUT, TIMEUNIT); // Better use this } catch (TimeoutException e) { // Only thrown if timeouts are used ... } catch (...) { ... }
  • 39. Timeouts (3) // Using Guava SimpleTimeLimiter Callable<MyActionResult> myAction = <My Blocking Action> SimpleTimeLimiter limiter = new SimpleTimeLimiter(); MyActionResult result = null; try { result = limiter.callWithTimeout(myAction, TIMEOUT, TIMEUNIT, false); } catch (UncheckedTimeoutException e) { ... } catch (...) { ... }
  • 41. Circuit Breaker – concept Client Resource Circuit Breaker Request Resource unavailable Resource available Closed Open Half-Open Lifecycle
  • 42.
  • 43. Implemented patterns •  Timeout •  Circuit breaker •  Load shedder
  • 44. Supported patterns •  Bulkheads
 (a.k.a. Failure Units) •  Fail fast •  Fail silently •  Graceful degradation of service •  Failover •  Escalation •  Retry •  ...
  • 46. public class HelloCommand extends HystrixCommand<String> { private static final String COMMAND_GROUP = "default"; private final String name; public HelloCommand(String name) { super(HystrixCommandGroupKey.Factory.asKey(COMMAND_GROUP)); this.name = name; } @Override protected String run() throws Exception { return "Hello, " + name; } } @Test public void shouldGreetWorld() { String result = new HelloCommand("World").execute(); assertEquals("Hello, World", result); }
  • 49. •  What will you do if a request fails? •  Consider failure handling from the very beginning •  Supplement with general failure handling strategies
  • 51. •  Define scaling strategy •  Think full stack •  Apply D-I-D rule •  Design for elasticity
  • 52. … and many more •  Supervision patterns •  Recovery & mitigation patterns •  Anti-fragility patterns •  Supporting patterns •  A rich pattern family Different approach than traditional
 enterprise software development
  • 53. How do I integrate resilience into my software development process?
  • 54. Steps to adopt resilient software design 1.  Create awareness: Go DevOps 2.  Create capability: Coach your developers 3.  Create sustainability: Inject errors
  • 56. Wrap-up •  Resilience is about availability •  Crucial for todays complex systems •  Not caring is a risk •  Go DevOps to create awareness
  • 57. Do not avoid failures. Embrace them!
  • 58. @ufried Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com