SlideShare a Scribd company logo
Fault tolerance made easy
Patterns for fault tolerance implemented surprisingly easy

Uwe Friedrichsen, codecentric AG, 2013-2014
@ufried
Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com
It‘s all about production!
Production
Availability
Resilience
Fault Tolerance
Your web server doesn‘t look good …
Pattern #1

Timeouts
Timeouts (1)
// Basics
myObject.wait(); // Do not use this by default
myObject.wait(TIMEOUT); // Better use this
// Some more basics
myThread.join(); // Do not use this by default
myThread.join(TIMEOUT); // Better use this
Timeouts (2)
// Using the Java concurrent library
Callable<MyActionResult> myAction = <My Blocking Action>
ExecutorService executor = Executors.newSingleThreadExecutor();
Future<MyActionResult> future = executor.submit(myAction);
MyActionResult result = null;
try {
result = future.get(); // Do not use this by default
result = future.get(TIMEOUT, TIMEUNIT); // Better use this
} catch (TimeoutException e) { // Only thrown if timeouts are used
...
} catch (...) {
...
}
Timeouts (3)
// Using Guava SimpleTimeLimiter
Callable<MyActionResult> myAction = <My Blocking Action>
SimpleTimeLimiter limiter = new SimpleTimeLimiter();
MyActionResult result = null;
try {
result =
limiter.callWithTimeout(myAction, TIMEOUT, TIMEUNIT, false);
} catch (UncheckedTimeoutException e) {
...
} catch (...) {
...
}
Determining Timeout Duration

Configurable Timeouts

Self-Adapting Timeouts

Timeouts in JavaEE Containers
Pattern #2

Circuit Breaker
Circuit Breaker (1)
Client
 Resource
Circuit Breaker
Request
Resource unavailable
Resource available
Closed
 Open
Half-Open
Lifecycle
Circuit Breaker (2)
Closed

on call / pass through
call succeeds / reset count
call fails / count failure
threshold reached / trip breaker
Open

on call / fail
on timeout / attempt reset
trip breaker
Half-Open

on call / pass through
call succeeds / reset
call fails / trip breaker
trip breaker
 attempt reset
reset
Source: M. Nygard, „Release It!“
Circuit Breaker (3)
public class CircuitBreaker implements MyResource {
public enum State { CLOSED, OPEN, HALF_OPEN }
final MyResource resource;
State state;
int counter;
long tripTime;
public CircuitBreaker(MyResource r) {
resource = r;
state = CLOSED;
counter = 0;
tripTime = 0L;
}
...
Circuit Breaker (4)
...
public Result access(...) { // resource access
Result r = null;
if (state == OPEN) {
checkTimeout();
throw new ResourceUnavailableException();
}
try {
r = r.access(...); // should use timeout
} catch (Exception e) {
fail();
throw e;
}
success();
return r;
}
...
Circuit Breaker (5)
...
private void success() {
reset();
}
private void fail() {
counter++;
if (counter > THRESHOLD) {
tripBreaker();
}
}
private void reset() {
state = CLOSED;
counter = 0;
}
...
Circuit Breaker (6)
...
private void tripBreaker() {
state = OPEN;
tripTime = System.currentTimeMillis();
}
private void checkTimeout() {
if ((System.currentTimeMillis - tripTime) > TIMEOUT) {
state = HALF_OPEN;
counter = THRESHOLD;
}
}
public State getState()
return state;
}
}
Thread-Safe Circuit Breaker

Failure Types

Tuning Circuit Breakers

Available Implementations
Pattern #3

Fail Fast
Fail Fast (1)
Client
 Resources
Expensive Action
Request
Uses
Fail Fast (2)
Client
 Resources
Expensive Action
Request
Fail Fast Guard
Uses
Check availability
Forward
Fail Fast (3)
public class FailFastGuard {
private FailFastGuard() {}
public static void checkResources(Set<CircuitBreaker> resources) {
for (CircuitBreaker r : resources) {
if (r.getState() != CircuitBreaker.CLOSED) {
throw new ResourceUnavailableException(r);
}
}
}
}
Fail Fast (4)
public class MyService {
Set<CircuitBreaker> requiredResources;
// Initialize resources
...
public Result myExpensiveAction(...) {
FailFastGuard.checkResources(requiredResources);
// Execute core action
...
}
}
The dreaded SiteTooSuccessfulException …
Pattern #4

Shed Load
Shed Load (1)
Clients
 Server
Too many Requests
Shed Load (2)
Server
Too many Requests
Gate Keeper
Monitor
Requests
Request Load Data
 Monitor Load
Shedded Requests
Clients
Shed Load (3)
public class ShedLoadFilter implements Filter {
Random random;
public void init(FilterConfig fc) throws ServletException {
random = new Random(System.currentTimeMillis());
}
public void destroy() {
random = null;
}
...
Shed Load (4)
...
public void doFilter(ServletRequest request,
ServletResponse response,
FilterChain chain)
throws java.io.IOException, ServletException {
int load = getLoad();
if (shouldShed(load)) {
HttpServletResponse res = (HttpServletResponse)response;
res.setIntHeader("Retry-After", RECOMMENDATION);
res.sendError(HttpServletResponse.SC_SERVICE_UNAVAILABLE);
return;
}
chain.doFilter(request, response);
}
...
Shed Load (5)
...
private boolean shouldShed(int load) { // Example implementation
if (load < THRESHOLD) {
return false;
}
double shedBoundary =
((double)(load - THRESHOLD))/
((double)(MAX_LOAD - THRESHOLD));
return random.nextDouble() < shedBoundary;
}
}
Shed Load (6)
Shed Load (7)
Shedding Strategy

Retrieving Load

Tuning Load Shedders

Alternative Strategies
Pattern #5

Deferrable Work
Deferrable Work (1)
Client
Requests
Request Processing
Resources
Use
Routine Work
Use
OVERLOAD
Deferrable Work (2)
Without

Deferrable Work
100%
OVERLOAD
With

Deferrable Work
100%
Request Processing
Routine Work
// Do or wait variant
ProcessingState state = initBatch();
while(!state.done()) {
int load = getLoad();
if (load > THRESHOLD) {
waitFixedDuration();
} else {
state = processNext(state);
}
}
void waitFixedDuration() {
Thread.sleep(DELAY); // try-catch left out for better readability
}
Deferrable Work (3)
// Adaptive load variant
ProcessingState state = initBatch();
while(!state.done()) {
waitLoadBased();
state = processNext(state);
}
void waitLoadBased() {
int load = getLoad();
long delay = calcDelay(load);
Thread.sleep(delay); // try-catch left out for better readability
}
long calcDelay(int load) { // Simple example implementation
if (load < THRESHOLD) {
return 0L;
}
return (load – THRESHOLD) * DELAY_FACTOR;
}
Deferrable Work (4)
Delay Strategy

Retrieving Load

Tuning Deferrable Work
I can hardly hear you …
Pattern #6

Leaky Bucket
Leaky Bucket (1)
Leaky Bucket
Fill
Problem
occured
Periodically
Leak
Error
Handling
Overflowed?
public class LeakyBucket { // Very simple implementation
final private int capacity;
private int level;
private boolean overflow;
public LeakyBucket(int capacity) {
this.capacity = capacity;
drain();
}
public void drain () {
this.level = 0;
this.overflow = false;
}
...
Leaky Bucket (2)
...
public void fill() {
level++;
if (level > capacity) {
overflow = true;
}
}
public void leak() {
level--;
if (level < 0) {
level = 0;
}
}
public boolean overflowed() {
return overflow;
}
}
Leaky Bucket (3)
Thread-Safe Leaky Bucket

Leaking strategies

Tuning Leaky Bucket

Available Implementations
Pattern #7

Limited Retries
// doAction returns true if successful, false otherwise
// General pattern
boolean success = false
int tries = 0;
while (!success && (tries < MAX_TRIES)) {
success = doAction(...);
tries++;
}
// Alternative one-retry-only variant
success = doAction(...) || doAction(...);
Limited Retries (1)
Idempotent Actions

Closures / Lambdas

Tuning Retries
More Patterns






•  Complete Parameter Checking
•  Marked Data
•  Routine Audits
Further reading

1.  Michael T. Nygard, Release It!,
Pragmatic Bookshelf, 2007
2.  Robert S. Hanmer,

Patterns for Fault Tolerant Software,
Wiley, 2007
3.  James Hamilton, On Designing and
Deploying Internet-Scale Services,

21st LISA Conference 2007
4.  Andrew Tanenbaum, Marten van Steen,
Distributed Systems – Principles and
Paradigms,

Prentice Hall, 2nd Edition, 2006
It‘s all about production!
@ufried
Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com
Fault tolerance made easy

More Related Content

What's hot

J unit스터디슬라이드
J unit스터디슬라이드J unit스터디슬라이드
J unit스터디슬라이드
ksain
 

What's hot (20)

Load Testing with RedLine13: Or getting paid to DoS your own systems
Load Testing with RedLine13: Or getting paid to DoS your own systemsLoad Testing with RedLine13: Or getting paid to DoS your own systems
Load Testing with RedLine13: Or getting paid to DoS your own systems
 
Defcon_Oracle_The_Making_of_the_2nd_sql_injection_worm
Defcon_Oracle_The_Making_of_the_2nd_sql_injection_wormDefcon_Oracle_The_Making_of_the_2nd_sql_injection_worm
Defcon_Oracle_The_Making_of_the_2nd_sql_injection_worm
 
Oaktable World 2014 Toon Koppelaars: database constraints polite excuse
Oaktable World 2014 Toon Koppelaars: database constraints polite excuseOaktable World 2014 Toon Koppelaars: database constraints polite excuse
Oaktable World 2014 Toon Koppelaars: database constraints polite excuse
 
LA Cassandra Day 2015 - Testing Cassandra
LA Cassandra Day 2015  - Testing CassandraLA Cassandra Day 2015  - Testing Cassandra
LA Cassandra Day 2015 - Testing Cassandra
 
Unit testing patterns for concurrent code
Unit testing patterns for concurrent codeUnit testing patterns for concurrent code
Unit testing patterns for concurrent code
 
Unit testing with Spock Framework
Unit testing with Spock FrameworkUnit testing with Spock Framework
Unit testing with Spock Framework
 
Docker and jvm. A good idea?
Docker and jvm. A good idea?Docker and jvm. A good idea?
Docker and jvm. A good idea?
 
Painless JavaScript Testing with Jest
Painless JavaScript Testing with JestPainless JavaScript Testing with Jest
Painless JavaScript Testing with Jest
 
Performance tests with Gatling (extended)
Performance tests with Gatling (extended)Performance tests with Gatling (extended)
Performance tests with Gatling (extended)
 
Connect2017 DEV-1550 Why Java 8? Or, What's a Lambda?
Connect2017 DEV-1550 Why Java 8? Or, What's a Lambda?Connect2017 DEV-1550 Why Java 8? Or, What's a Lambda?
Connect2017 DEV-1550 Why Java 8? Or, What's a Lambda?
 
Real world functional reactive programming
Real world functional reactive programmingReal world functional reactive programming
Real world functional reactive programming
 
Rules With Drools
Rules With DroolsRules With Drools
Rules With Drools
 
Stress test your backend with Gatling
Stress test your backend with GatlingStress test your backend with Gatling
Stress test your backend with Gatling
 
ScalaSwarm 2017 Keynote: Tough this be madness yet theres method in't
ScalaSwarm 2017 Keynote: Tough this be madness yet theres method in'tScalaSwarm 2017 Keynote: Tough this be madness yet theres method in't
ScalaSwarm 2017 Keynote: Tough this be madness yet theres method in't
 
Qunit Java script Un
Qunit Java script UnQunit Java script Un
Qunit Java script Un
 
Spock Framework
Spock FrameworkSpock Framework
Spock Framework
 
Unit testing in JavaScript with Jasmine and Karma
Unit testing in JavaScript with Jasmine and KarmaUnit testing in JavaScript with Jasmine and Karma
Unit testing in JavaScript with Jasmine and Karma
 
C++ Unit Test with Google Testing Framework
C++ Unit Test with Google Testing FrameworkC++ Unit Test with Google Testing Framework
C++ Unit Test with Google Testing Framework
 
From Elixir to Akka (and back) - ElixirConf Mx 2017
From Elixir to Akka (and back) - ElixirConf Mx 2017From Elixir to Akka (and back) - ElixirConf Mx 2017
From Elixir to Akka (and back) - ElixirConf Mx 2017
 
J unit스터디슬라이드
J unit스터디슬라이드J unit스터디슬라이드
J unit스터디슬라이드
 

Viewers also liked

The promises and perils of microservices
The promises and perils of microservicesThe promises and perils of microservices
The promises and perils of microservices
Uwe Friedrichsen
 

Viewers also liked (20)

How to survive in a BASE world
How to survive in a BASE worldHow to survive in a BASE world
How to survive in a BASE world
 
Dr. Hectic and Mr. Hype - surviving the economic darwinism
Dr. Hectic and Mr. Hype - surviving the economic darwinismDr. Hectic and Mr. Hype - surviving the economic darwinism
Dr. Hectic and Mr. Hype - surviving the economic darwinism
 
Self healing data
Self healing dataSelf healing data
Self healing data
 
Devops for Developers
Devops for DevelopersDevops for Developers
Devops for Developers
 
Patterns of resilience
Patterns of resiliencePatterns of resilience
Patterns of resilience
 
Resilience reloaded - more resilience patterns
Resilience reloaded - more resilience patternsResilience reloaded - more resilience patterns
Resilience reloaded - more resilience patterns
 
The promises and perils of microservices
The promises and perils of microservicesThe promises and perils of microservices
The promises and perils of microservices
 
Cloud fuer entscheider
Cloud fuer entscheiderCloud fuer entscheider
Cloud fuer entscheider
 
Komplexität - Na und?
Komplexität - Na und?Komplexität - Na und?
Komplexität - Na und?
 
Cloud Compliance - Bestimmungen, Zertifizierungen und all das
Cloud Compliance - Bestimmungen, Zertifizierungen und all dasCloud Compliance - Bestimmungen, Zertifizierungen und all das
Cloud Compliance - Bestimmungen, Zertifizierungen und all das
 
Scalability patterns
Scalability patternsScalability patterns
Scalability patterns
 
Emergent architecture
Emergent architectureEmergent architecture
Emergent architecture
 
Der Business Case für Architektur
Der Business Case für ArchitekturDer Business Case für Architektur
Der Business Case für Architektur
 
The agile Architect - Craftsmanship on a new Level
The agile Architect - Craftsmanship on a new LevelThe agile Architect - Craftsmanship on a new Level
The agile Architect - Craftsmanship on a new Level
 
Down with the_sandals
Down with the_sandalsDown with the_sandals
Down with the_sandals
 
No crash allowed - Fault tolerance patterns
No crash allowed - Fault tolerance patternsNo crash allowed - Fault tolerance patterns
No crash allowed - Fault tolerance patterns
 
Hochskalierbare Cloud-Architekturen
Hochskalierbare Cloud-ArchitekturenHochskalierbare Cloud-Architekturen
Hochskalierbare Cloud-Architekturen
 
Kommunikation und Qualität - Java Forum Nord 2016
Kommunikation und Qualität - Java Forum Nord 2016Kommunikation und Qualität - Java Forum Nord 2016
Kommunikation und Qualität - Java Forum Nord 2016
 
OAuth 2.0
OAuth 2.0OAuth 2.0
OAuth 2.0
 
OpenStack Summit :: Redundancy Doesn't Always Mean "HA" or "Cluster"
OpenStack Summit :: Redundancy Doesn't Always Mean "HA" or "Cluster"OpenStack Summit :: Redundancy Doesn't Always Mean "HA" or "Cluster"
OpenStack Summit :: Redundancy Doesn't Always Mean "HA" or "Cluster"
 

Similar to Fault tolerance made easy

33rd Degree 2013, Bad Tests, Good Tests
33rd Degree 2013, Bad Tests, Good Tests33rd Degree 2013, Bad Tests, Good Tests
33rd Degree 2013, Bad Tests, Good Tests
Tomek Kaczanowski
 
Effective testing for spark programs Strata NY 2015
Effective testing for spark programs   Strata NY 2015Effective testing for spark programs   Strata NY 2015
Effective testing for spark programs Strata NY 2015
Holden Karau
 
Java 5 concurrency
Java 5 concurrencyJava 5 concurrency
Java 5 concurrency
priyank09
 
Java Concurrency in Practice
Java Concurrency in PracticeJava Concurrency in Practice
Java Concurrency in Practice
ericbeyeler
 
Os Practical Assignment 1
Os Practical Assignment 1Os Practical Assignment 1
Os Practical Assignment 1
Emmanuel Garcia
 

Similar to Fault tolerance made easy (20)

Java Concurrency
Java ConcurrencyJava Concurrency
Java Concurrency
 
2012 JDays Bad Tests Good Tests
2012 JDays Bad Tests Good Tests2012 JDays Bad Tests Good Tests
2012 JDays Bad Tests Good Tests
 
33rd Degree 2013, Bad Tests, Good Tests
33rd Degree 2013, Bad Tests, Good Tests33rd Degree 2013, Bad Tests, Good Tests
33rd Degree 2013, Bad Tests, Good Tests
 
Developer Test - Things to Know
Developer Test - Things to KnowDeveloper Test - Things to Know
Developer Test - Things to Know
 
How to Start Test-Driven Development in Legacy Code
How to Start Test-Driven Development in Legacy CodeHow to Start Test-Driven Development in Legacy Code
How to Start Test-Driven Development in Legacy Code
 
The Promised Land (in Angular)
The Promised Land (in Angular)The Promised Land (in Angular)
The Promised Land (in Angular)
 
Effective testing for spark programs Strata NY 2015
Effective testing for spark programs   Strata NY 2015Effective testing for spark programs   Strata NY 2015
Effective testing for spark programs Strata NY 2015
 
Sane Async Patterns
Sane Async PatternsSane Async Patterns
Sane Async Patterns
 
Java 5 concurrency
Java 5 concurrencyJava 5 concurrency
Java 5 concurrency
 
Resiliency & Security_Ballerina Day CMB 2018
Resiliency & Security_Ballerina Day CMB 2018  Resiliency & Security_Ballerina Day CMB 2018
Resiliency & Security_Ballerina Day CMB 2018
 
Ten useful JavaScript tips & best practices
Ten useful JavaScript tips & best practicesTen useful JavaScript tips & best practices
Ten useful JavaScript tips & best practices
 
Multithreading in Java
Multithreading in JavaMultithreading in Java
Multithreading in Java
 
Confitura 2012 Bad Tests, Good Tests
Confitura 2012 Bad Tests, Good TestsConfitura 2012 Bad Tests, Good Tests
Confitura 2012 Bad Tests, Good Tests
 
Java util concurrent
Java util concurrentJava util concurrent
Java util concurrent
 
GeeCON 2012 Bad Tests, Good Tests
GeeCON 2012 Bad Tests, Good TestsGeeCON 2012 Bad Tests, Good Tests
GeeCON 2012 Bad Tests, Good Tests
 
Tricks to Making a Realtime SurfaceView Actually Perform in Realtime - Maarte...
Tricks to Making a Realtime SurfaceView Actually Perform in Realtime - Maarte...Tricks to Making a Realtime SurfaceView Actually Perform in Realtime - Maarte...
Tricks to Making a Realtime SurfaceView Actually Perform in Realtime - Maarte...
 
Nevyn — Promise, It's Async! Swift Language User Group Lightning Talk 2015-09-24
Nevyn — Promise, It's Async! Swift Language User Group Lightning Talk 2015-09-24Nevyn — Promise, It's Async! Swift Language User Group Lightning Talk 2015-09-24
Nevyn — Promise, It's Async! Swift Language User Group Lightning Talk 2015-09-24
 
Beyond parallelize and collect - Spark Summit East 2016
Beyond parallelize and collect - Spark Summit East 2016Beyond parallelize and collect - Spark Summit East 2016
Beyond parallelize and collect - Spark Summit East 2016
 
Java Concurrency in Practice
Java Concurrency in PracticeJava Concurrency in Practice
Java Concurrency in Practice
 
Os Practical Assignment 1
Os Practical Assignment 1Os Practical Assignment 1
Os Practical Assignment 1
 

More from Uwe Friedrichsen

Timeless design in a cloud-native world
Timeless design in a cloud-native worldTimeless design in a cloud-native world
Timeless design in a cloud-native world
Uwe Friedrichsen
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
Uwe Friedrichsen
 
Real-world consistency explained
Real-world consistency explainedReal-world consistency explained
Real-world consistency explained
Uwe Friedrichsen
 
The 7 quests of resilient software design
The 7 quests of resilient software designThe 7 quests of resilient software design
The 7 quests of resilient software design
Uwe Friedrichsen
 
Excavating the knowledge of our ancestors
Excavating the knowledge of our ancestorsExcavating the knowledge of our ancestors
Excavating the knowledge of our ancestors
Uwe Friedrichsen
 
The truth about "You build it, you run it!"
The truth about "You build it, you run it!"The truth about "You build it, you run it!"
The truth about "You build it, you run it!"
Uwe Friedrichsen
 
Resilient Functional Service Design
Resilient Functional Service DesignResilient Functional Service Design
Resilient Functional Service Design
Uwe Friedrichsen
 
DevOps is not enough - Embedding DevOps in a broader context
DevOps is not enough - Embedding DevOps in a broader contextDevOps is not enough - Embedding DevOps in a broader context
DevOps is not enough - Embedding DevOps in a broader context
Uwe Friedrichsen
 
Modern times - architectures for a Next Generation of IT
Modern times - architectures for a Next Generation of ITModern times - architectures for a Next Generation of IT
Modern times - architectures for a Next Generation of IT
Uwe Friedrichsen
 

More from Uwe Friedrichsen (20)

Timeless design in a cloud-native world
Timeless design in a cloud-native worldTimeless design in a cloud-native world
Timeless design in a cloud-native world
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
 
Life after microservices
Life after microservicesLife after microservices
Life after microservices
 
The hitchhiker's guide for the confused developer
The hitchhiker's guide for the confused developerThe hitchhiker's guide for the confused developer
The hitchhiker's guide for the confused developer
 
Digitization solutions - A new breed of software
Digitization solutions - A new breed of softwareDigitization solutions - A new breed of software
Digitization solutions - A new breed of software
 
Real-world consistency explained
Real-world consistency explainedReal-world consistency explained
Real-world consistency explained
 
The 7 quests of resilient software design
The 7 quests of resilient software designThe 7 quests of resilient software design
The 7 quests of resilient software design
 
Excavating the knowledge of our ancestors
Excavating the knowledge of our ancestorsExcavating the knowledge of our ancestors
Excavating the knowledge of our ancestors
 
The truth about "You build it, you run it!"
The truth about "You build it, you run it!"The truth about "You build it, you run it!"
The truth about "You build it, you run it!"
 
Resilient Functional Service Design
Resilient Functional Service DesignResilient Functional Service Design
Resilient Functional Service Design
 
Watch your communication
Watch your communicationWatch your communication
Watch your communication
 
Life, IT and everything
Life, IT and everythingLife, IT and everything
Life, IT and everything
 
DevOps is not enough - Embedding DevOps in a broader context
DevOps is not enough - Embedding DevOps in a broader contextDevOps is not enough - Embedding DevOps in a broader context
DevOps is not enough - Embedding DevOps in a broader context
 
Production-ready Software
Production-ready SoftwareProduction-ready Software
Production-ready Software
 
Towards complex adaptive architectures
Towards complex adaptive architecturesTowards complex adaptive architectures
Towards complex adaptive architectures
 
Conway's law revisited - Architectures for an effective IT
Conway's law revisited - Architectures for an effective ITConway's law revisited - Architectures for an effective IT
Conway's law revisited - Architectures for an effective IT
 
Microservices - stress-free and without increased heart attack risk
Microservices - stress-free and without increased heart attack riskMicroservices - stress-free and without increased heart attack risk
Microservices - stress-free and without increased heart attack risk
 
Modern times - architectures for a Next Generation of IT
Modern times - architectures for a Next Generation of ITModern times - architectures for a Next Generation of IT
Modern times - architectures for a Next Generation of IT
 
The Next Generation (of) IT
The Next Generation (of) ITThe Next Generation (of) IT
The Next Generation (of) IT
 
Why resilience - A primer at varying flight altitudes
Why resilience - A primer at varying flight altitudesWhy resilience - A primer at varying flight altitudes
Why resilience - A primer at varying flight altitudes
 

Recently uploaded

Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 

Recently uploaded (20)

Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 

Fault tolerance made easy

  • 1. Fault tolerance made easy Patterns for fault tolerance implemented surprisingly easy Uwe Friedrichsen, codecentric AG, 2013-2014
  • 2. @ufried Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com
  • 3. It‘s all about production!
  • 5. Your web server doesn‘t look good …
  • 7. Timeouts (1) // Basics myObject.wait(); // Do not use this by default myObject.wait(TIMEOUT); // Better use this // Some more basics myThread.join(); // Do not use this by default myThread.join(TIMEOUT); // Better use this
  • 8. Timeouts (2) // Using the Java concurrent library Callable<MyActionResult> myAction = <My Blocking Action> ExecutorService executor = Executors.newSingleThreadExecutor(); Future<MyActionResult> future = executor.submit(myAction); MyActionResult result = null; try { result = future.get(); // Do not use this by default result = future.get(TIMEOUT, TIMEUNIT); // Better use this } catch (TimeoutException e) { // Only thrown if timeouts are used ... } catch (...) { ... }
  • 9. Timeouts (3) // Using Guava SimpleTimeLimiter Callable<MyActionResult> myAction = <My Blocking Action> SimpleTimeLimiter limiter = new SimpleTimeLimiter(); MyActionResult result = null; try { result = limiter.callWithTimeout(myAction, TIMEOUT, TIMEUNIT, false); } catch (UncheckedTimeoutException e) { ... } catch (...) { ... }
  • 10. Determining Timeout Duration Configurable Timeouts Self-Adapting Timeouts Timeouts in JavaEE Containers
  • 12. Circuit Breaker (1) Client Resource Circuit Breaker Request Resource unavailable Resource available Closed Open Half-Open Lifecycle
  • 13. Circuit Breaker (2) Closed on call / pass through call succeeds / reset count call fails / count failure threshold reached / trip breaker Open on call / fail on timeout / attempt reset trip breaker Half-Open on call / pass through call succeeds / reset call fails / trip breaker trip breaker attempt reset reset Source: M. Nygard, „Release It!“
  • 14. Circuit Breaker (3) public class CircuitBreaker implements MyResource { public enum State { CLOSED, OPEN, HALF_OPEN } final MyResource resource; State state; int counter; long tripTime; public CircuitBreaker(MyResource r) { resource = r; state = CLOSED; counter = 0; tripTime = 0L; } ...
  • 15. Circuit Breaker (4) ... public Result access(...) { // resource access Result r = null; if (state == OPEN) { checkTimeout(); throw new ResourceUnavailableException(); } try { r = r.access(...); // should use timeout } catch (Exception e) { fail(); throw e; } success(); return r; } ...
  • 16. Circuit Breaker (5) ... private void success() { reset(); } private void fail() { counter++; if (counter > THRESHOLD) { tripBreaker(); } } private void reset() { state = CLOSED; counter = 0; } ...
  • 17. Circuit Breaker (6) ... private void tripBreaker() { state = OPEN; tripTime = System.currentTimeMillis(); } private void checkTimeout() { if ((System.currentTimeMillis - tripTime) > TIMEOUT) { state = HALF_OPEN; counter = THRESHOLD; } } public State getState() return state; } }
  • 18. Thread-Safe Circuit Breaker Failure Types Tuning Circuit Breakers Available Implementations
  • 20. Fail Fast (1) Client Resources Expensive Action Request Uses
  • 21. Fail Fast (2) Client Resources Expensive Action Request Fail Fast Guard Uses Check availability Forward
  • 22. Fail Fast (3) public class FailFastGuard { private FailFastGuard() {} public static void checkResources(Set<CircuitBreaker> resources) { for (CircuitBreaker r : resources) { if (r.getState() != CircuitBreaker.CLOSED) { throw new ResourceUnavailableException(r); } } } }
  • 23. Fail Fast (4) public class MyService { Set<CircuitBreaker> requiredResources; // Initialize resources ... public Result myExpensiveAction(...) { FailFastGuard.checkResources(requiredResources); // Execute core action ... } }
  • 26. Shed Load (1) Clients Server Too many Requests
  • 27. Shed Load (2) Server Too many Requests Gate Keeper Monitor Requests Request Load Data Monitor Load Shedded Requests Clients
  • 28. Shed Load (3) public class ShedLoadFilter implements Filter { Random random; public void init(FilterConfig fc) throws ServletException { random = new Random(System.currentTimeMillis()); } public void destroy() { random = null; } ...
  • 29. Shed Load (4) ... public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws java.io.IOException, ServletException { int load = getLoad(); if (shouldShed(load)) { HttpServletResponse res = (HttpServletResponse)response; res.setIntHeader("Retry-After", RECOMMENDATION); res.sendError(HttpServletResponse.SC_SERVICE_UNAVAILABLE); return; } chain.doFilter(request, response); } ...
  • 30. Shed Load (5) ... private boolean shouldShed(int load) { // Example implementation if (load < THRESHOLD) { return false; } double shedBoundary = ((double)(load - THRESHOLD))/ ((double)(MAX_LOAD - THRESHOLD)); return random.nextDouble() < shedBoundary; } }
  • 33. Shedding Strategy Retrieving Load Tuning Load Shedders Alternative Strategies
  • 35. Deferrable Work (1) Client Requests Request Processing Resources Use Routine Work Use
  • 36. OVERLOAD Deferrable Work (2) Without
 Deferrable Work 100% OVERLOAD With
 Deferrable Work 100% Request Processing Routine Work
  • 37. // Do or wait variant ProcessingState state = initBatch(); while(!state.done()) { int load = getLoad(); if (load > THRESHOLD) { waitFixedDuration(); } else { state = processNext(state); } } void waitFixedDuration() { Thread.sleep(DELAY); // try-catch left out for better readability } Deferrable Work (3)
  • 38. // Adaptive load variant ProcessingState state = initBatch(); while(!state.done()) { waitLoadBased(); state = processNext(state); } void waitLoadBased() { int load = getLoad(); long delay = calcDelay(load); Thread.sleep(delay); // try-catch left out for better readability } long calcDelay(int load) { // Simple example implementation if (load < THRESHOLD) { return 0L; } return (load – THRESHOLD) * DELAY_FACTOR; } Deferrable Work (4)
  • 40. I can hardly hear you …
  • 42. Leaky Bucket (1) Leaky Bucket Fill Problem occured Periodically Leak Error Handling Overflowed?
  • 43. public class LeakyBucket { // Very simple implementation final private int capacity; private int level; private boolean overflow; public LeakyBucket(int capacity) { this.capacity = capacity; drain(); } public void drain () { this.level = 0; this.overflow = false; } ... Leaky Bucket (2)
  • 44. ... public void fill() { level++; if (level > capacity) { overflow = true; } } public void leak() { level--; if (level < 0) { level = 0; } } public boolean overflowed() { return overflow; } } Leaky Bucket (3)
  • 45. Thread-Safe Leaky Bucket Leaking strategies Tuning Leaky Bucket Available Implementations
  • 47. // doAction returns true if successful, false otherwise // General pattern boolean success = false int tries = 0; while (!success && (tries < MAX_TRIES)) { success = doAction(...); tries++; } // Alternative one-retry-only variant success = doAction(...) || doAction(...); Limited Retries (1)
  • 48. Idempotent Actions Closures / Lambdas Tuning Retries
  • 49. More Patterns •  Complete Parameter Checking •  Marked Data •  Routine Audits
  • 50. Further reading 1.  Michael T. Nygard, Release It!, Pragmatic Bookshelf, 2007 2.  Robert S. Hanmer,
 Patterns for Fault Tolerant Software, Wiley, 2007 3.  James Hamilton, On Designing and Deploying Internet-Scale Services,
 21st LISA Conference 2007 4.  Andrew Tanenbaum, Marten van Steen, Distributed Systems – Principles and Paradigms,
 Prentice Hall, 2nd Edition, 2006
  • 51. It‘s all about production!
  • 52. @ufried Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com