SlideShare a Scribd company logo
Designing Fault Tolerant
Microservices
Speaker
Orkhan Gasimov
Solution design & implementation.
15 years of software engineering;
Teaching training courses.
Architecture, Java, JavaScript / TypeScript.
Author of training courses.
Microservice, Spring Cloud, Akka for Java.
2
Agenda
Forces & Solutions.
Patterns & Approaches.
3
Forces
Network, high-load, RPC mechanics.
DBService
API
DBService
DBService
DBService
Web
API
Mobile
API
4
Possible Issues
Network & services will eventually fail.
DBService
API
DBService
DBService
DBService
Web
API
Mobile
API
5
Network
Service Coordination
Service Coordination
Alternate Routes:
Consumer Producer
Producer
Producer
7
Service Coordination
Service Discovery:
• Dynamic service locations.
Service
Registry
Consumer Producer
8
Service Coordination
Service Discovery:
• Dynamic service locations.
• Scale horizontally on demand.
Service
Registry
Consumer Producer
Producer
Producer
9
Service Coordination
Service Discovery:
• Dynamic service locations.
• Scale horizontally on demand.
• Client-side load balancing on the fly. Service
Registry
Consumer Producer
Producer
Producer
LB
10
Service Coordination
Service Discovery:
• Dynamic service locations.
• Scale horizontally on demand.
• Client-side load balancing on the fly.
• Discovery Cluster.
Service
Registry
Consumer Producer
Producer
Producer
LB
Service
Registry
Service
Registry
11
High Load
High Load
Loaded services may get slower.
Common solutions:
Horizontal scalability and load-balancing.
DBService
API
DBService
DBService
DBService
Web
API
Mobile
API
13
High Load
Scale manually by deploying more instances.
DB
Service
API
DBService
DBService
DB
Web
API
Mobile
API
Service
Service
Service
14
High Load
Scale manually by deploying more instances.
Drawbacks:
• Deployment requires time and resources. Not a real-time solution.
• Lack of automation. Down-scale? Who? When?
15
High Load
Scale dynamically, using an auto-scaler service.
DBService
API
DBService
DBService
DB
Web
API
Mobile
API
Service
Service
Service
16
High Load
Scale dynamically, using an auto-scaler service.
Drawbacks:
• What happens when resource limit is reached?
• Auto-scaler requires support. May fail to down-scale / up-scale.
17
RPC Mechanics
RPC Mechanics
Cascading call stack deliver the final result.
Service ServiceService Service Service
ServiceService
19
RPC Mechanics
If network or service fails,
clients may get errors instead of meaningful results.
Service ServiceService Service Service
ServiceService
20
RPC Mechanics
Sometimes you have to wait, to get an error.
Service ServiceService Service Service
21
RPC Mechanics
Sometimes you have to wait, to get an error.
Users should wait for results, not errors!
Service ServiceService Service Service
22
RPC Mechanics
System needs time to free-up resources.
DBService
API
DBService
DBService
DBService
Web
API
Mobile
API
23
RPC Mechanics
Possible solution:
• Time-outs – consider slow calls erroneous.
• Threshold – e.g. error rate per interval.
• Fallback value/function (no network calls).
• Recovery time.
24
Circuit Breaker
Circuit Breaker
Closed – watch requests, keep track of errors.
26
Closed
Open
Half-Open
Success
Too many fails
Fast Fail
Try one request
Fail
Success
Circuit Breaker
Closed – watch requests, keep track of errors.
Open – if there are too many errors, fast fail to a default value.
27
Closed
Open
Half-Open
Success
Too many fails
Fast Fail
Try one request
Fail
Success
Circuit Breaker
Closed – watch requests, keep track of errors.
Open – if there are too many errors, fast fail to a default value.
Half-Open – try one request some time later.
28
Closed
Open
Half-Open
Success
Too many fails
Fast Fail
Try one request
Fail
Success
Circuit Breaker
Track Exceptions & watch for time-outs.
29
Circuit Breaker
On time-out throw a Timeout Exception.
Track only Exceptions!
30
Circuit Breaker
Usually applied at caller side.
DBService
API
DBService
DBService
DBService
Web
API
Mobile
APICB
CB
31
Example
Discovery:
32
Discovery
Server
Travel API
Ticket
Service
Comment
Service
Attachment
Service
Storage
Service
Example
Circuit Breaker:
33
Discovery
Server
Travel API
Ticket
Service
Comment
Service
Attachment
Service
Storage
Service
Example
Metrics:
34
Discovery
Server
Travel API
Ticket
Service
Comment
Service
Attachment
Service
Storage
Service
Metrics
Aggregator
N-Modular Redundancy
Process faster if possible
N-Modular Redundancy
Send N parallel calls, vote for the first acceptable result.
Caller
Executor 1
Executor 2
Executor N
Voter Receiver
N-Modular Redundancy
Faster processing & lower probability of errors.
Remote calls should be idempotent!
Executor 1 Executor 2 Executor N
ReceiverCaller & Voter
Example
The fastest wins.
Maps 1 Maps 2
Address Validation
Example
1st is ok.
2nd waits for 1st.
Maps 1 Maps 2
Address Validation
Recovery Blocks
Slow down if necessary
Fast Path
Protect from time-outs!
Recovery Blocks
Work faster when everything is fine.
Slow down to protect when necessary.
ReceiverCaller
Request
Response
Result
Basic Recovery
Fast Path
Advanced Recovery
Recovery Blocks
Synchronous
Block N
ReceiverBlock 2
Block 1
Caller
Request
Fail
Fail
Response Result
Recovery Blocks
Asynchronous
Block N
ReceiverBlock 2
Block 1
Caller
Request
Fail
Fail
Response
Example
Prefer the fastest.
Switch to next if prior won’t work.
Maps 1 Maps 2
Address Validation
Maps 1Cached
Error Kernel
Actor Model
Actors
Actors:
• Communicate by sending messages.
• Form into a hierarchy.
Tasks:
• Split tasks to sub-tasks.
• Delegate sub-tasks to subordinates.
Actors
Error handling:
• Actors may fail while executing a task.
• Fails are reported to parent as an error message.
• Parent actors supervise and monitor subordinates.
• Parents may resume, restart, stop subordinates or escalate the
error.
Error Kernel
Error Kernel Pattern:
• The root actor is the Error Kernel of the whole actor system.
• Each parent actor is the local Error Kernel of it’s hierarchy.
Error Kernel
• Split the task to sub-tasks and delegate the execution.
• Aggregate results, supervise and monitor faults.
• Do not execute tasks at root, avoid errors.
• If root fails, the whole task fails.
Instance Healers
Instance Healers
• Services are scaled by running several instances.
• Service instances may fail.
• Need a mechanism to watch for failed instances.
Instance Healers
• Watch for the service instances using health checks.
• If necessary, heal instances (restart or spawn).
• May be the Error Kernel of the system.
Healer
Dead NewHealed
Instance Healers
Healer
Dead NewHealed
Instance Healers
Possible Issues:
• Services may be in the middle of processing.
• Stateful services may loose data.
Instance Healers
Event Sourcing + CQRS:
• Keep track of state-changing events, rebuild state from events.
• Take snapshots for faster rebuilds.
• Services may re-read events after healing.
• Processing should be idempotent!
Command
Service
Query
Service
Event
Store
Data
Store
Event
Processor
Summary
Summary
• Forces:
• Network, Load, RPC.
• Solutions:
• Service Coordination, Horizontal Scalability & Load Balancing.
• Patterns:
• Circuit Breaker, N-Modular Redundancy, Recovery Blocks.
• Approaches:
• Error Kernel & Instance Healer.
58
Thank You!
ogasimov@gmail.com
facebook.com/ogassymov

More Related Content

What's hot

Visualizing Kafka Security
Visualizing Kafka SecurityVisualizing Kafka Security
Visualizing Kafka Security
DataWorks Summit
 
Kafka Security 101 and Real-World Tips
Kafka Security 101 and Real-World Tips Kafka Security 101 and Real-World Tips
Kafka Security 101 and Real-World Tips
confluent
 
Overcoming the Perils of Kafka Secret Sprawl (Tejal Adsul, Confluent) Kafka S...
Overcoming the Perils of Kafka Secret Sprawl (Tejal Adsul, Confluent) Kafka S...Overcoming the Perils of Kafka Secret Sprawl (Tejal Adsul, Confluent) Kafka S...
Overcoming the Perils of Kafka Secret Sprawl (Tejal Adsul, Confluent) Kafka S...
confluent
 
Understanding Apache Kafka® Latency at Scale
Understanding Apache Kafka® Latency at ScaleUnderstanding Apache Kafka® Latency at Scale
Understanding Apache Kafka® Latency at Scale
confluent
 
Take Kafka-on-Pulsar to Production at Internet Scale: Improvements Made for P...
Take Kafka-on-Pulsar to Production at Internet Scale: Improvements Made for P...Take Kafka-on-Pulsar to Production at Internet Scale: Improvements Made for P...
Take Kafka-on-Pulsar to Production at Internet Scale: Improvements Made for P...
StreamNative
 
How to build 1000 microservices with Kafka and thrive
How to build 1000 microservices with Kafka and thriveHow to build 1000 microservices with Kafka and thrive
How to build 1000 microservices with Kafka and thrive
Natan Silnitsky
 
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and VormetricProtecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
confluent
 
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, Confluent
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, ConfluentMaking Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, Confluent
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, Confluent
HostedbyConfluent
 
Kafka Pluggable Authorization for Enterprise Security (Anna Kepler, Viasat) K...
Kafka Pluggable Authorization for Enterprise Security (Anna Kepler, Viasat) K...Kafka Pluggable Authorization for Enterprise Security (Anna Kepler, Viasat) K...
Kafka Pluggable Authorization for Enterprise Security (Anna Kepler, Viasat) K...
confluent
 
Микросервисы со Spring Boot & Spring Cloud
Микросервисы со Spring Boot & Spring CloudМикросервисы со Spring Boot & Spring Cloud
Микросервисы со Spring Boot & Spring Cloud
Vitebsk DSC
 
Building Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBuilding Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache Kafka
Brian Ritchie
 
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
José Román Martín Gil
 
Altitude SF 2017: Security at the edge
Altitude SF 2017: Security at the edgeAltitude SF 2017: Security at the edge
Altitude SF 2017: Security at the edge
Fastly
 
The best of Apache Kafka Architecture
The best of Apache Kafka ArchitectureThe best of Apache Kafka Architecture
The best of Apache Kafka Architecture
techmaddy
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
Sumant Tambe
 
Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30...
Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30...Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30...
Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30...
HostedbyConfluent
 
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
Natan Silnitsky
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb Sharding
Araf Karsh Hamid
 
analytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the awsanalytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the aws
Scott Miao
 
Kafka Tutorial: Advanced Producers
Kafka Tutorial: Advanced ProducersKafka Tutorial: Advanced Producers
Kafka Tutorial: Advanced Producers
Jean-Paul Azar
 

What's hot (20)

Visualizing Kafka Security
Visualizing Kafka SecurityVisualizing Kafka Security
Visualizing Kafka Security
 
Kafka Security 101 and Real-World Tips
Kafka Security 101 and Real-World Tips Kafka Security 101 and Real-World Tips
Kafka Security 101 and Real-World Tips
 
Overcoming the Perils of Kafka Secret Sprawl (Tejal Adsul, Confluent) Kafka S...
Overcoming the Perils of Kafka Secret Sprawl (Tejal Adsul, Confluent) Kafka S...Overcoming the Perils of Kafka Secret Sprawl (Tejal Adsul, Confluent) Kafka S...
Overcoming the Perils of Kafka Secret Sprawl (Tejal Adsul, Confluent) Kafka S...
 
Understanding Apache Kafka® Latency at Scale
Understanding Apache Kafka® Latency at ScaleUnderstanding Apache Kafka® Latency at Scale
Understanding Apache Kafka® Latency at Scale
 
Take Kafka-on-Pulsar to Production at Internet Scale: Improvements Made for P...
Take Kafka-on-Pulsar to Production at Internet Scale: Improvements Made for P...Take Kafka-on-Pulsar to Production at Internet Scale: Improvements Made for P...
Take Kafka-on-Pulsar to Production at Internet Scale: Improvements Made for P...
 
How to build 1000 microservices with Kafka and thrive
How to build 1000 microservices with Kafka and thriveHow to build 1000 microservices with Kafka and thrive
How to build 1000 microservices with Kafka and thrive
 
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and VormetricProtecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
 
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, Confluent
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, ConfluentMaking Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, Confluent
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, Confluent
 
Kafka Pluggable Authorization for Enterprise Security (Anna Kepler, Viasat) K...
Kafka Pluggable Authorization for Enterprise Security (Anna Kepler, Viasat) K...Kafka Pluggable Authorization for Enterprise Security (Anna Kepler, Viasat) K...
Kafka Pluggable Authorization for Enterprise Security (Anna Kepler, Viasat) K...
 
Микросервисы со Spring Boot & Spring Cloud
Микросервисы со Spring Boot & Spring CloudМикросервисы со Spring Boot & Spring Cloud
Микросервисы со Spring Boot & Spring Cloud
 
Building Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBuilding Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache Kafka
 
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
 
Altitude SF 2017: Security at the edge
Altitude SF 2017: Security at the edgeAltitude SF 2017: Security at the edge
Altitude SF 2017: Security at the edge
 
The best of Apache Kafka Architecture
The best of Apache Kafka ArchitectureThe best of Apache Kafka Architecture
The best of Apache Kafka Architecture
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
 
Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30...
Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30...Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30...
Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30...
 
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb Sharding
 
analytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the awsanalytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the aws
 
Kafka Tutorial: Advanced Producers
Kafka Tutorial: Advanced ProducersKafka Tutorial: Advanced Producers
Kafka Tutorial: Advanced Producers
 

Similar to Designing Fault Tolerant Microservices

Fault Tolerance in Distributed Environment
Fault Tolerance in Distributed EnvironmentFault Tolerance in Distributed Environment
Fault Tolerance in Distributed Environment
Orkhan Gasimov
 
Resilience Planning & How the Empire Strikes Back
Resilience Planning & How the Empire Strikes BackResilience Planning & How the Empire Strikes Back
Resilience Planning & How the Empire Strikes Back
C4Media
 
Reactive solutions using java 9 and spring reactor
Reactive solutions using java 9 and spring reactorReactive solutions using java 9 and spring reactor
Reactive solutions using java 9 and spring reactor
OrenEzer1
 
Reactive Programming in Java 8 with Rx-Java
Reactive Programming in Java 8 with Rx-JavaReactive Programming in Java 8 with Rx-Java
Reactive Programming in Java 8 with Rx-Java
Kasun Indrasiri
 
Patterns of Distributed Application Design
Patterns of Distributed Application DesignPatterns of Distributed Application Design
Patterns of Distributed Application Design
GlobalLogic Ukraine
 
Failover-Apachecon-Asia-2022.pptx
Failover-Apachecon-Asia-2022.pptxFailover-Apachecon-Asia-2022.pptx
Failover-Apachecon-Asia-2022.pptx
DavidKjerrumgaard1
 
AAI-2013 Preparing to Fail: Practical WebSphere Application Server High Avail...
AAI-2013 Preparing to Fail: Practical WebSphere Application Server High Avail...AAI-2013 Preparing to Fail: Practical WebSphere Application Server High Avail...
AAI-2013 Preparing to Fail: Practical WebSphere Application Server High Avail...
WASdev Community
 
Designing Fault Tolerant APIs to keep Application Network Intact | MuleSoft M...
Designing Fault Tolerant APIs to keep Application Network Intact | MuleSoft M...Designing Fault Tolerant APIs to keep Application Network Intact | MuleSoft M...
Designing Fault Tolerant APIs to keep Application Network Intact | MuleSoft M...
MysoreMuleSoftMeetup
 
Patterns of Distributed Application Design
Patterns of Distributed Application DesignPatterns of Distributed Application Design
Patterns of Distributed Application Design
Orkhan Gasimov
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right Way
DataStax Academy
 
Metrics Are Not Enough: Monitoring Apache Kafka and Streaming Applications
Metrics Are Not Enough: Monitoring Apache Kafka and Streaming ApplicationsMetrics Are Not Enough: Monitoring Apache Kafka and Streaming Applications
Metrics Are Not Enough: Monitoring Apache Kafka and Streaming Applications
confluent
 
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
Amazon Web Services
 
Using Nagios with Chef
Using Nagios with ChefUsing Nagios with Chef
Using Nagios with Chef
Bryan McLellan
 
Embracing Failure - Fault Injection and Service Resilience at Netflix
Embracing Failure - Fault Injection and Service Resilience at NetflixEmbracing Failure - Fault Injection and Service Resilience at Netflix
Embracing Failure - Fault Injection and Service Resilience at Netflix
Josh Evans
 
Production Readiness Strategies in an Automated World
Production Readiness Strategies in an Automated WorldProduction Readiness Strategies in an Automated World
Production Readiness Strategies in an Automated World
Sean Chittenden
 
Net flix embracingfailure re-invent2014-141113085858-conversion-gate02
Net flix embracingfailure re-invent2014-141113085858-conversion-gate02Net flix embracingfailure re-invent2014-141113085858-conversion-gate02
Net flix embracingfailure re-invent2014-141113085858-conversion-gate02
~Eric Principe
 
An Introduction to Reactive Application, Reactive Streams, and options for JVM
An Introduction to Reactive Application, Reactive Streams, and options for JVMAn Introduction to Reactive Application, Reactive Streams, and options for JVM
An Introduction to Reactive Application, Reactive Streams, and options for JVM
Steve Pember
 
Expect the unexpected: Prepare for failures in microservices
Expect the unexpected: Prepare for failures in microservicesExpect the unexpected: Prepare for failures in microservices
Expect the unexpected: Prepare for failures in microservices
Bhakti Mehta
 
Deploying at will - SEI
 Deploying at will - SEI Deploying at will - SEI
(PFC305) Embracing Failure: Fault-Injection and Service Reliability | AWS re:...
(PFC305) Embracing Failure: Fault-Injection and Service Reliability | AWS re:...(PFC305) Embracing Failure: Fault-Injection and Service Reliability | AWS re:...
(PFC305) Embracing Failure: Fault-Injection and Service Reliability | AWS re:...
Amazon Web Services
 

Similar to Designing Fault Tolerant Microservices (20)

Fault Tolerance in Distributed Environment
Fault Tolerance in Distributed EnvironmentFault Tolerance in Distributed Environment
Fault Tolerance in Distributed Environment
 
Resilience Planning & How the Empire Strikes Back
Resilience Planning & How the Empire Strikes BackResilience Planning & How the Empire Strikes Back
Resilience Planning & How the Empire Strikes Back
 
Reactive solutions using java 9 and spring reactor
Reactive solutions using java 9 and spring reactorReactive solutions using java 9 and spring reactor
Reactive solutions using java 9 and spring reactor
 
Reactive Programming in Java 8 with Rx-Java
Reactive Programming in Java 8 with Rx-JavaReactive Programming in Java 8 with Rx-Java
Reactive Programming in Java 8 with Rx-Java
 
Patterns of Distributed Application Design
Patterns of Distributed Application DesignPatterns of Distributed Application Design
Patterns of Distributed Application Design
 
Failover-Apachecon-Asia-2022.pptx
Failover-Apachecon-Asia-2022.pptxFailover-Apachecon-Asia-2022.pptx
Failover-Apachecon-Asia-2022.pptx
 
AAI-2013 Preparing to Fail: Practical WebSphere Application Server High Avail...
AAI-2013 Preparing to Fail: Practical WebSphere Application Server High Avail...AAI-2013 Preparing to Fail: Practical WebSphere Application Server High Avail...
AAI-2013 Preparing to Fail: Practical WebSphere Application Server High Avail...
 
Designing Fault Tolerant APIs to keep Application Network Intact | MuleSoft M...
Designing Fault Tolerant APIs to keep Application Network Intact | MuleSoft M...Designing Fault Tolerant APIs to keep Application Network Intact | MuleSoft M...
Designing Fault Tolerant APIs to keep Application Network Intact | MuleSoft M...
 
Patterns of Distributed Application Design
Patterns of Distributed Application DesignPatterns of Distributed Application Design
Patterns of Distributed Application Design
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right Way
 
Metrics Are Not Enough: Monitoring Apache Kafka and Streaming Applications
Metrics Are Not Enough: Monitoring Apache Kafka and Streaming ApplicationsMetrics Are Not Enough: Monitoring Apache Kafka and Streaming Applications
Metrics Are Not Enough: Monitoring Apache Kafka and Streaming Applications
 
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
 
Using Nagios with Chef
Using Nagios with ChefUsing Nagios with Chef
Using Nagios with Chef
 
Embracing Failure - Fault Injection and Service Resilience at Netflix
Embracing Failure - Fault Injection and Service Resilience at NetflixEmbracing Failure - Fault Injection and Service Resilience at Netflix
Embracing Failure - Fault Injection and Service Resilience at Netflix
 
Production Readiness Strategies in an Automated World
Production Readiness Strategies in an Automated WorldProduction Readiness Strategies in an Automated World
Production Readiness Strategies in an Automated World
 
Net flix embracingfailure re-invent2014-141113085858-conversion-gate02
Net flix embracingfailure re-invent2014-141113085858-conversion-gate02Net flix embracingfailure re-invent2014-141113085858-conversion-gate02
Net flix embracingfailure re-invent2014-141113085858-conversion-gate02
 
An Introduction to Reactive Application, Reactive Streams, and options for JVM
An Introduction to Reactive Application, Reactive Streams, and options for JVMAn Introduction to Reactive Application, Reactive Streams, and options for JVM
An Introduction to Reactive Application, Reactive Streams, and options for JVM
 
Expect the unexpected: Prepare for failures in microservices
Expect the unexpected: Prepare for failures in microservicesExpect the unexpected: Prepare for failures in microservices
Expect the unexpected: Prepare for failures in microservices
 
Deploying at will - SEI
 Deploying at will - SEI Deploying at will - SEI
Deploying at will - SEI
 
(PFC305) Embracing Failure: Fault-Injection and Service Reliability | AWS re:...
(PFC305) Embracing Failure: Fault-Injection and Service Reliability | AWS re:...(PFC305) Embracing Failure: Fault-Injection and Service Reliability | AWS re:...
(PFC305) Embracing Failure: Fault-Injection and Service Reliability | AWS re:...
 

More from Orkhan Gasimov

Complex Application Design
Complex Application DesignComplex Application Design
Complex Application Design
Orkhan Gasimov
 
Digital Transformation - Why? How? What?
Digital Transformation - Why? How? What?Digital Transformation - Why? How? What?
Digital Transformation - Why? How? What?
Orkhan Gasimov
 
Service Mesh - Why? How? What?
Service Mesh - Why? How? What?Service Mesh - Why? How? What?
Service Mesh - Why? How? What?
Orkhan Gasimov
 
Angular Web Components
Angular Web ComponentsAngular Web Components
Angular Web Components
Orkhan Gasimov
 
Vert.x - Reactive & Distributed [Devoxx version]
Vert.x - Reactive & Distributed [Devoxx version]Vert.x - Reactive & Distributed [Devoxx version]
Vert.x - Reactive & Distributed [Devoxx version]
Orkhan Gasimov
 
Vertx - Reactive & Distributed
Vertx - Reactive & DistributedVertx - Reactive & Distributed
Vertx - Reactive & Distributed
Orkhan Gasimov
 
Refactoring Monolith to Microservices
Refactoring Monolith to MicroservicesRefactoring Monolith to Microservices
Refactoring Monolith to Microservices
Orkhan Gasimov
 
Angular or React
Angular or ReactAngular or React
Angular or React
Orkhan Gasimov
 
Secured REST Microservices with Spring Cloud
Secured REST Microservices with Spring CloudSecured REST Microservices with Spring Cloud
Secured REST Microservices with Spring Cloud
Orkhan Gasimov
 
Data Microservices with Spring Cloud
Data Microservices with Spring CloudData Microservices with Spring Cloud
Data Microservices with Spring Cloud
Orkhan Gasimov
 
Spring Cloud: Why? How? What?
Spring Cloud: Why? How? What?Spring Cloud: Why? How? What?
Spring Cloud: Why? How? What?
Orkhan Gasimov
 

More from Orkhan Gasimov (11)

Complex Application Design
Complex Application DesignComplex Application Design
Complex Application Design
 
Digital Transformation - Why? How? What?
Digital Transformation - Why? How? What?Digital Transformation - Why? How? What?
Digital Transformation - Why? How? What?
 
Service Mesh - Why? How? What?
Service Mesh - Why? How? What?Service Mesh - Why? How? What?
Service Mesh - Why? How? What?
 
Angular Web Components
Angular Web ComponentsAngular Web Components
Angular Web Components
 
Vert.x - Reactive & Distributed [Devoxx version]
Vert.x - Reactive & Distributed [Devoxx version]Vert.x - Reactive & Distributed [Devoxx version]
Vert.x - Reactive & Distributed [Devoxx version]
 
Vertx - Reactive & Distributed
Vertx - Reactive & DistributedVertx - Reactive & Distributed
Vertx - Reactive & Distributed
 
Refactoring Monolith to Microservices
Refactoring Monolith to MicroservicesRefactoring Monolith to Microservices
Refactoring Monolith to Microservices
 
Angular or React
Angular or ReactAngular or React
Angular or React
 
Secured REST Microservices with Spring Cloud
Secured REST Microservices with Spring CloudSecured REST Microservices with Spring Cloud
Secured REST Microservices with Spring Cloud
 
Data Microservices with Spring Cloud
Data Microservices with Spring CloudData Microservices with Spring Cloud
Data Microservices with Spring Cloud
 
Spring Cloud: Why? How? What?
Spring Cloud: Why? How? What?Spring Cloud: Why? How? What?
Spring Cloud: Why? How? What?
 

Recently uploaded

National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 

Recently uploaded (20)

National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 

Designing Fault Tolerant Microservices

Editor's Notes

  1. Network – fails and heals; High load – rush hours; PRC mechanics – algorithms, code;
  2. Up-scale and down-scale services on demand.
  3. Up-scale and down-scale services on demand.
  4. 1. Some executors may perform faster, while others may be overloaded. 2. More chances to get an acceptable response (in a timely manner).
  5. 1. Some executors may perform faster, while others may be overloaded. 2. More chances to get an acceptable response (in a timely manner).
  6. 1. Some executors may perform faster, while others may be overloaded. 2. More chances to get an acceptable response (in a timely manner).
  7. 1. Recover from errors if possible (more attempts, longer processing). 2. Sequential or asynchronous services, layered over the network. 3. Sequentially layered methods.
  8. 1. Caller may stream to first block; 2. On error, processing is passed to next block; 3. The first acceptable result is sent to Receiver;
  9. 1. Some executors may perform faster, while others may be overloaded. 2. More chances to get an acceptable response (in a timely manner).
  10. Prefer stateless processing, keep state in an external storage.