SlideShare a Scribd company logo
1 of 40
Download to read offline
Service Resiliency in Microservices
Afkham Azeez, Vice President, WSO2
November 2018
“
Random House Kernerman Webster’s College dictionary
Ability to return to the original form or position after being
affected by a particular alteration.
Ability to recover from illness, depression, adversity or the like.
Resilience
Why should I bother?
Because real systems need to run in production!
Availability is critical in production
MTTF
MTTF + MTTR
A =
E[Uptime]
E[Uptime] + E[Downtime]
A =
Traditional approach to improving availability
MTTF
MTTF + MTTR
A =
Resilience approach to improving availability
MTTF
MTTF + MTTR
A =
Failures are a fact of life.
Don’t avoid failures. Embrace them.
Resilience
○ The ability of an app to recover from certain types of failure yet remain
functional from the users’ perspective
○ Users don’t notice it
○ Graceful degradation
○ Resilience is how you achieve the outcome
○ Also called recoverability
Features of resilience systems
○ Failures are compartmentalized
○ A resilient system would automatically cut off failing components and
reintegrate them once they are no longer failing
MSA and resilience
MSA inherently helps with designing resilient systems
Techniques
○ Timeout
○ Retry
○ Failover
○ Load balancing
○ Circuit breaker
○ Transactions
Don’t hide the network
○ Calls over the network should always return errors in addition to the
response
○ Network calls should be easily distinguishable
var backendRes = backendClientEP->forward("/hello", request);
match backendRes {
http:Response res => {
...
}
error responseError => {
log:printError("Error sending response", err = responseError);
}
}
Timeout
○ Hide downstream latency and keep the responsiveness to upstream
○ Prevent waiting forever
Timeout Sample
endpoint http:Client backendClientEP {
url: "http://localhost:8080",
timeoutMillis: 2000
};
var backendRes = backendClientEP->forward("/hello", request);
match backendRes {
http:Response res => {
...
}
error responseError => {
string resp = cache.get(“hello”);
...
}
}
Demo
Retry
○ Transient failures are not uncommon
○ In such cases, a simple retry is sufficient
○ Can be handled by
○ Retry immediately
○ Retry with a delay
○ IMPORTANT: Idempotency should be handled at the application layer
Retry Sample
endpoint http:Client backendClientEP {
url: "http://localhost:8080",
retryConfig: {
interval: 3000,
count: 3,
backOffFactor: 2
},
timeoutMillis: 2000
};
Retry after 3000, 6000 & 12000 ms
Demo
Failover
Switch to an alternate endpoint when the primary fails
Failover Sample
endpoint http:FailoverClient foBackendEP {
timeoutMillis: 5000,
failoverCodes: [501, 502, 503],
intervalMillis: 5000,
targets: [
{ url: "http://localhost:3000/mock1" },
{ url: "http://localhost:8080/echo" },
{ url: "http://localhost:8080/mock" }
]
};
Demo
Load Balancing
Load balance across endpoints
Load Balancing Sample
endpoint http:LoadBalanceClient lbBackendEP {
targets: [
{ url: "http://localhost:8080/mock1" },
{ url: "http://localhost:8080/mock2" },
{ url: "http://localhost:8080/mock3" }
],
algorithm: http:ROUND_ROBIN,
timeoutMillis: 5000
};
Demo
Circuit breaker
○ Some transient failures take much longer to recover
○ Repeatedly retrying may hinder recoverability
○ Retry up to a certain degree and cut off
Circuit Breaker
OPEN HALF OPEN
CLOSED
Success
Failure
Delay
Failure
threshold
exceeded
Circuit Breaker Sample
endpoint http:Client backendClientEP {
url: "http://localhost:8080",
circuitBreaker: {
rollingWindow: {
timeWindowMillis: 16000,
bucketSizeMillis: 2000
},
failureThreshold: 0.2,
resetTimeMillis: 10000,
statusCodes: [400, 404, 500]
},
timeoutMillis: 2000
};
Demo
Transactions
µ3
µ1 µ6
µ2 µ4
µ5
Initiator
Participants
S1
S2
S3
2PC Coordination
Completion protocol
alt
Initiator Coordinator
Create-Context()
Micro-Transaction-Context
Commit()
Committed | Aborted | Mixed
Abort()
Aborted | Mixed
2PC Coordination
Durable protocol - triggered by commit
success
Coordinator Participant
Prepare()
Prepared | Read-Only | Aborted | Committed
notify(...Commit...)
Committed
notify(...Abort...)
Aborted
2PC Coordination
Durable protocol - triggered by abort
Coordinator Participant
notify(...Abort...)
Aborted
Transactions Sample - initiator
transaction with retries = 2 {
// Calling a local participant
localParticipantDoSomething();
// Calling a remote participant
http:Response res = check participant->get(“/ParticipantService/hi”)
} onretry {
// Code here will execute before retrying
} committed {
// Code here will execute after the initiated transaction
// has committed
} aborted {
// Code here will execute after the initiated transaction
// has been aborted
}
Transactions Sample - participant
service<http:Service> ParticipantService bind listener {
@transactions:participant {
oncommit = onTxnCommit,
onabort = onTxnAbort
}
hi(endpoint caller, http:Request req) {
...
}
}
...
function onTxnCommit(string transactionId) {
...
}
function onTxnAbort(string transactionId) {
...
}
Demo
https://github.com/afkham/
ballerinacon2018_resiliency
Q & A
THANK YOU

More Related Content

Similar to Service resiliency in microservices

Track 2, session 3, business continuity and disaster recovery in the virtuali...
Track 2, session 3, business continuity and disaster recovery in the virtuali...Track 2, session 3, business continuity and disaster recovery in the virtuali...
Track 2, session 3, business continuity and disaster recovery in the virtuali...EMC Forum India
 
Introduction of failsafe
Introduction of failsafeIntroduction of failsafe
Introduction of failsafeSunghyouk Bae
 
20201010 - Collabdays 2020 - Sandro Pereira - Power Automates: best practice...
20201010 -  Collabdays 2020 - Sandro Pereira - Power Automates: best practice...20201010 -  Collabdays 2020 - Sandro Pereira - Power Automates: best practice...
20201010 - Collabdays 2020 - Sandro Pereira - Power Automates: best practice...Sandro Pereira
 
Disaster Recovery and the Cloud
Disaster Recovery and the CloudDisaster Recovery and the Cloud
Disaster Recovery and the CloudJason Dea
 
Tef con2016 (1)
Tef con2016 (1)Tef con2016 (1)
Tef con2016 (1)ggarber
 
Client presentation disaster recovery as a service
Client presentation   disaster recovery as a serviceClient presentation   disaster recovery as a service
Client presentation disaster recovery as a serviceAjay V Singh
 
Circuit breakers - Using Spring-Boot + Hystrix + Dashboard + Retry
Circuit breakers - Using Spring-Boot + Hystrix + Dashboard + RetryCircuit breakers - Using Spring-Boot + Hystrix + Dashboard + Retry
Circuit breakers - Using Spring-Boot + Hystrix + Dashboard + RetryBruno Henrique Rother
 
Avoiding Disasters by Embracing Chaos
Avoiding Disasters by Embracing ChaosAvoiding Disasters by Embracing Chaos
Avoiding Disasters by Embracing ChaosOK2OK
 
Introduction to Chaos Engineering with Microsoft Azure
Introduction to Chaos Engineering with Microsoft AzureIntroduction to Chaos Engineering with Microsoft Azure
Introduction to Chaos Engineering with Microsoft AzureAna Medina
 
The Great Disconnect of Data Protection: Perception, Reality and Best Practices
The Great Disconnect of Data Protection: Perception, Reality and Best PracticesThe Great Disconnect of Data Protection: Perception, Reality and Best Practices
The Great Disconnect of Data Protection: Perception, Reality and Best Practicesiland Cloud
 
Testing Applications—For the Cloud and in the Cloud
Testing Applications—For the Cloud and in the CloudTesting Applications—For the Cloud and in the Cloud
Testing Applications—For the Cloud and in the CloudTechWell
 
Scalability and fault tolerance
Scalability and fault toleranceScalability and fault tolerance
Scalability and fault tolerancegaurav jain
 
PAC 2019 virtual Federico Toledo
PAC 2019 virtual Federico Toledo   PAC 2019 virtual Federico Toledo
PAC 2019 virtual Federico Toledo Neotys
 
EMC VPLEX Continuous availability and non disruptive
EMC VPLEX Continuous availability and non disruptiveEMC VPLEX Continuous availability and non disruptive
EMC VPLEX Continuous availability and non disruptivesolarisyougood
 
Valsatech selenium octopus framework-whitepages
Valsatech selenium octopus framework-whitepagesValsatech selenium octopus framework-whitepages
Valsatech selenium octopus framework-whitepagesChandra Sabbavarpu
 
Predictive maintenance
Predictive maintenancePredictive maintenance
Predictive maintenanceVishnuKumar384
 

Similar to Service resiliency in microservices (20)

Chapter 05: Eclipse Vert.x - Service Discovery, Resilience and Stability Patt...
Chapter 05: Eclipse Vert.x - Service Discovery, Resilience and Stability Patt...Chapter 05: Eclipse Vert.x - Service Discovery, Resilience and Stability Patt...
Chapter 05: Eclipse Vert.x - Service Discovery, Resilience and Stability Patt...
 
Track 2, session 3, business continuity and disaster recovery in the virtuali...
Track 2, session 3, business continuity and disaster recovery in the virtuali...Track 2, session 3, business continuity and disaster recovery in the virtuali...
Track 2, session 3, business continuity and disaster recovery in the virtuali...
 
Introduction of failsafe
Introduction of failsafeIntroduction of failsafe
Introduction of failsafe
 
20201010 - Collabdays 2020 - Sandro Pereira - Power Automates: best practice...
20201010 -  Collabdays 2020 - Sandro Pereira - Power Automates: best practice...20201010 -  Collabdays 2020 - Sandro Pereira - Power Automates: best practice...
20201010 - Collabdays 2020 - Sandro Pereira - Power Automates: best practice...
 
Disaster Recovery and the Cloud
Disaster Recovery and the CloudDisaster Recovery and the Cloud
Disaster Recovery and the Cloud
 
Akka (1)
Akka (1)Akka (1)
Akka (1)
 
Tef con2016 (1)
Tef con2016 (1)Tef con2016 (1)
Tef con2016 (1)
 
Client presentation disaster recovery as a service
Client presentation   disaster recovery as a serviceClient presentation   disaster recovery as a service
Client presentation disaster recovery as a service
 
Circuit breakers - Using Spring-Boot + Hystrix + Dashboard + Retry
Circuit breakers - Using Spring-Boot + Hystrix + Dashboard + RetryCircuit breakers - Using Spring-Boot + Hystrix + Dashboard + Retry
Circuit breakers - Using Spring-Boot + Hystrix + Dashboard + Retry
 
CS_10_DR_CFD
CS_10_DR_CFDCS_10_DR_CFD
CS_10_DR_CFD
 
Avoiding Disasters by Embracing Chaos
Avoiding Disasters by Embracing ChaosAvoiding Disasters by Embracing Chaos
Avoiding Disasters by Embracing Chaos
 
Introduction to Chaos Engineering with Microsoft Azure
Introduction to Chaos Engineering with Microsoft AzureIntroduction to Chaos Engineering with Microsoft Azure
Introduction to Chaos Engineering with Microsoft Azure
 
The Great Disconnect of Data Protection: Perception, Reality and Best Practices
The Great Disconnect of Data Protection: Perception, Reality and Best PracticesThe Great Disconnect of Data Protection: Perception, Reality and Best Practices
The Great Disconnect of Data Protection: Perception, Reality and Best Practices
 
Testing Applications—For the Cloud and in the Cloud
Testing Applications—For the Cloud and in the CloudTesting Applications—For the Cloud and in the Cloud
Testing Applications—For the Cloud and in the Cloud
 
Scalability and fault tolerance
Scalability and fault toleranceScalability and fault tolerance
Scalability and fault tolerance
 
Gatling
GatlingGatling
Gatling
 
PAC 2019 virtual Federico Toledo
PAC 2019 virtual Federico Toledo   PAC 2019 virtual Federico Toledo
PAC 2019 virtual Federico Toledo
 
EMC VPLEX Continuous availability and non disruptive
EMC VPLEX Continuous availability and non disruptiveEMC VPLEX Continuous availability and non disruptive
EMC VPLEX Continuous availability and non disruptive
 
Valsatech selenium octopus framework-whitepages
Valsatech selenium octopus framework-whitepagesValsatech selenium octopus framework-whitepages
Valsatech selenium octopus framework-whitepages
 
Predictive maintenance
Predictive maintenancePredictive maintenance
Predictive maintenance
 

More from Ballerina

Role of Integration and Service Mesh in Cloud Native Architecture KubeCon 2108
Role of Integration and Service Mesh in Cloud Native Architecture KubeCon 2108Role of Integration and Service Mesh in Cloud Native Architecture KubeCon 2108
Role of Integration and Service Mesh in Cloud Native Architecture KubeCon 2108Ballerina
 
Ballerina in the Real World: Motorola_KubeCon 2018
Ballerina in the Real World: Motorola_KubeCon 2018Ballerina in the Real World: Motorola_KubeCon 2018
Ballerina in the Real World: Motorola_KubeCon 2018Ballerina
 
Ballerina integration with Azure cloud services_KubeCon 2018
Ballerina integration with Azure cloud services_KubeCon 2018Ballerina integration with Azure cloud services_KubeCon 2018
Ballerina integration with Azure cloud services_KubeCon 2018Ballerina
 
Ballerina is not Java_KubeCon 2108
Ballerina is not Java_KubeCon 2108Ballerina is not Java_KubeCon 2108
Ballerina is not Java_KubeCon 2108Ballerina
 
Microservice Integration from Dev to Production_KubeCon2018
Microservice Integration from Dev to Production_KubeCon2018Microservice Integration from Dev to Production_KubeCon2018
Microservice Integration from Dev to Production_KubeCon2018Ballerina
 
Building a Microgateway in Ballerina_KubeCon 2108
Building a Microgateway in Ballerina_KubeCon 2108Building a Microgateway in Ballerina_KubeCon 2108
Building a Microgateway in Ballerina_KubeCon 2108Ballerina
 
Ballerina ecosystem
Ballerina ecosystemBallerina ecosystem
Ballerina ecosystemBallerina
 
Orchestrating microservices with docker and kubernetes
Orchestrating microservices with docker and kubernetesOrchestrating microservices with docker and kubernetes
Orchestrating microservices with docker and kubernetesBallerina
 
Data integration
Data integrationData integration
Data integrationBallerina
 
Microservices integration
Microservices integration   Microservices integration
Microservices integration Ballerina
 
Writing microservices
Writing microservicesWriting microservices
Writing microservicesBallerina
 
Ballerina philosophy
Ballerina philosophy Ballerina philosophy
Ballerina philosophy Ballerina
 
Ballerina: Cloud Native Programming Language
Ballerina: Cloud Native Programming Language Ballerina: Cloud Native Programming Language
Ballerina: Cloud Native Programming Language Ballerina
 
Writing services in Ballerina_Ballerina Day CMB 2018
Writing services in Ballerina_Ballerina Day CMB 2018Writing services in Ballerina_Ballerina Day CMB 2018
Writing services in Ballerina_Ballerina Day CMB 2018Ballerina
 
Resiliency & Security_Ballerina Day CMB 2018
Resiliency & Security_Ballerina Day CMB 2018  Resiliency & Security_Ballerina Day CMB 2018
Resiliency & Security_Ballerina Day CMB 2018 Ballerina
 
Stream Processing with Ballerina
Stream Processing with BallerinaStream Processing with Ballerina
Stream Processing with BallerinaBallerina
 
Secure by Design Microservices & Integrations
Secure by Design Microservices & IntegrationsSecure by Design Microservices & Integrations
Secure by Design Microservices & IntegrationsBallerina
 
Observability with Ballerina
Observability with BallerinaObservability with Ballerina
Observability with BallerinaBallerina
 
Serverless Ballerina
Serverless BallerinaServerless Ballerina
Serverless BallerinaBallerina
 
Test Driven Development for Microservices
Test Driven Development for MicroservicesTest Driven Development for Microservices
Test Driven Development for MicroservicesBallerina
 

More from Ballerina (20)

Role of Integration and Service Mesh in Cloud Native Architecture KubeCon 2108
Role of Integration and Service Mesh in Cloud Native Architecture KubeCon 2108Role of Integration and Service Mesh in Cloud Native Architecture KubeCon 2108
Role of Integration and Service Mesh in Cloud Native Architecture KubeCon 2108
 
Ballerina in the Real World: Motorola_KubeCon 2018
Ballerina in the Real World: Motorola_KubeCon 2018Ballerina in the Real World: Motorola_KubeCon 2018
Ballerina in the Real World: Motorola_KubeCon 2018
 
Ballerina integration with Azure cloud services_KubeCon 2018
Ballerina integration with Azure cloud services_KubeCon 2018Ballerina integration with Azure cloud services_KubeCon 2018
Ballerina integration with Azure cloud services_KubeCon 2018
 
Ballerina is not Java_KubeCon 2108
Ballerina is not Java_KubeCon 2108Ballerina is not Java_KubeCon 2108
Ballerina is not Java_KubeCon 2108
 
Microservice Integration from Dev to Production_KubeCon2018
Microservice Integration from Dev to Production_KubeCon2018Microservice Integration from Dev to Production_KubeCon2018
Microservice Integration from Dev to Production_KubeCon2018
 
Building a Microgateway in Ballerina_KubeCon 2108
Building a Microgateway in Ballerina_KubeCon 2108Building a Microgateway in Ballerina_KubeCon 2108
Building a Microgateway in Ballerina_KubeCon 2108
 
Ballerina ecosystem
Ballerina ecosystemBallerina ecosystem
Ballerina ecosystem
 
Orchestrating microservices with docker and kubernetes
Orchestrating microservices with docker and kubernetesOrchestrating microservices with docker and kubernetes
Orchestrating microservices with docker and kubernetes
 
Data integration
Data integrationData integration
Data integration
 
Microservices integration
Microservices integration   Microservices integration
Microservices integration
 
Writing microservices
Writing microservicesWriting microservices
Writing microservices
 
Ballerina philosophy
Ballerina philosophy Ballerina philosophy
Ballerina philosophy
 
Ballerina: Cloud Native Programming Language
Ballerina: Cloud Native Programming Language Ballerina: Cloud Native Programming Language
Ballerina: Cloud Native Programming Language
 
Writing services in Ballerina_Ballerina Day CMB 2018
Writing services in Ballerina_Ballerina Day CMB 2018Writing services in Ballerina_Ballerina Day CMB 2018
Writing services in Ballerina_Ballerina Day CMB 2018
 
Resiliency & Security_Ballerina Day CMB 2018
Resiliency & Security_Ballerina Day CMB 2018  Resiliency & Security_Ballerina Day CMB 2018
Resiliency & Security_Ballerina Day CMB 2018
 
Stream Processing with Ballerina
Stream Processing with BallerinaStream Processing with Ballerina
Stream Processing with Ballerina
 
Secure by Design Microservices & Integrations
Secure by Design Microservices & IntegrationsSecure by Design Microservices & Integrations
Secure by Design Microservices & Integrations
 
Observability with Ballerina
Observability with BallerinaObservability with Ballerina
Observability with Ballerina
 
Serverless Ballerina
Serverless BallerinaServerless Ballerina
Serverless Ballerina
 
Test Driven Development for Microservices
Test Driven Development for MicroservicesTest Driven Development for Microservices
Test Driven Development for Microservices
 

Recently uploaded

ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?Watsoo Telematics
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 

Recently uploaded (20)

ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 

Service resiliency in microservices

  • 1. Service Resiliency in Microservices Afkham Azeez, Vice President, WSO2 November 2018
  • 2. “ Random House Kernerman Webster’s College dictionary Ability to return to the original form or position after being affected by a particular alteration. Ability to recover from illness, depression, adversity or the like. Resilience
  • 3. Why should I bother?
  • 4. Because real systems need to run in production!
  • 5. Availability is critical in production MTTF MTTF + MTTR A = E[Uptime] E[Uptime] + E[Downtime] A =
  • 6. Traditional approach to improving availability MTTF MTTF + MTTR A =
  • 7. Resilience approach to improving availability MTTF MTTF + MTTR A =
  • 8. Failures are a fact of life. Don’t avoid failures. Embrace them.
  • 9. Resilience ○ The ability of an app to recover from certain types of failure yet remain functional from the users’ perspective ○ Users don’t notice it ○ Graceful degradation ○ Resilience is how you achieve the outcome ○ Also called recoverability
  • 10. Features of resilience systems ○ Failures are compartmentalized ○ A resilient system would automatically cut off failing components and reintegrate them once they are no longer failing
  • 11. MSA and resilience MSA inherently helps with designing resilient systems
  • 12. Techniques ○ Timeout ○ Retry ○ Failover ○ Load balancing ○ Circuit breaker ○ Transactions
  • 13. Don’t hide the network ○ Calls over the network should always return errors in addition to the response ○ Network calls should be easily distinguishable
  • 14. var backendRes = backendClientEP->forward("/hello", request); match backendRes { http:Response res => { ... } error responseError => { log:printError("Error sending response", err = responseError); } }
  • 15. Timeout ○ Hide downstream latency and keep the responsiveness to upstream ○ Prevent waiting forever
  • 16. Timeout Sample endpoint http:Client backendClientEP { url: "http://localhost:8080", timeoutMillis: 2000 }; var backendRes = backendClientEP->forward("/hello", request); match backendRes { http:Response res => { ... } error responseError => { string resp = cache.get(“hello”); ... } }
  • 17. Demo
  • 18. Retry ○ Transient failures are not uncommon ○ In such cases, a simple retry is sufficient ○ Can be handled by ○ Retry immediately ○ Retry with a delay ○ IMPORTANT: Idempotency should be handled at the application layer
  • 19. Retry Sample endpoint http:Client backendClientEP { url: "http://localhost:8080", retryConfig: { interval: 3000, count: 3, backOffFactor: 2 }, timeoutMillis: 2000 }; Retry after 3000, 6000 & 12000 ms
  • 20. Demo
  • 21. Failover Switch to an alternate endpoint when the primary fails
  • 22. Failover Sample endpoint http:FailoverClient foBackendEP { timeoutMillis: 5000, failoverCodes: [501, 502, 503], intervalMillis: 5000, targets: [ { url: "http://localhost:3000/mock1" }, { url: "http://localhost:8080/echo" }, { url: "http://localhost:8080/mock" } ] };
  • 23. Demo
  • 24. Load Balancing Load balance across endpoints
  • 25. Load Balancing Sample endpoint http:LoadBalanceClient lbBackendEP { targets: [ { url: "http://localhost:8080/mock1" }, { url: "http://localhost:8080/mock2" }, { url: "http://localhost:8080/mock3" } ], algorithm: http:ROUND_ROBIN, timeoutMillis: 5000 };
  • 26. Demo
  • 27. Circuit breaker ○ Some transient failures take much longer to recover ○ Repeatedly retrying may hinder recoverability ○ Retry up to a certain degree and cut off
  • 28. Circuit Breaker OPEN HALF OPEN CLOSED Success Failure Delay Failure threshold exceeded
  • 29. Circuit Breaker Sample endpoint http:Client backendClientEP { url: "http://localhost:8080", circuitBreaker: { rollingWindow: { timeWindowMillis: 16000, bucketSizeMillis: 2000 }, failureThreshold: 0.2, resetTimeMillis: 10000, statusCodes: [400, 404, 500] }, timeoutMillis: 2000 };
  • 30. Demo
  • 32. 2PC Coordination Completion protocol alt Initiator Coordinator Create-Context() Micro-Transaction-Context Commit() Committed | Aborted | Mixed Abort() Aborted | Mixed
  • 33. 2PC Coordination Durable protocol - triggered by commit success Coordinator Participant Prepare() Prepared | Read-Only | Aborted | Committed notify(...Commit...) Committed notify(...Abort...) Aborted
  • 34. 2PC Coordination Durable protocol - triggered by abort Coordinator Participant notify(...Abort...) Aborted
  • 35. Transactions Sample - initiator transaction with retries = 2 { // Calling a local participant localParticipantDoSomething(); // Calling a remote participant http:Response res = check participant->get(“/ParticipantService/hi”) } onretry { // Code here will execute before retrying } committed { // Code here will execute after the initiated transaction // has committed } aborted { // Code here will execute after the initiated transaction // has been aborted }
  • 36. Transactions Sample - participant service<http:Service> ParticipantService bind listener { @transactions:participant { oncommit = onTxnCommit, onabort = onTxnAbort } hi(endpoint caller, http:Request req) { ... } } ... function onTxnCommit(string transactionId) { ... } function onTxnAbort(string transactionId) { ... }
  • 37. Demo
  • 39. Q & A