SlideShare a Scribd company logo
1 of 19
Download to read offline
Service Mesh vs. Frameworks:
Where to put the resilience?
Michael Hofmann
https://hofmann-itconsulting.de
(1) Distributed Systems and Resilience
(2) Framework
(3) Service Mesh
(4) Framework and Service Mesh Characteristics
(5) Thoughts about Resilience
(6) Essential Requirements
(7) Conclusion
Agenda
Distributed Systems
➔ degree of distribution raises failure rate!
➔ compensation strategy: resilience!
slow response
timeout
aborted network connection
...
Typical Communication Errors Fallacies of Distributed Computing
The network is reliable.
Latency is zero.
Bandwidth is infinite.
The network is secure.
Topology doesn't change.
There is one administrator.
Transport cost is zero.
The network is homogeneous.
Hystrix
Alternative: Service Mesh?!
Resilience
Resilience4J
Failsafe
MicroProfile Fault Tolerance
…
Framework „DEATH“ Framework ACTIVE
Resilience Patterns
─
Timeout
─
Retry
─
Fallback
─
Circuit Breaker
─
Bulkhead
many more:
Uwe Friedrichsen: “Patterns of resilience” https://www.slideshare.net/ufried/patterns-of-resilience
@CircuitBreaker(successThreshold = 10,
requestVolumeThreshold = 4, failureRatio=0.5, delay = 1000)
public Connection serviceA() {
Connection conn = null;
counterForInvokingServiceA++;
conn = connectionService();
return conn;
}
MicroProfile Fault Tolerance
@Retry(maxRetries = 3)
@Fallback(fallbackMethod = "doFallback")
public Result doWork() {
return callServiceA(); // fallback on RuntimeException
}
private Result doFallback() {
return ...;
}
Service Mesh
The term service mesh is used to describe the
network of microservices that make up such
application and the interactions between them.
(istio.io)
Don’t manage a Service Mesh without tooling!
Requirements:
(1) manage calls on layer 7 (application layer, L7)
(2) resilience, routing, security and telemetry
(3) decentralized & transparent for services (implementation independent)
Istio Architecture
Resilience Patterns in Istio
✔
Timeout
✔
Retry
✔
CircuitBreaker
✔
Bulkhead
✗
Fallback?
✗
is a Fallback possible?
✗
less technical, more business driven
https://dzone.com/articles/fallbacks-are-overrated-architecting-for-resilienc
Resilience in Istio
$ kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: reviews
spec:
hosts:
- reviews
http:
- route:
- destination:
host: reviews
subset: v1
retries:
attempts: 3
perTryTimeout: 2s
EOF
$ kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: reviews
spec:
host: reviews
trafficPolicy:
connectionPool:
tcp:
maxConnections: 1
http:
http1MaxPendingRequests: 1
maxRequestsPerConnection: 1
outlierDetection:
consecutiveErrors: 1
interval: 1s
baseEjectionTime: 3m
maxEjectionPercent: 100
EOF
Resilience in Istio
Apply to sidecar
Resilience rules
— transparent for service
— act global on all sidecars
Fault Injection
MicroProfile with Istio setting
apiVersion:networking.istio.io/
v1alpha3
kind: VirtualService
metadata:
name: ratings
...
spec:
hosts:
- ratings
http:
- fault:
delay:
fixedDelay: 7s
percent: 100
MP_Fault_Tolerance_NonFallback_Enabled = false
Frameworks Characteristics
—
Java: a lot of different frameworks
—
Team decides framework?!?
—
Learning curve for every framework
—
Different frameworks behave different
—
Same framework in different version behave different
—
Same framework in different versions parallel in use
Frameworks Characteristics
➔ Change of framework:
➔ Replace all positions in code
➔ New behavior
➔ New deployment
➔ New tests
➔ Risk of chain reaction:
framework ➔ load balancing ➔ service registry
➔ Multiple service registries for every different framework?
Service Mesh Characteristics
—
Define new rule
—
Same behavior (… no framework change)
—
unchanged deployed service
—
new tests only for new rules
—
Client-side load balancing in sidecar
—
Service Registry based on endpoints in K8S
$ kubectl apply -f ...
Thoughts about Resilience
Resilience pattern still correct if communication behavior changes?
—
Modified behavior of partner
—
Modified communication partner
—
Modified infrastructure
—
Load changes during day
—
Side effects from other systems
—
Anticipate problems of tomorrow?
Thoughts about Resilience
—
Main problem: choose the right resilience pattern
—
Correct parameters for pattern?
—
Measure resilience
—
Mostly: try & error for suitable pattern/params
(main reason for end of life in hystrix)
—
Often: retry storm
—
Often: missing musketeer principle
(black sheep)
Essential Requirements
—
Modification: Quick and easy change of
(1) params for chosen pattern
(2) resilience pattern
—
Test
—
Monitoring
—
No black sheep
Essential Requirements
Istio Framework
Modification
+ Modify Params
- Change Pattern: Lifecycle
Test Fault Injection complicated
Monitoring + +
Black Sheep
No:
rule in all sidecars
$ kubectl apply -f ...
Conclusion
—
Comparable resilience patterns
—
Missing fallback in service mesh (but overrated)
—
Higher flexibility in service mesh
—
Fault injection easy in service mesh
Solve problems where they arise!
Service Mesh for L4-L7
Developer for L8 (original profession)

More Related Content

What's hot

What's hot (20)

Hyperledger Fabric
Hyperledger FabricHyperledger Fabric
Hyperledger Fabric
 
Hyperledger Fabric - Blockchain for the Enterprise - FOSDEM 20190203
Hyperledger Fabric - Blockchain for the Enterprise - FOSDEM 20190203Hyperledger Fabric - Blockchain for the Enterprise - FOSDEM 20190203
Hyperledger Fabric - Blockchain for the Enterprise - FOSDEM 20190203
 
How to Overcome Network Access Control Limitations for Better Network Security
How to Overcome Network Access Control Limitations for Better Network SecurityHow to Overcome Network Access Control Limitations for Better Network Security
How to Overcome Network Access Control Limitations for Better Network Security
 
Introduction to Blockchain and Hyperledger
Introduction to Blockchain and HyperledgerIntroduction to Blockchain and Hyperledger
Introduction to Blockchain and Hyperledger
 
Authorization and Authentication in Microservice Environments
Authorization and Authentication in Microservice EnvironmentsAuthorization and Authentication in Microservice Environments
Authorization and Authentication in Microservice Environments
 
Blockchain, 
Hyperledger fabric & Hyperledger cello
Blockchain, 
Hyperledger fabric & Hyperledger celloBlockchain, 
Hyperledger fabric & Hyperledger cello
Blockchain, 
Hyperledger fabric & Hyperledger cello
 
WSO2 Identity Server - Product Overview
WSO2 Identity Server - Product OverviewWSO2 Identity Server - Product Overview
WSO2 Identity Server - Product Overview
 
The missing signalling layer for WebRTC
The missing signalling layer for WebRTCThe missing signalling layer for WebRTC
The missing signalling layer for WebRTC
 
Blockchain testing strategy
Blockchain testing strategyBlockchain testing strategy
Blockchain testing strategy
 
Hyperledger fabric 3
Hyperledger fabric 3Hyperledger fabric 3
Hyperledger fabric 3
 
Ibm blockchain - Hyperledger 15.02.18
Ibm blockchain - Hyperledger 15.02.18Ibm blockchain - Hyperledger 15.02.18
Ibm blockchain - Hyperledger 15.02.18
 
GlobalSign's Hosted OCSP for IoT PKIs
GlobalSign's Hosted OCSP for IoT PKIsGlobalSign's Hosted OCSP for IoT PKIs
GlobalSign's Hosted OCSP for IoT PKIs
 
Modeling Multi-Layer Access Control Policies of a Hyperledger-Fabric-Based Ag...
Modeling Multi-Layer Access Control Policies of a Hyperledger-Fabric-Based Ag...Modeling Multi-Layer Access Control Policies of a Hyperledger-Fabric-Based Ag...
Modeling Multi-Layer Access Control Policies of a Hyperledger-Fabric-Based Ag...
 
The DNS Tunneling Blindspot
The DNS Tunneling BlindspotThe DNS Tunneling Blindspot
The DNS Tunneling Blindspot
 
Smart Contract Testing
Smart Contract TestingSmart Contract Testing
Smart Contract Testing
 
Evolution of WAF - Stop Worrying About Vulnerabilities
Evolution of WAF - Stop Worrying About VulnerabilitiesEvolution of WAF - Stop Worrying About Vulnerabilities
Evolution of WAF - Stop Worrying About Vulnerabilities
 
Hyperledger Fabric and Tools
Hyperledger Fabric and ToolsHyperledger Fabric and Tools
Hyperledger Fabric and Tools
 
MRA AMA Part 6: Service Mesh Models
MRA AMA Part 6: Service Mesh ModelsMRA AMA Part 6: Service Mesh Models
MRA AMA Part 6: Service Mesh Models
 
Hyperleger Composer Architecure Deep Dive
Hyperleger Composer Architecure Deep DiveHyperleger Composer Architecure Deep Dive
Hyperleger Composer Architecure Deep Dive
 
Introduction of Hyperledger Fabric & Composer
Introduction of Hyperledger Fabric & Composer Introduction of Hyperledger Fabric & Composer
Introduction of Hyperledger Fabric & Composer
 

Similar to Service Mesh vs. Frameworks: Where to put the resilience?

Three Degrees of Mediation: Challenges and Lessons in building Cloud-agnostic...
Three Degrees of Mediation: Challenges and Lessons in building Cloud-agnostic...Three Degrees of Mediation: Challenges and Lessons in building Cloud-agnostic...
Three Degrees of Mediation: Challenges and Lessons in building Cloud-agnostic...
Alex Maclinovsky
 
Performance testing virtualized systems v5
Performance testing virtualized systems v5Performance testing virtualized systems v5
Performance testing virtualized systems v5
Mentora
 

Similar to Service Mesh vs. Frameworks: Where to put the resilience? (20)

Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Architecting for failure - Why are distributed systems hard?
Architecting for failure - Why are distributed systems hard?Architecting for failure - Why are distributed systems hard?
Architecting for failure - Why are distributed systems hard?
 
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
 
Architecture of Falcon, a new chat messaging backend system build on Scala
Architecture of Falcon,  a new chat messaging backend system  build on ScalaArchitecture of Falcon,  a new chat messaging backend system  build on Scala
Architecture of Falcon, a new chat messaging backend system build on Scala
 
Designing distributed systems
Designing distributed systemsDesigning distributed systems
Designing distributed systems
 
Three Degrees of Mediation: Challenges and Lessons in building Cloud-agnostic...
Three Degrees of Mediation: Challenges and Lessons in building Cloud-agnostic...Three Degrees of Mediation: Challenges and Lessons in building Cloud-agnostic...
Three Degrees of Mediation: Challenges and Lessons in building Cloud-agnostic...
 
NoSQL Introduction, Theory, Implementations
NoSQL Introduction, Theory, ImplementationsNoSQL Introduction, Theory, Implementations
NoSQL Introduction, Theory, Implementations
 
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
 
Performance testing virtualized systems v5
Performance testing virtualized systems v5Performance testing virtualized systems v5
Performance testing virtualized systems v5
 
Infinit's Next Generation Key-value Store - Julien Quintard and Quentin Hocqu...
Infinit's Next Generation Key-value Store - Julien Quintard and Quentin Hocqu...Infinit's Next Generation Key-value Store - Julien Quintard and Quentin Hocqu...
Infinit's Next Generation Key-value Store - Julien Quintard and Quentin Hocqu...
 
Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13
 
Design (Cloud systems) for Failures
Design (Cloud systems) for FailuresDesign (Cloud systems) for Failures
Design (Cloud systems) for Failures
 
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018
 
Stephane Lapointe & Alexandre Brisebois: Développer des microservices avec Se...
Stephane Lapointe & Alexandre Brisebois: Développer des microservices avec Se...Stephane Lapointe & Alexandre Brisebois: Développer des microservices avec Se...
Stephane Lapointe & Alexandre Brisebois: Développer des microservices avec Se...
 
Microsoft Azure Cloud Basics Tutorial
Microsoft Azure Cloud Basics TutorialMicrosoft Azure Cloud Basics Tutorial
Microsoft Azure Cloud Basics Tutorial
 
CLUSTER COMPUTING
CLUSTER COMPUTINGCLUSTER COMPUTING
CLUSTER COMPUTING
 
Distributed systems and scalability rules
Distributed systems and scalability rulesDistributed systems and scalability rules
Distributed systems and scalability rules
 
170215 msa intro
170215 msa intro170215 msa intro
170215 msa intro
 
Introduction
IntroductionIntroduction
Introduction
 

More from Michael Hofmann

Developer Experience Cloud Native - From Code Gen to Git Commit without a CI/...
Developer Experience Cloud Native - From Code Gen to Git Commit without a CI/...Developer Experience Cloud Native - From Code Gen to Git Commit without a CI/...
Developer Experience Cloud Native - From Code Gen to Git Commit without a CI/...
Michael Hofmann
 

More from Michael Hofmann (13)

Service Specific AuthZ In The Cloud Infrastructure
Service Specific AuthZ In The Cloud InfrastructureService Specific AuthZ In The Cloud Infrastructure
Service Specific AuthZ In The Cloud Infrastructure
 
New Ways To Production - Stress-Free Evolution Of Your Cloud Applications
New Ways To Production - Stress-Free Evolution Of Your Cloud ApplicationsNew Ways To Production - Stress-Free Evolution Of Your Cloud Applications
New Ways To Production - Stress-Free Evolution Of Your Cloud Applications
 
Developer Experience Cloud Native - Become Efficient and Achieve Parity
Developer Experience Cloud Native - Become Efficient and Achieve ParityDeveloper Experience Cloud Native - Become Efficient and Achieve Parity
Developer Experience Cloud Native - Become Efficient and Achieve Parity
 
The Easy Way to Secure Microservices
The Easy Way to Secure MicroservicesThe Easy Way to Secure Microservices
The Easy Way to Secure Microservices
 
Service Mesh vs. Frameworks: Where to put the resilience?
Service Mesh vs. Frameworks: Where to put the resilience?Service Mesh vs. Frameworks: Where to put the resilience?
Service Mesh vs. Frameworks: Where to put the resilience?
 
Developer Experience Cloud Native - From Code Gen to Git Commit without a CI/...
Developer Experience Cloud Native - From Code Gen to Git Commit without a CI/...Developer Experience Cloud Native - From Code Gen to Git Commit without a CI/...
Developer Experience Cloud Native - From Code Gen to Git Commit without a CI/...
 
Servicierung von Monolithen - Der Weg zu neuen Technologien bis hin zum Servi...
Servicierung von Monolithen - Der Weg zu neuen Technologien bis hin zum Servi...Servicierung von Monolithen - Der Weg zu neuen Technologien bis hin zum Servi...
Servicierung von Monolithen - Der Weg zu neuen Technologien bis hin zum Servi...
 
Service Mesh mit Istio und MicroProfile - eine harmonische Kombination?
Service Mesh mit Istio und MicroProfile - eine harmonische Kombination?Service Mesh mit Istio und MicroProfile - eine harmonische Kombination?
Service Mesh mit Istio und MicroProfile - eine harmonische Kombination?
 
Service Mesh - kilometer 30 in a microservice marathon
Service Mesh - kilometer 30 in a microservice marathonService Mesh - kilometer 30 in a microservice marathon
Service Mesh - kilometer 30 in a microservice marathon
 
Service Mesh - Kilometer 30 im Microservices-Marathon
Service Mesh - Kilometer 30 im Microservices-MarathonService Mesh - Kilometer 30 im Microservices-Marathon
Service Mesh - Kilometer 30 im Microservices-Marathon
 
API-Economy bei Financial Services – Kein Stein bleibt auf dem anderen
API-Economy bei Financial Services – Kein Stein bleibt auf dem anderenAPI-Economy bei Financial Services – Kein Stein bleibt auf dem anderen
API-Economy bei Financial Services – Kein Stein bleibt auf dem anderen
 
Microprofile.io - Cloud Native mit Java EE
Microprofile.io - Cloud Native mit Java EEMicroprofile.io - Cloud Native mit Java EE
Microprofile.io - Cloud Native mit Java EE
 
Microservices mit Java EE - am Beispiel von IBM Liberty
Microservices mit Java EE - am Beispiel von IBM LibertyMicroservices mit Java EE - am Beispiel von IBM Liberty
Microservices mit Java EE - am Beispiel von IBM Liberty
 

Recently uploaded

Recently uploaded (20)

A Deep Dive into Secure Product Development Frameworks.pdf
A Deep Dive into Secure Product Development Frameworks.pdfA Deep Dive into Secure Product Development Frameworks.pdf
A Deep Dive into Secure Product Development Frameworks.pdf
 
^Clinic ^%[+27788225528*Abortion Pills For Sale In birch acres
^Clinic ^%[+27788225528*Abortion Pills For Sale In birch acres^Clinic ^%[+27788225528*Abortion Pills For Sale In birch acres
^Clinic ^%[+27788225528*Abortion Pills For Sale In birch acres
 
Spring into AI presented by Dan Vega 5/14
Spring into AI presented by Dan Vega 5/14Spring into AI presented by Dan Vega 5/14
Spring into AI presented by Dan Vega 5/14
 
Abortion Clinic In Pretoria ](+27832195400*)[ 🏥 Safe Abortion Pills in Pretor...
Abortion Clinic In Pretoria ](+27832195400*)[ 🏥 Safe Abortion Pills in Pretor...Abortion Clinic In Pretoria ](+27832195400*)[ 🏥 Safe Abortion Pills in Pretor...
Abortion Clinic In Pretoria ](+27832195400*)[ 🏥 Safe Abortion Pills in Pretor...
 
What is a Recruitment Management Software?
What is a Recruitment Management Software?What is a Recruitment Management Software?
What is a Recruitment Management Software?
 
Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024
 
Abortion Clinic In Polokwane ](+27832195400*)[ 🏥 Safe Abortion Pills in Polok...
Abortion Clinic In Polokwane ](+27832195400*)[ 🏥 Safe Abortion Pills in Polok...Abortion Clinic In Polokwane ](+27832195400*)[ 🏥 Safe Abortion Pills in Polok...
Abortion Clinic In Polokwane ](+27832195400*)[ 🏥 Safe Abortion Pills in Polok...
 
OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024
 
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-CloudAlluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
 
^Clinic ^%[+27788225528*Abortion Pills For Sale In witbank
^Clinic ^%[+27788225528*Abortion Pills For Sale In witbank^Clinic ^%[+27788225528*Abortion Pills For Sale In witbank
^Clinic ^%[+27788225528*Abortion Pills For Sale In witbank
 
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
 
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdfStrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
 
Effective Strategies for Wix's Scaling challenges - GeeCon
Effective Strategies for Wix's Scaling challenges - GeeConEffective Strategies for Wix's Scaling challenges - GeeCon
Effective Strategies for Wix's Scaling challenges - GeeCon
 
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
 
Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?
 
Auto Affiliate AI Earns First Commission in 3 Hours..pdf
Auto Affiliate  AI Earns First Commission in 3 Hours..pdfAuto Affiliate  AI Earns First Commission in 3 Hours..pdf
Auto Affiliate AI Earns First Commission in 3 Hours..pdf
 
Software Engineering - Introduction + Process Models + Requirements Engineering
Software Engineering - Introduction + Process Models + Requirements EngineeringSoftware Engineering - Introduction + Process Models + Requirements Engineering
Software Engineering - Introduction + Process Models + Requirements Engineering
 
architecting-ai-in-the-enterprise-apis-and-applications.pdf
architecting-ai-in-the-enterprise-apis-and-applications.pdfarchitecting-ai-in-the-enterprise-apis-and-applications.pdf
architecting-ai-in-the-enterprise-apis-and-applications.pdf
 
[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypse
[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypse[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypse
[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypse
 
Optimizing Operations by Aligning Resources with Strategic Objectives Using O...
Optimizing Operations by Aligning Resources with Strategic Objectives Using O...Optimizing Operations by Aligning Resources with Strategic Objectives Using O...
Optimizing Operations by Aligning Resources with Strategic Objectives Using O...
 

Service Mesh vs. Frameworks: Where to put the resilience?

  • 1. Service Mesh vs. Frameworks: Where to put the resilience? Michael Hofmann https://hofmann-itconsulting.de
  • 2. (1) Distributed Systems and Resilience (2) Framework (3) Service Mesh (4) Framework and Service Mesh Characteristics (5) Thoughts about Resilience (6) Essential Requirements (7) Conclusion Agenda
  • 3. Distributed Systems ➔ degree of distribution raises failure rate! ➔ compensation strategy: resilience! slow response timeout aborted network connection ... Typical Communication Errors Fallacies of Distributed Computing The network is reliable. Latency is zero. Bandwidth is infinite. The network is secure. Topology doesn't change. There is one administrator. Transport cost is zero. The network is homogeneous.
  • 4. Hystrix Alternative: Service Mesh?! Resilience Resilience4J Failsafe MicroProfile Fault Tolerance … Framework „DEATH“ Framework ACTIVE
  • 5. Resilience Patterns ─ Timeout ─ Retry ─ Fallback ─ Circuit Breaker ─ Bulkhead many more: Uwe Friedrichsen: “Patterns of resilience” https://www.slideshare.net/ufried/patterns-of-resilience
  • 6. @CircuitBreaker(successThreshold = 10, requestVolumeThreshold = 4, failureRatio=0.5, delay = 1000) public Connection serviceA() { Connection conn = null; counterForInvokingServiceA++; conn = connectionService(); return conn; } MicroProfile Fault Tolerance @Retry(maxRetries = 3) @Fallback(fallbackMethod = "doFallback") public Result doWork() { return callServiceA(); // fallback on RuntimeException } private Result doFallback() { return ...; }
  • 7. Service Mesh The term service mesh is used to describe the network of microservices that make up such application and the interactions between them. (istio.io) Don’t manage a Service Mesh without tooling! Requirements: (1) manage calls on layer 7 (application layer, L7) (2) resilience, routing, security and telemetry (3) decentralized & transparent for services (implementation independent)
  • 9. Resilience Patterns in Istio ✔ Timeout ✔ Retry ✔ CircuitBreaker ✔ Bulkhead ✗ Fallback? ✗ is a Fallback possible? ✗ less technical, more business driven https://dzone.com/articles/fallbacks-are-overrated-architecting-for-resilienc
  • 10. Resilience in Istio $ kubectl apply -f - <<EOF apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: reviews spec: hosts: - reviews http: - route: - destination: host: reviews subset: v1 retries: attempts: 3 perTryTimeout: 2s EOF $ kubectl apply -f - <<EOF apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: name: reviews spec: host: reviews trafficPolicy: connectionPool: tcp: maxConnections: 1 http: http1MaxPendingRequests: 1 maxRequestsPerConnection: 1 outlierDetection: consecutiveErrors: 1 interval: 1s baseEjectionTime: 3m maxEjectionPercent: 100 EOF
  • 11. Resilience in Istio Apply to sidecar Resilience rules — transparent for service — act global on all sidecars Fault Injection MicroProfile with Istio setting apiVersion:networking.istio.io/ v1alpha3 kind: VirtualService metadata: name: ratings ... spec: hosts: - ratings http: - fault: delay: fixedDelay: 7s percent: 100 MP_Fault_Tolerance_NonFallback_Enabled = false
  • 12. Frameworks Characteristics — Java: a lot of different frameworks — Team decides framework?!? — Learning curve for every framework — Different frameworks behave different — Same framework in different version behave different — Same framework in different versions parallel in use
  • 13. Frameworks Characteristics ➔ Change of framework: ➔ Replace all positions in code ➔ New behavior ➔ New deployment ➔ New tests ➔ Risk of chain reaction: framework ➔ load balancing ➔ service registry ➔ Multiple service registries for every different framework?
  • 14. Service Mesh Characteristics — Define new rule — Same behavior (… no framework change) — unchanged deployed service — new tests only for new rules — Client-side load balancing in sidecar — Service Registry based on endpoints in K8S $ kubectl apply -f ...
  • 15. Thoughts about Resilience Resilience pattern still correct if communication behavior changes? — Modified behavior of partner — Modified communication partner — Modified infrastructure — Load changes during day — Side effects from other systems — Anticipate problems of tomorrow?
  • 16. Thoughts about Resilience — Main problem: choose the right resilience pattern — Correct parameters for pattern? — Measure resilience — Mostly: try & error for suitable pattern/params (main reason for end of life in hystrix) — Often: retry storm — Often: missing musketeer principle (black sheep)
  • 17. Essential Requirements — Modification: Quick and easy change of (1) params for chosen pattern (2) resilience pattern — Test — Monitoring — No black sheep
  • 18. Essential Requirements Istio Framework Modification + Modify Params - Change Pattern: Lifecycle Test Fault Injection complicated Monitoring + + Black Sheep No: rule in all sidecars $ kubectl apply -f ...
  • 19. Conclusion — Comparable resilience patterns — Missing fallback in service mesh (but overrated) — Higher flexibility in service mesh — Fault injection easy in service mesh Solve problems where they arise! Service Mesh for L4-L7 Developer for L8 (original profession)