SlideShare a Scribd company logo
1 of 10
RESILIENT SYSTEM DESIGN
June 2013
Risk & Compliance Engineering, PayPal
Pradeep Ballal
Staale Nerboe
Greg Berry
This deck contains generic architecture information, and does not
reflect the exact details of current or planned systems.
PROBLEM DEFINITION AND SOLUTION
Problem
In a distributed, virtualized environment, system failures are inevitable.
Solution
Isolate functionality to enable independent implementation of appropriate availability
patterns and increase velocity/flexibility of fixes.
Use asynchronous reconciliation to resolve failures without affecting overall customer
experience.
2 Confidential and Proprietary
PPaaS
Circuit Breakers
Clients
Service Container
Circuit Breakers
3
HIGH LEVEL ARCHITECTURE
Confidential and Proprietary
Dependency Dependency
Dependency Dependency Dependency
Dependency
Orchestration/Response Consolidation
Request
Request
Request
Request
Component Container
Functional Component Functional Component
Dependency Dependency DependencyDependency
Functional Component (FC): Isolated set
of functionality that can be developed,
deployed and executed independently.
• Fits well into the Agile Development
methodology
• Fallback behavior defined
Service Container (SC): Contains
infrastructure to orchestrate FCs and
handle response consolidation and
initiate reconciliation during failure.
• Component based model (e.g. OSGi)
including support for hot deploy of FCs
without downtime for service
• Malfunctioning FCs will quickly show and
can be handled dynamically by properties
or real time deployments
• Provide meaningful response back to clients
4
SERVICE CONTAINER
Confidential and Proprietary
Service Container (SC): Contains infrastructure to orchestrate FCs and handle response
consolidation and initiate reconciliation during failure.
• Build on top of PayPal Platform as a Services (PPaaS)
• Component based model (e.g. OSGi) including support for hot deploy of FCs without downtime
for service
• Enforces the concepts of coarse grained services
• Malfunctioning FCs will quickly show and can be handled dynamically by properties or real time
deployments
• Provide meaningful response back to clients
• Non-intrusive on the clients
Functional Component (FC): Isolated set of functionality that can be developed,
deployed and executed independently.
• Fits well into the Agile Development methodology
• FCs can fail independently
• Fallback behavior defined
Clients
FALLBACK
5 Confidential and Proprietary
To create a resilient system each Functional Component and Dependency SHOULD fail
gracefully and have Fallback Behavior. This can be achieved by utilizing a framework
that enforces normalized behavior across the platform.
PS: Fallback Behavior should not be an
afterthought but should be detailed
out in the design in conjunction with
your business partners.
FAILURE
Request
Functional Component / Dependency
Circuit Breakers (Local / Global)
Logging / Monitoring
Normal
Behavior
Fallback
Behavior
Clients
CIRCUIT BREAKERS*
6 Confidential and Proprietary
Circuit Breakers (CB)s serve these purposes:
• It protects the clients from slow or broken FCs
• It protects services from demand in excess
of capacity
• And most importantly it protects the
Business from malfunctioning code by
tracking negative actions (like decline
payment) and if abnormal behavior is
found, shuts down the FC
*Concept first discussed in the excellent book Release It! by Michael Nygard.
Example open source implementation by Netflix: https://github.com/Netflix/Hystrix/wiki
CBs are named after their counterparts
in the physical world.
Local CBs: Track the health of services
Global CBs: Tracks negative behavior
that impacts the Business or health of
overall system
Service ContainerService Container
Request
Request
Functional Component
Dependency
Orchestration/
Response Consolidation
Circuit Breakers (Global)
Circuit Breakers
Request
Request
Functional Component
Dependency
Circuit BreakersConfig
Orchestration/
Response Consolidation
DATA ACCESS – NEED MORE
7 Confidential and Proprietary
Globally Distributed
• You can’t have a single system of record that contains all data
• Latency matters (you can’t go faster than the speed of light)
• There must be a way to partition data and processing
Always Available
• Everything needs to be redundant (or dispensable)
• Can’t have a single point of failure
Shares Nothing
• Systems must be able to run completely
independently
Read
ReplicasRead
Replicas
SoR
Journal
Read Service Life Cycle (CRUD) Service
Latency Bridge
Replay
Clients
8
EVENTUALLY CONSISTENT*
Confidential and Proprietary
CAP theorem: States that of three properties of distributed -data systems—data
consistency, system availability, and tolerance to network partition—only two can be
achieved at any given time.
To account for this fact a reconciliation system is required to identify issues and try to
correct them automatically. Only as a last resort should a Manual Review should be
conducted.
Design considerations:
• Limited DB table scanning: System should not rely on heavy DB table scanning and heavy
queries. If required this SHOULD be done in a DW or on a hadoop cluster and feed back into the
real time system.
• Non-intrusive: Listening only to events from other systems, SHOULD NOT touch code in other
parts of the system (and hence don’t need to get on their road map).
Types of reconciliation:
• Stateless: Only depend on the data in the request.
• Stateful: Depends on business processes and states when failure occurred. Hence when the
system failed may matter in the outcome of the reconciliation.
*See excellent paper “Eventually Consistent” by Werner Vogels, CTO Amazon
Service Container
Clients
9
DETAILED DESIGN
Confidential and Proprietary
Service Container
Request
Request
Functional Component
Dependency
Orchestration/Response
Consolidation
Circuit Breakers (Global)
Circuit Breakers
Request
Request
Functional Component
Dependency
Orchestration/Response
Consolidation
Circuit BreakersConfig
SoR
Reconciliation
&
Actions
Queue
Events
Reports (Manual)
Reconcile
10 Confidential and Proprietary
WE ARE HIRING
If you are interested in helping us solve
these problems, you can contact us at:
dwilfred@paypal.com
http://www.ebaycareers.com

More Related Content

What's hot

Landis+Gyr Smart Grid Presentation
Landis+Gyr Smart Grid PresentationLandis+Gyr Smart Grid Presentation
Landis+Gyr Smart Grid Presentationlg_slideshare
 
eBay Architecture
eBay Architecture eBay Architecture
eBay Architecture Tony Ng
 
Disaster Recovery Options with AWS
Disaster Recovery Options with AWSDisaster Recovery Options with AWS
Disaster Recovery Options with AWSAmazon Web Services
 
AWS Security Best Practices in a Zero Trust Security Model - DEM08 - Toronto ...
AWS Security Best Practices in a Zero Trust Security Model - DEM08 - Toronto ...AWS Security Best Practices in a Zero Trust Security Model - DEM08 - Toronto ...
AWS Security Best Practices in a Zero Trust Security Model - DEM08 - Toronto ...Amazon Web Services
 
Process Orchestration with Flowable and Spring Boot
Process Orchestration with Flowable and Spring BootProcess Orchestration with Flowable and Spring Boot
Process Orchestration with Flowable and Spring BootChavdar Baikov
 
Disaster Recovery with the AWS Cloud
Disaster Recovery with the AWS CloudDisaster Recovery with the AWS Cloud
Disaster Recovery with the AWS CloudAmazon Web Services
 
APIs in a Microservice Architecture
APIs in a Microservice ArchitectureAPIs in a Microservice Architecture
APIs in a Microservice ArchitectureWSO2
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservicespflueras
 
Domain Driven Design - Strategic Patterns and Microservices
Domain Driven Design - Strategic Patterns and MicroservicesDomain Driven Design - Strategic Patterns and Microservices
Domain Driven Design - Strategic Patterns and MicroservicesRadosław Maziarka
 
Event driven microservices with axon and spring boot-excitingly boring
Event driven microservices with axon and spring boot-excitingly boringEvent driven microservices with axon and spring boot-excitingly boring
Event driven microservices with axon and spring boot-excitingly boringAllard Buijze
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Kai Wähner
 
Practical FinOps in Practice
Practical FinOps in PracticePractical FinOps in Practice
Practical FinOps in PracticePetri Kallberg
 
MuleSoft Event Driven Architecture (EDA Patterns in MuleSoft) - VirtualMuleys63
MuleSoft Event Driven Architecture (EDA Patterns in MuleSoft) - VirtualMuleys63MuleSoft Event Driven Architecture (EDA Patterns in MuleSoft) - VirtualMuleys63
MuleSoft Event Driven Architecture (EDA Patterns in MuleSoft) - VirtualMuleys63Angel Alberici
 
IT Infrastructure Management Powerpoint Presentation Slides
IT Infrastructure Management Powerpoint Presentation SlidesIT Infrastructure Management Powerpoint Presentation Slides
IT Infrastructure Management Powerpoint Presentation SlidesSlideTeam
 
Introducing Azure SQL Database
Introducing Azure SQL DatabaseIntroducing Azure SQL Database
Introducing Azure SQL DatabaseJames Serra
 

What's hot (20)

Landis+Gyr Smart Grid Presentation
Landis+Gyr Smart Grid PresentationLandis+Gyr Smart Grid Presentation
Landis+Gyr Smart Grid Presentation
 
FinOps introduction
FinOps introductionFinOps introduction
FinOps introduction
 
eBay Architecture
eBay Architecture eBay Architecture
eBay Architecture
 
Disaster Recovery Options with AWS
Disaster Recovery Options with AWSDisaster Recovery Options with AWS
Disaster Recovery Options with AWS
 
AWS Security Best Practices in a Zero Trust Security Model - DEM08 - Toronto ...
AWS Security Best Practices in a Zero Trust Security Model - DEM08 - Toronto ...AWS Security Best Practices in a Zero Trust Security Model - DEM08 - Toronto ...
AWS Security Best Practices in a Zero Trust Security Model - DEM08 - Toronto ...
 
Process Orchestration with Flowable and Spring Boot
Process Orchestration with Flowable and Spring BootProcess Orchestration with Flowable and Spring Boot
Process Orchestration with Flowable and Spring Boot
 
Disaster Recovery with the AWS Cloud
Disaster Recovery with the AWS CloudDisaster Recovery with the AWS Cloud
Disaster Recovery with the AWS Cloud
 
APIs in a Microservice Architecture
APIs in a Microservice ArchitectureAPIs in a Microservice Architecture
APIs in a Microservice Architecture
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
 
Guide to an API-first Strategy
Guide to an API-first StrategyGuide to an API-first Strategy
Guide to an API-first Strategy
 
Migration Planning
Migration PlanningMigration Planning
Migration Planning
 
Domain Driven Design - Strategic Patterns and Microservices
Domain Driven Design - Strategic Patterns and MicroservicesDomain Driven Design - Strategic Patterns and Microservices
Domain Driven Design - Strategic Patterns and Microservices
 
Event driven microservices with axon and spring boot-excitingly boring
Event driven microservices with axon and spring boot-excitingly boringEvent driven microservices with axon and spring boot-excitingly boring
Event driven microservices with axon and spring boot-excitingly boring
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
 
Practical FinOps in Practice
Practical FinOps in PracticePractical FinOps in Practice
Practical FinOps in Practice
 
AWS Business Essentials
AWS Business EssentialsAWS Business Essentials
AWS Business Essentials
 
MuleSoft Event Driven Architecture (EDA Patterns in MuleSoft) - VirtualMuleys63
MuleSoft Event Driven Architecture (EDA Patterns in MuleSoft) - VirtualMuleys63MuleSoft Event Driven Architecture (EDA Patterns in MuleSoft) - VirtualMuleys63
MuleSoft Event Driven Architecture (EDA Patterns in MuleSoft) - VirtualMuleys63
 
IT Infrastructure Management Powerpoint Presentation Slides
IT Infrastructure Management Powerpoint Presentation SlidesIT Infrastructure Management Powerpoint Presentation Slides
IT Infrastructure Management Powerpoint Presentation Slides
 
Introducing Azure SQL Database
Introducing Azure SQL DatabaseIntroducing Azure SQL Database
Introducing Azure SQL Database
 
Patterns of resilience
Patterns of resiliencePatterns of resilience
Patterns of resilience
 

Viewers also liked

Global Payment System- Reference Architecture
Global Payment System- Reference ArchitectureGlobal Payment System- Reference Architecture
Global Payment System- Reference ArchitectureRamadas MV
 
Peter Afanasiev - Architecture of online Payments
Peter Afanasiev - Architecture of online PaymentsPeter Afanasiev - Architecture of online Payments
Peter Afanasiev - Architecture of online PaymentsCiklum Ukraine
 
Nodejs introduce - using Socket.io
Nodejs introduce - using Socket.ioNodejs introduce - using Socket.io
Nodejs introduce - using Socket.ioCaesar Chi
 
Enterprise Mobile App UX: Designing from UI to Backend
Enterprise Mobile App UX: Designing from UI to BackendEnterprise Mobile App UX: Designing from UI to Backend
Enterprise Mobile App UX: Designing from UI to BackendSanjeev Sharma
 
Webcast: API-Centric Architecture for Building Context-Aware Apps
Webcast: API-Centric Architecture for Building Context-Aware AppsWebcast: API-Centric Architecture for Building Context-Aware Apps
Webcast: API-Centric Architecture for Building Context-Aware AppsApigee | Google Cloud
 
PayPal: A case study
PayPal: A case studyPayPal: A case study
PayPal: A case studyKimberly Teo
 
PayPal's Private Cloud @ Scale
PayPal's Private Cloud @ ScalePayPal's Private Cloud @ Scale
PayPal's Private Cloud @ ScalePayPal
 
Online Payment Gateway System
Online Payment Gateway SystemOnline Payment Gateway System
Online Payment Gateway SystemMannu Khani
 
Tensile behavior of environment friendly jute epoxy laminated
Tensile behavior of environment friendly jute epoxy laminatedTensile behavior of environment friendly jute epoxy laminated
Tensile behavior of environment friendly jute epoxy laminatedDr. Rashnal Hossain
 
The Road-To-Wealth-By-Robert-Allen
The Road-To-Wealth-By-Robert-AllenThe Road-To-Wealth-By-Robert-Allen
The Road-To-Wealth-By-Robert-AllenDaniel Bulus Jangfa
 
Pengertian bkb kit juknis 2016
Pengertian bkb kit juknis 2016Pengertian bkb kit juknis 2016
Pengertian bkb kit juknis 2016mkt asakaprima
 
Romania cuisine_London meeting
Romania cuisine_London meetingRomania cuisine_London meeting
Romania cuisine_London meetingPetraSala
 
Company Presentation: global reach
Company Presentation: global reachCompany Presentation: global reach
Company Presentation: global reachArcher Inc.
 
3D Printing and the Future (or Demise) of IP
3D Printing and the Future (or Demise) of IP3D Printing and the Future (or Demise) of IP
3D Printing and the Future (or Demise) of IPRising Media, Inc.
 
MR. LOWEL ORTIZO CURRICULUM VITAE, Updated
MR. LOWEL ORTIZO CURRICULUM VITAE, UpdatedMR. LOWEL ORTIZO CURRICULUM VITAE, Updated
MR. LOWEL ORTIZO CURRICULUM VITAE, UpdatedLowel Ortizo
 
Nghịch lý cuộc đời
Nghịch lý cuộc đờiNghịch lý cuộc đời
Nghịch lý cuộc đờifrank2073
 

Viewers also liked (20)

Global Payment System- Reference Architecture
Global Payment System- Reference ArchitectureGlobal Payment System- Reference Architecture
Global Payment System- Reference Architecture
 
Peter Afanasiev - Architecture of online Payments
Peter Afanasiev - Architecture of online PaymentsPeter Afanasiev - Architecture of online Payments
Peter Afanasiev - Architecture of online Payments
 
Money adder
Money adderMoney adder
Money adder
 
Nodejs introduce - using Socket.io
Nodejs introduce - using Socket.ioNodejs introduce - using Socket.io
Nodejs introduce - using Socket.io
 
Enterprise Mobile App UX: Designing from UI to Backend
Enterprise Mobile App UX: Designing from UI to BackendEnterprise Mobile App UX: Designing from UI to Backend
Enterprise Mobile App UX: Designing from UI to Backend
 
Paypal.com ppt
Paypal.com pptPaypal.com ppt
Paypal.com ppt
 
Webcast: API-Centric Architecture for Building Context-Aware Apps
Webcast: API-Centric Architecture for Building Context-Aware AppsWebcast: API-Centric Architecture for Building Context-Aware Apps
Webcast: API-Centric Architecture for Building Context-Aware Apps
 
PayPal: A case study
PayPal: A case studyPayPal: A case study
PayPal: A case study
 
PayPal's Private Cloud @ Scale
PayPal's Private Cloud @ ScalePayPal's Private Cloud @ Scale
PayPal's Private Cloud @ Scale
 
Online Payment Gateway System
Online Payment Gateway SystemOnline Payment Gateway System
Online Payment Gateway System
 
22.10.16
22.10.1622.10.16
22.10.16
 
Tensile behavior of environment friendly jute epoxy laminated
Tensile behavior of environment friendly jute epoxy laminatedTensile behavior of environment friendly jute epoxy laminated
Tensile behavior of environment friendly jute epoxy laminated
 
The Road-To-Wealth-By-Robert-Allen
The Road-To-Wealth-By-Robert-AllenThe Road-To-Wealth-By-Robert-Allen
The Road-To-Wealth-By-Robert-Allen
 
Pengertian bkb kit juknis 2016
Pengertian bkb kit juknis 2016Pengertian bkb kit juknis 2016
Pengertian bkb kit juknis 2016
 
Romania cuisine_London meeting
Romania cuisine_London meetingRomania cuisine_London meeting
Romania cuisine_London meeting
 
Evaluating the Effect of Rural Finance on African Economies
Evaluating the Effect of Rural Finance on African EconomiesEvaluating the Effect of Rural Finance on African Economies
Evaluating the Effect of Rural Finance on African Economies
 
Company Presentation: global reach
Company Presentation: global reachCompany Presentation: global reach
Company Presentation: global reach
 
3D Printing and the Future (or Demise) of IP
3D Printing and the Future (or Demise) of IP3D Printing and the Future (or Demise) of IP
3D Printing and the Future (or Demise) of IP
 
MR. LOWEL ORTIZO CURRICULUM VITAE, Updated
MR. LOWEL ORTIZO CURRICULUM VITAE, UpdatedMR. LOWEL ORTIZO CURRICULUM VITAE, Updated
MR. LOWEL ORTIZO CURRICULUM VITAE, Updated
 
Nghịch lý cuộc đời
Nghịch lý cuộc đờiNghịch lý cuộc đời
Nghịch lý cuộc đời
 

Similar to PayPal Resilient System Design

Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready AppsVMware Tanzu
 
OnPrem Monitoring.pdf
OnPrem Monitoring.pdfOnPrem Monitoring.pdf
OnPrem Monitoring.pdfTarekHamdi8
 
Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Lari Hotari
 
Software Architecture
Software ArchitectureSoftware Architecture
Software ArchitectureAhmed Misbah
 
Resisting to The Shocks
Resisting to The ShocksResisting to The Shocks
Resisting to The ShocksStefano Fago
 
Stay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithStay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithMarkus Eisele
 
Entity Framework: To the Unit of Work Design Pattern and Beyond
Entity Framework: To the Unit of Work Design Pattern and BeyondEntity Framework: To the Unit of Work Design Pattern and Beyond
Entity Framework: To the Unit of Work Design Pattern and BeyondSteve Westgarth
 
HA & DR System Design - Concepts and Solution
HA & DR System Design - Concepts and SolutionHA & DR System Design - Concepts and Solution
HA & DR System Design - Concepts and SolutionContinuity and Resilience
 
Component based development | what, why and how
Component based development | what, why and howComponent based development | what, why and how
Component based development | what, why and howRakesh Kumar Jha
 
Monitoring Clusters and Load Balancers
Monitoring Clusters and Load BalancersMonitoring Clusters and Load Balancers
Monitoring Clusters and Load BalancersPrince JabaKumar
 
Bojan Veljanovski - Modular Software Architecture and Design (Code Camp 2016)
Bojan Veljanovski - Modular Software Architecture and Design (Code Camp 2016)Bojan Veljanovski - Modular Software Architecture and Design (Code Camp 2016)
Bojan Veljanovski - Modular Software Architecture and Design (Code Camp 2016)Bojan Veljanovski
 
Software requirement and specification
Software requirement and specificationSoftware requirement and specification
Software requirement and specificationAman Adhikari
 
Software requirement and specification
Software requirement and specificationSoftware requirement and specification
Software requirement and specificationAman Adhikari
 
2nd MODULE Software Requirements _ SW ENGG 22CSE141.pdf
2nd MODULE  Software Requirements   _ SW ENGG  22CSE141.pdf2nd MODULE  Software Requirements   _ SW ENGG  22CSE141.pdf
2nd MODULE Software Requirements _ SW ENGG 22CSE141.pdfJayanthi Kannan MK
 
Educational seminar lessons learned from customer db2 for z os health check...
Educational seminar   lessons learned from customer db2 for z os health check...Educational seminar   lessons learned from customer db2 for z os health check...
Educational seminar lessons learned from customer db2 for z os health check...John Campbell
 

Similar to PayPal Resilient System Design (20)

Unit ii
Unit ii  Unit ii
Unit ii
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready Apps
 
OnPrem Monitoring.pdf
OnPrem Monitoring.pdfOnPrem Monitoring.pdf
OnPrem Monitoring.pdf
 
Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014
 
Software Architecture
Software ArchitectureSoftware Architecture
Software Architecture
 
Dependency Injection
Dependency InjectionDependency Injection
Dependency Injection
 
Resisting to The Shocks
Resisting to The ShocksResisting to The Shocks
Resisting to The Shocks
 
Stay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithStay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolith
 
Xp conf-tbd
Xp conf-tbdXp conf-tbd
Xp conf-tbd
 
Entity Framework: To the Unit of Work Design Pattern and Beyond
Entity Framework: To the Unit of Work Design Pattern and BeyondEntity Framework: To the Unit of Work Design Pattern and Beyond
Entity Framework: To the Unit of Work Design Pattern and Beyond
 
HA & DR System Design - Concepts and Solution
HA & DR System Design - Concepts and SolutionHA & DR System Design - Concepts and Solution
HA & DR System Design - Concepts and Solution
 
Component based development | what, why and how
Component based development | what, why and howComponent based development | what, why and how
Component based development | what, why and how
 
Monitoring Clusters and Load Balancers
Monitoring Clusters and Load BalancersMonitoring Clusters and Load Balancers
Monitoring Clusters and Load Balancers
 
Bojan Veljanovski - Modular Software Architecture and Design (Code Camp 2016)
Bojan Veljanovski - Modular Software Architecture and Design (Code Camp 2016)Bojan Veljanovski - Modular Software Architecture and Design (Code Camp 2016)
Bojan Veljanovski - Modular Software Architecture and Design (Code Camp 2016)
 
Software requirement and specification
Software requirement and specificationSoftware requirement and specification
Software requirement and specification
 
Software requirement and specification
Software requirement and specificationSoftware requirement and specification
Software requirement and specification
 
2nd MODULE Software Requirements _ SW ENGG 22CSE141.pdf
2nd MODULE  Software Requirements   _ SW ENGG  22CSE141.pdf2nd MODULE  Software Requirements   _ SW ENGG  22CSE141.pdf
2nd MODULE Software Requirements _ SW ENGG 22CSE141.pdf
 
Software architecture
Software architectureSoftware architecture
Software architecture
 
ADF Performance Monitor
ADF Performance MonitorADF Performance Monitor
ADF Performance Monitor
 
Educational seminar lessons learned from customer db2 for z os health check...
Educational seminar   lessons learned from customer db2 for z os health check...Educational seminar   lessons learned from customer db2 for z os health check...
Educational seminar lessons learned from customer db2 for z os health check...
 

Recently uploaded

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

PayPal Resilient System Design

  • 1. RESILIENT SYSTEM DESIGN June 2013 Risk & Compliance Engineering, PayPal Pradeep Ballal Staale Nerboe Greg Berry This deck contains generic architecture information, and does not reflect the exact details of current or planned systems.
  • 2. PROBLEM DEFINITION AND SOLUTION Problem In a distributed, virtualized environment, system failures are inevitable. Solution Isolate functionality to enable independent implementation of appropriate availability patterns and increase velocity/flexibility of fixes. Use asynchronous reconciliation to resolve failures without affecting overall customer experience. 2 Confidential and Proprietary
  • 3. PPaaS Circuit Breakers Clients Service Container Circuit Breakers 3 HIGH LEVEL ARCHITECTURE Confidential and Proprietary Dependency Dependency Dependency Dependency Dependency Dependency Orchestration/Response Consolidation Request Request Request Request Component Container Functional Component Functional Component Dependency Dependency DependencyDependency Functional Component (FC): Isolated set of functionality that can be developed, deployed and executed independently. • Fits well into the Agile Development methodology • Fallback behavior defined Service Container (SC): Contains infrastructure to orchestrate FCs and handle response consolidation and initiate reconciliation during failure. • Component based model (e.g. OSGi) including support for hot deploy of FCs without downtime for service • Malfunctioning FCs will quickly show and can be handled dynamically by properties or real time deployments • Provide meaningful response back to clients
  • 4. 4 SERVICE CONTAINER Confidential and Proprietary Service Container (SC): Contains infrastructure to orchestrate FCs and handle response consolidation and initiate reconciliation during failure. • Build on top of PayPal Platform as a Services (PPaaS) • Component based model (e.g. OSGi) including support for hot deploy of FCs without downtime for service • Enforces the concepts of coarse grained services • Malfunctioning FCs will quickly show and can be handled dynamically by properties or real time deployments • Provide meaningful response back to clients • Non-intrusive on the clients Functional Component (FC): Isolated set of functionality that can be developed, deployed and executed independently. • Fits well into the Agile Development methodology • FCs can fail independently • Fallback behavior defined
  • 5. Clients FALLBACK 5 Confidential and Proprietary To create a resilient system each Functional Component and Dependency SHOULD fail gracefully and have Fallback Behavior. This can be achieved by utilizing a framework that enforces normalized behavior across the platform. PS: Fallback Behavior should not be an afterthought but should be detailed out in the design in conjunction with your business partners. FAILURE Request Functional Component / Dependency Circuit Breakers (Local / Global) Logging / Monitoring Normal Behavior Fallback Behavior
  • 6. Clients CIRCUIT BREAKERS* 6 Confidential and Proprietary Circuit Breakers (CB)s serve these purposes: • It protects the clients from slow or broken FCs • It protects services from demand in excess of capacity • And most importantly it protects the Business from malfunctioning code by tracking negative actions (like decline payment) and if abnormal behavior is found, shuts down the FC *Concept first discussed in the excellent book Release It! by Michael Nygard. Example open source implementation by Netflix: https://github.com/Netflix/Hystrix/wiki CBs are named after their counterparts in the physical world. Local CBs: Track the health of services Global CBs: Tracks negative behavior that impacts the Business or health of overall system Service ContainerService Container Request Request Functional Component Dependency Orchestration/ Response Consolidation Circuit Breakers (Global) Circuit Breakers Request Request Functional Component Dependency Circuit BreakersConfig Orchestration/ Response Consolidation
  • 7. DATA ACCESS – NEED MORE 7 Confidential and Proprietary Globally Distributed • You can’t have a single system of record that contains all data • Latency matters (you can’t go faster than the speed of light) • There must be a way to partition data and processing Always Available • Everything needs to be redundant (or dispensable) • Can’t have a single point of failure Shares Nothing • Systems must be able to run completely independently Read ReplicasRead Replicas SoR Journal Read Service Life Cycle (CRUD) Service Latency Bridge Replay Clients
  • 8. 8 EVENTUALLY CONSISTENT* Confidential and Proprietary CAP theorem: States that of three properties of distributed -data systems—data consistency, system availability, and tolerance to network partition—only two can be achieved at any given time. To account for this fact a reconciliation system is required to identify issues and try to correct them automatically. Only as a last resort should a Manual Review should be conducted. Design considerations: • Limited DB table scanning: System should not rely on heavy DB table scanning and heavy queries. If required this SHOULD be done in a DW or on a hadoop cluster and feed back into the real time system. • Non-intrusive: Listening only to events from other systems, SHOULD NOT touch code in other parts of the system (and hence don’t need to get on their road map). Types of reconciliation: • Stateless: Only depend on the data in the request. • Stateful: Depends on business processes and states when failure occurred. Hence when the system failed may matter in the outcome of the reconciliation. *See excellent paper “Eventually Consistent” by Werner Vogels, CTO Amazon
  • 9. Service Container Clients 9 DETAILED DESIGN Confidential and Proprietary Service Container Request Request Functional Component Dependency Orchestration/Response Consolidation Circuit Breakers (Global) Circuit Breakers Request Request Functional Component Dependency Orchestration/Response Consolidation Circuit BreakersConfig SoR Reconciliation & Actions Queue Events Reports (Manual) Reconcile
  • 10. 10 Confidential and Proprietary WE ARE HIRING If you are interested in helping us solve these problems, you can contact us at: dwilfred@paypal.com http://www.ebaycareers.com

Editor's Notes

  1. Mr. Pradeep Ballal works as a Senior Architect in the Core Service Product Development with specific focus on Compliance and Risk products with PayPal Singapore. Mr. Ballal is a software generalist with 13 years of technology experience and has special interest in decision management, business rules, enterprise software and architectures. Mr. Staale Nerboe (snerboe@paypal.com) works as a Senior Architect in the Core Service Product Development organization withPayPal Singapore. Mr. Nerboe has 15+ years of Technology Consulting and Software Architecture experience for large global companies world-wide.Mr. Greg Berry (gberry@paypal.com) works as a Principal Architect at PayPal in the Core Services organization. Greg has been an architect in the payments industry for more than 15 years.
  2. In a complex system you will see multiple levels of fallback behavior, like a onion. Also, a fallback behavior can also have fallback. E.g. as a last resort if only log and return an error message to the client.
  3. CBs can be implemented in various ways including Complex Event Processing (CEP), Database, Global Cache, or any other fast storage media. It needs to support fast read/write, but also be able to handle rolling windows, like last 5 minutes, 1 hour, 24 hours. This gets complex in an environment where there volume of service invocations are high (e.g. with large number of invocations or