SlideShare a Scribd company logo
1 of 19
June 2013
REAL-TIME DECISION STRATEGIES
Risk & Compliance Engineering, PayPal
Philip Wright
This deck contains generic architecture information, and does not
reflect the exact details of current or planned systems.
Confidential and Proprietary2
INTRODUCTION
This proposal discusses strategies and considerations for
real-time automated decision systems.
Confidential and Proprietary3
PROBLEM STATEMENT
In a connected and fast changing online world, business’ are seeking new strategies to
deliver the best customer experiences possible. A key component of these strategies is
improved decision making. An optimal business decision making strategy might be
measured against the following criteria:
• Fast – decisions are delivered quickly
• Accurate – correct decisions are generated
• Available – decisions are available when needed
Problem: It is challenging and costly to develop a single solution that meets all
business goals.
Goal: Explore major trade-offs that are typically considered when designing
architectures for real-time decisions.
Confidential and Proprietary4
ANALYTICS
Data is the input to the decision making process. Optimal decisions are obtained by
having the right data for the problem domain. Data is often derived from domain
specific sources such as a business transactions or processes.
Determining what data should be used in a decision is often the result of statistical
research and analytical processes that narrow down the key attributes in a data set to
those that are most deterministic.
A decision system may process the data into attributes or variables optimized for
decision making by aggregating, grouping or summarizing using statistical methods or
mathematical formulae.
Data processing may be performed offline or in real-time as needed.
Confidential and Proprietary5
DATA SET SIZE
Better quality decisions can be generated from larger, richer data sets. Finding new
more predictive variables is critical to many business strategies and that process often
starts by collecting more data. Data sets in some business domains can easily exceed
a petabytes in size.
Data Set Size
Confidential and Proprietary6
THE NEED FOR SPEED
Business advantage can be achieved through faster decisions. Making decisions
quickly sometimes requires pre-processing a decision in batch before it is needed. This
is how many systems are able to deliver very fast results.
Confidential and Proprietary7
ACCURACY
Pre-calculated results can only factor in to the decision the data that was available
when the pre-processing was performed. Decisions using this data could be made
minutes, hours or days later.
If there is any significant change in the data after pre-processing, a decision can be
made on out-of-date or stale data and could result in a poor business outcome or bad
customer experience.
Confidential and Proprietary8
REAL-TIME PROCESSING
Processing data in real-time can significantly improve the accuracy of decisions, since
the system can factor in the most recent changes in data. Systems such as those used
by many financial institutions, banks and the stock markets rely heavily on real-time
data.
Confidential and Proprietary9
REAL-TIME PROCESSING
Processing large amounts of data in real-time to generate a decision can be costly in
terms of hardware infrastructure needed to execute hundreds or even thousands of
database queries. These systems typically do not scale linearly and can add significant
complexity and latency to the decision system.
Confidential and Proprietary10
SPEED VS ACCURACY
It is challenging to optimize for both speed and accuracy. When a system is optimized
for speed, processing time needs to be kept to a minimum. Reducing the input data to
the decision process is one common way of minimizing processing time, but this has
the potential of generating less accurate decisions than could otherwise be delivered.
When a system is optimized for accuracy it usually requires more data and complex
processing which takes more time to generate the result. A trade-off must be found that
balances speed and accuracy based on the use-case.
Data Set Size
Confidential and Proprietary11
AVAILABILITY
The constant quest for richer data typically requires additional hardware infrastructure
and introduces new dependencies to the system which will reduce the overall
availability of the decision solution, sometimes defeating the benefits of adding the new
data source since a more accurate decision isn’t useful if it can’t be generated when
needed.
Confidential and Proprietary12
EXAMPLE
We can Combine best in class strategies to deliver fast, accurate and reliable
decision infrastructure that can support diverse business solutions.
Consider a theoretical business decision system with the following
requirements:
Requirement Goal Priority
Average Response Time <= 100ms 1
Decision Accuracy >= 99% 1
Decision Availability >= 99.99% 2
Here greater business priority has been placed on response time and
accuracy.
Confidential and Proprietary13
TRADE-OFFS
Accuracy
Use real-time data for
accuracy/correctness
Optimize the data set size
according to strategy
Reduce system dependencies
Pre-calculate and cache decisions
whenever possible
Expand and optimize
data sets for deep
analytics
Pre-calculate and cache decisions
whenever possible
An optimal decision strategy may consider these trade-offs:
?
Confidential and Proprietary14
DECISION OPTIMIZATION
Breaking a decision response down into its components may reveal properties that
allow us to design a system that can be further optimized. For example, consider a
system that generates decisions that fall into 3 broad segments, ‘Yes’, ‘No’ and ‘Not
sure’. The ‘Yes’ and ‘No’ segments contain decisions with a high degree of certainty
that can be made quickly using mostly pre-calculated data.
The ‘Not sure’ segment contains decisions that are neither clear ‘Yes’ or clear ‘No’
decisions. To minimize the risks of an incorrect decision, we may want to force all
decisions into the ‘Yes’ or ‘No’ segments, to do this we may need to perform further
processing on the ‘Not sure’ segment.
Decisions that fall between Yes
and No
No decisionsYes decisions
Not sure NoYes
Decisions
Confidential and Proprietary15
LAYERED STRATEGY
One approach to solving this problem would be through a layered decision strategy.
Such a solution may combine these components:
• A high performance tier that focuses on delivering highly available, fast decisions
that cover 80% of business requests with limited data requirements and minimal
system dependencies that is capable of delivering 99% correct decisions in an
average response time of 50ms.
• A deep analytics decision tier leveraging a larger richer data set for processing the
remaining 20% of requests that is able to deliver 99% correct decisions in an
average response time of 300ms.
When combined, the systems should deliver the business goals of 99% availability,
100ms average response time with a decision accuracy of 99% or greater.
Confidential and Proprietary16
FAST AND HIGHLY AVAILABLE
Continuing the hybrid concept, it may be feasible to design a system that supports fast
decisions which are 99% accurate for 80% of the requests entering the system. This
solution could leverage cached pre-calculated data to satisfy the 50ms response time
with 99.99% availability.
Client
Offline
Simple Decision Logic
Pre-calculated Data
Fast Decisions
Highly Available
Cached
Data
System Characteristics
Availability* >= 99.99%
Response
Time*
<= 50ms
Accuracy >= 99%
* Average
Average cache
response time 40ms
Availability 99.99%
Confidential and Proprietary17
IMPROVED ACCURACY
To meet the accuracy requirements of our system we need to include more data and
analytics processing for 20% of the requests entering the system. To do this we may
choose to add real-time and external data sources. The system will be more accurate
as a result, but the availability of the system will be lowered to 95% due to the increase
in the number of components and the response time will go up to 300ms due to the
external data access.
Rich Analytics
Advanced Decision Logic
Real-time Data
Data
Client
Data Data
External
Data
System Characteristics
Availability* >= 95%
Response
Time*
<= 300ms
Accuracy >= 99%
* Average
Average service
response time 300ms
Average DB response time 200msDB availability 95%
Average Response Time: 300msAvailability: < 95%
Service availability
99.99%
Confidential and Proprietary18
LAYERED SOLUTION DESIGN
Rich Analytics
Advanced Decision Logic
Real-time Data
Data
Client
Offline
Average Response Time: 50ms
Simple Decision Logic
Pre-calculated Data
Fast Decisions
Highly Available
Data Data
Cached
Data
External
Data
Average Response Time: 300ms
Availability: > 99.99%
Availability: > 95%
* Average
Combined Characteristics
Availability* >= 99%
Response
Time*
<= 100ms
Accuracy >= 99%
Average Response Time: 100msAvailability to client: > 99%
Confidential and Proprietary19
CONCLUSION
It is often necessary to make trade-offs when designing a system to meet strict
performance and availability goals. As we have seen with this theoretical example, with
some research into the problem domain, solutions can be found that solve the most
critical business problems without major compromises of key requirements.
In this example we split the solution into two distinct components, each focused on
solving a specific part of the business problem. By layering a solution in this way we
were able to trade-off 1% availability for improved average response time and
decisions with higher accuracy. Fully analyzing the business requirements will give the
designer the greatest flexibility and the appropriate basis for making sound trade-offs
that work for the problem domain.

More Related Content

What's hot

V mware quick start guide to disaster recovery
V mware   quick start guide to disaster recoveryV mware   quick start guide to disaster recovery
V mware quick start guide to disaster recovery
VMware_EMEA
 
Proventiv Sales Presentation
Proventiv Sales PresentationProventiv Sales Presentation
Proventiv Sales Presentation
MSI Services
 
The Edge of Disaster Recovery - May Events Presentation FINAL
The Edge of Disaster Recovery - May Events Presentation FINALThe Edge of Disaster Recovery - May Events Presentation FINAL
The Edge of Disaster Recovery - May Events Presentation FINAL
John Baumgarten
 
Ppt Template
Ppt TemplatePpt Template
Ppt Template
papdev
 

What's hot (15)

NSI Net Factor Advantage
NSI Net Factor AdvantageNSI Net Factor Advantage
NSI Net Factor Advantage
 
Will You Be Prepared When The Next Disaster Strikes - Whitepaper
Will You Be Prepared When The Next Disaster Strikes - WhitepaperWill You Be Prepared When The Next Disaster Strikes - Whitepaper
Will You Be Prepared When The Next Disaster Strikes - Whitepaper
 
InDefend-Integrated Data Privacy Offerings
InDefend-Integrated Data Privacy Offerings  InDefend-Integrated Data Privacy Offerings
InDefend-Integrated Data Privacy Offerings
 
V mware quick start guide to disaster recovery
V mware   quick start guide to disaster recoveryV mware   quick start guide to disaster recovery
V mware quick start guide to disaster recovery
 
Disaster Recovery vs. Business Continuity
Disaster Recovery vs. Business ContinuityDisaster Recovery vs. Business Continuity
Disaster Recovery vs. Business Continuity
 
Proventiv Sales Presentation
Proventiv Sales PresentationProventiv Sales Presentation
Proventiv Sales Presentation
 
Risk Based Approach To Recovery And Continuity Management John P Morency
Risk Based Approach To Recovery And Continuity Management   John P  MorencyRisk Based Approach To Recovery And Continuity Management   John P  Morency
Risk Based Approach To Recovery And Continuity Management John P Morency
 
Whitepaper: Healthcare Data Migration - Top 10 Questions
Whitepaper: Healthcare Data Migration - Top 10 Questions Whitepaper: Healthcare Data Migration - Top 10 Questions
Whitepaper: Healthcare Data Migration - Top 10 Questions
 
9 Hyperion Performance Myths and How to Debunk Them
9 Hyperion Performance Myths and How to Debunk Them9 Hyperion Performance Myths and How to Debunk Them
9 Hyperion Performance Myths and How to Debunk Them
 
Role of Operational System Design in Data Warehouse Implementation: Identifyi...
Role of Operational System Design in Data Warehouse Implementation: Identifyi...Role of Operational System Design in Data Warehouse Implementation: Identifyi...
Role of Operational System Design in Data Warehouse Implementation: Identifyi...
 
Prolifics Managed Services Offering
Prolifics Managed Services OfferingProlifics Managed Services Offering
Prolifics Managed Services Offering
 
The Edge of Disaster Recovery - May Events Presentation FINAL
The Edge of Disaster Recovery - May Events Presentation FINALThe Edge of Disaster Recovery - May Events Presentation FINAL
The Edge of Disaster Recovery - May Events Presentation FINAL
 
Trillium software garp march 2014 presentation bfast briefing
Trillium software   garp march 2014 presentation bfast briefingTrillium software   garp march 2014 presentation bfast briefing
Trillium software garp march 2014 presentation bfast briefing
 
Ppt Template
Ppt TemplatePpt Template
Ppt Template
 
Maclear’s IT GRC Tools – Key Issues and Trends
Maclear’s  IT GRC Tools – Key Issues and TrendsMaclear’s  IT GRC Tools – Key Issues and Trends
Maclear’s IT GRC Tools – Key Issues and Trends
 

Similar to Decision platform strategies

PayPal Decision Management Architecture
PayPal Decision Management ArchitecturePayPal Decision Management Architecture
PayPal Decision Management Architecture
Pradeep Ballal
 
Accelerating Time to Success for Your Big Data Initiatives
Accelerating Time to Success for Your Big Data InitiativesAccelerating Time to Success for Your Big Data Initiatives
Accelerating Time to Success for Your Big Data Initiatives
☁Jake Weaver ☁
 

Similar to Decision platform strategies (20)

Effectively Managing Your Historical Data
Effectively Managing Your Historical DataEffectively Managing Your Historical Data
Effectively Managing Your Historical Data
 
PayPal Decision Management Architecture
PayPal Decision Management ArchitecturePayPal Decision Management Architecture
PayPal Decision Management Architecture
 
MetaSuite and_hp_quality_center_enterprise
MetaSuite and_hp_quality_center_enterpriseMetaSuite and_hp_quality_center_enterprise
MetaSuite and_hp_quality_center_enterprise
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Cloud Data Management: The Future of Data Storage and Management
Cloud Data Management: The Future of Data Storage and ManagementCloud Data Management: The Future of Data Storage and Management
Cloud Data Management: The Future of Data Storage and Management
 
Advanced Analytics for Asset Management with IBM
Advanced Analytics for Asset Management with IBMAdvanced Analytics for Asset Management with IBM
Advanced Analytics for Asset Management with IBM
 
Mind Map Test Data Management Overview
Mind Map Test Data Management OverviewMind Map Test Data Management Overview
Mind Map Test Data Management Overview
 
Ibm test data_management_v0.4
Ibm test data_management_v0.4Ibm test data_management_v0.4
Ibm test data_management_v0.4
 
What is Test Data Management? Why Should You Focus on It?
What is Test Data Management? Why Should You Focus on It?What is Test Data Management? Why Should You Focus on It?
What is Test Data Management? Why Should You Focus on It?
 
Preparing for GDPR
Preparing for GDPRPreparing for GDPR
Preparing for GDPR
 
Decision Management & Cloud as a Platform for Predictive Analytics
Decision Management & Cloud as a Platform for Predictive AnalyticsDecision Management & Cloud as a Platform for Predictive Analytics
Decision Management & Cloud as a Platform for Predictive Analytics
 
Accelerating Time to Success for Your Big Data Initiatives
Accelerating Time to Success for Your Big Data InitiativesAccelerating Time to Success for Your Big Data Initiatives
Accelerating Time to Success for Your Big Data Initiatives
 
7 Emerging Data & Enterprise Integration Trends in 2022
7 Emerging Data & Enterprise Integration Trends in 20227 Emerging Data & Enterprise Integration Trends in 2022
7 Emerging Data & Enterprise Integration Trends in 2022
 
Justifying Capacity Management Efforts
Justifying Capacity Management EffortsJustifying Capacity Management Efforts
Justifying Capacity Management Efforts
 
Six Reasons to Upgrade your Database
Six Reasons to Upgrade your DatabaseSix Reasons to Upgrade your Database
Six Reasons to Upgrade your Database
 
IBM 2016 - Six reasons to upgrade your database
IBM 2016 - Six reasons to upgrade your databaseIBM 2016 - Six reasons to upgrade your database
IBM 2016 - Six reasons to upgrade your database
 
How companies are managing growth, gaining insights and cutting costs in the ...
How companies are managing growth, gaining insights and cutting costs in the ...How companies are managing growth, gaining insights and cutting costs in the ...
How companies are managing growth, gaining insights and cutting costs in the ...
 
Six Reasons to Upgrade your Database
Six Reasons to Upgrade your DatabaseSix Reasons to Upgrade your Database
Six Reasons to Upgrade your Database
 
Predictive Maintenance Solution for Industries - Cyient
Predictive Maintenance Solution for Industries - CyientPredictive Maintenance Solution for Industries - Cyient
Predictive Maintenance Solution for Industries - Cyient
 
Enterprise Test Data Generation.pptx
Enterprise Test Data Generation.pptxEnterprise Test Data Generation.pptx
Enterprise Test Data Generation.pptx
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Decision platform strategies

  • 1. June 2013 REAL-TIME DECISION STRATEGIES Risk & Compliance Engineering, PayPal Philip Wright This deck contains generic architecture information, and does not reflect the exact details of current or planned systems.
  • 2. Confidential and Proprietary2 INTRODUCTION This proposal discusses strategies and considerations for real-time automated decision systems.
  • 3. Confidential and Proprietary3 PROBLEM STATEMENT In a connected and fast changing online world, business’ are seeking new strategies to deliver the best customer experiences possible. A key component of these strategies is improved decision making. An optimal business decision making strategy might be measured against the following criteria: • Fast – decisions are delivered quickly • Accurate – correct decisions are generated • Available – decisions are available when needed Problem: It is challenging and costly to develop a single solution that meets all business goals. Goal: Explore major trade-offs that are typically considered when designing architectures for real-time decisions.
  • 4. Confidential and Proprietary4 ANALYTICS Data is the input to the decision making process. Optimal decisions are obtained by having the right data for the problem domain. Data is often derived from domain specific sources such as a business transactions or processes. Determining what data should be used in a decision is often the result of statistical research and analytical processes that narrow down the key attributes in a data set to those that are most deterministic. A decision system may process the data into attributes or variables optimized for decision making by aggregating, grouping or summarizing using statistical methods or mathematical formulae. Data processing may be performed offline or in real-time as needed.
  • 5. Confidential and Proprietary5 DATA SET SIZE Better quality decisions can be generated from larger, richer data sets. Finding new more predictive variables is critical to many business strategies and that process often starts by collecting more data. Data sets in some business domains can easily exceed a petabytes in size. Data Set Size
  • 6. Confidential and Proprietary6 THE NEED FOR SPEED Business advantage can be achieved through faster decisions. Making decisions quickly sometimes requires pre-processing a decision in batch before it is needed. This is how many systems are able to deliver very fast results.
  • 7. Confidential and Proprietary7 ACCURACY Pre-calculated results can only factor in to the decision the data that was available when the pre-processing was performed. Decisions using this data could be made minutes, hours or days later. If there is any significant change in the data after pre-processing, a decision can be made on out-of-date or stale data and could result in a poor business outcome or bad customer experience.
  • 8. Confidential and Proprietary8 REAL-TIME PROCESSING Processing data in real-time can significantly improve the accuracy of decisions, since the system can factor in the most recent changes in data. Systems such as those used by many financial institutions, banks and the stock markets rely heavily on real-time data.
  • 9. Confidential and Proprietary9 REAL-TIME PROCESSING Processing large amounts of data in real-time to generate a decision can be costly in terms of hardware infrastructure needed to execute hundreds or even thousands of database queries. These systems typically do not scale linearly and can add significant complexity and latency to the decision system.
  • 10. Confidential and Proprietary10 SPEED VS ACCURACY It is challenging to optimize for both speed and accuracy. When a system is optimized for speed, processing time needs to be kept to a minimum. Reducing the input data to the decision process is one common way of minimizing processing time, but this has the potential of generating less accurate decisions than could otherwise be delivered. When a system is optimized for accuracy it usually requires more data and complex processing which takes more time to generate the result. A trade-off must be found that balances speed and accuracy based on the use-case. Data Set Size
  • 11. Confidential and Proprietary11 AVAILABILITY The constant quest for richer data typically requires additional hardware infrastructure and introduces new dependencies to the system which will reduce the overall availability of the decision solution, sometimes defeating the benefits of adding the new data source since a more accurate decision isn’t useful if it can’t be generated when needed.
  • 12. Confidential and Proprietary12 EXAMPLE We can Combine best in class strategies to deliver fast, accurate and reliable decision infrastructure that can support diverse business solutions. Consider a theoretical business decision system with the following requirements: Requirement Goal Priority Average Response Time <= 100ms 1 Decision Accuracy >= 99% 1 Decision Availability >= 99.99% 2 Here greater business priority has been placed on response time and accuracy.
  • 13. Confidential and Proprietary13 TRADE-OFFS Accuracy Use real-time data for accuracy/correctness Optimize the data set size according to strategy Reduce system dependencies Pre-calculate and cache decisions whenever possible Expand and optimize data sets for deep analytics Pre-calculate and cache decisions whenever possible An optimal decision strategy may consider these trade-offs: ?
  • 14. Confidential and Proprietary14 DECISION OPTIMIZATION Breaking a decision response down into its components may reveal properties that allow us to design a system that can be further optimized. For example, consider a system that generates decisions that fall into 3 broad segments, ‘Yes’, ‘No’ and ‘Not sure’. The ‘Yes’ and ‘No’ segments contain decisions with a high degree of certainty that can be made quickly using mostly pre-calculated data. The ‘Not sure’ segment contains decisions that are neither clear ‘Yes’ or clear ‘No’ decisions. To minimize the risks of an incorrect decision, we may want to force all decisions into the ‘Yes’ or ‘No’ segments, to do this we may need to perform further processing on the ‘Not sure’ segment. Decisions that fall between Yes and No No decisionsYes decisions Not sure NoYes Decisions
  • 15. Confidential and Proprietary15 LAYERED STRATEGY One approach to solving this problem would be through a layered decision strategy. Such a solution may combine these components: • A high performance tier that focuses on delivering highly available, fast decisions that cover 80% of business requests with limited data requirements and minimal system dependencies that is capable of delivering 99% correct decisions in an average response time of 50ms. • A deep analytics decision tier leveraging a larger richer data set for processing the remaining 20% of requests that is able to deliver 99% correct decisions in an average response time of 300ms. When combined, the systems should deliver the business goals of 99% availability, 100ms average response time with a decision accuracy of 99% or greater.
  • 16. Confidential and Proprietary16 FAST AND HIGHLY AVAILABLE Continuing the hybrid concept, it may be feasible to design a system that supports fast decisions which are 99% accurate for 80% of the requests entering the system. This solution could leverage cached pre-calculated data to satisfy the 50ms response time with 99.99% availability. Client Offline Simple Decision Logic Pre-calculated Data Fast Decisions Highly Available Cached Data System Characteristics Availability* >= 99.99% Response Time* <= 50ms Accuracy >= 99% * Average Average cache response time 40ms Availability 99.99%
  • 17. Confidential and Proprietary17 IMPROVED ACCURACY To meet the accuracy requirements of our system we need to include more data and analytics processing for 20% of the requests entering the system. To do this we may choose to add real-time and external data sources. The system will be more accurate as a result, but the availability of the system will be lowered to 95% due to the increase in the number of components and the response time will go up to 300ms due to the external data access. Rich Analytics Advanced Decision Logic Real-time Data Data Client Data Data External Data System Characteristics Availability* >= 95% Response Time* <= 300ms Accuracy >= 99% * Average Average service response time 300ms Average DB response time 200msDB availability 95% Average Response Time: 300msAvailability: < 95% Service availability 99.99%
  • 18. Confidential and Proprietary18 LAYERED SOLUTION DESIGN Rich Analytics Advanced Decision Logic Real-time Data Data Client Offline Average Response Time: 50ms Simple Decision Logic Pre-calculated Data Fast Decisions Highly Available Data Data Cached Data External Data Average Response Time: 300ms Availability: > 99.99% Availability: > 95% * Average Combined Characteristics Availability* >= 99% Response Time* <= 100ms Accuracy >= 99% Average Response Time: 100msAvailability to client: > 99%
  • 19. Confidential and Proprietary19 CONCLUSION It is often necessary to make trade-offs when designing a system to meet strict performance and availability goals. As we have seen with this theoretical example, with some research into the problem domain, solutions can be found that solve the most critical business problems without major compromises of key requirements. In this example we split the solution into two distinct components, each focused on solving a specific part of the business problem. By layering a solution in this way we were able to trade-off 1% availability for improved average response time and decisions with higher accuracy. Fully analyzing the business requirements will give the designer the greatest flexibility and the appropriate basis for making sound trade-offs that work for the problem domain.

Editor's Notes

  1. Philip Wright is Directorof Architecture at Paypal and has contributed to strategy and development of major Risk management platforms and solutions at ebay/Paypal since 2005.