SlideShare a Scribd company logo
Shadowing production
requests
JAKA ŠUŠTERŠIČ
BACKEND ENGINEER
MOST COMMON QUESTIONS BEFORE A
DEPLOY
• Will the new implemented functionality work?
• Will it break any existing functionality?
- EVERY BACKEND DEVELOPER
ESTABLISHED METHODS OF TESTING
• We only test the edge cases we can think of
• The number of code paths grows with project complexity
• Time and effort
YOLO METHODS
BLUE GREEN DEPLOYMENTS
CANARY DEPLOYMENTS
YOLO DRAWBACKS
• Requires active monitoring and reverting
• Users are still affected by regressions
PRODUCTION REQUEST SHADOWING
SHADOW
PROXY
SERVER
SHADOWING ADVANTAGES
• Users can't be affected by regressions in the new release
• We can compare production and shadow response content and
performance to catch regressions!
SHADOWING ADVANTAGES
• Free real world test inputs
• Free assertions - know good last version
REAL TIME
ZERO EFFORT
RISK-FREE
TESTING
WITH PRODUCTION REQUESTS
ARCHITECTURE
EXAMPLE
PROBLEMS
DATABASE
• Side effects of concurrently processing the same request on two
application servers
• For better response comparisons we need a shadow DB that has the same
data
DATABASE SOLUTION
• Dynamically mock the database layer (warning, this may be hard)
• Writable production read replica in the shadow environment
• Re-synced once enough changes happen
• Introduces some comparison noise
• Good enough™
EXTERNAL SERVICES
• Sending emails, processing payments, ...
• Data integrity
• Rate limiting + cost
KEEPING STATE ACROSS REQUESTS
• CSRF tokens
• The user doesn't receive the shadow responses
• The production state in invalid in the shadow environment
ACTIVE REQUEST EDITING
• Find state tokens in shadow responses
• Persist tokens in Redis per user IP
• Replace tokens in duplicated shadow requests
RESPONSE COMPARISON NOISE
• Sources of noise
• timestamps
• RNG
• external service data
• intended changes
DEVELOPER'S VIEW
CONCLUSION
• Risk free for end users
• New releases are subject to real world edge cases
• Automatic regression detection
• Increased trust in new releases
Visit infinum.co or find us on social networks:
infinum.co infinumco infinumco infinum
Questions?

More Related Content

What's hot

Dev/Test scenarios in DevOps world
Dev/Test scenarios in DevOps worldDev/Test scenarios in DevOps world
Dev/Test scenarios in DevOps world
Davide Benvegnù
 
DockerCon SF 2015: Cultural Change using Docker
DockerCon SF 2015: Cultural Change using Docker DockerCon SF 2015: Cultural Change using Docker
DockerCon SF 2015: Cultural Change using Docker
Docker, Inc.
 
Automatic Undo for Cloud Management via AI Planning
Automatic Undo for Cloud Management via AI PlanningAutomatic Undo for Cloud Management via AI Planning
Automatic Undo for Cloud Management via AI Planning
Hiroshi Wada
 
Infrastructure as Code
Infrastructure as CodeInfrastructure as Code
Infrastructure as Code
Leandro Rosa
 
Siebel Monitoring Tools
Siebel Monitoring ToolsSiebel Monitoring Tools
Siebel Monitoring Tools
Alceu Rodrigues de Freitas Junior
 
Continuous delivery while minimizing performance risks (dutch web ops meetup)
Continuous delivery while minimizing performance risks (dutch web ops meetup)Continuous delivery while minimizing performance risks (dutch web ops meetup)
Continuous delivery while minimizing performance risks (dutch web ops meetup)
a32an
 
Enterprise Beacon Object Hive - Siebel Version Control
Enterprise Beacon Object Hive - Siebel Version ControlEnterprise Beacon Object Hive - Siebel Version Control
Enterprise Beacon Object Hive - Siebel Version Control
Milind Waikul
 
Andreas Grabner - Performance as Code, Let's Make It a Standard
Andreas Grabner - Performance as Code, Let's Make It a StandardAndreas Grabner - Performance as Code, Let's Make It a Standard
Andreas Grabner - Performance as Code, Let's Make It a Standard
Neotys_Partner
 
(SPOT302) Availability: The New Kind of Innovator’s Dilemma
(SPOT302) Availability: The New Kind of Innovator’s Dilemma(SPOT302) Availability: The New Kind of Innovator’s Dilemma
(SPOT302) Availability: The New Kind of Innovator’s Dilemma
Amazon Web Services
 
Building Reactive applications with Akka
Building Reactive applications with AkkaBuilding Reactive applications with Akka
Building Reactive applications with Akka
Knoldus Inc.
 
Testing with Rspec
Testing with RspecTesting with Rspec
Testing with Rspec
rngtng
 
Evolve your application
Evolve your applicationEvolve your application
Evolve your application
Mário Nogueira
 
(Ignite) JOURNEY OF CHAOS EXPERIMENT EXECUTION IN PRODUCTION - LEONID HAIMOV,...
(Ignite) JOURNEY OF CHAOS EXPERIMENT EXECUTION IN PRODUCTION - LEONID HAIMOV,...(Ignite) JOURNEY OF CHAOS EXPERIMENT EXECUTION IN PRODUCTION - LEONID HAIMOV,...
(Ignite) JOURNEY OF CHAOS EXPERIMENT EXECUTION IN PRODUCTION - LEONID HAIMOV,...
DevOpsDays Tel Aviv
 
DevOps Tooling event Amazic
DevOps Tooling event AmazicDevOps Tooling event Amazic
DevOps Tooling event Amazic
Bas van Oudenaarde
 
Infrastructure as code
Infrastructure as codeInfrastructure as code
Infrastructure as code
Aakash Singhal
 
BOSE - Josh Steckler - Automating Automation: Build environments, on-demand
BOSE - Josh Steckler - Automating Automation: Build environments, on-demandBOSE - Josh Steckler - Automating Automation: Build environments, on-demand
BOSE - Josh Steckler - Automating Automation: Build environments, on-demand
DevOps Enterprise Summit
 
Sencha Roadshow 2017: Sencha Upgrades - The Good. The Bad. The Ugly - Eva Luc...
Sencha Roadshow 2017: Sencha Upgrades - The Good. The Bad. The Ugly - Eva Luc...Sencha Roadshow 2017: Sencha Upgrades - The Good. The Bad. The Ugly - Eva Luc...
Sencha Roadshow 2017: Sencha Upgrades - The Good. The Bad. The Ugly - Eva Luc...
Sencha
 
Infrastructure as Code
Infrastructure as CodeInfrastructure as Code
Infrastructure as Code
Matt Cowger
 
From vagrant to production - Mark Eijsermans
From vagrant to production - Mark EijsermansFrom vagrant to production - Mark Eijsermans
From vagrant to production - Mark Eijsermans
Devopsdays
 
DevOps Fest 2020. Kohsuke Kawaguchi. GitOps, Jenkins X & the Future of CI/CD
DevOps Fest 2020. Kohsuke Kawaguchi. GitOps, Jenkins X & the Future of CI/CDDevOps Fest 2020. Kohsuke Kawaguchi. GitOps, Jenkins X & the Future of CI/CD
DevOps Fest 2020. Kohsuke Kawaguchi. GitOps, Jenkins X & the Future of CI/CD
DevOps_Fest
 

What's hot (20)

Dev/Test scenarios in DevOps world
Dev/Test scenarios in DevOps worldDev/Test scenarios in DevOps world
Dev/Test scenarios in DevOps world
 
DockerCon SF 2015: Cultural Change using Docker
DockerCon SF 2015: Cultural Change using Docker DockerCon SF 2015: Cultural Change using Docker
DockerCon SF 2015: Cultural Change using Docker
 
Automatic Undo for Cloud Management via AI Planning
Automatic Undo for Cloud Management via AI PlanningAutomatic Undo for Cloud Management via AI Planning
Automatic Undo for Cloud Management via AI Planning
 
Infrastructure as Code
Infrastructure as CodeInfrastructure as Code
Infrastructure as Code
 
Siebel Monitoring Tools
Siebel Monitoring ToolsSiebel Monitoring Tools
Siebel Monitoring Tools
 
Continuous delivery while minimizing performance risks (dutch web ops meetup)
Continuous delivery while minimizing performance risks (dutch web ops meetup)Continuous delivery while minimizing performance risks (dutch web ops meetup)
Continuous delivery while minimizing performance risks (dutch web ops meetup)
 
Enterprise Beacon Object Hive - Siebel Version Control
Enterprise Beacon Object Hive - Siebel Version ControlEnterprise Beacon Object Hive - Siebel Version Control
Enterprise Beacon Object Hive - Siebel Version Control
 
Andreas Grabner - Performance as Code, Let's Make It a Standard
Andreas Grabner - Performance as Code, Let's Make It a StandardAndreas Grabner - Performance as Code, Let's Make It a Standard
Andreas Grabner - Performance as Code, Let's Make It a Standard
 
(SPOT302) Availability: The New Kind of Innovator’s Dilemma
(SPOT302) Availability: The New Kind of Innovator’s Dilemma(SPOT302) Availability: The New Kind of Innovator’s Dilemma
(SPOT302) Availability: The New Kind of Innovator’s Dilemma
 
Building Reactive applications with Akka
Building Reactive applications with AkkaBuilding Reactive applications with Akka
Building Reactive applications with Akka
 
Testing with Rspec
Testing with RspecTesting with Rspec
Testing with Rspec
 
Evolve your application
Evolve your applicationEvolve your application
Evolve your application
 
(Ignite) JOURNEY OF CHAOS EXPERIMENT EXECUTION IN PRODUCTION - LEONID HAIMOV,...
(Ignite) JOURNEY OF CHAOS EXPERIMENT EXECUTION IN PRODUCTION - LEONID HAIMOV,...(Ignite) JOURNEY OF CHAOS EXPERIMENT EXECUTION IN PRODUCTION - LEONID HAIMOV,...
(Ignite) JOURNEY OF CHAOS EXPERIMENT EXECUTION IN PRODUCTION - LEONID HAIMOV,...
 
DevOps Tooling event Amazic
DevOps Tooling event AmazicDevOps Tooling event Amazic
DevOps Tooling event Amazic
 
Infrastructure as code
Infrastructure as codeInfrastructure as code
Infrastructure as code
 
BOSE - Josh Steckler - Automating Automation: Build environments, on-demand
BOSE - Josh Steckler - Automating Automation: Build environments, on-demandBOSE - Josh Steckler - Automating Automation: Build environments, on-demand
BOSE - Josh Steckler - Automating Automation: Build environments, on-demand
 
Sencha Roadshow 2017: Sencha Upgrades - The Good. The Bad. The Ugly - Eva Luc...
Sencha Roadshow 2017: Sencha Upgrades - The Good. The Bad. The Ugly - Eva Luc...Sencha Roadshow 2017: Sencha Upgrades - The Good. The Bad. The Ugly - Eva Luc...
Sencha Roadshow 2017: Sencha Upgrades - The Good. The Bad. The Ugly - Eva Luc...
 
Infrastructure as Code
Infrastructure as CodeInfrastructure as Code
Infrastructure as Code
 
From vagrant to production - Mark Eijsermans
From vagrant to production - Mark EijsermansFrom vagrant to production - Mark Eijsermans
From vagrant to production - Mark Eijsermans
 
DevOps Fest 2020. Kohsuke Kawaguchi. GitOps, Jenkins X & the Future of CI/CD
DevOps Fest 2020. Kohsuke Kawaguchi. GitOps, Jenkins X & the Future of CI/CDDevOps Fest 2020. Kohsuke Kawaguchi. GitOps, Jenkins X & the Future of CI/CD
DevOps Fest 2020. Kohsuke Kawaguchi. GitOps, Jenkins X & the Future of CI/CD
 

Similar to Shadowing production requests

The art of system and solution testing
The art of system and solution testingThe art of system and solution testing
The art of system and solution testing
gaoliang641
 
Debugging,Troubleshooting & Monitoring Distributed Web & Cloud Applications a...
Debugging,Troubleshooting & Monitoring Distributed Web & Cloud Applications a...Debugging,Troubleshooting & Monitoring Distributed Web & Cloud Applications a...
Debugging,Troubleshooting & Monitoring Distributed Web & Cloud Applications a...
Theo Jungeblut
 
VoltDB and Erlang - Tech planet 2012
VoltDB and Erlang - Tech planet 2012VoltDB and Erlang - Tech planet 2012
VoltDB and Erlang - Tech planet 2012
Eonblast
 
NoSQL and ACID
NoSQL and ACIDNoSQL and ACID
NoSQL and ACID
FoundationDB
 
Introduction to Aspect Oriented Programming (DDD South West 4.0)
Introduction to Aspect Oriented Programming (DDD South West 4.0)Introduction to Aspect Oriented Programming (DDD South West 4.0)
Introduction to Aspect Oriented Programming (DDD South West 4.0)
Yan Cui
 
My Web Performance Dirty Secrets
My Web Performance Dirty SecretsMy Web Performance Dirty Secrets
My Web Performance Dirty Secrets
Fred Beringer
 
Db trends final
Db trends   finalDb trends   final
Db trends final
Craig Mullins
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
FoundationDB
 
Dynamic Service Configuration and Automated Network Configuration with NETCON...
Dynamic Service Configuration and Automated Network Configuration with NETCON...Dynamic Service Configuration and Automated Network Configuration with NETCON...
Dynamic Service Configuration and Automated Network Configuration with NETCON...
Tail-f Systems
 
Stay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithStay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolith
Markus Eisele
 
Client Technical Analysis of Legacy Software and Future Replacement
Client Technical Analysis of Legacy Software and Future ReplacementClient Technical Analysis of Legacy Software and Future Replacement
Client Technical Analysis of Legacy Software and Future Replacement
VictorSzoltysek
 
What is Reactive programming?
What is Reactive programming?What is Reactive programming?
What is Reactive programming?
Kevin Webber
 
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Spark Summit
 
Closing Keynote
Closing KeynoteClosing Keynote
Closing Keynote
Neo4j
 
Migrating IBM i Systems to the Cloud: Exploring the Pros and Cons
Migrating IBM i Systems to the Cloud: Exploring the Pros and ConsMigrating IBM i Systems to the Cloud: Exploring the Pros and Cons
Migrating IBM i Systems to the Cloud: Exploring the Pros and Cons
Precisely
 
Moving to software-based production workflows and containerisation of media a...
Moving to software-based production workflows and containerisation of media a...Moving to software-based production workflows and containerisation of media a...
Moving to software-based production workflows and containerisation of media a...
Kieran Kunhya
 
SolarWinds Scalability for the Enterprise
SolarWinds Scalability for the EnterpriseSolarWinds Scalability for the Enterprise
SolarWinds Scalability for the Enterprise
SolarWinds
 
Tools. Techniques. Trouble?
Tools. Techniques. Trouble?Tools. Techniques. Trouble?
Tools. Techniques. Trouble?
Testplant
 
Einführung in RavenDB
Einführung in RavenDBEinführung in RavenDB
Einführung in RavenDB
NETUserGroupBern
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to Kubernetes
QAware GmbH
 

Similar to Shadowing production requests (20)

The art of system and solution testing
The art of system and solution testingThe art of system and solution testing
The art of system and solution testing
 
Debugging,Troubleshooting & Monitoring Distributed Web & Cloud Applications a...
Debugging,Troubleshooting & Monitoring Distributed Web & Cloud Applications a...Debugging,Troubleshooting & Monitoring Distributed Web & Cloud Applications a...
Debugging,Troubleshooting & Monitoring Distributed Web & Cloud Applications a...
 
VoltDB and Erlang - Tech planet 2012
VoltDB and Erlang - Tech planet 2012VoltDB and Erlang - Tech planet 2012
VoltDB and Erlang - Tech planet 2012
 
NoSQL and ACID
NoSQL and ACIDNoSQL and ACID
NoSQL and ACID
 
Introduction to Aspect Oriented Programming (DDD South West 4.0)
Introduction to Aspect Oriented Programming (DDD South West 4.0)Introduction to Aspect Oriented Programming (DDD South West 4.0)
Introduction to Aspect Oriented Programming (DDD South West 4.0)
 
My Web Performance Dirty Secrets
My Web Performance Dirty SecretsMy Web Performance Dirty Secrets
My Web Performance Dirty Secrets
 
Db trends final
Db trends   finalDb trends   final
Db trends final
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
 
Dynamic Service Configuration and Automated Network Configuration with NETCON...
Dynamic Service Configuration and Automated Network Configuration with NETCON...Dynamic Service Configuration and Automated Network Configuration with NETCON...
Dynamic Service Configuration and Automated Network Configuration with NETCON...
 
Stay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithStay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolith
 
Client Technical Analysis of Legacy Software and Future Replacement
Client Technical Analysis of Legacy Software and Future ReplacementClient Technical Analysis of Legacy Software and Future Replacement
Client Technical Analysis of Legacy Software and Future Replacement
 
What is Reactive programming?
What is Reactive programming?What is Reactive programming?
What is Reactive programming?
 
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
 
Closing Keynote
Closing KeynoteClosing Keynote
Closing Keynote
 
Migrating IBM i Systems to the Cloud: Exploring the Pros and Cons
Migrating IBM i Systems to the Cloud: Exploring the Pros and ConsMigrating IBM i Systems to the Cloud: Exploring the Pros and Cons
Migrating IBM i Systems to the Cloud: Exploring the Pros and Cons
 
Moving to software-based production workflows and containerisation of media a...
Moving to software-based production workflows and containerisation of media a...Moving to software-based production workflows and containerisation of media a...
Moving to software-based production workflows and containerisation of media a...
 
SolarWinds Scalability for the Enterprise
SolarWinds Scalability for the EnterpriseSolarWinds Scalability for the Enterprise
SolarWinds Scalability for the Enterprise
 
Tools. Techniques. Trouble?
Tools. Techniques. Trouble?Tools. Techniques. Trouble?
Tools. Techniques. Trouble?
 
Einführung in RavenDB
Einführung in RavenDBEinführung in RavenDB
Einführung in RavenDB
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to Kubernetes
 

Recently uploaded

Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
Pravash Chandra Das
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 

Recently uploaded (20)

Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 

Shadowing production requests

Editor's Notes

  1. What are the two main questions you ask before you deploy a new release to production? - does the new functionality work? - did the changes break any existing functionality? My talk is gonna focus on helping us gain confidence to answer the second one with a strong no.
  2. What we usually do to gain confidence is to write a bunch of tests of all shapes and sizes. They are great, but they don't get us there the whole way. There's always that lingering feeling of "what did we miss in tests, will this really work in production?" In essence we are the limiting factor when it comes to tests. We only test the edge cases we can think of, theres a lot of them and testing all possibilities takes too much time and effort. So what can we do?
  3. Test code in production! This seems funny at first but is the core principle of the most well known methods of deployment.
  4. I call these YOLO methods. We push a release to production even though it may have regressions. We don't care about that too much because we are sure we can catch them in time and revert to the previous version
  5. The first of these methods is blue green deployments. It consists of a pair of production like environments, with the production one containing the current production release and the other one containing the next release. The central router routes all user requests to the new release. In case we detect some regressions/ increased error rates we just reroute everyone back to the old production release.
  6. Canary deployment are very similar, but differ in the fact that we only send a small portion of users to the new release.
  7. Both of these methods work to a degree, but they have some drawbacks. We need a monkey/developer to observe the behaviour of the releases and revert back to the old version. Whats even worse is that these methods still allow users to experience regressions. They just limit the exposure to a percentage of users or a time period. Bummer
  8. Production request shadowing is a method that aims at solving these problems. It's not very widely used, I only found mentions in some Amazon patents and Twitter/Cookpad talks.
  9. The core idea of shadowing is that user requests are fed through the shadow proxy server and duplicated to both a production environment and a shadow environment which contains the new release. Both application server process the same request and then return the answer to the shadow proxy server. The user is then sent only the production response.
  10. It's a simple idea, but it brings a lot to the table. Because users only see production responses they can't be affected by regressions in the new release. More importantly we can compare production and shadow content and performance to catch regressions!
  11. We get free real world test inputs from users in production AND free assertions. If the new release response is not the same as the production one we classify it as a regression.
  12. This equates to real time, zero effort, risk-free testing with production requests.
  13. So everything up until now was theoretical, now we'll look at the example I implemented for my thesis. Lets start with the architecture
  14. This looks a bit more complicated than the last picture. We have a bunch of application servers, 2 in the production environment and 1 in the shadow environment. They are using corresponding separate databases. The main 2 components that I implemented were the shadow proxy server and the command & control application. User requests flow as follows: the HTTP server forwards them to either the production environment or to the shadow proxy server. That in turn duplicates them to a production and shadow app instance with the new release. The results are then stored in Redis for comparison by the command & control application. The user then gets sent just the production response
  15. As with anything once you start looking closer a bunch of issues start popping up.
  16. The biggest one is related to the database. Concurrently processing the same user request on two application servers can lead to some unexpected behaviour. Imagine trying to process a requests that wants to delete a record. The faster of the two servers would fire an SQL query and succeed while the second one would return an error as the record wouldn't exist anymore. Same with uniqueness validations. The responses would differ too much To achieve the best response comparison we would want the underlaying production and shadow databases to be separate BIT-BY-BIT copies of each other at all times.
  17. The correct way to do this would be to dynamically mock the database layer in the shadow environment, but that is pretty hard. Twitter supposedly did it. The easier but less correct way I chose was to introduce a writable production read replica using bucardo. This means that the 2 databases are the same most of the time but the shadow one can deviate to a certain degree. Once it deviates too much we re-sync it back with production to improve response comparisons. Having this deviation does introduce some comparison noise but allows us to have a much simpler infrastructure, all we need is another DB.
  18. The next problem is external services. There are certain types of requests we don't want to duplicate - sending emails to clients, processing payment data.. We could also run into rate limiting or high costs by just duplicating external service calls
  19. The solution for this is quite simple - Environment flags. We need to identify and mock all the external service calls in the shadow environment.
  20. Another issue we would quickly run into especially if we render views is keeping state across multiple requests. This can occur for example with CSRF tokens. Because the user only receives the production response and included token, all the consequent requests that he sends to the shadow environment will be invalid.
  21. To solve this we need to actively edit user requests. We need to search for state tokens in the shadow responses, store them per user IP in redis and replace them in the duplicated shadow request.
  22. So once we solve all of those technical problems there comes the biggest usability issue. In a perfect world a request being processed by 2 application servers at the same time with the same release would always return the same response. Sadly this is not true as there are many sources of noise in the responses. A few examples of these are timestamps, values dictated by random number generators, external service data and all the intended changes we made in the new release.
  23. To combat this I implemented global and per-endpoint comparison rules. By default all response changes result in a negative score. Using a regex expression we can catch the part of the data that changes and set it a neutral score modifier. In this example we'r ignoring the CSRF token in the meta tag