SlideShare a Scribd company logo
1 of 15
Experimentation Platform

      Ashok Banerjee
Motivation
• Innovation iteration -> correct evaluation
   – Blindingly obvious
   – Clear but deductive reasoning (involved)
   – A/B Testing
      • Segment based optimization
      • Multi dimensional impact and stochastic

      • Incremental Radicalism

      • Disclaimer: Some parts of this platform are in existence but
        more will come to life and we will solicit more inputs and
        involvement
Experimentation Platform
              Components
• Bucketing (A or B)
  – Web Bucketing on User Cohorts
  – Supply Chain Bucketing on Order Basket or
    Warehouse (e.g. Packing)
• Control variables – what is being tested
  – Price
  – Gift Wrap
  – Position on Web Page
  – Recommendation Positioning
Experimentation Platform
• Result variables (often studied for a week to a
  month)
   –   Repeat Visit
   –   Repeat Buy
   –   Repeat Engagement
   –   Spend
• Result interpretation
   – Z-test
   – T-test
   – Chi Squared
Bucketing (Web)
• Bucketing: Declarative Common Cohorts
  – User (sync): Cohorts are complex queries often run
    async. If sufficiently complex e.g.
     • Users who bought Books with increasing spend but did not
       buy electronics
     • User Activity Store searches, clicks, views etc.
     • Cached and hit at web scale

     • Cohorts can be selected declaratively e.g.
         –   Category Purchased
         –   Search Ranking
         –   Email Marketing
         –   Spend slope
Bucketing (Fulfilment)
– Order Fulfilment (async): Rules
   • RETE evaluation of rules: Predicates evaluate minimal
     number of times 1000 rules
   • Async process => on the fly evaluation
– Interaction Plots need to be looked into for
  multiple experiments
– Exclusive buckets on control variables
   • e.g. 2 experiments cannot both decide on gift wrap
   • Price cannot be influenced by 2 different experiments
Control Variables
• Control Variables: Configuration Based delta
   – Price elasticity
   – Position on page
   – Recommendation
   – Gift Wrap
   – Business Flow (e.g. in Mumbai a new Packing
     technique) => BPM
Execution
• Execution
  – Client Library to evaluate
  – if (experiment45) { ….. }
  – Configuration based deviators
     • Better still evaluate experiment deviator e.g.
     • SLA = SLA - experimentDelta (experimenting with early
       delivery)
        – experimentDelta comes from config service


Multi-armed bandit to apply the changes?
  90% Greedy and 10% random
Binomial at Large # -> Normal
• Binomial (Most human decisions) -> Normal
(p + q)n = Sum(nCr prq(n-r))
Yr = nCr prq(n-r)
(Yr+1 – Yr)/Yr [Large n]

dy = -x2
Y     (std dev)2
Interaction Plot
  – From Peltier Stats on OKCupid Data
  – Smile no interaction with eye contact
  – Flirty face significant interaction


Beware of interaction
Between experiments
Result Interpretation

• Result Interpretation
   – T-test: Samples less than 30 [Fatter tail]
   – Z-test: (x-m)/(std dev) = 1.95 [Normal]
   – Paired t-test: Return/Refund-> Gift -> Repeat Buys
   – Chi Squared
   – F test
• Do we lose anything by repeated testing until test
  convergence?
Development Paradigm
–   Simplify during experiment
–   Scalability: Build experiment to work out of memory
–   Availability: Fail-Open
–   Sharding and Database: Not big scale
–   Performance: In Memory for a few
–   Figure out control variables

Upper bound of expected results -> 90% of experiments
may not need to be scaled out
Decision Paradigm
– No code needed to test an idea
– Experiments run in parallel
– Need to test for interaction and main effects
Development Paradigm
– Scalability: Build experiment to work out of
  memory
– Availability: Fail-Open
– Sharding and Database: Not big scale
– Performance: In Memory for a few nodes

Upper bound of expected results -> 90% of
experiments may not need to be scaled out
Summary
• A/B Testing Platform becomes key beyond
  trivially obvious
• Configuration based A/B tests (trivial to check
  on curiousity)
• Result interpretation is non trivial and varies

More Related Content

Similar to Experimentation Platform

Apache con big data 2015 - Data Science from the trenches
Apache con big data 2015 - Data Science from the trenchesApache con big data 2015 - Data Science from the trenches
Apache con big data 2015 - Data Science from the trenchesVinay Shukla
 
Software testing performance testing
Software testing  performance testingSoftware testing  performance testing
Software testing performance testingGaneshKumarKanthiah
 
Test in action – week 1
Test in action – week 1Test in action – week 1
Test in action – week 1Yi-Huan Chan
 
Finding Bugs Faster with Assertion Based Verification (ABV)
Finding Bugs Faster with Assertion Based Verification (ABV)Finding Bugs Faster with Assertion Based Verification (ABV)
Finding Bugs Faster with Assertion Based Verification (ABV)DVClub
 
Class9_SW_Testing_Strategies.pdf
Class9_SW_Testing_Strategies.pdfClass9_SW_Testing_Strategies.pdf
Class9_SW_Testing_Strategies.pdfFarjanaParvin5
 
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
(ARC310) Solving Amazon's Catalog Contention With Amazon KinesisAmazon Web Services
 
Context-Driven Performance Testing
Context-Driven Performance TestingContext-Driven Performance Testing
Context-Driven Performance TestingAlexander Podelko
 
Dealing with the Three Horrible Problems in Verification
Dealing with the Three Horrible Problems in VerificationDealing with the Three Horrible Problems in Verification
Dealing with the Three Horrible Problems in VerificationDVClub
 
Parallel run selenium tests in a good way
Parallel run selenium tests in a good  wayParallel run selenium tests in a good  way
Parallel run selenium tests in a good wayCOMAQA.BY
 
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...HostedbyConfluent
 
Multiple Dimensions of Load Testing
Multiple Dimensions of Load TestingMultiple Dimensions of Load Testing
Multiple Dimensions of Load TestingAlexander Podelko
 
Modern agile & ESP proposal for Transformation
Modern agile & ESP proposal for TransformationModern agile & ESP proposal for Transformation
Modern agile & ESP proposal for TransformationRavi Tadwalkar
 
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...Lionel Briand
 
Risk based testing and random testing
Risk based testing and random testingRisk based testing and random testing
Risk based testing and random testingHimanshu
 
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Codemotion
 
Data mining guest lecture (CSE6331 University of Texas, Arlington) 2004
Data mining guest lecture (CSE6331 University of Texas, Arlington) 2004Data mining guest lecture (CSE6331 University of Texas, Arlington) 2004
Data mining guest lecture (CSE6331 University of Texas, Arlington) 2004Alan Walker
 
Large scale Click-streaming and tranaction log mining
Large scale Click-streaming and tranaction log miningLarge scale Click-streaming and tranaction log mining
Large scale Click-streaming and tranaction log miningitstuff
 
IEEE.BigData.Tutorial.2.slides
IEEE.BigData.Tutorial.2.slidesIEEE.BigData.Tutorial.2.slides
IEEE.BigData.Tutorial.2.slidesNish Parikh
 
Andrew rusling 21 experiments to increase velocity
Andrew rusling 21 experiments to increase velocityAndrew rusling 21 experiments to increase velocity
Andrew rusling 21 experiments to increase velocityScrum Australia Pty Ltd
 

Similar to Experimentation Platform (20)

Apache con big data 2015 - Data Science from the trenches
Apache con big data 2015 - Data Science from the trenchesApache con big data 2015 - Data Science from the trenches
Apache con big data 2015 - Data Science from the trenches
 
Software testing performance testing
Software testing  performance testingSoftware testing  performance testing
Software testing performance testing
 
Test in action – week 1
Test in action – week 1Test in action – week 1
Test in action – week 1
 
Agile Testing
Agile TestingAgile Testing
Agile Testing
 
Finding Bugs Faster with Assertion Based Verification (ABV)
Finding Bugs Faster with Assertion Based Verification (ABV)Finding Bugs Faster with Assertion Based Verification (ABV)
Finding Bugs Faster with Assertion Based Verification (ABV)
 
Class9_SW_Testing_Strategies.pdf
Class9_SW_Testing_Strategies.pdfClass9_SW_Testing_Strategies.pdf
Class9_SW_Testing_Strategies.pdf
 
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
 
Context-Driven Performance Testing
Context-Driven Performance TestingContext-Driven Performance Testing
Context-Driven Performance Testing
 
Dealing with the Three Horrible Problems in Verification
Dealing with the Three Horrible Problems in VerificationDealing with the Three Horrible Problems in Verification
Dealing with the Three Horrible Problems in Verification
 
Parallel run selenium tests in a good way
Parallel run selenium tests in a good  wayParallel run selenium tests in a good  way
Parallel run selenium tests in a good way
 
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
 
Multiple Dimensions of Load Testing
Multiple Dimensions of Load TestingMultiple Dimensions of Load Testing
Multiple Dimensions of Load Testing
 
Modern agile & ESP proposal for Transformation
Modern agile & ESP proposal for TransformationModern agile & ESP proposal for Transformation
Modern agile & ESP proposal for Transformation
 
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
 
Risk based testing and random testing
Risk based testing and random testingRisk based testing and random testing
Risk based testing and random testing
 
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
 
Data mining guest lecture (CSE6331 University of Texas, Arlington) 2004
Data mining guest lecture (CSE6331 University of Texas, Arlington) 2004Data mining guest lecture (CSE6331 University of Texas, Arlington) 2004
Data mining guest lecture (CSE6331 University of Texas, Arlington) 2004
 
Large scale Click-streaming and tranaction log mining
Large scale Click-streaming and tranaction log miningLarge scale Click-streaming and tranaction log mining
Large scale Click-streaming and tranaction log mining
 
IEEE.BigData.Tutorial.2.slides
IEEE.BigData.Tutorial.2.slidesIEEE.BigData.Tutorial.2.slides
IEEE.BigData.Tutorial.2.slides
 
Andrew rusling 21 experiments to increase velocity
Andrew rusling 21 experiments to increase velocityAndrew rusling 21 experiments to increase velocity
Andrew rusling 21 experiments to increase velocity
 

Recently uploaded

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

Experimentation Platform

  • 1. Experimentation Platform Ashok Banerjee
  • 2. Motivation • Innovation iteration -> correct evaluation – Blindingly obvious – Clear but deductive reasoning (involved) – A/B Testing • Segment based optimization • Multi dimensional impact and stochastic • Incremental Radicalism • Disclaimer: Some parts of this platform are in existence but more will come to life and we will solicit more inputs and involvement
  • 3. Experimentation Platform Components • Bucketing (A or B) – Web Bucketing on User Cohorts – Supply Chain Bucketing on Order Basket or Warehouse (e.g. Packing) • Control variables – what is being tested – Price – Gift Wrap – Position on Web Page – Recommendation Positioning
  • 4. Experimentation Platform • Result variables (often studied for a week to a month) – Repeat Visit – Repeat Buy – Repeat Engagement – Spend • Result interpretation – Z-test – T-test – Chi Squared
  • 5. Bucketing (Web) • Bucketing: Declarative Common Cohorts – User (sync): Cohorts are complex queries often run async. If sufficiently complex e.g. • Users who bought Books with increasing spend but did not buy electronics • User Activity Store searches, clicks, views etc. • Cached and hit at web scale • Cohorts can be selected declaratively e.g. – Category Purchased – Search Ranking – Email Marketing – Spend slope
  • 6. Bucketing (Fulfilment) – Order Fulfilment (async): Rules • RETE evaluation of rules: Predicates evaluate minimal number of times 1000 rules • Async process => on the fly evaluation – Interaction Plots need to be looked into for multiple experiments – Exclusive buckets on control variables • e.g. 2 experiments cannot both decide on gift wrap • Price cannot be influenced by 2 different experiments
  • 7. Control Variables • Control Variables: Configuration Based delta – Price elasticity – Position on page – Recommendation – Gift Wrap – Business Flow (e.g. in Mumbai a new Packing technique) => BPM
  • 8. Execution • Execution – Client Library to evaluate – if (experiment45) { ….. } – Configuration based deviators • Better still evaluate experiment deviator e.g. • SLA = SLA - experimentDelta (experimenting with early delivery) – experimentDelta comes from config service Multi-armed bandit to apply the changes? 90% Greedy and 10% random
  • 9. Binomial at Large # -> Normal • Binomial (Most human decisions) -> Normal (p + q)n = Sum(nCr prq(n-r)) Yr = nCr prq(n-r) (Yr+1 – Yr)/Yr [Large n] dy = -x2 Y (std dev)2
  • 10. Interaction Plot – From Peltier Stats on OKCupid Data – Smile no interaction with eye contact – Flirty face significant interaction Beware of interaction Between experiments
  • 11. Result Interpretation • Result Interpretation – T-test: Samples less than 30 [Fatter tail] – Z-test: (x-m)/(std dev) = 1.95 [Normal] – Paired t-test: Return/Refund-> Gift -> Repeat Buys – Chi Squared – F test • Do we lose anything by repeated testing until test convergence?
  • 12. Development Paradigm – Simplify during experiment – Scalability: Build experiment to work out of memory – Availability: Fail-Open – Sharding and Database: Not big scale – Performance: In Memory for a few – Figure out control variables Upper bound of expected results -> 90% of experiments may not need to be scaled out
  • 13. Decision Paradigm – No code needed to test an idea – Experiments run in parallel – Need to test for interaction and main effects
  • 14. Development Paradigm – Scalability: Build experiment to work out of memory – Availability: Fail-Open – Sharding and Database: Not big scale – Performance: In Memory for a few nodes Upper bound of expected results -> 90% of experiments may not need to be scaled out
  • 15. Summary • A/B Testing Platform becomes key beyond trivially obvious • Configuration based A/B tests (trivial to check on curiousity) • Result interpretation is non trivial and varies

Editor's Notes

  1. Don’t be brave, you will be wrong. Your predecessors were bright too a good breakfast enhanced your mood more than your IQ. Experiment without fear. Free to experiment but not free to put things into production until sure that it will help.Try every experimentEnable everyone in the company to experiment