SlideShare a Scribd company logo
Speakers
Statistical Models,
Explored and Explained
Sara Vafi, Stats Expert, Optimizely
Shana Rusonis, Product Marketing, Optimizely
Today’s Speakers
Sara Vafi Shana Rusonis
Housekeeping
• We’re recording!
• Slides and recording will be
emailed to you tomorrow
• Time for questions at the end
Agenda
• Bayesian & Frequentist Statistics
• Error Control - Average vs. All Error Control
• Bayes Rule
• Benefits & Risks
• Optimizely Stats Engine
• Q&A
Why Do We Experiment?
● Experimentation is essential for learning
● Try new ideas without fear of failure
● Give your business a signal to act on
in a sea of noisy data
What’s most Important to You?
● Running experiments quickly
● But also reporting on results accurately
● When not all statistical solutions are created equal
Types of Statistical Methods
Bayesian
OR
Frequentist
Bayesian Statistics
● Bayesian statistics take a more bottom-up approach to data analysis
● Our parameters are unknown
● The data is fixed
● There is a prior probability
● “Opinion-based”
“A Bayesian is one who, vaguely
expecting a horse, and catching a
glimpse of a donkey, strongly believes
he has seen a mule.”
Source
Frequentist Statistics
● Frequentist arguments are more counter-factual in nature
● Parameters remain constant during the repeatable sampling process
● Resemble the type of logic that lawyers use in court
● ‘Is this variation different from the control?’ is a basic building block of this
approach.
Example
Dan & Pete Rolling a 6-Sided Die
Scenario:
● Pete will roll a die and the outcome can either be 1, 2, 3, 4, 5, or 6
● If Pete rolls a 4, he will give Dan $1 million
If Dan was a Bayesian statistician, how would he react?
If Dan was a Frequentist statistician, how would he react?
Example
Probability of the sun exploding
Source
Error Control
Error Control Explained
● The likelihood that the observed result of an experiment happened by chance,
rather than a change that you introduced
● When we set the statistical significance on an experiment to 90%, that means
there's a 10% chance of a statistical error, or a 1 in 10 chance that the result
happened by chance
Average Error Control
● Corresponds to Bayesian A/B Testing
● Less useful for iterating on test results
● Harder to learn from individual experiments with confidence
All Error Control
● Corresponds to Frequentist A/B Testing
● Any experiment will have less than a 10% chance of a mistake
● Rate of errors is 1 in 10
Average Error Control vs. All Error Control
● Average error control leads to lower accuracy for small improvements
● All error control is accurate for all users
● There are certain cases where average error control is an appropriate
alternative
Error Rates for Experiments
Bayes Rule
Average Error Control & Bayesian A/B
Testing
● Requires two sources of randomness
○ Randomness or “noise” in the data
○ The makeup of the “typical” experiment group
● Distribution over experiment improvements
Different Beliefs in Composition of ‘Typical’ Experiments
Bayes Rule
Bayes Rule & Bayesian A/B Testing
Bayes Rule & Average Error Value
Recap Average Error Control
Bayesian A/B Testing
Prior Distributions
Bayes Rule
All Error Control is Frequentist A/B Testing
● All error control corresponds to Frequentist AB testing
● We want to aim to control the false positive rate
● Chance an experiment is either called a winner or loser
Benefits & Risks
Benefits of Bayesian A/B Testing
● Average error control can be very
attractive
● Helps solve the “peeking” problem
● Average error control is fast
Risks of Bayesian A/B Testing
● It’s more appealing but it’s risky in practice
● Smaller improvement experiments with fast results = high risk
● Higher error rate than the method actually suggests
Benefits of Frequentist A/B Testing
● This type of test will make fewer mistakes on experiments with
non-zero improvements
● The rate of errors will be less than 1 in 10
● Option to speed up experimentation by using a prior
Learning from A/B Tests
Learning from A/B Tests
Risk Involved with Typical Realistic Experiments
Realistic Bayesian A/B Tests vs. Stats Engine
● The hardest experiments to call correctly are those with small
improvements
● A/B testing in the wild is not easy
● We need more and more data in order to achieve average error control
on realistic experiments
So what does this mean?
Stats Engine
Stats EngineTM
Results are valid whenever you
check
Avoid costly statistics errors
Measure real-time results
with confidence
Key Takeaways
● Bayesian vs. Frequentist methods
● All error control vs. average error control
● Blended approach leads to greater confidence
QUESTIONS?
THANK YOU!

More Related Content

What's hot

Resampling methods
Resampling methodsResampling methods
Resampling methods
Setia Pramana
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Derek Kane
 
Lect4 principal component analysis-I
Lect4 principal component analysis-ILect4 principal component analysis-I
Lect4 principal component analysis-I
hktripathy
 
Introduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood EstimatorIntroduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood Estimator
Amir Al-Ansary
 
Lecture - ANCOVA 4 Slides.pdf
Lecture - ANCOVA 4 Slides.pdfLecture - ANCOVA 4 Slides.pdf
Lecture - ANCOVA 4 Slides.pdf
muhammad shahid
 
Linear Regression With R
Linear Regression With RLinear Regression With R
Linear Regression With R
Edureka!
 
Model selection and cross validation techniques
Model selection and cross validation techniquesModel selection and cross validation techniques
Model selection and cross validation techniques
Venkata Reddy Konasani
 
Introduction to Generalized Linear Models
Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Introduction to Generalized Linear Models
richardchandler
 
Bayesian statistics
Bayesian statisticsBayesian statistics
Bayesian statistics
Alberto Labarga
 
Probability mass functions and probability density functions
Probability mass functions and probability density functionsProbability mass functions and probability density functions
Probability mass functions and probability density functionsAnkit Katiyar
 
Logistic Regression Analysis
Logistic Regression AnalysisLogistic Regression Analysis
Logistic Regression Analysis
COSTARCH Analytical Consulting (P) Ltd.
 
Data Analysis
Data AnalysisData Analysis
Data Analysis
sikander kushwaha
 
Software packages for statistical analysis - SPSS
Software packages for statistical analysis - SPSSSoftware packages for statistical analysis - SPSS
Software packages for statistical analysis - SPSS
ANAND BALAJI
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
CIToolkit
 
Causal inference in practice
Causal inference in practiceCausal inference in practice
Causal inference in practice
Amit Sharma
 
Generalized linear model
Generalized linear modelGeneralized linear model
Generalized linear model
Rahul Rockers
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to Statistics
MoonWeryah
 
Missing data handling
Missing data handlingMissing data handling
Missing data handling
QuantUniversity
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
Attaullah Khan
 
Basics of SPSS, Part 2
Basics of SPSS, Part 2Basics of SPSS, Part 2
Basics of SPSS, Part 2
Christine Pereira Ask Brunel
 

What's hot (20)

Resampling methods
Resampling methodsResampling methods
Resampling methods
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
 
Lect4 principal component analysis-I
Lect4 principal component analysis-ILect4 principal component analysis-I
Lect4 principal component analysis-I
 
Introduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood EstimatorIntroduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood Estimator
 
Lecture - ANCOVA 4 Slides.pdf
Lecture - ANCOVA 4 Slides.pdfLecture - ANCOVA 4 Slides.pdf
Lecture - ANCOVA 4 Slides.pdf
 
Linear Regression With R
Linear Regression With RLinear Regression With R
Linear Regression With R
 
Model selection and cross validation techniques
Model selection and cross validation techniquesModel selection and cross validation techniques
Model selection and cross validation techniques
 
Introduction to Generalized Linear Models
Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Introduction to Generalized Linear Models
 
Bayesian statistics
Bayesian statisticsBayesian statistics
Bayesian statistics
 
Probability mass functions and probability density functions
Probability mass functions and probability density functionsProbability mass functions and probability density functions
Probability mass functions and probability density functions
 
Logistic Regression Analysis
Logistic Regression AnalysisLogistic Regression Analysis
Logistic Regression Analysis
 
Data Analysis
Data AnalysisData Analysis
Data Analysis
 
Software packages for statistical analysis - SPSS
Software packages for statistical analysis - SPSSSoftware packages for statistical analysis - SPSS
Software packages for statistical analysis - SPSS
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 
Causal inference in practice
Causal inference in practiceCausal inference in practice
Causal inference in practice
 
Generalized linear model
Generalized linear modelGeneralized linear model
Generalized linear model
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to Statistics
 
Missing data handling
Missing data handlingMissing data handling
Missing data handling
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Basics of SPSS, Part 2
Basics of SPSS, Part 2Basics of SPSS, Part 2
Basics of SPSS, Part 2
 

Viewers also liked

The New State of Personalization
The New State of PersonalizationThe New State of Personalization
The New State of Personalization
Optimizely
 
Recommendations Reboot: Improving RIO with Experimentation
Recommendations Reboot: Improving RIO with ExperimentationRecommendations Reboot: Improving RIO with Experimentation
Recommendations Reboot: Improving RIO with Experimentation
Optimizely
 
Retain or Die: The Retention Playbook
Retain or Die: The Retention PlaybookRetain or Die: The Retention Playbook
Retain or Die: The Retention Playbook
Optimizely
 
Meet Optimizely X Web Experimentation
Meet Optimizely X Web ExperimentationMeet Optimizely X Web Experimentation
Meet Optimizely X Web Experimentation
Optimizely
 
Meet Optimizely X Recommendations
Meet Optimizely X RecommendationsMeet Optimizely X Recommendations
Meet Optimizely X Recommendations
Optimizely
 
7 Habits of Highly Effective Personalization Organizations
7 Habits of Highly Effective Personalization Organizations7 Habits of Highly Effective Personalization Organizations
7 Habits of Highly Effective Personalization Organizations
Optimizely
 
Optimizely X Seminar Amsterdam Nov 10
Optimizely X Seminar Amsterdam Nov 10Optimizely X Seminar Amsterdam Nov 10
Optimizely X Seminar Amsterdam Nov 10
Optimizely
 
Unbounce Pitch
Unbounce PitchUnbounce Pitch
Unbounce Pitch
Unbounce
 
BounceX Client Presentation
BounceX Client PresentationBounceX Client Presentation
BounceX Client Presentation
Sam Schwamm
 
A/B Tests mit Optimizely in Single Page Apps - Beispiel AngularJS
A/B Tests mit Optimizely in Single Page Apps - Beispiel AngularJSA/B Tests mit Optimizely in Single Page Apps - Beispiel AngularJS
A/B Tests mit Optimizely in Single Page Apps - Beispiel AngularJS
Optimizely
 
Real-Time Personalization: Top 5 Use Cases to Boost Conversions
Real-Time Personalization: Top 5 Use Cases to Boost ConversionsReal-Time Personalization: Top 5 Use Cases to Boost Conversions
Real-Time Personalization: Top 5 Use Cases to Boost ConversionsMarketo
 
Optimize Everything : A framework for solving your BIGGEST Problems Through O...
Optimize Everything : A framework for solving your BIGGEST Problems Through O...Optimize Everything : A framework for solving your BIGGEST Problems Through O...
Optimize Everything : A framework for solving your BIGGEST Problems Through O...
Optimizely
 
Adobe Marketing Cloud
Adobe Marketing CloudAdobe Marketing Cloud
Adobe Marketing Cloud
edynamic
 
Shopper Survey Trends, Europe
Shopper Survey Trends, Europe Shopper Survey Trends, Europe
Shopper Survey Trends, Europe
Optimizely
 
Conversion Day Belgium - Personalization keynote
Conversion Day Belgium - Personalization keynoteConversion Day Belgium - Personalization keynote
Conversion Day Belgium - Personalization keynote
Optimizely
 
Chicago Website Personalization Strategy Workshop
Chicago Website Personalization Strategy WorkshopChicago Website Personalization Strategy Workshop
Chicago Website Personalization Strategy Workshop
Optimizely
 
The Art & Science of Standing Out in a Saturated Market
The Art & Science of Standing Out in a Saturated MarketThe Art & Science of Standing Out in a Saturated Market
The Art & Science of Standing Out in a Saturated Market
Optimizely
 
AppsFlyer Mobile App Tracking | Campaign & Engagement Analytics
AppsFlyer Mobile App Tracking | Campaign & Engagement AnalyticsAppsFlyer Mobile App Tracking | Campaign & Engagement Analytics
AppsFlyer Mobile App Tracking | Campaign & Engagement Analytics
AppsFlyer
 
Personalization - 10 Lessons Learned from Netflix
Personalization - 10 Lessons Learned from NetflixPersonalization - 10 Lessons Learned from Netflix
Personalization - 10 Lessons Learned from Netflix
Pancrazio Auteri
 
5 Ways To Surprise Your Audience (and keep their attention)
5 Ways To Surprise Your Audience (and keep their attention)5 Ways To Surprise Your Audience (and keep their attention)
5 Ways To Surprise Your Audience (and keep their attention)
Slides | Presentation Design Agency
 

Viewers also liked (20)

The New State of Personalization
The New State of PersonalizationThe New State of Personalization
The New State of Personalization
 
Recommendations Reboot: Improving RIO with Experimentation
Recommendations Reboot: Improving RIO with ExperimentationRecommendations Reboot: Improving RIO with Experimentation
Recommendations Reboot: Improving RIO with Experimentation
 
Retain or Die: The Retention Playbook
Retain or Die: The Retention PlaybookRetain or Die: The Retention Playbook
Retain or Die: The Retention Playbook
 
Meet Optimizely X Web Experimentation
Meet Optimizely X Web ExperimentationMeet Optimizely X Web Experimentation
Meet Optimizely X Web Experimentation
 
Meet Optimizely X Recommendations
Meet Optimizely X RecommendationsMeet Optimizely X Recommendations
Meet Optimizely X Recommendations
 
7 Habits of Highly Effective Personalization Organizations
7 Habits of Highly Effective Personalization Organizations7 Habits of Highly Effective Personalization Organizations
7 Habits of Highly Effective Personalization Organizations
 
Optimizely X Seminar Amsterdam Nov 10
Optimizely X Seminar Amsterdam Nov 10Optimizely X Seminar Amsterdam Nov 10
Optimizely X Seminar Amsterdam Nov 10
 
Unbounce Pitch
Unbounce PitchUnbounce Pitch
Unbounce Pitch
 
BounceX Client Presentation
BounceX Client PresentationBounceX Client Presentation
BounceX Client Presentation
 
A/B Tests mit Optimizely in Single Page Apps - Beispiel AngularJS
A/B Tests mit Optimizely in Single Page Apps - Beispiel AngularJSA/B Tests mit Optimizely in Single Page Apps - Beispiel AngularJS
A/B Tests mit Optimizely in Single Page Apps - Beispiel AngularJS
 
Real-Time Personalization: Top 5 Use Cases to Boost Conversions
Real-Time Personalization: Top 5 Use Cases to Boost ConversionsReal-Time Personalization: Top 5 Use Cases to Boost Conversions
Real-Time Personalization: Top 5 Use Cases to Boost Conversions
 
Optimize Everything : A framework for solving your BIGGEST Problems Through O...
Optimize Everything : A framework for solving your BIGGEST Problems Through O...Optimize Everything : A framework for solving your BIGGEST Problems Through O...
Optimize Everything : A framework for solving your BIGGEST Problems Through O...
 
Adobe Marketing Cloud
Adobe Marketing CloudAdobe Marketing Cloud
Adobe Marketing Cloud
 
Shopper Survey Trends, Europe
Shopper Survey Trends, Europe Shopper Survey Trends, Europe
Shopper Survey Trends, Europe
 
Conversion Day Belgium - Personalization keynote
Conversion Day Belgium - Personalization keynoteConversion Day Belgium - Personalization keynote
Conversion Day Belgium - Personalization keynote
 
Chicago Website Personalization Strategy Workshop
Chicago Website Personalization Strategy WorkshopChicago Website Personalization Strategy Workshop
Chicago Website Personalization Strategy Workshop
 
The Art & Science of Standing Out in a Saturated Market
The Art & Science of Standing Out in a Saturated MarketThe Art & Science of Standing Out in a Saturated Market
The Art & Science of Standing Out in a Saturated Market
 
AppsFlyer Mobile App Tracking | Campaign & Engagement Analytics
AppsFlyer Mobile App Tracking | Campaign & Engagement AnalyticsAppsFlyer Mobile App Tracking | Campaign & Engagement Analytics
AppsFlyer Mobile App Tracking | Campaign & Engagement Analytics
 
Personalization - 10 Lessons Learned from Netflix
Personalization - 10 Lessons Learned from NetflixPersonalization - 10 Lessons Learned from Netflix
Personalization - 10 Lessons Learned from Netflix
 
5 Ways To Surprise Your Audience (and keep their attention)
5 Ways To Surprise Your Audience (and keep their attention)5 Ways To Surprise Your Audience (and keep their attention)
5 Ways To Surprise Your Audience (and keep their attention)
 

Similar to Statistical Models Explored and Explained

Chris Stuccio - Data science - Conversion Hotel 2015
Chris Stuccio - Data science - Conversion Hotel 2015Chris Stuccio - Data science - Conversion Hotel 2015
Chris Stuccio - Data science - Conversion Hotel 2015
Webanalisten .nl
 
Conversion Conference Berlin
Conversion Conference BerlinConversion Conference Berlin
Conversion Conference Berlin
Tom Capper
 
Statistics for CRO - Conversion Conference London
Statistics for CRO - Conversion Conference LondonStatistics for CRO - Conversion Conference London
Statistics for CRO - Conversion Conference London
Tom Capper
 
The Finishing Line
The Finishing LineThe Finishing Line
The Finishing Line
Oban International
 
10 Guidelines for A/B Testing
10 Guidelines for A/B Testing10 Guidelines for A/B Testing
10 Guidelines for A/B Testing
Emily Robinson
 
신뢰할 수 있는 A/B 테스트를 위해 알아야 할 것들
신뢰할 수 있는 A/B 테스트를 위해 알아야 할 것들신뢰할 수 있는 A/B 테스트를 위해 알아야 할 것들
신뢰할 수 있는 A/B 테스트를 위해 알아야 할 것들
Minho Lee
 
Tale of Two Tests
Tale of Two TestsTale of Two Tests
Tale of Two Tests
Optimizely
 
Bad Metric, Bad!
Bad Metric, Bad!Bad Metric, Bad!
Bad Metric, Bad!
Joseph Ours, MBA, PMP
 
The Necessity of the Measure Phase with Matt Hansen at StatStuff
The Necessity of the Measure Phase with Matt Hansen at StatStuffThe Necessity of the Measure Phase with Matt Hansen at StatStuff
The Necessity of the Measure Phase with Matt Hansen at StatStuff
Matt Hansen
 
One Sample Hypothesis - Tips
One Sample Hypothesis - TipsOne Sample Hypothesis - Tips
One Sample Hypothesis - Tipsprussin86
 
One Sample Hypothesis Tips
One Sample Hypothesis   TipsOne Sample Hypothesis   Tips
One Sample Hypothesis Tipsprussin86
 
One Sample Hypothesis - Tips
One Sample Hypothesis - TipsOne Sample Hypothesis - Tips
One Sample Hypothesis - Tipsprussin86
 
One Sample Hypothesis Tips
One  Sample  Hypothesis    TipsOne  Sample  Hypothesis    Tips
One Sample Hypothesis Tipsprussin86
 
One Sample Hypothesis - Tips
One Sample Hypothesis - TipsOne Sample Hypothesis - Tips
One Sample Hypothesis - Tipsprussin86
 
Opticon 2017 Experimenting with Stats Engine
Opticon 2017 Experimenting with Stats EngineOpticon 2017 Experimenting with Stats Engine
Opticon 2017 Experimenting with Stats Engine
Optimizely
 
A/B testing from basic concepts to advanced techniques
A/B testing  from basic concepts to advanced techniquesA/B testing  from basic concepts to advanced techniques
A/B testing from basic concepts to advanced techniques
Anatoliy Vuets
 
Model validation
Model validationModel validation
Model validation
Utkarsh Sharma
 
Webinar: Experimentation & Product Management by Indeed Product Lead
Webinar: Experimentation & Product Management by Indeed Product LeadWebinar: Experimentation & Product Management by Indeed Product Lead
Webinar: Experimentation & Product Management by Indeed Product Lead
Product School
 
Optimizely Workshop: Take Action on Results with Statistics
Optimizely Workshop: Take Action on Results with StatisticsOptimizely Workshop: Take Action on Results with Statistics
Optimizely Workshop: Take Action on Results with Statistics
Optimizely
 

Similar to Statistical Models Explored and Explained (20)

Chris Stuccio - Data science - Conversion Hotel 2015
Chris Stuccio - Data science - Conversion Hotel 2015Chris Stuccio - Data science - Conversion Hotel 2015
Chris Stuccio - Data science - Conversion Hotel 2015
 
Conversion Conference Berlin
Conversion Conference BerlinConversion Conference Berlin
Conversion Conference Berlin
 
Statistics for CRO - Conversion Conference London
Statistics for CRO - Conversion Conference LondonStatistics for CRO - Conversion Conference London
Statistics for CRO - Conversion Conference London
 
The Finishing Line
The Finishing LineThe Finishing Line
The Finishing Line
 
10 Guidelines for A/B Testing
10 Guidelines for A/B Testing10 Guidelines for A/B Testing
10 Guidelines for A/B Testing
 
신뢰할 수 있는 A/B 테스트를 위해 알아야 할 것들
신뢰할 수 있는 A/B 테스트를 위해 알아야 할 것들신뢰할 수 있는 A/B 테스트를 위해 알아야 할 것들
신뢰할 수 있는 A/B 테스트를 위해 알아야 할 것들
 
Tale of Two Tests
Tale of Two TestsTale of Two Tests
Tale of Two Tests
 
Bad Metric, Bad!
Bad Metric, Bad!Bad Metric, Bad!
Bad Metric, Bad!
 
The Necessity of the Measure Phase with Matt Hansen at StatStuff
The Necessity of the Measure Phase with Matt Hansen at StatStuffThe Necessity of the Measure Phase with Matt Hansen at StatStuff
The Necessity of the Measure Phase with Matt Hansen at StatStuff
 
One Sample Hypothesis - Tips
One Sample Hypothesis - TipsOne Sample Hypothesis - Tips
One Sample Hypothesis - Tips
 
One Sample Hypothesis Tips
One Sample Hypothesis   TipsOne Sample Hypothesis   Tips
One Sample Hypothesis Tips
 
One Sample Hypothesis - Tips
One Sample Hypothesis - TipsOne Sample Hypothesis - Tips
One Sample Hypothesis - Tips
 
One Sample Hypothesis Tips
One  Sample  Hypothesis    TipsOne  Sample  Hypothesis    Tips
One Sample Hypothesis Tips
 
One Sample Hypothesis - Tips
One Sample Hypothesis - TipsOne Sample Hypothesis - Tips
One Sample Hypothesis - Tips
 
Opticon 2017 Experimenting with Stats Engine
Opticon 2017 Experimenting with Stats EngineOpticon 2017 Experimenting with Stats Engine
Opticon 2017 Experimenting with Stats Engine
 
A/B testing from basic concepts to advanced techniques
A/B testing  from basic concepts to advanced techniquesA/B testing  from basic concepts to advanced techniques
A/B testing from basic concepts to advanced techniques
 
Model validation
Model validationModel validation
Model validation
 
Webinar: Experimentation & Product Management by Indeed Product Lead
Webinar: Experimentation & Product Management by Indeed Product LeadWebinar: Experimentation & Product Management by Indeed Product Lead
Webinar: Experimentation & Product Management by Indeed Product Lead
 
Optimizely Workshop: Take Action on Results with Statistics
Optimizely Workshop: Take Action on Results with StatisticsOptimizely Workshop: Take Action on Results with Statistics
Optimizely Workshop: Take Action on Results with Statistics
 
Getting testing right
Getting testing right Getting testing right
Getting testing right
 

More from Optimizely

Clover Rings Up Digital Growth to Drive Experimentation
Clover Rings Up Digital Growth to Drive ExperimentationClover Rings Up Digital Growth to Drive Experimentation
Clover Rings Up Digital Growth to Drive Experimentation
Optimizely
 
Make Every Touchpoint Count: How to Drive Revenue in an Increasingly Online W...
Make Every Touchpoint Count: How to Drive Revenue in an Increasingly Online W...Make Every Touchpoint Count: How to Drive Revenue in an Increasingly Online W...
Make Every Touchpoint Count: How to Drive Revenue in an Increasingly Online W...
Optimizely
 
The Science of Getting Testing Right
The Science of Getting Testing RightThe Science of Getting Testing Right
The Science of Getting Testing Right
Optimizely
 
Atlassian's Mystique CLI, Minimizing the Experiment Development Cycle
Atlassian's Mystique CLI, Minimizing the Experiment Development CycleAtlassian's Mystique CLI, Minimizing the Experiment Development Cycle
Atlassian's Mystique CLI, Minimizing the Experiment Development Cycle
Optimizely
 
Autotrader Case Study: Migrating from Home-Grown Testing to Best-in-Class Too...
Autotrader Case Study: Migrating from Home-Grown Testing to Best-in-Class Too...Autotrader Case Study: Migrating from Home-Grown Testing to Best-in-Class Too...
Autotrader Case Study: Migrating from Home-Grown Testing to Best-in-Class Too...
Optimizely
 
Zillow + Optimizely: Building the Bridge to $20 Billion Revenue
Zillow + Optimizely: Building the Bridge to $20 Billion RevenueZillow + Optimizely: Building the Bridge to $20 Billion Revenue
Zillow + Optimizely: Building the Bridge to $20 Billion Revenue
Optimizely
 
The Future of Optimizely for Technical Teams
The Future of Optimizely for Technical TeamsThe Future of Optimizely for Technical Teams
The Future of Optimizely for Technical Teams
Optimizely
 
Empowering Agents to Provide Service from Anywhere: Contact Centers in the Ti...
Empowering Agents to Provide Service from Anywhere: Contact Centers in the Ti...Empowering Agents to Provide Service from Anywhere: Contact Centers in the Ti...
Empowering Agents to Provide Service from Anywhere: Contact Centers in the Ti...
Optimizely
 
Experimentation Everywhere: Create Exceptional Online Shopping Experiences an...
Experimentation Everywhere: Create Exceptional Online Shopping Experiences an...Experimentation Everywhere: Create Exceptional Online Shopping Experiences an...
Experimentation Everywhere: Create Exceptional Online Shopping Experiences an...
Optimizely
 
Building an Experiment Pipeline for GitHub’s New Free Team Offering
Building an Experiment Pipeline for GitHub’s New Free Team OfferingBuilding an Experiment Pipeline for GitHub’s New Free Team Offering
Building an Experiment Pipeline for GitHub’s New Free Team Offering
Optimizely
 
AMC Networks Experiments Faster on the Server Side
AMC Networks Experiments Faster on the Server SideAMC Networks Experiments Faster on the Server Side
AMC Networks Experiments Faster on the Server Side
Optimizely
 
Evolving Experimentation from CRO to Product Development
Evolving Experimentation from CRO to Product DevelopmentEvolving Experimentation from CRO to Product Development
Evolving Experimentation from CRO to Product Development
Optimizely
 
Overcoming the Challenges of Experimentation on a Service Oriented Architecture
Overcoming the Challenges of Experimentation on a Service Oriented ArchitectureOvercoming the Challenges of Experimentation on a Service Oriented Architecture
Overcoming the Challenges of Experimentation on a Service Oriented Architecture
Optimizely
 
How The Zebra Utilized Feature Experiments To Increase Carrier Card Engagemen...
How The Zebra Utilized Feature Experiments To Increase Carrier Card Engagemen...How The Zebra Utilized Feature Experiments To Increase Carrier Card Engagemen...
How The Zebra Utilized Feature Experiments To Increase Carrier Card Engagemen...
Optimizely
 
Making Your Hypothesis Work Harder to Inform Future Product Strategy
Making Your Hypothesis Work Harder to Inform Future Product StrategyMaking Your Hypothesis Work Harder to Inform Future Product Strategy
Making Your Hypothesis Work Harder to Inform Future Product Strategy
Optimizely
 
Kick Your Assumptions: How Scholl's Test-Everything Culture Drives Revenue
Kick Your Assumptions: How Scholl's Test-Everything Culture Drives RevenueKick Your Assumptions: How Scholl's Test-Everything Culture Drives Revenue
Kick Your Assumptions: How Scholl's Test-Everything Culture Drives Revenue
Optimizely
 
Experimentation through Clients' Eyes
Experimentation through Clients' EyesExperimentation through Clients' Eyes
Experimentation through Clients' Eyes
Optimizely
 
Shipping to Learn and Accelerate Growth with GitHub
Shipping to Learn and Accelerate Growth with GitHubShipping to Learn and Accelerate Growth with GitHub
Shipping to Learn and Accelerate Growth with GitHub
Optimizely
 
Test Everything: TrustRadius Delivers Customer Value with Experimentation
Test Everything: TrustRadius Delivers Customer Value with ExperimentationTest Everything: TrustRadius Delivers Customer Value with Experimentation
Test Everything: TrustRadius Delivers Customer Value with Experimentation
Optimizely
 
Optimizely Agent: Scaling Resilient Feature Delivery
Optimizely Agent: Scaling Resilient Feature DeliveryOptimizely Agent: Scaling Resilient Feature Delivery
Optimizely Agent: Scaling Resilient Feature Delivery
Optimizely
 

More from Optimizely (20)

Clover Rings Up Digital Growth to Drive Experimentation
Clover Rings Up Digital Growth to Drive ExperimentationClover Rings Up Digital Growth to Drive Experimentation
Clover Rings Up Digital Growth to Drive Experimentation
 
Make Every Touchpoint Count: How to Drive Revenue in an Increasingly Online W...
Make Every Touchpoint Count: How to Drive Revenue in an Increasingly Online W...Make Every Touchpoint Count: How to Drive Revenue in an Increasingly Online W...
Make Every Touchpoint Count: How to Drive Revenue in an Increasingly Online W...
 
The Science of Getting Testing Right
The Science of Getting Testing RightThe Science of Getting Testing Right
The Science of Getting Testing Right
 
Atlassian's Mystique CLI, Minimizing the Experiment Development Cycle
Atlassian's Mystique CLI, Minimizing the Experiment Development CycleAtlassian's Mystique CLI, Minimizing the Experiment Development Cycle
Atlassian's Mystique CLI, Minimizing the Experiment Development Cycle
 
Autotrader Case Study: Migrating from Home-Grown Testing to Best-in-Class Too...
Autotrader Case Study: Migrating from Home-Grown Testing to Best-in-Class Too...Autotrader Case Study: Migrating from Home-Grown Testing to Best-in-Class Too...
Autotrader Case Study: Migrating from Home-Grown Testing to Best-in-Class Too...
 
Zillow + Optimizely: Building the Bridge to $20 Billion Revenue
Zillow + Optimizely: Building the Bridge to $20 Billion RevenueZillow + Optimizely: Building the Bridge to $20 Billion Revenue
Zillow + Optimizely: Building the Bridge to $20 Billion Revenue
 
The Future of Optimizely for Technical Teams
The Future of Optimizely for Technical TeamsThe Future of Optimizely for Technical Teams
The Future of Optimizely for Technical Teams
 
Empowering Agents to Provide Service from Anywhere: Contact Centers in the Ti...
Empowering Agents to Provide Service from Anywhere: Contact Centers in the Ti...Empowering Agents to Provide Service from Anywhere: Contact Centers in the Ti...
Empowering Agents to Provide Service from Anywhere: Contact Centers in the Ti...
 
Experimentation Everywhere: Create Exceptional Online Shopping Experiences an...
Experimentation Everywhere: Create Exceptional Online Shopping Experiences an...Experimentation Everywhere: Create Exceptional Online Shopping Experiences an...
Experimentation Everywhere: Create Exceptional Online Shopping Experiences an...
 
Building an Experiment Pipeline for GitHub’s New Free Team Offering
Building an Experiment Pipeline for GitHub’s New Free Team OfferingBuilding an Experiment Pipeline for GitHub’s New Free Team Offering
Building an Experiment Pipeline for GitHub’s New Free Team Offering
 
AMC Networks Experiments Faster on the Server Side
AMC Networks Experiments Faster on the Server SideAMC Networks Experiments Faster on the Server Side
AMC Networks Experiments Faster on the Server Side
 
Evolving Experimentation from CRO to Product Development
Evolving Experimentation from CRO to Product DevelopmentEvolving Experimentation from CRO to Product Development
Evolving Experimentation from CRO to Product Development
 
Overcoming the Challenges of Experimentation on a Service Oriented Architecture
Overcoming the Challenges of Experimentation on a Service Oriented ArchitectureOvercoming the Challenges of Experimentation on a Service Oriented Architecture
Overcoming the Challenges of Experimentation on a Service Oriented Architecture
 
How The Zebra Utilized Feature Experiments To Increase Carrier Card Engagemen...
How The Zebra Utilized Feature Experiments To Increase Carrier Card Engagemen...How The Zebra Utilized Feature Experiments To Increase Carrier Card Engagemen...
How The Zebra Utilized Feature Experiments To Increase Carrier Card Engagemen...
 
Making Your Hypothesis Work Harder to Inform Future Product Strategy
Making Your Hypothesis Work Harder to Inform Future Product StrategyMaking Your Hypothesis Work Harder to Inform Future Product Strategy
Making Your Hypothesis Work Harder to Inform Future Product Strategy
 
Kick Your Assumptions: How Scholl's Test-Everything Culture Drives Revenue
Kick Your Assumptions: How Scholl's Test-Everything Culture Drives RevenueKick Your Assumptions: How Scholl's Test-Everything Culture Drives Revenue
Kick Your Assumptions: How Scholl's Test-Everything Culture Drives Revenue
 
Experimentation through Clients' Eyes
Experimentation through Clients' EyesExperimentation through Clients' Eyes
Experimentation through Clients' Eyes
 
Shipping to Learn and Accelerate Growth with GitHub
Shipping to Learn and Accelerate Growth with GitHubShipping to Learn and Accelerate Growth with GitHub
Shipping to Learn and Accelerate Growth with GitHub
 
Test Everything: TrustRadius Delivers Customer Value with Experimentation
Test Everything: TrustRadius Delivers Customer Value with ExperimentationTest Everything: TrustRadius Delivers Customer Value with Experimentation
Test Everything: TrustRadius Delivers Customer Value with Experimentation
 
Optimizely Agent: Scaling Resilient Feature Delivery
Optimizely Agent: Scaling Resilient Feature DeliveryOptimizely Agent: Scaling Resilient Feature Delivery
Optimizely Agent: Scaling Resilient Feature Delivery
 

Recently uploaded

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 

Recently uploaded (20)

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 

Statistical Models Explored and Explained

  • 1. Speakers Statistical Models, Explored and Explained Sara Vafi, Stats Expert, Optimizely Shana Rusonis, Product Marketing, Optimizely
  • 3. Housekeeping • We’re recording! • Slides and recording will be emailed to you tomorrow • Time for questions at the end
  • 4. Agenda • Bayesian & Frequentist Statistics • Error Control - Average vs. All Error Control • Bayes Rule • Benefits & Risks • Optimizely Stats Engine • Q&A
  • 5. Why Do We Experiment? ● Experimentation is essential for learning ● Try new ideas without fear of failure ● Give your business a signal to act on in a sea of noisy data
  • 6. What’s most Important to You? ● Running experiments quickly ● But also reporting on results accurately ● When not all statistical solutions are created equal
  • 7. Types of Statistical Methods Bayesian OR Frequentist
  • 8. Bayesian Statistics ● Bayesian statistics take a more bottom-up approach to data analysis ● Our parameters are unknown ● The data is fixed ● There is a prior probability ● “Opinion-based”
  • 9. “A Bayesian is one who, vaguely expecting a horse, and catching a glimpse of a donkey, strongly believes he has seen a mule.” Source
  • 10. Frequentist Statistics ● Frequentist arguments are more counter-factual in nature ● Parameters remain constant during the repeatable sampling process ● Resemble the type of logic that lawyers use in court ● ‘Is this variation different from the control?’ is a basic building block of this approach.
  • 11. Example Dan & Pete Rolling a 6-Sided Die Scenario: ● Pete will roll a die and the outcome can either be 1, 2, 3, 4, 5, or 6 ● If Pete rolls a 4, he will give Dan $1 million If Dan was a Bayesian statistician, how would he react? If Dan was a Frequentist statistician, how would he react?
  • 12. Example Probability of the sun exploding Source
  • 14. Error Control Explained ● The likelihood that the observed result of an experiment happened by chance, rather than a change that you introduced ● When we set the statistical significance on an experiment to 90%, that means there's a 10% chance of a statistical error, or a 1 in 10 chance that the result happened by chance
  • 15. Average Error Control ● Corresponds to Bayesian A/B Testing ● Less useful for iterating on test results ● Harder to learn from individual experiments with confidence
  • 16. All Error Control ● Corresponds to Frequentist A/B Testing ● Any experiment will have less than a 10% chance of a mistake ● Rate of errors is 1 in 10
  • 17. Average Error Control vs. All Error Control ● Average error control leads to lower accuracy for small improvements ● All error control is accurate for all users ● There are certain cases where average error control is an appropriate alternative
  • 18. Error Rates for Experiments
  • 20. Average Error Control & Bayesian A/B Testing ● Requires two sources of randomness ○ Randomness or “noise” in the data ○ The makeup of the “typical” experiment group ● Distribution over experiment improvements
  • 21. Different Beliefs in Composition of ‘Typical’ Experiments
  • 23. Bayes Rule & Bayesian A/B Testing
  • 24. Bayes Rule & Average Error Value
  • 25. Recap Average Error Control Bayesian A/B Testing Prior Distributions Bayes Rule
  • 26. All Error Control is Frequentist A/B Testing ● All error control corresponds to Frequentist AB testing ● We want to aim to control the false positive rate ● Chance an experiment is either called a winner or loser
  • 28. Benefits of Bayesian A/B Testing ● Average error control can be very attractive ● Helps solve the “peeking” problem ● Average error control is fast
  • 29. Risks of Bayesian A/B Testing ● It’s more appealing but it’s risky in practice ● Smaller improvement experiments with fast results = high risk ● Higher error rate than the method actually suggests
  • 30. Benefits of Frequentist A/B Testing ● This type of test will make fewer mistakes on experiments with non-zero improvements ● The rate of errors will be less than 1 in 10 ● Option to speed up experimentation by using a prior
  • 33. Risk Involved with Typical Realistic Experiments
  • 34. Realistic Bayesian A/B Tests vs. Stats Engine
  • 35. ● The hardest experiments to call correctly are those with small improvements ● A/B testing in the wild is not easy ● We need more and more data in order to achieve average error control on realistic experiments So what does this mean?
  • 37. Stats EngineTM Results are valid whenever you check Avoid costly statistics errors Measure real-time results with confidence
  • 38. Key Takeaways ● Bayesian vs. Frequentist methods ● All error control vs. average error control ● Blended approach leads to greater confidence