How to Keep Your Gains from A/B Tests Without Accidentally Killing Them Later
 

How to Keep Your Gains from A/B Tests Without Accidentally Killing Them Later

on

  • 3,820 views

Limits of A/B Tests ...

Limits of A/B Tests
A/B tests don’t give you perfect decisions.
No matter what you do, you’re never 100% certain
If we’re not careful, winners aren’t really winners
Your conversions go up… and then they come back down
The Standard Solution
Run your test until you hit 95% statistical significance.
Go to getdatadriven.com if you need a significance calculator.
Martin Goodson’s PDF on poor testing methods: kiss.ly/bad-testing
This gives us the best data but not necessarily the best ROI.
So how far do we take this?
Simulation Time!
We modeled several A/B testing strategies. Using Monte Carlo simulations, we tested different strategies over 1 million observations (people).
Will Kurt gets full credit for all this. @willkurt
1 Pick the minimal improvement The Scientist: 2 Determine your sample size 3 Determine degree of certainty (95%) 4 Start test but don’t check it early 5 If results aren’t significant, keep control
Results for the Scientist:
1 Waits until 80% significance The Reckless Marketer: 2 Calls a winner as soon as 80% gets hit
Results for the Reckless Marketer:
1 Waits for 95% significance The Impatient Marketer: 2 Moves on to the next test after 500 people
Results for the Impatient Marketer:
The Realist 1 Waits for 99% significance 2 Moves on to the next test after 2,000 people
Results for the Realist:
The Persistent Realist 1 Waits for 99% significance 2 Moves on to the next test after 20,000 people
Results for the Persistent Realist:
The Blitz Realist 1 Waits for 99% significance 2 Moves on to the next test after 200 people
Results for the Blitz Realist:
Let’s compare them using the area under the curve.
Don’t make decisions at less than 95% significance.
You’ll waste all the time you spend testing
1 Be a scientist at 95% We have 3 viable strategies for making this work: 2 Only make changes at 99% 3 Sloppy 95% but make it up in volume
1 Pick the minimal improvement Be a scientist when you have lots of data and resources 2 Determine your sample size 3 Determine degree of certainty (95% 4 Start test but don’t check it early 5 If results aren’t significant, keep control
If you don’t have the data or resources to be a scientist, go fast at 99%.
And if you still want to play at 95% without being a scientist, never stop testing.
How We A/B Test
First, get volume to 4000+ people/month.
Only make changes at 99% significance.
Let the test run at least 1 week before checking results.
If not at 99% after two weeks, launch the next test.
If the next test isn’t ready, let it keep running while you build the next one.
The KISSmetrics A/B Testing Strategy 1 Get to 4,000 people/month for test 2 Only change the control if you reach 99% 3 Check results after 1 week 4 Launch the next test at 2 weeks 5 Let old tests run if you’re still building
This strategy isn’t perfect. It’s a balance between good data and speed.

Statistics

Views

Total Views
3,820
Views on SlideShare
3,806
Embed Views
14

Actions

Likes
2
Downloads
32
Comments
0

3 Embeds 14

https://twitter.com 11
http://tweetedtimes.com 2
http://feedly.com 1

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    How to Keep Your Gains from A/B Tests Without Accidentally Killing Them Later How to Keep Your Gains from A/B Tests Without Accidentally Killing Them Later Presentation Transcript

    • Lars Lofgren and Will Kurt Keep Your Gains from A/B Tests Without Killing Them Later May 2014
    • @larslofgren Hit me up
    • 1 The limits of A/B tests We’ll cover… 2 The standard solutions 3 Simulations! Woohoo! #KISSwebinar 4 The 3 strategies of A/B testing that work 5 How we A/B test at KISSmetrics
    • WATCH WEBINAR RECORDING NOW
    • Limits of A/B Tests
    • A/B tests don’t give you perfect decisions.
    • No ma er what you do, you’re never 100% certain
    • If we’re not careful, winners aren’t really winners
    • Your conversions go up… and then they come back down
    • The Standard Solution
    • Run your test until you hit 95% statistical significance.
    • Go to getdatadriven.com if you need a significance calculator.
    • 1 Pick the minimal improvement Scientific A/B testing: 2 Determine your sample size 3 Determine degree of certainty (95%) #KISSwebinar 4 Start test but don’t check it early 5 If results aren’t significant, keep control
    • Martin Goodson’s PDF on poor testing methods: kiss.ly/bad-testing
    • This gives us the best data but not necessarily the best ROI.
    • So how far do we take this?
    • Simulation Time!
    • We modeled several A/B testing strategies. Using Monte Carlo simulations, we tested different strategies over 1 million observations (people).
    • Will Kurt gets full credit for all this. @willkurt
    • 1 Pick the minimal improvement The Scientist: 2 Determine your sample size 3 Determine degree of certainty (95%) #KISSwebinar 4 Start test but don’t check it early 5 If results aren’t significant, keep control
    • Results for the Scientist:
    • 1 Waits until 80% significance The Reckless Marketer: #KISSwebinar 2 Calls a winner as soon as 80% gets hit
    • Results for the Reckless Marketer:
    • 1 Waits for 95% significance The Impatient Marketer: #KISSwebinar 2 Moves on to the next test a er 500 people
    • Results for the Impatient Marketer:
    • The Realist #KISSwebinar 1 Waits for 99% significance 2 Moves on to the next test a er 2,000 people
    • Results for the Realist:
    • The Persistent Realist #KISSwebinar 1 Waits for 99% significance 2 Moves on to the next test a er 20,000 people
    • Results for the Persistent Realist:
    • The Blitz Realist #KISSwebinar 1 Waits for 99% significance 2 Moves on to the next test a er 200 people
    • Results for the Blitz Realist:
    • Let’s compare them using the area under the curve.
    • A/B Strategy Scores Strategy Conditions Score Scientist Stats like a pro 67759 Reckless Marketer 80% 57649 Impatient Marketer 95% and 500 people 60532 Realist 99% and 2,000 people 67896 Persistent Realist 99% and 20,000 people 68346 Blitz Realist 99% and 200 people 62836 No Testing Testing? NOPE! 50000 Each score is the area under the curve from the simulation. The higher the score, the more conversions you received.
    • 0 17500 35000 52500 70000 Persistent Realist Realist Scientist Blitz Realist Impatient Reckless No Testing 50,000 57,649 60,532 62,836 67,75967,89668,346 A/B Strategy Scores
    • LOG IN WITH GOOGLE Start Your Free KISSmetrics Trial Text
    • 3 Strategies
    • Don’t make decisions at less than 95% significance.
    • You’ll waste all the time you spend testing
    • 1 Be a scientist at 95% We have 3 viable strategies for making this work: 2 Only make changes at 99% 3 Sloppy 95% but make it up in volume #KISSwebinar
    • 1 Pick the minimal improvement Be a scientist when you have lots of data and resources 2 Determine your sample size 3 Determine degree of certainty (95%) #KISSwebinar 4 Start test but don’t check it early 5 If results aren’t significant, keep control
    • If you don’t have the data or resources to be a scientist, go fast at 99%.
    • And if you still want to play at 95% without being a scientist, never stop testing.
    • How We A/B Test
    • First, get volume to 4000+ people/month.
    • Only make changes at 99% significance.
    • Let the test run at least 1 week before checking results.
    • If not at 99% a er two weeks, launch the next test.
    • If the next test isn’t ready, let it keep running while you build the next one.
    • The KISSmetrics A/B Testing Strategy 1 Get to 4,000 people/month for test 2 Only change the control if you reach 99% 3 Check results a er 1 week 4 Launch the next test at 2 weeks 5 Let old tests run if you’re still building
    • This strategy isn’t perfect. It’s a balance between good data and speed.
    • 1 Be a scientist at 95% Remember the 3 strategies: 2 Only make changes at 99% 3 Sloppy 95% but make it up in volume #KISSwebinar
    • Q&A Time! Lars Lofgren @larslofgren llofgren@kissmetrics.com