Your SlideShare is downloading. ×
  • Like

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

How to Keep Your Gains from A/B Tests Without Accidentally Killing Them Later

  • 3,986 views
Published

Limits of A/B Tests …

Limits of A/B Tests
A/B tests don’t give you perfect decisions.
No matter what you do, you’re never 100% certain
If we’re not careful, winners aren’t really winners
Your conversions go up… and then they come back down
The Standard Solution
Run your test until you hit 95% statistical significance.
Go to getdatadriven.com if you need a significance calculator.
Martin Goodson’s PDF on poor testing methods: kiss.ly/bad-testing
This gives us the best data but not necessarily the best ROI.
So how far do we take this?
Simulation Time!
We modeled several A/B testing strategies. Using Monte Carlo simulations, we tested different strategies over 1 million observations (people).
Will Kurt gets full credit for all this. @willkurt
1 Pick the minimal improvement The Scientist: 2 Determine your sample size 3 Determine degree of certainty (95%) 4 Start test but don’t check it early 5 If results aren’t significant, keep control
Results for the Scientist:
1 Waits until 80% significance The Reckless Marketer: 2 Calls a winner as soon as 80% gets hit
Results for the Reckless Marketer:
1 Waits for 95% significance The Impatient Marketer: 2 Moves on to the next test after 500 people
Results for the Impatient Marketer:
The Realist 1 Waits for 99% significance 2 Moves on to the next test after 2,000 people
Results for the Realist:
The Persistent Realist 1 Waits for 99% significance 2 Moves on to the next test after 20,000 people
Results for the Persistent Realist:
The Blitz Realist 1 Waits for 99% significance 2 Moves on to the next test after 200 people
Results for the Blitz Realist:
Let’s compare them using the area under the curve.
Don’t make decisions at less than 95% significance.
You’ll waste all the time you spend testing
1 Be a scientist at 95% We have 3 viable strategies for making this work: 2 Only make changes at 99% 3 Sloppy 95% but make it up in volume
1 Pick the minimal improvement Be a scientist when you have lots of data and resources 2 Determine your sample size 3 Determine degree of certainty (95% 4 Start test but don’t check it early 5 If results aren’t significant, keep control
If you don’t have the data or resources to be a scientist, go fast at 99%.
And if you still want to play at 95% without being a scientist, never stop testing.
How We A/B Test
First, get volume to 4000+ people/month.
Only make changes at 99% significance.
Let the test run at least 1 week before checking results.
If not at 99% after two weeks, launch the next test.
If the next test isn’t ready, let it keep running while you build the next one.
The KISSmetrics A/B Testing Strategy 1 Get to 4,000 people/month for test 2 Only change the control if you reach 99% 3 Check results after 1 week 4 Launch the next test at 2 weeks 5 Let old tests run if you’re still building
This strategy isn’t perfect. It’s a balance between good data and speed.

Published in Marketing , Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,986
On SlideShare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
38
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Lars Lofgren and Will Kurt Keep Your Gains from A/B Tests Without Killing Them Later May 2014
  • 2. @larslofgren Hit me up
  • 3. 1 The limits of A/B tests We’ll cover… 2 The standard solutions 3 Simulations! Woohoo! #KISSwebinar 4 The 3 strategies of A/B testing that work 5 How we A/B test at KISSmetrics
  • 4. WATCH WEBINAR RECORDING NOW
  • 5. Limits of A/B Tests
  • 6. A/B tests don’t give you perfect decisions.
  • 7. No ma er what you do, you’re never 100% certain
  • 8. If we’re not careful, winners aren’t really winners
  • 9. Your conversions go up… and then they come back down
  • 10. The Standard Solution
  • 11. Run your test until you hit 95% statistical significance.
  • 12. Go to getdatadriven.com if you need a significance calculator.
  • 13. 1 Pick the minimal improvement Scientific A/B testing: 2 Determine your sample size 3 Determine degree of certainty (95%) #KISSwebinar 4 Start test but don’t check it early 5 If results aren’t significant, keep control
  • 14. Martin Goodson’s PDF on poor testing methods: kiss.ly/bad-testing
  • 15. This gives us the best data but not necessarily the best ROI.
  • 16. So how far do we take this?
  • 17. Simulation Time!
  • 18. We modeled several A/B testing strategies. Using Monte Carlo simulations, we tested different strategies over 1 million observations (people).
  • 19. Will Kurt gets full credit for all this. @willkurt
  • 20. 1 Pick the minimal improvement The Scientist: 2 Determine your sample size 3 Determine degree of certainty (95%) #KISSwebinar 4 Start test but don’t check it early 5 If results aren’t significant, keep control
  • 21. Results for the Scientist:
  • 22. 1 Waits until 80% significance The Reckless Marketer: #KISSwebinar 2 Calls a winner as soon as 80% gets hit
  • 23. Results for the Reckless Marketer:
  • 24. 1 Waits for 95% significance The Impatient Marketer: #KISSwebinar 2 Moves on to the next test a er 500 people
  • 25. Results for the Impatient Marketer:
  • 26. The Realist #KISSwebinar 1 Waits for 99% significance 2 Moves on to the next test a er 2,000 people
  • 27. Results for the Realist:
  • 28. The Persistent Realist #KISSwebinar 1 Waits for 99% significance 2 Moves on to the next test a er 20,000 people
  • 29. Results for the Persistent Realist:
  • 30. The Blitz Realist #KISSwebinar 1 Waits for 99% significance 2 Moves on to the next test a er 200 people
  • 31. Results for the Blitz Realist:
  • 32. Let’s compare them using the area under the curve.
  • 33. A/B Strategy Scores Strategy Conditions Score Scientist Stats like a pro 67759 Reckless Marketer 80% 57649 Impatient Marketer 95% and 500 people 60532 Realist 99% and 2,000 people 67896 Persistent Realist 99% and 20,000 people 68346 Blitz Realist 99% and 200 people 62836 No Testing Testing? NOPE! 50000 Each score is the area under the curve from the simulation. The higher the score, the more conversions you received.
  • 34. 0 17500 35000 52500 70000 Persistent Realist Realist Scientist Blitz Realist Impatient Reckless No Testing 50,000 57,649 60,532 62,836 67,75967,89668,346 A/B Strategy Scores
  • 35. LOG IN WITH GOOGLE Start Your Free KISSmetrics Trial Text
  • 36. 3 Strategies
  • 37. Don’t make decisions at less than 95% significance.
  • 38. You’ll waste all the time you spend testing
  • 39. 1 Be a scientist at 95% We have 3 viable strategies for making this work: 2 Only make changes at 99% 3 Sloppy 95% but make it up in volume #KISSwebinar
  • 40. 1 Pick the minimal improvement Be a scientist when you have lots of data and resources 2 Determine your sample size 3 Determine degree of certainty (95%) #KISSwebinar 4 Start test but don’t check it early 5 If results aren’t significant, keep control
  • 41. If you don’t have the data or resources to be a scientist, go fast at 99%.
  • 42. And if you still want to play at 95% without being a scientist, never stop testing.
  • 43. How We A/B Test
  • 44. First, get volume to 4000+ people/month.
  • 45. Only make changes at 99% significance.
  • 46. Let the test run at least 1 week before checking results.
  • 47. If not at 99% a er two weeks, launch the next test.
  • 48. If the next test isn’t ready, let it keep running while you build the next one.
  • 49. The KISSmetrics A/B Testing Strategy 1 Get to 4,000 people/month for test 2 Only change the control if you reach 99% 3 Check results a er 1 week 4 Launch the next test at 2 weeks 5 Let old tests run if you’re still building
  • 50. This strategy isn’t perfect. It’s a balance between good data and speed.
  • 51. 1 Be a scientist at 95% Remember the 3 strategies: 2 Only make changes at 99% 3 Sloppy 95% but make it up in volume #KISSwebinar
  • 52. Q&A Time! Lars Lofgren @larslofgren llofgren@kissmetrics.com