Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

5,418 views

Published on

A/B tests don’t give you perfect decisions.

No matter what you do, you’re never 100% certain

If we’re not careful, winners aren’t really winners

Your conversions go up… and then they come back down

The Standard Solution

Run your test until you hit 95% statistical signiﬁcance.

Go to getdatadriven.com if you need a signiﬁcance calculator.

Martin Goodson’s PDF on poor testing methods: kiss.ly/bad-testing

This gives us the best data but not necessarily the best ROI.

So how far do we take this?

Simulation Time!

We modeled several A/B testing strategies. Using Monte Carlo simulations, we tested diﬀerent strategies over 1 million observations (people).

Will Kurt gets full credit for all this. @willkurt

1 Pick the minimal improvement The Scientist: 2 Determine your sample size 3 Determine degree of certainty (95%) 4 Start test but don’t check it early 5 If results aren’t signiﬁcant, keep control

Results for the Scientist:

1 Waits until 80% signiﬁcance The Reckless Marketer: 2 Calls a winner as soon as 80% gets hit

Results for the Reckless Marketer:

1 Waits for 95% signiﬁcance The Impatient Marketer: 2 Moves on to the next test after 500 people

Results for the Impatient Marketer:

The Realist 1 Waits for 99% signiﬁcance 2 Moves on to the next test after 2,000 people

Results for the Realist:

The Persistent Realist 1 Waits for 99% signiﬁcance 2 Moves on to the next test after 20,000 people

Results for the Persistent Realist:

The Blitz Realist 1 Waits for 99% signiﬁcance 2 Moves on to the next test after 200 people

Results for the Blitz Realist:

Let’s compare them using the area under the curve.

Don’t make decisions at less than 95% signiﬁcance.

You’ll waste all the time you spend testing

1 Be a scientist at 95% We have 3 viable strategies for making this work: 2 Only make changes at 99% 3 Sloppy 95% but make it up in volume

1 Pick the minimal improvement Be a scientist when you have lots of data and resources 2 Determine your sample size 3 Determine degree of certainty (95% 4 Start test but don’t check it early 5 If results aren’t signiﬁcant, keep control

If you don’t have the data or resources to be a scientist, go fast at 99%.

And if you still want to play at 95% without being a scientist, never stop testing.

How We A/B Test

First, get volume to 4000+ people/month.

Only make changes at 99% signiﬁcance.

Let the test run at least 1 week before checking results.

If not at 99% after two weeks, launch the next test.

If the next test isn’t ready, let it keep running while you build the next one.

The KISSmetrics A/B Testing Strategy 1 Get to 4,000 people/month for test 2 Only change the control if you reach 99% 3 Check results after 1 week 4 Launch the next test at 2 weeks 5 Let old tests run if you’re still building

This strategy isn’t perfect. It’s a balance between good data and speed.

No Downloads

Total views

5,418

On SlideShare

0

From Embeds

0

Number of Embeds

20

Shares

0

Downloads

42

Comments

0

Likes

2

No embeds

No notes for slide

- 1. Lars Lofgren and Will Kurt Keep Your Gains from A/B Tests Without Killing Them Later May 2014
- 2. @larslofgren Hit me up
- 3. 1 The limits of A/B tests We’ll cover… 2 The standard solutions 3 Simulations! Woohoo! #KISSwebinar 4 The 3 strategies of A/B testing that work 5 How we A/B test at KISSmetrics
- 4. WATCH WEBINAR RECORDING NOW
- 5. Limits of A/B Tests
- 6. A/B tests don’t give you perfect decisions.
- 7. No ma er what you do, you’re never 100% certain
- 8. If we’re not careful, winners aren’t really winners
- 9. Your conversions go up… and then they come back down
- 10. The Standard Solution
- 11. Run your test until you hit 95% statistical signiﬁcance.
- 12. Go to getdatadriven.com if you need a signiﬁcance calculator.
- 13. 1 Pick the minimal improvement Scientiﬁc A/B testing: 2 Determine your sample size 3 Determine degree of certainty (95%) #KISSwebinar 4 Start test but don’t check it early 5 If results aren’t signiﬁcant, keep control
- 14. Martin Goodson’s PDF on poor testing methods: kiss.ly/bad-testing
- 15. This gives us the best data but not necessarily the best ROI.
- 16. So how far do we take this?
- 17. Simulation Time!
- 18. We modeled several A/B testing strategies. Using Monte Carlo simulations, we tested diﬀerent strategies over 1 million observations (people).
- 19. Will Kurt gets full credit for all this. @willkurt
- 20. 1 Pick the minimal improvement The Scientist: 2 Determine your sample size 3 Determine degree of certainty (95%) #KISSwebinar 4 Start test but don’t check it early 5 If results aren’t signiﬁcant, keep control
- 21. Results for the Scientist:
- 22. 1 Waits until 80% signiﬁcance The Reckless Marketer: #KISSwebinar 2 Calls a winner as soon as 80% gets hit
- 23. Results for the Reckless Marketer:
- 24. 1 Waits for 95% signiﬁcance The Impatient Marketer: #KISSwebinar 2 Moves on to the next test a er 500 people
- 25. Results for the Impatient Marketer:
- 26. The Realist #KISSwebinar 1 Waits for 99% signiﬁcance 2 Moves on to the next test a er 2,000 people
- 27. Results for the Realist:
- 28. The Persistent Realist #KISSwebinar 1 Waits for 99% signiﬁcance 2 Moves on to the next test a er 20,000 people
- 29. Results for the Persistent Realist:
- 30. The Blitz Realist #KISSwebinar 1 Waits for 99% signiﬁcance 2 Moves on to the next test a er 200 people
- 31. Results for the Blitz Realist:
- 32. Let’s compare them using the area under the curve.
- 33. A/B Strategy Scores Strategy Conditions Score Scientist Stats like a pro 67759 Reckless Marketer 80% 57649 Impatient Marketer 95% and 500 people 60532 Realist 99% and 2,000 people 67896 Persistent Realist 99% and 20,000 people 68346 Blitz Realist 99% and 200 people 62836 No Testing Testing? NOPE! 50000 Each score is the area under the curve from the simulation. The higher the score, the more conversions you received.
- 34. 0 17500 35000 52500 70000 Persistent Realist Realist Scientist Blitz Realist Impatient Reckless No Testing 50,000 57,649 60,532 62,836 67,75967,89668,346 A/B Strategy Scores
- 35. LOG IN WITH GOOGLE Start Your Free KISSmetrics Trial Text
- 36. 3 Strategies
- 37. Don’t make decisions at less than 95% signiﬁcance.
- 38. You’ll waste all the time you spend testing
- 39. 1 Be a scientist at 95% We have 3 viable strategies for making this work: 2 Only make changes at 99% 3 Sloppy 95% but make it up in volume #KISSwebinar
- 40. 1 Pick the minimal improvement Be a scientist when you have lots of data and resources 2 Determine your sample size 3 Determine degree of certainty (95%) #KISSwebinar 4 Start test but don’t check it early 5 If results aren’t signiﬁcant, keep control
- 41. If you don’t have the data or resources to be a scientist, go fast at 99%.
- 42. And if you still want to play at 95% without being a scientist, never stop testing.
- 43. How We A/B Test
- 44. First, get volume to 4000+ people/month.
- 45. Only make changes at 99% signiﬁcance.
- 46. Let the test run at least 1 week before checking results.
- 47. If not at 99% a er two weeks, launch the next test.
- 48. If the next test isn’t ready, let it keep running while you build the next one.
- 49. The KISSmetrics A/B Testing Strategy 1 Get to 4,000 people/month for test 2 Only change the control if you reach 99% 3 Check results a er 1 week 4 Launch the next test at 2 weeks 5 Let old tests run if you’re still building
- 50. This strategy isn’t perfect. It’s a balance between good data and speed.
- 51. 1 Be a scientist at 95% Remember the 3 strategies: 2 Only make changes at 99% 3 Sloppy 95% but make it up in volume #KISSwebinar
- 52. Q&A Time! Lars Lofgren @larslofgren llofgren@kissmetrics.com

No public clipboards found for this slide

Be the first to comment