Your SlideShare is downloading.
×

×
Saving this for later?
Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.

Text the download link to your phone

Standard text messaging rates apply

Like this presentation? Why not share!

- Different Types of Psychic Services... by John Gomez 841 views
- GFOA Technologies for Government CF... by Paul W. Taylor 86 views
- Flume @ Austin HUG 2/17/11 by Cloudera, Inc. 5277 views
- Template Haskell Tutorial by kizzx2 7789 views
- Building your startup's team by Marc Weil 7560 views
- Tweak Your Slides: Ten Design Princ... by Chiara Ojeda 55178 views
- WordCamp toronto by Stéphane Boisvert 244 views
- Contenidos digitales y Propiedad I... by Lorena Fernández 85093 views
- Business Continuity Planning Seminar by cmckinney 2667 views
- Why google will not hire you by Sumit Arora 2624 views
- Distributed Design and Architecture... by Derek Collison 13684 views
- [Hadoop] NexR Terapot: Massive Emai... by Jinho Jung 3962 views

Like this? Share it with your network
Share

4,200

Published on

Limits of A/B Tests

A/B tests don’t give you perfect decisions.

No matter what you do, you’re never 100% certain

If we’re not careful, winners aren’t really winners

Your conversions go up… and then they come back down

The Standard Solution

Run your test until you hit 95% statistical signiﬁcance.

Go to getdatadriven.com if you need a signiﬁcance calculator.

Martin Goodson’s PDF on poor testing methods: kiss.ly/bad-testing

This gives us the best data but not necessarily the best ROI.

So how far do we take this?

Simulation Time!

We modeled several A/B testing strategies. Using Monte Carlo simulations, we tested diﬀerent strategies over 1 million observations (people).

Will Kurt gets full credit for all this. @willkurt

1 Pick the minimal improvement The Scientist: 2 Determine your sample size 3 Determine degree of certainty (95%) 4 Start test but don’t check it early 5 If results aren’t signiﬁcant, keep control

Results for the Scientist:

1 Waits until 80% signiﬁcance The Reckless Marketer: 2 Calls a winner as soon as 80% gets hit

Results for the Reckless Marketer:

1 Waits for 95% signiﬁcance The Impatient Marketer: 2 Moves on to the next test after 500 people

Results for the Impatient Marketer:

The Realist 1 Waits for 99% signiﬁcance 2 Moves on to the next test after 2,000 people

Results for the Realist:

The Persistent Realist 1 Waits for 99% signiﬁcance 2 Moves on to the next test after 20,000 people

Results for the Persistent Realist:

The Blitz Realist 1 Waits for 99% signiﬁcance 2 Moves on to the next test after 200 people

Results for the Blitz Realist:

Let’s compare them using the area under the curve.

Don’t make decisions at less than 95% signiﬁcance.

You’ll waste all the time you spend testing

1 Be a scientist at 95% We have 3 viable strategies for making this work: 2 Only make changes at 99% 3 Sloppy 95% but make it up in volume

1 Pick the minimal improvement Be a scientist when you have lots of data and resources 2 Determine your sample size 3 Determine degree of certainty (95% 4 Start test but don’t check it early 5 If results aren’t signiﬁcant, keep control

If you don’t have the data or resources to be a scientist, go fast at 99%.

And if you still want to play at 95% without being a scientist, never stop testing.

How We A/B Test

First, get volume to 4000+ people/month.

Only make changes at 99% signiﬁcance.

Let the test run at least 1 week before checking results.

If not at 99% after two weeks, launch the next test.

If the next test isn’t ready, let it keep running while you build the next one.

The KISSmetrics A/B Testing Strategy 1 Get to 4,000 people/month for test 2 Only change the control if you reach 99% 3 Check results after 1 week 4 Launch the next test at 2 weeks 5 Let old tests run if you’re still building

This strategy isn’t perfect. It’s a balance between good data and speed.

No Downloads

Total Views

4,200

On Slideshare

0

From Embeds

0

Number of Embeds

4

Shares

0

Downloads

38

Comments

0

Likes

2

No embeds

No notes for slide

- 1. Lars Lofgren and Will Kurt Keep Your Gains from A/B Tests Without Killing Them Later May 2014
- 2. @larslofgren Hit me up
- 3. 1 The limits of A/B tests We’ll cover… 2 The standard solutions 3 Simulations! Woohoo! #KISSwebinar 4 The 3 strategies of A/B testing that work 5 How we A/B test at KISSmetrics
- 4. WATCH WEBINAR RECORDING NOW
- 5. Limits of A/B Tests
- 6. A/B tests don’t give you perfect decisions.
- 7. No ma er what you do, you’re never 100% certain
- 8. If we’re not careful, winners aren’t really winners
- 9. Your conversions go up… and then they come back down
- 10. The Standard Solution
- 11. Run your test until you hit 95% statistical signiﬁcance.
- 12. Go to getdatadriven.com if you need a signiﬁcance calculator.
- 13. 1 Pick the minimal improvement Scientiﬁc A/B testing: 2 Determine your sample size 3 Determine degree of certainty (95%) #KISSwebinar 4 Start test but don’t check it early 5 If results aren’t signiﬁcant, keep control
- 14. Martin Goodson’s PDF on poor testing methods: kiss.ly/bad-testing
- 15. This gives us the best data but not necessarily the best ROI.
- 16. So how far do we take this?
- 17. Simulation Time!
- 18. We modeled several A/B testing strategies. Using Monte Carlo simulations, we tested diﬀerent strategies over 1 million observations (people).
- 19. Will Kurt gets full credit for all this. @willkurt
- 20. 1 Pick the minimal improvement The Scientist: 2 Determine your sample size 3 Determine degree of certainty (95%) #KISSwebinar 4 Start test but don’t check it early 5 If results aren’t signiﬁcant, keep control
- 21. Results for the Scientist:
- 22. 1 Waits until 80% signiﬁcance The Reckless Marketer: #KISSwebinar 2 Calls a winner as soon as 80% gets hit
- 23. Results for the Reckless Marketer:
- 24. 1 Waits for 95% signiﬁcance The Impatient Marketer: #KISSwebinar 2 Moves on to the next test a er 500 people
- 25. Results for the Impatient Marketer:
- 26. The Realist #KISSwebinar 1 Waits for 99% signiﬁcance 2 Moves on to the next test a er 2,000 people
- 27. Results for the Realist:
- 28. The Persistent Realist #KISSwebinar 1 Waits for 99% signiﬁcance 2 Moves on to the next test a er 20,000 people
- 29. Results for the Persistent Realist:
- 30. The Blitz Realist #KISSwebinar 1 Waits for 99% signiﬁcance 2 Moves on to the next test a er 200 people
- 31. Results for the Blitz Realist:
- 32. Let’s compare them using the area under the curve.
- 33. A/B Strategy Scores Strategy Conditions Score Scientist Stats like a pro 67759 Reckless Marketer 80% 57649 Impatient Marketer 95% and 500 people 60532 Realist 99% and 2,000 people 67896 Persistent Realist 99% and 20,000 people 68346 Blitz Realist 99% and 200 people 62836 No Testing Testing? NOPE! 50000 Each score is the area under the curve from the simulation. The higher the score, the more conversions you received.
- 34. 0 17500 35000 52500 70000 Persistent Realist Realist Scientist Blitz Realist Impatient Reckless No Testing 50,000 57,649 60,532 62,836 67,75967,89668,346 A/B Strategy Scores
- 35. LOG IN WITH GOOGLE Start Your Free KISSmetrics Trial Text
- 36. 3 Strategies
- 37. Don’t make decisions at less than 95% signiﬁcance.
- 38. You’ll waste all the time you spend testing
- 39. 1 Be a scientist at 95% We have 3 viable strategies for making this work: 2 Only make changes at 99% 3 Sloppy 95% but make it up in volume #KISSwebinar
- 40. 1 Pick the minimal improvement Be a scientist when you have lots of data and resources 2 Determine your sample size 3 Determine degree of certainty (95%) #KISSwebinar 4 Start test but don’t check it early 5 If results aren’t signiﬁcant, keep control
- 41. If you don’t have the data or resources to be a scientist, go fast at 99%.
- 42. And if you still want to play at 95% without being a scientist, never stop testing.
- 43. How We A/B Test
- 44. First, get volume to 4000+ people/month.
- 45. Only make changes at 99% signiﬁcance.
- 46. Let the test run at least 1 week before checking results.
- 47. If not at 99% a er two weeks, launch the next test.
- 48. If the next test isn’t ready, let it keep running while you build the next one.
- 49. The KISSmetrics A/B Testing Strategy 1 Get to 4,000 people/month for test 2 Only change the control if you reach 99% 3 Check results a er 1 week 4 Launch the next test at 2 weeks 5 Let old tests run if you’re still building
- 50. This strategy isn’t perfect. It’s a balance between good data and speed.
- 51. 1 Be a scientist at 95% Remember the 3 strategies: 2 Only make changes at 99% 3 Sloppy 95% but make it up in volume #KISSwebinar
- 52. Q&A Time! Lars Lofgren @larslofgren llofgren@kissmetrics.com

Be the first to comment