Limits of A/B Tests ...

Limits of A/B Tests

A/B tests don’t give you perfect decisions.

No matter what you do, you’re never 100% certain

If we’re not careful, winners aren’t really winners

Your conversions go up… and then they come back down

The Standard Solution

Run your test until you hit 95% statistical signiﬁcance.

Go to getdatadriven.com if you need a signiﬁcance calculator.

Martin Goodson’s PDF on poor testing methods: kiss.ly/bad-testing

This gives us the best data but not necessarily the best ROI.

So how far do we take this?

Simulation Time!

We modeled several A/B testing strategies. Using Monte Carlo simulations, we tested diﬀerent strategies over 1 million observations (people).

Will Kurt gets full credit for all this. @willkurt

1 Pick the minimal improvement The Scientist: 2 Determine your sample size 3 Determine degree of certainty (95%) 4 Start test but don’t check it early 5 If results aren’t signiﬁcant, keep control

Results for the Scientist:

1 Waits until 80% signiﬁcance The Reckless Marketer: 2 Calls a winner as soon as 80% gets hit

Results for the Reckless Marketer:

1 Waits for 95% signiﬁcance The Impatient Marketer: 2 Moves on to the next test after 500 people

Results for the Impatient Marketer:

The Realist 1 Waits for 99% signiﬁcance 2 Moves on to the next test after 2,000 people

Results for the Realist:

The Persistent Realist 1 Waits for 99% signiﬁcance 2 Moves on to the next test after 20,000 people

Results for the Persistent Realist:

The Blitz Realist 1 Waits for 99% signiﬁcance 2 Moves on to the next test after 200 people

Results for the Blitz Realist:

Let’s compare them using the area under the curve.

Don’t make decisions at less than 95% signiﬁcance.

You’ll waste all the time you spend testing

1 Be a scientist at 95% We have 3 viable strategies for making this work: 2 Only make changes at 99% 3 Sloppy 95% but make it up in volume

1 Pick the minimal improvement Be a scientist when you have lots of data and resources 2 Determine your sample size 3 Determine degree of certainty (95% 4 Start test but don’t check it early 5 If results aren’t signiﬁcant, keep control

If you don’t have the data or resources to be a scientist, go fast at 99%.

And if you still want to play at 95% without being a scientist, never stop testing.

How We A/B Test

First, get volume to 4000+ people/month.

Only make changes at 99% signiﬁcance.

Let the test run at least 1 week before checking results.

If not at 99% after two weeks, launch the next test.

If the next test isn’t ready, let it keep running while you build the next one.

The KISSmetrics A/B Testing Strategy 1 Get to 4,000 people/month for test 2 Only change the control if you reach 99% 3 Check results after 1 week 4 Launch the next test at 2 weeks 5 Let old tests run if you’re still building

This strategy isn’t perfect. It’s a balance between good data and speed.

### Statistics

### Views

- Total Views
- 2,379
- Views on SlideShare
- 2,366
- Embed Views

### Actions

- Likes
- 1
- Downloads
- 31
- Comments
- 0

### Accessibility

### Categories

### Upload Details

Uploaded via SlideShare as Adobe PDF

### Usage Rights

© All Rights Reserved

No comments yet1 Like1Full NameComment goes here.Michael Karpov, Product Manager at Yandex 1 month ago