A/B testing

What Is A/B Testing?
 Develop

two versions of a page
 Randomly show different versions to users
 Track how users perform
 Evaluate (that's where statistics comes in)
 Use the better version

Why A/B Test?
A

typical website converts 2% of visitors
into customers
 People can't explain why they left
 Small changes can make a big difference
 How about +40%?
 Google believes it works, see Content
Experiments in Google Analytics

What Can You A/B Test?
 Removing

form fields
 Adding relevant form fields
 Marketing landing pages
 Different explanations
 Having interstitial pages
 Email content
 Any casual decisions you care about

A/B Tests Do Not Substitute For
 Talking

to users
 Usability tests
 Acceptance tests
 Unit tests
 Thinking

The G-Test
A

method for comparing 2 data sets
 It was invented by Karl Pearson in 1900
 It is a close relative of the chi-square test
 It is our main method for evaluating A/B
tests
 There are alternatives

Limitations Of The G-Test
 Only

answers yes/no questions (but you
pick the question)
 Only handles 2 versions (there is a
workaround)
 Requires independence in samples
 Does not do confidence intervals

What To Measure
 Start

your A/B test
 Divide your users into groups A and B
 Decide whether each person did what
you want
 Reduce your results to 4 numbers: ($a_yes,
$a_no, $b_yes, $b_no)

G-Test Evaluation
 Select

a yes/no question about users
 Divide users in A and B into yes/no
 Perform complicated G-test calculation
to calculate $p
 Our confidence is 1-$p
 Make decision if our confidence is near
100% and we have enough samples
 Enough samples is at least 10 yes and no
results in each test

Your Conversion Funnel







Every company has one or more conversion
funnels
You should know yours, and be actively trying
to improve each step
Each step can be tracked with some metric
Most A/B tests concentrate on one step in the
funnel
Expect to run multiple A/B tests against each
Standardize these metrics

Examples Of Metrics
 Sessions,

sessions with registration
 People who searched, who viewed detail
page, contacted, leased
 People who saved favorites, started a
cart, completed purchase
 People who saw at least 3 pages, clicked
on an ad
 Anything measurable and important to
your business

Too Many Metrics?
 You

may have many metrics
 High confidence on one may be chance
 Believe if it was the metric you tried to
change
 Believe if very high confidence
 Believe if metrics agree
 Conflicting metrics require business
decision

Is That It?
 You

now know enough to run a successful
A/B test!
 If you do everything right
 If you do it wrong you won't know
 You'll just get random answers
 And believe them

Compare Apples To Apples
 Traffic

behaves differently at different

times
 Friday night ≠ Monday morning
 First week in month ≠ last week in month
 Last week's visitors have done more than
this week's
 Do not try to compare people from
different times

Be Careful When Changing
The Mix
A

and B can receive unequal traffic
 But do not change the mix they get
 Wrong Change(90/10) A vs B to (80/20)
 You are implicitly comparing old A's with
new B's
 Right Change(10/10/80) A vs B vs
Untested to (20/20/60)
 This comes up repeatedly

What Is Wrong With This?
 Suppose

you are A/B testing a change to
your product page
 You log hits on your product page
 You log clicks on Buy Now
 You plug those numbers into the A/B
calculator
 Is this OK?

Beware Hidden Correlations!
 Correlations

increase variability, and
therefore $g_test
 Some people look at many product
pages
 Their buying behaviour is correlated on
those pages
 This increases the size of chance
fluctuations
 Leading to wrong results

Guarantee Independence
 Whatever

granularity (session, person,
event) you make A/B decisons on...
 Needs to be what you test on
 In this case measure people who hit your
product page
 Measure people who clicked on Buy Now
 Those are the right statistics to use
 This comes up repeatedly

Wrong Metric








At Rent.com we changed the title of our
lease report email
The new email had improved opens and
clicks
That was because it interested people who
were still looking for a place to live
That email needed to interest people who
had already found a place to live
We looked at the wrong metric, and it cost us
millions
This mistake is fairly rare

That's It!
 Those

are the big mistakes that I've seen
 You now know how to do an A/B test
 ...and should have good odds of getting it
right
 Of course there is more to know
 But this is the core

A/B testing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (14)

Similar to A/B testing

Similar to A/B testing (20)

Recently uploaded

Recently uploaded (20)

A/B testing