2. What Is A/B Testing?
ï‚› Develop
two versions of a page
ï‚› Randomly show different versions to users
ï‚› Track how users perform
ï‚› Evaluate (that's where statistics comes in)
ï‚› Use the better version
3. Why A/B Test?
ï‚›A
typical website converts 2% of visitors
into customers
ï‚› People can't explain why they left
ï‚› Small changes can make a big difference
ï‚› How about +40%?
ï‚› Google believes it works, see Content
Experiments in Google Analytics
4. What Can You A/B Test?
ï‚› Removing
form fields
ï‚› Adding relevant form fields
ï‚› Marketing landing pages
ï‚› Different explanations
ï‚› Having interstitial pages
ï‚› Email content
ï‚› Any casual decisions you care about
5. A/B Tests Do Not Substitute For
ï‚› Talking
to users
ï‚› Usability tests
ï‚› Acceptance tests
ï‚› Unit tests
ï‚› Thinking
6. The G-Test
ï‚›A
method for comparing 2 data sets
ï‚› It was invented by Karl Pearson in 1900
ï‚› It is a close relative of the chi-square test
ï‚› It is our main method for evaluating A/B
tests
ï‚› There are alternatives
7. Limitations Of The G-Test
ï‚› Only
answers yes/no questions (but you
pick the question)
ï‚› Only handles 2 versions (there is a
workaround)
ï‚› Requires independence in samples
ï‚› Does not do confidence intervals
8. What To Measure
ï‚› Start
your A/B test
ï‚› Divide your users into groups A and B
ï‚› Decide whether each person did what
you want
ï‚› Reduce your results to 4 numbers: ($a_yes,
$a_no, $b_yes, $b_no)
9. G-Test Evaluation
ï‚› Select
a yes/no question about users
ï‚› Divide users in A and B into yes/no
ï‚› Perform complicated G-test calculation
to calculate $p
ï‚› Our confidence is 1-$p
ï‚› Make decision if our confidence is near
100% and we have enough samples
ï‚› Enough samples is at least 10 yes and no
results in each test
11. Your Conversion Funnel
ï‚›
ï‚›
ï‚›
ï‚›
ï‚›
ï‚›
Every company has one or more conversion
funnels
You should know yours, and be actively trying
to improve each step
Each step can be tracked with some metric
Most A/B tests concentrate on one step in the
funnel
Expect to run multiple A/B tests against each
Standardize these metrics
12. Examples Of Metrics
ï‚› Sessions,
sessions with registration
ï‚› People who searched, who viewed detail
page, contacted, leased
ï‚› People who saved favorites, started a
cart, completed purchase
ï‚› People who saw at least 3 pages, clicked
on an ad
ï‚› Anything measurable and important to
your business
13. Too Many Metrics?
ï‚› You
may have many metrics
ï‚› High confidence on one may be chance
ï‚› Believe if it was the metric you tried to
change
ï‚› Believe if very high confidence
ï‚› Believe if metrics agree
ï‚› Conflicting metrics require business
decision
14. Is That It?
ï‚› You
now know enough to run a successful
A/B test!
ï‚› If you do everything right
ï‚› If you do it wrong you won't know
ï‚› You'll just get random answers
ï‚› And believe them
15. Compare Apples To Apples
ï‚› Traffic
behaves differently at different
times
 Friday night ≠Monday morning
 First week in month ≠last week in month
ï‚› Last week's visitors have done more than
this week's
ï‚› Do not try to compare people from
different times
16. Be Careful When Changing
The Mix
ï‚›A
and B can receive unequal traffic
ï‚› But do not change the mix they get
ï‚› Wrong Change(90/10) A vs B to (80/20)
ï‚› You are implicitly comparing old A's with
new B's
ï‚› Right Change(10/10/80) A vs B vs
Untested to (20/20/60)
ï‚› This comes up repeatedly
17. What Is Wrong With This?
ï‚› Suppose
you are A/B testing a change to
your product page
ï‚› You log hits on your product page
ï‚› You log clicks on Buy Now
ï‚› You plug those numbers into the A/B
calculator
ï‚› Is this OK?
18. Beware Hidden Correlations!
ï‚› Correlations
increase variability, and
therefore $g_test
ï‚› Some people look at many product
pages
ï‚› Their buying behaviour is correlated on
those pages
ï‚› This increases the size of chance
fluctuations
ï‚› Leading to wrong results
19. Guarantee Independence
ï‚› Whatever
granularity (session, person,
event) you make A/B decisons on...
ï‚› Needs to be what you test on
ï‚› In this case measure people who hit your
product page
ï‚› Measure people who clicked on Buy Now
ï‚› Those are the right statistics to use
ï‚› This comes up repeatedly
20. Wrong Metric
ï‚›
ï‚›
ï‚›
ï‚›
ï‚›
ï‚›
At Rent.com we changed the title of our
lease report email
The new email had improved opens and
clicks
That was because it interested people who
were still looking for a place to live
That email needed to interest people who
had already found a place to live
We looked at the wrong metric, and it cost us
millions
This mistake is fairly rare
21. That's It!
ï‚› Those
are the big mistakes that I've seen
ï‚› You now know how to do an A/B test
ï‚› ...and should have good odds of getting it
right
ï‚› Of course there is more to know
ï‚› But this is the core