Nikolay Novozhilov gave a presentation on common problems with A/B testing and statistics. He discussed how looking at data before testing can invalidate results and showed through a Monte Carlo simulation that variants can appear to "win" just by chance. Multivariate testing and multiple comparisons were also cited as issues. Novozhilov recommended starting with a clear hypothesis, replicating tests, and considering sample size, significance, effect size, and power to obtain more reliable results from A/B tests.
2. A/B TESTING AND THE MOST COMMON
PROBLEMS, A LOOK AT STATISTICS AND
THE WAYS TO GET IT WRONG
Nikolay Novozhilov - Product director of data
platforms @ Wego.com.
Nikolay is building big data capabilities at Wego.com,
the Asia Pacific and the Middle East's leading travel
metasearch engine. He has 7+ years of experience in
data analytics working for IT startups and previously
for consulting. Nikolay received an MBA from INSEAD
in Singapore and before that lived and worked in
Moscow.
3. A/B testing and problems with statistics
Web Analytics Wednesday, Singapore
Nikolay Novozhilov, Wego.com
www.novozhilov.co
7. Lies, damned lies, and statistics
All different! All based on assumptions!!!
Tool Test used
Optimizely Two-tailed sequential likelihood ratio test
with false discovery rate controls
Google Analytics Bayes estimate with uniform beta prior
VWO Intersection of confidence intervals for
binominal distribution
Leanplum Confidence intervals at p=5%, unknown
statistic
Usereffect Chi-square statistics
Commerce Sciences Welch's t-test
8. What is p-value and why it is 5%?
All tests are
based on
assumptions!
Assumption #1:
You don’t look at
the data upfront
9. What happens if you look?
I played Monte Carlo in
Excel
And here is the result:
• 5% p-value
• 1000 “users” in each
sample
• CR of 2%
• A wins over A 29% of
the times!
10. What do you do about it?
Don’t look! (just kidding)
Google “O'Brien & Fleming interim
analysis” (no, still kidding )
Keep calm, more stuff coming!