Ton Wesseling
How an analyst can add value!
Digital Experiments
TON@ONLINEDIALOGUE.COM
TON@ONLINEDIALOGUE.COM
Dale	Ha's	on	Researchgate
TON@ONLINEDIALOGUE.COM
TON@ONLINEDIALOGUE.COM
		A/B-tes5ng	mastery	course	
This talk only makes sense if
you have 10.000 transactions
or more per month – enough to
get experimentation in the
DNA of your organization.!
TON@ONLINEDIALOGUE.COM
Data	Analyst	-	The	Noun	Project	icon	from	the	Noun	Project
TON@ONLINEDIALOGUE.COM
Behavior	Analyst	Meaning	Noun	shirt	on	Amazon.com
TON@ONLINEDIALOGUE.COM
DEF!
The task of an analyst within an A/B-testing Culture!
1.  Data!
2.  Effectiveness!
3.  Finance!
TON@ONLINEDIALOGUE.COM
Data!
Let there be high quality data!
TON@ONLINEDIALOGUE.COM
Make sure all funnels are measured…!
TON@ONLINEDIALOGUE.COM
Make sure your testing solution has all users!
Users on template: 42186!
Users in the tool: 37652!
Users with code executed: 34312 !
100%!
89%!
81%!
TON@ONLINEDIALOGUE.COM
What if my experiments had 20% more users?!
TON@ONLINEDIALOGUE.COM
Recognizing returning users!
Buddhini	S.	on	Jargon	Wall
TON@ONLINEDIALOGUE.COM
Be able to segment on page interactions!
TON@ONLINEDIALOGUE.COM
Be able to segment on who can be influenced!
TON@ONLINEDIALOGUE.COM
Be able to segment on who can be influenced!
TON@ONLINEDIALOGUE.COM
Be able to create behavioral segments!
Typical ecommerce flow example:
ü  All users on your website with enough time to take action
ü  All users on your website with at least some interaction
ü  All users on your website with heavy interaction
ü  All users on your website with clear intent to buy
ü  All users on your website that are willing to buy
ü  All users on your website that succeed in buying
ü  All users on your website that return with intent to buy more
Funnel	
+	
Average	
5me
TON@ONLINEDIALOGUE.COM
Data!
Let there be high quality data!
TON@ONLINEDIALOGUE.COM
DEF!
The task of an analyst within an A/B-testing Culture!
1.  Data!
2.  Effectiveness!
3.  Finance!
TON@ONLINEDIALOGUE.COM
Effectiveness!
Make sure you work on stuff!
with the highest potential outcome!
TON@ONLINEDIALOGUE.COM
Statistical Power!
The likelihood that an experiment will
detect an effect, when there is an effect
there to be detected!
TON@ONLINEDIALOGUE.COM
Power & Significance
New version is
NOT better
New version is
better
New version is
NOT better
New version is
better
Measured
Reality
TON@ONLINEDIALOGUE.COM
Power & Significance
New version is
NOT better
New version is
better
New version is
NOT better
New version is
better
Measured
Reality
TON@ONLINEDIALOGUE.COM
Power & Significance
New version is
NOT better
New version is
better
New version is
NOT better
New version is
better
Measured
Reality
TON@ONLINEDIALOGUE.COM
Power & Significance
Do not reject H0 Reject H0
New version is
NOT better
New version is
better
Measured
Reality
TON@ONLINEDIALOGUE.COM
Power & Significance
Do not reject H0 Reject H0
H0 is true
H0 is false
Measured
Reality
TON@ONLINEDIALOGUE.COM
Significance
Do not reject H0 Reject H0
H0 is true
H0 is false
Correct decision
J
Measured
Reality
TON@ONLINEDIALOGUE.COM
Significance
Do not reject H0 Reject H0
H0 is true
Type I
False Positive (α)
H0 is false
Correct decision
J
Measured
Reality
TON@ONLINEDIALOGUE.COM
Power
Do not reject H0 Reject H0
H0 is true
Correct decision
J
Type I
False Positive (α)
H0 is false
Correct decision
J
Measured
Reality
TON@ONLINEDIALOGUE.COM
Power
Do not reject H0 Reject H0
H0 is true
Correct decision
J
Type I
False Positive (α)
H0 is false
Type II

False Negative (β)
Correct decision
J
Measured
Reality
TON@ONLINEDIALOGUE.COM
Power
New version is
NOT better
New version is
better
New version is
NOT better
Correct decision
J
Type I
False Positive (α)
New version is
better
Type II

False Negative (β)
Correct decision
J
Measured
Reality
TON@ONLINEDIALOGUE.COM
Power & Significance rule of thumb
Power
When you start: try to test on pages with a high Power
(>80%) à otherwise you don’t detect effects when there is
an effect to be detected (False negatives).
Significance
When you start: try to test against a high enough
significance level (90%) à otherwise you’ll declare winners,
when in reality there isn’t an effect (False positives).
TON@ONLINEDIALOGUE.COM
https://abtestguide.com/abtestsize/!
TON@ONLINEDIALOGUE.COM
TON@ONLINEDIALOGUE.COM
https://ondi.me/bandwidth!
TON@ONLINEDIALOGUE.COM
Prioritize based on MDE to start!
TON@ONLINEDIALOGUE.COM
Prioritize based on measured results!!
TON@ONLINEDIALOGUE.COM
Effectiveness!
Make sure you work on stuff!
with the highest potential outcome!
TON@ONLINEDIALOGUE.COM
DEF!
The task of an analyst within an A/B-testing Culture!
1.  Data!
2.  Effectiveness!
3.  Finance!
TON@ONLINEDIALOGUE.COM
Finance!
Business case calculations!
TON@ONLINEDIALOGUE.COM
What does your calculation look like?!
If significant result:!
!
Extra new customers per week!
*!
52 weeks effective!
*!
Average lifetime value!
TON@ONLINEDIALOGUE.COM
So this experiment will bring us:!
€412.390!
TON@ONLINEDIALOGUE.COM
TON@ONLINEDIALOGUE.COM
TON@ONLINEDIALOGUE.COM
So this experiment will bring us?!
€412.390 * (100%-Type-M error %)?!
TON@ONLINEDIALOGUE.COM
Prioritize based on measured results?!
* (100% - M-Type Error) of course!
TON@ONLINEDIALOGUE.COM
What is your false discovery rate?!
Significance border: 90%!
100 experiments!
20 significant outcomes!
!
50%!* (it’s a little lower, this is the poor man’s calculation)!
(with every real win the number of experiments without wins becomes lower, which leads to less false positives)!
TON@ONLINEDIALOGUE.COM
So not really 50%!
FDR* = (Measured Wins - ((Measured Wins - !
((100% - Confidence Level) * Experiments))!
/ Confidence Level)) / Measured Wins!
!
=!
!
(20 – ((20 – ((100% - 90%) * 100)) / 90%)) / 20!
!
=!
!
44%!* (only if your power on all experiments was 100%)!
(Your Power will be lower, which means you had more real wins, but not measured (false negatives).!
This leads to less experiments without an effect, so the number of false positives will be even lower)!
TON@ONLINEDIALOGUE.COM
https://abtestguide.com/fdr/!
TON@ONLINEDIALOGUE.COM
TON@ONLINEDIALOGUE.COM
So all your experiments will bring you:!
Sum of (every winner *!
!
(100% - Type-M error % per winner))!
*!
(100% - FDR%)!
*!
Implementation % (within x months…)!
(assuming every new win is tested on the new default where all earlier wins are implemented)!
TON@ONLINEDIALOGUE.COM
You can correct FDR for P-value distribution!
TON@ONLINEDIALOGUE.COM
So all your experiments will bring you:!
Sum of (every winner *!
!
(100% - Type-M error % per winner))!
*!
(100% - corrected FDR%)!
*!
Implementation % (within x months…)!
(assuming every new win is tested on the new default where all earlier wins are implemented)!
TON@ONLINEDIALOGUE.COM
Maximize your growth with ROI limits:!
Value of A/B-testing for Optimization!
___________________________________!
!
Costs of A/B-testing for Optimization!
= ROI!
TON@ONLINEDIALOGUE.COM
Finance: are you above or below your ROI limit?!
1.  Above: increase budgets!
2.  Below: increase knowledge!
3.  Still below: decrease budgets!
TON@ONLINEDIALOGUE.COM
Finance!
Business case calculations!
TON@ONLINEDIALOGUE.COM
DEF!
The task of an analyst within an A/B-testing Culture!
1.  Data!
2.  Effectiveness!
3.  Finance!
TON@ONLINEDIALOGUE.COM
Behavior	Analyst	Meaning	Noun	shirt	on	Amazon.com
TON@ONLINEDIALOGUE.COM
Data	Analyst	-	The	Noun	Project	icon	from	the	Noun	Project	
An A/B-testing for growth analyst:!
1.  Makes sure there is high
quality Data available!
2.  Steers the data chance
on Effect!
3.  Reports on the real
Financial impact!
TON@ONLINEDIALOGUE.COM
à https://ondi.me/cxlcourse ß!
Questions:!
Ton Wesseling
https://ondi.me/tonw
Let’s connect on LinkedIn

Latest article on A/B-testing:
Ton Wesseling
How an analyst can add value!
Digital Experiments

DDTT11: Ton Wesseling - 21-01-20