3. 3
One of the most common questions we face in
marketing is measuring the incremental effects
โ How much incremental revenue did the new pricing
strategy drive?
โ What impact did the new feature on the website have?
โ How many incremental conversions were achieved by
increasing the commission rate for our affiliates?
โ โฆ
4. 4
The main gold standard method for estimating causal
effects is a randomised experiment
10%
Conversion
15%
Conversion
50% of visitors see Version B
Version B
Version A
50% of visitors see Version A
5. 5
However, often A/B tests are either too expensive to run
or cannot be run, e.g. due to legal reasons
15%
Conversion
100% of visitors see Version B
Version B
Version A
6. 6
Example: financial performance of a company A
80
120
160
200
2011 2012 2013 2014 2015 2016 2017
Date
Adjusted
Closing
Price
Scandal broke
Actual share
price
7. 7
Approach: estimate the share price had the scandal not
happened
100
150
200
2011 2012 2013 2014 2015 2016 2017
Date
Adjusted
Closing
Price
Scandal broke
Actual share
price
Predicted
share price
8. 8
By comparing the actual and predicted share price, we
can estimate the drop in stock value due to the scandal
100
150
200
2011 2012 2013 2014 2015 2016 2017
Date
Adjusted
Closing
Price
Scandal broke
Drop in stock
value due to
scandal
Actual share
price
Predicted
share price
9. 9
Thanks to a fully Bayesian approach, we can quantify
the confidence level of our predictions
100
150
200
250
2011 2012 2013 2014 2015 2016 2017
Day
Clicks
Scandal broke
Actual share
price
Predicted
share price
95% credible
interval
10. 10
How do we construct the counterfactual estimate?
50
100
150
200
250
2011 2012 2013 2014 2015 2016 2017
Date
Adjusted
Closing
Price
Actual share
price
Predicted
share price
95% credible
interval
Company B
share price
Company C
share price
Training Prediction
Scandal broke
11. 11
Causal Impact methodology is based on a Bayesian
structural time series model
๐ฆ๐ก = ๐๐ก
๐
๐ผ๐ก + ๐๐ก
๐ผ๐ก+1 = ๐๐ก
๐
๐ผ๐ก + ๐ ๐ก๐๐ก
Causal Impact model
Most general form of the model
Observation equation
State equation
๐ฆ๐ก = ๐๐ก + ๐๐ก + ๐ฅ๐ก
๐
๐ฝ + ๐๐ก
๐๐ก+1 = ๐๐ก + ๐ฟ๐ก + ๐๐,๐ก
๐ฟ๐ก+1 = ๐ฟ๐ก + ๐๐ฟ,๐ก
๐๐ก+1 = โ
๐=0
๐โ2
๐๐กโ๐ + ๐๐,๐ก
13. 13
We impose an inv-gamma prior on ๐๐บ
๐
, with parameters ๐๐บ
and ๐๐บ selected based on the expected goodness-of-fit
๐ฆ๐ก = ๐๐ก + ๐๐ก + ๐ฅ๐ก
๐
๐ฝ + ๐๐ก
๐๐ก+1 = ๐๐ก + ๐ฟ๐ก + ๐๐,๐ก
๐ฟ๐ก+1 = ๐ฟ๐ก + ๐๐ฟ,๐ก
๐๐ก+1 = โ
๐=0
๐โ2
๐๐กโ๐ + ๐๐,๐ก
~๐ฉ 0, ๐๐
2
~๐ฉ 0, ๐๐
2
~๐ฉ 0, ๐๐ฟ
2
~๐ฉ 0, ๐๐
2
Priors
๐๐
2
~ ๐ผ๐๐ฃโ๐บ๐๐๐๐ ๐ ๐, ๐ฃ๐
0
1
2
3
4
0 1 2 3
x
Probability
Density
a = 1, b = 1
a = 2, b = 1
a = 3, b = 1
a = 3, b = 0.5
InvโGamma(a,b) density for varying values of a and b
14. 14
We impose weak priors on ๐๐
๐, ๐๐น
๐
and ๐๐
๐ reflecting the
assumption that errors are small in the state process
๐ฆ๐ก = ๐๐ก + ๐๐ก + ๐ฅ๐ก
๐
๐ฝ + ๐๐ก
๐๐ก+1 = ๐๐ก + ๐ฟ๐ก + ๐๐,๐ก
๐ฟ๐ก+1 = ๐ฟ๐ก + ๐๐ฟ,๐ก
๐๐ก+1 = โ
๐=0
๐โ2
๐๐กโ๐ + ๐๐,๐ก
~๐ฉ 0, ๐๐
2
~๐ฉ 0, ๐๐
2
~๐ฉ 0, ๐๐ฟ
2
~๐ฉ 0, ๐๐
2
Priors
๐๐
2
, ๐๐ฟ
2
, ๐๐
2
~ ๐ผ๐๐ฃโ๐บ๐๐๐๐ 1, 0.01 ร ๐๐๐(๐ฆ)
0
1
2
3
4
0 1 2 3
x
Probability
Density
a = 1, b = 1
a = 2, b = 1
a = 3, b = 1
a = 3, b = 0.5
InvโGamma(a,b) density for varying values of a and b
15. 15
We let the model choose an appropriate set of controls
by placing a spike and slab prior over coefficients ๐ท
๐ฆ๐ก = ๐๐ก + ๐๐ก + ๐ฅ๐ก
๐
๐ฝ + ๐๐ก
๐๐ก+1 = ๐๐ก + ๐ฟ๐ก + ๐๐,๐ก
๐ฟ๐ก+1 = ๐ฟ๐ก + ๐๐ฟ,๐ก
๐๐ก+1 = โ
๐=0
๐โ2
๐๐กโ๐ + ๐๐,๐ก
~๐ฉ 0, ๐๐
2
~๐ฉ 0, ๐๐
2
~๐ฉ 0, ๐๐ฟ
2
~๐ฉ 0, ๐๐
2
Priors
๐ฝ๐พ|๐๐
2
~ ๐ฉ(0, ๐๐๐
2
๐๐
๐ โ1
)
๐ ๐ ~
๐=1
๐ฝ
๐๐
๐๐
(1 โ ๐๐)
๐๐
0
1
2
3
4
โ2 โ1 0 1 2
x
Probability
Density
Spike
Slab
Density functions of spike and slab priors
16. 16
The inference can be performed in R with just 6 lines of
code
1 library(CausalImpact)
2 pre.period <- as.Date(c("2011-01-03", "2015-09-14"))
3 post.period <- as.Date(c("2015-09-21", "2017-03-19"))
4 impact <- CausalImpact(data, pre.period, post.period)
5 plot(impact)
6 summary(impact)
17. 17
Results can be plotted and summarised in a table
original
pointwise
cumulative
2011 2012 2013 2014 2015 2016 2017
100
150
200
250
โ80
โ40
0
โ4000
โ3000
โ2000
โ1000
0
Date
Adjusted
Closing
Price
Cumulative panel only makes sense when the metric is
additive, such as clicks or the number of orders, but not
in the case when it is a share price
19. 19
Additional considerations
โ It is important that covariates included in the model are not
themselves affected by the event. For each covariate included,
it is critical to reason why this is the case.
โ The model can be validated by running the Causal Impact
analysis on an โimaginary eventโ before the actual event. We
should not be seeing any significant effect, and actual and
predicted lines should match reasonably closely before the actual
event.
20. 20
References
โ K.H. Brodersen, F. Gallusser, J. Koehler, N. Remy, S. L. Scott,
(2015). Inferring Causal Impact Using Bayesian Structural Time-
Series Models.
https://research.google.com/pubs/pub41854.html.
โ S. L. Scott, H. Varian, (2013). Predicting the Present with
Bayesian Structural Time Series.
https://people.ischool.berkeley.edu/~hal/Papers/2013/pred-
present-with-bsts.pdf.