This document provides an overview of using data analysis techniques like Causal Impact and Prophet to analyze SEO performance data. Causal Impact can be used to measure the impact of a single change on a metric like traffic or conversions by comparing actual data to a predicted counterfactual. Prophet allows removing noise from overall growth trends by incorporating additional factors like search volume or economic indicators. The document outlines examples, limitations, and resources to get started with these techniques.
7. My goal:
By the end of this,
you’ll be able to:
Causal Impact + Tensorflow Probability
E.g. measuring the effect of Algo update
Before & after test 🧪
Prophet with regressor
E.g. removing noise from growth factors
Remove noise from data 📊
9. Statistical significant =
very unlikely to have occurred
simply by chance
You can feel confident that it’s real, not
that you just got lucky or unlucky
12. Suitable for:
Any change in
a single day
Did Google Algo update impact
our traffic or position?
Algorithm update
Did updating the CTA improve our
conversion?
Changing CTA / design
Did optimisation help improve our
position?
On-page optimisation
31. Always check: p-value
p-value = the probability of getting this result by
chance. Closer this gets to 0%, more likely that it
is not by chance
p-value < 0.05 = it’s statistically significant
p-value > 0.05 = it’s NOT (= just accident?)
32. New users = Drop is not statistically significant 😇
43. Python package: tfcausalimpact
Language Python
Author WillianFuks (CausalImpact by Google +
Tensorflow by Google)
Ingredients Data + exact date of the change
Regressor Optional (as many as you want)
Used for Impact of the change on a single day
44. Troubleshooting 🧪🔥
Google spreadsheet
● Dates should be in ascending order, in YYYY-MM-DD
● Numbers should not have dots (1,000 → 1000)
● Make sure there are no empty cells
● Useful formula for Google Trends: =if(len(B2),B2,C1)
Python
● When it fails - start from importing data again
● Check how data looks with data.head() or print(data)
● Check all data types are float or integers by data.dtypes
48. Suitable for:
Overall trend
Is the traffic growing? Is it due to
search demand?
Growth (+ relationship
with price changes, etc)
Is there steady growth as we publish
more blog articles?
Growth in blogs
49. Our US blog is growing 🚀
Are we actually growing???
65. Python package: Prophet
Language Python
Author Facebook
Ingredients Data (12+ months)
Regressor Optional (as many as you want)
Used for Forecast / time series trend
66. Troubleshooting 🧪🔥
Google spreadsheet
● You can have missing data - but preferably not
● Do not name a column named ‘trend’ - it’s reserved
● (Everything as I’ve said in Causal Impact)
Python
● (Everything as I’ve said in Causal Impact)
71. How to spot multicollinearity 🔎
Common sense
✅ No coding needed
❌ Inaccurate
VIF
(Variance inflation factors)
✅ Accurate
❌ Coding needed
VIF colab notebook + testing method in the slide
72. VIF - Remove any variables with VIF > 1.5
(1.5 is VERY conservative; can be 2-5)
73. Python package: VIF from Statsmodels
Language Python
Author Open source
Data Data
74. How to run VIF in 3 easy steps 💥
1. Google sheet with all the data (variables)
2. Open Colab Notebook and change here
3. Click buttons
76. My goal:
And now, you should
be able to:
Causal Impact + Tensorflow Probability
E.g. measuring the effect of Algo update
Before & after test 🧪
Prophet with regressor
E.g. removing noise from growth factors
Remove noise in data 📊
✅
✅
77. Cheatsheet
Causal
impact
● Changed template
● Changed CTA
● Google Algorithm update
● Fee change of product
● Russian war on Ukraine
Prophet ● Traffic & Google trend relationship
● Overall trend with seasonality
● Effect of inflation
78. Limitations + cautions
Causal
impact
● Highly dependent on the data points
(provide as much data; consider time lag)
● Not good for multiple changes happening at
once / changes happening over time
● Very slow to run
● Small numbers (e.g. CvR) - hard to detect
Prophet ● Sensitive to seasonality - should provide
multiple seasons
● Automatically tries to ‘fit the model’ - so we
cannot specify to ‘prioritise’ one regressor
over another
● Weak on outlier / large impact events
81. ‘Sometimes the simplest tools are pure and
effective. You only use a complex technique if
there is no simpler way.
It is the principles of analysis (the logic, the
conclusions) that are the most powerful.’