Explainable Machine Learning for
Ranking Factors
@OnCrawl
#brightonseo
Vincent Terrasi
Product Director
@OnCrawl
+20 years experience in SEO
Data Science coach
Speaking
today
@OnCrawl
#brightonseo
OnCrawl SEO Crawler
OnCrawl Log Analyzer
OnCrawl Data³
OnCrawl Labs
@OnCrawl
#brightonseo
For more information, visit
www.oncrawl.com
Flight
plan
- Which ranking factors are important?
- How to interpret "black box" machine
learning models?
- How does a ranking factor contribute
to ranking for individual URLs?
- What's next for interpreting machine
learning?
@OnCrawl
#brightonseo
@OnCrawl
#brightonseo
https://www.uptimiser.com.hk/
Machine Learning Model
Ready-To-Use
ModelURL
TOP 10
NOT
TOP 10
Accuracy: 92%
Error rate: 8%
@OnCrawl
#brightonseo
Ranking Factors
Accuracy: 92%
Error rate: 8%
@OnCrawl
#brightonseo
with Response Time
Error rate: 8% Error rate: 20%
without Response Time
Variable Importance: 20 - 8 = 12%
@OnCrawl
#brightonseo
Explain one variable: Basic Method
(e.g.: Citation Flow)
Ready-To-Use
Model
URL with
different
CF values
HIGH INFLUENCE:
URL with CF=90
URL with CF=80
….
URL with CF=50
LOW INFLUENCE
URL with CF<20
Accuracy: 92%
Error rate: 8%
@OnCrawl
#brightonseo
Explain one variable: Basic Method
(e.g.: Citation Flow)
@OnCrawl
#brightonseo
Black box model: Can you explain per URL?
@OnCrawl
#brightonseo
Interpretable Accurate
Complex model
Simple model
Interpretable or accurate: choose one
@OnCrawl
#brightonseo
Can you interpret complex models?
@OnCrawl
#brightonseo
Explicability with Shapley: Nobel Prize in 2012
Excellent Load Time : +10%
WordCount > 600 : +5%
TrustFlow > 70 : +5%
Threshold
40%
Prediction for URL 1
60%0%
@OnCrawl
#brightonseo
It is mind-blowing to explain a
prediction as a cooperative
game played by the feature
values.
@OnCrawl
#brightonseo
Shapley:
Prediction as a game played by feature value
Word
Count
Response
Time
Inlinks
Citation
Flow Trust
Flow
https://clearcode.cc/blog/game-theory-attribution/
@OnCrawl
#brightonseo
Shapley:
Simulation with 3 features = 2^3 cases
https://clearcode.cc/blog/game-theory-attribution/
@OnCrawl
#brightonseo
Python Library: SHAP
https://github.com/slundberg/shap
@OnCrawl
#brightonseo
Python Library: SHAP
https://github.com/slundberg/shap
@OnCrawl
#brightonseo
Python Library: SHAP
https://github.com/slundberg/shap
@OnCrawl
#brightonseo
● A lot of computing time. (2k)
● Explanations created with the Shapley value
method always use all the features.
Disadvantages
@OnCrawl
#brightonseo
New way to boost your SEO with Data Science
@OnCrawl
#brightonseo
xAI Concept
Source : Darpa
@OnCrawl
#brightonseo
@OnCrawl
#brightonseo
Using Google Colab,
we’ll open ‘à la carte’
algorithms to address
strategic SEO issues based on
Python and R.
First R&D projects available!
Key Takeaways:
● You can use ML to find your own ranking
factors
● Shapley Values is a good method to better
understand a model
● Shapley Library is available with Python, R
or directly with Dataiku
● Currently, xAI is the best framework to
level up your SEO
@OnCrawl
#brightonseo
www.oncrawl.com
Enjoy a 1-month free trial
http://cgntv.co/f
@OnCrawl
#brightonseo
Thank you!
Any questions?
@vincentterrasi

Explainable Machine Learning for Ranking Factors