Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
SearchLove London 2017
Let’s do better
We need to have a bigger impact
By Will Critchlow - @willcritchlow
The last 18 months of split-testing
has shown me that FAR too often...
...common recommendations make no difference
or can even be disastrous...
...you’ll never believe what led to this decline
-27% in two weeks
“Target these pages
at the ways that
people search.”
WHAT?
And even when we recommend the
right kind of thing, we suck at the
details
Pretty gloomy. Want to come on a journey to do ...
Let’s do fewer pointless things
Let’s screw things up less often
And let’s make some really
EFFECTIVE recommendations
I think it’s a fairly straight-forward pitch
Control
We have
the same
data as
Google
Influence
Google has
data we don’t
have
Keyword targeting
External links
Internal ...
On a 2x2 like any good consultant
There are
areas where
Google has
data we don’t
have
While in
others, we
have the
same
information
they do
We can only influence these factors
While these, we fully control
Keyword targeting
Control
Influence
Internal links
We have
the same
data as
Google
Google has
data we don’t
have
Website “quality”
Control
Influence
We have
the same
data as
Google
Google has
data we don’t
have
Usage data
Control
External links
Out of scope today
We have the
same data
as Google
Google has
data we don’t
have
External links
The less direct control you have over a factor, the
harder testing and modelling becomes.
Control
We have t...
Control
Keyword targeting
Influence
Test & Model
We have the
same data
as Google
Google has
data we don’t
have
Control
Influence
Usage data
Website “quality”
Survey & Study
We have the
same data
as Google
Google has
data we don’t
have
Control
Influence
Internal links
Analyse better
We have the
same data
as Google
Google has
data we don’t
have
1. Data we are missing -
survey and study
Control
Influence
Usage data
Website “quality”
Survey & Study
We have the
same data
as Google
Google has
data we don’t
have
1. Data we have
(or can get)
only for our own site
Like usage data - see, for example, this
post by @SimoAhava explaining ...
2. Cases where the
real ranking factor is a
machine-learned
proxy for the real thing
e.g.
● Content quality (Panda ML)
● L...
For usage data: it is impossible to guess what
people prefer
See whichtestwon
So tools like SERP Turkey can be useful
(by our very own @TomAnthonySEO)
When it comes to “quality”
● How do you define it?
● How do you communicate it to clients / bosses?
● How do you benchmark...
Gather human rater
information
Google employs
thousands of human
quality raters to answer
questionnaires about
many kinds ...
Train ML models
Google uses the human
questionnaires as training
data for ML models of
“quality”
Gather human rater
inform...
2011
Release Panda
The Panda quality
algorithm starts being
used as a batch process
modifying the regular core
algorithm
T...
2011
Release Panda
The Panda quality
algorithm starts being
used as a batch process
modifying the regular core
algorithm
T...
Back in 2011, I was suggesting we run our own
Panda-like quality surveys (WBF here, instructions here)
Probably the only
thing that’s really
changed since then is
that you should run it
mobile-first now
Hat-tip Tom Capper
More executives are aware of
quality as a ranking factor these
days
Since Panda went real-time, quality issues don’t neces...
Client site 1 Client site 2
Would you trust information from this website? 72% 64%
Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website writ...
Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website writ...
Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website writ...
Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website writ...
Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website writ...
Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website writ...
Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website writ...
Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website writ...
Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website writ...
We also asked for free-text feedback and
found some surprising priorities from
non-SEOs
“The reviews seem fake”
Trust is a huge deal for real-world users
“There's not enough information
about the company and why I
should use their products”
On a micro-site that doesn’t have a...
“In this day and age every page that
has anything at all to do with
business should be https”
Security is a big deal in B2...
“The pictures were of low quality
and blurry”
We know this matters to users. It’s at the easier end of ML detection
Benefits of running surveys:
Real site Screenshot
vs.
Benefits of running surveys:
Real site Screenshot
vs.
Real site Staging
vs.
Benefits of running surveys:
Real site Screenshot
vs.
Real site Staging
vs.
Your site Competitor
vs.
Benefits of running surveys:
Real site Screenshot
vs.
Real site Staging
vs.
Your site Competitor
vs.
Competitor
Tweaked
co...
2. Factors we need to analyse better
Control
Influence
Internal links
Analyse better
We have the
same data
as Google
Google has
data we don’t
have
“Improve your
information
architecture by
linking more to
your product
pages.”
Not wrong exactly,
but certainly incomplete
Can you figure out:
Will we do better if we make this change?
How much better could it be?
Which of the many ways of doing...
Let’s look at the state of the art:
Use interactive visualisations to find issues
Calculate internal PageRank
Follow Paul ...
You’ve probably all seen crawl graphs
They are distorted by starting at one page and only showing some paths
Good explaine...
Full link graphs are more complete, but I find them
hard to interpret
Use static visualisations for:
Communicating and
Convincing
they are generally not good for
Discovery and
Diagnosis
Though sometimes you’ll find something
interesting like this entirely-duplicated site
Credit: Paul Shapiro
“Everything looks
like a graph but
almost nothing
should ever be
drawn as one”
I found this quote in this interesting pres...
Interactive visualisations in Gephi are more useful
for discovery and diagnosis
Link
Internal PageRank is a
powerful idea.
But by starting from
“all pages are equal”
we get some
odd results
Like the contact ...
There are case studies of people seeing real results
from radical changes to internal link structure
See Alex’s fascinatin...
but real-world changes are hard to
make, hard to undo, and could
cause lasting damage
and even worse from my perspective, ...
So our state of the art still has gaps
How much difference will a proposed fix make?
Which proposed change is a better ide...
It’s important because our intuition
is really bad.
Essentially what we want to do is figure out the best link structure f...
I mentioned PageRank (PR)
before without really
explaining it
It’s the algorithm Google
developed to measure
webpages’ authority based
on links
Many people can talk
about the random surfer
model
For this talk, I’m going to group it with updates like reasonable surfer
Fewer are comfortable
with the eigenvector of the
stochastic adjacency
matrix
But most intuition is based
on “flow” of PR - and that’s
not really how the
algorithm works
I suspect most people’s intuition about PageRank is
wrong so I did some unscientific surveying
See the survey
Let me explain: Imagine a typical site
With some external links in to some pages
Now imagine you add a new page, linked only from
the homepage
And linking to the same N pages as the homepage
How does its PageRank compare?
PageRank?
PageRank?
I suspect most people’s intuition about PageRank is
wrong so I did some unscientific surveying
See the survey
Over 1 in 5 people got even the simple question
wrong
And to be honest, depending what “significantly” means, even the 19%...
Now, let’s step it up a notch
You’re on “who wants to be a millionaire”, you ask
the audience, and it comes back like this:
Still sure you’re right?
It’s actually quite sensitive to some assumptions,
but almost 3 in 5 people are definitely wrong
NOPE
I wasn’t 100% sure, but my modelling matched my
intuition
NOPE
Right answer
Though there are some weird site setups where you
can find this happens (e.g. no external links at all)
NOPE
Right answer
...
NOPE
Either way, it was only ~2%
of the new page’s PR on
Distilled.net
This is important because it means
too many recommendations are
based on bad intuition about how
PageRank works
None of us...
There are always trade-offs, but we
can’t compare them easily
It’s rare for one approach strictly to dominate another
So let’s try to come up with a better
approach
What I really want to do is run
PageRank across the whole web
graph
Then make changes to my site’s
linking structure, and re-run
PageRank on the whole web
We can approximate this with a
modified form of internal PageRank
1. Crawl x levels deep
& export internal links
Subcategory
Category 1
Homepage
Category 2
Subcategory
Subcategory
Subcateg...
2. Gather raw external authority
(raw mozrank from the moz API)
Subcategory
Category 1
Homepage
Category 2
Subcategory
Sub...
3. Normalise the authority data
mR raw
3.67E-13
3.35E-11
1.71E-13
1.64E-13
1.59E-13
3.28E-13
6.88E-14
2.45E-13
7.12E-14
3....
3. Normalise the authority data
mR raw mR raw normalised
3.67E-13 1.0%
3.35E-11 94.2%
1.71E-13 0.5%
1.64E-13 0.5%
1.59E-13...
4. Use NetworkX or similar to run PR
See NetworkX
5. Set personalization to mR probabilities
Set alpha to damping parameter (normally 0.85, we want lower)
Future enhancements
● Handle nofollow correctly (see Matt Cutts’ old PageRank sculpting post)
● Handle redirects and rel c...
Then we propose a change and see
if the treatment works
Step 1 is figuring out how to capture your proposed changes to the...
You can add or remove small numbers of links by
changing the crawl output in a spreadsheet
Source Destination
https://www....
It’s easy to make sitewide additions to the
navigation as you build the graph
site.add_edges_from([(edge['Source'],
'https...
Much harder to remove from global navigation
because it’s not the same as removing every link
site.add_edges_from([(edge['...
For more complex changes, we can use our ODN
Then crawl the preview environment
Then crawl the preview environment
Subtleties:
● Crawl live and preview to x levels deep
● Combine into a superset of page...
Generally we will care
about the impact on
groups of pages:
Label them by URL /
in the crawl /
using modularity
Might it be possible to come up
with a single metric that captures
“internal link graph quality”?
I’ve been wondering abou...
Until then: compare your proposed
changes to find the best solution to
your issue
For example, find the change that best f...
So I think I’ve presented two key
new ideas in this section:
1. A quantitative way of assessing
your internal link setup
by incorporating external authority into internal PR calculati...
2. A way of comparing different
proposed changes
by working with the data rather than just with visualisations
And remember, we
need this because you
need to make bold
changes
Small tweaks don’t even move the PageRank needle
Summary
1. Start gathering qualitative data
For your site, for proposed changes, for competitors.
About quality and about usage.
2. Use more powerful quantitative
data
For things like internal linking analysis and recommendations
See my newly-publishe...
Let’s stop wasting time with
ineffective recommendations, or
damaging sites with bad ones
and start making a real difference
Thank you for coming to
SearchLove
If you’re interested in the
counter-intuitive results I
presented at the beginning,
check out odn.distilled.net. We’ll
be ...
@willcritchlow
● Da Vinci helicopter
● Niels Bohr
● Scream
● Statue of Liberty
● Complexity
● Head in hands
● Rorschach Test
● State of t...
Upcoming SlideShare
Loading in …5
×

SearchLove London 2017 | Will Critchlow | Seeing the Future: How to Tell the Impact of a Change Before You Make it

15,901 views

Published on

It is often hard and expensive to make major changes to your website and many businesses demand forecasts, predictions, and business cases to prioritise them. Will is going to present tools and approaches for figuring out whether a change is worthwhile before you make it - including ways of thinking about on-page, content quality, usage data impacts, and what happens when you change your internal linking structure.

Published in: Marketing
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

SearchLove London 2017 | Will Critchlow | Seeing the Future: How to Tell the Impact of a Change Before You Make it

  1. 1. SearchLove London 2017 Let’s do better We need to have a bigger impact By Will Critchlow - @willcritchlow
  2. 2. The last 18 months of split-testing has shown me that FAR too often...
  3. 3. ...common recommendations make no difference
  4. 4. or can even be disastrous...
  5. 5. ...you’ll never believe what led to this decline -27% in two weeks
  6. 6. “Target these pages at the ways that people search.”
  7. 7. WHAT?
  8. 8. And even when we recommend the right kind of thing, we suck at the details Pretty gloomy. Want to come on a journey to do better?
  9. 9. Let’s do fewer pointless things
  10. 10. Let’s screw things up less often
  11. 11. And let’s make some really EFFECTIVE recommendations I think it’s a fairly straight-forward pitch
  12. 12. Control We have the same data as Google Influence Google has data we don’t have Keyword targeting External links Internal links Usage data Website “quality” Ranking factors
  13. 13. On a 2x2 like any good consultant
  14. 14. There are areas where Google has data we don’t have
  15. 15. While in others, we have the same information they do
  16. 16. We can only influence these factors
  17. 17. While these, we fully control
  18. 18. Keyword targeting Control Influence Internal links We have the same data as Google Google has data we don’t have
  19. 19. Website “quality” Control Influence We have the same data as Google Google has data we don’t have Usage data
  20. 20. Control External links Out of scope today We have the same data as Google Google has data we don’t have
  21. 21. External links The less direct control you have over a factor, the harder testing and modelling becomes. Control We have the same data as Google Google has data we don’t have
  22. 22. Control Keyword targeting Influence Test & Model We have the same data as Google Google has data we don’t have
  23. 23. Control Influence Usage data Website “quality” Survey & Study We have the same data as Google Google has data we don’t have
  24. 24. Control Influence Internal links Analyse better We have the same data as Google Google has data we don’t have
  25. 25. 1. Data we are missing - survey and study
  26. 26. Control Influence Usage data Website “quality” Survey & Study We have the same data as Google Google has data we don’t have
  27. 27. 1. Data we have (or can get) only for our own site Like usage data - see, for example, this post by @SimoAhava explaining how to capture bounce rate back to the SERP. Also interesting: Rand’s video about a possible organic quality score.
  28. 28. 2. Cases where the real ranking factor is a machine-learned proxy for the real thing e.g. ● Content quality (Panda ML) ● Link quality ○ Ignored links (ML on disavow) Actually measure: ML PROXY FOR QUALITY Want to measure: QUALITY
  29. 29. For usage data: it is impossible to guess what people prefer See whichtestwon
  30. 30. So tools like SERP Turkey can be useful (by our very own @TomAnthonySEO)
  31. 31. When it comes to “quality” ● How do you define it? ● How do you communicate it to clients / bosses? ● How do you benchmark it against competitors? ● How do you figure out if a change improves it?
  32. 32. Gather human rater information Google employs thousands of human quality raters to answer questionnaires about many kinds of website
  33. 33. Train ML models Google uses the human questionnaires as training data for ML models of “quality” Gather human rater information Google employs thousands of human quality raters to answer questionnaires about many kinds of website
  34. 34. 2011 Release Panda The Panda quality algorithm starts being used as a batch process modifying the regular core algorithm Train ML models Google uses the human questionnaires as training data for ML models of “quality” Gather human rater information Google employs thousands of human quality raters to answer questionnaires about many kinds of website
  35. 35. 2011 Release Panda The Panda quality algorithm starts being used as a batch process modifying the regular core algorithm Train ML models Google uses the human questionnaires as training data for ML models of “quality” Gather human rater information Google employs thousands of human quality raters to answer questionnaires about many kinds of website 2016 Make Panda real-time “Quality” becomes a first-class ranking factor in the core algorithm
  36. 36. Back in 2011, I was suggesting we run our own Panda-like quality surveys (WBF here, instructions here)
  37. 37. Probably the only thing that’s really changed since then is that you should run it mobile-first now Hat-tip Tom Capper
  38. 38. More executives are aware of quality as a ranking factor these days Since Panda went real-time, quality issues don’t necessarily cause obvious drops correlated with algorithm history dates
  39. 39. Client site 1 Client site 2 Would you trust information from this website? 72% 64%
  40. 40. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81%
  41. 41. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65%
  42. 42. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65% Would you give this site your credit card details? 29% 21% 43%
  43. 43. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65% Would you give this site your credit card details? 29% 21% 43% Are there any noticeable errors on this page? 6% 4% 1%
  44. 44. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65% Would you give this site your credit card details? 29% 21% 43% Are there any noticeable errors on this page? 6% 4% 1% Does this page provide original content or info? 76% 72% 85%
  45. 45. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65% Would you give this site your credit card details? 29% 21% 43% Are there any noticeable errors on this page? 6% 4% 1% Does this page provide original content or info? 76% 72% 85% Would you recognize this site as an authority? 44% 33% 58%
  46. 46. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65% Would you give this site your credit card details? 29% 21% 43% Are there any noticeable errors on this page? 6% 4% 1% Does this page provide original content or info? 76% 72% 85% Would you recognize this site as an authority? 44% 33% 58% Does this website contain insightful analysis? 72% 62% 81%
  47. 47. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65% Would you give this site your credit card details? 29% 21% 43% Are there any noticeable errors on this page? 6% 4% 1% Does this page provide original content or info? 76% 72% 85% Would you recognize this site as an authority? 44% 33% 58% Does this website contain insightful analysis? 72% 62% 81% Would you consider bookmarking pages on this site? 44% 38% 56%
  48. 48. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65% Would you give this site your credit card details? 29% 21% 43% Are there any noticeable errors on this page? 6% 4% 1% Does this page provide original content or info? 76% 72% 85% Would you recognize this site as an authority? 44% 33% 58% Does this website contain insightful analysis? 72% 62% 81% Would you consider bookmarking pages on this site? 44% 38% 56% Are there excessive adverts on this website? 2% 2% 8%
  49. 49. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65% Would you give this site your credit card details? 29% 21% 43% Are there any noticeable errors on this page? 6% 4% 1% Does this page provide original content or info? 76% 72% 85% Would you recognize this site as an authority? 44% 33% 58% Does this website contain insightful analysis? 72% 62% 81% Would you consider bookmarking pages on this site? 44% 38% 56% Are there excessive adverts on this website? 2% 2% 8% Could pages from this site appear in print? 54% 54% 59%
  50. 50. We also asked for free-text feedback and found some surprising priorities from non-SEOs
  51. 51. “The reviews seem fake” Trust is a huge deal for real-world users
  52. 52. “There's not enough information about the company and why I should use their products” On a micro-site that doesn’t have an “about” page
  53. 53. “In this day and age every page that has anything at all to do with business should be https” Security is a big deal in B2B - even without on-site purchases
  54. 54. “The pictures were of low quality and blurry” We know this matters to users. It’s at the easier end of ML detection
  55. 55. Benefits of running surveys: Real site Screenshot vs.
  56. 56. Benefits of running surveys: Real site Screenshot vs. Real site Staging vs.
  57. 57. Benefits of running surveys: Real site Screenshot vs. Real site Staging vs. Your site Competitor vs.
  58. 58. Benefits of running surveys: Real site Screenshot vs. Real site Staging vs. Your site Competitor vs. Competitor Tweaked competitorvs.
  59. 59. 2. Factors we need to analyse better
  60. 60. Control Influence Internal links Analyse better We have the same data as Google Google has data we don’t have
  61. 61. “Improve your information architecture by linking more to your product pages.”
  62. 62. Not wrong exactly, but certainly incomplete
  63. 63. Can you figure out: Will we do better if we make this change? How much better could it be? Which of the many ways of doing it is best?
  64. 64. Let’s look at the state of the art: Use interactive visualisations to find issues Calculate internal PageRank Follow Paul Shapiro and Patrick Stox for more
  65. 65. You’ve probably all seen crawl graphs They are distorted by starting at one page and only showing some paths Good explainer at sitebulb.com and Ian Lurie reports some good results from colouring by indexation
  66. 66. Full link graphs are more complete, but I find them hard to interpret
  67. 67. Use static visualisations for: Communicating and Convincing
  68. 68. they are generally not good for Discovery and Diagnosis
  69. 69. Though sometimes you’ll find something interesting like this entirely-duplicated site Credit: Paul Shapiro
  70. 70. “Everything looks like a graph but almost nothing should ever be drawn as one” I found this quote in this interesting presentation
  71. 71. Interactive visualisations in Gephi are more useful for discovery and diagnosis
  72. 72. Link
  73. 73. Internal PageRank is a powerful idea. But by starting from “all pages are equal” we get some odd results Like the contact page being more powerful than the homepage
  74. 74. There are case studies of people seeing real results from radical changes to internal link structure See Alex’s fascinating Mozcon talk [PDF]
  75. 75. but real-world changes are hard to make, hard to undo, and could cause lasting damage and even worse from my perspective, it’s hard to split-test when the expected changes are everywhere on the site
  76. 76. So our state of the art still has gaps How much difference will a proposed fix make? Which proposed change is a better idea?
  77. 77. It’s important because our intuition is really bad. Essentially what we want to do is figure out the best link structure for distributing external authority around our site
  78. 78. I mentioned PageRank (PR) before without really explaining it
  79. 79. It’s the algorithm Google developed to measure webpages’ authority based on links
  80. 80. Many people can talk about the random surfer model For this talk, I’m going to group it with updates like reasonable surfer
  81. 81. Fewer are comfortable with the eigenvector of the stochastic adjacency matrix
  82. 82. But most intuition is based on “flow” of PR - and that’s not really how the algorithm works
  83. 83. I suspect most people’s intuition about PageRank is wrong so I did some unscientific surveying See the survey
  84. 84. Let me explain: Imagine a typical site
  85. 85. With some external links in to some pages
  86. 86. Now imagine you add a new page, linked only from the homepage
  87. 87. And linking to the same N pages as the homepage
  88. 88. How does its PageRank compare? PageRank? PageRank?
  89. 89. I suspect most people’s intuition about PageRank is wrong so I did some unscientific surveying See the survey
  90. 90. Over 1 in 5 people got even the simple question wrong And to be honest, depending what “significantly” means, even the 19% might not be too wrong. But it does hint at single-iteration thinking. We’re all really bad at figuring out the convergence of iterative algorithms.
  91. 91. Now, let’s step it up a notch
  92. 92. You’re on “who wants to be a millionaire”, you ask the audience, and it comes back like this:
  93. 93. Still sure you’re right?
  94. 94. It’s actually quite sensitive to some assumptions, but almost 3 in 5 people are definitely wrong NOPE
  95. 95. I wasn’t 100% sure, but my modelling matched my intuition NOPE Right answer
  96. 96. Though there are some weird site setups where you can find this happens (e.g. no external links at all) NOPE Right answer Possible edge case
  97. 97. NOPE Either way, it was only ~2% of the new page’s PR on Distilled.net
  98. 98. This is important because it means too many recommendations are based on bad intuition about how PageRank works None of us have an intuitive sense of random surfer or eigenvectors
  99. 99. There are always trade-offs, but we can’t compare them easily It’s rare for one approach strictly to dominate another
  100. 100. So let’s try to come up with a better approach
  101. 101. What I really want to do is run PageRank across the whole web graph
  102. 102. Then make changes to my site’s linking structure, and re-run PageRank on the whole web
  103. 103. We can approximate this with a modified form of internal PageRank
  104. 104. 1. Crawl x levels deep & export internal links Subcategory Category 1 Homepage Category 2 Subcategory Subcategory Subcategory Facet Facet Product Product Product
  105. 105. 2. Gather raw external authority (raw mozrank from the moz API) Subcategory Category 1 Homepage Category 2 Subcategory Subcategory Subcategory Facet Facet Product Product Product
  106. 106. 3. Normalise the authority data mR raw 3.67E-13 3.35E-11 1.71E-13 1.64E-13 1.59E-13 3.28E-13 6.88E-14 2.45E-13 7.12E-14 3.12E-13 1.67E-13
  107. 107. 3. Normalise the authority data mR raw mR raw normalised 3.67E-13 1.0% 3.35E-11 94.2% 1.71E-13 0.5% 1.64E-13 0.5% 1.59E-13 0.4% 3.28E-13 0.9% 6.88E-14 0.2% 2.45E-13 0.7% 7.12E-14 0.2% 3.12E-13 0.9% 1.67E-13 0.5%
  108. 108. 4. Use NetworkX or similar to run PR See NetworkX
  109. 109. 5. Set personalization to mR probabilities Set alpha to damping parameter (normally 0.85, we want lower)
  110. 110. Future enhancements ● Handle nofollow correctly (see Matt Cutts’ old PageRank sculpting post) ● Handle redirects and rel canonical sensibly ● Include top mR pages (or all pages with mR?) - even if not in the crawl ○ Use as a seed and crawl from these pages ● Weight links by type to get closer to reasonable surfer model ○ This is the weight parameter in NetworkX ○ Use actual click-data for your own site to approximate an actual surfer!
  111. 111. Then we propose a change and see if the treatment works Step 1 is figuring out how to capture your proposed changes to the internal link structure of your site
  112. 112. You can add or remove small numbers of links by changing the crawl output in a spreadsheet Source Destination https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/services/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/events/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/resources/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/resources/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/resources/features/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/u/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/resources/videos/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/about/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/jobs/
  113. 113. It’s easy to make sitewide additions to the navigation as you build the graph site.add_edges_from([(edge['Source'], 'https://www.distilled.net/events/searchlove-london/')])
  114. 114. Much harder to remove from global navigation because it’s not the same as removing every link site.add_edges_from([(edge['Source'], 'https://www.distilled.net/events/searchlove-london/')])
  115. 115. For more complex changes, we can use our ODN
  116. 116. Then crawl the preview environment
  117. 117. Then crawl the preview environment Subtleties: ● Crawl live and preview to x levels deep ● Combine into a superset of pages discovered on each crawl ● Crawl both again from the list Because we are comparing relative weights (normalised PR) we need the same set of pages
  118. 118. Generally we will care about the impact on groups of pages: Label them by URL / in the crawl / using modularity
  119. 119. Might it be possible to come up with a single metric that captures “internal link graph quality”? I’ve been wondering about equality metrics like Gini coefficients. Come back next year to see if I’ve made progress on this!
  120. 120. Until then: compare your proposed changes to find the best solution to your issue For example, find the change that best flows authority to under-indexed product pages.
  121. 121. So I think I’ve presented two key new ideas in this section:
  122. 122. 1. A quantitative way of assessing your internal link setup by incorporating external authority into internal PR calculations
  123. 123. 2. A way of comparing different proposed changes by working with the data rather than just with visualisations
  124. 124. And remember, we need this because you need to make bold changes Small tweaks don’t even move the PageRank needle
  125. 125. Summary
  126. 126. 1. Start gathering qualitative data For your site, for proposed changes, for competitors. About quality and about usage.
  127. 127. 2. Use more powerful quantitative data For things like internal linking analysis and recommendations See my newly-published blog post for the technical details
  128. 128. Let’s stop wasting time with ineffective recommendations, or damaging sites with bad ones
  129. 129. and start making a real difference
  130. 130. Thank you for coming to SearchLove
  131. 131. If you’re interested in the counter-intuitive results I presented at the beginning, check out odn.distilled.net. We’ll be happy to demo for you. We’re serving ~5 billion requests per quarter and recently published everything from response times to our +£100k / month split test.
  132. 132. @willcritchlow
  133. 133. ● Da Vinci helicopter ● Niels Bohr ● Scream ● Statue of Liberty ● Complexity ● Head in hands ● Rorschach Test ● State of the art ● Axe ● Surfer ● Clouds / Clouds with sun ● Wrong way ● Accountant glasses ● St Paul’s cathedral ● Cactus Image credits ● Lego heads ● Anonymous ● Padlock ● Blur ● Doctor ● Boardroom ● Repetition ● Balance ● Smash ● Stars ● Table football ● Equality ● Quality ● Panda ● Leaves

×