Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SearchLove London 2017 | Will Critchlow | Seeing the Future: How to Tell the Impact of a Change Before You Make it

16,848 views

Published on

It is often hard and expensive to make major changes to your website and many businesses demand forecasts, predictions, and business cases to prioritise them. Will is going to present tools and approaches for figuring out whether a change is worthwhile before you make it - including ways of thinking about on-page, content quality, usage data impacts, and what happens when you change your internal linking structure.

Published in: Marketing
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

SearchLove London 2017 | Will Critchlow | Seeing the Future: How to Tell the Impact of a Change Before You Make it

  1. 1. SearchLove London 2017 Let’s do better We need to have a bigger impact By Will Critchlow - @willcritchlow
  2. 2. The last 18 months of split-testing has shown me that FAR too often...
  3. 3. ...common recommendations make no difference
  4. 4. or can even be disastrous...
  5. 5. ...you’ll never believe what led to this decline -27% in two weeks
  6. 6. “Target these pages at the ways that people search.”
  7. 7. WHAT?
  8. 8. And even when we recommend the right kind of thing, we suck at the details Pretty gloomy. Want to come on a journey to do better?
  9. 9. Let’s do fewer pointless things
  10. 10. Let’s screw things up less often
  11. 11. And let’s make some really EFFECTIVE recommendations I think it’s a fairly straight-forward pitch
  12. 12. Control We have the same data as Google Influence Google has data we don’t have Keyword targeting External links Internal links Usage data Website “quality” Ranking factors
  13. 13. On a 2x2 like any good consultant
  14. 14. There are areas where Google has data we don’t have
  15. 15. While in others, we have the same information they do
  16. 16. We can only influence these factors
  17. 17. While these, we fully control
  18. 18. Keyword targeting Control Influence Internal links We have the same data as Google Google has data we don’t have
  19. 19. Website “quality” Control Influence We have the same data as Google Google has data we don’t have Usage data
  20. 20. Control External links Out of scope today We have the same data as Google Google has data we don’t have
  21. 21. External links The less direct control you have over a factor, the harder testing and modelling becomes. Control We have the same data as Google Google has data we don’t have
  22. 22. Control Keyword targeting Influence Test & Model We have the same data as Google Google has data we don’t have
  23. 23. Control Influence Usage data Website “quality” Survey & Study We have the same data as Google Google has data we don’t have
  24. 24. Control Influence Internal links Analyse better We have the same data as Google Google has data we don’t have
  25. 25. 1. Data we are missing - survey and study
  26. 26. Control Influence Usage data Website “quality” Survey & Study We have the same data as Google Google has data we don’t have
  27. 27. 1. Data we have (or can get) only for our own site Like usage data - see, for example, this post by @SimoAhava explaining how to capture bounce rate back to the SERP. Also interesting: Rand’s video about a possible organic quality score.
  28. 28. 2. Cases where the real ranking factor is a machine-learned proxy for the real thing e.g. ● Content quality (Panda ML) ● Link quality ○ Ignored links (ML on disavow) Actually measure: ML PROXY FOR QUALITY Want to measure: QUALITY
  29. 29. For usage data: it is impossible to guess what people prefer See whichtestwon
  30. 30. So tools like SERP Turkey can be useful (by our very own @TomAnthonySEO)
  31. 31. When it comes to “quality” ● How do you define it? ● How do you communicate it to clients / bosses? ● How do you benchmark it against competitors? ● How do you figure out if a change improves it?
  32. 32. Gather human rater information Google employs thousands of human quality raters to answer questionnaires about many kinds of website
  33. 33. Train ML models Google uses the human questionnaires as training data for ML models of “quality” Gather human rater information Google employs thousands of human quality raters to answer questionnaires about many kinds of website
  34. 34. 2011 Release Panda The Panda quality algorithm starts being used as a batch process modifying the regular core algorithm Train ML models Google uses the human questionnaires as training data for ML models of “quality” Gather human rater information Google employs thousands of human quality raters to answer questionnaires about many kinds of website
  35. 35. 2011 Release Panda The Panda quality algorithm starts being used as a batch process modifying the regular core algorithm Train ML models Google uses the human questionnaires as training data for ML models of “quality” Gather human rater information Google employs thousands of human quality raters to answer questionnaires about many kinds of website 2016 Make Panda real-time “Quality” becomes a first-class ranking factor in the core algorithm
  36. 36. Back in 2011, I was suggesting we run our own Panda-like quality surveys (WBF here, instructions here)
  37. 37. Probably the only thing that’s really changed since then is that you should run it mobile-first now Hat-tip Tom Capper
  38. 38. More executives are aware of quality as a ranking factor these days Since Panda went real-time, quality issues don’t necessarily cause obvious drops correlated with algorithm history dates
  39. 39. Client site 1 Client site 2 Would you trust information from this website? 72% 64%
  40. 40. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81%
  41. 41. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65%
  42. 42. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65% Would you give this site your credit card details? 29% 21% 43%
  43. 43. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65% Would you give this site your credit card details? 29% 21% 43% Are there any noticeable errors on this page? 6% 4% 1%
  44. 44. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65% Would you give this site your credit card details? 29% 21% 43% Are there any noticeable errors on this page? 6% 4% 1% Does this page provide original content or info? 76% 72% 85%
  45. 45. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65% Would you give this site your credit card details? 29% 21% 43% Are there any noticeable errors on this page? 6% 4% 1% Does this page provide original content or info? 76% 72% 85% Would you recognize this site as an authority? 44% 33% 58%
  46. 46. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65% Would you give this site your credit card details? 29% 21% 43% Are there any noticeable errors on this page? 6% 4% 1% Does this page provide original content or info? 76% 72% 85% Would you recognize this site as an authority? 44% 33% 58% Does this website contain insightful analysis? 72% 62% 81%
  47. 47. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65% Would you give this site your credit card details? 29% 21% 43% Are there any noticeable errors on this page? 6% 4% 1% Does this page provide original content or info? 76% 72% 85% Would you recognize this site as an authority? 44% 33% 58% Does this website contain insightful analysis? 72% 62% 81% Would you consider bookmarking pages on this site? 44% 38% 56%
  48. 48. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65% Would you give this site your credit card details? 29% 21% 43% Are there any noticeable errors on this page? 6% 4% 1% Does this page provide original content or info? 76% 72% 85% Would you recognize this site as an authority? 44% 33% 58% Does this website contain insightful analysis? 72% 62% 81% Would you consider bookmarking pages on this site? 44% 38% 56% Are there excessive adverts on this website? 2% 2% 8%
  49. 49. Client site 1 Client site 2 Key competitor Would you trust information from this website? 72% 64% 81% Is this website written by experts? 50% 46% 65% Would you give this site your credit card details? 29% 21% 43% Are there any noticeable errors on this page? 6% 4% 1% Does this page provide original content or info? 76% 72% 85% Would you recognize this site as an authority? 44% 33% 58% Does this website contain insightful analysis? 72% 62% 81% Would you consider bookmarking pages on this site? 44% 38% 56% Are there excessive adverts on this website? 2% 2% 8% Could pages from this site appear in print? 54% 54% 59%
  50. 50. We also asked for free-text feedback and found some surprising priorities from non-SEOs
  51. 51. “The reviews seem fake” Trust is a huge deal for real-world users
  52. 52. “There's not enough information about the company and why I should use their products” On a micro-site that doesn’t have an “about” page
  53. 53. “In this day and age every page that has anything at all to do with business should be https” Security is a big deal in B2B - even without on-site purchases
  54. 54. “The pictures were of low quality and blurry” We know this matters to users. It’s at the easier end of ML detection
  55. 55. Benefits of running surveys: Real site Screenshot vs.
  56. 56. Benefits of running surveys: Real site Screenshot vs. Real site Staging vs.
  57. 57. Benefits of running surveys: Real site Screenshot vs. Real site Staging vs. Your site Competitor vs.
  58. 58. Benefits of running surveys: Real site Screenshot vs. Real site Staging vs. Your site Competitor vs. Competitor Tweaked competitorvs.
  59. 59. 2. Factors we need to analyse better
  60. 60. Control Influence Internal links Analyse better We have the same data as Google Google has data we don’t have
  61. 61. “Improve your information architecture by linking more to your product pages.”
  62. 62. Not wrong exactly, but certainly incomplete
  63. 63. Can you figure out: Will we do better if we make this change? How much better could it be? Which of the many ways of doing it is best?
  64. 64. Let’s look at the state of the art: Use interactive visualisations to find issues Calculate internal PageRank Follow Paul Shapiro and Patrick Stox for more
  65. 65. You’ve probably all seen crawl graphs They are distorted by starting at one page and only showing some paths Good explainer at sitebulb.com and Ian Lurie reports some good results from colouring by indexation
  66. 66. Full link graphs are more complete, but I find them hard to interpret
  67. 67. Use static visualisations for: Communicating and Convincing
  68. 68. they are generally not good for Discovery and Diagnosis
  69. 69. Though sometimes you’ll find something interesting like this entirely-duplicated site Credit: Paul Shapiro
  70. 70. “Everything looks like a graph but almost nothing should ever be drawn as one” I found this quote in this interesting presentation
  71. 71. Interactive visualisations in Gephi are more useful for discovery and diagnosis
  72. 72. Link
  73. 73. Internal PageRank is a powerful idea. But by starting from “all pages are equal” we get some odd results Like the contact page being more powerful than the homepage
  74. 74. There are case studies of people seeing real results from radical changes to internal link structure See Alex’s fascinating Mozcon talk [PDF]
  75. 75. but real-world changes are hard to make, hard to undo, and could cause lasting damage and even worse from my perspective, it’s hard to split-test when the expected changes are everywhere on the site
  76. 76. So our state of the art still has gaps How much difference will a proposed fix make? Which proposed change is a better idea?
  77. 77. It’s important because our intuition is really bad. Essentially what we want to do is figure out the best link structure for distributing external authority around our site
  78. 78. I mentioned PageRank (PR) before without really explaining it
  79. 79. It’s the algorithm Google developed to measure webpages’ authority based on links
  80. 80. Many people can talk about the random surfer model For this talk, I’m going to group it with updates like reasonable surfer
  81. 81. Fewer are comfortable with the eigenvector of the stochastic adjacency matrix
  82. 82. But most intuition is based on “flow” of PR - and that’s not really how the algorithm works
  83. 83. I suspect most people’s intuition about PageRank is wrong so I did some unscientific surveying See the survey
  84. 84. Let me explain: Imagine a typical site
  85. 85. With some external links in to some pages
  86. 86. Now imagine you add a new page, linked only from the homepage
  87. 87. And linking to the same N pages as the homepage
  88. 88. How does its PageRank compare? PageRank? PageRank?
  89. 89. I suspect most people’s intuition about PageRank is wrong so I did some unscientific surveying See the survey
  90. 90. Over 1 in 5 people got even the simple question wrong And to be honest, depending what “significantly” means, even the 19% might not be too wrong. But it does hint at single-iteration thinking. We’re all really bad at figuring out the convergence of iterative algorithms.
  91. 91. Now, let’s step it up a notch
  92. 92. You’re on “who wants to be a millionaire”, you ask the audience, and it comes back like this:
  93. 93. Still sure you’re right?
  94. 94. It’s actually quite sensitive to some assumptions, but almost 3 in 5 people are definitely wrong NOPE
  95. 95. I wasn’t 100% sure, but my modelling matched my intuition NOPE Right answer
  96. 96. Though there are some weird site setups where you can find this happens (e.g. no external links at all) NOPE Right answer Possible edge case
  97. 97. NOPE Either way, it was only ~2% of the new page’s PR on Distilled.net
  98. 98. This is important because it means too many recommendations are based on bad intuition about how PageRank works None of us have an intuitive sense of random surfer or eigenvectors
  99. 99. There are always trade-offs, but we can’t compare them easily It’s rare for one approach strictly to dominate another
  100. 100. So let’s try to come up with a better approach
  101. 101. What I really want to do is run PageRank across the whole web graph
  102. 102. Then make changes to my site’s linking structure, and re-run PageRank on the whole web
  103. 103. We can approximate this with a modified form of internal PageRank
  104. 104. 1. Crawl x levels deep & export internal links Subcategory Category 1 Homepage Category 2 Subcategory Subcategory Subcategory Facet Facet Product Product Product
  105. 105. 2. Gather raw external authority (raw mozrank from the moz API) Subcategory Category 1 Homepage Category 2 Subcategory Subcategory Subcategory Facet Facet Product Product Product
  106. 106. 3. Normalise the authority data mR raw 3.67E-13 3.35E-11 1.71E-13 1.64E-13 1.59E-13 3.28E-13 6.88E-14 2.45E-13 7.12E-14 3.12E-13 1.67E-13
  107. 107. 3. Normalise the authority data mR raw mR raw normalised 3.67E-13 1.0% 3.35E-11 94.2% 1.71E-13 0.5% 1.64E-13 0.5% 1.59E-13 0.4% 3.28E-13 0.9% 6.88E-14 0.2% 2.45E-13 0.7% 7.12E-14 0.2% 3.12E-13 0.9% 1.67E-13 0.5%
  108. 108. 4. Use NetworkX or similar to run PR See NetworkX
  109. 109. 5. Set personalization to mR probabilities Set alpha to damping parameter (normally 0.85, we want lower)
  110. 110. Future enhancements ● Handle nofollow correctly (see Matt Cutts’ old PageRank sculpting post) ● Handle redirects and rel canonical sensibly ● Include top mR pages (or all pages with mR?) - even if not in the crawl ○ Use as a seed and crawl from these pages ● Weight links by type to get closer to reasonable surfer model ○ This is the weight parameter in NetworkX ○ Use actual click-data for your own site to approximate an actual surfer!
  111. 111. Then we propose a change and see if the treatment works Step 1 is figuring out how to capture your proposed changes to the internal link structure of your site
  112. 112. You can add or remove small numbers of links by changing the crawl output in a spreadsheet Source Destination https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/services/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/events/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/resources/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/resources/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/resources/features/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/u/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/resources/videos/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/about/ https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/jobs/
  113. 113. It’s easy to make sitewide additions to the navigation as you build the graph site.add_edges_from([(edge['Source'], 'https://www.distilled.net/events/searchlove-london/')])
  114. 114. Much harder to remove from global navigation because it’s not the same as removing every link site.add_edges_from([(edge['Source'], 'https://www.distilled.net/events/searchlove-london/')])
  115. 115. For more complex changes, we can use our ODN
  116. 116. Then crawl the preview environment
  117. 117. Then crawl the preview environment Subtleties: ● Crawl live and preview to x levels deep ● Combine into a superset of pages discovered on each crawl ● Crawl both again from the list Because we are comparing relative weights (normalised PR) we need the same set of pages
  118. 118. Generally we will care about the impact on groups of pages: Label them by URL / in the crawl / using modularity
  119. 119. Might it be possible to come up with a single metric that captures “internal link graph quality”? I’ve been wondering about equality metrics like Gini coefficients. Come back next year to see if I’ve made progress on this!
  120. 120. Until then: compare your proposed changes to find the best solution to your issue For example, find the change that best flows authority to under-indexed product pages.
  121. 121. So I think I’ve presented two key new ideas in this section:
  122. 122. 1. A quantitative way of assessing your internal link setup by incorporating external authority into internal PR calculations
  123. 123. 2. A way of comparing different proposed changes by working with the data rather than just with visualisations
  124. 124. And remember, we need this because you need to make bold changes Small tweaks don’t even move the PageRank needle
  125. 125. Summary
  126. 126. 1. Start gathering qualitative data For your site, for proposed changes, for competitors. About quality and about usage.
  127. 127. 2. Use more powerful quantitative data For things like internal linking analysis and recommendations See my newly-published blog post for the technical details
  128. 128. Let’s stop wasting time with ineffective recommendations, or damaging sites with bad ones
  129. 129. and start making a real difference
  130. 130. Thank you for coming to SearchLove
  131. 131. If you’re interested in the counter-intuitive results I presented at the beginning, check out odn.distilled.net. We’ll be happy to demo for you. We’re serving ~5 billion requests per quarter and recently published everything from response times to our +£100k / month split test.
  132. 132. @willcritchlow
  133. 133. ● Da Vinci helicopter ● Niels Bohr ● Scream ● Statue of Liberty ● Complexity ● Head in hands ● Rorschach Test ● State of the art ● Axe ● Surfer ● Clouds / Clouds with sun ● Wrong way ● Accountant glasses ● St Paul’s cathedral ● Cactus Image credits ● Lego heads ● Anonymous ● Padlock ● Blur ● Doctor ● Boardroom ● Repetition ● Balance ● Smash ● Stars ● Table football ● Equality ● Quality ● Panda ● Leaves

×