Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SearchLove London 2019 - Will Critchlow - Misunderstood Concepts at the Heart of SEO

3,426 views

Published on

The basics of SEO are technical accessibility, relevance, quality, and authority. Or: can it be crawled, does it meet a keyword need, and is it trustworthy? In each of these areas, we need to build on solid foundational understanding, and find the areas where advanced understanding will give us an edge. Will’s recent research has shown common gaps in understanding, and highlighted interesting advanced topics. In this wide-ranging session, he guarantees you’ll learn something, and you’ll come away with training guidelines for the basics.

Published in: Design

SearchLove London 2019 - Will Critchlow - Misunderstood Concepts at the Heart of SEO

  1. 1. SearchLove London 2019 Will Critchlow (@willcritchlow), CEO, Distilled
  2. 2. The design flaw that almost wiped out a NYC skyscraper [New Yorker]
  3. 3. Shout out to @DianeHartley who was the student, who didn’t realise until years later that her call had made any difference
  4. 4. Some purely theoretical
  5. 5. Some useful and practical Some purely theoretical
  6. 6. Twitter thread reference
  7. 7. User-agent: * Disallow: /mydir/
  8. 8. /mydir/
  9. 9. User-agent: googlebot Disallow: /mydir/
  10. 10. /mydir/
  11. 11. User-agent: * Disallow: /mydir/ User-agent: googlebot Disallow: /secret/
  12. 12. /mydir/
  13. 13. User-agent: * Disallow: /mydir/ User-agent: googlebot Disallow: /secret/
  14. 14. WRONG
  15. 15. /mydir/
  16. 16. User-agent: * Disallow: /mydir/ User-agent: googlebot Disallow: /secret/
  17. 17. User-agent: * Disallow: /mydir/ User-agent: googlebot Disallow: /secret/
  18. 18. User-agent: * Disallow: /a/ User-agent: googlebot Disallow: /b/
  19. 19. Adsbot-Google
  20. 20. User-agent: * Disallow: /a/ User-agent: googlebot Disallow: /b/
  21. 21. User-agent: * Disallow: /a/ User-agent: googlebot Disallow: /b/
  22. 22. not
  23. 23. User-agent: * Disallow: /a/ User-agent: googlebot Disallow: /b/
  24. 24. User-agent: * Disallow: /a/ User-agent: googlebot Disallow: /b/
  25. 25. *
  26. 26. both
  27. 27. User-agent: Adsbot-Google Disallow: /a/ Disallow: /b/
  28. 28. googlebot/1.2 googlebot* googlebot Source: https://developers.google.com/search/reference/robots_txt
  29. 29. User-agent: googlebot/1.2
  30. 30. THIS IS WRONG
  31. 31. Luckily released their parser with source: “The library is slightly modified (i.e. some internal headers and equivalent symbols) production code used by Googlebot”
  32. 32. /*static*/ absl::string_view RobotsMatcher::ExtractUserAgent( absl::string_view user_agent) { // Allowed characters in user-agent are [a-zA-Z_-]. const char* end = user_agent.data(); while (absl::ascii_isalpha(*end) || *end == '-' || *end == '_') { ++end; } return user_agent.substr(0, end - user_agent.data()); } Source: open source robots.txt parser
  33. 33. /*static*/ absl::string_view RobotsMatcher::ExtractUserAgent( absl::string_view user_agent) { // Allowed characters in user-agent are [a-zA-Z_-]. const char* end = user_agent.data(); while (absl::ascii_isalpha(*end) || *end == '-' || *end == '_') { ++end; } return user_agent.substr(0, end - user_agent.data()); } Source: open source robots.txt parser
  34. 34. // Allowed characters in user-agent are [a-zA-Z_-]. Source: open source robots.txt parser
  35. 35. User-agent: googlebot/1.2
  36. 36. User-agent: googlebot/three
  37. 37. User-agent: googlebot-1.2
  38. 38. User-agent: googlebo
  39. 39. User-agent: google
  40. 40. User-agent: googlebotthethird
  41. 41. Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/W.X.Y.Z Safari/537.36 Source: updating the user agent of Googlebot
  42. 42. Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/W.X.Y.Z Safari/537.36 Source: updating the user agent of Googlebot The “token”
  43. 43. https://support.google.com/webmasters/answer/6062596?hl=en
  44. 44. https://support.google.com/webmasters/answer/1061943?hl=en
  45. 45. https://support.google.com/webmasters/answer/1061943?hl=en
  46. 46. ./robots robots.txt googlebot-image /c/ Source: open source robots.txt parser
  47. 47. ./robots robots.txt googlebot-image /c/ Source: open source robots.txt parser
  48. 48. ./robots robots.txt googlebot-image /c/ Source: open source robots.txt parser
  49. 49. ./robots robots.txt googlebot-image /c/ Source: open source robots.txt parser
  50. 50. ./robots robots.txt googlebot-image /c/ user-agent 'googlebot-image' with URI '/c/': ALLOWED Source: open source robots.txt parser
  51. 51. ./robots robots.txt googlebot-image /c/ user-agent 'googlebot-image' with URI '/c/': ALLOWED Source: open source robots.txt parser
  52. 52. ./robots robots.txt googlebot-image /c/ user-agent 'googlebot-image' with URI '/c/': ALLOWED Source: open source robots.txt parser ��
  53. 53. Docs Online tool Open source
  54. 54. Docs Online tool Open source Googlebot/1.2
  55. 55. Docs Online tool Open source Rule ordering
  56. 56. Docs Online tool Open source Other gbots
  57. 57. Docs Online tool Open source Other gbots ��
  58. 58. Docs Online tool Open source
  59. 59. Official Google announcement
  60. 60. Gary Illyes at Pubcon
  61. 61. Docs Online tool Open source
  62. 62. === User-Agent: googlebot Disallow: /foo.png === Target: /foo.png Bot: Googlebot-Image
  63. 63. === User-Agent: googlebot Disallow: /foo.png === Target: /foo.png Bot: Googlebot-Image Results: Documentation: SHOULD BE BLOCKED
  64. 64. === User-Agent: googlebot Disallow: /foo.png === Target: /foo.png Bot: Googlebot-Image Results: Documentation: SHOULD BE BLOCKED Online Tool: BLOCKED
  65. 65. === User-Agent: googlebot Disallow: /foo.png === Target: /foo.png Bot: Googlebot-Image Results: Documentation: SHOULD BE BLOCKED Online Tool: BLOCKED Open Source Parser: ALLOWED
  66. 66. === User-Agent: googlebot Disallow: /foo.png === Target: /foo.png Bot: Googlebot-Image Results: Documentation: SHOULD BE BLOCKED Online Tool: BLOCKED Open Source Parser: ALLOWED Actual Googlebot: BLOCKED
  67. 67. Docs Online tool Open source Treatment of googlebot-image googlebot-news etc �� Actual gbot
  68. 68. Twitter threads: reference 1 and reference 2
  69. 69. 19 50
  70. 70. 100 19 50
  71. 71. @PeterSokolowski via Lisa Schneider from Merriam Webster at SearchLove
  72. 72. OK. But. 1. We appropriated this language
  73. 73. THE TAIL IS LITERALLY LONG
  74. 74. And 2. It has a meaning in the context of a long tail strategy
  75. 75. A plan to have huge numbers of pages indexed with differentiated content and enough authority to rank widely
  76. 76. Low volume Lots of words Uncom- petitive
  77. 77. Target uncompetitive queries can be a strategy for smaller businesses. Low volume Lots of words Uncomp- etitive
  78. 78. x%
  79. 79. WRONG
  80. 80. Thanks to @KaneJamison for the reference link Collection frequency (cf) Document frequency (df) try 10,422 insurance
  81. 81. Thanks to @KaneJamison for the reference link Collection frequency (cf) Document frequency (df) try 10,422 insurance 10,440
  82. 82. Thanks to @KaneJamison for the reference link Collection frequency (cf) Document frequency (df) try 10,422 8,760 insurance 10,440
  83. 83. Thanks to @KaneJamison for the reference link Collection frequency (cf) Document frequency (df) try 10,422 8,760 insurance 10,440 3,997
  84. 84. Thanks to @KaneJamison for the reference link Inverse collection frequency (icf) Inverse document frequency (idf) try 0.000096 0.000114 insurance 0.000096 0.000250
  85. 85. Thanks to @KaneJamison for the reference link Inverse collection frequency (icf) Inverse document frequency (idf) try 0.000096 0.000114 insurance 0.000096 0.000250
  86. 86. Twitter thread reference
  87. 87. y
  88. 88. 0
  89. 89. WRONG
  90. 90. 0.25 0.25 0.25 0.25
  91. 91. 0.25 0.25 0.25 0.25 0.15 / # pages
  92. 92. y
  93. 93. y
  94. 94. Twitter thread reference - more detail in @tomanthony’s presentation
  95. 95. google.com
  96. 96. google.com googleusercontent.com
  97. 97. RIGHT
  98. 98. httpOnly
  99. 99. WRONG
  100. 100. distilled.net
  101. 101. <script src="https://example.com/widget.js"> </script>
  102. 102. 🍪🍪🍪
  103. 103. WRONG
  104. 104. Twitter search: [#SEOquiz from:willcritchlow] Or go here to find the thread that pulls them all together.
  105. 105. ● Citicorp Center ● Spider web ● Ice cream ● Lightning ● Robot ● Gate ● St. John’s ● Who wants to be a millionaire ● Fire ● Long tail ● Lisa Schneider ● Confused ● Hay stack ● Graph ● Cookies ● London

×