Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

4 types of indexing problems - Case studies and solutions

136 views

Published on

In his talk at Digital Elite Day 2020, Bartosz Góralewicz covers 4 distinct types of indexing problems that no website is immune from:
- URL indexing,
- Mobile-first related partial indexing
- JavaScript partial indexing
- Layout-based partial indexing.

Published in: Internet
  • Be the first to comment

  • Be the first to like this

4 types of indexing problems - Case studies and solutions

  1. 1. Bartosz Góralewicz LOREM IPSUMLOREM IPSUMLOREM IPSUM indexing problems Examples & solutions! ! ! @bart_goralewicz
  2. 2. Helping Fortune 500's rank better and get more traffic Bartosz Góralewicz @bart_goralewicz www.onely.com We're deeply specialized: Technical SEO JavaScript SEO Rendering SEO (!) Indexing Issues (!) Web Performance Link to this deck -> on the last slide @bart_goralewiczwww.onely.com
  3. 3. on Google=Indexed @bart_goralewiczwww.onely.com
  4. 4. Percentage of URLs NOT indexed @bart_goralewiczwww.onely.com 30%76% 15% 81% 14%21% 38%71% 98%
  5. 5. How do we know it? @bart_goralewiczwww.onely.com
  6. 6. The Google Indexing Forecast Indexed pages with JavaScript content not indexed after 2 weeks avg. 35% not indexed Rendering in 2019: 5 seconds at median one.ly/tgif @bart_goralewiczwww.onely.com
  7. 7. Our findings – 4th will shock you There are huge brands out there barely in Google’s index 1 Indexing HTML not as easy as we assumed2 Indexing trends fluctuate during Google updates3 You can get kicked out of Google’s index4 @bart_goralewiczwww.onely.com
  8. 8. *sorted by the level of complexity, ascending Every kind of indexing problems comes from different origins and requires different solutions. URL indexing problems Mobile-first related indexing problems JavaScript related indexing problems Layout based indexing problems Every kind of indexing problem* @bart_goralewiczwww.onely.com
  9. 9. #SEJSummit @bart_goralewicz Discovery Queue Crawl Rendering Index selectionIndexingRanking How indexing works @bart_goralewiczwww.onely.com
  10. 10. #SEJSummit @bart_goralewicz Discovery Queue Crawl Rendering Index selectionIndexingRanking *please don’t start a Twitter war after this slide  Partial indexing issue = URL not indexed AFTER it was crawled * How indexing works @bart_goralewiczwww.onely.com
  11. 11. @bart_goralewiczwww.onely.com Why is indexing going to be more and more of a problem?
  12. 12. Google’s challenge 2010 2010 2012 2012 2014 2020 @bart_goralewiczwww.onely.com
  13. 13. source: https://twitter.com/methode/status/1261259179983081473 @bart_goralewiczwww.onely.com @bart_goralewiczwww.onely.com
  14. 14. 1-minute crash course Index selection @bart_goralewiczwww.onely.com @bart_goralewiczwww.onely.com
  15. 15. Index selection for dummies SOURCE: Patent Method and apparatus for managing a backlog of pending URL crawls (patent US8676783B1) Limit: 100 people Rendering Links Efficient crawling Content Indexing strategy @bart_goralewiczwww.onely.com
  16. 16. Indexing trends - JS May Percentage of indexed JavaScript content one.ly/tgif Google May 2020 core update @bart_goralewiczwww.onely.com
  17. 17. Indexing trends – URLs May Percentage of indexed HTML content one.ly/tgif Google May 2020 core update @bart_goralewiczwww.onely.com
  18. 18. = new challenges @bart_goralewiczwww.onely.com index selection+Limited resources
  19. 19. *sorted by the level of complexity, ascending Every kind of indexing problems comes from different origins and requires different solutions. Mobile-first related indexing problems JavaScript related indexing problems Layout based indexing problems Every kind of indexing problem* @bart_goralewiczwww.onely.com URL indexing problems
  20. 20. Let’s start easy with a little @bart_goralewiczwww.onely.com warm up
  21. 21. URL indexing - example one.ly/alba-shoes @bart_goralewiczwww.onely.com
  22. 22. URL indexing - example @bart_goralewiczwww.onely.com one.ly/alba-shoes
  23. 23. Problem with the site: command False negatives @bart_goralewiczwww.onely.com
  24. 24. Site: command new challenges Site:URL – watch out for false negatives* *fortunately, there are a few ways to avoid those and get 100% accuracy @bart_goralewiczwww.onely.com
  25. 25. URL indexing - causes • Thin content • Duplicate content • Cannibalization • Etc. Content quality Crawler budget issues @bart_goralewiczwww.onely.com Index bloat
  26. 26. @bart_goralewiczwww.onely.com PARTIAL INDEXING AHEAD URL indexing
  27. 27. Mobile-first related partial indexing @bart_goralewiczwww.onely.com not visible on mobile
  28. 28. Mobile-first related partial indexing - example one.ly/yoox-pants mobile desktop @bart_goralewiczwww.onely.com
  29. 29. desktop not visible on mobile @bart_goralewiczwww.onely.com one.ly/yoox-pants
  30. 30. @bart_goralewiczwww.onely.com
  31. 31. *Tomek’s joke @bart_goralewiczwww.onely.com
  32. 32. Diagnosing mobile-first related indexing problems Simple way - Side by side visual comparison1 @bart_goralewiczwww.onely.com
  33. 33. Diagnosing mobile-first related indexing problems @bart_goralewiczwww.onely.com Diffchecker 2
  34. 34. Make sure that all the content on mobile is on desktop as well. @bart_goralewiczwww.onely.com
  35. 35. Thinking about that new, shiny JS framework? @bart_goralewiczwww.onely.com
  36. 36. JavaScript indexing ≈ 25% trends over time @bart_goralewiczwww.onely.com
  37. 37. @bart_goralewiczwww.onely.com We need to talk.Before we move forward...
  38. 38. @bart_goralewiczwww.onely.com Remember those good old times, when only SOME websites were JS-powered?
  39. 39. In 2020, Wordpress, Magento, Wix, Shopify are usually JS-powered too! @bart_goralewiczwww.onely.com DUH!
  40. 40. Google Hangouts (August 23rd 2019)
  41. 41. JavaScript SEO is not dying. It's getting even more complex Is JavaScript SEO dying? @bart_goralewiczwww.onely.com
  42. 42. or… in simpler terms. @bart_goralewiczwww.onely.com
  43. 43. JavaScript SEO is not dying @bart_goralewiczwww.onely.com
  44. 44. @bart_goralewiczwww.onely.com It is getting even more f..cked up!
  45. 45. @bart_goralewiczwww.onely.com JavaScript SEO leveled up over the last years. JS
  46. 46. JavaScript-related indexing problems - example with JavaScript without JavaScript @bart_goralewiczwww.onely.com
  47. 47. Diagnosing JS-related partial indexing problems @bart_goralewiczwww.onely.com
  48. 48. Diagnosing JS-related partial indexing problems @bart_goralewiczwww.onely.com
  49. 49. INDEXED JavaScript indexing problems = partial indexing @bart_goralewiczwww.onely.com The URL is JavaScript dependent content – NOT INDEXED. How to spot JavaScript indexing problems?
  50. 50. WRS To understand JS-related indexing problems, we need to look under Google’s hood a bit. @bart_goralewiczwww.onely.com
  51. 51. WRS To understand JS-related indexing problems, we need to look under Google’s hood a bit. @bart_goralewiczwww.onely.com
  52. 52. Google limits CPU consumption source: Google Webmaster Conference Product Summit, Mountain View, CA http://services.google.com/fh/files/events/wmconf_product_summit_slides_publish.pdf @bart_goralewiczwww.onely.com
  53. 53. Rendering - a search engine's perspective @bart_goralewiczwww.onely.com
  54. 54. Confession time Father, 81% of my content is not indexed @bart_goralewiczwww.onely.com
  55. 55. Browser BOR Browser BORvs source: Patent Batch-optimized render and fetch architecture (patent US20180276220A1) @bart_goralewiczwww.onely.com
  56. 56. How Batch- Optimized Rendering works step by step source: Patent Batch-optimized render and fetch architecture (patent US20180276220A1) @bart_goralewiczwww.onely.com
  57. 57. Step 1. BOR skips all resources which are not essential to generate a preview of your page Examples: Tracking scripts (Google Analytics, Hotjar etc.) Ads Images* How Batch-optimized rendering works source: Patent Batch-optimized render and fetch architecture (patent US20180276220A1) @bart_goralewiczwww.onely.com
  58. 58. vs Browser BOR Load: 4.24s Load: 1.91s @bart_goralewiczwww.onely.com
  59. 59. Set the value of a Virtual Clock Step 2. How Batch-optimized rendering works source: Patent Batch-optimized render and fetch architecture (patent US20180276220A1) @bart_goralewiczwww.onely.com
  60. 60. 1. Virtual Clock’s time runs out* 2. Website’s layout is generated *simplification Step 3. How Batch-optimized rendering works source: Patent Batch-optimized render and fetch architecture (patent US20180276220A1) @bart_goralewiczwww.onely.com
  61. 61. Using this data to rank better Virtual Clock Layout source: Patent Batch-optimized render and fetch architecture (patent US20180276220A1) @bart_goralewiczwww.onely.com
  62. 62. Virtual Clock = Rendering Budget* *simplification @bart_goralewiczwww.onely.com
  63. 63. Rendering pauses while waiting for scripts, CSS files etc. Cost of our website’s rendering A script/CSS heavy website needs more “virtual time” on the virtual clock Source: Patent Batch-optimized render and fetch architecture (patent US20180276220A1) Virtual Clock @bart_goralewiczwww.onely.com
  64. 64. BOR – a place where real time doesn’t matter. Source: Patent Batch-optimized render and fetch architecture (patent US20180276220A1) @bart_goralewiczwww.onely.com
  65. 65. Where is the limit? @bart_goralewiczwww.onely.com
  66. 66. How resource-hungry is your website? Superfast CPUSlower CPU @bart_goralewiczwww.onely.com
  67. 67. Measuring the Virtual Clock load* of your website. *Ubersimplification 2 options @bart_goralewiczwww.onely.com
  68. 68. Use TLDR one.ly/tldr Simulate BOR in your Chrome Dev Tools one.ly/bor Detailed walkthrough @bart_goralewiczwww.onely.com
  69. 69. Virtual clock’s time runs out the LAYOUT is generated Source: Patent Batch-optimized render and fetch architecture (patent US20180276220A1) @bart_goralewiczwww.onely.com
  70. 70. @bart_goralewiczwww.onely.com
  71. 71. @bart_goralewiczwww.onely.com
  72. 72. Pre-layout times <!DOCTYPE html><html lang="en"><head><meta charset="utf-8"><meta http-equiv="X-UA- Compatible" content="IE=edge"><meta name="viewport" content="width=device-width, initial- scale=1"/> <script>ss_u = false; ss_g_var = 'no'; if (document.cookie.indexOf("scroll0=") > -1 && window.location.href.indexOf('no_scroll') === -1 ) { ss_u = true; ss_g_var = 'yes'; }</script> <link rel="preconnect" href="https://c.amazon-adsystem.com" /><link rel="preconnect" href="https://aax.amazon-adsystem.com" /><link rel="preconnect" href="https://adserver- us.adtech.advertising.com" /><link rel="preconnect" href="https://as-sec.casalemedia.com" /><link rel="preconnect" href="https://ib.adnxs.com" /><link rel="preconnect" href="https://ap.lijit.com" /><link rel="preconnect" href="https://vap2sfo1.lijit.com" /><link rel="preconnect" href="https://g2.gumgum.com" /><link rel="preconnect" href="https://tag.1rx.io" /><link rel="preconnect" href="https://btlr.sharethrough.com" /><link rel="preconnect" href="https://dmx.districtm.io" /><link rel="preconnect" href="https://apex.go.sonobi.com" /><link rel="preconnect" href="https://hb.emxdgt.com" /><link rel="preconnect" href="https://biddr.brealtime.com" /><link rel="preconnect" href="https://web.hb.ad.cpe.dotomi.com" /><link rel="preconnect" href="https://s3.amazonaws.com" /><link rel="preconnect" href="https://a.teads.tv" /><link rel="preconnect" href="https://c.aaxads.com" /><link rel="preconnect" href="https://s.aaxads.com" /><link rel="preconnect" href="https://related.searchenginejournal.com" /><link rel="preconnect" href="https://cpm.webtradingspot.com" /><link rel="preconnect" href="https://cdn.jsdelivr.net" /><link rel="preconnect" href="https://adservice.google.com" /><link rel="preconnect" href="https://tpc.googlesyndication.com" /><link rel="preconnect" href="https://www.googletagservices.com" /><link rel="preconnect" href="https://pagead2.googlesyndication.com" /><link rel="preconnect" href="https://pubads.g.doubleclick.net" /><link rel="preconnect" href="https://www.google.com" /><link rel="preconnect" href="https://googleads4.g.doubleclick.net" /><link rel="preconnect" href="https://cdn.adnxs.com" /><link rel="preconnect" href="https://www.google-analytics.com" /><link rel="preconnect" href="https://www.googletagservices.com" /><link rel="preconnect" href="https://connect.facebook.net" /><link rel="dns-prefetch" href="https://platform.twitter.com" /><link rel="dns-prefetch" href="https://www.youtube.com" /><link rel="preconnect" href="https://cdn.searchenginejournal.com" /> <script>width_param = 'large'; if( window.innerWidth < 1024 ){ width_param = 'small'; Before 2011 rendering After 2011 Google Panda Content quality updates @bart_goralewiczwww.onely.com
  73. 73. Layout vs. Rendering new findings @bart_goralewiczwww.onely.com
  74. 74. A lot of focus on… layout. Source: BOR patents (2012 -2018) @bart_goralewiczwww.onely.com
  75. 75. text appearing above-the-fold (e.g., visible without scrolling) may be considered more important than text below-the-line.” Content location matters source: Patent Batch-optimized render and fetch architecture (patent US20180276220A1) „ @bart_goralewiczwww.onely.com
  76. 76. Patent on Scheduling resource crawls (filed in 2011) The importance of the section is based on (...) prominence of the section within the rendered layout. Source: Patent Scheduling resource crawls (US20130144858A1) ads ads „ @bart_goralewiczwww.onely.com
  77. 77. The value of links depends on their location and attributes Position of the link within the page: source: Google patent Ranking documents based on user behavior and/or feature data (US10152520B1) Size and color of anchor text • In an HTML list • In running text • Above or below the screenfold • Top, bottom, left, right • Footer, sidebar etc. @bart_goralewiczwww.onely.com
  78. 78. (…) link positioned under the “More Top Stories” heading on the cnn.com has a high probability of being selected. „ Some sections may get more “Link Juice”* from Google *Wink, Wink John Mu ;) source: Google patent Ranking documents based on user behavior and/or feature data (US10152520B1) @bart_goralewiczwww.onely.com
  79. 79. Partial indexing – our findings onely.com/tools/tldr @bart_goralewiczwww.onely.com
  80. 80. Google seems to struggle with indexing “related items”, “you may also be interested in”. @bart_goralewiczwww.onely.com
  81. 81. ..more findings @bart_goralewiczwww.onely.com
  82. 82. Going even more beyond JavaScript… Ekhm… @bart_goralewiczwww.onely.com
  83. 83. …which brings us to the 4th kind of partial indexing problems @bart_goralewiczwww.onely.com
  84. 84. @bart_goralewiczwww.onely.com http://one.ly/target
  85. 85. URL is indexed @bart_goralewiczwww.onely.com
  86. 86. @bart_goralewiczwww.onely.com
  87. 87. @bart_goralewiczwww.onely.com
  88. 88. Not indexed @bart_goralewiczwww.onely.com
  89. 89. Patent on Scheduling resource crawls (filed in 2011) The importance of the section is based on (...) prominence of the section within the rendered layout. Source: Patent Scheduling resource crawls (US20130144858A1) ads ads „ @bart_goralewiczwww.onely.com
  90. 90. @bart_goralewiczwww.onely.com …all those partial indexing problems are not THAT serious.
  91. 91. … but
  92. 92. Let’s recap first
  93. 93. *sorted by the level of complexity, ascending Every kind of indexing problems comes from different origins and requires different solutions. Mobile-first related indexing problems JavaScript related indexing problems Layout based indexing problems Every kind of indexing problem* @bart_goralewiczwww.onely.com URL indexing problems
  94. 94. @bart_goralewiczwww.onely.com How are indexing problems killing your traffic?
  95. 95. Let’s investigate Target.com again. @bart_goralewiczwww.onely.com
  96. 96. @bart_goralewiczwww.onely.com Js + mobile Quality Quality Indexed
  97. 97. @bart_goralewiczwww.onely.com Shipping info
  98. 98. Main content @bart_goralewiczwww.onely.com
  99. 99. Patent on Scheduling resource crawls (filed in 2011) The importance of the section is based on (...) prominence of the section within the rendered layout. Source: Patent Scheduling resource crawls (US20130144858A1) ads ads „ @bart_goralewiczwww.onely.com
  100. 100. 87% 62,24% @bart_goralewiczwww.onely.com https://www.target.com/p/nhl-chicago-blackhawks-checkers-game/-/A-54589615 87%
  101. 101. @bart_goralewiczwww.onely.com 43,04% https://www.target.com/p/nhl-chicago-blackhawks-checkers-game/-/A-54589615
  102. 102. @bart_goralewiczwww.onely.com https://www.target.com/p/nhl-chicago-blackhawks-checkers-game/-/A-54589615 87% Js + mobile Indexed 87% 62,24%
  103. 103. @bart_goralewiczwww.onely.com 43,04% https://www.target.com/p/nhl-chicago-blackhawks-checkers-game/-/A-54589615 Quality
  104. 104. @bart_goralewiczwww.onely.com Summary
  105. 105. @bart_goralewiczwww.onely.com Summary ? ?
  106. 106. Summary @bart_goralewiczwww.onely.com
  107. 107. Let’s talk about the results… @bart_goralewiczwww.onely.com
  108. 108. @bart_goralewiczwww.onely.com Indexed content = rankings.
  109. 109. THANK YOU www.onely.com

×