Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

rel canonical audit BrightonSEO September 2018

1,547 views

Published on

Why taking a look at your canonical setup is a smart move... Some data to compare with, an audit checklist, and a number of Google's quotes over the years.

Published in: Marketing
  • Be the first to comment

rel canonical audit BrightonSEO September 2018

  1. 1. Mark Thomas @SearchMATH BOTIFY Why auditing your rel=canonical configuration is a shrewd move http://www.slideshare.net/MarkThomas114
  2. 2. TECHNICAL SEO CONTENT REAL RANKINGS & CTR The first unified suite to drive SEO success in each phase of the organic search process
  3. 3. I would also like to add to this conversation that we have learned the hard way that if we use canonicals for pages that aren’t duplicates or near-duplicates, we have no impact at best and a ranking drop at worst. Please don’t get clever with canonicals in a market that I need to meet my targets.
  4. 4. 45 Million URLs are being tagged with an alternative incorrect canonical tag. Which confuses Google and forces the crawl of 45M unnecessary URLs: When a query URL has a space, the canonical rule will substitute the "+" character for the encoded character "%20” https://www.example.com/a/audi-a4?query=audi+a4 (which has the internal links) will canonicalize to https://www. example.com/a/audi-a4?query=audi%20a4
  5. 5. Why Audit Your Canonical Set Up?
  6. 6. 1) Get More Inventory Ranking 2) Get The Right Inventory Ranking Why Audit Your Canonical Set Up?
  7. 7. rel=canonical Issues Detect Fix Agenda
  8. 8. …a process for converting data that has more than one possible representation into a "standard", "normal", or canonical form.
  9. 9. Canonicalization Canonical Duplicate 98% Match Partner Site
  10. 10. Including a rel=canonical link in your webpage is a strong hint to search engines about your preferred version to index among duplicate pages on the web.
  11. 11. rel=canonical Fact File An “Element” rather than a “Tag” rel=canonical is a hint, not a directive rel=‘canonical’ or rel=“canonical” are fine when placed in the <head> Google processes rel=canonical as a 2nd/3rd step – not during crawl
  12. 12. What does rel=canonical offer?
  13. 13. • Circumvents Duplicate Content • Avoids diluting Link Authority • Avoids Content Cannibalisation in SERPs What does rel=canonical offer?
  14. 14. Specific cases for rel=canonical Uppercase/lowercase URL paths, Session IDs, Tracking Codes Product review pages with /review/product/list/ Multiple versions of category pages derived from dynamic filters Product Page: Multiple versions ‘Show all’ category pages: if a different URL Content Syndication
  15. 15. The cleaner you can make your signals, the more likely we'll use them. John Mueller Reddit AMA, April 2018
  16. 16. Google chose different canonical than user – There are many cases where Google simply gets this wrong. Are there any methods that would force Google to honor the canonical specified by the webmaster?
  17. 17. • Redirect to your preferred version • Make internal links, hreflang, rel=next/prev/etc. point to the preferred version • Put it into a sitemap file, etc.
  18. 18. 10 Common Issues Abundance Too many pages Canonicalizing to a single page Code rel=canonical in <body>, multiple declarations, etc. Content Lack of parity between canonical and canonicalized Duplication Too little canonicalization hreflang Canonicalizing pages in a hreflang cluster to one language variant HTTP Codes Canonicalizing to non-200 HTTP Status codes Linking More links to canonicalized page rather than canonical Noindex Noindex present on a canonicalized page Pagination Canonicalizing component pages to the first page in a paginated set Tracking Parameters generating duplicate URLs
  19. 19. Non-canonical Gaining Impressions
  20. 20. 11 Step Canonical Audit Review GSC Index Coverage Report Build Data Warehouse Including: Simulated Web Crawl, Logs, JS, GSC, GA/Adobe Review Duplicate Content Situation Assess Crawl Budget Impact Assess Canonical Content Similarity Check Internal Linking Signals (canonical should receive most internal links) XML Sitemap Check (should only contain canonical URLs) Check URLs with canonicals pointing to a 404 or noindex Check URLs missing a canonical element Check paginated URLs have a self-referencing canonical Check hreflang clusters self-referencing canonical
  21. 21. Review GSC Index Coverage Report Use Google’s Index Coverage Report Valid - Indexed; consider marking as canonical: The URL was indexed. Because it has duplicate URLs, we recommend explicitly marking this URL as canonical Excluded - Duplicate page without canonical tag: This page has duplicates, none of which is marked canonical. We think this page is not the canonical one. You should explicitly mark the canonical for this page Google chose different canonical than user: This page is marked as canonical for a set of pages, but Google thinks another URL makes a better canonical. Google has indexed the page we consider canonical rather than this one. We recommend that you explicitly mark this page as a duplicate of the canonical URL Submitted URL not selected as canonical: difference between this status and "Google chose different canonical than user" is that, in this case, you explicitly requested indexing.
  22. 22. Data Warehouse Build Data Warehouse Including: Simulated Web Crawl, Logs, JS, GSC, GA/Adobe
  23. 23. Assess Crawl Budget Impact Canonical Conversion
  24. 24. Canonical Similarity Assess Canonical Content Similarity
  25. 25. How do sites compare? Industry URLs Crawled Known URLs Number of Compliant URLs crawled Canonical Not Equal Volume Meta Noindex + Canonical Not Equal or Bad Status Code Total Canonical Not Equal Volume % Canonical Not Equal Duplicate Content: No. of Pages with Similarity > 90% % of Pages with Similarity > 90% Pages Less Than 50% Similar to Canonical % of Canonicalised Pages Less Than 50% Similar to Canonical Number of URLs Crawled by Botify & Google Number of Compliant Pages crawled by Botify & Google % of of Compliant Pages crawled by Botify & Google Number of URLs Crawled >80% Number of Compliant URLs Crawled >80% %of Compliant Pages Crawled >80% Number of URLs Crawled 20%-79% No of Incoming Canonical Tags >5 No of Incoming Canonical Tags >10 No of Incoming Canonical Tags >50 Canonical Not Equal but present in Sitemap Travel 4616 4616 3376 513 2 515 11 % 419 12 % 150 29 % 1061 850 25 % 186 161 5% 570 2 1 0 0 Retail 22161 22161 8010 8890 262 9152 41 % 1386 17 % 36 0% 10884 5830 73 % 1223 1099 14 % 2868 681 138 0 0 Retail 25720 25270 19216 2130 0 2130 8% 2770 14 % 57 3% 10681 9969 52 % 151 149 1% 2060 87 48 4 0 Retail 43,499 43,499 39,663 3751 16 3767 9% 1112 3% 3465 92 % 29328 29169 74 % 468 468 1% 3083 132 103 13 0 Classified 123,336 123,336 122,085 34 0 34 0% 7716 6% 20 59 % 103098 102492 84 % 6445 6442 5% 28623 0 0 0 0 Publishing 316487 316487 220597 50278 4519 54797 17 % 4391 2% 5824 11 % 132887 123466 56 % 8607 8603 4% 38415 358 243 84 40 Travel 366068 366068 171113 71425 69354 140779 38 % 24256 14 % 4927 3% 163586 94749 55 % 6528 6376 4% 41905 205 100 27 3 Travel 421166 421166 115144 43855 72 43927 10 % 33293 29 % 2649 6% 115783 69293 60 % 6312 6244 5% 40874 297 181 1 3 Retail 1141182 1141182 654029 142920 21480 164400 14 % 53924 8% 120734 73 % 318000 232201 36 % 8708 8392 1% 26106 70 58 46 730 Retail 2798951 2798951 727844 911137 701798 1612935 58 % 15786 2% 280547 17 % 961585 704797 97 % 170270 166,363.00 23 % 479940 67966 40101 0 0 >0 Flagged >10% Flagged >10% Flagged >10% Flagged <80% Flagged <20% Flagged
  26. 26. How do sites compare? Industry URLs Crawled Known URLs Number of Compliant URLs crawled Canonical Not Equal Volume Meta Noindex + Canonical Not Equal or Bad Status Code Total Canonical Not Equal Volume % Canonical Not Equal Duplicate Content: No. of Pages with Similarity > 90% % of Pages with Similarity > 90% Pages Less Than 50% Similar to Canonical % of Canonicalised Pages Less Than 50% Similar to Canonical Travel 4616 4616 3376 513 2 515 11% 419 12% 150 29% Retail 22161 22161 8010 8890 262 9152 41% 1386 17% 36 0% Retail 25720 25270 19216 2130 0 2130 8% 2770 14% 57 3% Retail 43,499 43,499 39,663 3751 16 3767 9% 1112 3% 3465 92% Classified 123,336 123,336 122,085 34 0 34 0% 7716 6% 20 59% Publishing 316487 316487 220597 50278 4519 54797 17% 4391 2% 5824 11% Travel 366068 366068 171113 71425 69354 140779 38% 24256 14% 4927 3% Travel 421166 421166 115144 43855 72 43927 10% 33293 29% 2649 6% Retail 1141182 1141182 654029 142920 21480 164400 14% 53924 8% 120734 73% Retail 2798951 2798951 727844 911137 701798 1612935 58% 15786 2% 280547 17% >0 Flagged >10% Flagged >10% Flagged >10% Flagged
  27. 27. How do sites compare? Industry Number of URLs Crawled by Botify & Google Number of Compliant Pages crawled by Botify & Google % of of Compliant Pages crawled by Botify & Google Number of URLs Crawled >80% Number of Compliant URLs Crawled >80% %of Compliant Pages Crawled >80% Number of URLs Crawled 20%- 79% No of Incoming Canonical Tags >5 No of Incoming Canonical Tags >10 No of Incoming Canonical Tags >50 Canonical Not Equal but present in Sitemap Travel 1061 850 25% 186 161 5% 570 2 1 0 0 Retail 10884 5830 73% 1223 1099 14% 2868 681 138 0 0 Retail 10681 9969 52% 151 149 1% 2060 87 48 4 0 Retail 29328 29169 74% 468 468 1% 3083 132 103 13 0 Classified 103098 102492 84% 6445 6442 5% 28623 0 0 0 0 Publishing 132887 123466 56% 8607 8603 4% 38415 358 243 84 40 Travel 163586 94749 55% 6528 6376 4% 41905 205 100 27 3 Travel 115783 69293 60% 6312 6244 5% 40874 297 181 1 3 Retail 318000 232201 36% 8708 8392 1% 26106 70 58 46 730 Retail 961585 704797 97% 170270 166,363.00 23% 479940 67966 40101 0 0 <80% Flagged <20% Flagged
  28. 28. Fix Upstream Where Possible
  29. 29. rel=canonical Fixes TECHNIQUE DETAIL Upstream Standardised URLs Get shot of: event tracking, session IDs, query strings, etc. Consistent Internal Links Use 301 if absolutely necessary Robots.txt Disallow: /*?query= GSC Exclude Parameters Downstream rel=canonical Consistent Signals
  30. 30. Conclusion
  31. 31. Be consistent: “Consistency is the mother of all good SEO.” Matt Cutts 2009 | John Mueller 2016
  32. 32. @SearchMATH http://www.slideshare.net/ MarkThomas114

×