Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Crawl Optimisation - #Pubcon 2015

6,820 views

Published on

Slides from my technical SEO talk about crawl optimisation, part of the SEO Tech Masters session at Pubcon 2015

Published in: Internet
  • Hey guys! Who wants to chat with me? More photos with me here 👉 http://www.bit.ly/katekoxx
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Crawl Optimisation - #Pubcon 2015

  1. 1. #pubcon@badams Crawl Optimisation Presented by: Barry Adams Polemic Digital
  2. 2. #pubcon@badams About Barry Adams • Dutchman in Northern Ireland • Founder of Polemic Digital • Senior editor for StateofDigital.com • Twitter ranter: @badams • Lecturer & educator
  3. 3. #pubcon@badams What is Crawl Optimisation? Ensuring search engine spiders waste as little time as possible crawling the right URLs on your site.
  4. 4. #pubcon@badams Why is Crawl Optimisation important? If you waste crawl budget, the right pages are unlikely to be crawled & indexed.
  5. 5. #pubcon@badams Crawl Sources • Site crawl • XML Sitemaps • Inbound links • DNS records • Domain registrations • Browsing data
  6. 6. #pubcon@badams Identifying Crawl Waste
  7. 7. #pubcon@badams Crawl Waste • Bogus URLs in XML Sitemap
  8. 8. #pubcon@badams Optimise XML Sitemaps • Ensure your sitemap contains final URLs only • Minimise 301-redirects or other non-200 status codes • Use multiple sitemaps to identify crawl waste in GSC
  9. 9. #pubcon@badams Crawl Waste • Paginated Listings • Especially when combined with faceted navigation
  10. 10. #pubcon@badams Optimise Paginated Listings • List more items on a single page • Implement rel=prev/next • Block sorting parameters in robots.txt – Disallow: /*?order=* • “rel=nofollow”
  11. 11. #pubcon@badams Crawl Waste • Internal Site Search Results
  12. 12. #pubcon@badams Block Internal Site Search Pages • Block in robots.txt User-agent: * Disallow: /SearchResults.aspx Disallow: /*query=* Disallow: /*s=*
  13. 13. #pubcon@badams Crawl Waste • Internal redirects
  14. 14. #pubcon@badams Minimise Internal Redirects • Find redirects with Screaming Frog • Internal links should all be 200 OK • Flat site structure
  15. 15. #pubcon@badams Crawl Waste • Canonicalised Pages
  16. 16. #pubcon@badams Use Canonicals Wisely • “rel=canonical” is primarily for index issues – It is not a fix for crawl waste – Search engines need to see the canonical tag before they can act on it – Ergo, pages need to be crawled before rel=canonical has any effect – Ditto with meta noindex tags
  17. 17. #pubcon@badams Crawl Waste • Slow loading pages
  18. 18. #pubcon@badams Optimise Load Speed • Time to First Byte • Lightweight pages • Caching • Compression
  19. 19. #pubcon@badams Crawl Optimisation Summarised • Don’t let search engines do the hard work • Tools at your disposal; – DeepCrawl – Google Search Console – Screaming Frog SEO Crawler – WebPageTest.org • Solutions; – XML Sitemaps – robots.txt – rel=nofollow – rel=prev/next – Load speed

×