Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Search Engine Crawl Budget - Don't Stop the Bot

Learn how crawl budget impacts your website's rankings and how you can improve your website's crawlabilty.

  • Login to see the comments

  • Be the first to like this

Search Engine Crawl Budget - Don't Stop the Bot

  1. 1. Crawl Budget Don’t stop the Bot
  2. 2. Overview 1. Introduction 2. Crawl Budget 3. Crawl Rate Limit 4. Crawl Frequency 5. Why you should care 6. Actions
  3. 3. Crawling “Crawling is the entry point for sites into Google's search results.” Gary Illyes, Google Webmaster Central Blog
  4. 4. Google’s Crawl Budget Google defines crawl budget as: “The number of URLs Googlebot can and wants to crawl.”
  5. 5. Google’s Crawl Budget Crawl Rate Crawl Demand How often the site is visited. This is determined by: • Crawl health What is Good Health? • Quick response time • Limited server errors Fast site = increased crawl rate Slow site = Google takes it easy, slows down, visits less. Popularity • How often URLs are visited from the web • Removal of stale URLs to keep the SERPs fresh There are two elements that decide how much crawl budget a website gets: Crawl Rate + Crawl Demand = Crawl Budget
  6. 6. Factors affecting crawl budget • Having many low-value-add URLs has a negative impact on budget. These include: • Faceted navigation & session identifiers • On-site duplicate content • Low quality & spam content • Soft error pages e.g. 404s; 500s • Hacked pages • Infinite spaces & proxies Source: from Google Webmaster blog
  7. 7. Why give a jot about the Bot: Wasting server resources on pages like those mentioned will drain crawl activity from pages that do actually have value. This may cause a significant delay in discovering great content on your or a client’s site.
  8. 8. Crawl Rate Limit Crawl rate limit is designed to help Google not crawl your pages too much and too fast where it hurts your server. In other words, if the Bot thinks your site can’t cope, it will take it easy. It will crawl slower and therefore reach less of your site.
  9. 9. Crawl Frequency Crawl Frequency is the number of days per month that Googlebot requests a URL. There is a clear relationship between traffic and frequency of crawl, making this a critical SEO indicator. Understanding what Google is crawling frequently is a good indicator of what Google thinks is worthwhile, what it needs to keep fresh in the index. Understanding the characteristics of those pages can inform what you might need to do to improve the remainder. Source: Botify
  10. 10. Factors affecting crawl frequency • Things that can reduce crawl frequency include: • Site structure issues • Duplicate content • Publishing pages for which there’s no demand • Publishing at a rate that is faster than what Google is ready to admit to the index Pages with more internal links are more likely to be crawled more frequently. Source: Botify
  11. 11. Why give a jot about the Bot: • Crawl budget is a precious resource. We need to use it wisely. It is SO important on larger sites. • More pages crawled = the more pages that may be indexed. • More bot energy focussed on high value pages over poor low value content. • New content is discovered quickly and easily, giving it a chance to rank sooner
  12. 12. What we want to achieve: A well-organized site in which the most important content is easily accessible from the homepage and other important entry points. A speedy site that represents healthy servers, so the Bot can get more content over the same number of connections.
  13. 13. • Use Deepcrawl to identify thin pages • Improve these pages where possible • Consider combining with another page or removing Thin Pages Duplicate Content Internal Linking • Always do a ‘site:’ search for a variety of related keywords to your new page / content • Make sure your content is unique • Make sure your content sits logically in the site’s hierarchy • Remove orphaned pages – link to them from relevant pages, ideally those that are indexed and do well • Make sure important pages are linked to widely, from relevant variations of optimised anchor text • Make sure sitemaps are up-to-date • Use search console to discover most linked to pages & make sure these respond to priorities What we need to do?
  14. 14. Webmaster Blog Crawl Errors report in Search Console Crawl Frequency from Botify Further Reading