How to deal with
crawl budget
Joanna Beech | Deloitte
SLIDESHARE.NET/JOANNABEECH
@JOANNABEECH
How to deal with Crawl
Budget?
September, 2021
Joanna Beech
#brightonSEO
3 | Member firms and DTTL: Insert appropriate copyright
[To edit, click View > Slide Master > Slide Master]
Joanna Beech
Dog Lover
Worked in both Agency and
Client-side
Manager (SME)– Deloitte
#brightonSEO
4 | Member firms and DTTL: Insert appropriate copyright
[To edit, click View > Slide Master > Slide Master]
Key Takeaways
What is Crawl Budget
01
02 What can we do to improve
our Crawl Budget
#brightonSEO
What is Crawl Budget?
Your headline goes
here like normal
• The number of pages Search Engine bots
crawl and index on a website within a
given timeframe.
• Determined by website size, its health and
the number of links to your site
What is Crawl Budget?
Your headline goes
here like normal
• If the number of pages exceed your sites
crawl budget
• It is effective!
• Issue for larger websites or medium
websites that publish rapidly changing
content
When does it become an issue?
What can we do?
9 | Member firms and DTTL: Insert appropriate copyright
[To edit, click View > Slide Master > Slide Master]
• Improve site speed
• Website hierarchy
• Internal links
• Limit duplicate content
• Fix Technical Errors
• Limit orphan pages
• Deal with re-directs
What can we do?
#brightonSE
Improve Site Speed
Why?
Google crawls more of your site’s URLs
Improve Site Speed
How?
• Check your Google Search Console to
see what is affecting your site speed
• Page speed insights
12 | Member firms and DTTL: Insert appropriate copyright
[To edit, click View > Slide Master > Slide Master]
• Improve site speed
• Website hierarchy
• Internal links
• Limit duplicate content
• Fix Technical Errors
• Limit orphan pages
• Deal with re-directs
What can we do?
#brightonSE
Website Hierarchy
Why?
A clear website hierarchy will help
optimise each crawl
Website Hierarchy
How?
Flat website hierarchy works best
Website.com/category/subcategory
/subcategory/product
Breadcrumb navigation
15 | Member firms and DTTL: Insert appropriate copyright
[To edit, click View > Slide Master > Slide Master]
• Improve site speed
• Website hierarchy
• Internal links
• Limit duplicate content
• Fix Technical Errors
• Limit orphan pages
• Deal with re-directs
What can we do?
#brightonSE
Internal Links
Why?
Internal link structure assists crawlers to
find your pages easier
Internal Links
How?
Follow a pyramid structure
18 | Member firms and DTTL: Insert appropriate copyright
[To edit, click View > Slide Master > Slide Master]
• Improve site speed
• Website hierarchy
• Internal links
• Limit duplicate content
• Fix Technical Errors
• Limit orphan pages
• Deal with re-directs
What can we do?
#brightonSE
Duplicate Content
Why?
Ensure each crawl focus’s
crawling unique content rather than
unique URLs
Duplicate Content
How?
Delete duplicate pages
Set necessary parameters in robots.txt
Set necessary parameters in meta tags
Set a 301 redirect
Use rel=canonical
21 | Member firms and DTTL: Insert appropriate copyright
[To edit, click View > Slide Master > Slide Master]
• Improve site speed
• Website hierarchy
• Internal links
• Limit duplicate content
• Fix Technical Errors
• Limit orphan pages
• Deal with re-directs
What can we do?
#brightonSE
Technical Errors
Why?
Error responses causes crawl budget
being reduced
Technical Errors
How?
Find problem pages on Google Search
Console
24 | Member firms and DTTL: Insert appropriate copyright
[To edit, click View > Slide Master > Slide Master]
• Improve site speed
• Website hierarchy
• Internal links
• Limit duplicate content
• Fix Technical Errors
• Limit orphan pages
• Deal with re-directs
What can we do?
#brightonSE
Orphan Pages
Why?
Search engine crawlers want to crawl a
complete website
Orphan Pages
How?
Little to no orphan pages
Remove or redirect orphan pages
‘noindex’ tag/robot.txt disallow on
orphan pages
27 | Member firms and DTTL: Insert appropriate copyright
[To edit, click View > Slide Master > Slide Master]
• Improve site speed
• Website hierarchy
• Internal links
• Limit duplicate content
• Limit orphan pages
• Deal with re-directs
What can we do?
#brightonSE
Redirects
Why?
Crawlers will skip long redirect chains
and therefore index less pages.
Redirects
How?
Limit number of redirects
No equity - 404
30 | Member firms and DTTL: Insert appropriate copyright
[To edit, click View > Slide Master > Slide Master]
• Improve site speed ✔
• Website hierarchy ✔
• Internal links ✔
• Limit duplicate content ✔
• Limit orphan pages ✔
• Deal with re-directs ✔
What can we do?
#brightonSE
Your headline goes
here like normal
Encourage Crawl
Email Campaign
Social Media
Hero Banner
Your headline goes
here like normal
Encourage Crawl
Joanna Beech
Manager – Deloitte
jbeech@Deloitte.co.uk

BrightonSEO Sept

  • 1.
    How to dealwith crawl budget Joanna Beech | Deloitte SLIDESHARE.NET/JOANNABEECH @JOANNABEECH
  • 2.
    How to dealwith Crawl Budget? September, 2021 Joanna Beech #brightonSEO
  • 3.
    3 | Memberfirms and DTTL: Insert appropriate copyright [To edit, click View > Slide Master > Slide Master] Joanna Beech Dog Lover Worked in both Agency and Client-side Manager (SME)– Deloitte #brightonSEO
  • 4.
    4 | Memberfirms and DTTL: Insert appropriate copyright [To edit, click View > Slide Master > Slide Master] Key Takeaways What is Crawl Budget 01 02 What can we do to improve our Crawl Budget #brightonSEO
  • 5.
  • 6.
    Your headline goes herelike normal • The number of pages Search Engine bots crawl and index on a website within a given timeframe. • Determined by website size, its health and the number of links to your site What is Crawl Budget?
  • 7.
    Your headline goes herelike normal • If the number of pages exceed your sites crawl budget • It is effective! • Issue for larger websites or medium websites that publish rapidly changing content When does it become an issue?
  • 8.
  • 9.
    9 | Memberfirms and DTTL: Insert appropriate copyright [To edit, click View > Slide Master > Slide Master] • Improve site speed • Website hierarchy • Internal links • Limit duplicate content • Fix Technical Errors • Limit orphan pages • Deal with re-directs What can we do? #brightonSE
  • 10.
    Improve Site Speed Why? Googlecrawls more of your site’s URLs
  • 11.
    Improve Site Speed How? •Check your Google Search Console to see what is affecting your site speed • Page speed insights
  • 12.
    12 | Memberfirms and DTTL: Insert appropriate copyright [To edit, click View > Slide Master > Slide Master] • Improve site speed • Website hierarchy • Internal links • Limit duplicate content • Fix Technical Errors • Limit orphan pages • Deal with re-directs What can we do? #brightonSE
  • 13.
    Website Hierarchy Why? A clearwebsite hierarchy will help optimise each crawl
  • 14.
    Website Hierarchy How? Flat websitehierarchy works best Website.com/category/subcategory /subcategory/product Breadcrumb navigation
  • 15.
    15 | Memberfirms and DTTL: Insert appropriate copyright [To edit, click View > Slide Master > Slide Master] • Improve site speed • Website hierarchy • Internal links • Limit duplicate content • Fix Technical Errors • Limit orphan pages • Deal with re-directs What can we do? #brightonSE
  • 16.
    Internal Links Why? Internal linkstructure assists crawlers to find your pages easier
  • 17.
  • 18.
    18 | Memberfirms and DTTL: Insert appropriate copyright [To edit, click View > Slide Master > Slide Master] • Improve site speed • Website hierarchy • Internal links • Limit duplicate content • Fix Technical Errors • Limit orphan pages • Deal with re-directs What can we do? #brightonSE
  • 19.
    Duplicate Content Why? Ensure eachcrawl focus’s crawling unique content rather than unique URLs
  • 20.
    Duplicate Content How? Delete duplicatepages Set necessary parameters in robots.txt Set necessary parameters in meta tags Set a 301 redirect Use rel=canonical
  • 21.
    21 | Memberfirms and DTTL: Insert appropriate copyright [To edit, click View > Slide Master > Slide Master] • Improve site speed • Website hierarchy • Internal links • Limit duplicate content • Fix Technical Errors • Limit orphan pages • Deal with re-directs What can we do? #brightonSE
  • 22.
    Technical Errors Why? Error responsescauses crawl budget being reduced
  • 23.
    Technical Errors How? Find problempages on Google Search Console
  • 24.
    24 | Memberfirms and DTTL: Insert appropriate copyright [To edit, click View > Slide Master > Slide Master] • Improve site speed • Website hierarchy • Internal links • Limit duplicate content • Fix Technical Errors • Limit orphan pages • Deal with re-directs What can we do? #brightonSE
  • 25.
    Orphan Pages Why? Search enginecrawlers want to crawl a complete website
  • 26.
    Orphan Pages How? Little tono orphan pages Remove or redirect orphan pages ‘noindex’ tag/robot.txt disallow on orphan pages
  • 27.
    27 | Memberfirms and DTTL: Insert appropriate copyright [To edit, click View > Slide Master > Slide Master] • Improve site speed • Website hierarchy • Internal links • Limit duplicate content • Limit orphan pages • Deal with re-directs What can we do? #brightonSE
  • 28.
    Redirects Why? Crawlers will skiplong redirect chains and therefore index less pages.
  • 29.
    Redirects How? Limit number ofredirects No equity - 404
  • 30.
    30 | Memberfirms and DTTL: Insert appropriate copyright [To edit, click View > Slide Master > Slide Master] • Improve site speed ✔ • Website hierarchy ✔ • Internal links ✔ • Limit duplicate content ✔ • Limit orphan pages ✔ • Deal with re-directs ✔ What can we do? #brightonSE
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
    Joanna Beech Manager –Deloitte jbeech@Deloitte.co.uk

Editor's Notes

  • #5 What do we mean when we say ‘crawl budget’ (yes it is a made up term by SEO’s) and what we can do to improve our crawl budget. Yes only 2 key takeaways however crawl budget is a complex issue and there is PLENTY to cover on how we can improve it.
  • #7 Put simply, the number of pages search engine bots crawl and index on a website within a given timeframe. There are many tthings that affect your ‘crawl budget’ such as site health and the number of links to your website
  • #8 So when does crawl budget become an issue? Quite easy, its if the number of pages exceed your sites crawl budget! Being a new section not getting crawled, new content hasn’t been indexed. Its important to remember it is effective! However With this Crawl Budget should only be an issue for larger websites or medium sites that content is changing rapidly (telecoms company’s are a good example of this). If you don’t fit into this camps and there are pages not being indexed you need to look at your website health.
  • #10 Here are a list of areas that do affect how effective the crawl of your website is, for me this list is by importance with site speed affecting your crawl the most. Others may disagree but this is just what I have found. Now I only have 20 mins so it will be a top level insight into each area so if anything does sound like something you have you should work on please look into it further. Just message me if you want some recommendation on material. Anyway, first lets look at Improve Website speed
  • #11 The slow speed of a website will negatively affect the Crawl Budget. Search engine crawlers uses browers to crawl and index webpages. If it finds a website slow, it can crawl only fewer number of pages.
  • #12 That’s why it’s important to fix all the errors notified in Google  Console account. Google Search Console also offers webmasters the option to check the crawl stats of the web property. Crawl stats can help in keeping track of the fluctuations in the crawl rate and come up with quick fixes.  Making site faster with a server that has significantly less response time, means faster crawling, indexing, and a better crawl budget. You can also check the server logs to do an in-depth analysis of how the Googlebot treats your website. The server log files also help webmasters to see where the crawl budget is getting wasted and come up with actionable solutions.
  • #17 A well structured internal link structure will assist crawlers to find your pages easier, make use of you high traffic pages. A well-organized website with internal links pointing to important pages means a better crawl rate.
  • #18 We would want websites to follow a pyramidical internal linking structure. This will ensure that important pages that are buried inside the website get crawled as they are linked from more important/high traffic pages. 
  • #20 The crawl budget of your website will be affected by duplicate content as search engines such as Google doesn’t want the same content on multiple pages to get indexed. Google has stated that it doesn’t want to waste resources crawling copied pages, internal search result pages, and tag pages.
  • #23 technical glitches taking crawl rate of websites for a toss is 404, 410, 500 errors. If the Google crawler encounters 5xx status codes while crawling a website, it’s highly unlikely to skip and there is a chance that the crawl budget for the site is reduced considerably. When search engine crawlers (“crawlers” from here on out) consistently receive 5xx server errors while crawling your site, they’ll want to avoid adding to the performance issues your site already has, so they’ll dial back their crawl efforts. Essentially this means that they’re lowering your site’s crawl budget. This results in your content getting (re)indexed more slowly. Adding to this, a website with a lot of 404 errors, and soft error pages may deter Googlebot from further crawling the site.
  • #26 Ensuring that there are no orphan pages is an important part of a website’s crawl budget optimization efforts.  Orphan pages make it hard for Search Engine crawlers to crawl a complete website and this can lead to a limited crawl.
  • #29 Search engine crawlers will skip the site from the crawl or end up indexing less pages if it finds a large number of long redirect chains.
  • #30 Even though it is practically impossible for larger websites to live without redirects, we should limit this as much as possible. Only redirect pages that holds equity (Page Authority) and if we have