Crawl Budget: Everything you Need to Know

Crawl Budget:
Everything you Need
to Know
https://www.slideshare.net/SallyR7
@salryam
Sally Raymer
HEYCAR
/in/sally-r-seo/

What is crawl budget and
why does it matter?

The bottom
line
● Updated guidelines recognise
medium and problematic sites
● Increasing the crawl rate to
better fit your site
● Reducing crawl bloat

How Google Interacts with
our Sites

Page is created
Site requests page is
crawled
Google Crawls the page
Indexing
Does the page contain authoritative, high
quality content that matches user intent?
Page is indexed Crawled but not indexed
Discovered not Indexed

Crawl Demand Crawl Capacity
● Eligible indexable URLs
● How often content is
created/ updated
● How much traffic you get
to folders/ pages on site
● How quickly content loads
● How many errors are on
site
JS
Rendering

Create Actionable data
crawl requests/ 90
days = daily crawl
budget
indexable + errors
+ canonicals = no.
of requested pages
no. of requested
pages / daily crawl
budget = site ratio
Website
Daily crawl
budget
No. of
requested
crawls
Site ratio
No.
Indexed
Pages
% Indexed
pages vs
requests
example.com 20,000 4,826,000 x241 476,000 9.7%

Uncover the coverage
report
● Discovered not indexed
● Crawled not indexed
● Server errors
● Large number of canonical
tags
● Large number redirects
● Indexed -blocked by robots.txt

What’s being crawled
in comparison to
organic traffic

What response codes are being crawled?
Define historic trends and volume

What paginated pages & parameters
are being crawled

What file types is Google crawling?
Images
Video
JS

What commonalities do crawled pages share?

Use Internal links to prioritise
● Capitalise on link equity
● Review link hierarchy regularly
● Strengthen content hubs
● Be mindful of where links are pointing

Dynamically populate sitemaps
● Qualify all 200 canonicals
● Break your sitemaps up
● Don’t forget images and videos
● Minimise manual management

Use robots.txt
● Prevent Google from crawling
specific taxonomies
● Highlight top directories
● Link to sitemaps
● Remember: Blocked URLS can
be indexed

Three Questions to ask yourself
1.What are our KPIs?
2.How do we define user intent?
3.What is our web structure?
Credit: https://www.screamingfrog.co.uk/site-architecture-crawl-visualisations/

Low (low traffic) quality content
If it doesn’t work, can we cut it?
● Are they high intent pages?
● Do they have quality backlinks?
If the answer is no, what is the value in holding on to pages
that aren’t working?

Duplication & Cannibalisation
● Canonicals ➡️ Do we need both sets of content
● Bad taxonomy ➡️ Clean up the site structure
● Internal competition ➡️ Use GSC & keyword mapping

Faceted navigation &
parameters
● Do not deindex all parameters as a rule
● Are top level parameters valuable?
● Are static alternatives available?
● Is link equity and information architecture protected?

Redirects & 404s
● Redirects used for OOS products
● Historical buildup of redirects or 404’s
● Embedded links will continue to be crawled
● Optimise the user journey not just site health

Picking an Approach to Deindexation

Noindex directive - if the page serves a purpose
in the user journey, the noindex meta tag is
probably the best method but does not take effect
immediately.

Blocking via robots.txt - this is a great way of
keeping down crawl requests but it doesn’t stop
indexation.

410 - Removing URLs that do not play a part in the
user journey or provide value is a great way of
dealing with crawl bloat.

Ensure Content is Accessible
If you have a large JS web app,
ensuring that Google does not
have to allocate more server
resources than essential to
render important content.
Credit: https://nextjs.org/learn/basics/data-fetching/two-forms

Updating & Creating New Content
Regularly
Creating fresh content that meets
Google E-E-A-T quality rater
guidelines will help increase crawl
demand for your site.

Improving Page Speed and Performance
● Invest in servers
● Improve page performance metrics
● Limit errors - redirect chains or loops

Crawl Budget: A Happy Ending
Identifying crawl
rate
Optimising the
crawl budget
Identifying crawl
issues
1.
2. 3.

THANK YOU
Slideshare.Net/SallyRaymer2
@salraym /sally-r-seo/
Some great resources:
Google Documentation: Managing Crawl Budget
Verifying Google Bot for log file analysis
Screaming Frog’s guide to log file analysis
How to identify keyword cannibalisation
SEJ internal link optimisation checklist for enterprise sites
Mckinsey Consumer Decision Journey
Mckinsey customer touchpoints

Crawl Budget: Everything you Need to Know

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Crawl Budget: Everything you Need to Know

Similar to Crawl Budget: Everything you Need to Know (20)

Recently uploaded

Recently uploaded (20)

Crawl Budget: Everything you Need to Know