Multi-Hop Site redirects
These should all redirect DIRECTLY
to the same page on the designated
site. In most cases they do not. This
wastes Users and Bots time.
Screaming Frog crawled 168,000 URLs over night
and killed my computer. It still said only 26% done!
● 20,000 were internal
● Many were external
○ cdn images
○ Resource files with timestamps (poor cache blocking)
○ Bat.bing.com calls, which don’t seem to disallow?
○ Facebook/Youtube links
The t parameter
Action and pagination URLs
introduce a “t” parameter to their
links, this seems to be a random
number generated for each page.
It introduces a spider trap with an
infinite number of URLs to crawl.
Make sure these URLs controlled.
Option: Noindex, then disallow?
Plus the t parameter issue.
Keep them out of the crawl and index.
Noindex then later disallow?
Think about hiding them from crawling (don’t use href)
We want them in the index, to help discover
Fix the t parameter.
Use rel prev/next to indicate they are related
Duplicate URLs with postfixed dashes (_)
Product links sometimes add an _ to the
products URL. This causes two versions of
every product, to be crawled, indexed, then
This issue can be recursive. A single _ URL
can generate double __ URLs, etc.
This is a URL Google is showing in search:
Screaming Frog picked up some cases where a URL would 302 redirect and
add a session parameter to the URL.
session parameters are another way to cause infinite crawl paths. Every request
from Googlebot is a new session. Investigate this.
Ask Questions and the rdm parameter
Screaming Frog with rendering found many
URLs with rdm parameters.
I think a false capture, but worth checking
the Google Search Consoles “URL
parameter tool” and crawling/indexing
reports to see if it causes issue.
Pages you don’t want crawled or indexed
Some pages have no value in the search results. So are best kept out of crawling
and indexing. Google bot checks out every URL it finds in an <a href="" />. E.g.
● Add a review links
● Login links
Stop bots wasting their time with these URLs:
● Don’t make them links. E.g. use onclick.
● Noindex or disallow the destination pages.
Disallowing almost everyone from crawling
Maybe this is because of the spider traps
causing strain on your server?
Be very careful, as you may be blocking
a spider that is of use to you.
And it only works with nice bots. It’s a
No canonical tags
Canonical tags can help control these duplicate and infinite crawl issues, by
bringing the bots back to a main page for each bit of content.
Try and use clean URLs
WARNING: I’m not suggesting altering existing URLs.
Some characters in URLs are troublesome. It’s best to stick to a few known to
be safe characters:
● a-z and A-Z (Some people prefer all lower case to avoid other issues)
● Dashes (-) and Underscores (_). Dashes are best.
The biggest issues characters are spaces (aka + or %20), decimal points (.)
and anything that gets encoded on the fly.
Good for mobile and Speed checks
● Compress images
● Compress resources
The Usual eCommerce Suspects
● No home page content
● No content on category pages (that rank and make you money)
● Thin or Duplicated product descriptions
● Faceted navigation indexing
● Duplicate or near duplicate product pages
● Almost perfect Product/Review structured data (brand missing name)
● Add rel prev/next to paginated pages
● Breadcrumb structured data is missing the URL (id) for each step
● Pages without breadcrumbs markup an empty one (home page)
● I could not find Organization markup
Check the Search Consoles Structured Data and Analytics report set to “Rich
results” to see how they are performing.
Stuff I liked
● The parts finder on the home page (analyse and possibly make more
● Banner is not too tall
● Trust messages (address, ABN, Terms, Payment methods
● Filters (If they were taken out of the crawl)
● Detailed tracking
○ Google Analytics Enhanced Ecommerce
○ Google Ads Dynamic Remarketing and Conversion tracking
○ Facebook Pixel
Learn from the New Google Search Console
● URL parameter report - how bad are those parameters affecting the crawl
● URL inspection tool - What does Google think of a specific URL
● Index coverage reports - Why are URLs excluded from the crawl and index
● Performance report - which pages are causing traffic for you, and why
● Which valuable pages stopped working. What happened.
Learn from Google Analytics
● Which landing pages from search are making you money, improve them.
(Home, Categories, Products)
● Which valuable pages stopped working for you, why? Can they be fixed?
(Moved, Removed, Changed)
● Fix spider traps and crawling/indexing issues.
● More content for your valuable pages. Help them along.
● Fix Structured Data issues.
● Do an SEMrush audit. Understand the issues and fix those that will benefit
Web Site Advantage
Google Webmaster Central
● Big banner
● Blog. Enable click on post headings
● No Structured Data found (LocalBusiness, Services, BlogPostings…)
● Move Capabilities to their own page?
● Work page is slow. Reduce images sizes to how they are displayed.
● Work page and several other base pages are noindexed?
● Work page has a rel next to another page with the same content
● Using gtag plus ga three times!!! A fifth to a different property. And
● Big banner
● Not secure is live (canonicals to secure)
● Secure links are to not secure pages
● Non www redirects to Default.asp
● Home page meta description repeated on many pages
● Articles all use the same title
● Use of old Google Analytics (ga) and very old Google Analytics (gat)?
● Support subdomain is blocking resources
● “Next Page” is a POST based form
● Very basic structured data (Product)