SlideShare a Scribd company logo
1 of 17
How To Tackle Crawling Large eCommerce
Sites
● Maria Camanes
AUTHORITAS
● SEO Jo Blogs - Growth Marketer
● Carrie Shepherd - Marketing Executive
Maria Camanes, Senior SEO Consultant
● Over 6 years in SEO. Now a Senior SEO Consultant at Builtvisible,
where I joined 3 years ago
● Passionate about the technical side of SEO and specialised in site
speed optimisation and ecommerce SEO
● Work across a variety of accounts but mostly ecommerce sites
● Occasional speaker and regular trainer at BrightonSEO
● Twitter @mariacamanes
Common issues:
• A missing or wrongly implemented product retirement strategy can – and will – have a
negative impact on any ecommerce site’s organic performance
• Discontinued or temporarily unavailable products can result in large quantities of 404s,
broken links and empty category pages (thin content)
• Broken links are harmful for all types of sites but the possibility of broken links in an
ecommerce site is higher
• Displaying a 404 or empty page to your beloved customers will result in bad UX but also on
large quantities of link equity being lost
Today we’ll focus on how to find out of stock products as well as thin category pages and
- as these often occur in large quantities - how to deal with them at scale.
Maria’s tips on crawling large ecommerce sites
For example:
• This product page, has 91 backlinks from 28 different referring domains. The site has a
number of products out of stock with a significant number of backlinks
• As a result, it’s quite common to find large amounts of out of stock product pages for a
single site indexed by search engines
Tip #1:
Crawl your site to find out of stock products at scale
How to do it:
Use the ‘Custom search’ feature in Screaming Frog to find these when they return a 200 status code (as
these won’t be picked up via a standard crawl or GSC)
• Step 1: find the identifiable out of stock copy on your product page. Using Amazon as an example, their
identifiable out of stock is “Currently unavailable”
Tip #1:
Crawl your site to find out of stock products at scale
How to do it:
Use the ‘Custom search’ feature in Screaming Frog to find these when they return a 200 status code (as
these won’t be picked up via a standard crawl or GSC)
• Step 1: find the identifiable out of stock copy on your product page. Using Amazon as an example, their
identifiable out of stock is “Currently unavailable”
• Step 2: copy and paste this into Screaming Frog’s ‘Custom Search’ feature and run a crawl
Tip #1:
Crawl your site to find out of stock products at scale
How to do it:
Use the ‘Custom search’ feature in Screaming Frog to find these when they return a 200 status code (as
these won’t be picked up via a standard crawl or GSC)
• Step 1: find the identifiable out of stock copy on your product page. Using Amazon as an example, their
identifiable out of stock is “Currently unavailable”
• Step 2: copy and paste this into Screaming Frog’s ‘Custom Search’ feature
• Step 3: crawl will return all of the product pages that contain the “out of stock” string. Don’t forget to manually
QA for any errors
Tip #1:
Crawl your site to find out of stock products at scale
• You can use the same process to find product listing pages that are empty (meaning they
have no products)
• Just copy the ‘no products’ identifier in Screaming Frog, in the same way we did for ‘out of
stock’ products
• Here are some examples:
Tip #2:
Apply this to category pages to find empty PLPs
If your site is too big and you are having issues with allocated memory on your desktop, you
can limit your crawl to only include category URLs or exclude product URLs by using the
‘include’ or ‘exclude’ features on the tool (e.g. exclude https://www.example.com/products/.*)
Tip #3:
Limit your crawl to include/exclude certain URLs
Common issues:
• Thin category pages with limited stock are also a source of bad UX
• They will result in lost sales and when this happens at scale, this can have a significant
impact in revenue (not only for SEO)
• They put the site at risk of algorithm penalties
Tip #4:
Use the ‘Custom extraction’ tool to find thin PLPs
How to do it:
Taking this category page as an example
• Step 1: fetch the contents of any container using class=“styleCount” by typing this: //*[contains(@class,
'styleCount’)]
Tip #4:
Use the ‘Custom extraction’ tool to find thin PLPs
How to do it:
Taking this category page as an example
• Step 1: fetch the contents of any container using class=“styleCount” by typing this: //*[contains(@class,
'styleCount’)]
• Step 2: when crawl finishes, go to the ‘Custom’ tab at the top of the tool and you’ll see something like this
Tip #4:
Use the ‘Custom extraction’ tool to find thin PLPs
How to do it:
Taking this category page as an example
• Step 1: fetch the contents of any container using class=“styleCount” by typing this: //*[contains(@class,
'styleCount’)]
• Step 2: when crawl finishes, go to the ‘Custom’ tab at the top of the tool and you’ll see something like this
*Note: if your page doesn’t have a container with the number of products available, you can still count the number
of elements on a page by using the count function: count(//div[@class="offer__content"])
Tip #4:
Use the ‘Custom extraction’ tool to find thin PLPs
● If you want to learn more about how to use XPath for SEO purposes, you can read this
guide: https://bit.ly/3aeXsX0
● Learn how to extract other elements of your category pages, such as titles, headings, etc.
You can look at this article: https://bit.ly/2VveYRm
● Check this out for more details on everything I’ve covered: https://bit.ly/2VwMsim
Tip #5:
Learn more about how to use XPath for SEO
Thank you - over to Q and A
● Great tips from Maria
● Maria Camanes @MariaCamanes
“Technical SEO"
● Serena Pearson
● Franco Valentino
● Paul Lovell
Friday 17th April 2020 @ 4 p.m. SEO Advice, tea and cake with...

More Related Content

More from Authoritas

More from Authoritas (15)

SEO Friendly Migrations - Tea-Time SEO' Series of Daily SEO Live Talks
SEO Friendly Migrations - Tea-Time SEO' Series of Daily SEO Live TalksSEO Friendly Migrations - Tea-Time SEO' Series of Daily SEO Live Talks
SEO Friendly Migrations - Tea-Time SEO' Series of Daily SEO Live Talks
 
Technical SEO - Tea-Time SEO' Series of Daily SEO Live Talks
Technical SEO - Tea-Time SEO' Series of Daily SEO Live TalksTechnical SEO - Tea-Time SEO' Series of Daily SEO Live Talks
Technical SEO - Tea-Time SEO' Series of Daily SEO Live Talks
 
SEO Reporting and Analytics - Tea-Time SEO Series of Daily SEO Talks from SE...
SEO Reporting and Analytics  - Tea-Time SEO Series of Daily SEO Talks from SE...SEO Reporting and Analytics  - Tea-Time SEO Series of Daily SEO Talks from SE...
SEO Reporting and Analytics - Tea-Time SEO Series of Daily SEO Talks from SE...
 
Optimising for Local Search - Tea-Time SEO' Series of Daily SEO Live Talks
Optimising for Local Search - Tea-Time SEO' Series of Daily SEO Live TalksOptimising for Local Search - Tea-Time SEO' Series of Daily SEO Live Talks
Optimising for Local Search - Tea-Time SEO' Series of Daily SEO Live Talks
 
Understanding User Intent - Tea-Time SEO' Series of Daily SEO Live Talks
Understanding User Intent  - Tea-Time SEO' Series of Daily SEO Live TalksUnderstanding User Intent  - Tea-Time SEO' Series of Daily SEO Live Talks
Understanding User Intent - Tea-Time SEO' Series of Daily SEO Live Talks
 
The Importance of E.A.T and Your Blog Strategy - Authoritas 'Tea-time SEO' Se...
The Importance of E.A.T and Your Blog Strategy - Authoritas 'Tea-time SEO' Se...The Importance of E.A.T and Your Blog Strategy - Authoritas 'Tea-time SEO' Se...
The Importance of E.A.T and Your Blog Strategy - Authoritas 'Tea-time SEO' Se...
 
Reducing Site Speed - Authoritas 'Tea-time SEO' Series of Daily SEO Live Streams
Reducing Site Speed - Authoritas 'Tea-time SEO' Series of Daily SEO Live StreamsReducing Site Speed - Authoritas 'Tea-time SEO' Series of Daily SEO Live Streams
Reducing Site Speed - Authoritas 'Tea-time SEO' Series of Daily SEO Live Streams
 
SEO-Friendly Website Migrations - Authoritas 'Tea-time SEO' Series of Daily S...
SEO-Friendly Website Migrations - Authoritas 'Tea-time SEO' Series of Daily S...SEO-Friendly Website Migrations - Authoritas 'Tea-time SEO' Series of Daily S...
SEO-Friendly Website Migrations - Authoritas 'Tea-time SEO' Series of Daily S...
 
Chasing the Goolge Algorithm - Penalties and E.A.T (Expertise, Authority, Tru...
Chasing the Goolge Algorithm - Penalties and E.A.T (Expertise, Authority, Tru...Chasing the Goolge Algorithm - Penalties and E.A.T (Expertise, Authority, Tru...
Chasing the Goolge Algorithm - Penalties and E.A.T (Expertise, Authority, Tru...
 
SERPs, User Intent and winning Featured Snippets - Authoritas 'Tea-time SEO' ...
SERPs, User Intent and winning Featured Snippets - Authoritas 'Tea-time SEO' ...SERPs, User Intent and winning Featured Snippets - Authoritas 'Tea-time SEO' ...
SERPs, User Intent and winning Featured Snippets - Authoritas 'Tea-time SEO' ...
 
Creating the Perfect Content and SEO Strategy - Authoritas 'Tea-time SEO' Ser...
Creating the Perfect Content and SEO Strategy - Authoritas 'Tea-time SEO' Ser...Creating the Perfect Content and SEO Strategy - Authoritas 'Tea-time SEO' Ser...
Creating the Perfect Content and SEO Strategy - Authoritas 'Tea-time SEO' Ser...
 
Managing SEO Performance Across Remote Teams - Authoritas 'Tea-time SEO' Ser...
Managing SEO Performance Across Remote Teams  - Authoritas 'Tea-time SEO' Ser...Managing SEO Performance Across Remote Teams  - Authoritas 'Tea-time SEO' Ser...
Managing SEO Performance Across Remote Teams - Authoritas 'Tea-time SEO' Ser...
 
SEO Reporting and Analytics - Authoritas 'Tea-time SEO' Series of Daily SEO L...
SEO Reporting and Analytics - Authoritas 'Tea-time SEO' Series of Daily SEO L...SEO Reporting and Analytics - Authoritas 'Tea-time SEO' Series of Daily SEO L...
SEO Reporting and Analytics - Authoritas 'Tea-time SEO' Series of Daily SEO L...
 
Understanding and implementing SEO User Intent - Part 1
Understanding and implementing SEO User Intent - Part 1Understanding and implementing SEO User Intent - Part 1
Understanding and implementing SEO User Intent - Part 1
 
We’re looking for a Chief Technology Officer (CTO)
We’re looking for a Chief Technology Officer (CTO)We’re looking for a Chief Technology Officer (CTO)
We’re looking for a Chief Technology Officer (CTO)
 

Recently uploaded

Brand experience Peoria City Soccer Presentation.pdf
Brand experience Peoria City Soccer Presentation.pdfBrand experience Peoria City Soccer Presentation.pdf
Brand experience Peoria City Soccer Presentation.pdf
tbatkhuu1
 

Recently uploaded (20)

Five Essential Tools for International SEO - Natalia Witczyk - SearchNorwich 15
Five Essential Tools for International SEO - Natalia Witczyk - SearchNorwich 15Five Essential Tools for International SEO - Natalia Witczyk - SearchNorwich 15
Five Essential Tools for International SEO - Natalia Witczyk - SearchNorwich 15
 
BDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort Service
 
The+State+of+Careers+In+Retention+Marketing-2.pdf
The+State+of+Careers+In+Retention+Marketing-2.pdfThe+State+of+Careers+In+Retention+Marketing-2.pdf
The+State+of+Careers+In+Retention+Marketing-2.pdf
 
Brand experience Peoria City Soccer Presentation.pdf
Brand experience Peoria City Soccer Presentation.pdfBrand experience Peoria City Soccer Presentation.pdf
Brand experience Peoria City Soccer Presentation.pdf
 
Digital-Marketing-Into-by-Zoraiz-Ahmad.pptx
Digital-Marketing-Into-by-Zoraiz-Ahmad.pptxDigital-Marketing-Into-by-Zoraiz-Ahmad.pptx
Digital-Marketing-Into-by-Zoraiz-Ahmad.pptx
 
Martal Group - B2B Lead Gen Agency - Onboarding Overview
Martal Group - B2B Lead Gen Agency - Onboarding OverviewMartal Group - B2B Lead Gen Agency - Onboarding Overview
Martal Group - B2B Lead Gen Agency - Onboarding Overview
 
25+ years’ experience (310) 882-6330 Love Spells in Wilmington, DE | black ma...
25+ years’ experience (310) 882-6330 Love Spells in Wilmington, DE | black ma...25+ years’ experience (310) 882-6330 Love Spells in Wilmington, DE | black ma...
25+ years’ experience (310) 882-6330 Love Spells in Wilmington, DE | black ma...
 
How consumers use technology and the impacts on their lives
How consumers use technology and the impacts on their livesHow consumers use technology and the impacts on their lives
How consumers use technology and the impacts on their lives
 
Analysis of Sineing Website and how to fix
Analysis of Sineing Website and how to fixAnalysis of Sineing Website and how to fix
Analysis of Sineing Website and how to fix
 
BDSM⚡Call Girls in Sector 128 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 128 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 128 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 128 Noida Escorts >༒8448380779 Escort Service
 
Major SEO Trends in 2024 - Banyanbrain Digital
Major SEO Trends in 2024 - Banyanbrain DigitalMajor SEO Trends in 2024 - Banyanbrain Digital
Major SEO Trends in 2024 - Banyanbrain Digital
 
Kraft Mac and Cheese campaign presentation
Kraft Mac and Cheese campaign presentationKraft Mac and Cheese campaign presentation
Kraft Mac and Cheese campaign presentation
 
personal branding kit for music business
personal branding kit for music businesspersonal branding kit for music business
personal branding kit for music business
 
Elevate Your Advertising Game: Introducing Billion Broadcaster Lift Advertising
Elevate Your Advertising Game: Introducing Billion Broadcaster Lift AdvertisingElevate Your Advertising Game: Introducing Billion Broadcaster Lift Advertising
Elevate Your Advertising Game: Introducing Billion Broadcaster Lift Advertising
 
Unraveling the Mystery of The Circleville Letters.pptx
Unraveling the Mystery of The Circleville Letters.pptxUnraveling the Mystery of The Circleville Letters.pptx
Unraveling the Mystery of The Circleville Letters.pptx
 
Instant Digital Issuance: An Overview With Critical First Touch Best Practices
Instant Digital Issuance: An Overview With Critical First Touch Best PracticesInstant Digital Issuance: An Overview With Critical First Touch Best Practices
Instant Digital Issuance: An Overview With Critical First Touch Best Practices
 
Situation Analysis | Management Company.
Situation Analysis | Management Company.Situation Analysis | Management Company.
Situation Analysis | Management Company.
 
What is Google Search Console and What is it provide?
What is Google Search Console and What is it provide?What is Google Search Console and What is it provide?
What is Google Search Console and What is it provide?
 
Unlocking the Mystery of the Voynich Manuscript
Unlocking the Mystery of the Voynich ManuscriptUnlocking the Mystery of the Voynich Manuscript
Unlocking the Mystery of the Voynich Manuscript
 
BDSM⚡Call Girls in Sector 144 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 144 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 144 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 144 Noida Escorts >༒8448380779 Escort Service
 

How To Tackle Crawling Large eCommerce Sites - Tea-Time SEO' Series of Daily SEO Live Talks

  • 1.
  • 2. How To Tackle Crawling Large eCommerce Sites ● Maria Camanes AUTHORITAS ● SEO Jo Blogs - Growth Marketer ● Carrie Shepherd - Marketing Executive
  • 3. Maria Camanes, Senior SEO Consultant ● Over 6 years in SEO. Now a Senior SEO Consultant at Builtvisible, where I joined 3 years ago ● Passionate about the technical side of SEO and specialised in site speed optimisation and ecommerce SEO ● Work across a variety of accounts but mostly ecommerce sites ● Occasional speaker and regular trainer at BrightonSEO ● Twitter @mariacamanes
  • 4. Common issues: • A missing or wrongly implemented product retirement strategy can – and will – have a negative impact on any ecommerce site’s organic performance • Discontinued or temporarily unavailable products can result in large quantities of 404s, broken links and empty category pages (thin content) • Broken links are harmful for all types of sites but the possibility of broken links in an ecommerce site is higher • Displaying a 404 or empty page to your beloved customers will result in bad UX but also on large quantities of link equity being lost Today we’ll focus on how to find out of stock products as well as thin category pages and - as these often occur in large quantities - how to deal with them at scale. Maria’s tips on crawling large ecommerce sites
  • 5. For example: • This product page, has 91 backlinks from 28 different referring domains. The site has a number of products out of stock with a significant number of backlinks • As a result, it’s quite common to find large amounts of out of stock product pages for a single site indexed by search engines Tip #1: Crawl your site to find out of stock products at scale
  • 6. How to do it: Use the ‘Custom search’ feature in Screaming Frog to find these when they return a 200 status code (as these won’t be picked up via a standard crawl or GSC) • Step 1: find the identifiable out of stock copy on your product page. Using Amazon as an example, their identifiable out of stock is “Currently unavailable” Tip #1: Crawl your site to find out of stock products at scale
  • 7. How to do it: Use the ‘Custom search’ feature in Screaming Frog to find these when they return a 200 status code (as these won’t be picked up via a standard crawl or GSC) • Step 1: find the identifiable out of stock copy on your product page. Using Amazon as an example, their identifiable out of stock is “Currently unavailable” • Step 2: copy and paste this into Screaming Frog’s ‘Custom Search’ feature and run a crawl Tip #1: Crawl your site to find out of stock products at scale
  • 8. How to do it: Use the ‘Custom search’ feature in Screaming Frog to find these when they return a 200 status code (as these won’t be picked up via a standard crawl or GSC) • Step 1: find the identifiable out of stock copy on your product page. Using Amazon as an example, their identifiable out of stock is “Currently unavailable” • Step 2: copy and paste this into Screaming Frog’s ‘Custom Search’ feature • Step 3: crawl will return all of the product pages that contain the “out of stock” string. Don’t forget to manually QA for any errors Tip #1: Crawl your site to find out of stock products at scale
  • 9. • You can use the same process to find product listing pages that are empty (meaning they have no products) • Just copy the ‘no products’ identifier in Screaming Frog, in the same way we did for ‘out of stock’ products • Here are some examples: Tip #2: Apply this to category pages to find empty PLPs
  • 10. If your site is too big and you are having issues with allocated memory on your desktop, you can limit your crawl to only include category URLs or exclude product URLs by using the ‘include’ or ‘exclude’ features on the tool (e.g. exclude https://www.example.com/products/.*) Tip #3: Limit your crawl to include/exclude certain URLs
  • 11. Common issues: • Thin category pages with limited stock are also a source of bad UX • They will result in lost sales and when this happens at scale, this can have a significant impact in revenue (not only for SEO) • They put the site at risk of algorithm penalties Tip #4: Use the ‘Custom extraction’ tool to find thin PLPs
  • 12. How to do it: Taking this category page as an example • Step 1: fetch the contents of any container using class=“styleCount” by typing this: //*[contains(@class, 'styleCount’)] Tip #4: Use the ‘Custom extraction’ tool to find thin PLPs
  • 13. How to do it: Taking this category page as an example • Step 1: fetch the contents of any container using class=“styleCount” by typing this: //*[contains(@class, 'styleCount’)] • Step 2: when crawl finishes, go to the ‘Custom’ tab at the top of the tool and you’ll see something like this Tip #4: Use the ‘Custom extraction’ tool to find thin PLPs
  • 14. How to do it: Taking this category page as an example • Step 1: fetch the contents of any container using class=“styleCount” by typing this: //*[contains(@class, 'styleCount’)] • Step 2: when crawl finishes, go to the ‘Custom’ tab at the top of the tool and you’ll see something like this *Note: if your page doesn’t have a container with the number of products available, you can still count the number of elements on a page by using the count function: count(//div[@class="offer__content"]) Tip #4: Use the ‘Custom extraction’ tool to find thin PLPs
  • 15. ● If you want to learn more about how to use XPath for SEO purposes, you can read this guide: https://bit.ly/3aeXsX0 ● Learn how to extract other elements of your category pages, such as titles, headings, etc. You can look at this article: https://bit.ly/2VveYRm ● Check this out for more details on everything I’ve covered: https://bit.ly/2VwMsim Tip #5: Learn more about how to use XPath for SEO
  • 16. Thank you - over to Q and A ● Great tips from Maria ● Maria Camanes @MariaCamanes
  • 17. “Technical SEO" ● Serena Pearson ● Franco Valentino ● Paul Lovell Friday 17th April 2020 @ 4 p.m. SEO Advice, tea and cake with...