SlideShare a Scribd company logo
Checking Google Index status at scale with Node.js
Checking
Google Index status
at scale with Node.js
Jose Luis Hernando
@jlhernando #BrightonSEO
Senior Technical SEO Consultant
Checking Google Index status at scale with Node.js
Today’s agenda
1. Why it’s important to know your website’s indexing status
2. The challenge to extract this data
3. Getting the data with Node.js – Live Demo!
4. Using this data for your SEO strategy
Checking Google Index status at scale with Node.js
Why is it important?
Reason #1
Not in the Index => Not in the SERPs
Icons from Google, Flaticon & Sitecheckerpro
Checking Google Index status at scale with Node.js
Why is it important?
Reason #2
Google evaluates site quality based on indexed pages
Sources:
Google Only Can Judge Site Quality Based On Pages They Index – Barry Swartz (Search Engine Roundtable)
English Google Webmaster Central office-hours hangout – Google Webmasters YouTube Channel
Low Quality Pages
Uncontrolled Faceted Navigation URLs
Unsupervised User Generated Content
Indexable Non-Canonical URLs
High Quality Pages
Category Pages
Editorial Pages
Canonical Product Pages
+
Checking Google Index status at scale with Node.js
Why is it important?
Reason #3
Inefficient use of Google’s resources
https://website.com/category-one/
HTML CSS JS
/category-one/?color=red
/category-one/?color=blue
/category-one/?color=red&blue
…
∞
Checking Google Index status at scale with Node.js
71.7%
54.3%
41.7%
34.4%
45.3%
30.2%
15.1%
10.1%
1-10k
10k-100k
100k-1M
1M+
Avg. Crawl Ratio (%) Avg. Active Ratio (%)
Source: How Does Google Crawl the Web? – (Annabelle Bouard & Dimitri Brunel – Botify)
Crawl Ratio
Percentage of pages
crawled by Google in 30 days
Active Ratio
Percentage of pages that
have generated at least
one organic visit in 30 days.
How much of your site is Googlebot crawling?
Checking Google Index status at scale with Node.js
The challenge
to extract this data
• Googlebot’s crawling behaviour
doesn’t determine indexing status
Checking Google Index status at scale with Node.js
The challenge:
extracting this data
• Googlebot’s crawling behaviour
doesn’t determine indexing status
• You rely on partial and sometimes
inaccurate data points:
• site: & inurl: operators
• GSC Indexing reports:
• URL Inspection Tool (< 200 URLs /day)
• Coverage Reports (< 1,000 rows / report)
Checking Google Index status at scale with Node.js
Proxy metrics != Accurate data
Checking Google Index status at scale with Node.js
If you can’t find it, build it
Checking Google Index status at scale with Node.js
{Live demo}
bit.ly/google-index-checker-script
Checking Google Index status at scale with Node.js
Using the following method
goes against Google’s Terms of Service
as it automatically requests search queries from Google Search
Quick FYI
Checking Google Index status at scale with Node.js
Our script outperforms every other method available
Checking Google Index status at scale with Node.js
How can you use Google index data?
Identify inefficient use of
crawl budget
Error Prioritisation
Identify holes
in your architecture
Check for pages from your
site that should be indexed
but are not.
Find pages that should not
be indexed but are indexed.
Detect pages that used to
exist and now return an error
(4xx) but are still indexed.
Checking Google Index status at scale with Node.js
Use case #1
Sitemap Health Check
How many URLs from your XML sitemap are
indexed?
• 200 Status Code – 81,688
Inspired by Data Secrets of the Index Coverage Report – AJ Kohn
Sitemaps = 111,772 URLs
80% Indexed 74,223
7,465
Google Index Status of 2xx URLs
from Sitemap
Indexed Not Indexed
Checking Google Index status at scale with Node.js
Use case #1
Sitemap Health Check
How many URLs from your XML sitemap are
indexed?
• 200 Status Code – 81,688
• 404 Status Code – 29,969
Inspired by Data Secrets of the Index Coverage Report – AJ Kohn
Sitemaps = 111,772 URLs
80% Indexed
21% Indexed
6,268
23,701
Google Index Status of 4xx URLs
from Sitemap
Indexed Not Indexed
Checking Google Index status at scale with Node.js
Use case #1
Sitemap Health Check
How many URLs from your XML sitemap are
indexed?
• 200 Status Code – 81,688
• 404 Status Code – 29,969
• 301 Status Code – 365
Inspired by Data Secrets of the Index Coverage Report – AJ Kohn
Sitemaps = 111,772 URLs
80% Indexed
21% Indexed
4% Indexed
16 349
Google Index Status of 3xx URLs
from Sitemap
Indexed Not Indexed
Checking Google Index status at scale with Node.js
Sitemap Health Check
Next Steps
1) Identify if these URLs are important to your site’s bottom line
2) Check if a pool of these URLs have issues on GSC’s
Index Coverage Report
3) Choose a tactic to improve the visibility of these URLs
4) Isolate the relevant URLs and modify the existing sitemap or create a
new-sitemap.xml to monitor progress
Checking Google Index status at scale with Node.js
Use case #2
Log File Analysis Plus+
How many URLs with Googlebot hits are
indexed?
• ~160k Googlebot hits to non-canonical URLs
(/Uppercase/ vs /lowercase/)
• Identified if non-canonical URLs were indexed
• Identified if the referenced canonical URLs
were indexed
35.8%
64.2%
Indexed Non-Canonical URLs
Requested by Googlebot
Indexed Not Indexed
Undisclosed Client
Checking Google Index status at scale with Node.js
Log File Analysis+
Next Steps
1) Identify if the canonical tag is correctly placed
2) Identify if the root cause is internal linking, external linking or other
3) Consider redirecting non-canonical URLs to canonical URLs
4) Create a new-sitemap.xml with problematic URLs to encourage
Googlebot revisiting those URLs and for monitoring purposes
Checking Google Index status at scale with Node.js
• Check Real-time indexing (News sites, Offer sites, Job Boards)
• Check uncontrolled faceted navigation (Crawl budget optimisation)
• Check inactive product/category URLs – (Site architecture
improvements)
• Check old 4xx that are live now & haven't been deindexed yet
(Recover organic opportunities)
Other use cases
Inform your SEO strategy
Checking Google Index status at scale with Node.js
Further reading
https://bit.ly/google-index-checks
Checking Google Index status at scale with Node.js
Further reading
https://bit.ly/gsc-index-coverage
Checking Google Index status at scale with Node.js
The Google Index Checker script has opened a door
to get useful, actionable data at scale for your sites
Use it, and act on it.
Checking Google Index status at scale with Node.js
Thank you.
builtvisible.com
Jose Luis Hernando
Senior Technical SEO Consultant
@jlhernando
Checking Google Index status at scale with Node.js
How does Google crawl the web – Annabelle Bouard & Dimitri Brunel (Botify)
English Google Webmaster Central office-hours hangout – Google Webmasters YouTube Channel
Google Only Can Judge Site Quality Based On Pages They Index – Barry Swartz (Search Engine Roundtable)
Data Secrets of the Index Coverage Report - Blind Five Year Old (AJ Kohn)
How Google Search Works – Google Documentation
How Search organises information – Google Documentation
Our new search index: Caffeine - Carrie Grimes
When indexing goes wrong: how Google Search recovered from indexing issues & lessons learned since -
Vincent Courson, Google Search Outreach
How Search Engines Work: Crawling, Indexing & Ranking – Moz
(Please) Stop Using Unsafe Characters in URLs – Jeff Starr
Sources & additional reading

More Related Content

What's hot

Seo 101 in 2019
Seo 101 in 2019Seo 101 in 2019
Seo 101 in 2019
Angela Bergmann
 
SEO for website migrations - 53 SEO factors for a successful website relaunch
SEO for website migrations - 53 SEO factors for a successful website relaunchSEO for website migrations - 53 SEO factors for a successful website relaunch
SEO for website migrations - 53 SEO factors for a successful website relaunch
Eoghan Henn
 
The 30 Minute SEO Audit
The 30 Minute SEO AuditThe 30 Minute SEO Audit
The 30 Minute SEO Audit
BrightEdge
 
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021Headless SEO: Optimising Next Gen Sites | brightonSEO 2021
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021
Alex Wright
 
What a search engine can teach you about product sitemaps - BrightonSEO April...
What a search engine can teach you about product sitemaps - BrightonSEO April...What a search engine can teach you about product sitemaps - BrightonSEO April...
What a search engine can teach you about product sitemaps - BrightonSEO April...
Pricesearcher
 
Found vs. Chosen: How to Earn the Long Click with Content Hubs
Found vs. Chosen: How to Earn the Long Click with Content HubsFound vs. Chosen: How to Earn the Long Click with Content Hubs
Found vs. Chosen: How to Earn the Long Click with Content Hubs
Ronell Smith
 
Matching Keywords to Pages - Information Architecture
Matching Keywords to Pages - Information ArchitectureMatching Keywords to Pages - Information Architecture
Matching Keywords to Pages - Information Architecture
Dominic Woodman
 
SEO Audit Workshop : Frameworks , Techniques and Tools
SEO Audit Workshop : Frameworks , Techniques and Tools SEO Audit Workshop : Frameworks , Techniques and Tools
SEO Audit Workshop : Frameworks , Techniques and Tools
NEW MEDIA GURU
 
Technical SEO: How to Perform an SEO Audit (Step by Step Guide)
Technical SEO: How to Perform an SEO Audit (Step by Step Guide)Technical SEO: How to Perform an SEO Audit (Step by Step Guide)
Technical SEO: How to Perform an SEO Audit (Step by Step Guide)
Ryan Stewart
 
BrightonSEO 2019 - Crawl Budget is dead, please welcome Rendering Budget
BrightonSEO 2019 - Crawl Budget is dead, please welcome Rendering BudgetBrightonSEO 2019 - Crawl Budget is dead, please welcome Rendering Budget
BrightonSEO 2019 - Crawl Budget is dead, please welcome Rendering Budget
Botify
 
How Does Google Crawl the Web?
How Does Google Crawl the Web?How Does Google Crawl the Web?
How Does Google Crawl the Web?
Botify
 
Mobile-First Index: A Data-Driven Analysis & Discussion
Mobile-First Index:  A Data-Driven Analysis & DiscussionMobile-First Index:  A Data-Driven Analysis & Discussion
Mobile-First Index: A Data-Driven Analysis & Discussion
Botify
 
How To Successfully Undertake Site Migrations - Search London 2017
How To Successfully Undertake Site Migrations - Search London 2017How To Successfully Undertake Site Migrations - Search London 2017
How To Successfully Undertake Site Migrations - Search London 2017
Chloe Bodard
 
Building an SEO Exponential Growth model by closing your content gaps
Building an SEO Exponential Growth model by closing your content gapsBuilding an SEO Exponential Growth model by closing your content gaps
Building an SEO Exponential Growth model by closing your content gaps
Razvan Gavrilas
 
How to Succeed in B2B SEO
How to Succeed in B2B SEOHow to Succeed in B2B SEO
How to Succeed in B2B SEO
Dominic Woodman
 
Decrypt Google’s Behavior with Botify Log Analyzer
Decrypt Google’s Behavior with Botify Log AnalyzerDecrypt Google’s Behavior with Botify Log Analyzer
Decrypt Google’s Behavior with Botify Log Analyzer
Botify
 
Conflicting Website Signals & Confused Search Engines - Rachel Costello, Tech...
Conflicting Website Signals & Confused Search Engines - Rachel Costello, Tech...Conflicting Website Signals & Confused Search Engines - Rachel Costello, Tech...
Conflicting Website Signals & Confused Search Engines - Rachel Costello, Tech...
Rachel Costello
 
How SEO Has Changed (and what to do about it) - Adam Audette - RKG Summit 2013
How SEO Has Changed (and what to do about it) - Adam Audette - RKG Summit 2013How SEO Has Changed (and what to do about it) - Adam Audette - RKG Summit 2013
How SEO Has Changed (and what to do about it) - Adam Audette - RKG Summit 2013
Adam Audette
 
SEO for Enterprise: Stuff You Can Do Yourself!
SEO for Enterprise: Stuff You Can Do Yourself!SEO for Enterprise: Stuff You Can Do Yourself!
SEO for Enterprise: Stuff You Can Do Yourself!
Adam Audette
 

What's hot (19)

Seo 101 in 2019
Seo 101 in 2019Seo 101 in 2019
Seo 101 in 2019
 
SEO for website migrations - 53 SEO factors for a successful website relaunch
SEO for website migrations - 53 SEO factors for a successful website relaunchSEO for website migrations - 53 SEO factors for a successful website relaunch
SEO for website migrations - 53 SEO factors for a successful website relaunch
 
The 30 Minute SEO Audit
The 30 Minute SEO AuditThe 30 Minute SEO Audit
The 30 Minute SEO Audit
 
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021Headless SEO: Optimising Next Gen Sites | brightonSEO 2021
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021
 
What a search engine can teach you about product sitemaps - BrightonSEO April...
What a search engine can teach you about product sitemaps - BrightonSEO April...What a search engine can teach you about product sitemaps - BrightonSEO April...
What a search engine can teach you about product sitemaps - BrightonSEO April...
 
Found vs. Chosen: How to Earn the Long Click with Content Hubs
Found vs. Chosen: How to Earn the Long Click with Content HubsFound vs. Chosen: How to Earn the Long Click with Content Hubs
Found vs. Chosen: How to Earn the Long Click with Content Hubs
 
Matching Keywords to Pages - Information Architecture
Matching Keywords to Pages - Information ArchitectureMatching Keywords to Pages - Information Architecture
Matching Keywords to Pages - Information Architecture
 
SEO Audit Workshop : Frameworks , Techniques and Tools
SEO Audit Workshop : Frameworks , Techniques and Tools SEO Audit Workshop : Frameworks , Techniques and Tools
SEO Audit Workshop : Frameworks , Techniques and Tools
 
Technical SEO: How to Perform an SEO Audit (Step by Step Guide)
Technical SEO: How to Perform an SEO Audit (Step by Step Guide)Technical SEO: How to Perform an SEO Audit (Step by Step Guide)
Technical SEO: How to Perform an SEO Audit (Step by Step Guide)
 
BrightonSEO 2019 - Crawl Budget is dead, please welcome Rendering Budget
BrightonSEO 2019 - Crawl Budget is dead, please welcome Rendering BudgetBrightonSEO 2019 - Crawl Budget is dead, please welcome Rendering Budget
BrightonSEO 2019 - Crawl Budget is dead, please welcome Rendering Budget
 
How Does Google Crawl the Web?
How Does Google Crawl the Web?How Does Google Crawl the Web?
How Does Google Crawl the Web?
 
Mobile-First Index: A Data-Driven Analysis & Discussion
Mobile-First Index:  A Data-Driven Analysis & DiscussionMobile-First Index:  A Data-Driven Analysis & Discussion
Mobile-First Index: A Data-Driven Analysis & Discussion
 
How To Successfully Undertake Site Migrations - Search London 2017
How To Successfully Undertake Site Migrations - Search London 2017How To Successfully Undertake Site Migrations - Search London 2017
How To Successfully Undertake Site Migrations - Search London 2017
 
Building an SEO Exponential Growth model by closing your content gaps
Building an SEO Exponential Growth model by closing your content gapsBuilding an SEO Exponential Growth model by closing your content gaps
Building an SEO Exponential Growth model by closing your content gaps
 
How to Succeed in B2B SEO
How to Succeed in B2B SEOHow to Succeed in B2B SEO
How to Succeed in B2B SEO
 
Decrypt Google’s Behavior with Botify Log Analyzer
Decrypt Google’s Behavior with Botify Log AnalyzerDecrypt Google’s Behavior with Botify Log Analyzer
Decrypt Google’s Behavior with Botify Log Analyzer
 
Conflicting Website Signals & Confused Search Engines - Rachel Costello, Tech...
Conflicting Website Signals & Confused Search Engines - Rachel Costello, Tech...Conflicting Website Signals & Confused Search Engines - Rachel Costello, Tech...
Conflicting Website Signals & Confused Search Engines - Rachel Costello, Tech...
 
How SEO Has Changed (and what to do about it) - Adam Audette - RKG Summit 2013
How SEO Has Changed (and what to do about it) - Adam Audette - RKG Summit 2013How SEO Has Changed (and what to do about it) - Adam Audette - RKG Summit 2013
How SEO Has Changed (and what to do about it) - Adam Audette - RKG Summit 2013
 
SEO for Enterprise: Stuff You Can Do Yourself!
SEO for Enterprise: Stuff You Can Do Yourself!SEO for Enterprise: Stuff You Can Do Yourself!
SEO for Enterprise: Stuff You Can Do Yourself!
 

Similar to Checking google index status at scale

Evaluating URLs at Scale
Evaluating URLs at ScaleEvaluating URLs at Scale
Evaluating URLs at Scale
BristolSEO
 
Crawl Budget: Everything you Need to Know
Crawl Budget: Everything you Need to KnowCrawl Budget: Everything you Need to Know
Crawl Budget: Everything you Need to Know
SallyR7
 
SEO for Ecommerce: A Comprehensive Guide
SEO for Ecommerce: A Comprehensive GuideSEO for Ecommerce: A Comprehensive Guide
SEO for Ecommerce: A Comprehensive Guide
Adam Audette
 
What is Google Search Console and What is it provide?
What is Google Search Console and What is it provide?What is Google Search Console and What is it provide?
What is Google Search Console and What is it provide?
riteshhsociall
 
Technical SEO - An Introduction to Core Aspects of Technical SEO Best-Practise
Technical SEO - An Introduction to Core Aspects of Technical SEO Best-PractiseTechnical SEO - An Introduction to Core Aspects of Technical SEO Best-Practise
Technical SEO - An Introduction to Core Aspects of Technical SEO Best-Practise
Erudite
 
33 Tactics to Engage and Retain More Customers - IRCE 2016
33 Tactics to Engage and Retain More Customers - IRCE 201633 Tactics to Engage and Retain More Customers - IRCE 2016
33 Tactics to Engage and Retain More Customers - IRCE 2016
Mark Ginsberg
 
Paul Duncan - Advanced Tracking & Enriched SERP Results via Google Tag Manager
Paul Duncan - Advanced Tracking & Enriched SERP Results via Google Tag ManagerPaul Duncan - Advanced Tracking & Enriched SERP Results via Google Tag Manager
Paul Duncan - Advanced Tracking & Enriched SERP Results via Google Tag Manager
Julia Grosman
 
Site Analysis
Site AnalysisSite Analysis
Site Analysis
deepak tiwari
 
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AUKeeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Jason Mun
 
33 Tactics to Engage and Retain More Customers- IRCE 2016
33 Tactics to Engage and Retain More Customers- IRCE 201633 Tactics to Engage and Retain More Customers- IRCE 2016
33 Tactics to Engage and Retain More Customers- IRCE 2016
Andrew Scarbrough
 
Technical SEO Checklist for Beginners
Technical SEO Checklist for BeginnersTechnical SEO Checklist for Beginners
Technical SEO Checklist for Beginners
BristolSEO
 
Site Migrations by Nik Ranger
 Site Migrations by Nik Ranger Site Migrations by Nik Ranger
Site Migrations by Nik Ranger
Anton Shulke
 
Raven Tools for Reporting, Analysis & Strategy Development
Raven Tools for Reporting, Analysis & Strategy DevelopmentRaven Tools for Reporting, Analysis & Strategy Development
Raven Tools for Reporting, Analysis & Strategy Development
BrettASnyder
 
Dc seo fin
Dc seo finDc seo fin
Dc seo fin
Anton Surov
 
Web Mining.pptx
Web Mining.pptxWeb Mining.pptx
Web Mining.pptx
ScrbifPt
 
Faceted Navigation: (Almost) Everyone is Doing it Wrong
Faceted Navigation: (Almost) Everyone is Doing it WrongFaceted Navigation: (Almost) Everyone is Doing it Wrong
Faceted Navigation: (Almost) Everyone is Doing it Wrong
Botify
 
Basic Search Engine Optimization Strategies
Basic Search Engine Optimization Strategies  Basic Search Engine Optimization Strategies
Basic Search Engine Optimization Strategies
Online Business Owners
 
Search engine optimization
Search engine optimizationSearch engine optimization
Search engine optimization
mds university ajmer
 
Introduction To SEO (SEARCH ENGINE OPTIMIZATION)- Learning Catalyst
Introduction To SEO (SEARCH ENGINE OPTIMIZATION)- Learning CatalystIntroduction To SEO (SEARCH ENGINE OPTIMIZATION)- Learning Catalyst
Introduction To SEO (SEARCH ENGINE OPTIMIZATION)- Learning Catalyst
Learning-Catalyst
 
Seo tutorial
Seo tutorialSeo tutorial
Seo tutorial
Online Net india
 

Similar to Checking google index status at scale (20)

Evaluating URLs at Scale
Evaluating URLs at ScaleEvaluating URLs at Scale
Evaluating URLs at Scale
 
Crawl Budget: Everything you Need to Know
Crawl Budget: Everything you Need to KnowCrawl Budget: Everything you Need to Know
Crawl Budget: Everything you Need to Know
 
SEO for Ecommerce: A Comprehensive Guide
SEO for Ecommerce: A Comprehensive GuideSEO for Ecommerce: A Comprehensive Guide
SEO for Ecommerce: A Comprehensive Guide
 
What is Google Search Console and What is it provide?
What is Google Search Console and What is it provide?What is Google Search Console and What is it provide?
What is Google Search Console and What is it provide?
 
Technical SEO - An Introduction to Core Aspects of Technical SEO Best-Practise
Technical SEO - An Introduction to Core Aspects of Technical SEO Best-PractiseTechnical SEO - An Introduction to Core Aspects of Technical SEO Best-Practise
Technical SEO - An Introduction to Core Aspects of Technical SEO Best-Practise
 
33 Tactics to Engage and Retain More Customers - IRCE 2016
33 Tactics to Engage and Retain More Customers - IRCE 201633 Tactics to Engage and Retain More Customers - IRCE 2016
33 Tactics to Engage and Retain More Customers - IRCE 2016
 
Paul Duncan - Advanced Tracking & Enriched SERP Results via Google Tag Manager
Paul Duncan - Advanced Tracking & Enriched SERP Results via Google Tag ManagerPaul Duncan - Advanced Tracking & Enriched SERP Results via Google Tag Manager
Paul Duncan - Advanced Tracking & Enriched SERP Results via Google Tag Manager
 
Site Analysis
Site AnalysisSite Analysis
Site Analysis
 
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AUKeeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
 
33 Tactics to Engage and Retain More Customers- IRCE 2016
33 Tactics to Engage and Retain More Customers- IRCE 201633 Tactics to Engage and Retain More Customers- IRCE 2016
33 Tactics to Engage and Retain More Customers- IRCE 2016
 
Technical SEO Checklist for Beginners
Technical SEO Checklist for BeginnersTechnical SEO Checklist for Beginners
Technical SEO Checklist for Beginners
 
Site Migrations by Nik Ranger
 Site Migrations by Nik Ranger Site Migrations by Nik Ranger
Site Migrations by Nik Ranger
 
Raven Tools for Reporting, Analysis & Strategy Development
Raven Tools for Reporting, Analysis & Strategy DevelopmentRaven Tools for Reporting, Analysis & Strategy Development
Raven Tools for Reporting, Analysis & Strategy Development
 
Dc seo fin
Dc seo finDc seo fin
Dc seo fin
 
Web Mining.pptx
Web Mining.pptxWeb Mining.pptx
Web Mining.pptx
 
Faceted Navigation: (Almost) Everyone is Doing it Wrong
Faceted Navigation: (Almost) Everyone is Doing it WrongFaceted Navigation: (Almost) Everyone is Doing it Wrong
Faceted Navigation: (Almost) Everyone is Doing it Wrong
 
Basic Search Engine Optimization Strategies
Basic Search Engine Optimization Strategies  Basic Search Engine Optimization Strategies
Basic Search Engine Optimization Strategies
 
Search engine optimization
Search engine optimizationSearch engine optimization
Search engine optimization
 
Introduction To SEO (SEARCH ENGINE OPTIMIZATION)- Learning Catalyst
Introduction To SEO (SEARCH ENGINE OPTIMIZATION)- Learning CatalystIntroduction To SEO (SEARCH ENGINE OPTIMIZATION)- Learning Catalyst
Introduction To SEO (SEARCH ENGINE OPTIMIZATION)- Learning Catalyst
 
Seo tutorial
Seo tutorialSeo tutorial
Seo tutorial
 

More from Builtvisible

Webinar: How to benefit from changing consumer demand
Webinar: How to benefit from changing consumer demandWebinar: How to benefit from changing consumer demand
Webinar: How to benefit from changing consumer demand
Builtvisible
 
GA4 Mini Training Webinar Deck.pdf
GA4 Mini Training Webinar Deck.pdfGA4 Mini Training Webinar Deck.pdf
GA4 Mini Training Webinar Deck.pdf
Builtvisible
 
Webinar: How and why to use social media to inform creative content
Webinar: How and why to use social media to inform creative contentWebinar: How and why to use social media to inform creative content
Webinar: How and why to use social media to inform creative content
Builtvisible
 
Webinar: How to supercharge local SEO strategies for multi-location businesses
Webinar: How to supercharge local SEO strategies for multi-location businessesWebinar: How to supercharge local SEO strategies for multi-location businesses
Webinar: How to supercharge local SEO strategies for multi-location businesses
Builtvisible
 
How to prepare for Google's page experience update
How to prepare for Google's page experience updateHow to prepare for Google's page experience update
How to prepare for Google's page experience update
Builtvisible
 
Optimising your faceted navigation to target long-tail keywords
Optimising your faceted navigation to target long-tail keywordsOptimising your faceted navigation to target long-tail keywords
Optimising your faceted navigation to target long-tail keywords
Builtvisible
 
Ecommerce quick wins you can implement today to boost SEO performance
Ecommerce quick wins you can implement today to boost SEO performanceEcommerce quick wins you can implement today to boost SEO performance
Ecommerce quick wins you can implement today to boost SEO performance
Builtvisible
 
How to build a flexible content strategy
How to build a flexible content strategyHow to build a flexible content strategy
How to build a flexible content strategy
Builtvisible
 
How to make change happen in your organisation by talking your devs language
How to make change happen in your organisation by talking your devs languageHow to make change happen in your organisation by talking your devs language
How to make change happen in your organisation by talking your devs language
Builtvisible
 
Google for jobs – Matt Hunt's top tips from Tea-time SEO
Google for jobs – Matt Hunt's top tips from Tea-time SEOGoogle for jobs – Matt Hunt's top tips from Tea-time SEO
Google for jobs – Matt Hunt's top tips from Tea-time SEO
Builtvisible
 
Crawling ecommerce sites – Maria Camanes' top tips from Tea-time SEO
Crawling ecommerce sites – Maria Camanes' top tips from Tea-time SEOCrawling ecommerce sites – Maria Camanes' top tips from Tea-time SEO
Crawling ecommerce sites – Maria Camanes' top tips from Tea-time SEO
Builtvisible
 
Reducing site speed - Rachel Costello's top tips from Tea-time SEO
Reducing site speed - Rachel Costello's top tips from Tea-time SEOReducing site speed - Rachel Costello's top tips from Tea-time SEO
Reducing site speed - Rachel Costello's top tips from Tea-time SEO
Builtvisible
 
Webinar: Common challenges with e commerce seo optimisation
Webinar: Common challenges with e commerce seo optimisationWebinar: Common challenges with e commerce seo optimisation
Webinar: Common challenges with e commerce seo optimisation
Builtvisible
 
Webinar: Turn browsers to customers with product page improvements
Webinar: Turn browsers to customers with product page improvementsWebinar: Turn browsers to customers with product page improvements
Webinar: Turn browsers to customers with product page improvements
Builtvisible
 
Building a culture of measurement: PR Week Breakfast Briefing
Building a culture of measurement: PR Week Breakfast BriefingBuilding a culture of measurement: PR Week Breakfast Briefing
Building a culture of measurement: PR Week Breakfast Briefing
Builtvisible
 
Getting PR Onside with Data | SearchLove 2018
Getting PR Onside with Data | SearchLove 2018Getting PR Onside with Data | SearchLove 2018
Getting PR Onside with Data | SearchLove 2018
Builtvisible
 
PPC Cost Analysis | Search Marketing Summit Australia 2
PPC Cost Analysis | Search Marketing Summit Australia 2PPC Cost Analysis | Search Marketing Summit Australia 2
PPC Cost Analysis | Search Marketing Summit Australia 2
Builtvisible
 
Addressing Site Quality | Search Marketing Summit Australia
Addressing Site Quality | Search Marketing Summit AustraliaAddressing Site Quality | Search Marketing Summit Australia
Addressing Site Quality | Search Marketing Summit Australia
Builtvisible
 
SEO for Faceted Navigation | Get STAT City Crawl
SEO for Faceted Navigation | Get STAT City CrawlSEO for Faceted Navigation | Get STAT City Crawl
SEO for Faceted Navigation | Get STAT City Crawl
Builtvisible
 
Google Tag Manager Can Do What? | SMX London
Google Tag Manager Can Do What? | SMX LondonGoogle Tag Manager Can Do What? | SMX London
Google Tag Manager Can Do What? | SMX London
Builtvisible
 

More from Builtvisible (20)

Webinar: How to benefit from changing consumer demand
Webinar: How to benefit from changing consumer demandWebinar: How to benefit from changing consumer demand
Webinar: How to benefit from changing consumer demand
 
GA4 Mini Training Webinar Deck.pdf
GA4 Mini Training Webinar Deck.pdfGA4 Mini Training Webinar Deck.pdf
GA4 Mini Training Webinar Deck.pdf
 
Webinar: How and why to use social media to inform creative content
Webinar: How and why to use social media to inform creative contentWebinar: How and why to use social media to inform creative content
Webinar: How and why to use social media to inform creative content
 
Webinar: How to supercharge local SEO strategies for multi-location businesses
Webinar: How to supercharge local SEO strategies for multi-location businessesWebinar: How to supercharge local SEO strategies for multi-location businesses
Webinar: How to supercharge local SEO strategies for multi-location businesses
 
How to prepare for Google's page experience update
How to prepare for Google's page experience updateHow to prepare for Google's page experience update
How to prepare for Google's page experience update
 
Optimising your faceted navigation to target long-tail keywords
Optimising your faceted navigation to target long-tail keywordsOptimising your faceted navigation to target long-tail keywords
Optimising your faceted navigation to target long-tail keywords
 
Ecommerce quick wins you can implement today to boost SEO performance
Ecommerce quick wins you can implement today to boost SEO performanceEcommerce quick wins you can implement today to boost SEO performance
Ecommerce quick wins you can implement today to boost SEO performance
 
How to build a flexible content strategy
How to build a flexible content strategyHow to build a flexible content strategy
How to build a flexible content strategy
 
How to make change happen in your organisation by talking your devs language
How to make change happen in your organisation by talking your devs languageHow to make change happen in your organisation by talking your devs language
How to make change happen in your organisation by talking your devs language
 
Google for jobs – Matt Hunt's top tips from Tea-time SEO
Google for jobs – Matt Hunt's top tips from Tea-time SEOGoogle for jobs – Matt Hunt's top tips from Tea-time SEO
Google for jobs – Matt Hunt's top tips from Tea-time SEO
 
Crawling ecommerce sites – Maria Camanes' top tips from Tea-time SEO
Crawling ecommerce sites – Maria Camanes' top tips from Tea-time SEOCrawling ecommerce sites – Maria Camanes' top tips from Tea-time SEO
Crawling ecommerce sites – Maria Camanes' top tips from Tea-time SEO
 
Reducing site speed - Rachel Costello's top tips from Tea-time SEO
Reducing site speed - Rachel Costello's top tips from Tea-time SEOReducing site speed - Rachel Costello's top tips from Tea-time SEO
Reducing site speed - Rachel Costello's top tips from Tea-time SEO
 
Webinar: Common challenges with e commerce seo optimisation
Webinar: Common challenges with e commerce seo optimisationWebinar: Common challenges with e commerce seo optimisation
Webinar: Common challenges with e commerce seo optimisation
 
Webinar: Turn browsers to customers with product page improvements
Webinar: Turn browsers to customers with product page improvementsWebinar: Turn browsers to customers with product page improvements
Webinar: Turn browsers to customers with product page improvements
 
Building a culture of measurement: PR Week Breakfast Briefing
Building a culture of measurement: PR Week Breakfast BriefingBuilding a culture of measurement: PR Week Breakfast Briefing
Building a culture of measurement: PR Week Breakfast Briefing
 
Getting PR Onside with Data | SearchLove 2018
Getting PR Onside with Data | SearchLove 2018Getting PR Onside with Data | SearchLove 2018
Getting PR Onside with Data | SearchLove 2018
 
PPC Cost Analysis | Search Marketing Summit Australia 2
PPC Cost Analysis | Search Marketing Summit Australia 2PPC Cost Analysis | Search Marketing Summit Australia 2
PPC Cost Analysis | Search Marketing Summit Australia 2
 
Addressing Site Quality | Search Marketing Summit Australia
Addressing Site Quality | Search Marketing Summit AustraliaAddressing Site Quality | Search Marketing Summit Australia
Addressing Site Quality | Search Marketing Summit Australia
 
SEO for Faceted Navigation | Get STAT City Crawl
SEO for Faceted Navigation | Get STAT City CrawlSEO for Faceted Navigation | Get STAT City Crawl
SEO for Faceted Navigation | Get STAT City Crawl
 
Google Tag Manager Can Do What? | SMX London
Google Tag Manager Can Do What? | SMX LondonGoogle Tag Manager Can Do What? | SMX London
Google Tag Manager Can Do What? | SMX London
 

Recently uploaded

Mastering The Best Restaurant Advertising Campaigns Detailed Guide
Mastering The Best Restaurant Advertising Campaigns Detailed GuideMastering The Best Restaurant Advertising Campaigns Detailed Guide
Mastering The Best Restaurant Advertising Campaigns Detailed Guide
Kopa Global Technologies
 
No Cookies, No Problem - Steve Krull, Be Found Online
No Cookies, No Problem - Steve Krull, Be Found OnlineNo Cookies, No Problem - Steve Krull, Be Found Online
No Cookies, No Problem - Steve Krull, Be Found Online
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Data-Driven Personalization - Build a Competitive Advantage by Knowing Your C...
Data-Driven Personalization - Build a Competitive Advantage by Knowing Your C...Data-Driven Personalization - Build a Competitive Advantage by Knowing Your C...
Data-Driven Personalization - Build a Competitive Advantage by Knowing Your C...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Crafting Seamless B2B Customer Journeys - Strategies for Exceptional Experien...
Crafting Seamless B2B Customer Journeys - Strategies for Exceptional Experien...Crafting Seamless B2B Customer Journeys - Strategies for Exceptional Experien...
Crafting Seamless B2B Customer Journeys - Strategies for Exceptional Experien...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Lily Ray - Optimize the Forest, Not the Trees: Move Beyond SEO Checklist - Mo...
Lily Ray - Optimize the Forest, Not the Trees: Move Beyond SEO Checklist - Mo...Lily Ray - Optimize the Forest, Not the Trees: Move Beyond SEO Checklist - Mo...
Lily Ray - Optimize the Forest, Not the Trees: Move Beyond SEO Checklist - Mo...
Amsive
 
Playlist and Paint Event with Sony Music U
Playlist and Paint Event with Sony Music UPlaylist and Paint Event with Sony Music U
Playlist and Paint Event with Sony Music U
SemajahParker
 
How to Use AI to Write a High-Quality Article that Ranks
How to Use AI to Write a High-Quality Article that RanksHow to Use AI to Write a High-Quality Article that Ranks
How to Use AI to Write a High-Quality Article that Ranks
minatamang0021
 
Mastering SEO for Google in the AI Era - Dennis Yu
Mastering SEO for Google in the AI Era - Dennis YuMastering SEO for Google in the AI Era - Dennis Yu
Top digital marketing institutein noida
Top digital marketing institutein noidaTop digital marketing institutein noida
Top digital marketing institutein noida
aditisingh6607
 
Mastering SEO for Google in the AI Era - Dennis Yu
Mastering SEO for Google in the AI Era - Dennis YuMastering SEO for Google in the AI Era - Dennis Yu
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Get Off the Bandwagon - Separating Digital Marketing Myths from Truth - Scott...
Get Off the Bandwagon - Separating Digital Marketing Myths from Truth - Scott...Get Off the Bandwagon - Separating Digital Marketing Myths from Truth - Scott...
Get Off the Bandwagon - Separating Digital Marketing Myths from Truth - Scott...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Future-Proof Like Beyoncé - Syncing Email and Social Media for Iconic Brand L...
Future-Proof Like Beyoncé - Syncing Email and Social Media for Iconic Brand L...Future-Proof Like Beyoncé - Syncing Email and Social Media for Iconic Brand L...
Future-Proof Like Beyoncé - Syncing Email and Social Media for Iconic Brand L...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Marketing in the Age of AI - Shifting CX from Monologue to Dialogue - Susan W...
Marketing in the Age of AI - Shifting CX from Monologue to Dialogue - Susan W...Marketing in the Age of AI - Shifting CX from Monologue to Dialogue - Susan W...
Marketing in the Age of AI - Shifting CX from Monologue to Dialogue - Susan W...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Email Marketing Master Class - Chris Ferris
Email Marketing Master Class - Chris FerrisEmail Marketing Master Class - Chris Ferris
PickUp_conversational AI_Capex, Inc._20240611
PickUp_conversational AI_Capex, Inc._20240611PickUp_conversational AI_Capex, Inc._20240611
PickUp_conversational AI_Capex, Inc._20240611
Shuntaro Kogame
 
From Hope to Despair The Top 10 Reasons Businesses Ditch SEO Tactics.pptx
From Hope to Despair The Top 10 Reasons Businesses Ditch SEO Tactics.pptxFrom Hope to Despair The Top 10 Reasons Businesses Ditch SEO Tactics.pptx
From Hope to Despair The Top 10 Reasons Businesses Ditch SEO Tactics.pptx
Boston SEO Services
 
How To Navigate AI - The Future is Yours to Define - Tim Hayden
How To Navigate AI - The Future is Yours to Define - Tim HaydenHow To Navigate AI - The Future is Yours to Define - Tim Hayden
How To Navigate AI - The Future is Yours to Define - Tim Hayden
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 

Recently uploaded (20)

Mastering The Best Restaurant Advertising Campaigns Detailed Guide
Mastering The Best Restaurant Advertising Campaigns Detailed GuideMastering The Best Restaurant Advertising Campaigns Detailed Guide
Mastering The Best Restaurant Advertising Campaigns Detailed Guide
 
No Cookies, No Problem - Steve Krull, Be Found Online
No Cookies, No Problem - Steve Krull, Be Found OnlineNo Cookies, No Problem - Steve Krull, Be Found Online
No Cookies, No Problem - Steve Krull, Be Found Online
 
Data-Driven Personalization - Build a Competitive Advantage by Knowing Your C...
Data-Driven Personalization - Build a Competitive Advantage by Knowing Your C...Data-Driven Personalization - Build a Competitive Advantage by Knowing Your C...
Data-Driven Personalization - Build a Competitive Advantage by Knowing Your C...
 
Crafting Seamless B2B Customer Journeys - Strategies for Exceptional Experien...
Crafting Seamless B2B Customer Journeys - Strategies for Exceptional Experien...Crafting Seamless B2B Customer Journeys - Strategies for Exceptional Experien...
Crafting Seamless B2B Customer Journeys - Strategies for Exceptional Experien...
 
Lily Ray - Optimize the Forest, Not the Trees: Move Beyond SEO Checklist - Mo...
Lily Ray - Optimize the Forest, Not the Trees: Move Beyond SEO Checklist - Mo...Lily Ray - Optimize the Forest, Not the Trees: Move Beyond SEO Checklist - Mo...
Lily Ray - Optimize the Forest, Not the Trees: Move Beyond SEO Checklist - Mo...
 
Playlist and Paint Event with Sony Music U
Playlist and Paint Event with Sony Music UPlaylist and Paint Event with Sony Music U
Playlist and Paint Event with Sony Music U
 
Amazing and On Point - Ramon Ray, USA TODAY
Amazing and On Point - Ramon Ray, USA TODAYAmazing and On Point - Ramon Ray, USA TODAY
Amazing and On Point - Ramon Ray, USA TODAY
 
How to Use AI to Write a High-Quality Article that Ranks
How to Use AI to Write a High-Quality Article that RanksHow to Use AI to Write a High-Quality Article that Ranks
How to Use AI to Write a High-Quality Article that Ranks
 
Mastering SEO for Google in the AI Era - Dennis Yu
Mastering SEO for Google in the AI Era - Dennis YuMastering SEO for Google in the AI Era - Dennis Yu
Mastering SEO for Google in the AI Era - Dennis Yu
 
Top digital marketing institutein noida
Top digital marketing institutein noidaTop digital marketing institutein noida
Top digital marketing institutein noida
 
Mastering SEO for Google in the AI Era - Dennis Yu
Mastering SEO for Google in the AI Era - Dennis YuMastering SEO for Google in the AI Era - Dennis Yu
Mastering SEO for Google in the AI Era - Dennis Yu
 
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
 
Get Off the Bandwagon - Separating Digital Marketing Myths from Truth - Scott...
Get Off the Bandwagon - Separating Digital Marketing Myths from Truth - Scott...Get Off the Bandwagon - Separating Digital Marketing Myths from Truth - Scott...
Get Off the Bandwagon - Separating Digital Marketing Myths from Truth - Scott...
 
Future-Proof Like Beyoncé - Syncing Email and Social Media for Iconic Brand L...
Future-Proof Like Beyoncé - Syncing Email and Social Media for Iconic Brand L...Future-Proof Like Beyoncé - Syncing Email and Social Media for Iconic Brand L...
Future-Proof Like Beyoncé - Syncing Email and Social Media for Iconic Brand L...
 
Marketing in the Age of AI - Shifting CX from Monologue to Dialogue - Susan W...
Marketing in the Age of AI - Shifting CX from Monologue to Dialogue - Susan W...Marketing in the Age of AI - Shifting CX from Monologue to Dialogue - Susan W...
Marketing in the Age of AI - Shifting CX from Monologue to Dialogue - Susan W...
 
Email Marketing Master Class - Chris Ferris
Email Marketing Master Class - Chris FerrisEmail Marketing Master Class - Chris Ferris
Email Marketing Master Class - Chris Ferris
 
PickUp_conversational AI_Capex, Inc._20240611
PickUp_conversational AI_Capex, Inc._20240611PickUp_conversational AI_Capex, Inc._20240611
PickUp_conversational AI_Capex, Inc._20240611
 
From Hope to Despair The Top 10 Reasons Businesses Ditch SEO Tactics.pptx
From Hope to Despair The Top 10 Reasons Businesses Ditch SEO Tactics.pptxFrom Hope to Despair The Top 10 Reasons Businesses Ditch SEO Tactics.pptx
From Hope to Despair The Top 10 Reasons Businesses Ditch SEO Tactics.pptx
 
How To Navigate AI - The Future is Yours to Define - Tim Hayden
How To Navigate AI - The Future is Yours to Define - Tim HaydenHow To Navigate AI - The Future is Yours to Define - Tim Hayden
How To Navigate AI - The Future is Yours to Define - Tim Hayden
 
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge...
 

Checking google index status at scale

  • 1. Checking Google Index status at scale with Node.js Checking Google Index status at scale with Node.js Jose Luis Hernando @jlhernando #BrightonSEO Senior Technical SEO Consultant
  • 2. Checking Google Index status at scale with Node.js Today’s agenda 1. Why it’s important to know your website’s indexing status 2. The challenge to extract this data 3. Getting the data with Node.js – Live Demo! 4. Using this data for your SEO strategy
  • 3. Checking Google Index status at scale with Node.js Why is it important? Reason #1 Not in the Index => Not in the SERPs Icons from Google, Flaticon & Sitecheckerpro
  • 4. Checking Google Index status at scale with Node.js Why is it important? Reason #2 Google evaluates site quality based on indexed pages Sources: Google Only Can Judge Site Quality Based On Pages They Index – Barry Swartz (Search Engine Roundtable) English Google Webmaster Central office-hours hangout – Google Webmasters YouTube Channel Low Quality Pages Uncontrolled Faceted Navigation URLs Unsupervised User Generated Content Indexable Non-Canonical URLs High Quality Pages Category Pages Editorial Pages Canonical Product Pages +
  • 5. Checking Google Index status at scale with Node.js Why is it important? Reason #3 Inefficient use of Google’s resources https://website.com/category-one/ HTML CSS JS /category-one/?color=red /category-one/?color=blue /category-one/?color=red&blue … ∞
  • 6. Checking Google Index status at scale with Node.js 71.7% 54.3% 41.7% 34.4% 45.3% 30.2% 15.1% 10.1% 1-10k 10k-100k 100k-1M 1M+ Avg. Crawl Ratio (%) Avg. Active Ratio (%) Source: How Does Google Crawl the Web? – (Annabelle Bouard & Dimitri Brunel – Botify) Crawl Ratio Percentage of pages crawled by Google in 30 days Active Ratio Percentage of pages that have generated at least one organic visit in 30 days. How much of your site is Googlebot crawling?
  • 7. Checking Google Index status at scale with Node.js The challenge to extract this data • Googlebot’s crawling behaviour doesn’t determine indexing status
  • 8. Checking Google Index status at scale with Node.js The challenge: extracting this data • Googlebot’s crawling behaviour doesn’t determine indexing status • You rely on partial and sometimes inaccurate data points: • site: & inurl: operators • GSC Indexing reports: • URL Inspection Tool (< 200 URLs /day) • Coverage Reports (< 1,000 rows / report)
  • 9. Checking Google Index status at scale with Node.js Proxy metrics != Accurate data
  • 10. Checking Google Index status at scale with Node.js If you can’t find it, build it
  • 11. Checking Google Index status at scale with Node.js {Live demo} bit.ly/google-index-checker-script
  • 12. Checking Google Index status at scale with Node.js Using the following method goes against Google’s Terms of Service as it automatically requests search queries from Google Search Quick FYI
  • 13. Checking Google Index status at scale with Node.js Our script outperforms every other method available
  • 14. Checking Google Index status at scale with Node.js How can you use Google index data? Identify inefficient use of crawl budget Error Prioritisation Identify holes in your architecture Check for pages from your site that should be indexed but are not. Find pages that should not be indexed but are indexed. Detect pages that used to exist and now return an error (4xx) but are still indexed.
  • 15. Checking Google Index status at scale with Node.js Use case #1 Sitemap Health Check How many URLs from your XML sitemap are indexed? • 200 Status Code – 81,688 Inspired by Data Secrets of the Index Coverage Report – AJ Kohn Sitemaps = 111,772 URLs 80% Indexed 74,223 7,465 Google Index Status of 2xx URLs from Sitemap Indexed Not Indexed
  • 16. Checking Google Index status at scale with Node.js Use case #1 Sitemap Health Check How many URLs from your XML sitemap are indexed? • 200 Status Code – 81,688 • 404 Status Code – 29,969 Inspired by Data Secrets of the Index Coverage Report – AJ Kohn Sitemaps = 111,772 URLs 80% Indexed 21% Indexed 6,268 23,701 Google Index Status of 4xx URLs from Sitemap Indexed Not Indexed
  • 17. Checking Google Index status at scale with Node.js Use case #1 Sitemap Health Check How many URLs from your XML sitemap are indexed? • 200 Status Code – 81,688 • 404 Status Code – 29,969 • 301 Status Code – 365 Inspired by Data Secrets of the Index Coverage Report – AJ Kohn Sitemaps = 111,772 URLs 80% Indexed 21% Indexed 4% Indexed 16 349 Google Index Status of 3xx URLs from Sitemap Indexed Not Indexed
  • 18. Checking Google Index status at scale with Node.js Sitemap Health Check Next Steps 1) Identify if these URLs are important to your site’s bottom line 2) Check if a pool of these URLs have issues on GSC’s Index Coverage Report 3) Choose a tactic to improve the visibility of these URLs 4) Isolate the relevant URLs and modify the existing sitemap or create a new-sitemap.xml to monitor progress
  • 19. Checking Google Index status at scale with Node.js Use case #2 Log File Analysis Plus+ How many URLs with Googlebot hits are indexed? • ~160k Googlebot hits to non-canonical URLs (/Uppercase/ vs /lowercase/) • Identified if non-canonical URLs were indexed • Identified if the referenced canonical URLs were indexed 35.8% 64.2% Indexed Non-Canonical URLs Requested by Googlebot Indexed Not Indexed Undisclosed Client
  • 20. Checking Google Index status at scale with Node.js Log File Analysis+ Next Steps 1) Identify if the canonical tag is correctly placed 2) Identify if the root cause is internal linking, external linking or other 3) Consider redirecting non-canonical URLs to canonical URLs 4) Create a new-sitemap.xml with problematic URLs to encourage Googlebot revisiting those URLs and for monitoring purposes
  • 21. Checking Google Index status at scale with Node.js • Check Real-time indexing (News sites, Offer sites, Job Boards) • Check uncontrolled faceted navigation (Crawl budget optimisation) • Check inactive product/category URLs – (Site architecture improvements) • Check old 4xx that are live now & haven't been deindexed yet (Recover organic opportunities) Other use cases Inform your SEO strategy
  • 22. Checking Google Index status at scale with Node.js Further reading https://bit.ly/google-index-checks
  • 23. Checking Google Index status at scale with Node.js Further reading https://bit.ly/gsc-index-coverage
  • 24. Checking Google Index status at scale with Node.js The Google Index Checker script has opened a door to get useful, actionable data at scale for your sites Use it, and act on it.
  • 25. Checking Google Index status at scale with Node.js Thank you. builtvisible.com Jose Luis Hernando Senior Technical SEO Consultant @jlhernando
  • 26. Checking Google Index status at scale with Node.js How does Google crawl the web – Annabelle Bouard & Dimitri Brunel (Botify) English Google Webmaster Central office-hours hangout – Google Webmasters YouTube Channel Google Only Can Judge Site Quality Based On Pages They Index – Barry Swartz (Search Engine Roundtable) Data Secrets of the Index Coverage Report - Blind Five Year Old (AJ Kohn) How Google Search Works – Google Documentation How Search organises information – Google Documentation Our new search index: Caffeine - Carrie Grimes When indexing goes wrong: how Google Search recovered from indexing issues & lessons learned since - Vincent Courson, Google Search Outreach How Search Engines Work: Crawling, Indexing & Ranking – Moz (Please) Stop Using Unsafe Characters in URLs – Jeff Starr Sources & additional reading