Developing SEO Tools:
Custom solutions with free and paid APIs
Max Prin
CONDÉ NAST | TECHNICALSEO.COM
slideshare.net/MaxPrin
@maxxeight
(Traditional) SEO Pillars
Relevance
Content Strategy
Authority & Trust
Link Building
Site Infrastructure
Technical SEO
Improving the site infrastructure in order to optimize
crawling and indexing to eventually increase ranking.
True, but incomplete.
Technical SEO?
Technical SEO?
“Any sufficiently technical action undertaken
with the intent to improve search results.”
– Russ Jones
Technical SEO
Site Health
Fixes + Improvements
01
Tooling
Automation + Reporting
02
Tooling
Getting Started
Getting Started
● Google Cloud Platform Free Tier
○ Virtual machine
○ Storage
○ APIs
https://cloud.google.com/free
Getting Started
● Front-end (UI)
○ HTML / CSS
○ JavaScript
■ React / Angular / Vue
● Back-end
○ Server
○ Database
○ Language: python, php,
nodejs (at least for
puppeteer)
Efficiency
Automation
Reporting
Monitoring, data export,
time-consuming tasks, etc.
Custom dashboards,
augmented data, etc.
Site Health Monitoring
● Out-of-the-box solutions, scheduled crawls, custom extractions, etc.
● Custom solution - tips and ideas
○ Monitor only a few URLs per page template
○ No database needed, only keep 3 days of
data in .txt of .json files
○ Free API for Domain-level info:
whoisxmlapi.com (screenshot, DNS,
categorization, SSL cert, etc.)
○ Gmail API + Puppeteer:
parse GSC notifications and login GSC to get
“# of URLs impacted”
URL Inspector on Steroid
“I want to know everything about that URL…”
● Fetch and render (custom extractions for
rendering issues)
○ cURL + puppeteer
● On-page elements (meta data, headings, etc.)
● hreflang tags validation
● Schema markup extraction
○ Free Yandex API
● GSC’s URL inspection API
● PageSpeed Insights API (Lighthouse reports)
● Content: entities, classification, sentiment
○ Google Natural Language API
● Traffic: GSC + GA APIs
● CMS API for internal data
● Geotargeting: anonymous-proxies.net
https://technicalseo.com/tools/fetch-render/
Google-InspectionTool
https://technicalseo.com/tools/robots-txt/
Keyword Tool
Combine:
● Search Console data (ranking
URLs, clicks / impressions)
With (paid APIs from semrush,
keywords everywhere, etc.):
● Search volumes
● Related keywords
● Top ranking URLs
Similar to Bing Webmaster Tools
Clients’ Tech Stacks
BuiltWith API: https://api.builtwith.com
Automating
Tasks
Example: SERP screenshots
Reporting
Custom dashboards,
augmented data, etc.
Search Console
Analytics API
Combining properties
Pro tip: use “service accounts” to
avoid disconnection and
authentication with your app
GCP’s Service Accounts
Google Algorithm Updates
https://technicalseo.com/tools/google-algorithm-updates/
Google Search Status Dashboard
https://status.search.google.com/incidents.json
maxpr.in/algo-updates-json
More Insights (Just) with Search Console Data
Combine “dimensions” with the API and/or data from multiple requests to analyze:
● Top Keywords of Top URLs
○ e.g. top 5 keywords for each of the top 100 URLs
○ What % of traffic each keyword represent? e.g. top keyword = 65% of
clicks to page
● Cannibalization
○ How many and which pages rank for each keywords?
○ Within same site or across markets (e.g. UK site outranking US site in US)
Core Web Vitals
● Leverage the historical data
endpoint in
CrUX API
● Overlay competitors
(programmatically pull from
semrush)
● Warning: Search Console report
vs. CrUX origin data
Search Console
Sitemaps API
+ robots.txt info
+ HTTP status code checker
Jira API
Custom table listing all “SEO”
issues (based on name, description,
label, comments, etc.)
Own Data + Natural Language Processing (NLP)
Augmented data for deeper performance analysis
Clicks, impressions, CTR, position from GSC
+ Author
+ Content type (page template)
+ Site section
+ Published/modified date
+ Word count
+ Core Web Vitals
+ Content category (classification,
entities, tags)
+ Sentiment analysis (title and/or
body)
Clicks Impressions CTR # of Pages
Beauty &
Fitness
1,234,567 34,567,891 3.57% 25
Shopping 456,789 22,333,444 2.05% 65
News 123,456 11,789,101 1.05% 58
Own Data + Natural Language Processing (NLP)
Augmented data for deeper performance analysis
+
Clicks Impressions CTR # of Pages
Positive 1,234,567 34,567,891 3.57% 25
Neutral 456,789 22,333,444 2.05% 65
Negative 123,456 11,789,101 1.05% 58
Generative AI
Content briefs, ideation, etc.
Explaining data
Generative AI
OpenAI’s GPT API - Function calling
- Integrate any data into your
chat/assistant
- Eliminate “hallucinations”
- New: cheaper pricing for GPT-4
API and improved function
calling feature
Thank you!
Questions?

Max Prin - brightonSEO San Diego 2023 - Developing SEO Tools

  • 1.
    Developing SEO Tools: Customsolutions with free and paid APIs Max Prin CONDÉ NAST | TECHNICALSEO.COM slideshare.net/MaxPrin @maxxeight
  • 2.
    (Traditional) SEO Pillars Relevance ContentStrategy Authority & Trust Link Building Site Infrastructure Technical SEO
  • 3.
    Improving the siteinfrastructure in order to optimize crawling and indexing to eventually increase ranking. True, but incomplete. Technical SEO?
  • 4.
    Technical SEO? “Any sufficientlytechnical action undertaken with the intent to improve search results.” – Russ Jones
  • 5.
    Technical SEO Site Health Fixes+ Improvements 01 Tooling Automation + Reporting 02
  • 6.
  • 7.
  • 8.
    Getting Started ● GoogleCloud Platform Free Tier ○ Virtual machine ○ Storage ○ APIs https://cloud.google.com/free
  • 9.
    Getting Started ● Front-end(UI) ○ HTML / CSS ○ JavaScript ■ React / Angular / Vue ● Back-end ○ Server ○ Database ○ Language: python, php, nodejs (at least for puppeteer)
  • 10.
  • 11.
    Automation Reporting Monitoring, data export, time-consumingtasks, etc. Custom dashboards, augmented data, etc.
  • 12.
    Site Health Monitoring ●Out-of-the-box solutions, scheduled crawls, custom extractions, etc. ● Custom solution - tips and ideas ○ Monitor only a few URLs per page template ○ No database needed, only keep 3 days of data in .txt of .json files ○ Free API for Domain-level info: whoisxmlapi.com (screenshot, DNS, categorization, SSL cert, etc.) ○ Gmail API + Puppeteer: parse GSC notifications and login GSC to get “# of URLs impacted”
  • 13.
    URL Inspector onSteroid “I want to know everything about that URL…” ● Fetch and render (custom extractions for rendering issues) ○ cURL + puppeteer ● On-page elements (meta data, headings, etc.) ● hreflang tags validation ● Schema markup extraction ○ Free Yandex API ● GSC’s URL inspection API ● PageSpeed Insights API (Lighthouse reports) ● Content: entities, classification, sentiment ○ Google Natural Language API ● Traffic: GSC + GA APIs ● CMS API for internal data ● Geotargeting: anonymous-proxies.net https://technicalseo.com/tools/fetch-render/
  • 14.
  • 15.
    Keyword Tool Combine: ● SearchConsole data (ranking URLs, clicks / impressions) With (paid APIs from semrush, keywords everywhere, etc.): ● Search volumes ● Related keywords ● Top ranking URLs Similar to Bing Webmaster Tools
  • 16.
    Clients’ Tech Stacks BuiltWithAPI: https://api.builtwith.com
  • 17.
  • 18.
  • 19.
    Search Console Analytics API Combiningproperties Pro tip: use “service accounts” to avoid disconnection and authentication with your app
  • 20.
  • 21.
    Google Algorithm Updates https://technicalseo.com/tools/google-algorithm-updates/ GoogleSearch Status Dashboard https://status.search.google.com/incidents.json maxpr.in/algo-updates-json
  • 22.
    More Insights (Just)with Search Console Data Combine “dimensions” with the API and/or data from multiple requests to analyze: ● Top Keywords of Top URLs ○ e.g. top 5 keywords for each of the top 100 URLs ○ What % of traffic each keyword represent? e.g. top keyword = 65% of clicks to page ● Cannibalization ○ How many and which pages rank for each keywords? ○ Within same site or across markets (e.g. UK site outranking US site in US)
  • 23.
    Core Web Vitals ●Leverage the historical data endpoint in CrUX API ● Overlay competitors (programmatically pull from semrush) ● Warning: Search Console report vs. CrUX origin data
  • 24.
    Search Console Sitemaps API +robots.txt info + HTTP status code checker
  • 25.
    Jira API Custom tablelisting all “SEO” issues (based on name, description, label, comments, etc.)
  • 26.
    Own Data +Natural Language Processing (NLP) Augmented data for deeper performance analysis Clicks, impressions, CTR, position from GSC + Author + Content type (page template) + Site section + Published/modified date + Word count + Core Web Vitals + Content category (classification, entities, tags) + Sentiment analysis (title and/or body) Clicks Impressions CTR # of Pages Beauty & Fitness 1,234,567 34,567,891 3.57% 25 Shopping 456,789 22,333,444 2.05% 65 News 123,456 11,789,101 1.05% 58
  • 27.
    Own Data +Natural Language Processing (NLP) Augmented data for deeper performance analysis + Clicks Impressions CTR # of Pages Positive 1,234,567 34,567,891 3.57% 25 Neutral 456,789 22,333,444 2.05% 65 Negative 123,456 11,789,101 1.05% 58
  • 28.
    Generative AI Content briefs,ideation, etc. Explaining data
  • 29.
    Generative AI OpenAI’s GPTAPI - Function calling - Integrate any data into your chat/assistant - Eliminate “hallucinations” - New: cheaper pricing for GPT-4 API and improved function calling feature
  • 30.