C-137 SEO
Technical SEO
Corpus Control
WHAT?!
Do you know all of the
URLs on your website?
This bot does
New Google Search Console
That’s a lot of excluded pages
That means there are nearly 
5 million rogue pages
XML Sitemaps should
include all URLs available
to be indexed
Known – Submitted
Uncover Problems by
Identifying Patterns
Use the filter functionality
Using the percentage
makes sizing the
problem easier to
communicate
Parameters
are scary
Use robots.txt
generously
This tool is
awesome
No Index
 Disallow
4xx
 Canonical
404s are junk food
bit.ly/internal404s
My feelings on rel=canonical are not very nuanced
301 redirect unless the user needs the page
Not all status types are equal
Not a good look
Show me what you got
XML sitemaps
50,000 URLs per sitemap, 50,000 sitemaps per sitemap
index means 2,500,000,000 potential URLs
Optimize your sitemaps
bit.ly/sitemapoptimization
Things can get
out of hand
“Hey boss, we really need
crawl tracking.”
Sometimes Google
crawls the wrong way
You are what Googlebot eats
bit.ly/crawloptimization
Tracking Crawl by
Status Code
Tracking Crawl by
Page Type
Page type makes
a difference
Log file analysis
bit.ly/ultimatelogfileguide
bit.ly/completelogfileguide
“If you can’t
measure it, you
can’t manage it.”
- Peter Drucker
“If you can’t
measure it, you
can’t improve it.”
- Lord Kelvin
Crawl
Index
Rank
Tidying Up
Corpus control
TL;DL
Measure how Google is crawling your site and take
action to ensure they’re crawling the ‘right’ things
AJ Kohn
Owner, Blind Five Year Old
www.blindfiveyearold.com
aj@blindfiveyearold.com
@ajkohn
bit.ly/C137SEO

C-137 SEO