@earnedMarketing
EPISODE WASTE
@earnedMarketing
`
Tomas Vaitulevicius
@earnedMarketing
Head of Digital Marketing @ JustPark
0
200,000
400,000
600,000
800,000
1,000,000
1,200,000
1,400,000
1,600,000
1,800,000
2,000,000
SEO Traffic
0
5,000,000
10,000,000
15,000,000
20,000,000
25,000,000
SEO Traffic
@earnedMarketing
Richard Baxter
@richardbaxter
Benjamin Johnson
@d00berry
Dean Rowe
+DeanRoweSEO
Big thanks for help in putting this content together
@earnedMarketing
SEO ARCHITECTURE
@earnedMarketing
Search
Demand
Topic Coverage
Top of Class
Content
Dedicated
Pages
Flat Prioritised
Linking
Monitoring
Devising of SEO architecture follows a very similar set of steps at websites of all sizes
@earnedMarketing
Monitoring
Search
Demand
Topic Coverage
Top of Class
Content
Dedicated
Pages
Flat Prioritised
Linking
But at large ones there’s a number of other SEO complications that need to be dealt with. This
deck focuses primarily on Waste (of crawl budget, Google index & internal link equity)
@earnedMarketing
Nofollow
Robots.txt
Noindex
Canonical
keep Googlebot away
from parts of site
keep Googlebot away
from parts of site
get parts of site out of
Google’s index
solve duplicate content
issues
SEO TOOLKIT
Small Sites
Another difference between small and large site SEO architecture is that the basic SEO tools…
@earnedMarketing
Nofollow
Robots.txt
Noindex
Canonical
keep Googlebot away
from parts of site
keep Googlebot away
from parts of site
get parts of site out of
Google’s index
solve duplicate content
issues
burn internal link equity
burn internal link equity &
block inbound link equity
waste crawl budget and
burn 15% of link equity
waste crawl budget, burn
15% of link equity &
add uncertainty
SEO TOOLKIT BAND AIDS
Small Sites Large Sites
…become pretty damaging on scale
@earnedMarketingThis matters because PageRank is still the foundation of Google’s crawl and indexation
@earnedMarketing
Search
Listing
Home
Category
Let’s say we have a small website
@earnedMarketing
Search 0.61
Listing
Home 0.85
0.72
1
0.72
0.61
Category
0.85
1
With 1 unit of PageRank arriving to the homepage and cascading down through links
@earnedMarketing
Search 0.15
Listing
Home 0.85
0.36
1
1
0.36
0.15
Category
0.85
0.36
0.15
Dead
-end
Dead
-end
If we add a couple of links patched with the SEO band-aids (nofollow or Robots.txt Disallow),
we’ll make half of the link equity of Category and Search pages evaporate from our site
@earnedMarketing
Search 0.15
Listing
Home 0.85
0.36
1
1
0.36
0.15
Category
0.85
-75%
0.36
0.15
Dead
-end
Dead
-end
Making the listing page 75% weaker. Inefficiencies like these are killing large site SEO as pages
with little PageRank don’t get crawled and indexed, and obviously won’t get any traffic
@earnedMarketing
IT IS HARD!
Huge amounts of waste and damaging effects of SEO Band Aids do make Large Site SEO
Architecture pretty d*mn hard
@earnedMarketing
Page Oriented
Architecture
Destination
Oriented
Architecture
Single Page
Application
But we found inspiration in the new technology of Single Page Applications for a new approach
to SEO architecture which fixes the problems rather than patching them up
@earnedMarketing
justpark.com/london/
justpark.com/london/?page=2
justpark.com/london/?sort=price
justpark.com/listing-1/
justpark.com/listing-1/photos
justpark.com/listing-1/save
justpark.com/listing-1/enquire
justpark.com/listing-1/book
justpark.com/forgot-password
page
page?
page?
page
page?
page?
page?
page?
page?
destination
destination
destination
destination
destination
destination
destination
destination
destination
PAGES vs DESTINATIONS
In Destination Oriented Architecture we want to identify the canonical pages/URLs that
represent real Destinations targeting SEO Topics and “kill” all of the other publicly available URLs
@earnedMarketing
1 SEO Topic = 1 Destination
1 Destination = 1 SEO Topic
No SEO Topic = No Destination
We want to have as many distinct Destinations as we have different SEO Topics we’re targeting.
And all the supplementary content and functionality to live within these Destinations
@earnedMarketing
DESTINATION TO CRAP RATIOS
0% 20% 40% 60% 80% 100%
Usage
Internal links
Index
Crawl
We use Destination to Crap ratios to gauge how well we’re doing on the journey to a fully
Destination Oriented Architecture (it’s also helpful in getting buy-in from the different
stakeholders as no one wants to think of their platform as being 80% crap or waste)
@earnedMarketing
Crawl – split of Googlebot crawl hits in your access logs between
(exact) destination URLs and not
Index – all of your destinations should be in your sitemaps that are
submitted to the Google Search Console. Index ratio is = Indexed
Destinations (GSC > Crawl > Sitemaps) vs Total Indexed (GSC >
Google Index > Index Status) - Indexed Destinations
Internal Links – all internal links from a web crawl (Screaming
Frog, Deep Crawl, etc.) split between the ones pointing to (exact)
destination URLs and not
Usage – page views of your users (web analytics) split between
(exact) destination URLs and not
METHODOLOGY
@earnedMarketing
REAL WORLD EXAMPLES
@earnedMarketing
rightmove.co.uk/fees.html?listing_id=165467654
justpark.com/parking-spaces/…/callout-snippet/
> Js-off – host content within
a relevant destination and link
with in-page anchors
> Js span trigger for
preloaded or AJAX lightbox
! Crawl Waste
! Littered Index
! Duplicate Content
! Thin Content
! Wasted Internal Link Equity
! Scattered Inbound Link Equity
SUPPLEMENTARY CONTENT
@earnedMarketing
rightmove.co.uk/property/London.html/svr/2124;jsession
id=9BE1415794CEDC5590B1FA11B8817DE0
> Exclude for bots
> Move to cookies
> Go stateless (in extreme
circumstances carrying the
state in POST form hidden
fields)
! Crawl Waste
! Littered Index
! Duplicate Content
! Thin Content
! Wasted Internal Link Equity
! Scattered Inbound Link Equity
SESSION PARAMETERS
@earnedMarketing
http://ww.just-park.co.uk/uk/parking/London >>
https://www.justpark.com /uk/parking/london /
> Catch-all 301 redirects in
server config
> http <> https
> non-www <> www <
unrecognised subdomains
> upper case > lower case
> no trailing slash <> trailing
slash
! Crawl Waste
! Littered Index
! Duplicate Content
! Thin Content
! Wasted Internal Link Equity
! Scattered Inbound Link Equity
ALTERNATIVE URLs
@earnedMarketing
instagram.com/accounts/login/?next=%2Fabout…
distilled.net/store/profile/login/?next=/resources/
> Hash parameter for Js
> HTTP Referrer
> Cookies / LocalStorage
> Lightbox login form
! Crawl Waste
! Littered Index
! Duplicate Content
! Thin Content
! Wasted Internal Link Equity
! Scattered Inbound Link Equity
FORWARDING PARAMETERS
@earnedMarketing
ufc.com/fightweek?utm_campaign=Intl+Fight+…
ted.com/?utm_medium=email&utm_source=Oxford…
> Special URL tracking redirect
loop
> Hash (#) parameter based
traffic source tracking
! Crawl Waste
! Littered Index
! Duplicate Content
! Thin Content
! Wasted Internal Link Equity
! Scattered Inbound Link Equity
TRACKING PARAMETERS
@earnedMarketing
justpark.com/london/…/garden-car-park/?start_date=
2015-08-16&end_date=2015-08-16&start_time=…
> Omit on default
> Server session (but better
not)
> Hash parameter for Js
> Cookies / LocalStorage
! Crawl Waste
! Littered Index
! Duplicate Content
! Thin Content
! Wasted Internal Link Equity
! Scattered Inbound Link Equity
FUNCTIONAL PARAMETERS
@earnedMarketing
worldbank.org/…/modules/economic/gnp/print.html
rightmove.co.uk/…/print.html?listingId=47812940
> Print Stylesheet! Crawl Waste
! Littered Index
! Duplicate Content
! Thin Content
! Wasted Internal Link Equity
! Scattered Inbound Link Equity
PRINT VERSION
@earnedMarketing
justpark.com/…/book/?listing_id=148685&…
rightmove.co.uk/addtoshortlist.html?listing_id=478129
> Logged-out version link to
/login#forwarding=xxx
> Logged-out version Js span
trigger login lightbox
> POST to the product URL
> AJAX for logged-in
! Crawl Waste
! Littered Index
! Duplicate Content
! Thin Content
! Wasted Internal Link Equity
! Scattered Inbound Link Equity
LOGGED-IN FUNCTIONALITY
@earnedMarketing
justpark.com/uk/parking/brighton/?page=2
rightmove.co.uk/…/London.html?sortType=1
> InPage-only AJAX
manipulations
> Cookies
> Hash parameters and on
load AJAX processing
! Crawl Waste
! Littered Index
! Duplicate Content
! Thin Content
! Wasted Internal Link Equity
! Scattered Inbound Link Equity
SEARCH PAGINATION & FILTERS
@earnedMarketing
gumtree.com/search?q=car&tq=%7B%22i%22%3A...
ebay.co.uk/sch/i.html?_nkw=car&_from=R40&_tr…
> Canonicalising redirects
> Js search form pointing to
the canonical URL
> AJAX search with pushState
canonical URLs (SPA)
! Crawl Waste
! Littered Index
! Duplicate Content
! Thin Content
! Wasted Internal Link Equity
! Scattered Inbound Link Equity
DYNAMIC SEARCH URLS
@earnedMarketing
rightmove.co.uk/…/terms-of-use-and-privacy-policy
justpark.com/uk/airport-parking/
> Js span triggered AJAX
lightbox
> Shortlisting only relevant
resources by page type
(homepage, search, etc.)
> Merge multiple site-wide-
linked pieces into a single
location with hash deep links
! Crawl Waste
! Littered Index
! Duplicate Content
! Thin Content
! Wasted Internal Link Equity
! Scattered Inbound Link Equity
SITE-WIDE LINKS (HEADER / FOOTER)
@earnedMarketing
And, please, crawl your sites to make sure you’re not linking to URLs
that redirect or canonicalise to other URLs!..

Large Site SEO Architecture - #BrightonSEO 2015