International SEO: The Weird Technical Parts - Pubcon Vegas 2019 Patrick Stox
International SEO: The
Weird Technical Parts
Presented by: Patrick Stox
• Technical SEO for IBM - Opinions expressed are
my own and not those of IBM.
• I write, mainly for Search Engine Land
• I speak at some conferences like this one, SMX,
• Organizer for the Raleigh SEO Meetup (most successful
• We also run a conference, the Raleigh SEO Conference
• Also the Beer & SEO Meetup (because beer)
• 2017 + 2018 + 2019 US Search Awards Judge, 2017 +
2018 + 2019 UK Search Awards Judge, 2018 + 2019
Interactive Marketing Awards Judge
• Founder Technical SEO Slack Group
• Moderator /r/TechSEO on Reddit
Who is Patrick Stox?
First Off, Listen to Aleyda!
Those were best practices
• They work in pairs to form a cluster of pages
• They get to share signals like links from the strongest
page in the cluster which helps other versions rank
better when correct.
More About Hreflang
Common Misconception #2
Sitemaps are better because hreflang is processed
The reality is signals are checked at time of crawl
whether sitemap or html.
Common Misconception #3
Hreflang is based on your canonical tag.
It doesn’t matter what your canonical is set as, it
matters what the indexed version is. (more on why
that is important later)
Here’s What You Can Get Away With
AKA Google is going to ignore of fix these things for you
Looks like Google lets it slide, although standards and best
practices say to use “–” instead of “_”
Hreflang – Underscore Instead of Dash
Is it essential to have self-referencing hreflang tags?
John Mueller: No.
Hreflang – Self Referencing
Hreflang – Relative vs Absolute URLs
Hreflang to a URL that Redirects
Mostly true, but X-default can be used for auto-
redirecting, serving dynamic content, or as a language
More about X-default
The way this works is by most specific match, so based on
signals what is the best matching page:
• lc-cc = language in the specified country
• lc = doesn't match a country but matches the language
settings of the user searching
• x-default = overall default if it doesn't match any
language+country or language or if it can't be determined.
This is like a catch all for anything that doesn't match
Putting Everything On One Page
Googlebot crawls from the US, putting all versions of
content on one page just means they only see one
Also Be Careful with Auto-Redirecting
People usually use some combination of cookies, IP,
and browser language.
Redirect all users based on location
• Pros: Users mostly end up on the right version
• Cons: cost, complexity, maintenance, accuracy, crawl/index issues, titles
and descriptions in the wrong language
• Would redirect search engines to where they crawl from. Google for instance
mostly crawls from the US so we would effectively de-index all geo pages. For
other search engines and occasionally from Google (they sometimes crawl
from outside the US) we may remove the US pages from the index.
• Potential legal issues with EU Anti-Geoblocking Regulations which tell you not
to automatically redirect users to a different country, to give them an option
Redirect Users but not Search Engines
Give the user a choice
Be Careful With These Banners
Check Your Stack
Encoding characters in URLs with UTF-8 is fine with
Google, but there may be a point of failure in your
tech stack where it is not supported.
Page Serves From A Different URL Than
How It’s Indexed
You need the indexed version in your tags.
Canonicalization can be complex: Canonical tag,
redirects, internal linking, sitemap URLs, hreflang???,
You can’t have hreflang tags in the body because they
could be used for hijacking. The tags can be forced
into the body section under certain conditions.
Can be caused by things like iframes or <p> tags in the
<head> section, can be from injecting them. Use DOM
breakpoints to troubleshoot.
What Happens with Different Setups?
Domain names can expire.
Domains can become invalidated in GSC (impacts
sitemap implementation of hreflang)
Difficult to maintain.
Can become invalidated in GSC (impacts sitemap
implementation of hreflang)
SEOs don’t like them and may try to change things.
Change is when things go wrong.
Same page in multiple countries, what happens?
Google views as duplicates and folds together in their index
(they’re trying to help us by consolidating the duplicates).
Wrong versions of the pages show in the locales, whichever
page became the main version when folded shows instead.
Any – Duplicates
Eventually Google will make the connections via hreflang
tags and pages will be swapped properly based on locale and
the correct one will show
Any – Duplicates
Any – Duplicates
Mostly language based with dynamic personalization or
options for different countries.
Fewer, stronger pages.
What I Prefer
GSC Shows Data on Canonical
Checking GSC URL inspector to see how
pages are indexed
GSC International Targeting Report
The right way to check which page is
ranking in a country
1. Go to the version of Google for the country and
perform your search.
2. At the end of the search url add &hl=lc&gl=cc where
lc is the language code and cc is the country code.
Regression tests, crawlers, proactive crawlers
Any number of things can break from masking to things
carrying over from dev/test/staging environments, CMS