Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018

@patrickstox #DTDconf
Troubleshooting SEO for modern JS frameworks

Who is Patrick Stox
Technical SEO for IBM - Opinions expressed are my own and not those of IBM.
I write, mainly for Search Engine Land
I speak at some conferences like this one, SMX, Pubcon, TechSEO Boost, etc.
Organizer for the Raleigh SEO Meetup (most successful in US)
We also run a conference, the Raleigh SEO Conference
Also the Beer & SEO Meetup (because beer)
2017 US Search Awards Judge, 2017 UK Search Awards Judge
2018 Interactive Marketing Awards Judge
Some of you may know me from the Wix SEO Hero contest (I got disqualified
https://beanseohero.com/)

Why do SEOs need to learn about JS?

WordPress is replacing their TinyMCE editor with Gutenberg built in React
The content blocks make it into more of a page builder

Be prepared
Devs will focus on build and functionality and SEO and accessibility are afterthoughts.
Many of these folks are devs who haven’t had to work with SEOs before.
Counterpoint: many SEOs haven’t worked with JS or these devs.

What’s an SEO to do?

Find out about the setup, what framework, how is it rendering?
• Server-Side Rendering (SSR) – renders on demand from the server
• Pre-Rendering – a headless browser records the DOM (Document Object Model) and
creates an HTML snapshot. Like SSR, but done pre-deployment. Prerender.io,
BromBone, PhantomJS
• Client-Side Rendering – rendered in the users browser
• Isomorphic or Universal – Serves a rendered version on load but then replaces with JS
for subsequent loads.

What should I use?
• Server-Side Rendering (SSR) – slower TTFB unless you cache, will work for the ~2% of
users with JS disabled.
• Pre-Rendering – may not serve the latest version, doesn’t allow for personalization, will
work for the ~2% of users with JS disabled.
• Client-Side Rendering - longer to load but everything is available and can be changed
quickly. A loading image is typically used but you may see a blank page.
• Isomorphic or Universal - This is probably the best setup, but it can be a lot of resources
to load.

Can I detect the User Agent and serve an image only to Googlebot?
Sort of. It’s technically cloaking, but if the output is the same I would do it.
Really, make sure your output is the same.

What about Bing or international markets?
Fabrice Canel of Bing said at Pubcon Vegas in 2017 that Bing processes JS. (be wary)
If you’re in Asian markets you’re out of luck for now.
I’d still prefer to use Isomorphic or Universal.

URLs
Most of the frameworks have a router allowing you to customize URLs.
/en/us?Topics%5B0%5D%5B0%5D=cat.topic%3Ainfrastructure
Create patterns to match /{language}/{country}/{category}/{slug}

They don’t have to change URLs to show different content
To change URLs for different content, usually History API and HTML5 pushstate() are used.

Links
You still need <a href=
ng-click, href=“javascript:void(0);” – these won’t be seen as links

Noscript
If you put something in <noscript> it’s probably going to be ignored.

It’s hard to 404
You can add a noindex to any error pages along with a message. Will be treated as a soft-
404.
JS redirect to an actual 404 page that returns the status code.
Create a 404 Route.

Things start to get interesting…

View Source gets you this:
<!DOCTYPE html><html><head><meta charSet="utf-8" class="next-head"/><script class="next-
head">window.NREUM||(NREUM={});NREUM.info = {"agent":"","beacon":"bam.nr-data.net","errorBeacon":"bam.nr-
data.net","licenseKey":"2961bc4e3a","applicationID":"103427480","applicationTime":0.347295,"transactionName":"YQdSMU
cDXEMAVEUMClhNcxBGFl1dTkBUCQZZD1U=","queueTime":0,"ttGuid":"8aaa5aa3717120","agentToken":null};
(window.NREUM||(NREUM={})).loader_config={xpid:"VQIHUVBSCRABVFJWBQYDXlQ="};window.NREUM||(NREUM={}),_
_nr_require=function(t,e,n){function r(n){if(!e[n]){var o=e[n]={exports:{}};t[n][0].call(o.exports,function(e){var o=t[n][1][e];return
r(o||e)},o,o.exports)}return e[n].exports}if("function"==typeof __nr_require)return __nr_require;for(var
o=0;o<n.length;o++)r(n[o]);return r}({1:[function(t,e,n){function r(t){try{c.console&&console.log(t)}catch(e){}}var
o,i=t("ee"),a=t(20),c={};try{o=localStorage.getItem("__nr_flags").split(","),console&&"function"==typeof
console.log&&(c.console=!0,o.indexOf("dev")!==-1&&(c.dev=!0),o.indexOf("nr_dev")!==-
1&&(c.nrDev=!0))}catch(s){}c.nrDev&&i.on("internal-error",function(t){r(t.stack)}),c.dev&&i.on("fn-
err",function(t,e,n){r(n.stack)}),c.dev&&(r("NR AGENT IN DEVELOPMENT MODE"),r("flags: "+a(c,function(t,e){return
t}).join(", ")))},{}],2:[function(t,e,n){function r(t,e,n,r,c){try{h?h-=1:o(c||new
UncaughtException(t,e,n),!0)}catch(f){try{i("ierr",[f,s.now(),!0])}catch(d){}}return"function"==typeof
u&&u.apply(this,a(arguments))}function UncaughtException(t,e,n){this.message=t||"Uncaught error with no additional

Source and Google’s Cache
Both the source code and Google’s cache are raw HTML, before JS is processed.
You can’t rely on these when working with JS.

Inspect or Inspect Element
Viewing the DOM shows you the HTML after the JS has been processed.
If you want to make sure Google sees it, load in the DOM by default without needing an
action like click or scroll.

Render a page as Googlebot
• Google Search Console / Fetch and Render - renders a page with the real Googlebot as
either desktop or mobile, lists blocked / unreachable resources. Lets you submit rendered
pages to Google search for indexing. This is stricter than the system normally used.
• Google Mobile Friendly Test - renders a page with smartphone Googlebot.
• Merkle Fetch and Render - fetch & render a page with any bot
• Google Rich Results Test shows rendered DOM >
Crawlers: Screaming Frog, Sitebulb, Botify

To check indexing
Search Google:
site:domain.com “use a snippet of text from your page”

It’s not quite that simple

What may be indexed 1st is HTML, before the render happens
So you may not find what’s in your site: search yet.
You’ll see errors in the GSC HTML Improvements Report.
You might need to check the source code vs the DOM to see what will change once the Web
Rendering Service (WRS) renders the page.
Nofollow added via JS is a bad idea because it may show as follow before rendered.
Internal links may not be picked up and added to crawl before the render happens.
We setup a test site in October and couldn’t get everything indexed, but we put it live
and everything was fine. Google may not want to crawl everything on a test site.
If you need the information indexed fast, Use server-side rendering or pre-rendering

Experiments in June 2017 – Slack log
"user": "U0SB51WPJ",
"text": "I requested it and the timing puts 2 different bots there. Mozilla/5.0 (compatible; Googlebot/2.1;
+<http://www.google.com/bot.html>) and Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko;
Google Search Console) Chrome/41.0.2272.118 Safari/537.36",
"ts": "1496434137.849731"
"user": "U2FNA0DQB",
"text": "So it's possible that the same user agent that we know as console - when we request it manually - is the same
user agent we'd see when they hit us with the simlauted browser on their own without us manually doing via fetch",
"ts": "1496434501.951224"
"user": "U0SB51WPJ",
"text": "^thinking about this a second, GSC shows how googlebot saw and how a visitor saw the page, so first is google
and second is the second tab for how visitors saw the page.",
"ts": "1496434516.955902"

Conclusion:

Confirmed in August
https://developers.google.com/search/docs/guides/rendering

Highlights:
Googlebot and WRS only speaks HTTP/1.x and FTP, with and without TLS.
For rendering, Googlebot is stateless: Local Storage, Session Storage, HTTP Cookies are all
cleared across page loads
Not available: Service Workers, IndexedDB, WebGL, WebSQL, Fetch, ES6 / ECMAScript 6
support
ES6 transpile to ES5 for now (thanks Bartosz!)
Recommend using feature detection and Polyfills & treat Googlebot like any browser
WRS declines permission requests like Camera API, Geolocation API, and Notifications API.
Googlebot and WRS continuously analyze and identify resources that don’t contribute to
essential page content and may not fetch such resources.

Tracking Errors and Debugging
https://developers.google.com/search/docs/guides/debug-rendering
You can also download Chrome 41.

Myth busting
Because of this experiment from Max Prin, most people believe that Googlebot will only wait
5 seconds for a page to load. https://maxxeight.com/tests/js-timer/

What else we know
Mostly crawls from West Coast US (Mountain View).
According to Gary Illyes at SMX Advanced they don’t throttle speeds when checking mobile
sites.
They are very aggressive with caching everything (you may want to use file versioning).
In fact, because of the caching and you know being a bot and all, they actually run things as
fast as they can with a sped up clock. Check out this post from Tom Anthony
http://www.tomanthony.co.uk/blog/googlebot-javascript-random/
Mobile Googlebot's screen size is 431 X 731 and Google resizes to 12,140 pixels high, while
the desktop version is 768 X 1024 but Google only resizes to 9,307 pixels high. Check out
JR Oakes https://codeseo.io/console-log-hacking-for-googlebot/ This can screw with you in
GSC Fetch and Render if you resize images based on height.

Thank You, that’s it for me.
Now ask Bartosz all of your questions.

Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018

Similar to Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018 (20)

More from patrickstox

More from patrickstox (20)

Recently uploaded

Recently uploaded (20)

Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018

Editor's Notes