React JS and Search Engines - Patrick Stox at Triangle ReactJS Meetup

Patrick Stox | @patrickstox
• Technical SEO for IBM - Opinions expressed are my
own and not those of IBM.
• I write, mainly for Search Engine Land
• I speak at some conferences like SMX, Pubcon, TechSEO Boost
• Organizer for the Raleigh SEO Meetup (most successful in US) and
the Beer & SEO Meetup
• We also run a conference, the Raleigh SEO Conference
• Judge: 2017/18/19 US Search Awards, 2017/18/19 UK Search
Awards, 2018/19 Interactive Marketing Awards, 2019 SEMrush AU
Search Awards
• Founder Technical SEO Slack Group
• Moderator /r/TechSEO on Reddit
• Finalist for SEO Speaker of the Year and SEO Contributor of the Year
- 2018 Search Engine Land Awards.
• On a lot of top SEO lists like 140 of Today's Top SEO Experts to
Follow.
• Part of SERoundtable’s Honor an SEO series
Who is Patrick Stox?

Why I’m Here:

Googlebot Before
Chrome 41
Polyfills
ES5
https://webmasters.googleblog.com/2019/05/the-new-evergreen-googlebot.html
https://webmasters.googleblog.com/2019/08/evergreen-googlebot-in-testing-
tools.html
Googlebot Now
Evergreen
All newer features should work
Tools are updated also

Content Initially Indexed is HTML Snapshot
Source Google: https://developers.google.com/search/docs/guides/javascript-seo-basics

Render Options
Source Google: https://developers.google.com/web/updates/2019/02/rendering-
on-the-web

Google Recommends Dynamic Rendering
Dynamic rendering means switching between client-side rendered and pre-
rendered content for specific user agents.
Options: Puppeteer, Rendertron, prerender.io
Source Google: https://developers.google.com/search/docs/guides/dynamic-
rendering

Google’s View of JS Devs

For SEOs: Cloaking
Using dynamic rendering to serve completely different
content to users and crawlers can be considered cloaking
Cloaking refers to the practice of presenting different
content or URLs to human users and search engines.
Cloaking is considered a violation of
Google’s Webmaster Guidelines because it provides our
users with different results than they expected.
https://support.google.com/webmasters/answer/66355?hl
=en

Testing
Don’t use Google’s cache. That’s an HTML snapshot processed by your browser.
You all know you shouldn’t use view-source, many SEOs do not yet. Help them.
Do use Mobile Friendly Test https://search.google.com/test/mobile-friendly
URL Inspection Tool https://search.google.com/search-console
-Show loaded/blocked resources, console output and exceptions, rendered DOM
Rich Results (desktop): https://search.google.com/test/rich-results
Google: site:whatever.com "part of your text“ to check if text is seen

Other Search Engines
Bing has the capability, but not the scale. They mostly use this for top pages and
web spam. **personal note that they may be ramping this up, but they haven’t
announced anything yet.
Yandex = limited
Most Asian Search Engines = no. I’ve seen nothing from Baidu, Naver, etc.

Need To Know About Googlebot
Declines user permission requests
Stateless, doesn't navigate
• Local Storage and Session Storage data are cleared across page loads.
• HTTP Cookies are cleared across page loads.
Use feature detection to identify supported APIs and capabilities
Make sure your web components are search-friendly:
• To encapsulate and hide implementation details, use shadow DOM.
• Put your content into light DOM whenever possible.
https://developers.google.com/search/docs/guides/fix-search-javascript

Will hit APIs if it’s allowed
They need to access resources (like JavaScript), don’t block them
Between the initial snapshot and the rendered version they will choose the most
restrictive statements (nofollow vs follow, noindex vs index, etc)
Some pages take longer to be processed than others:
https://webmasters.googleblog.com/2017/01/what-crawl-budget-means-for-
googlebot.html

Internal links may not be picked up and added to crawl before the render
happens
Mobile First: https://webmasters.googleblog.com/2018/03/rolling-out-mobile-first-
indexing.html

You May See Another Domain Indexed
If you’re using an app shell model pages may be folded together in their index.
Basically the html looked the same as something else so they figured the pages
were duplicate and only wanted 1 record in their index. This should resolve once
it’s been through the renderer.

Please
Don't use hash routing
Load content by default without needing an action like click, mouseover, scroll
Make sure links are links: <a href=“/good-link”>Will be crawled</a>
• No

Please Clean URLs
Change URLs for different content: History API and HTML5 pushstate()
Use your router:
Clean your URLs, none of this:
?Topics%5B0%5D%5B0%5D=cat.topic%3Ainfrastructure

Use a 404 When Possible
Help your SEOs understand how this works. JS can’t throw a 404, but you can:
JS redirect or route to 404 page on a server with an actual 404 response.
Not as great: You can add a noindex to any error pages along with a message
like “404 Page Not Found”. Will be treated as a soft-404 even with status code
200 shown.
Lots of analytics may fire and lots of SEO tools + other tools don’t look for soft-
404s, so not having this status code when looking at data causes data issues.

Make a Sitemap
Search engines like Google read this file to more intelligently crawl your site. A
sitemap tells the crawler which files you think are important in your site.
React Router Sitemap:
https://www.npmjs.com/package/react-router-sitemap

As an SEO
https://www.npmjs.com/package/react-helmet

Other things to know about
Mostly crawls from West Coast US (Mountain View). Some crawling
internationally.
They are very aggressive with caching everything (you may want to use file
versioning). This can lead to some impossible states being indexed if parts of old
files are cached.
They download pages and download resources, but all of this is stored and run
as fast as possible as “rendering”.

Other things to know about
Googlebot renders with a long viewport. Mobile screen size is
431 X 731 and Google resizes to 12,140 pixels high, while the
desktop version is 768 X 1024 but Google only resizes to 9,307
pixels high.
Credit to JR Oakes @jroakes https://codeseo.io/console-log-
hacking-for-googlebot/

Resources
SEO Mythbusting Series + JS SEO Series
https://www.youtube.com/channel/UCWf2ZlNsCGDS89VBF_awNvA
JS SEO Basics: https://developers.google.com/search/docs/guides/javascript-
seo-basics
Dynamic Rendering: https://developers.google.com/search/docs/guides/dynamic-
rendering
Mobile Friendly Test: https://search.google.com/test/mobile-friendly
Google Search Console: https://search.google.com/search-console
JavaScript Working Group: https://groups.google.com/forum/#!forum/js-sites-wg

React JS and Search Engines - Patrick Stox at Triangle ReactJS Meetup

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to React JS and Search Engines - Patrick Stox at Triangle ReactJS Meetup

Similar to React JS and Search Engines - Patrick Stox at Triangle ReactJS Meetup (20)

More from patrickstox

More from patrickstox (9)

Recently uploaded

Recently uploaded (20)

React JS and Search Engines - Patrick Stox at Triangle ReactJS Meetup