7. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit7
Search Engines’ Goal
Serving the
best results
Most relevant
Great UX
8. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit8
How (Some) Search Engines Work?
1. Crawl the Web
2. (Render Pages)
3. Index URLs/Content
4. Rank URLs
9. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit9
Technical SEO’s Goal
Making sure search engines can access and understand your awesome content, and the
wonderful user experience you provide to your visitors.
Crawling
Can search
engines access
the pages?
Rendering
Can search
engines see the
content/UX?
Indexing
Are
URLs/content
indexed?
11. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit11
Crawling
“Crawling is the entry point for sites
into Google's search results.
Efficient crawling of a website helps
with its indexing in Google Search.”
- Gary Illyes, Webmaster Trends Analyst, Google
12. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit12
Crawling
How to make sure search engines crawl all of your important pages?
Provide clean URLs
13. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit13
Crawling: provide clean URLs (JavaScript-based websites)
Fragment Identifier: example.com/#url
Not supported. Ignored. URL = example.com
Hashbang: example.com/#!url
Google and Bing will request:
example.com/?_escaped_fragment_=url
The escaped_fragment URL should return an HTML
snapshot
Clean URL: example.com/url
Using the pushState function of the HTML5 History API
14. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit14
Crawling
How to make sure search engines crawl all of your important pages?
Provide clean URLs
Lead bots to valuable pages
15. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit15
Crawling: leading bots to valuable pages
Clear/clean site navigation
Internal linking is a powerful
signal:
Use <a href> elements
Avoid “mega-mega” menus
Accurate/up-to-date Sitemaps
Sitemaps (XML or HTML) should
only include URLs that:
Render a 200 OK status code
Have a self-referencing
canonical tag (or no tag)
Properly handled duplicate
content and low-value pages
Parameterized URLs, sorts and
facet navigation, etc.
Use canonical/noindex tags
Non-indexable URLs are crawled
less often
16. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit16
Crawling: recommendations and tools
Analyze your log files and crawl your site to find out if:
Search engine bots are not crawling some URLs
Pages are not properly linked to, internally
Sitemaps contain non-canonical URLs
Low-value and duplicate pages are indexable
17. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit17
Crawling
How to make sure search engines crawl all of your important pages?
Provide clean URLs
Lead bots to valuable pages
Make bots crawl more pages
18. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit18
Crawling: making bots crawl more pages
Decreased load time = increased crawl rate
Fast and reliable server (no 5xx errors)
Content Delivery Network
Rendering = requesting ALL resources
Reduce the size of resources
(compression, minification, etc.)
Reduce the number of requests (redirects,
icons, fonts, etc.)
19. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit19
(Fast site: better user experience)
Optimize the critical rendering path
(perceived latency)
Make the content above the fold appear faster
Use HTTP/2
Multiplexing, binary headers, header
compression, server push
Better engagement metrics: lower bounce
rate, higher time on site (short vs. long clicks)
https://raventools.com/blog/free-ssl-http2/
21. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit21
Rendering
Google is leveraging a
headless browser to
fully render webpages.
Executing JavaScript/CSS
“Understanding web pages better”
Source Code DOM
22. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit22
Rendering: mobile-friendliness
23. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit23
(Mobile-First Index)
Someday, Google will primarily crawl
the web with they mobile user agent
Make sure ALL of your
valuable content is available
on your mobile site.
http://maxpr.in/merkle-mobile-first
24. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit24
Rendering: intrusive interstitials
As of January 10, 2017:
Pages with
intrusive
interstitials
may not rank
as high
25. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit25
Rendering: Progressive Web Apps (PWA)
26. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit26
Rendering
How to make sure Google can “understand” your pages?
Don’t block resources
27. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit27
Rendering: don’t block resources (robots.txt)
28. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit28
Rendering: robots.txt testing tool
29. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit29
Rendering
How to make sure Google can “understand” your pages?
Don’t block resources
Load content automatically
30. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit30
Rendering: load content automatically (vs. based on user interaction)
31. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit31
Rendering: fetch & render tool
33. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit33
Indexing
Part of the indexing process is to “annotate semantics” in
order to retrieve relevant pages
34. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit34
Indexing
How to, technically, make your content more relevant?
Optimize metadata
Leverage structured data markup
35. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit35
Indexing: leverage structured data markup
Structured data markup has 2 components:
Vocabulary: schema.org
Format:
• Microdata
• JSON-LD
36. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit36
Indexing: leverage structured data markup
37. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit37
Indexing: leverage structured data markup
Google supports a ton of markups
that can enhance search results:
- Product, reviews, ratings
- Events, music, movie, recipes, etc.
Rich Snippet = Higher Click-Through Rate (CTR)
38. Max Prin @maxxeight | MnSearch Summit 2017 #MNSummit38
Wrap-up
Technical optimization can greatly
improve overall online performance
Crawling
More valuable pages indexed
Unique content → more authority
Fast side → better engagement metrics
Rendering
A great UX (mobile, PWA) understood by
search engines
Indexing
More relevance and rich snippets (better
CTR) through structured data markup
Search engines’ work can be boiled down to three core functions: Crawl, Index and Rank. As Google is able to execute JavaScript and fully render webpages, an additional step needs to be considered in the process: rendering, in between crawling and indexing.
While technical SEO is only one aspect of SEO, overall online performance can be greatly improved through technical optimization of a website.
In this session, we’ll go over some technical recommendations that help search engines’ efficiency: how to optimize crawling, rendering and indexing to eventually improve rankings?
Search engines’ mission:
Serving the best results to their users (based on their intent)
Relevance (content) + Popularity (links)
UX (mobile-friendly, fast)
Your goal when doing technical SEO work is to make sure search engines can access and index your awesome content, and understand the great UX you provide to your visitors.
Why is “technical SEO more important than ever”? Because while search engines get smarter and smarter, they’re not moving as fast as web development technologies, and simply not human (machine learning: great, but not the answer to everything).
Sometimes, the quality of the content or the UX is not the cause of poor rankings. It’s their lack of visibility. The inability for search engines to understand that they’re there and they’re great.
https://webmasters.googleblog.com/2017/01/what-crawl-budget-means-for-googlebot.html
- Small websites: not something to really worry about, but…
Fragment identifier: this URL structure is already a concept in the web and relates to deep linking into content on a particular page (“jump links”).
Can’t be accessed/crawled/indexed.
Hashbang: Used with the “old” AJAX crawling scheme. Not recommended, more complex to implement.
Clean URL using History API’s pushState function.
Must return a 200 status code when loaded directly
Ranking positive outcome:
Unique content -> more authority
Googlebot is leveraging a headless browser (most likely a version of Chromium) to fully render webpages.
“Understanding web pages better”: https://webmasters.googleblog.com/2014/05/understanding-web-pages-better.html
Indexing dynamic content
Understanding UX: mobile friendliness, intrusive/content-blocking interstitials
https://webmasters.googleblog.com/2016/11/building-indexable-progressive-web-apps.html
Usually, PWAs are JavaScript-based website
Server-side vs. client-side rendering
Googlebot is able to execute JavaScript/CSS to crawl and index dynamic content as well as “understand web pages better”: mobile-friendliness (mobile-first index coming), intrusive interstitials, PWAs. As advanced as the search engine is, there are a few things to remember and implement.
Googlebot is able to execute JavaScript to crawl and index dynamic content. As advanced as the search engine is, there are a few things to remember and implement.
Mega menu – mouseover + ajax
Tabs/accordeons – click + ajax
Load more/infinite scroll - click/scroll + ajax
https://technicalseo.com/seo-tools/fetch-render/
- DOM Snapshot – dynamic “hidden” content
Paul Haar -> analyze crawled pages: extract links, render contents, annotate semantics…
“Indexing” is not only for search engines to add URLs/pages to their indices. While scoring happens right before ranking, search engines index and flag pages based on their content, and more specifically, based on what they understand about the content => understanding what the content is about is the key to relevance.
Leveraging structured data markup, where applicable, in order to create context and improve relevance
Microdata: introduced with HTML5. Attributes and values directly integrated into HTML elements -> can quickly become complicated to implement, especially when the data is not grouped together in the code.
JSON-LD (JavaScript Object Notation for Linked Data): much easier to implement (1 block of script)
Bing doesn’t officially support JSON-LD yet
It’s easy to forget the golden rule of SDM: do not markup non-visible data