SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript Frameworks - SMX Munich 2016
Ari Nahmani covers the latest in advanced technical SEO at SMX Munich (Muenchen) 2016. Discussions of the deprecated HTML snapshot, Javascript crawlability and indexing, new frameworks, prerendering, server side rendering, prerender.io, isomorphic javascript, and other technical issues related to the future of protecting your index health.
Ari Nahmani covers the latest in advanced technical SEO at SMX Munich (Muenchen) 2016. Discussions of the deprecated HTML snapshot, Javascript crawlability and indexing, new frameworks, prerendering, server side rendering, prerender.io, isomorphic javascript, and other technical issues related to the future of protecting your index health.
9.
Today’s Session
• Technical SEO issues around e-commerce /
large site architecture
• Preventing index bloat & preserving crawl
budget as a core methodology
• Current solutions & upcoming threats (JS,
AJAX, new frameworks, pre-rendering)
18.
Index Bloat Prevention: Sorts & Filters
<link rel="canonical"
href=”http://www.site.com/guys/tees/" />
• Basic Solution: Strip out the unnecessary
parameters
19.
Solution: Filtering Out All Facet Params
• PROS:
– Avoids diluted / dupe URLs (request, not
directive)
• CONS:
– If you want/need specific parameters indexed
and exposed (size, color), need properly coded
canonical tag logic, recipe for major leak and
confusion.
– Considerations w/ pagination & view-all page
24.
Index Bloat Prevention: JS + AJAX
AJAX Refinement V1 =
NO URL CHANGE
25.
Index Bloat Prevention: JS + AJAX
AJAX Refinement V1 - NO URL CHANGE,
but inactive, different href= URL exists
26.
AJAX Facet Refinements V1 (NO URL CHANGE)
• PROS:
– Theoretically no parameters exposed to bloat the
index
• CONS:
– Users can’t share refined / filtered content to
friends, no accurate bookmarking. (Terrible UX)
– Googlebot will still crawl hidden href=' or other JS
framework links like Angular: ng-href= (check
canonical logic!!)
27.
Index Bloat Prevention: JS + AJAX
AJAX Refinement V2 =
html 5 history.pushState()
28.
Index Bloat Prevention: JS + AJAX
html 5 history.pushState()
http://www.site.com/guys/tees/?color=green&size=large
31.
Index Bloat Prevention: JS + AJAX
Google preferred pushstate URL version, we had to reinforce
(via normal inline href=‘’, canonical, xml sitemap)
32.
AJAX Facet Refinements V2 (PushState URL Change)
• PROS:
– Users can now share /bookmark the correct
content
– Added to browser history
• CONS:
– Still need to have consistent canonical structure
due to Googlebot crawling pushstate()
– Different hidden URL structure via AJAX facets
may require further unpredictable
canonicalization logic / further dev work
40.
Pre or Realtime
Rendered
(to users & bots)
Indexing AJAX & JS: How To Decide?
HTML
SNAPSHOT
_escaped_fragment_=
Trust
Googlebot
VALIDATE!
Progressive
Enhancement
‘Dumbed down’
HTML Template
3rd Party
Service
(prerender.io)
Server side
(phantomJS /
headless browser)
Pre-Rendered
(to bots)
41.
Pre or Realtime
Rendered
(to users & bots)
Indexing AJAX & JS: How To Decide?
HTML
SNAPSHOT
_escaped_fragment_=
Trust
Googlebot
VALIDATE!
Progressive
Enhancement
‘Dumbed down’
HTML Template
3rd Party
Service
(prerender.io)
Pre-Rendered
(to bots)
Server side
(phantomJS /
headless browser)
42.
Indexing AJAX & JS: HTML Snapshot
• Upon crawl of URL with _escaped_fragment_=,
serve ’dumbed down’ HTML version of page.
• Not pre-rendered, rather simplified.
• For example, on ecommerce à a view-all
category listing with no dynamic facets.
Amazing results from our clients.
43.
Indexing AJAX & JS: How To Decide?
HTML
SNAPSHOT
_escaped_fragment_=
Trust
Googlebot
VALIDATE!
Progressive
Enhancement
‘Dumbed down’
HTML Template
3rd Party
Service
(prerender.io)
Pre or Realtime
Rendered
(to users & bots)
Pre-Rendered
(to bots)
Server side
(phantomJS /
headless browser)
44.
Indexing AJAX & JS: Pre-rendering
Upon crawl of URL with _escaped_fragment_=
1. prerender.io – middleware via reverse proxy
that serves a pre-rendered, cached HTML
page to bots
OR
2. Server side – the server pre-rendered the JS
in cached html pages to serve to bots or
does it in real-time (headless browser).
49.
Server side
(phantomJS /
headless browser)
Pre or Realtime
Rendered
(to users & bots)
Indexing AJAX & JS: How To Decide?
HTML
SNAPSHOT
_escaped_fragment_=
Trust
Googlebot
VALIDATE!
Progressive
Enhancement
‘Dumbed down’
HTML Template
3rd Party
Service
(prerender.io)
Pre-Rendered
(to bots)
50.
Indexing AJAX & JS: Server Side
bit.ly/javascriptseo
51.
Indexing AJAX & JS: Server Side
bit.ly/javascriptseobit.ly/javascriptseo
52.
Indexing AJAX & JS: Server Side
bit.ly/javascriptseobit.ly/javascriptseo
53.
Server side
(phantomJS /
headless browser)
Pre or Realtime
Rendered
(to users & bots)
Indexing AJAX & JS: How To Decide?
HTML
SNAPSHOT
_escaped_fragment_=
Trust
Googlebot
VALIDATE!
Progressive
Enhancement
‘Dumbed down’
HTML Template
3rd Party
Service
(prerender.io)
Pre-Rendered
(to bots)
54.
Indexing AJAX & JS: Trust Googlebot
read these first…
65.
Summing It Up
• Index Bloat, Crawl Budget, & Testing: Large sites are
prone to serious index bloat and wasted crawl budget.
Needs diligent testing and an OCD-like attention to detail
with the basics. Test often & automate!
• JS/AJAX: Pushstate(), JS Frameworks and AJAX present
both discovery and bloat challenges. Know the options:
short term fixes like HTML snapshot (G+B), and long term
re-designs with modern frameworks w/ built in server side
rendering.
66.
Dankeschön!
Questions?
Ari Nahmani
CEO / Founder
Kahena Digital Marketing
ari@kahenadigital.com
@AriNahmani
67.
References:
• Can You Now Trust Google To Crawl Ajax Sites?
• Search Engine Optimization Best Practices for AJAX URLs | Webmaster Blog
• We Tested How Googlebot Crawls Javascript And Here's What We Learned
• Prerender - AngularJS SEO, BackboneJS SEO, or EmberJS SEO
• SMX Munich Advanced Technical SEO Brainstorm - Google Docs
• www.simoahava.com/seo/dynamically-added-meta-data-indexed-google-crawlers/
• Speakers | Search Marketing Expo – SMX Munich
• JavaScript + SEO: Better Together — Medium
• SEO AJAX Crawlability in a Responsive Publisher World
• SEO Strategies for JavaScript-Heavy Single Page Applications or AJAX Sites | Search Engine Watch
• The Basics of JavaScript Framework SEO in AngularJS - Builtvisible
• Can Search Engines Crawl Javascript?
• https://www.w3.org/wiki/Graceful_degradation_versus_progressive_enhancement#Graceful_degradatio
n_and_progressive_enhancement_in_a_nutshell
• SEO and JS: New Challenges
• BromBone | SEO for your AngularJS, EmberJS, or BackboneJS website.
• DIY AngularJS SEO with PhantomJS (the easy way!) | Lawsonry
• https://scotch.io/tutorials/angularjs-seo-with-prerender-io