Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

L’importance du crawl du JavaScript : pourquoi, comment et pour quels bénéfices ?

3,642 views

Published on

Slides de la conférence 'L’importance du crawl du JavaScript : pourquoi, comment et pour quels bénéfices ?' lors du SEO Camp'us 2018

Published in: Internet
  • Be the first to comment

  • Be the first to like this

L’importance du crawl du JavaScript : pourquoi, comment et pour quels bénéfices ?

  1. 1. 1#seocamp Demystifying JavaScript & SEO
  2. 2. 2#seocamp Demystifying JavaScript & SEO Shedding light on common misconceptions in the JavaScript for SEO space Robin Eisenberg Engineering Manager @botify.com - Full-Stack Developer, Architect - Web and Mobile solutions - Currently Manager of the Botify Engineering Team Dimitri Brunel Search Data Strategist @botify.com - SEO managers in Marketing Agencies & Pure Players (previously) - Currently part of BOTIFY Search Data Strategist team
  3. 3. 3#seocamp ▪ What is JavaScript? Why is it used? ▪ Why is JavaScript important for SEO? ▪ How does JavaScript affect the website ? ▪ What type of takeaways ? Conference Agenda
  4. 4. 4#seocamp What is JavaScript? Why is there a need for JavaScript in the first place ? ● Tells the browser what to display ● Tree of nodes, described by tags ● Nodes can contain: ○ Text ○ Metadata ○ Links ○ (Assets) - not relevant ○ (Style) - not relevant ● Uses HTML to tell the browser what to display ● Allows a developer to dictate (‘Script’) a browser’s behaviour after receiving a Web Page ● Can make extra requests ● Can react to user interaction (clicks, mouse movement, keyboard, etc…) ● Can store information on the user’s computer ● Can modify the HTML HTML is the browser’s display language JavaScript is the browser’s behaviour language
  5. 5. 5#seocamp JavaScript as seen by Browsers How do browsers handle JavaScript? 1) Browser asks Server what it should do 2) Server responds with HTML + JavaScript (“display this, and do that”) 3) Browser reads and displays HTML 4) Browser executes the JavaScript • Multiple browser/server communication • Browser modifies what it displays through JavaScript • User-interaction can be handled by the browser, without server communication
  6. 6. 6#seocamp JS Rendering #1: product shelves How does JavaScript fetch and render affect this page ?
  7. 7. 7#seocamp Result #1: product tiles now rendered with JS How does JavaScript fetch and render affect this page ? Product Tiles now rendered 1. In-page contents are now accessible 2. New internal Outlinks are now crawlable
  8. 8. 8#seocamp JS Rendering #2: product page contents How does JavaScript fetch and render affect this page ?
  9. 9. 9#seocamp Result #2: contents are now inserted with JS How does JavaScript fetch and render affect this page ? Contents are now rendered 1. FULL in-page contents are now accessible 2. A few new internal Outlinks are now crawlable
  10. 10. 10#seocamp History of JavaScript How has JavaScript’s use on Web evolved? Way back in time: ■ First we used JS just for tracking (such as web analytics) ■ Then we used JS for dynamic content, and now we use JS for all content Back in time for the SEOs: we all had a long and hard relationship with JS and Google... ■ For some SEOs, JS was almost .onclick functions, whereas async functions with JS are now everywhere. ■ We also knew the AJAX fragments, and !Escape_fragment before it was deprecated by Google ■ We also knew the HTML5 history featured and .pushState() method, compliant for googlebot. Back to mid-2016: ● We now know that legacy JavaScript VS modern JavaScript make differences ● We now know that JavaScript Language Features make huge differences for Googlebot Now: ● The JavaScript Language (ECMAScript) is more normalized and less derivative ● Google communicates more on what works and what doesn’t
  11. 11. 11#seocamp Why is JavaScript inevitable? Why is the web moving towards more and more JavaScript on websites? ● JavaScript allows for client-side interactions ● JavaScript allows for client-side processing ● JavaScript offloads processing costs to clients ● JavaScript allows websites to access native browser features (localization, notification, media, etc…) ● PO/PMs prefer JavaScript because it delivers the most interactive product ● Developers prefer JavaScript because it is much easier to use than HTML for complex websites ● Infrastructure prefers JavaScript because it offloads processing to the client’s computers ● Finance prefers JavaScript because it is cheaper
  12. 12. 12#seocamp How is JavaScript used on the Web? “Some JavaScript” websites: ■ Serve a pre-filled HTML page ■ Augment the page using JavaScript ■ Examples: ○ Comments ○ Personalized content ○ Recommendation systems ○ Nav. such as Faceted Navigations ○ Filtering such as Asc./Desc. sortings “Full JavaScript” websites: ● Smart blacklisting ● Serve an empty HTML page ● Solely use JavaScript to insert HTML elements ● Examples: ○ Single-Page-Apps ○ PWAs ○ AngularJS, ReactJS, Ember, VueJS, Polymer JavaScript in the wild
  13. 13. 13#seocamp Is JavaScript a problem for my SEO? “My Dev team wants to use React, what do I do ?” Absolutely not. ● Bots currently execute JavaScript on pages ● JavaScript can be written to make sure it is crawlable ● Most JavaScript non-compliance can be automatically fixed! If your development team still produces non-compliant JavaScript: ● Explain that they can fix non-compliance automatically ● All JavaScript frameworks offer a SSR fallback ● If all else fails, pre-rendering can be used (but expensive)
  14. 14. 14#seocamp SEO managers are always worried about JS frameworks (= seo-pocalypse). This problem is easily solved with good JS code hygiene. A JS Checklist could be: ■ JavaScript for SEO compliance ⇔ JS is Crawl-compliant ■ Fetch and Render via GSC ⇔ rendering seems OK or KO ? ■ A test URL should be indexed ⇔ (site:) ■ A test CONTENT should be indexed ⇔ (site:) “search you unique sentence” ■ A test LINKS should be found ⇔ (search your secret children inlink) ■ A */react/test/* should be crawled ⇔ (dig into LOG files!) Is it enough? No. You need to check standard SEO indicators and make sure Frameworks: crawling & indexing experiments For any JS framework, make experiments to understand what Googlebot is able to fetch > render > index
  15. 15. 15#seocamp What is JavaScript crawling ? How do Bots see your JavaScript website ?
  16. 16. 16#seocamp Crawling: JavaScript as seen by Crawlers How do crawlers handle JavaScript? 1) Crawler asks Server what it should do (pretending it’s a browser) 2) Server responds with HTML + JavaScript (“display this, and do that”) 3) Crawler launches Browser, feeds HTML and JavaScript 4) Browser executes the JavaScript 5) JavaScript modifies the HTML 6) Browser sends modified HTML back to Crawler 7) Crawler reads and analyzes HTML 8) Crawler finds links 9) Crawler goes back to (1) for every link found In step (5) and (6), the JavaScript can make requests, setup user interaction triggers, and modify the page. This results in modified HTML that has been ‘augmented’ by the execution of the page’s JavaScript. The modified HTML is what is fed back to the crawler for analysis.
  17. 17. 17#seocamp ● Crawlers can and do crawl JavaScript (execute JavaScript on pages) ● JavaScript crawling is a very expensive operation - optimize your render budget ○ optimize render time ○ optimize fetch time ○ smart resource blacklists in robots.txt ■ Gotcha: Some websites may have their JS scripts in their robots.txt (for historical reasons) ■ The same SEO techniques apply: ○ Crawl budgets apply ○ Internals / External Linking ○ Content & Perceived Quality ○ Page similarity & Uniqueness studies apply Crawling: Key Takeaways
  18. 18. 18#seocamp JS Rendering: compare without JS / with JS How does JavaScript fetch and render affect a scope of page ? Impact on pages inventory and linking Without JS With JS
  19. 19. 19#seocamp JS Rendering: full JavaScript website How does JavaScript fetch and render affect this page ? JavaScript enabled JavaScript disabled
  20. 20. 20#seocamp Your JavaScript (best practices) How to handle your site’s JavaScript SEO What your Dev Team needs to know: ● Target Chromium 41 or earlier ● Transpile ES6 to ES5 ● Provide Polyfills ● Optimize fetch time ● Optimize render time If they can’t provide these points, turn to Server- Side-Rendering (SSR). If they can’t provide SSR, turn to Pre-Rendering. Cost should come out of Tech budget. What your SEO Tools need to do: ■ Smart blacklisting ■ Blacklist specific resources ■ Cache resources (for at-scale crawls) ■ Analyze fetch time (avg. nb of resources) ■ Analyze render time (delay to render) ■ Detail resources (allowed, failed, from cache)
  21. 21. 21#seocamp How should you analyze your JavaScript website ? How to level-up your JavaScript SEO assessment ?
  22. 22. 22#seocamp JS Rendering #1: faceted navigation How does JavaScript fetch and render affect this page ?
  23. 23. 23#seocamp Result #1: faceted nav links now rendered with JS How does JavaScript fetch and render affect this page ? Faceted navigation links now rendered 1. new Links available (faceted navigation) 2. new HTML blocks (recommendation system)
  24. 24. 24#seocamp JS Rendering #2: ongoing news publishing How does JavaScript fetch and render affect this page ?
  25. 25. 25#seocamp Result #2: infinite scroll now inserted with JS How does JavaScript fetch and render affect this page ? Lazy Loading now rendered ● infinite scrolling ● async article displaying 1. Many new links 2. Many new contents 3. that is new page states (from 1~n articles)
  26. 26. 27#seocamp ▪ The Contents can be cached, but not indexed ▪ (De)Activating JS will display different results ⇔ not 100% reliable ▪ GSC fetch and render has no debug mode ⇔ it explains nothing From these current takeaways ▪ Google cache & Google index could be separated ⇔ unclear ▪ Commitment on crawling / indexing stay unclear ⇔ smthg is missing ▪ People approach is weak (checking cache index) ⇔ rather than JS code In fact a crucial clue was still missing to understand GoogleBot - their Remaining problem are: GG cache vs GG Index You may see more or less differences… and this is not the right thing to assess !
  27. 27. 28#seocamp ▪ Do not trust JS frameworks ⇔ build your own tests and do transpilling ▪ Do not trust GSC ⇔ choose a SEO tool with fetch & render tool ▪ Do not trust Google Cache ⇔ not 100% reliable for the moment Your existing SEO analysis techniques still apply. Some must be modified to account for JavaScript: - Fetch and Render time - Resource optimization - Page and resources sizes So keep in mind all of these guidelines What you (seo manager) have to tell to your dev team
  28. 28. 29#seocamp Full-JS website Focus on all aspects of load time (initial HTML + JS rendering)
  29. 29. 30#seocamp Full-JS website Understanding the JS load time (initial HTML + JS rendering + number of executed resources)
  30. 30. 31#seocamp Compare without JS / with JS: Understanding the Impact on Content, on Internal Linking, etc. Without JS Avg. number of words (with the templates) Avg. number of words (excluding the templates) Avg. number of words (with the templates) Avg. number of words (excluding the templates) ▪ Focus on the important (top tail) pages: those that generate most traffic on the website. ▪ Further analysis: does additional content enhance content uniqueness in pages? ▪ Further analysis: does additional links improve crawl ratio on priority pages? With JS
  31. 31. 34#seocamp ■ Javascript assessment on websites has become mandatory - there is no ignoring it. ■ It is new territory for SEO managers. ■ You need the right tools to analyze the JavaScript indicators in your website. Conclusion: Key Takeaways ● Javascript is perfectly crawlable in the right circumstances. ● It is not an indicator of bad SEO. ● It does not change key SEO factors used during ranking. ■ JavaScript SEO analysis requires working closely with the website's technical team. ■ Most -if not all- compliance issues can be fixed easily and automatically. ■ It will soon be an important differentiator for rankings. 1. 3. 2.
  32. 32. 35#seocamp ● Rendering on Google Search ⇔ https://developers.google.com/search/docs/guides/rendering ● Ilya Grigorik / Engineer @google - https://twitter.com/igrigorik Web performance engineer, co-chair of W3C Webperf WG ● Eric Bidelman / Engineer @google - https://twitter.com/ebidel Working on headless Chrome, Lighthouse, dev tools. ● Sam Li / Engineer @googlechrome @polymer - https://twitter.com/samdotli ● Great Polyfillers ⇔ already vetted set of polyfills that you can trust, such as those provided by ES5-Shim and ES6-Shim. ● Great Transpilers ⇔ Babel (formerly 6to5): Transpiles ES6+ into ES5 // Traceur: Transpiles ES6, ES7, and beyond into ES5 ● Can I Use… / @caniuse - https://twitter.com/caniuse Compatibility tables for features in HTML5, CSS3, SVG and other upcoming web technologies ● Bartoz Garalewicz / Head of SEO @ Elephate - https://twitter.com/bart_goralewicz Tomek Rudzki / R&D Specialist @ Elephate - https://twitter.com/TomekRudzki Very warm thanks and People to follow Take time to read their articles
  33. 33. 36#seocamp Thank you for your attention Get in touch! hello@botify.com
  34. 34. 37#seocamp Question Mug ● Technical questions? ● SEO questions? ● Google questions?
  35. 35. 38#seocamp Addendum: Special note for your JS developers Script to detect the HTML/CSS/JS features used by a site and x-reference them w/ the features supported by Google Search bot https://twitter.com/ebidel/status/9733064 63081738240
  36. 36. 39#seocamp Addendum: Special note from Ilya Grigorik Once you will understand what ECMAScript is meaning, everything will start to become step by step crystal clear, to talk with you dev team ! Supported by Chrome 41 Features are functions that regular browsers can use. If a ES5 Features is working with Chrome 41, it should also work for Googlebot. CanIUSe.com assess which functions is fully compliant with Chrome version 41 (googlebot) Unsupported Chrome 41 Chrome 41 has a debug mode 1/ Click on INSPECT 2/ Click on CONSOLE 3/ if you JS errors, go to see your Front-End JS dev/ Extra Browser fallback A polyfill is a browser fallback, made in JavaScript, that allows functionality you expect to work in modern browsers to work in older browsers, e.g., to support canvas (an HTML5 feature) in older browsers.

×