Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

What To Do When Google Can't Understand Your JavaScript By Jody O'Donnell

2,290 views

Published on

From the SMX Advanced Conference in Seattle, Washington June 22-23, 2016. SESSION: What To Do When Google Can't Understand Your JavaScript. PRESENTATION: What To Do When Google Can't Understand Your JavaScript - Given by Jody O'Donnell. #SMX #21A2

Published in: Marketing
  • Be the first to comment

What To Do When Google Can't Understand Your JavaScript By Jody O'Donnell

  1. 1. #SMX #21A2 @gimpslice What To Do When Google Can't Understand Your JavaScript
  2. 2. #SMX #21A2 @gimpslice Jody J. O’Donnell SEO Technical Director Merkle, Inc
  3. 3. #SMX #21A2 @gimpslice Websites need to be Discoverable and Crawlable Webpages need to be understood through SEO signals Webpages need their Content to be Scored correctly to be Indexed Websites need to be Indexed properly to Rank Let’s Start with the Basics
  4. 4. #SMX #21A2 @gimpslice We are not going to talk about Bing except in the traditional sense of reading the HTTP Headers and the HTML Source Bing and the other search engines still require HTML snapshots: § HTML Snapshots and the rendered DOM by a normal browser engine (read by Google) should be identical Oh…Bing.
  5. 5. #SMX #21A2 @gimpslice First thing you did was turn JS off in new browsers JavaScript wasn’t the problem, but it took the blame. Lookin’ at you MSIE 3-6 The Browser Wars
  6. 6. #SMX #21A2 @gimpslice It wasn’t that JavaScript got better § Microsoft decided to play ball § Internet Explorer got better Today, JavaScript is one of the most popular programming languages in the world So…What Happened?
  7. 7. #SMX #21A2 @gimpslice jQuery and Mootools came out in less than a year of each other Out of the box solutions for free! Client-Side Applications that use tradition HTML and CSS statements JavaScript Libraries
  8. 8. #SMX #21A2 @gimpslice Development is much faster when you reference a free library rather than create it each time §  Browser compatibility was a bigger deal, too It worked with SEO! What did this do for Us?
  9. 9. #SMX #21A2 @gimpslice SEO Ain’t All That §  User experience is more important than ever §  Attention spans are short §  We can't hold technology hostage because of SEO At the End of the Day
  10. 10. #SMX #21A2 @gimpslice Libraries Evolved to entire frameworks Frameworks were out of the box solutions for creating apps § Apps can be on client or server § The frameworks created out of the box MVC’s Programmers are now free to concentrate on functionality JavaScript Frameworks
  11. 11. #SMX #21A2 @gimpslice From an SEO standpoint, we look at the result § Our examination is in the code, not the backend to produce it We are not going to discuss the differences between the frameworks § Nor the client-side vs. server-side rendering § SPA’s and frameworks should be able to produce 1 URL to one piece of unique content SPA’s and JavaScript Frameworks
  12. 12. #SMX #21A2 @gimpslice
  13. 13. #SMX #21A2 @gimpslice They built a history API call between the Code and the browser Specifically, a function called pushState() Two of those variables §  Title Tag §  URL History API - pushState()
  14. 14. #SMX #21A2 @gimpslice
  15. 15. #SMX #21A2 @gimpslice SEO’s need to understand the difference between: §  HTTP Headers §  HTML Source §  HTML Snapshot §  Rendered DOM Different Places to Look
  16. 16. #SMX #21A2 @gimpslice All search engines have two orders § HTTP Headers –  x-robots-tag, link:canonical, link:hreflang § HTML Source –  meta robots tag, <link> canonical, <link> hreflang SEO Signals by Order of Precedence
  17. 17. #SMX #21A2 @gimpslice DOM – The Third Order
  18. 18. #SMX #21A2 @gimpslice JavaScript redirects are treated in a similar manner as 301 redirects What were the Results?
  19. 19. #SMX #21A2 @gimpslice Dynamically inserted content is treated in an equivalent whether in the HTML source or in the DOM. § This goes with SEO Signals as well What were the Results?
  20. 20. #SMX #21A2 @gimpslice One of the tests that failed § The rel=“nofollow” tag was completely ignored in the DOM We think this is an Order of Precedence problem § Crawl signals are picked up starting in the HTML source, looking for the rel=”nofollow” signal § Essentially, a deduping mechanism maybe responsible One Discrepancy
  21. 21. #SMX #21A2 @gimpslice SEO Signal: §  we will pick it up and use it the first time we see it Content: §  More of a mixed bag, they could choose which content they score and index SEO Signal vs. Content
  22. 22. #SMX #21A2 @gimpslice As a secondary effect, our article got picked up in a hackernews thread The single piece that came from this whole discussion was a small part with a self-purported Ex-Google Employee § Discussion about if google would wait for 120 seconds before taking the snapshot in case of injected content § Google engineer did agree that they did and it was a fixed time Was it true? Hackernews
  23. 23. #SMX #21A2 @gimpslice Google Fetch & Render PageSpeed Insights www.maxxeight.com/js-timer/
  24. 24. #SMX #21A2 @gimpslice Most Ecom sites have category pages with hundreds of products to list The Category Page Conundrum
  25. 25. #SMX #21A2 @gimpslice You had three choices: § View-All Pages § Paginated Pages § Lazy Loading (which didn’t have an SEO option) The Category Page Conundrum
  26. 26. #SMX #21A2 @gimpslice Infinite scroll changed with HTML 5’s History API Now we can tie a JS Listening Handler with a pushState() call §  You can push {{ URL + ?page=2 }} into the URL bar –  REMEMBER: You can update title tag here, too Infinite Scroll + HTML5
  27. 27. #SMX #21A2 @gimpslice Search engines can reference any of the individual pages and render the HTML equivalent of that single page §  THIS IS WHAT YOU DO WITH PAGINATION ALREADY!!! Infinite Scroll + HTML5
  28. 28. #SMX #21A2 @gimpslice The user scrolls as far as they want Fantastic User Experience Infinite Scroll + HTML5
  29. 29. #SMX #21A2 @gimpslice There is a problem We still need links! Partial Solution
  30. 30. #SMX #21A2 @gimpslice
  31. 31. #SMX #21A2 @gimpslice The rendered HTML version of the DOM should be a working HTML copy of the page § When you “Inspect Element” you are seeing the visual representation of the DOM The Rendered DOM
  32. 32. #SMX #21A2 @gimpslice HTML Snapshots should be as close to an exact HTML instance of the DOM as possible Googlebot does not need these at this time Let’s Talk Snapshots
  33. 33. #SMX #21A2 @gimpslice Do not have conflicting signals between the Orders of Precedence § Be consistent between the HTTP Headers, HTML Source and rendered DOM Conflicting SEO Signals
  34. 34. #SMX #21A2 @gimpslice In order for the DOM to fully render correctly, the browser needs access to all the assets being requested for the page This is true for googlebot to render the DOM, too § If googlebot can’t access assets like JavaScript, page renders can be incomplete Crawl Signals – Robots.txt
  35. 35. #SMX #21A2 @gimpslice Crawl signals can be picked up anywhere in the: § HTTP Headers (canonicals) § HTML Source (links) § DOM (links) Crawl signals should be consistent between the Orders of Precedence § Conflicting, or signals that only appear only in the DOM (such as the rel=“nofollow”) might not be seen or interpreted correctly Crawl Signals
  36. 36. #SMX #21A2 @gimpslice The web is still based on links A JS function is not a hyperlink element §  We have seen Google incorrectly create URL “strings” and generate URLs that don’t exist Onclick + window.location != <a href=”#”>
  37. 37. #SMX #21A2 @gimpslice Navigation still needs to be links that go to the correct page Same with Faceted Navigation § Each facet link needs to correspond to an actual page in the click+reload method § The actual loading can be AJAX calls for the user Crawl Signals - Navigation
  38. 38. #SMX #21A2 @gimpslice Googlebot is a lame user Googlebot doesn’t click on buttons and doesn’t scroll down the page, etc § Therefore the content needs to be loaded in the DOM automatically, not based on user interactions § We haven't seen AJAX sequences being indexed and interpreted by Googlebot. Content Considerations
  39. 39. #SMX #21A2 @gimpslice 1 URL per piece of content and 1 piece of content per URL – § It is essential to have every piece of content accessible via its own URL § Single Page Applications (SPAs) should actually not be using a “single page” or single URL when delivering the content Changing Content Best Practices
  40. 40. #SMX #21A2 @gimpslice Tabbed content should all be in the DOM § Same way we would want it if we were talking about putting it in the HTML source to begin with Changing Content
  41. 41. #SMX #21A2 @gimpslice Because of the DOM Snapshot (5 seconds) § Content injected automatically after 5 seconds won’t be scored or indexed § SEO signals after 5 seconds won’t be included in the scoring § HTML Snapshots need to align here with the content within that 5 second cutoff Content - DOM Timeouts
  42. 42. #SMX #21A2 @gimpslice Reviews on pages are generally like old lazy load pages § The first grouping is loaded in the DOM § The rest are AJAX calls § Reviews behind the AJAX calls will most likely still not be indexed Content - Reviews
  43. 43. #SMX #21A2 @gimpslice Indexation signal directives should be aligned as well § X-Robots-Tag and Meta Robots “noindex” § JavaScript Redirects Indexation signal hints § Rel next/prev consolidation signals § Link Canonical tags § Link Alternate tags Index Signals
  44. 44. #SMX #21A2 @gimpslice
  45. 45. #SMX #21A2 @gimpslice SPA’s and Status Codes
  46. 46. #SMX #21A2 @gimpslice SPA’s and Status Codes
  47. 47. #SMX #21A2 @gimpslice If URL doesn’t exist -> § JS redirect to a page that actually 404 § Do not use Meta-Equiv Refresh to redirect If you need to redirect: –  302 – It needs to be server side before before the JS app loads (rewrite rule) –  301 - same as 302 or JS redirect (considered as 301 by Google) Status Code Challenges
  48. 48. #SMX #21A2 @gimpslice
  49. 49. #SMX #21A2 @gimpslice HTTP Headers, HTML Source, HTML Snapshots and the DOM all contain SEO signals § Google is looking at all three Orders of Precedence for signals § Bing/Rest of world look at it through the traditional two orders § Be consistent in your content and SEO signals A Signal is a Signal is a Signal
  50. 50. #SMX #21A2 @gimpslice Google is dumping the DOM § Line up content and SEO signal at all levels of precedence § The DOM should be an HTML representation of the working page § Look at your pages in Fetch and Render to see how Google is able to render the page Googlebot only as Good as the DOM
  51. 51. #SMX #21A2 @gimpslice Googlebot can’t render dynamic content driven by user interactions such as click and mouseover § It isn’t a user, it isn’t going to interact with the page beyond a link or a post § It won’t “scr0ll” to the bottom of the page § You need to assign a unique URL to each piece of unique content Googlebot is a Lame User
  52. 52. #SMX #21A2 @gimpslice We Tested JavaScript! § http://searchengineland.com/tested-googlebot-crawls-javascript-heres- learned-220157 Hacknews Thread § https://news.ycombinator.com/item?id=9529782 Angular Air Hangout § https://www.youtube.com/watch?v=lxulee01zyY Links to articles referenced
  53. 53. #SMX #21A2 @gimpslice LEARN MORE: UPCOMING @SMX EVENTS THANK YOU! SEE YOU AT THE NEXT #SMX

×