Successfully reported this slideshow.
Your SlideShare is downloading. ×

Max Prin - SMX West 2018 - JavaScript & PWAs - What SEOs Need To Know

Ad

#SMX #23A @maxxeight
What SEOs Need To Know
JavaScript &
Progressive
Web Apps (PWAs)

Ad

#SMX #23A @maxxeight

Ad

#SMX #23A @maxxeight
What’s a Web App?

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Upcoming SlideShare
Search Y 2020 - PWA + AMP
Search Y 2020 - PWA + AMP
Loading in …3
×

Check these out next

1 of 31 Ad
1 of 31 Ad
Advertisement

More Related Content

Advertisement

Max Prin - SMX West 2018 - JavaScript & PWAs - What SEOs Need To Know

  1. 1. #SMX #23A @maxxeight What SEOs Need To Know JavaScript & Progressive Web Apps (PWAs)
  2. 2. #SMX #23A @maxxeight
  3. 3. #SMX #23A @maxxeight What’s a Web App?
  4. 4. #SMX #23A @maxxeight Traditional Page Lifecycle Web Application Lifecycle What’s a Web App? Initial GET request POST request HTML HTML Initial GET request AJAX call HTML (App shell) JSON, HTML, etc.
  5. 5. #SMX #23A @maxxeight What’s a Progressive Web App? NativeApps Web Apps
  6. 6. #SMX #23A @maxxeight Reliable & Fast App shell cached locally (on 1st load) • Fast loading when offline or with slow connection (on subsequent loads) Mobile-friendly (responsive) Secure (HTTPS) What’s a Progressive Web App? Engaging Bookmark (icon) on device’s homepage Push notifications
  7. 7. #SMX #23A @maxxeight What’s a Progressive Web App?
  8. 8. #SMX #23A @maxxeight WHAT ABOUT ACCESSIBILITY FOR SEARCH ENGINE BOTS?
  9. 9. #SMX #23A @maxxeight What’s a Progressive Web App? NativeApps Web Apps
  10. 10. #SMX #23A @maxxeight How Search Engines Typically Work Render
  11. 11. #SMX #23A @maxxeight Rendering On Google Search Googlebot uses a web rendering service (WRS) that is based on Chrome 41 (M41). However:  Some features and APIs, such as IndexedDB or Service Workers, are disabled. Google doesn’t install or use Service Workers when crawling PWAs #SMXInsights
  12. 12. #SMX #23A @maxxeight Issues for all crawlers  Potentially a unique URL (or non-crawlable URLs)  A unique HTML document (the “app shell”) – Same <head> section (title, meta and link tags, etc.) Issues for crawlers other than Google (and Baidu)  Client-side rendering of content (HTML source code vs. DOM) Web Apps (SPAs, PWAs)
  13. 13. #SMX #23A @maxxeight  Crawling – 1 unique “clean” URL per piece of content (and vice-versa) Making Sure Search Engines Can Understand Your Pages
  14. 14. #SMX #23A @maxxeight Crawling: Provide “Clean”/Crawlable URLs Fragment Identifier: example.com/#url – Not supported. Ignored. URL = example.com Hashbang: example.com/#!url (pretty URL) – Google and Bing will request: example.com/?_escaped_fragment_=url (ugly URL) – The escaped_fragment URL should return an HTML snapshot Clean URL: example.com/url – Leveraging the pushState function from the History API – Must return a 200 status code when loaded directly
  15. 15. #SMX #23A @maxxeight  Crawling – 1 unique “clean” URL per piece of content (and vice-versa) – onclick + window.location ≠ <a href=”link.html”> Making Sure Search Engines Can Understand Your Pages
  16. 16. #SMX #23A @maxxeight  Crawling – 1 unique “clean” URL per piece of content (and vice-versa) – onclick + window.location ≠ <a href=”link.html”>  Rendering – Don’t block JavaScript resources via robots.txt Making Sure Search Engines Can Understand Your Pages
  17. 17. #SMX #23A @maxxeight Rendering: Don’t Block Resources
  18. 18. #SMX #23A @maxxeight  Crawling – 1 unique “clean” URL per piece of content (and vice-versa) – onclick + window.location ≠ <a href=”link.html”>  Rendering – Don’t block JavaScript resources via robots.txt – Load content automatically, not based on user interaction (click, mouseover, scroll) Making Sure Search Engines Can Understand Your Pages
  19. 19. #SMX #23A @maxxeight Rendering: Load Content Automatically
  20. 20. #SMX #23A @maxxeight  Crawling – 1 unique “clean” URL per piece of content (and vice-versa) – onclick + window.location ≠ <a href=”link.html”>  Rendering – Don’t block JavaScript resources via robots.txt – Load content automatically, not based on user interaction (click, mouseover, scroll) – For Bing and other crawlers: HTML snapshots Making Sure Search Engines Can Understand Your Pages
  21. 21. #SMX #23A @maxxeight  Crawling – 1 unique “clean” URL per piece of content (and vice-versa) – onclick + window.location ≠ <a href=”link.html”>  Rendering – Don’t block JavaScript resources via robots.txt – Load content automatically, not based on user interaction (click, mouseover, scroll) – For Bing and other crawlers: HTML snapshots  Indexing – Avoid duplicate <head> section elements (title, meta description, etc.) Making Sure Search Engines Can Understand Your Pages
  22. 22. #SMX #23A @maxxeight Main content gets rendered here Same title, description, canonical tag, etc. for every URL
  23. 23. #SMX #23A @maxxeight Share these #SMXInsights on your social channels! #SMXInsights  SEO Best Practices For JavaScript Sites – Crawling: use clean URLs and proper <a href> elements – Rendering: avoid blocking resources and loading content upon user interaction – Indexing: make sure meta data in <head> is not duplicated across pages
  24. 24. #SMX #23A @maxxeight Tools
  25. 25. #SMX #23A @maxxeight
  26. 26. #SMX #23A @maxxeight
  27. 27. #SMX #23A @maxxeight
  28. 28. #SMX #23A @maxxeight Share these #SMXInsights on your social channels! #SMXInsights  The Rich Results Testing Tool is the only Google- provided tool showing the rendered code (DOM)
  29. 29. #SMX #23A @maxxeight
  30. 30. #SMX #23A @maxxeight SEO Crawlers
  31. 31. #SMX #23A @maxxeight LEARN MORE: UPCOMING @SMX EVENTS THANK YOU! SEE YOU AT THE NEXT #SMX

Editor's Notes

  • In computing, a web application or web app is a client–server computer program in which the client (including the user interface and client-side logic) runs in a web browser. Common web applications include webmail, online retail sales, online auctions, wikis, instant messaging services and many other functions.
    https://en.wikipedia.org/wiki/Web_application

    Any website can be a web app. But in general, a web app provides some type of functionality/interactive experience such as ordering something online.
    “In general” sites with static content such as corporate websites and news publishers are not web apps. This changed with the rise of PWAs.
  • Single-Page Applications (SPAs) are Web apps that load a single HTML page and dynamically update that page as the user interacts with the app. SPAs use AJAX and HTML5 to create fluid and responsive Web apps, without constant page reloads. However, this means much of the work happens on the client side, in JavaScript.
    https://msdn.microsoft.com/en-us/magazine/dn463786.aspx
  • Why is the reach of web apps higher? Search engines (vs. app stores).
  • No need to be indexed
  • “rendering” is the keyword. Google is, since a few years now, rendering web pages, after crawling and before indexing, in order to understand them better.
  • https://developers.google.com/search/docs/guides/rendering
  • Fragment identifier: this URL structure is already a concept in the web and relates to deep linking into content on a particular page (“jump links”).
    Can’t be accessed/crawled/indexed.
    Hashbang: Used with the “old” AJAX crawling scheme. Not recommended, more complex to implement.
    Clean URL using History API’s pushState function.

    AJAX-crawling scheme
    Google has deprecated this recommendation in October 2015
    Won’t be supported by ~Q2 2018
  • Mega menu – mouseover + ajax
    Tabs/accordeons – click + ajax
    Load more/infinite scroll - click/scroll + ajax
  • But it’s still a better source of info than the cache.

    It fetches pages from a Google IP (it makes a difference sometimes for websites blocking “Googlebot” user-agent if not coming from a known Google IP)
    It leverages Googlebot’s JavaScript rendering engine which is likely to be more advanced than PhantomJS.
  • But it’s still a better source of info than the cache.

    It fetches pages from a Google IP (it makes a difference sometimes for websites blocking “Googlebot” user-agent if not coming from a known Google IP)
    It leverages Googlebot’s JavaScript rendering engine which is likely to be more advanced than PhantomJS.
  • Botify
    ScreamingFrog
    Sitebulb

×