Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Understanding Page Load / Ziling Zhao (Google)

796 views

Published on

Large websites with large customer bases should have fast page loads no matter where your customers are coming from. In this day and age speed is expected. Getting there requires engineers to both have data and the ability to analyze and find problems.

This talk will address page load speed in two parts. A "cold" load where a user first comes to your site and a "warm" load which deals with intra-site page load speed. We will dive into the details of each page load and what is really going on. From network optimization to browser render performance, all things matter when it comes to optimizing the load of your web page. Furthermore, we will look into some tools that can be used to analyze and help developers discover and address problems.

Published in: Engineering
  • Be the first to comment

Understanding Page Load / Ziling Zhao (Google)

  1. 1. Understanding Page Load Ziling Zhao - Engineer at Google
  2. 2. What I will talk about ● Page Load Latency ○ Time leading from a click to a usable website ● Traditional and dynamic page loads ● A few tips and things to avoid ● How the browser works ● Tools that can help you explore site performance ● Most of this was picked up while optimizing YouTube
  3. 3. Who am I? ● Software Engineer on YouTube ○ Working on Desktop and Mobile Web ● Been at Google for 4 ½ years ● Prior to Google, I worked at Cisco Systems ● Graduated from the University of Washington ● Specialize in web performance
  4. 4. Why care about page load speed? ● Customers are from different parts of the world ● People have slow connections and slow CPUs ● Malware, antivirus, memory pressure, etc. ● Mobile users have resource constraints ● People want smooth app-like experiences ● Slow websites turn users away!
  5. 5. Two types of page loads.
  6. 6. Traditional Page Load First page load, as well as any full page load. Measured starting from when the request is sent for the page. Ends when the important parts of the requested page is viewable and ready. "Cold"
  7. 7. Dynamic Page Load AJAX driven site navigation. Starts when an action (typically user driven) triggers a navigation to the next page. This should be before the actual request. Ends when the important parts of the requested page is viewable and ready. "Warm"
  8. 8. Monitor your performance.
  9. 9. Response-Time Rules of Thumb 0.1s - Instantaneous 1s - Seamless 10s - Attention threshold Jakob Nielsen http://www.nngroup.com/articles/response-times- 3-important-limits/
  10. 10. Network dominates load time.
  11. 11. Examining at a web page load.
  12. 12. Chrome DevTools - Network Panel
  13. 13. WebPagetest
  14. 14. WebPagetest - highload.co http://www.webpagetest.org
  15. 15. Hmmm... WebPagetest results
  16. 16. Page Load Basics A CDN serves content closer to your customers. Low latency. Cookieless, smaller response/request size. Extra DNS lookup and TCP connection for separate domain ● Use a CDN ● Cache static content ● Connection keep-alive ● Use gzip transfers ● Compress Images ● Minify your javascript/css ● Minimize first byte time ● Reduce number of downloads
  17. 17. List static assets not being served off of a CDN
  18. 18. Page Load Basics Avoid the roundtrip. ● Use a CDN ● Cache static content ● Connection keep-alive ● Use gzip transfers ● Compress Images ● Minify your javascript/css ● Minimize first byte time ● Reduce number of downloads
  19. 19. Check for cache headers These images never change ● Set a max-age or expires header ETag still incurs a round trip ● Browser needs to ask the server if content has changed ○ RTT 289ms!
  20. 20. Page Load Basics Do your handshake once. Reduce your transfer size. ● Use a CDN ● Cache static content ● Connection keep-alive ● Use gzip transfers ● Compress Images ● Minify your javascript/css ● Minimize first byte time ● Reduce number of downloads
  21. 21. Keep connections open and gzip
  22. 22. Page Load Basics ● Use a CDN ● Cache static content ● Connection keep-alive ● Use gzip transfers ● Compress Images ● Minify your javascript/css ● Minimize first byte time ● Reduce number of downloads
  23. 23. Image size reduction
  24. 24. Compress Images Original Image ● JPEG Quality 94 ● Resolution: 1500 x 1001 pixels ● File Size: 795KB Compressed Image ● JPEG Quality 70 ● Resolution: 1024 x 683 pixels ● File Size: 152.5KB WebP ● WebP Quality 70 ● Resolution: 1024 x 683 pixels ● File Size: 136.2KB
  25. 25. Page Load Basics Use something like uglify/closure compiler/YUICompressor . ● Use a CDN ● Cache static content ● Connection keep-alive ● Use gzip transfers ● Compress Images ● Minify your javascript/css ● Minimize first byte time ● Reduce number of downloads
  26. 26. Minify your large javascript files
  27. 27. Probably don't need another version.
  28. 28. Page Load Basics Chunk your responses if you have RPCs or heavy server processing. Serve static content early. ● Use a CDN ● Cache static content ● Connection keep-alive ● Use gzip transfers ● Compress Images ● Minify your javascript/css ● Minimize first byte time ● Reduce number of downloads
  29. 29. Transfer-Encoding: Chunked
  30. 30. Page Load Basics ● Use a CDN ● Cache static content ● Connection keep-alive ● Use gzip transfers ● Compress Images ● Minify your javascript/css ● Minimize first byte time ● Reduce number of resources
  31. 31. Reduce the number of resources ● This used to be simple ○ HTTP/2 makes things complicated ● Under HTTP1 ○ 6~ concurrent connections per server for all tabs ■ Reduce the total number of resources needed ○ Sprite your assets, reduce the number of small images ○ Reduce CSS/JS downloads ● But can you serve SPDY/HTTP2?
  32. 32. SPDY/HTTP2 and QUIC ● Connection multiplexing ○ Allows multiple requests over one socket ■ Only creates one connection per server ■ Reduces the need to sprite your images or shard static domains ● Header compression ● Faster SSL handshake ● Serverpush resources before client asks for them ● QUIC is a UDP-based transport ○ currently used for YouTube video serving
  33. 33. Dev Tools
  34. 34. After the first byte
  35. 35. The browser starts getting busy ● The browser must take the CSS and HTML and join them together ○ HTML is parsed into a DOM (Document Object Model) ○ CSS is parsed into an CSSOM ● Styles are applied/recalculated ● The page is layed out and painted ○ Elements are positioned and pixels are drawn ● Resources linked to in your page must be downloaded ● Javascript must be executed
  36. 36. Rendering is (mostly) single threaded.
  37. 37. The render-thread has a lot to do ● A single render thread (per tab) handles a lot of important tasks ○ Parsing ○ Styling ○ Layout ○ Painting ○ Javascript ○ Event handling, etc. ● Network operations are on a separate process ○ One shared network manager across tabs
  38. 38. Page load walkthrough.
  39. 39. After the first byte ● Browser parses HTML to construct the DOM (Document Object Model) ● CSS is parsed to form the CSSOM ● DOM + CSSOM = Render Tree ○ This is how the visible elements will be displayed. ● Render Tree is painted HTML DOM Tree
  40. 40. After the first byte ● Browser parses HTML to construct the DOM (Document Object Model) ● CSS is parsed to form the CSSOM ● DOM + CSSOM = Render Tree ○ This is how the visible elements will be displayed. ● Render Tree is painted HTML DOM Tree CSS CSSOM
  41. 41. Where does the CSS come from?
  42. 42. The browser has to read the <link> before CSS downloads.
  43. 43. After the first byte ● Browser parses HTML to construct the DOM (Document Object Model) ● CSS is parsed to form the CSSOM ● DOM + CSSOM = Render Tree ○ This is how the visible elements will be displayed. ● Render Tree is painted HTML DOM Tree CSS CSSOM HTML Parser Network Manager
  44. 44. After the first byte ● Browser parses HTML to construct the DOM (Document Object Model) ● CSS is parsed to form the CSSOM ● DOM + CSSOM = Render Tree ○ This is how the visible elements will be displayed. ● Render Tree is painted HTML DOM Tree CSS CSSOM HTML Parser Network Manager Render Tree
  45. 45. Javascript is blocking.
  46. 46. Parser + Javascript on the same thread ● Javascript can read and write to the DOM ○ The parser is cautious and will pause until javascript is downloaded and run ● The browser can paint the partial page while blocked ○ Being blocked in <head> means there is nothing to paint ● General advice: ○ Have stylesheets in the head and send them early ○ Put javascript at the bottom of your page to avoid blocking the parser
  47. 47. Be careful with JS in <head> ● Stylesheets block painting ● Putting inline <script> right after <link rel=" stylesheet"> will block the parser until stylesheet is downloaded ○ Browser assumes js will read stylesheet ● If you must put scripts in the head. ○ Keep it small, maybe inline ○ It may be better to put it before your external CSS ■ Otherwise it will block until CSS is done downloading before executing
  48. 48. After the first byte ● Browser parses HTML to construct the DOM (Document Object Model) ● CSS is parsed to form the CSSOM ● DOM + CSSOM = Render Tree ○ This is how the visible elements will be displayed. ● Render Tree is painted HTML DOM Tree CSS CSSOM HTML Parser Network Manager Render Tree Paint Display
  49. 49. The Preloader
  50. 50. The Preloader ● Also known as: Speculative/look-ahead pre-parser ● Making your site run-faster since 2008 ● 20%~ page load improvement ● While the parser is blocked, another parser flies ahead and finds all downloadable resources ○ Stylesheets, scripts, images, html imports
  51. 51. Network Priorities - Chrome ● Network requests are executed in priority order ○ Layout blocking network requests are HIGHEST/MEDIUM priority (CSS/HTML) ○ Layout blocking mode allows one low priority request at a time (this may change). ○ Scripts are LOW ○ Async XHR's are LOW priority ○ Images are LOWEST or IDLE priority
  52. 52. Pre-loader - Chunked Transfer ● Chunked encoding + scripts at the end ○ Pre-loader can't see it until the chunk is sent ■ Your script download will be after other resources ○ Can use async and defer to put scripts earlier ■ async - Doesn't block parser for download, will pause to execute ● This will put it in LOWEST priority ■ defer - Doesn't block for download, execute after parser is completed (keeps order)
  53. 53. Scripts at footer can load earlier than images
  54. 54. chrome://net-internals#events
  55. 55. Warning: JS module loaders ● Cannot take advantage of the pre-loader ● Script loaders will only load scripts after: ○ Your bootstrap script downloads and is executed ○ Bootstrap script execution will be blocked by any previous CSS or javascript ● Same applies to CSS loaders ○ Especially damaging since CSS is typically fetched at MEDIUM priorities
  56. 56. Dynamic Page Transitions
  57. 57. Dynamic page transitions ● Faster than full page loads ○ We can make them even faster ● Fewer network requests ● Higher expectations for mobile web apps ○ Animate between page loads ■ Must be careful to avoid janking the animations. ● We want to hide latency and make it feel seamless ○ Ideally instantaneous if you prefetch
  58. 58. Dynamic page load Jankiness can be introduced at almost every step. When the user expresses an intent is where it starts. User Input, can be a touch, click, scroll, anything that signifies a dynamic page load should occur. User Input Time
  59. 59. Dynamic page load Often, page transitions require you to clean up or animate some stuff. This can involve removing event listeners, turning off animations, removing DOM elements, etc. We typically will want to fetch the page data and then run dispose because why wait? User Input Dispose XHR Time
  60. 60. Dynamic page load Note that network happens on a separate thread and does not block the render thread. The XHR however is on the same thread and cannot run in parallel with dispose. User Input Dispose XHR Time Network Manager hl.co/next-page
  61. 61. Dynamic page load After the response comes back an event handler for the XHR will run and mutate the web page. After modifying the page, javascript code for the specific page may need to be initialized. User Input Dispose XHR Time Network Manager XHR.onload Mutate DOM Init Page JS hl.co/next-page
  62. 62. DOM mutation causes work
  63. 63. Dynamic page load If you inject HTML, work must be done by the browser to parse the text into a DOM tree and appended. The part of the DOM that was mutated needs to go through style recalculation and layout. Then the DOM needs to be re-painted. User Input Dispose XHR Time Network Manager hl.co/next-page XHR.onload Mutate DOM Parse RecalcStyle Layout Init Page JS Paint
  64. 64. Dynamic page load Your dispose call could have modified the DOM as well. If so, parse, layout, and paint will also occur after dispose. Your JS + browser work can delay the handling of network responses. User Input Dispose XHR Time Network Manager hl.co/next-page XHR.onload Mutate DOM Init Page JS Parse RecalcStyle Layout Paint Parse RecalcStyle Layout Paint
  65. 65. Dynamic page load From the user's perspective, the latency of going from one page to the next is the equivalent of starting with the user input all the way till the last paint. So, we are done right? User Input Dispose XHR Time Network Manager XHR.onload Mutate DOM Init Page JS hl.co/next-page Parse RecalcStyle Layout Paint Parse RecalcStyle Layout Paint
  66. 66. Dynamic Page Load Deals with intra-site navigation. Typically AJAX driven. Starts when an action (typically user driven) decides to load a page. This can be before the actual request. Ends when the important parts of the requested page is viewable and ready. "Warm"
  67. 67. Dynamic page load Just because the DOM was changed via your XHR doesn't mean the page is ready. You may have loaded additional stylesheets, more images, etc. User Input Dispose XHR Time Network Manager XHR.onload Mutate DOM Init Page JS hl.co/next-page Parse RecalcStyle Layout Paint Parse RecalcStyle Layout Paint
  68. 68. Dynamic page load Just because the DOM was changed via your XHR doesn't mean the page is ready. Often times there are css or images in your new page that need to be loaded. When the CSS comes in, the page needs to be reflowed and repainted. Time Network ManagerXHR.onload Mutate DOM Init Page JS css image.png image2.png Parse RecalcStyle Layout Paint RecalcStyle Layout Paint
  69. 69. Dynamic page load Sometimes there is additional data to load, maybe from third party sources, or just delay loaded components. Be aware of where and when your javascript is running during these critical moments. Time Network ManagerXHR.onload Mutate DOM Init Page JS moredata css image.png image2.png moreData. onLoad RecalcStyle Layout Paint Parse RecalcStyle Layout Paint
  70. 70. Keep the renderThread idle
  71. 71. Break up long running Javascript ● Delay non-essential javascript ● Debounce event handlers ○ readyStateChange/scroll/resize events fire way more often than you care ■ These events will queue up behind slow js handlers ● Javascript should run in short chunks for animations ○ For 60fps animations, limit javascript to < 16ms (including browser render time!) ○ requestAnimationFrame
  72. 72. Don't run js for long periods of time.
  73. 73. RAIL Response Animation Idle Load https://developers.google.com/web/tools/chrome- devtools/profile/evaluate-performance/rail 0 - 16ms Threshold for animations 0 - 100ms User input response 100 - 300ms Detectable delay 300 - 1000ms Keeps user focus 1000+ms Users lose focus 10s+ Abandonment
  74. 74. RecalcStyle Layout Paint Reduce/eliminate parse time ● Using innerHTML requires the browser to parse HTML text into DOM ○ Often faster to use alternatives: ■ createElement/cloneNode/appendChild ■ <template> importNode ● innerHTML is also a security issue ● Try not to block parse with scripts. Parse
  75. 75. Parse time blocks on inline scripts.
  76. 76. Make recalcstyle and layout faster ● Mutate as few DOM elements as possible ○ The browser can optimize for subtree changes ● Reduce the size of your DOM ○ Rule of thumb: no more than 3000~ elements on a page ● Simplify the number of CSS rules and selector complexity ● Hide things that are not visible ○ display: none ● Avoid mutating stylesheets. Parse RecalcStyle Layout Paint RecalcStyle Layout
  77. 77. Recalculate Style + Layout in Devtools
  78. 78. Speeding up paints ● Use GPU rasterization on mobile web ○ Use meta viewport tag containing width=device-width and minimum-scale=1.0 ● Use layers where it makes sense ○ Do research, be careful where you do it ■ Don't create too many layers ○ Creates separate textures ■ can eliminate the need to paint at all Parse RecalcStyle Layout PaintPaint
  79. 79. Networking for dynamic pages ● Layout blocking mode isn't active anymore ● Typically fewer requests for dynamic transitions ○ Be careful cancelling a request with HTTP/1. Socket is closed and requires to be rebuilt ● Do you value images or AJAX? ○ Image priorities are LOWEST/IDLE vs. XHR which is LOW
  80. 80. Summary ● Profile your page loads ○ Test with webpagetest.org ● Make your page transitions seamless ○ Keep your render thread idle and happy ● Understand how the browser loads your page ○ Both first page load and intra-site page loads ○ Analyze your render thread!
  81. 81. Questions?
  82. 82. Extra Slides
  83. 83. Real User Metrics ● What does your website feel like to the average user? ○ Are you optimizing the correct area? ● Fully understand the flow of your website ○ What actions/events are important to your users? ■ When does this action/event start? ○ What are the blocking steps? ■ Are they blocked on network? CPU? User input? ● Need large data sets ○ Server side performance tends to be consistent ○ Client side performance is noisy and requires large data sets.
  84. 84. Measuring start and stop ● Start as early as possible. ○ Use the navigation/resource timing APIs. ● Where to stop depends on your site. ○ Content above the fold is typical. ■ onLoad is not a good measure. ○ If you're a video site, you may care more about the video playing.
  85. 85. Navigation/Resource Timing API
  86. 86. Profiling tools I use ● DevTools ○ All types of goodies ○ High overhead, accessible, (mostly) easy to use ● chrome://tracing ○ Gives you insight to Chrome ○ Hard to use ○ High overhead ● Web Tracing Framework ○ Setup required ○ Low overhead instrumented profiling
  87. 87. Chrome DevTools - Network Panel - Throttling
  88. 88. Chrome DevTools - Timeline ● Arguably most powerful tool in devtools ● High overhead ● Observe render thread blocking operations ○ What javascript functions were called ○ How much time did parse/recalculate style/layout/paint take? ● Style Invalidation Tracing/Paint Inspection/Memory profiling ● Lots more stuff!
  89. 89. Chrome DevTools - Timeline
  90. 90. chrome://tracing ● A very detailed view into Chrome ● Hard to use ○ A lot of data ■ Collects data from all of chrome, not just your tab ○ Limited collection buffer size ○ Limited amounts of app specific JS information available ● Best run in its own Chrome instance
  91. 91. chrome://tracing
  92. 92. Web Tracing Framework ● Written for Google Maps ○ http://google.github.io/tracing-framework/ ○ Made for tracking FPS and inspecting WebGL ● Instrumented profiler ○ Low overhead ○ Setup cost ■ Minified code exports readable names to profiler ○ Does not play with Chrome Devtools ○ No network view
  93. 93. Web Tracing Framework

×