Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
From nothing to a video
under 2 seconds
Mikhail Sychev, Software Engineer Unicorn at Google
Who am I?
● Software Engineer at YouTube
● Evangelist of modern web technologies
● Member of the “Make YouTube Faster” gro...
What we will talk about
● Types of page load, associated challenges and our
approach to handling them
● Tools that we use ...
1 Second
...users notice the short delay, they stay focused on
their current train of thought during the one-second
interv...
YouTube is a video streaming service, starting
video playback as early as possible is the most
important task of the Watch...
Video PlayerGuide
Metadata
Actions
Playlist and related videos
Comments
Metadata
Masthead
Cold load
Warm load
Hot load
Navigation types
Users opening YouTube page with a plain HTTP
(s) request
● Full HTML page downloaded and parsed
● Some or no static resour...
● Various “pre-browse” techniques
○ http://www.slideshare.
net/souders/prebrowsing-velocity-ny-2013
○ http://www.slideshar...
Good news we have plenty of room for
optimizations (bad news is up to us to do all of it)
● How fast can we send data to t...
● HTTP2/SPDY(for http) + QUIC(for video)
● gzip
● Image crushing
● JS compiled by Google closure and modularized
● Icon sp...
● Why should you care?
○ Typed JavaScript
■ Absolutely critical for large teams
○ A set of advanced optimizations
○ HUGE s...
● Try the compiler online:
https://closure-compiler.appspot.com/
● Docs and examples:
https://developers.google.
com/closu...
Chunking
GET /
INIT RPC DATA TRANSFER
PAGE RENDERING...
TEMPLATE
STATIC
Typical request lifetime
● Browser performs a request
● Bac...
● Browser has nothing to do while we are blocked by
RPCs on the backend
● But when we start sending the data it’s forced t...
Chunking approach
● Render and send to the browser first chunk
containing references to external resources
● Browser start...
Chunking approach
Works for client side applications too
● Send links to application resources early
● Render the applicat...
350kb of compiled and
compressed JS
Here comes the player
Player
● Player is large
○ It’s not just a video tag
○ Ads, format selection logic, UI, annotations, etc…
○ Sometimes we h...
Player
● No silver bullet for cold load
● Have to carefully profile and optimize the code
● Send the player early and init...
Player
● tldr: Efficient video playback is HARD, we have a
whole team dedicating to making it fast
● Pick your battles and...
Thumbnails
Thumbnails
● 10+ thumbnails above the fold on the Watch page
● Some of the pages are mostly thumbnails
● Important for use...
Thumbnails
● Only images above the fold are important on
initial page display, everything else can be loaded
later
● But some extra o...
● Can’t start loading of the images until JS can be
executed and forces re-layout
● Not the best solution if most of the i...
Visible to the user on page load
Invisible, but preloaded
Invisible to the user, not loaded
“WebP is a new image format that provides
lossless and lossy compression for images on
the web. WebP lossless images are 2...
● Chrome (+mobile)
● Opera (+mobile)
● FF and IE through WebPJS
● Android 4.0+
● iOS (through 3rd party libraries)
● WebKi...
Use WebP for sprites and for thumbnails if you can
afford extra space and wants to serve to native
clients.
mileage may va...
Delay loading
● Browsers already has priorities
● Sometimes we want more control
● Fetching of video bytes should be more important
than...
Cold load
Warm load
Hot load
● A navigation from one YouTube page to another in
the same tab/window
● We have full control, can use rel=”pre-whatever”
...
http://youtube.github.io/spfjs/
https://www.youtube.com/watch?v=yNe_HdayTtY
aka “that red bar on top of
the page”
● State ...
● Lightweight alternative to rewriting all of the backend
templates
● Chunks of the page are send from backend as
chunked ...
Critical Stages of Playback
With player preinit
13% QPS decrease
40% faster playback
Cold load
Warm load
Hot load
● Can we do better? Cache?
○ Visited pages/history - necessary to get the
back/forward right
○ Tricky, first page load pro...
● Every latency impacting change goes through A/B
testing
● Monitor both latency impact and user behaviors
● Making things...
● Regressions happen all the time
○ Some are expected, like YouTube logo doodles
○ Some are real issues
● A lot of things ...
● We log timestamps of important events, server
time, aft, qoe, etc..
● Browsers react differently, so we collect data per...
Summary
● Aim for average page load latency of 1 second or
better
● Different types of page loads may require different
approach
●...
● Understand how the browser works
● SPF is a reasonable alternative to replacing
everything with client side templating, ...
Questions?
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
Upcoming SlideShare
Loading in …5
×

From nothing to a video under 2 seconds / Mikhail Sychev (YouTube)

520 views

Published on

What does it take to achieve sub two seconds video playback latency on the 3rd largest website in the world?

We will peek under the hood of the Watch page and explore what common problems are being solved by
YouTube's Desktop team and what interesting solutions had to be implemented to achieve this goal.

We will discuss how page loads are classified and what specific treatment is required for every type, what tools and technologies are used in the stack, how being one of the largest image serving websites affects our approach to thumbnails and how we maintain and monitor our latency goals.

Published in: Engineering
  • Be the first to comment

From nothing to a video under 2 seconds / Mikhail Sychev (YouTube)

  1. 1. From nothing to a video under 2 seconds Mikhail Sychev, Software Engineer Unicorn at Google
  2. 2. Who am I? ● Software Engineer at YouTube ● Evangelist of modern web technologies ● Member of the “Make YouTube Faster” group
  3. 3. What we will talk about ● Types of page load, associated challenges and our approach to handling them ● Tools that we use to build and monitor YouTube ● Tricks we learned along the way
  4. 4. 1 Second ...users notice the short delay, they stay focused on their current train of thought during the one-second interval … this means that new pages must display within 1 second for users to feel like they're navigating freely; any slower and they feel held back by the computer and don't click as readily. JAKOB NIELSEN http://www.nngroup.com/articles/response-times-3-important-limits/
  5. 5. YouTube is a video streaming service, starting video playback as early as possible is the most important task of the Watch page ....yet everything else is important too
  6. 6. Video PlayerGuide Metadata Actions Playlist and related videos Comments Metadata Masthead
  7. 7. Cold load Warm load Hot load Navigation types
  8. 8. Users opening YouTube page with a plain HTTP (s) request ● Full HTML page downloaded and parsed ● Some or no static resources in cache ● Some DNS cache ● Thumbnails have to be downloaded
  9. 9. ● Various “pre-browse” techniques ○ http://www.slideshare. net/souders/prebrowsing-velocity-ny-2013 ○ http://www.slideshare. net/MilanAryal/preconnect-prefetch-prerender ● Browsers are really good at rendering HTML soup ○ Because this is what the most of internet is But that’s pretty much it...
  10. 10. Good news we have plenty of room for optimizations (bad news is up to us to do all of it) ● How fast can we send data to the browser? ● How much data should be downloaded? ● How much data should be processed? ● Do we have CPU/thread or network congestion?
  11. 11. ● HTTP2/SPDY(for http) + QUIC(for video) ● gzip ● Image crushing ● JS compiled by Google closure and modularized ● Icon sprites (WebP where supported) ● Minified CSS ● CDN https://developers.google.com/speed/pagespeed/ Basic things you would expect
  12. 12. ● Why should you care? ○ Typed JavaScript ■ Absolutely critical for large teams ○ A set of advanced optimizations ○ HUGE size savings and dead code removal ○ Kind of hard to setup and writing annotations is time consuming Closure compiler
  13. 13. ● Try the compiler online: https://closure-compiler.appspot.com/ ● Docs and examples: https://developers.google. com/closure/compiler/docs/api-tutorial2
  14. 14. Chunking
  15. 15. GET / INIT RPC DATA TRANSFER PAGE RENDERING... TEMPLATE STATIC Typical request lifetime ● Browser performs a request ● Backend parses this requests and necessary data is fetched / RPCs called ● Page is rendered via some templating language and sent to browser ● Browser starts to render while fetching the page and downloads external resources
  16. 16. ● Browser has nothing to do while we are blocked by RPCs on the backend ● But when we start sending the data it’s forced to render the page while downloading and executing all external resources ● Result - CPU/thread and bandwidth congestion
  17. 17. Chunking approach ● Render and send to the browser first chunk containing references to external resources ● Browser starts to fetch them while the connection is still open ● Send extra chunks of data as RPCs are completed ● Serialize data as JSON to be used later if UI is blocking GET / INIT T0 DATA TRANSFER RNDR... RPC1 STATIC T1 RPC2 T2 RNDR RNDR
  18. 18. Chunking approach Works for client side applications too ● Send links to application resources early ● Render the application chrome, do not wait for page onload event ● Append data as JSON to the end of the page ● Be careful of timing issues ○ You can’t predict if the application is initialized first or if the page is completely downloaded GET / INIT T0 DATA TRANSFER RNDR... RPC STATIC JSON RNDR
  19. 19. 350kb of compiled and compressed JS Here comes the player
  20. 20. Player ● Player is large ○ It’s not just a video tag ○ Ads, format selection logic, UI, annotations, etc… ○ Sometimes we have to fallback to Flash ● Just executing all the necessary JS is a significant CPU task ● We really don’t want to do this on every page load (but nothing we can do for cold loads) ● Player can be blocked by OS (video and audio init)
  21. 21. Player ● No silver bullet for cold load ● Have to carefully profile and optimize the code ● Send the player early and init early ● But the page may still be downloading/painting ● So try not to get in the way e.g. asking the page for the container size of the player may trigger relayout blocking the browser for 10x ms
  22. 22. Player ● tldr: Efficient video playback is HARD, we have a whole team dedicating to making it fast ● Pick your battles and don’t try to support every browser/platform unless you absolutely have to ● Focus on HTML5 if possible, Flash is slowly going away
  23. 23. Thumbnails
  24. 24. Thumbnails ● 10+ thumbnails above the fold on the Watch page ● Some of the pages are mostly thumbnails ● Important for users to decide what to watch, we want thumbnails as fast as possible unless they are in the path of video
  25. 25. Thumbnails
  26. 26. ● Only images above the fold are important on initial page display, everything else can be loaded later ● But some extra ones can still be preloaded to prevent thumbnail popping Delay/Lazy loading
  27. 27. ● Can’t start loading of the images until JS can be executed and forces re-layout ● Not the best solution if most of the images are above the fold ● We use hybrid approach, do not preloader thumbnails that are always above the fold and affect user behavior Delay/Lazy loading
  28. 28. Visible to the user on page load Invisible, but preloaded Invisible to the user, not loaded
  29. 29. “WebP is a new image format that provides lossless and lossy compression for images on the web. WebP lossless images are 26% smaller in size compared to PNGs. WebP lossy images are 25-34% smaller in size...”
  30. 30. ● Chrome (+mobile) ● Opera (+mobile) ● FF and IE through WebPJS ● Android 4.0+ ● iOS (through 3rd party libraries) ● WebKit based console applications http://caniuse.com/webp
  31. 31. Use WebP for sprites and for thumbnails if you can afford extra space and wants to serve to native clients. mileage may vary but expect 10% faster page load
  32. 32. Delay loading
  33. 33. ● Browsers already has priorities ● Sometimes we want more control ● Fetching of video bytes should be more important than thumbnails ● But this requires large codebase refactoring ○ Triggers potential race conditions and issues of various kinds ● setInterval and setTimeout hijacking as a simple introduction of scheduling
  34. 34. Cold load Warm load Hot load
  35. 35. ● A navigation from one YouTube page to another in the same tab/window ● We have full control, can use rel=”pre-whatever” ● Can we do better? ○ Only transfer JSON data? ■ Requires to rewrite all backend templates ■ Browser are actually really good at rendering html soup
  36. 36. http://youtube.github.io/spfjs/ https://www.youtube.com/watch?v=yNe_HdayTtY aka “that red bar on top of the page” ● State and history management ● In memory caching ● Chunked JSON responses ● Partial DOM updates
  37. 37. ● Lightweight alternative to rewriting all of the backend templates ● Chunks of the page are send from backend as chunked JSON ○ Some overhead on escaping (don’t mess up your JSON) ○ Overall JSON responses are smaller ● Player preinit on non Watch and persistent player across page boundaries ● Less DOM changes on navigation ● Custom caching
  38. 38. Critical Stages of Playback
  39. 39. With player preinit
  40. 40. 13% QPS decrease 40% faster playback
  41. 41. Cold load Warm load Hot load
  42. 42. ● Can we do better? Cache? ○ Visited pages/history - necessary to get the back/forward right ○ Tricky, first page load problem ■ If first page is rendered on backend, how do we go back? ○ How does this affect the metrics? ○ Can we do even better? Cache pages that would visited with high probability(next video in autoplay for example?) ○ But what if the user does not go to next page? ■ Even more metric craziness ■ More QPS
  43. 43. ● Every latency impacting change goes through A/B testing ● Monitor both latency impact and user behaviors ● Making things fast is good, but sometimes we have to revisit experiments due to behavior changes (especially for delay loading) Monitoring
  44. 44. ● Regressions happen all the time ○ Some are expected, like YouTube logo doodles ○ Some are real issues ● A lot of things can change ○ Sizes of common static resources ○ Number of images on the page ○ Latency of server responses
  45. 45. ● We log timestamps of important events, server time, aft, qoe, etc.. ● Browsers react differently, so we collect data per browser ○ Version, background state, etc... ● http://www.webpagetest.org ● In addition we use many in-house tools to for monitoring and notification
  46. 46. Summary
  47. 47. ● Aim for average page load latency of 1 second or better ● Different types of page loads may require different approach ● Use WebP for sprites(usually) and thumbnails(if possible) ● Google closure compiler is awesome, but hard to set up - use it if you can ● Minimize amount of work done on every load (persistent player)
  48. 48. ● Understand how the browser works ● SPF is a reasonable alternative to replacing everything with client side templating, saves QPS and makes everything faster ● Chunking unblocks the browser, but requires backend to support it ● Monitor everything, A/B testing is important, profiling is critical
  49. 49. Questions?

×