Web Performance Lessons at DDD Sydney May 2016

406 views

Published on

For two years Jason has been improving the performance and scalability of websites he didn't build and often can't change. This session will cover some of the challenges encountered, how they can be addressed, and approaches to avoid them in the first place.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
406
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Hi. I am Jason Stangroome. I work for section.io.

    Several years ago, section.io started by providing a fully-managed web performance service for established websites that were losing business due to slow pages. We would ask the site owners to change their website DNS records to resolve to our systems instead and we would do the rest.

    Our toolset could be distilled to just the web browser development tools, and a series of content-rewriting HTTP proxies. We don’t get to make changes to the origin web servers, we don’t get to see the origin source code, at best we might get access to the New Relic portal if the agent software was already installed on the web servers.

    We’ve learnt a great deal providing this service, and now, building on this experience, section.io also offers a new class of Content Delivery Network (CDN) with a self-service model and a developer-focused workflow.

    Today I’d like to share with you some of the website performance issues we see, why they exist and how to address them.

  • Before I get too detailed, I should clarify the kinds of websites I’ve worked with.

    At the moment, section.io serves over 130 fully-managed websites.

    The majority are e-commerce sites built on a variety of platforms and languages. Some are off-the-shelf, like Magento (PHP), with some plugins and some site-specific customisations, others are completely bespoke ASP.NET solutions, and there is everything else in between.

    There are some characteristics common to these fully-managed sites:
    Mostly backend-rendered content, ie minimal use of Angular, or other client-side view engines
    Most content changes infrequently (ie minutes or hours or more, but not seconds)
    User authentication is only applicable to particular areas of the site
    Traffic has big spikes (eg Click Frenzy, general EDMs) and has a large impact on revenue and brand reputation
    The site owner has limited skills and/or resources to improve the backend code (or at least not as cost effectively as we can do it)
  • From our perspective, ie providing a service that lives between the user’s web browser and the site owner’s web servers, there are several aspects to making a website “fast”.

    But there is one aspect that is crucial to us: availability. Or put differently, as a visitor trying to use your website, the page that doesn’t load is the ultimate slow page. You might be able to serve that “503 Service Unavailable” response in milliseconds but that doesn’t help.

    I mentioned spikey traffic on the last slide. Three weeks ago the premiere of Shark Tank Australia Season 2 aired. One of the entrepreneurs came to us to improve his site. Before his business’ segment broadcast, the site received less than 5,000 requests per hour. In the hour that Shark Tank was viewed along the east coast, the site received 750,000 requests and the median page-load time remained stable.

    It is one thing to provide a fast experience to a handful of users but it is much more important to stay fast, and online, for thousands of concurrent users.
  • Those of you who share my passion for security may have heard of WebGoat. WebGoat is an intentionally-insecure website provided by the Open Web Application Security Project (OWASP) for demonstrating web security issues.

    I do not have a WebGoat-equivalent for demonstrating poor web performance. The issues I am discussing today have been addressed for our customer’s websites. Also, most people don’t like their work used as an example of what not to do.

    Instead I’ll do my best to describe the scenarios, and I’ll be happy to go into more detail if you want to chat after the presentation.

    Waterfall diagram from: http://www.webpagetest.org/result/160526_C7_9f22a556d269a7ae325e128c2c5cb3f9/
  • I want to start with caching in the browser.

    Every request that can be avoided is going to save at least one network round-trip and maybe five round-trips if a new HTTPS connection is involved and that’s before considering the time to generate and download the response. And that’s also a request your web servers don’t need to handle.

    This includes request with `If-None-Match` and `If-Modified-Since` headers that result in a `304 Not Modified` response.

    Unfortunately, there are two hard problems in computer science: naming, cache invalidation, and off-by-one errors. Often when people have trouble with cache invalidation, they disable caching. That’s when you see a `Cache-Control` response header specifying `max-age=0` or `no-cache` or some other non-sense combination on resources which clearly should be cacheable. The same goes for `Expires` response headers with timestamps in the past, or the ridiculously near future. This is the first thing we encounter and have to fix for almost every customer.

    --

    Query strings on the end of the URL are another great way to miss the cache. For most web servers, the order of the key-value pairs in the query string is irrelevant and often lost when presented to the application, but for a HTTP cache, the order of the keys is the difference between a cache hit and a network request – be consistent when composing URLs with query strings.

    You’ve probably seen, or even implemented, a site where a timestamp is added to the query string as an intentional cache-buster to force clients to fetch the latest resources when you deploy a new version. One good way is when the timestamp is the time the deployment happened. A terrible way we discovered on one website was to use JavaScript to append the current system time (in seconds) to the query string of all the image `src` URLs on the page.

    --

    So, who thinks protocol-relative URLs are cool? … I used to think so, especially compared to the other technique of detecting whether the page was loaded over HTTP or HTTPS and then dynamically updating all the inline URLs to avoid the dreaded mixed-mode security warnings.

    But here’s the problem – the user navigates around your site caching all the common stylesheets, scripts, sprites, the logo, then they visit the first HTTPS page on your site (eg to register). Suddenly the user gets a page full of new HTTPS-prefixed URLs. None of these resources are cached and the browser has to fetch all of them again. With very rare exceptions, the responses the browser gets back are identical to the cached HTTP versions.

    The simple answer is this: if a resource is available via a secure HTTPS URL, always use that URL even from a non-secure page. You avoid the HTTPS-detection logic, you avoid mixed-mode warnings, you reduce requests to your webserver, and the secure pages load faster.

    --

    There are various semantic differences between the GET and POST methods but I only want to focus on two right now:

    Often POST is chosen so that more data can be sent in the request body than can be sent in URL… and to be fair older browsers had some fairly tight restrictions on URL length. However, don’t just assume your data size demands using POST, test it, because of Number 2.

    A cached response will not be used to satisfy a POST request…. i.e POST equals no caching.

    One our customers had a product search page on their site. All the content on the page was the result of an AJAX POST. Also, all of the site’s category pages were just a friendly URL to the search page preloaded with specific criteria. The search page allowed sorting and filtering by lots of facets and so the POST was used to jam all the criteria in the request body.

    We investigated and found that the most complex filtering still fit in the URL but the origin web server only accepted POST requests. Our solution was to rewrite the JavaScript on the wire en route to the browser so it would use a GET request and the query string instead, then when the GET request came back through our platform, we’d serve a cached response if we had one, otherwise we’d rewrite the request to a POST and forward it to the origin webserver. This was an enormous win – especially given how expensive search functionality can be to process on the origin servers.
  • I’ve included this slide for completeness but I really don’t want to spend much time on it. This is the performance stuff you see everywhere. Yes, it’s important, but it’s boring.

    Sadly, we also find so much of it has not been applied to many sites. Add this to your build pipeline and move on. For our platform we do this on the wire and cache the results.

    For images, often the problem stems from a content management system. When back-office users upload images for the site, they’re accepted as-is. The user was expected to know which file format to use, which resolution is appropriate, how to strip the Adobe metadata out.

    The only point I’ll call out here is to make sure image references in the markup include the width and height attributes as it can significantly improve page render times.
  • If the marketing team has input on how a website operates you can bet it will quickly end up with many third-party scripts on all your pages. The official documentation for many of these, provides a HTML <script>-element snippet and instructions to paste it in the document head. And this snippet will then load synchronously, blocking HTML parsing while the script is downloaded and executed. This is *the* biggest cause of slow pages across all sites in our experience.

    Wherever we can we ask the third-party for the deferred version of the script and we rewrite the HTML on the wire to replace the original <script>-block and move it out of the <head> and to the end of the <body>. Sometimes we write the deferred version ourselves because the third-party doesn’t provide one. The reason we can do this, is that many of these scripts have no visual component, or at least nothing critical to page-layout or usability.

    --

    However, there is a trade-off – typically what these scripts do is gather data then ship it back to some tracking endpoint. By delaying the loading of these scripts, some users will see what they want on the page and click to navigate before the script has fully loaded or sent its data. This will impact the quality of the data you’re capturing. When our customers come to us, it is speed they want (or at least the benefits that speed brings) so this trade-off is acceptable but it may not be for everyone and its also why the official docs encourage the scripts to be loaded upfront.

    There is a new Web API slowly coming to browsers that will help here. It’s called “sendBeacon” and basically allows the browser to send data to an endpoint “whenever it has an opportunity”, even after the user has navigated away from the page.

    --

    One of the reasons why these scripts have such an impact on page-load time is due to how they’re hosted. Typically they’re on a third-party domain so that can mean another DNS-resolve, another connection handshake, and – worst case – an under-resourced web server handling the request.

    Ideally though, these scripts are on a shared CDN, which is fast, highly-available, versioned, and using the same URL in your HTML as is being used by other websites that your visitors also frequent so it’s already in their browser cache.

    --

    When these ideals are not met you may want to consider hosting the script on your own domain, if supported. And this isn’t limited to just tracking scripts, even the location of framework libraries like jQuery should be considered. Sadly the performance effect of this decision isn’t as clear-cut as others and is best verified against your site’s user-base.

    Performance aside, choosing to bring a script on-domain can be driven by how dependent your first-party script is on these other scripts. We’ve seen new, buggy, versions of scripts pushed to non-versioned URLs and break sites that depend on them. We’ve seen ISP-specific network issues leave visitors able to access a website but not able to reach some of the third-party resources.

    If your site can continue to function, perhaps with a few error messages in the console, then this isn’t a big deal. If your code’s critical path depends on these libraries, you may want to host them on-domain.

    --

    Cite: http://www.growingwiththeweb.com/2014/02/async-vs-defer-attributes.html
    Cite: https://developer.mozilla.org/en-US/docs/Web/API/Navigator/sendBeacon
  • So far, I’ve been focused on predominantly client-side concerns but the server-side certainly has its fair share too. In fact, we could spend days looking at the ways to optimise server-side code but instead I’d like to just highlight a few scenarios I’ve personally found interesting.

    --

    Websites get crawled. Often by popular search engines which is desirable but also by many other bots – specialist, niche search engines, screen scrapers, suspicious vulnerability scanners, everything. Often these bots request URLs that don’t exist on your site.

    How much does it cost your server to serve a 404 Not Found page? How much does it cost to server thousands of 404 Not Found pages for different URLs?

    We have worked with sites where serving a 404 required hitting their application code, query the database to see if the URL corresponded to a product, or a category, or some CMS article page, getting zero results, then rendering their custom 404 page by querying the database again to find popular products that the visitor might have been interested in. That gets resource intensive quickly when hit repeatedly.

    When these requests come in once-each for a different URL, having our CDN in front caching them doesn’t help. However, when the origin web server is doing some page caching of its own, it gets worse. In once instance, the sheer volume of recently accessed 404 pages being cached by the origin web server’s own page cache quickly reached the cache’s capacity and started evicting all the other objects from the cache, forcing the server to recompute results for legitimate requests from real users.

    --

    And just when you think it won’t get any worse, remember that many bots are dumb. They’re not running a full web browser, or even PhantomJS – they just use cURL or their favourite programming languages HTTP request API and they’re not honouring cookies.

    Meanwhile, you have a web server that issues a session cookie for every request that comes in without one and stores some initial session data in a database for that new session. In one case, we worked with a site where the server stored the timestamp and the URL of the last request for every session in a database table using an UPDATE-or-INSERT approach. With a cookie-ignorant bot crawling the site, this quickly filled the session table to the point that the site was blocking all requests while waiting for database locks.

    --

    But it isn’t just bots. Hoards of excited customers racing to give you money can be equally devastating – although arguably a good problem to have.

    One customer had established a pattern of releasing several new products, every week or two. Always on the same day of the week, at the same time. There was only limited stock and their customers knew they had to get in quick. It was normal to see their site sell all their new stock in the half-hour of that new product launch. But they had a hard time getting the site to stay online under the load. As their CDN provider we could offload all the traffic up until the users began the checkout process – it’s user specific, it has to go to the origin.

    One of their biggest issues was database contention trying to copy the records from the shopping cart tables to the completed order tables and update the remaining inventory rows. Every time the database server would grind to a halt.

    Eventually they switched to a data store with a more suitable concurrency model for each concern – inventory levels, and complete orders.

    --

    Think about the revenue path through your site, and how much of what is happening is critical to complete the transaction and what could be postponed until later. If your payment provider is a bottleneck, can you complete the web transaction early, and later contact the handful of users who’s payment bounced?

    Consider what controls you can give to operations staff during high-traffic events to disable optional features but still enables your site to deliver its key capability. Hypothetically, YouTube might decice to disable comments or related-videos to keep streaming online.
  • I mentioned earlier that you should forget about protocol-relative URLs and just use HTTPS all the time. Historically the argument against this is that HTTPS is too slow or that it requires too much computation. And that’s the best place for that argument – in history.

    Today, a well-configured HTTPS deployment is both fast, and efficient. Improved processor power has a lot to do with it but there have also been improvements in the TLS protocol itself. TLS session resumption (through session caching or tickets) eliminates an extra round-trip for return visitors. TLS False Start can eliminate the extra-round trip for new visitors.

    Dynamic TLS record sizing reduces the latency during the initial handshake, OCSP Stapling allows the browser to check certificate revocation without another network call, and HSTS can pre-empt the typical 301 Redirect from the HTTP site to HTTPS.

    And for me, the greatest evidence that HTTPS is fast is that Netflix is now streaming all their full HD video over HTTPS.

    --

    While HTTPS is fast, HTTP/2 is faster. The HTTP/2 specifications allow HTTP/2 over insecure plain-text connections just like HTTP/1.1 but all the browser implementations only allow HTTP/2 over HTTPS by default so going full-HTTPS is your first step regardless.

    The first benefit you get is multiplexing. I’m sure you’ve noticed HTTP/1.1 only supports a single request and response at a time on a connection so your browser opens multiple connections so it can request multiple resources in parallel, paying the connection handshake round-trip cost on each of them. HTTP/2 allows concurrent requests and responses to be interleaved on a single connection, avoiding those extra handshakes, and the multiplexer also has a priority model to request that more important resources (eg those required for page rendering) are sent sooner.

    HTTP/2 is already supported today in Chrome, Safari, Edge, and Firefox.

    --

    But wait, that’s not all. HTTP/2 also introduces Server Push. Let’s imagine a really simple site – a page consists of the HTML document, a reference to one all-encompassing stylesheet, and a reference to a single JavaScript file. Normally the browser requests the document, waits for the HTML to start arriving, then it begins parsing the markup and discovers these stylesheet and script references. The browser realises it doesn’t have them cached locally, so it sends the requests for each of them to the server, and knowing that it can’t render the page without at least receiving the stylesheet, it does whatever else it can while it waits for those responses.

    However, the server knew all along that it was sending HTML with references to these other resources that would inevitably be requested. With HTTP/2 the server can take this opportunity to begin sending these resources to browser, multiplexed with the HTML document response, before even seeing a request for them. With Server Push, when the browser reaches the point that it discovers these references in the markup, it will also immediately discover that it already has these resources too and can begin rendering much sooner.

    Server Push is supported by the same browsers that support HTTP/2 with the exception of Safari.

    --

    There are several other advantages to HTTP/2 but I want finish on just one note which is relevant while your site’s visitors use a mix of HTTP/2-capable browsers and incapable browsers. Before HTTP/2, sites with many resources often employ domain sharding – this is where you create additional subdomains so that browsers will make many more connections to download more resources in parallel (eg having www.site.com and images.site.com or assets1. and assets2. and so on). This approach is suboptimal for HTTP/2 – it works better with a single multiplexed connection. Domain sharding with HTTP/2 just costs you more unnecessary connection handshakes.

    Thankfully, there is special configuration that HTTP/2 browsers will recognise and avoid creating additional connections even though you’re specifying different domain names, and that configuration requires these two criteria are met:
    1. All the different domain names need to DNS resolve to the same IP addresses.
    2. The HTTPS certificate provided by the server on the first connection (eg www) needs to also include the names of the sharded domains – this can be via a wildcard or by a discrete list of Subject Alternate Names.

    --

    Cite: https://istlsfastyet.com
  • I work for a CDN company, so inevitably this slide was coming.

    Over half of the Alexa top thousand websites use a CDN and so do over a third of the top 10,000 websites. There are some problems better addressed by a CDN than by throwing more hardware at your web servers, and some problems you can only solve with a CDN.

    If you find yourself with a site with a lot of traffic, or even just a spikey traffic profile, you should consider using a CDN. It doesn’t need to be our CDN, do your research, pick the right one for you.

    --

    Everything I said up front about caching in the browser applies much the same, if not more, when using a CDN. On top of that, there are some additional points when the cached responses are not used by just a single user’s browser, but by many users.

    Cookies are death. Essentially, the presence of a cookie in a request or a response indicates that the response should not be cached for sharing with other users. Unless of course you’d like to leak user data like Steam did in December or Menulog did last week.

    The problem is that there’s usually only one important cookie identifying a user – the session cookie – but this cookie has a different name for basically every server-side web framework, and is customisable. Additionally, many third-party client-side scripts also set cookies – cookies that are only intended to be consumed client-side and are totally ignored by the server. And while cookies can be flagged as `httponly` so that JavaScript cannot access it, there is no inverse to stop a cookie from being sent to the server. The Web Storage API is probably the closest to such an idea but isn’t broadly used in this way for various reasons.

    This is why many CDN are only used for static resources guaranteed to be safe to share among users, and even put on subdomains instead of the sites primary domain.

    A good CDN will let you tailor which cookies can be ignored and which signal potentially user-specific content. And if you have a web server that insists on issuing session cookies to every anonymous user it sees, a great CDN will let specify which requests that session cookie is relevant to.

    --

    There are many approaches to building a site for both desktop browsers and mobile browsers. One is to put the mobile site on a separate `m.` subdomain, another is to go completely responsive and let the browser decide how to render the content. We often see the situation where the web server does User-Agent sniffing to send different responses to desktop versus mobile for what is essentially the same request from both. I won’t debate the pros and cons of each style today but I will say this User-Agent sniffing approach is bad for CDN caching.

    When matching a request to a potentially cached response, a CDN can only use what has been provided by the browser in the request. By the default, the cache key is the absolute URL. This default behaviour would lead to a mix of desktop and mobile content being cached for various URLs which are then often served to the wrong device type.

    The HTTP specification has a mechanism for extending the cache key – the `Vary` response header. For these sites serving mixed form-factor content, they leverage to by sending `Vary: User-Agent` which a well-behaved CDN will then do and the correct content now goes to the correct devices. The problem is this obliterates your cache effectiveness. There is so much variety in User-Agent strings, care of OS versions, browser versions, plugins, processor architecture, and so on, than instead of having two buckets, desktop and mobile (or maybe 3: tablet), you get hundreds or more, massively reducing the likely hood that a browser is going to request a resource which is keyed by the same User-Agent and has not yet expired. Not to mention that this cache fragmentation means you have much more data in your cache and you can begin evict other items due to cache pressure.

    For our customers, we solve this by bringing the User-Agent sniffing into the CDN. We can then map a User-Agent string to one of two cache keys, and if there’s a cache miss, we forward the determination of mobile or desktop to their web server so they can provide the corresponding response.

    --

    Who knows CSRF (cross-site request forgery)? For those that aren’t sure, CSRF is the idea that a third-party can present to a user, a page with a form that will covertly submit to another site, and if the browser has session cookies for that other site, they’ll be sent with the request it becomes indistinguishable from an intentional action of the user. This could be tricking a user into liking a post on Facebook, or completing a financial transaction, and is naturally something we need to protect against.

    A very effective approach is to use Anti-CSRF Tokens. These typically appear as a hidden field in the form on the original site containing a user-session-specific secret key. When the form is submitted, the web server looks for this key and compares it against the same key stored in the server-side session data corresponding to the user’s session cookie. If the key is absent or doesn’t match, the request is rejected. This works because browsers make it very difficult to obtain this token across origin boundaries.

    The problem is that various web frameworks strongly encourage Anti-CSRF tokens to be used on every POST request. Imagine an e-commerce site where a page of the most popular products has a convenient “Add to cart” button beside every product. If the web server expects a session cookie and a corresponding token for an add-to-cart POST request for an unauthenticated visitor then that page can not be cached and shared with other users.

    For a CDN to be as effective as possible, CSRF needs to be configured only on privileged actions. Beginning the checkout process with items in your cart – check for the token. A logged in user adding an item to their wish list – probably check for token. A new visitor adding an item to their cart – probably better mitigated with an Origin or Referer request header check instead of the token.

    --

    Hopefully a pattern has started to emerge in my presentation today. Wherever possible, serve HTTP responses that are either entirely user-abstracted content that can be shared with anyone, or only contain the data that is specific to a particular user. Then cache that shared content as much as possible.

    For example, picture the “Hi Bob” that appears on the header of every page of a site when you’re logged in. Many sites we work with, still render that user’s name into in to the HTML page along with everything else that is otherwise not specific to Bob. Instead, the site could omit that text from the HTML page and use an AJAX call to discover the user’s name from the session data and render that in the browser separately. The delay on the extra round-trip for that piece of data won’t affect the user experience but it will mean the server is spending much less time generating whole pages for everybody individually.

    Even better would be to just cache the user’s name in browser local storage and forget the AJAX call. Greeting the user by their first name is rarely a privileged operation.

    At every opportunity, ask yourself, do we really need to ask the server for this information right now?

    Or even tell the server right now? Staying with the shopping cart example – many sites would behave the same if adding items to the cart was purely a client-side operation, only informing the server when the user was ready to begin checkout, or perhaps when the user leaves the site, and you want to capture abandoned-cart data.

    --

    Whichever CDN you choose, ask them how you can test the CDN caching and rewriting in your pre-production environments.


    Cite: http://dyn.com/blog/dyn-research-cdn-adoption-by-the-numbers/
  • Thank you all for listening.

    I hope you found at least some of what I’ve shared today to be useful.

    If you have any questions about web performance, HTTPS, Content Delivery Networks, or anything else I’ve touched on, I’ll be around all afternoon.

    Or you can look me up on Twitter, or my blog.
  • Web Performance Lessons at DDD Sydney May 2016

    1. 1. Web Performance Lessons Solving performance challenges on other people’s websites
    2. 2. • E-commerce • Retail • Insurance • Automotive • News and forums • Brochureware Context
    3. 3. • Browser rendering performance • Delivering responses to the browser • Rendering responses on the server • Availability Performance and Efficiency
    4. 4. WebGoat
    5. 5. • Cache-Control and Expires headers • Query strings • Protocol-relative URLs • Prefer GET instead of POST Cache in the browser
    6. 6. • Scripts and Stylesheets • Concatenate • Minify/compress • Images • Resolution (and in HTML) • Strip metadata • Sprite / Inline Appropriate resources
    7. 7. • <script defer> • Page-load time versus quality metrics • Shared CDN versus on-domain • Beware coupling Scripts
    8. 8. • Dynamic pages not found • Server-side session state • Database contention • Degrade gracefully Web Server
    9. 9. • TLS is fast • Multiplexing • Server Push • Domain sharding HTTPS and HTTP/2
    10. 10. • Cookies • Vary: User-Agent • Cross-Site Request Forgery • User-abstracted HTML Content Delivery Network
    11. 11. Jason Stangroome Twitter: @jstangroome https://blog.stangroome.com Thank you
    12. 12. 1-5 August DDD Sydney thanks our sponsors

    ×