HCLT Whitepaper: Accelerated Web Content Delivery


Published on

In 1995 when the Internet was still in its infancy there were about 16 million users worldwide using it, compared to about 2 billion worldwide users today1. Over the last fifteen years not only has the number of people using the Internet grown exponentially, but we have also witnessed an evolution of technology standards, protocols, and information consumption patterns. The Internet is no longer limited to desktop/laptop computers. An increasing number of people on the go are using handheld devices to access their preferred websites. The easy access of websites has resulted in a significant increase in Web traffic.
Today while designing a Web application or a website that is expected to generate a lot of interest, one has to ensure that the Web application has the right design and infrastructure to handle the extra load, failing which websites are likely to experience difficulties. For instance, the highly popular micro-blogging website twitter.com faced stability issues for a long time after its launch, since it was not designed to handle a large amount of traffic.
The performance of a Web application is determined by multiple factors such as design and application architecture, quality of code and hardware infrastructure. Performance needs to be built at every layer of the technology stack to get a solid finished product.
This paper focuses on the Web content caching aspect of website performance.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

HCLT Whitepaper: Accelerated Web Content Delivery

  1. 1. Accelerated Web Content DeliverySanjeet JoshiArchitecture Technology ServicesHCL Technologies Ltd.
  2. 2. Accelerated Web Content Delivery© 2010, HCL Technologies Ltd.November 2010The author would like to thank Dr. Usha Thakur of ATS for her valuable help incontent formatting and content enhancement. NON-DISCLOSURE OBLIGATIONS AND DISCLAIMER The data, information or material provided herein is confidential and proprietary to HCL and shall not be disclosed, duplicated or used in whole or in part for any purpose other than as approved by an authorized official of HCL in writing. The recipient agrees to maintain complete confidentiality of the information; data received and shall take all reasonable precautions/steps in maintaining confidentiality of the same, however in any event not less than the precautions/steps taken for its own confidential material. If you are not the intended recipient of this information, you are not authorized to read, forward, print, retain, copy or disseminate this document or any part of it. Any statements in this presentation that are not historical facts may include forward-looking statements that involve risks and uncertainties; actual results may differ from the forward-looking statements. Page 2 of 12
  3. 3. BackgroundIn 1995 when the Internet was still in its infancy there were about 16 million users worldwideusing it, compared to about 2 billion worldwide users today1. Over the last fifteen years not onlyhas the number of people using the Internet grown exponentially, but we have also witnessedan evolution of technology standards, protocols, and information consumption patterns. TheInternet is no longer limited to desktop/laptop computers. An increasing number of people onthe go are using handheld devices to access their preferred websites. The easy access ofwebsites has resulted in a significant increase in Web traffic.Today while designing a Web application or a website that is expected to generate a lot ofinterest, one has to ensure that the Web application has the right design and infrastructure tohandle the extra load, failing which websites are likely to experience difficulties. For instance,the highly popular micro-blogging website twitter.com faced stability issues for a long time afterits launch, since it was not designed to handle a large amount of traffic.The performance of a Web application is determined by multiple factors such as design andapplication architecture, quality of code and hardware infrastructure. Performance needs to bebuilt at every layer of the technology stack to get a solid finished product.This paper focuses on the Web content caching aspect of website performance.PurposeWeb caching is not a new idea. It has been in use for quite some time and current browsers,caching proxies, and Web servers provide support for it. However Web caching is most often anignored aspect while designing a technology stack of a Web application.Web content caching can be implemented by content consumers (end users) to improve theirInternet browsing experience or by content providers to reduce the load on their origininfrastructure, as well as to give their customers a better Web surfing experience.Caching at content consumer’s end is handled by Web browsers such as Internet Explorer,Firefox etc. This is done automatically and end users have limited control over how and whatwill be cached. Some organizations also install caching proxies to cache incoming Web contentand to apply security policies.This paper will focus on caching solutions from a content provider’s point of view and thevarious ways in which content caching can enhance a website’s performance. The paper1 See http://www.internetworldstats.com/stats.htm [November 2010] -> indicates when this site wasaccessed Page 3 of 12
  4. 4. assumes that the reader is familiar with Web standards like HTTP, HTML and is technical innature. It is targeted towards technology architects and solution designers.Web Caching ConceptsThe concept of caching has been widely used since the early days of computing andimplemented at various layers in a technology stack. For example the processor chip layer hasa hardware cache that is used for storing most frequently accessed instructions. Irrespective ofwhere a cache is used, its main function is to store the most frequently accessed data(information or instructions) and its main goal is to improve performance by reducingread/computation cycle times.It is common knowledge that application level caching can be extremely beneficial in savingmultiple expensive database reads or expensive repetitive computations thereby improving theoverall application performance.HTTP caching or Web caching goes one layer above and caches entire static Web resources(e.g., HTML pages, CSS files etc) either at the client side (browser cache) or at the server side(origin cache infrastructure).Let us take a quick look at some of the common terms used with respect to caching in generaland Web caching in particular.Origin server or origin infrastructure is the server infrastructure where Web servers orapplication servers are hosted. These servers are responsible for serving fresh content uponrequest.Time to live (TTL) Cacheable data has a validity period beyond which it is considered stale.This is referred to as TTL. It is a critical parameter because a very low TTL makes cachingineffective and a very high TTL results in stale data being served to clients.Cache hit occurs each time an HTTP request is served from cache.Cache hit ratio is the percentage of all requests that result in cache hits.A cache miss occurs when a request cannot be served from cache. Page 4 of 12
  5. 5. Controlling Caching Behavior of Your ContentWeb browsers and caching proxies depend on the HTML and HTTP headers of the deliveredcontent for determining if the content can be cached, and if so, for how long it can be cached.These cache headers can be tuned to define the cache behavior of a Web application/website.Cache HeadersHTML authors can use tags in the <HEAD> section of the HTML page to dictate the cachingbehavior of that page. However, header tags for caching do not have defined standards andhence not all browsers or caches honor them. For example using <Pragma: no-cache> doesnot guarantee that the content will never be cached. Hence it is not advisable to use HTMLcache headers.A more reliable approach is to use HTTP headers. HTTP headers are created by the Webserver and sent in response to a request. The headers help the caching layer decide if thecontent can be cached, for how long it can be served, and when it needs to be refreshed fromthe origin server. Some important HTTP headers that control caching are as follows: Expires: Gives the date and time after which response is considered stale. For example, Expires: Sun, 06 Aug 2011 10:00:00 GMT. Cache-Control: Provides multiple options for controlling cache mechanism. They are as follows max-age=[seconds] — specifies the maximum time for which a resource will be considered fresh. Similar to Expire, this directive is relative to the time of the request, and not absolute. s-maxage=[seconds] — similar to max-age, except that it only applies to shared (e.g., proxy) caches. public — marks authenticated responses as cacheable; normally, if HTTP authentication is required, responses are automatically private. private — allows caches that are specific to one user (e.g., in a browser) to store the response; shared caches (e.g., in a proxy) may not. no-cache — forces cache to submit each request back to the origin server for validation before releasing a cached copy. This is useful for ensuring that authentication has been respected (in combination with public) and for maintaining freshness without sacrificing all of the benefits of caching. no-store — instructs caches not to keep a copy of the representation under any conditions. Page 5 of 12
  6. 6. must-revalidate — tells cache that it must obey any freshness information user gives about a representation. HTTP allows cache to serve stale representations under special conditions. proxy-revalidate — similar to must-revalidate, except that it only applies to proxy caches. Note: One important point to remember here is that not all type of content can be cached. For instance, dynamic content generated using server side scripting cannot be cached under normal conditions. However, dynamically assembled content that does not change frequently can be cached by making those scripts return valid cache headers.Content Delivery NetworksContent Delivery Networks (CDN) are established commercial solutions on the market thatprovide a Web content caching layer. These networks provide a transparent caching layerbetween Web clients and the origin infrastructure, and intercept every request going to theorigin server. Typically CDNs have their cache servers distributed around the world and havesmart algorithms for delivering cached content from the nearest (in terms of network hops)cache location. CDNs take a major chunk of content serving load away from the origininfrastructure thus reducing its load. CDNs are also used for delivering rich multimedia contentsuch as audio and video files.Figure 1 illustrates where a CDN fits in the overall workflow. Origin Server Infrastructure Web clients www CDN http http http Figure 1: Positioning a CDNAlthough CDNs deliver huge value they may not be suitable for small organizations with limitedbudget because they are expensive to hire. CDNs are recommended mostly for organizationsthat want more control over the caching behavior of their content. In such cases, a custom CDNnot only works out to be cheaper to implement but also gives immense control over caching. Page 6 of 12
  7. 7. SQUID Proxy in Server Acceleration ModeSquid is an open source caching proxy product licensed under the GNU GPL. It is one of themost widely used, robust and feature-rich open source products available on the market. Squidis used by websites such as Wikipedia.org that witness very high traffic volumes.Squid can be installed as a proxy to improve client side Web surfing performance, apply securityand filtering mechanism and apply organizational policies by monitoring outgoing requests.Squid can also be installed in a reverse proxy mode to improve server side content deliveryperformance. This is also known as server accelerator mode. A reverse proxy is setup close tothe origin Web servers to serve incoming requests rather than outgoing requests. Origin Server Infrastructure Squid Web clients Reverse Proxies www http http Figure 2: Squid as Reverse ProxyA reverse proxy acts as an intermediary between a Web client and the origin Web server(s). Itreceives all content requests and delivers valid content available in cache. If the requestedcontent is not available, the reverse proxy requests the origin server for the content. Thisreduces TCP connection and content rendering load on the origin servers making themavailable for other important tasks.Some key benefits of the afore-mentioned architecture are as follows. 1. LOAD BALANCING: If the Web server infrastructure requires expensive server hardware, Squid can be installed on a number of inexpensive commodity hardware boxes, thereby reducing the number of expensive origin servers. 2. SECURITY: This can also provide an effective security solution because the origin server infrastructure is hidden behind the Squid infrastructure layer. Hence any attack on the website is limited to the squid infrastructure, and any damage is limited to the cached content. 3. PERFORMANCE: A correctly tuned Squid installation can provide significant performance gains as the proxy is meant for serving cached content at very high speeds. It uses in- memory caching for better performance. Squid also provides various cache replacement policies that play a major role in determining the performance of a Squid server. Page 7 of 12
  8. 8. Squid Cache Replacement PoliciesCache replacement policy determines which objects in the cache can be replaced by other newobjects that are most likely to be served and thereby improve the cache hit ratio. This is animportant choice because it helps in disk and memory usage optimization. For example, themost popular objects should not be removed from the cache and least accessed cached objectsshould be replaced by more popular objects.There are various replacement policies offered by Squid. Below we provide a brief introductionto all of them. There is no single recommended or best policy. The right policy is chosen afterstudying the content and how it is accessed.LRU (Least Recently Used)LRU is a common and effective choice for most cache implementations. It removes objects withthe greatest last accessed timestamp i.e. cached objects that are not accessed for a long timeare the prime candidates for replacement. LRU works well when objects that are most recentlyaccessed have a greater likelihood of being accessed again in the near future.LFUDA (Least Frequently Used with Dynamic Aging)LFU is another commonly used policy that keeps count of object references and then removesthe least used objects.LFUDA is a variant of LFU that uses a dynamic aging policy to accommodate shifts in the set ofpopular objects. In the dynamic aging policy, the cache age factor is added to the referencecount when an object is added to the cache or an existing object is modified. This preventspreviously popular documents from polluting the cache.GDSF (Greedy Dual-Size Frequency)GDSF is an enhancement of GDS which takes into account the size of the cached object andthe cost associated to retrieve it. GDFS takes into account frequency of reference. This policy isoptimized for more popular, smaller objects in order to maximize object hit rate.Squid Deployment TopologiesMultiple Squid servers can be configured to work together to improve cache hit ratios or tohandle additional load. Squid caches, when installed in such a group, share either a siblingrelationship or a parent relationship. Squid servers running as parents can have multiple siblingnodes communicating with it essentially forming a hierarchy. A flat topology may include Squidservers with only sibling relationships.If a request results in cache miss on a sibling node, it is transferred to the parent node. If parentalso returns a cache miss then the parent contacts the origin server for fresh content. Page 8 of 12
  9. 9. Squid Capacity PlanningSquids hardware requirements are generally modest. Memory is often the most importantresource. A memory shortage significantly reduces performance. Higher hit ratios are obtainedby caching more objects. Caching more objects requires more disk space. Therefore disk spaceis also an important factor that needs to be considered. Fast disks and interfaces are alsobeneficial in improving disk access time. SCSI performs better than ATA, and may be chosen ifthe higher cost can be justified. While fast CPUs are nice, they are not critical to goodperformance.Squid allocates a small amount of memory for each cached resource (up to 24 bytes perresource). As a rule of thumb it requires 32MB RAM for each GB disk space. So a server with512MB RAM can serve a disk cache of 16GB, or for a 300GB disk cache, approximately 10GBRAM will be needed.Conclusion  Using reverse proxies for Web caching is a non-intrusive way of improving content delivery performance.  Reverse proxy based Web caching can be implemented as a cost effective replacement for commercial CDNs.  A customized CDN gives better control over the caching infrastructure and helps meet the specific performance needs of an enterprise as compared to an expensive commercial CDN which may provide limited configuration options.  CDNs can reduce considerable load from the origin servers thus freeing up the origin server resources for other tasks. Page 9 of 12
  10. 10. Appendix A – Case Study“Squid Implementation for a Leading Global Entertainment ContentCompany”The customer uses Akamai Edge Server Platform for improved content delivery. Edge ServerPlatform’s design helps in improving content availability and reducing request response time.This ideally translates into less Web traffic coming directly to the Web servers (origin servers)thus improving the overall efficiency of the infrastructure and reducing infrastructure costs.Ironically though, it was observed that origin servers are receiving increased Web traffic fromAkamai Edge servers themselves. A solution had to be put in place to tackle that problem withminimal impact on existing applications and content.Problem ContextThe Akamai Edge Platform offers a robust design for highly efficient content delivery across theglobe. This is achieved by deploying several thousand servers at data centers all over the world(edge servers) and then replicating the content to be delivered on appropriate servers. The keythen is to route all content requests from clients to the nearest (in terms of network hops)available server resulting in minimal response time and higher availability. Here the edge serveract as a caching proxy that requests content from the origin server and then serves the cachedcopy until its expiry, at which point a fresh copy is again requested from the origin server.Akamai uses a hierarchical architecture for its edge platform to avoid thousands of edge serversmaking multiple refresh requests to the origin server.The problem is that the ‘innermost’ edge servers still need to make a refresh request to get thenew content from the origin server. This results in the origin server having to serve each of therequests separately. This was the root cause of the problem. Origin Server Infrastructure Akamai CDN Foo.htm Foo.htm Foo.htm Foo.htm Figure 3 – High-level Problem Representation Page 10 of 12
  11. 11. The customer summarized the problem at hand thus: - High traffic documents such as home pages were being requested from their origin servers as many as 70 times within a single TTL interval. This meant that there were that many innermost Akamai servers in the hierarchy. - Far too many requests were being received for pages, XML documents, dynamically generated JS, CSS etc.Customer felt that if the above-mentioned problems were addressed, the availability of the originservers would rise close to to 99.99%.Solution Approaches Considered by HCLBelow is a brief summary of the approaches evaluated by the HCL team and its assessment ofthose approaches.Approach 1: Custom Solution - Application Server SideThe first approach called for intercepting incoming content refresh requests from the Akamaiservers to the origin servers, queuing and prioritizing them, and then rendering the highestpriority content.HCL Assessment of Approach 1  Solution was a workable one but complex and many race conditions would have to be considered before the solution’s effectiveness became known.  Robustness and performance of such a solution was not obvious.  Solution mandated changes to the application layer which could have resulted in a cascading effect on the underlying layers.Approach 2: Using Pre-fetch Settings Provided by AkamaiThe second approach called for asynchronous content refresh. When this feature is enabled inAkamai, the content refresh requests are sent even before the content becomes stale. Akamaiservers continue to serve the existing content even after sending refresh requests, therebyrefreshing content asynchronously.HCL Assessment of Approach 2  Solution seemed like it was a perfect fit for the problem at hand, but it would not provide a complete solution.  Solution would work well only when content was requested during the threshold set by pre-fetch settings. For example if pre-fetch was set to 90%, Akamai servers would send refresh requests to origin after 90% of TTL were over.  Core problem of receiving multiple requests for the same content would remain unaddressed Page 11 of 12
  12. 12. HCL’s Squid Reverse Proxy-Based SolutionThe HCL solution was based on the following design principles: 1. Minimal or no changes to the application layer 2. No rework for content producers or brand owners 3. Once installed, solution should work transparently (without any other layers being aware of its existence) 4. Solution should be repeatable/reusableUsing Squid as a Reverse ProxyThe goal of HCL’s solution was to minimize the number of requests going to the origin serverswhile still serving as fresh content as possible.As a first step, the HCL team proposed the installation of Squid in the reverse proxy mode on aseparate infrastructure. This introduced an additional caching layer between Akamai serversand the origin servers. Upon setup, it cached all the relevant content and served it wheneverrequested by Akamai. The team used advanced cache control setting provided by Squid (v 2.7)to control the number of redundant requests for a single resource and to also supportasynchronous refresh.Goals AchievedThe solution proposed by the HCL team passed the rigorous performance checks with over90% load reduction. Page 12 of 12