World Wide Web Caching Trends and Techniques by Ersan Bilik 11 May 2005
Why ? Bandwith Savings Server Load Balancing Network Latency Reduction Content Availabilitiy Why we needed caching ? Growth of Internet ! At 90`s FTP usage was %44 , nowadays HTTP usage is between 75% and 80%
Proxy Caching Intercepts HTTP request. Requested object found ? Then return it Else , go to object’s home server and cache it. Return cached object
Disadvantages of Proxy Caching When the cache server is unavailable, then served clients won’t reach content. Web browsers should be manually configured to appropriate proxy cache In any change of proxy server all browsers should be manually configured again !
So, What to do ? Locating nearby proxies Browser Auto-Configuration Proposed by Internet Engineering Task Force (IETF) Web Proxy Auto Discovery Protocol (WPAD) Relies on DNS records and DHCP to locate an automatic proxy configuration file (APC)
Reverse Proxy Cache Cache it near the origin of contents instead of near the clients. Good for servers with a huge number of requests For e.g. Hosting farms Can’t be an alternative of Client-Side Proxy Caching
Transparent Proxy Caching Intercepts HTTP request Redirecting them to web cache servers After that, it works like a proxy caching Works like a router
Advantages & Disadvantages No need to configure web browsers manually ! Additional network traffic ! No Acknowledge ! Can be used with a L4 switcher !
Adaptive Web Caching Aim: Dynamically Bring proxy servers closer to “hot spots” What is hot spot ? A highly requested information Cache Group Management Control (CGMP) Content Routing Protocol (CRP)
How CGMP algorithm works ? Nodes “Learn” the enviroment Sounds like Artifical Intelligence (Genetic algorithms) Nodes votes the other nodes The higher fitness values of nodes, the more nodes will join to that mesh (sub network)
Push Caching Aim : Keep requested data close to clients Not like adaptive caching Adaptive caching targets on : The boundries can englarge Push caching targets on : Contenting datas
Active Caching  Problem : 30% of information requested by clients are dynamic data’s (such as cookies) In future, it will be more ! Use applets for dynamic content ! Cache the dynamic content  Use dynamic content locally (at cache)
Cache Deployment Options Consumer Oriented Proxy Caching, Tranparent Proxy Caching Provider Oriented Reverse Proxy Caching Strategic Points in Network Adaptive Caching Advantages & Disadvantages ?
Hierarchical Caching Aim: Have a series of caches hierarchically arranged in a tree like structure When request arrived , leverage from eachother. Child Caches query parent caches and children query eachother.
Intercache Communication It is desirable to caches query eachother ! There are five well known protocols which deals with this issue. ICP,cache digest,CRP,CARP,WCCP ICP is the oldest and most mature ICP queries other caches to determine the best way to respond requested object. There is a relation between depth of tree and latency
Hash Based Request Routing Aim: Perform load balancing in cache clusters Why to use a long string when u can use 128 bits to define everything ? Microsoft CARP doesn’t query cache’s but request url by a hashing function
Optimized DISK I/O`s Data Structers to optimize caching Hash tables Reduce I/O costs will effect performance
Micro Kernel Architectures Resource Allocation Task Execution Disk Access Transfer Times Windows NT and UNIX is not a suitable operation system for web caching
Content Prefetching Retriving data from remote servers in anticipation client requests What to prefetch next ? A smart algorithm can reduce latency up to 50%
Cache Consistency What will happen to out-to-date objects ? Instead of checking when a request comes, check it periodically.. But whom to check ? Server to proxy or proxy to server ? Time to Live (TTL)
Conclusion & Questions & Comments Web Caching is important technology Bandwith Savings Network Latency Reduction

World Wide Web Caching

  • 1.
    World Wide WebCaching Trends and Techniques by Ersan Bilik 11 May 2005
  • 2.
    Why ? BandwithSavings Server Load Balancing Network Latency Reduction Content Availabilitiy Why we needed caching ? Growth of Internet ! At 90`s FTP usage was %44 , nowadays HTTP usage is between 75% and 80%
  • 3.
    Proxy Caching InterceptsHTTP request. Requested object found ? Then return it Else , go to object’s home server and cache it. Return cached object
  • 4.
    Disadvantages of ProxyCaching When the cache server is unavailable, then served clients won’t reach content. Web browsers should be manually configured to appropriate proxy cache In any change of proxy server all browsers should be manually configured again !
  • 5.
    So, What todo ? Locating nearby proxies Browser Auto-Configuration Proposed by Internet Engineering Task Force (IETF) Web Proxy Auto Discovery Protocol (WPAD) Relies on DNS records and DHCP to locate an automatic proxy configuration file (APC)
  • 6.
    Reverse Proxy CacheCache it near the origin of contents instead of near the clients. Good for servers with a huge number of requests For e.g. Hosting farms Can’t be an alternative of Client-Side Proxy Caching
  • 7.
    Transparent Proxy CachingIntercepts HTTP request Redirecting them to web cache servers After that, it works like a proxy caching Works like a router
  • 8.
    Advantages & DisadvantagesNo need to configure web browsers manually ! Additional network traffic ! No Acknowledge ! Can be used with a L4 switcher !
  • 9.
    Adaptive Web CachingAim: Dynamically Bring proxy servers closer to “hot spots” What is hot spot ? A highly requested information Cache Group Management Control (CGMP) Content Routing Protocol (CRP)
  • 10.
    How CGMP algorithmworks ? Nodes “Learn” the enviroment Sounds like Artifical Intelligence (Genetic algorithms) Nodes votes the other nodes The higher fitness values of nodes, the more nodes will join to that mesh (sub network)
  • 11.
    Push Caching Aim: Keep requested data close to clients Not like adaptive caching Adaptive caching targets on : The boundries can englarge Push caching targets on : Contenting datas
  • 12.
    Active Caching Problem : 30% of information requested by clients are dynamic data’s (such as cookies) In future, it will be more ! Use applets for dynamic content ! Cache the dynamic content Use dynamic content locally (at cache)
  • 13.
    Cache Deployment OptionsConsumer Oriented Proxy Caching, Tranparent Proxy Caching Provider Oriented Reverse Proxy Caching Strategic Points in Network Adaptive Caching Advantages & Disadvantages ?
  • 14.
    Hierarchical Caching Aim:Have a series of caches hierarchically arranged in a tree like structure When request arrived , leverage from eachother. Child Caches query parent caches and children query eachother.
  • 15.
    Intercache Communication Itis desirable to caches query eachother ! There are five well known protocols which deals with this issue. ICP,cache digest,CRP,CARP,WCCP ICP is the oldest and most mature ICP queries other caches to determine the best way to respond requested object. There is a relation between depth of tree and latency
  • 16.
    Hash Based RequestRouting Aim: Perform load balancing in cache clusters Why to use a long string when u can use 128 bits to define everything ? Microsoft CARP doesn’t query cache’s but request url by a hashing function
  • 17.
    Optimized DISK I/O`sData Structers to optimize caching Hash tables Reduce I/O costs will effect performance
  • 18.
    Micro Kernel ArchitecturesResource Allocation Task Execution Disk Access Transfer Times Windows NT and UNIX is not a suitable operation system for web caching
  • 19.
    Content Prefetching Retrivingdata from remote servers in anticipation client requests What to prefetch next ? A smart algorithm can reduce latency up to 50%
  • 20.
    Cache Consistency Whatwill happen to out-to-date objects ? Instead of checking when a request comes, check it periodically.. But whom to check ? Server to proxy or proxy to server ? Time to Live (TTL)
  • 21.
    Conclusion & Questions& Comments Web Caching is important technology Bandwith Savings Network Latency Reduction