More Related Content
Similar to Improving Web Browser Access Latency Using Content Aliasing in Proxy Cache
Similar to Improving Web Browser Access Latency Using Content Aliasing in Proxy Cache (20)
More from IAEME Publication
More from IAEME Publication (20)
Improving Web Browser Access Latency Using Content Aliasing in Proxy Cache
- 1. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
356
IMPROVING ACCESS LATENCY OF WEB BROWSER BY USING
CONTENT ALIASING IN PROXY CACHE SERVER
Sachin Chavan1
, Nitin Chavan2
1
Department of Computer Engineering, MPSTME, NMIMS, Shirpur
2
Department of Information Technology, MPSTME, NMIMS, Shirpur
ABSTRACT
The web community is growing so quickly that the number of clients accessing web
servers is increasing nearly tremendously. This rapid increase of web clients affected several
aspects and characteristics of web such as reduced network bandwidth, increased latency, and
higher response time for users who require large scale web services. This paper considers
different types of proxy actions and proposes a novel design and methodology to address
these issues. Focused on studies in what way they influence the browser display time. It
discusses also acceptable loading times and the scope of cacheable objects. The methodology
works by analysing content in the proxy cache, identifying content aliasing, duplicate
suppression and by the creation of the respective soft links. The present solution makes
intelligent use of the proxy cache server to overcome these problems. In this study proxies
were designed to enable network administrators to control internet access from within
intranet. But when proxy cache is used, there develops the problem of Aliasing. Aliasing in
proxy server caches occurs when the same content is stored in the cache several times. The
present methodology improves performance in case of access latency and browser response
time at the same time it avoids storing the same content in cache multiple times those results
in wastage of storage space.
KEYWORDS: Access Latency, Cache, Web Proxy, Mirroring, and Duplicate Suppression,
Content aliasing.
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING
& TECHNOLOGY (IJCET)
ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online)
Volume 4, Issue 2, March – April (2013), pp. 356-365
© IAEME: www.iaeme.com/ijcet.asp
Journal Impact Factor (2013): 6.1302 (Calculated by GISI)
www.jifactor.com
IJCET
© I A E M E
- 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
357
1. INTRODUCTION
In the field of web server management, researchers have focused on aliasing in proxy
server caches for a long time. Web caching consists of storing frequently referred objects on a
caching server instead of the original server, so that web servers can make better use of
network bandwidth, reduce the workload on servers, and improve the response time for users.
Aliasing means giving multiple names to the same thing.
The proxy cache also stores all of the images and sub files for the visited pages, so if
the user jumps to a new page within the same site that uses, for example, the same images,
the proxy cache has them already stored and can load them into the user's browser quicker
than having to retrieve them from the Web site server's remote site. Aliasing in proxy server
caches occurs when the same content is stored in cache multiple times. On the World Wide
Web, aliasing commonly occurs when a client makes two requests, and both the requests
have the same payload. Currently, browsers perform cache lookups using Uniform Resource
Locators (URLs) as identifiers.
Websites that contain the same content are called mirrors. Mirrors are redundancy
mechanisms built into the web space to serve web pages faster, but they cost in terms of
cache space. As the amount of web traffic increases, the efficient utilization of network
bandwidth increasingly becomes more important. The Technique needs to analyse web traffic
to understand its characteristics. That will optimize the use of network bandwidth to reduce
network latency and to improve response time for users [8].
A proxy cache is a shared network device that can undertake Web transactions on
behalf of a client, and, like the browser, the proxy cache stores the content. Subsequent
requests for this content, by this or any other client of the cache will trigger the cache to
deliver the locally stored copy of the content, avoiding a repeat of the download from the
original content source [4].
Figure 1. Concept of Caching (Proxy Cache)
Bandwidth Saving and Traffic Reduction
Proxy Cache Server
- 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
358
1.1 Advantages of Caching
1. Web caching reduces the workload of the remote Web server
2. Client can obtain a cached copy at the proxy if the remote server is not available.
3. It provides us a chance to analyze an organization usage patterns.
1.2 Disadvantages of using a caching:
1. A client might be looking at stale data due to the lack of proper proxy updating.
2. The access latency may increase in the case of a cache miss due to the extra proxy
processing.
3. A single proxy cache is always a bottleneck.
4. A single proxy is a single point of failure.
2. RELATED WORK
2.1 The Access Latency
Latency is defined as the delay between a request for a Web page and receiving that
page in its entirety. The latency problem occurs when users judge the download as too long.
Unacceptable latency does not only adversely effects user satisfaction. Web pages that are
loaded faster are judged to be significantly more interesting than their slower counterparts
[12].
Studies on human cognition revealed that the response time shorter than 0.1 second is
unnoticeable and the delay of 1 second matches the pace of interactive dialog. Following
table shows the transfer rate of different connection types.
Table 1. Transfer Rates for different connection Type
Connection Type Slow Normal Maximum
Modem 33k6 <2.734 ≈3 ≈3.65
Modem 56k <4.199 ≈5 ≈6.08
ISDN 64k <5.469 ≈6 ≈6.94
Cable <9.766 ≈17 by provider
ADSL <12.21 ≈24 ≈732
Ethernet 10Base-T (10 Megabits/sec) <73 ≈195 ≈977
Table shows the different parameters that affects the access time of browser. The different
parameters are type of connection used by the user and the condition of connection. The
timing of internet use also affects on access latency due to bandwidth sharing.
2.2 Web Traffic
The amount of data sent and received by visitors to a website is web traffic. It is
analysis to see the popularity of web sites and individual pages or sections within a site. Web
traffic can be analyzed by viewing the traffic statistics found in the web server log file, an
automatically generated list of all the pages served.
Traffic analysis is conducted using access logs from web proxy server. Each entry in
access logs records the URL of document being requested, date and time of the request, the
name of the client host making the request, number of bytes returns to requesting client, and
information that describe how the clients request was treated as proxy [1].
Processing these log entries can produce useful summary statistics about workload volume,
document type and sizes, popularity of document and proxy cache performance [5].
- 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
359
2.3 Static Caching
It is a new approach of web caching which uses yesterday’s log to predict the today’s
user request. The static caching algorithm defines a fixed set of URLs by analyzing the logs
of previous periods. It then calculates the value of the unique URL. Depending on the value,
URLs are arranged in the descending order, and the URL with the highest value is selected.
This set of URLs is known as the working set. When a user requests a document and the
document is present in the working set, the request is fulfilled from the cache. Otherwise, the
user request is fulfilled from the origin server [6].
2.4 Dynamic Caching
Dynamic caching is more complex than static caching and requires detailed
knowledge of the application. One must consider the candidates for dynamic caching
carefully since, by its very nature, dynamically generated content can be different based on
the state of the application. Therefore, it is important to consider under what conditions
dynamically generated content can be cached returning the correct response. This requires
knowledge of the application, its possible states, and other data, such as parameters that
ensure the dynamic data is generated in a deterministic manner [3].
2.5 MD5 Algorithm
MD5, developed by Ron Rives in 1992, is a comparison cryptographic hash algorithm
that succeeded the MD4 algorithm. MD5 takes an input of any length and generates an MD5
digest of fixed length (128 bits or 32 characters). Because MD5 uses the same algorithm
every time, a particular data string always generates the same MD5 hash every time.
MD5 cryptographic hash offers several advantages over its predecessors (such as MD4) and
its competitors (such as, SHA and SHA.1). One of these advantages is that MD5 is a one way
cryptographic hash. Another advantage is that MD5 can accept inputs of any length but still
generates a fixed length output. MD5 is fast, and it is highly unlikely that two different
strings can hash to the same digest. Moreover, with MD5 it is also highly unlikely that two
different input strings can hash to the same digest. Furthermore, MD5 is reliable in the sense
that the same input string always yields the same output digest every time [11].
3. EXPERIMENTAL SETUP
3.1 Changing of proxy server
In most of the organization’s or institution server does not support the proxy cache, so
it is difficult to use main server as cache server so we have to change the proxy server from
main server to other server [2].
Following are the steps to switch machine to other proxy:
1. Open the browser for ex. Internet Explorer
2. In internet explorer pull down the Tools menu and click Internet Options...
3. Click the Connections tab:
4. click the LAN Settings... button:
5. In the Address: box change "proxy1 Address" to "proxy2 Address" or vice versa and
click OK.
6. Click OK on the Internet Options dialogue box to get back to the browser screen and
you will now be able to get external sites.
- 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
360
3.2 Duplication of Data
Duplication of data means storing the multiple copies of same data object. In case of
cache when we cache the object or the webpage that web page is stored at cache memory but
when the different users request the same page then the multiple copies of that object or web
page is stored at cache memory which results in the wastage of storage space as we all know
the maintenance of cache is an expensive task so such wastage is not affordable. To avoid the
problem of duplication of the data objects or web page duplicate suppression mechanism is to
be used [7]. If the duplicate copy of data is saved at proxy cache then it acquires more space
of storage in the analysis part given in work shows that the effect of duplication in the cache
space [4].
3.3 Duplicate Suppression
You can reduce storage space requirements by avoiding duplicating copies of the
same data. Content Engine provides the option to suppress storage of duplicate content
elements. Duplicate suppression applies to any kind of content. Incoming content is not
added to the storage area if identical content exists in the storage area; only unique content is
added [14].
Due to large network size there are many pages on web, most of those pages will not be
referenced multiple times by any one cache, means the probability with which the Kth
page
will be referenced is 1/K. re-referenced follow a distribution similar to Zipf’s law [9].
3.5 Experimental Results
The experimentation carried out at the lab of our institute. Some of popular websites
are considered for experiment. Those websites are use to analyse for access latency of
browser under different conditions. Keyword based search also used for Latency time based
on the type of content either image or text search.
Table 2. Response time of search engine for Text and Image Search.
Text Search Image Search
Keywords
From
Web Server
From
Cache Server
From
Web Server
From
Cache Server
SVKM 250 140 230 200
NMIMS 140 130 300 100
RCPIT 250 120 350 150
CANNON 240 130 250 100
SAMSUNG 210 140 640 200
NOKIA 250 190 240 120
MATLAB 240 160 280 160
OPERA 250 150 120 120
SIEMENS 230 160 310 100
MICROMAX 160 140 190 110
MPSC 170 140 180 100
UPSC 210 150 150 140
IRCTC 160 140 330 90
RRB 260 120 310 70
- 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
361
Table 2 show the reduction into response time of browser when page is fetched from
cache server instead of web server. First column shows the different keywords which is used
for analysis. Same keywords are used for the text search and image search. Table contains
response time of browser for text search as well as image search. From table we can say that
there is considerable amount of reduction of access latency when the page is fetch from
Proxy Cache.
Figure 2 shows the comparison of response time when the page is fetch from main
server and the response time when it is fetch from proxy cache. From Figure 2 we can say
that there is considerable amount of reduction of the response time. Figure shows the graph
plot for comparison of response time when the response comes from main source and when
the response comes from local cache server for Text Search for some keywords. Here first bar
shows the response time when the page is fetch from Web server where second bar shows the
response time when the page is fetch from local proxy cache server where we have
implemented content aliasing algorithm.
Figure.2 Response time of Search engine for Text Search
From Figure 2 it is clear that in text search for keyword we get 40 or more than 40
percent of reduction of response time. Where in case of some keywords like Samsung,
IRCTC, RRB, Siemens the response time is reduced by more than 70 percent. Wherein case
of opera, SVKM, and UPSC it is negligible or at most 10 percent. It is due to dynamic
content comes under the search.
Figure 3 shows the comparison of response time for image search for given keywords
when the page is fetch from main server and the response time when it is fetch from proxy
cache. From Figure 3 we can say that there is difference between the response times. Figure
shows the graph plot for comparison of response time when the response comes from main
source and when the response comes from local cache server for Image Search for some
keywords. Here first bar shows the response time when the page is fetch from Web server
where second bar shows the response time when the page is fetch from local proxy cache
server where we have implemented content aliasing algorithm.
- 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
362
Figure 3. Response time of Search engine for Image Search
From Figure 3 it is clear that in Image search for keyword we get very less amount of
reduction in the response time because the images are more dynamic than the text.
Table 3. Connection Time and Response time of browser for some Websites.
From Web Server From Cache Server
WEBSITE Connection Response Connection Response
www.nmims.edu 7000 44000 3000 14000
www.rcpit.ac.in 6120 26140 3920 10310
www.mpsc.gov.in 5800 25700 3200 6390
www.upsc.gov.in 1890 4760 320 690
www.unipune.ac.in 2480 8600 1130 1580
www.wipro.com 2300 24750 900 3780
www.infosys.com 1710 18180 770 1980
www.techmahindra.com 990 18000 1260 7250
www.jaihindcollege.com 1210 13230 500 1170
www.jaihindcollege.ac.in 1800 15930 540 1040
www.msbte.com 1800 10170 810 1130
www.msbshse.ac.in 1530 4550 540 1040
www.cbse.nic.in 1130 5580 630 900
www.irctc.com 1710 12960 1670 3240
Table 3 shows the connection time and response time of browser for a various sites. It
gives the comparison of connection time and response time when page is fetched from cache
server instead of web server. First column shows the different websites which is used for
analysis. From table we can say that there is considerable amount of reduction of access
latency when the page is fetch from Proxy Cache instead of main server.
- 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
363
Figure.4 Connection time for different Websites
Figure 4 shows the effect of content aliasing on the access time of web browser in
terms of connection time. In maximum cases we get more than 50 percent of reduction in
connection time. In some cases the reduction is 30-50 percent. In case of IRCTC website the
reduction in connection time is negligible. Where in case of ‘TECHMAHINDRA’ website
connection time increased. It is due to the dynamic content is more on website.
Figure. 5 Response time for Different Websites
Figure 5 shows the comparative graph of response time of browser for different
websites. When the web page is fetched from cache server then the response time is less.
From above graph we can say that the reduction in response time is more than 60 percent in
each case. In some cases the reduction into the response time is more than 90 percent. So by
using the content aliasing in proxy cache server we get significant amount of time save in
terms of response time as well as connection time.
- 9. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
364
It is clear that amount of user time is saved by using the concept of content aliasing.
We have achieved reduction of access latency by also considering other parameters like
cache size, stale data.
4. CONCLUSION
The analysis based experimental results proves the need for methodology that
improve the web access performance to enhance bandwidth utilization and greater
connectivity speed. Here the suggested Design aspects improve the web performance in terms
of reduced latency, improved user response time, and optimal use of the existing bandwidth
by using web caching. Content aliasing successfully detected using a web based application,
database queries and files system calls. A considerable amount of duplicate storage can be
avoided through the suggested methodology. It is, therefore, a very useful mechanism for
web proxy caches. Moreover, the solution is successfully able to keep cached pages in
synchronization with the pages on the web server, checking for new pages if needed. This
work can be further optimize by the Daemon Process, which can be design and run
periodically to check the consistency of the data cached and the data at the web server. This
can be scheduled during the slack time with the less traffic which will not add any additional
toll on the bandwidth as well as it updates the TTL – Time to Live Period of the cached data.
REFERENCES
[1] Kartik Bommepally, Glisa T. K., Jeena J. Prakash, Sanasam Ranbir Singh and Hema A
Murthy “Internet Activity Analysis through Proxy Log” IEEE, 2010.
[2] E-Services Team, “Changing Proxy Server” by the Robert Gordon University, School
hill, Aberdeen, Scotland-2006.
[3] Chen, W.; Martin, P.; Hassanein, H.S., "Caching dynamic content on the Web,"
Canadian Conference on Electrical and Computer Engineering, 2003, vol.2, no., pp.
947- 950 vol.2, 4-7 May 2003.
[4] Sadhna Ahuja, Tao Wu and Sudhir Dixit “On the Effects of Content Compression on
Web Cache Performance,” Proceedings of the International Conference on Information
Technology: Computers and Communications, 2003.
[5] Mark S. Squillante, David D. Yaot and Li Zhang “Web Traffic Modeling and Web
Server Performance Analysis” Proceedings of the 38' Conference on Decision &
Control Phoenix, Arizona USA December 1999.
[6] C. E. Wills and M. Mikhailov, “Studying the Impact of More Complete Server
Information on Web Caching,” Computer Communications, vol. 24, no. 2, pp. 184.190,
May 2000.
[7] J Wang “A Survey of Web Caching Schemes for the Internet” - Cornell Network
Research Group (C/NRG), Department of Computer Science, Cornell University 1999.
[8] N. Shivakumar and H. Garcia-Molina, “Finding near Replicas of Documents on the
Web” Proc. Workshop on Web Databases, Mar. 1998.
[9] L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf like
Distributions: Evidence and Implications. In Proc. Infocom ’99. New York, NY, March,
1999.
- 10. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
365
[10] Guerrero, C.; Juiz, C.; Puigjaner, R.; "Web Performance and Behavior Ontology,"
Complex, Intelligent and Software Intensive Systems, 2008. CISIS 2008. International
Conference on, vol., no., pp.219-225, 4-7 March 2008.
[11] Kimmo Jarvinen, Matti Tommiska and Jorma Skytta, “Hardware Implementation
Analysis of the MD5 Hash Algorithm,” IEEE Computer Society. 2005.
[12] Andrzej Sieminski, “The impact of Proxy caches on Browser Latency” International
Journal of Computer Science & Applications, 2005, Vol. II, No. II, pp. 5 – 21.
[13] S B Patil, Sachin Chavan, Preeti Patil; “High quality design to enhance and improve
performance of large scale web applications” International Journal of Computer
Engineering and Technology (IJCET), Volume 3, Issue 1, January- June (2012),
pp. 198-205, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375.
[14] S.Vikram Phaneendra, “Minimizing Client-Server Traffic Based on AJAX”,
International journal of Computer Engineering & Technology (IJCET), Volume 3,
Issue 1, 2012, pp. 10 - 16, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375.
[15] A. Suganthy, G.S.Sumithra, J.Hindusha, A.Gayathri and S.Girija, “Semantic Web
Services and its Challenges”, International journal of Computer Engineering &
Technology (IJCET), Volume 1, Issue 2, 2010, pp. 26 - 37, ISSN Print: 0976 – 6367,
ISSN Online: 0976 – 6375.