International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
356
IMPROVING ACCESS LATENCY OF WEB BROWSER BY USING
CONTENT ALIASING IN PROXY CACHE SERVER
Sachin Chavan1
, Nitin Chavan2
1
Department of Computer Engineering, MPSTME, NMIMS, Shirpur
2
Department of Information Technology, MPSTME, NMIMS, Shirpur
ABSTRACT
The web community is growing so quickly that the number of clients accessing web
servers is increasing nearly tremendously. This rapid increase of web clients affected several
aspects and characteristics of web such as reduced network bandwidth, increased latency, and
higher response time for users who require large scale web services. This paper considers
different types of proxy actions and proposes a novel design and methodology to address
these issues. Focused on studies in what way they influence the browser display time. It
discusses also acceptable loading times and the scope of cacheable objects. The methodology
works by analysing content in the proxy cache, identifying content aliasing, duplicate
suppression and by the creation of the respective soft links. The present solution makes
intelligent use of the proxy cache server to overcome these problems. In this study proxies
were designed to enable network administrators to control internet access from within
intranet. But when proxy cache is used, there develops the problem of Aliasing. Aliasing in
proxy server caches occurs when the same content is stored in the cache several times. The
present methodology improves performance in case of access latency and browser response
time at the same time it avoids storing the same content in cache multiple times those results
in wastage of storage space.
KEYWORDS: Access Latency, Cache, Web Proxy, Mirroring, and Duplicate Suppression,
Content aliasing.
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING
& TECHNOLOGY (IJCET)
ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online)
Volume 4, Issue 2, March – April (2013), pp. 356-365
© IAEME: www.iaeme.com/ijcet.asp
Journal Impact Factor (2013): 6.1302 (Calculated by GISI)
www.jifactor.com
IJCET
© I A E M E
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
357
1. INTRODUCTION
In the field of web server management, researchers have focused on aliasing in proxy
server caches for a long time. Web caching consists of storing frequently referred objects on a
caching server instead of the original server, so that web servers can make better use of
network bandwidth, reduce the workload on servers, and improve the response time for users.
Aliasing means giving multiple names to the same thing.
The proxy cache also stores all of the images and sub files for the visited pages, so if
the user jumps to a new page within the same site that uses, for example, the same images,
the proxy cache has them already stored and can load them into the user's browser quicker
than having to retrieve them from the Web site server's remote site. Aliasing in proxy server
caches occurs when the same content is stored in cache multiple times. On the World Wide
Web, aliasing commonly occurs when a client makes two requests, and both the requests
have the same payload. Currently, browsers perform cache lookups using Uniform Resource
Locators (URLs) as identifiers.
Websites that contain the same content are called mirrors. Mirrors are redundancy
mechanisms built into the web space to serve web pages faster, but they cost in terms of
cache space. As the amount of web traffic increases, the efficient utilization of network
bandwidth increasingly becomes more important. The Technique needs to analyse web traffic
to understand its characteristics. That will optimize the use of network bandwidth to reduce
network latency and to improve response time for users [8].
A proxy cache is a shared network device that can undertake Web transactions on
behalf of a client, and, like the browser, the proxy cache stores the content. Subsequent
requests for this content, by this or any other client of the cache will trigger the cache to
deliver the locally stored copy of the content, avoiding a repeat of the download from the
original content source [4].
Figure 1. Concept of Caching (Proxy Cache)
Bandwidth Saving and Traffic Reduction
Proxy Cache Server
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
358
1.1 Advantages of Caching
1. Web caching reduces the workload of the remote Web server
2. Client can obtain a cached copy at the proxy if the remote server is not available.
3. It provides us a chance to analyze an organization usage patterns.
1.2 Disadvantages of using a caching:
1. A client might be looking at stale data due to the lack of proper proxy updating.
2. The access latency may increase in the case of a cache miss due to the extra proxy
processing.
3. A single proxy cache is always a bottleneck.
4. A single proxy is a single point of failure.
2. RELATED WORK
2.1 The Access Latency
Latency is defined as the delay between a request for a Web page and receiving that
page in its entirety. The latency problem occurs when users judge the download as too long.
Unacceptable latency does not only adversely effects user satisfaction. Web pages that are
loaded faster are judged to be significantly more interesting than their slower counterparts
[12].
Studies on human cognition revealed that the response time shorter than 0.1 second is
unnoticeable and the delay of 1 second matches the pace of interactive dialog. Following
table shows the transfer rate of different connection types.
Table 1. Transfer Rates for different connection Type
Connection Type Slow Normal Maximum
Modem 33k6 <2.734 ≈3 ≈3.65
Modem 56k <4.199 ≈5 ≈6.08
ISDN 64k <5.469 ≈6 ≈6.94
Cable <9.766 ≈17 by provider
ADSL <12.21 ≈24 ≈732
Ethernet 10Base-T (10 Megabits/sec) <73 ≈195 ≈977
Table shows the different parameters that affects the access time of browser. The different
parameters are type of connection used by the user and the condition of connection. The
timing of internet use also affects on access latency due to bandwidth sharing.
2.2 Web Traffic
The amount of data sent and received by visitors to a website is web traffic. It is
analysis to see the popularity of web sites and individual pages or sections within a site. Web
traffic can be analyzed by viewing the traffic statistics found in the web server log file, an
automatically generated list of all the pages served.
Traffic analysis is conducted using access logs from web proxy server. Each entry in
access logs records the URL of document being requested, date and time of the request, the
name of the client host making the request, number of bytes returns to requesting client, and
information that describe how the clients request was treated as proxy [1].
Processing these log entries can produce useful summary statistics about workload volume,
document type and sizes, popularity of document and proxy cache performance [5].
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
359
2.3 Static Caching
It is a new approach of web caching which uses yesterday’s log to predict the today’s
user request. The static caching algorithm defines a fixed set of URLs by analyzing the logs
of previous periods. It then calculates the value of the unique URL. Depending on the value,
URLs are arranged in the descending order, and the URL with the highest value is selected.
This set of URLs is known as the working set. When a user requests a document and the
document is present in the working set, the request is fulfilled from the cache. Otherwise, the
user request is fulfilled from the origin server [6].
2.4 Dynamic Caching
Dynamic caching is more complex than static caching and requires detailed
knowledge of the application. One must consider the candidates for dynamic caching
carefully since, by its very nature, dynamically generated content can be different based on
the state of the application. Therefore, it is important to consider under what conditions
dynamically generated content can be cached returning the correct response. This requires
knowledge of the application, its possible states, and other data, such as parameters that
ensure the dynamic data is generated in a deterministic manner [3].
2.5 MD5 Algorithm
MD5, developed by Ron Rives in 1992, is a comparison cryptographic hash algorithm
that succeeded the MD4 algorithm. MD5 takes an input of any length and generates an MD5
digest of fixed length (128 bits or 32 characters). Because MD5 uses the same algorithm
every time, a particular data string always generates the same MD5 hash every time.
MD5 cryptographic hash offers several advantages over its predecessors (such as MD4) and
its competitors (such as, SHA and SHA.1). One of these advantages is that MD5 is a one way
cryptographic hash. Another advantage is that MD5 can accept inputs of any length but still
generates a fixed length output. MD5 is fast, and it is highly unlikely that two different
strings can hash to the same digest. Moreover, with MD5 it is also highly unlikely that two
different input strings can hash to the same digest. Furthermore, MD5 is reliable in the sense
that the same input string always yields the same output digest every time [11].
3. EXPERIMENTAL SETUP
3.1 Changing of proxy server
In most of the organization’s or institution server does not support the proxy cache, so
it is difficult to use main server as cache server so we have to change the proxy server from
main server to other server [2].
Following are the steps to switch machine to other proxy:
1. Open the browser for ex. Internet Explorer
2. In internet explorer pull down the Tools menu and click Internet Options...
3. Click the Connections tab:
4. click the LAN Settings... button:
5. In the Address: box change "proxy1 Address" to "proxy2 Address" or vice versa and
click OK.
6. Click OK on the Internet Options dialogue box to get back to the browser screen and
you will now be able to get external sites.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
360
3.2 Duplication of Data
Duplication of data means storing the multiple copies of same data object. In case of
cache when we cache the object or the webpage that web page is stored at cache memory but
when the different users request the same page then the multiple copies of that object or web
page is stored at cache memory which results in the wastage of storage space as we all know
the maintenance of cache is an expensive task so such wastage is not affordable. To avoid the
problem of duplication of the data objects or web page duplicate suppression mechanism is to
be used [7]. If the duplicate copy of data is saved at proxy cache then it acquires more space
of storage in the analysis part given in work shows that the effect of duplication in the cache
space [4].
3.3 Duplicate Suppression
You can reduce storage space requirements by avoiding duplicating copies of the
same data. Content Engine provides the option to suppress storage of duplicate content
elements. Duplicate suppression applies to any kind of content. Incoming content is not
added to the storage area if identical content exists in the storage area; only unique content is
added [14].
Due to large network size there are many pages on web, most of those pages will not be
referenced multiple times by any one cache, means the probability with which the Kth
page
will be referenced is 1/K. re-referenced follow a distribution similar to Zipf’s law [9].
3.5 Experimental Results
The experimentation carried out at the lab of our institute. Some of popular websites
are considered for experiment. Those websites are use to analyse for access latency of
browser under different conditions. Keyword based search also used for Latency time based
on the type of content either image or text search.
Table 2. Response time of search engine for Text and Image Search.
Text Search Image Search
Keywords
From
Web Server
From
Cache Server
From
Web Server
From
Cache Server
SVKM 250 140 230 200
NMIMS 140 130 300 100
RCPIT 250 120 350 150
CANNON 240 130 250 100
SAMSUNG 210 140 640 200
NOKIA 250 190 240 120
MATLAB 240 160 280 160
OPERA 250 150 120 120
SIEMENS 230 160 310 100
MICROMAX 160 140 190 110
MPSC 170 140 180 100
UPSC 210 150 150 140
IRCTC 160 140 330 90
RRB 260 120 310 70
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
361
Table 2 show the reduction into response time of browser when page is fetched from
cache server instead of web server. First column shows the different keywords which is used
for analysis. Same keywords are used for the text search and image search. Table contains
response time of browser for text search as well as image search. From table we can say that
there is considerable amount of reduction of access latency when the page is fetch from
Proxy Cache.
Figure 2 shows the comparison of response time when the page is fetch from main
server and the response time when it is fetch from proxy cache. From Figure 2 we can say
that there is considerable amount of reduction of the response time. Figure shows the graph
plot for comparison of response time when the response comes from main source and when
the response comes from local cache server for Text Search for some keywords. Here first bar
shows the response time when the page is fetch from Web server where second bar shows the
response time when the page is fetch from local proxy cache server where we have
implemented content aliasing algorithm.
Figure.2 Response time of Search engine for Text Search
From Figure 2 it is clear that in text search for keyword we get 40 or more than 40
percent of reduction of response time. Where in case of some keywords like Samsung,
IRCTC, RRB, Siemens the response time is reduced by more than 70 percent. Wherein case
of opera, SVKM, and UPSC it is negligible or at most 10 percent. It is due to dynamic
content comes under the search.
Figure 3 shows the comparison of response time for image search for given keywords
when the page is fetch from main server and the response time when it is fetch from proxy
cache. From Figure 3 we can say that there is difference between the response times. Figure
shows the graph plot for comparison of response time when the response comes from main
source and when the response comes from local cache server for Image Search for some
keywords. Here first bar shows the response time when the page is fetch from Web server
where second bar shows the response time when the page is fetch from local proxy cache
server where we have implemented content aliasing algorithm.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
362
Figure 3. Response time of Search engine for Image Search
From Figure 3 it is clear that in Image search for keyword we get very less amount of
reduction in the response time because the images are more dynamic than the text.
Table 3. Connection Time and Response time of browser for some Websites.
From Web Server From Cache Server
WEBSITE Connection Response Connection Response
www.nmims.edu 7000 44000 3000 14000
www.rcpit.ac.in 6120 26140 3920 10310
www.mpsc.gov.in 5800 25700 3200 6390
www.upsc.gov.in 1890 4760 320 690
www.unipune.ac.in 2480 8600 1130 1580
www.wipro.com 2300 24750 900 3780
www.infosys.com 1710 18180 770 1980
www.techmahindra.com 990 18000 1260 7250
www.jaihindcollege.com 1210 13230 500 1170
www.jaihindcollege.ac.in 1800 15930 540 1040
www.msbte.com 1800 10170 810 1130
www.msbshse.ac.in 1530 4550 540 1040
www.cbse.nic.in 1130 5580 630 900
www.irctc.com 1710 12960 1670 3240
Table 3 shows the connection time and response time of browser for a various sites. It
gives the comparison of connection time and response time when page is fetched from cache
server instead of web server. First column shows the different websites which is used for
analysis. From table we can say that there is considerable amount of reduction of access
latency when the page is fetch from Proxy Cache instead of main server.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
363
Figure.4 Connection time for different Websites
Figure 4 shows the effect of content aliasing on the access time of web browser in
terms of connection time. In maximum cases we get more than 50 percent of reduction in
connection time. In some cases the reduction is 30-50 percent. In case of IRCTC website the
reduction in connection time is negligible. Where in case of ‘TECHMAHINDRA’ website
connection time increased. It is due to the dynamic content is more on website.
Figure. 5 Response time for Different Websites
Figure 5 shows the comparative graph of response time of browser for different
websites. When the web page is fetched from cache server then the response time is less.
From above graph we can say that the reduction in response time is more than 60 percent in
each case. In some cases the reduction into the response time is more than 90 percent. So by
using the content aliasing in proxy cache server we get significant amount of time save in
terms of response time as well as connection time.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
364
It is clear that amount of user time is saved by using the concept of content aliasing.
We have achieved reduction of access latency by also considering other parameters like
cache size, stale data.
4. CONCLUSION
The analysis based experimental results proves the need for methodology that
improve the web access performance to enhance bandwidth utilization and greater
connectivity speed. Here the suggested Design aspects improve the web performance in terms
of reduced latency, improved user response time, and optimal use of the existing bandwidth
by using web caching. Content aliasing successfully detected using a web based application,
database queries and files system calls. A considerable amount of duplicate storage can be
avoided through the suggested methodology. It is, therefore, a very useful mechanism for
web proxy caches. Moreover, the solution is successfully able to keep cached pages in
synchronization with the pages on the web server, checking for new pages if needed. This
work can be further optimize by the Daemon Process, which can be design and run
periodically to check the consistency of the data cached and the data at the web server. This
can be scheduled during the slack time with the less traffic which will not add any additional
toll on the bandwidth as well as it updates the TTL – Time to Live Period of the cached data.
REFERENCES
[1] Kartik Bommepally, Glisa T. K., Jeena J. Prakash, Sanasam Ranbir Singh and Hema A
Murthy “Internet Activity Analysis through Proxy Log” IEEE, 2010.
[2] E-Services Team, “Changing Proxy Server” by the Robert Gordon University, School
hill, Aberdeen, Scotland-2006.
[3] Chen, W.; Martin, P.; Hassanein, H.S., "Caching dynamic content on the Web,"
Canadian Conference on Electrical and Computer Engineering, 2003, vol.2, no., pp.
947- 950 vol.2, 4-7 May 2003.
[4] Sadhna Ahuja, Tao Wu and Sudhir Dixit “On the Effects of Content Compression on
Web Cache Performance,” Proceedings of the International Conference on Information
Technology: Computers and Communications, 2003.
[5] Mark S. Squillante, David D. Yaot and Li Zhang “Web Traffic Modeling and Web
Server Performance Analysis” Proceedings of the 38' Conference on Decision &
Control Phoenix, Arizona USA December 1999.
[6] C. E. Wills and M. Mikhailov, “Studying the Impact of More Complete Server
Information on Web Caching,” Computer Communications, vol. 24, no. 2, pp. 184.190,
May 2000.
[7] J Wang “A Survey of Web Caching Schemes for the Internet” - Cornell Network
Research Group (C/NRG), Department of Computer Science, Cornell University 1999.
[8] N. Shivakumar and H. Garcia-Molina, “Finding near Replicas of Documents on the
Web” Proc. Workshop on Web Databases, Mar. 1998.
[9] L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf like
Distributions: Evidence and Implications. In Proc. Infocom ’99. New York, NY, March,
1999.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
365
[10] Guerrero, C.; Juiz, C.; Puigjaner, R.; "Web Performance and Behavior Ontology,"
Complex, Intelligent and Software Intensive Systems, 2008. CISIS 2008. International
Conference on, vol., no., pp.219-225, 4-7 March 2008.
[11] Kimmo Jarvinen, Matti Tommiska and Jorma Skytta, “Hardware Implementation
Analysis of the MD5 Hash Algorithm,” IEEE Computer Society. 2005.
[12] Andrzej Sieminski, “The impact of Proxy caches on Browser Latency” International
Journal of Computer Science & Applications, 2005, Vol. II, No. II, pp. 5 – 21.
[13] S B Patil, Sachin Chavan, Preeti Patil; “High quality design to enhance and improve
performance of large scale web applications” International Journal of Computer
Engineering and Technology (IJCET), Volume 3, Issue 1, January- June (2012),
pp. 198-205, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375.
[14] S.Vikram Phaneendra, “Minimizing Client-Server Traffic Based on AJAX”,
International journal of Computer Engineering & Technology (IJCET), Volume 3,
Issue 1, 2012, pp. 10 - 16, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375.
[15] A. Suganthy, G.S.Sumithra, J.Hindusha, A.Gayathri and S.Girija, “Semantic Web
Services and its Challenges”, International journal of Computer Engineering &
Technology (IJCET), Volume 1, Issue 2, 2010, pp. 26 - 37, ISSN Print: 0976 – 6367,
ISSN Online: 0976 – 6375.

Improving access latency of web browser by using content aliasing in

  • 1.
    International Journal ofComputer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 356 IMPROVING ACCESS LATENCY OF WEB BROWSER BY USING CONTENT ALIASING IN PROXY CACHE SERVER Sachin Chavan1 , Nitin Chavan2 1 Department of Computer Engineering, MPSTME, NMIMS, Shirpur 2 Department of Information Technology, MPSTME, NMIMS, Shirpur ABSTRACT The web community is growing so quickly that the number of clients accessing web servers is increasing nearly tremendously. This rapid increase of web clients affected several aspects and characteristics of web such as reduced network bandwidth, increased latency, and higher response time for users who require large scale web services. This paper considers different types of proxy actions and proposes a novel design and methodology to address these issues. Focused on studies in what way they influence the browser display time. It discusses also acceptable loading times and the scope of cacheable objects. The methodology works by analysing content in the proxy cache, identifying content aliasing, duplicate suppression and by the creation of the respective soft links. The present solution makes intelligent use of the proxy cache server to overcome these problems. In this study proxies were designed to enable network administrators to control internet access from within intranet. But when proxy cache is used, there develops the problem of Aliasing. Aliasing in proxy server caches occurs when the same content is stored in the cache several times. The present methodology improves performance in case of access latency and browser response time at the same time it avoids storing the same content in cache multiple times those results in wastage of storage space. KEYWORDS: Access Latency, Cache, Web Proxy, Mirroring, and Duplicate Suppression, Content aliasing. INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) ISSN 0976 – 6367(Print) ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), pp. 356-365 © IAEME: www.iaeme.com/ijcet.asp Journal Impact Factor (2013): 6.1302 (Calculated by GISI) www.jifactor.com IJCET © I A E M E
  • 2.
    International Journal ofComputer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 357 1. INTRODUCTION In the field of web server management, researchers have focused on aliasing in proxy server caches for a long time. Web caching consists of storing frequently referred objects on a caching server instead of the original server, so that web servers can make better use of network bandwidth, reduce the workload on servers, and improve the response time for users. Aliasing means giving multiple names to the same thing. The proxy cache also stores all of the images and sub files for the visited pages, so if the user jumps to a new page within the same site that uses, for example, the same images, the proxy cache has them already stored and can load them into the user's browser quicker than having to retrieve them from the Web site server's remote site. Aliasing in proxy server caches occurs when the same content is stored in cache multiple times. On the World Wide Web, aliasing commonly occurs when a client makes two requests, and both the requests have the same payload. Currently, browsers perform cache lookups using Uniform Resource Locators (URLs) as identifiers. Websites that contain the same content are called mirrors. Mirrors are redundancy mechanisms built into the web space to serve web pages faster, but they cost in terms of cache space. As the amount of web traffic increases, the efficient utilization of network bandwidth increasingly becomes more important. The Technique needs to analyse web traffic to understand its characteristics. That will optimize the use of network bandwidth to reduce network latency and to improve response time for users [8]. A proxy cache is a shared network device that can undertake Web transactions on behalf of a client, and, like the browser, the proxy cache stores the content. Subsequent requests for this content, by this or any other client of the cache will trigger the cache to deliver the locally stored copy of the content, avoiding a repeat of the download from the original content source [4]. Figure 1. Concept of Caching (Proxy Cache) Bandwidth Saving and Traffic Reduction Proxy Cache Server
  • 3.
    International Journal ofComputer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 358 1.1 Advantages of Caching 1. Web caching reduces the workload of the remote Web server 2. Client can obtain a cached copy at the proxy if the remote server is not available. 3. It provides us a chance to analyze an organization usage patterns. 1.2 Disadvantages of using a caching: 1. A client might be looking at stale data due to the lack of proper proxy updating. 2. The access latency may increase in the case of a cache miss due to the extra proxy processing. 3. A single proxy cache is always a bottleneck. 4. A single proxy is a single point of failure. 2. RELATED WORK 2.1 The Access Latency Latency is defined as the delay between a request for a Web page and receiving that page in its entirety. The latency problem occurs when users judge the download as too long. Unacceptable latency does not only adversely effects user satisfaction. Web pages that are loaded faster are judged to be significantly more interesting than their slower counterparts [12]. Studies on human cognition revealed that the response time shorter than 0.1 second is unnoticeable and the delay of 1 second matches the pace of interactive dialog. Following table shows the transfer rate of different connection types. Table 1. Transfer Rates for different connection Type Connection Type Slow Normal Maximum Modem 33k6 <2.734 ≈3 ≈3.65 Modem 56k <4.199 ≈5 ≈6.08 ISDN 64k <5.469 ≈6 ≈6.94 Cable <9.766 ≈17 by provider ADSL <12.21 ≈24 ≈732 Ethernet 10Base-T (10 Megabits/sec) <73 ≈195 ≈977 Table shows the different parameters that affects the access time of browser. The different parameters are type of connection used by the user and the condition of connection. The timing of internet use also affects on access latency due to bandwidth sharing. 2.2 Web Traffic The amount of data sent and received by visitors to a website is web traffic. It is analysis to see the popularity of web sites and individual pages or sections within a site. Web traffic can be analyzed by viewing the traffic statistics found in the web server log file, an automatically generated list of all the pages served. Traffic analysis is conducted using access logs from web proxy server. Each entry in access logs records the URL of document being requested, date and time of the request, the name of the client host making the request, number of bytes returns to requesting client, and information that describe how the clients request was treated as proxy [1]. Processing these log entries can produce useful summary statistics about workload volume, document type and sizes, popularity of document and proxy cache performance [5].
  • 4.
    International Journal ofComputer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 359 2.3 Static Caching It is a new approach of web caching which uses yesterday’s log to predict the today’s user request. The static caching algorithm defines a fixed set of URLs by analyzing the logs of previous periods. It then calculates the value of the unique URL. Depending on the value, URLs are arranged in the descending order, and the URL with the highest value is selected. This set of URLs is known as the working set. When a user requests a document and the document is present in the working set, the request is fulfilled from the cache. Otherwise, the user request is fulfilled from the origin server [6]. 2.4 Dynamic Caching Dynamic caching is more complex than static caching and requires detailed knowledge of the application. One must consider the candidates for dynamic caching carefully since, by its very nature, dynamically generated content can be different based on the state of the application. Therefore, it is important to consider under what conditions dynamically generated content can be cached returning the correct response. This requires knowledge of the application, its possible states, and other data, such as parameters that ensure the dynamic data is generated in a deterministic manner [3]. 2.5 MD5 Algorithm MD5, developed by Ron Rives in 1992, is a comparison cryptographic hash algorithm that succeeded the MD4 algorithm. MD5 takes an input of any length and generates an MD5 digest of fixed length (128 bits or 32 characters). Because MD5 uses the same algorithm every time, a particular data string always generates the same MD5 hash every time. MD5 cryptographic hash offers several advantages over its predecessors (such as MD4) and its competitors (such as, SHA and SHA.1). One of these advantages is that MD5 is a one way cryptographic hash. Another advantage is that MD5 can accept inputs of any length but still generates a fixed length output. MD5 is fast, and it is highly unlikely that two different strings can hash to the same digest. Moreover, with MD5 it is also highly unlikely that two different input strings can hash to the same digest. Furthermore, MD5 is reliable in the sense that the same input string always yields the same output digest every time [11]. 3. EXPERIMENTAL SETUP 3.1 Changing of proxy server In most of the organization’s or institution server does not support the proxy cache, so it is difficult to use main server as cache server so we have to change the proxy server from main server to other server [2]. Following are the steps to switch machine to other proxy: 1. Open the browser for ex. Internet Explorer 2. In internet explorer pull down the Tools menu and click Internet Options... 3. Click the Connections tab: 4. click the LAN Settings... button: 5. In the Address: box change "proxy1 Address" to "proxy2 Address" or vice versa and click OK. 6. Click OK on the Internet Options dialogue box to get back to the browser screen and you will now be able to get external sites.
  • 5.
    International Journal ofComputer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 360 3.2 Duplication of Data Duplication of data means storing the multiple copies of same data object. In case of cache when we cache the object or the webpage that web page is stored at cache memory but when the different users request the same page then the multiple copies of that object or web page is stored at cache memory which results in the wastage of storage space as we all know the maintenance of cache is an expensive task so such wastage is not affordable. To avoid the problem of duplication of the data objects or web page duplicate suppression mechanism is to be used [7]. If the duplicate copy of data is saved at proxy cache then it acquires more space of storage in the analysis part given in work shows that the effect of duplication in the cache space [4]. 3.3 Duplicate Suppression You can reduce storage space requirements by avoiding duplicating copies of the same data. Content Engine provides the option to suppress storage of duplicate content elements. Duplicate suppression applies to any kind of content. Incoming content is not added to the storage area if identical content exists in the storage area; only unique content is added [14]. Due to large network size there are many pages on web, most of those pages will not be referenced multiple times by any one cache, means the probability with which the Kth page will be referenced is 1/K. re-referenced follow a distribution similar to Zipf’s law [9]. 3.5 Experimental Results The experimentation carried out at the lab of our institute. Some of popular websites are considered for experiment. Those websites are use to analyse for access latency of browser under different conditions. Keyword based search also used for Latency time based on the type of content either image or text search. Table 2. Response time of search engine for Text and Image Search. Text Search Image Search Keywords From Web Server From Cache Server From Web Server From Cache Server SVKM 250 140 230 200 NMIMS 140 130 300 100 RCPIT 250 120 350 150 CANNON 240 130 250 100 SAMSUNG 210 140 640 200 NOKIA 250 190 240 120 MATLAB 240 160 280 160 OPERA 250 150 120 120 SIEMENS 230 160 310 100 MICROMAX 160 140 190 110 MPSC 170 140 180 100 UPSC 210 150 150 140 IRCTC 160 140 330 90 RRB 260 120 310 70
  • 6.
    International Journal ofComputer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 361 Table 2 show the reduction into response time of browser when page is fetched from cache server instead of web server. First column shows the different keywords which is used for analysis. Same keywords are used for the text search and image search. Table contains response time of browser for text search as well as image search. From table we can say that there is considerable amount of reduction of access latency when the page is fetch from Proxy Cache. Figure 2 shows the comparison of response time when the page is fetch from main server and the response time when it is fetch from proxy cache. From Figure 2 we can say that there is considerable amount of reduction of the response time. Figure shows the graph plot for comparison of response time when the response comes from main source and when the response comes from local cache server for Text Search for some keywords. Here first bar shows the response time when the page is fetch from Web server where second bar shows the response time when the page is fetch from local proxy cache server where we have implemented content aliasing algorithm. Figure.2 Response time of Search engine for Text Search From Figure 2 it is clear that in text search for keyword we get 40 or more than 40 percent of reduction of response time. Where in case of some keywords like Samsung, IRCTC, RRB, Siemens the response time is reduced by more than 70 percent. Wherein case of opera, SVKM, and UPSC it is negligible or at most 10 percent. It is due to dynamic content comes under the search. Figure 3 shows the comparison of response time for image search for given keywords when the page is fetch from main server and the response time when it is fetch from proxy cache. From Figure 3 we can say that there is difference between the response times. Figure shows the graph plot for comparison of response time when the response comes from main source and when the response comes from local cache server for Image Search for some keywords. Here first bar shows the response time when the page is fetch from Web server where second bar shows the response time when the page is fetch from local proxy cache server where we have implemented content aliasing algorithm.
  • 7.
    International Journal ofComputer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 362 Figure 3. Response time of Search engine for Image Search From Figure 3 it is clear that in Image search for keyword we get very less amount of reduction in the response time because the images are more dynamic than the text. Table 3. Connection Time and Response time of browser for some Websites. From Web Server From Cache Server WEBSITE Connection Response Connection Response www.nmims.edu 7000 44000 3000 14000 www.rcpit.ac.in 6120 26140 3920 10310 www.mpsc.gov.in 5800 25700 3200 6390 www.upsc.gov.in 1890 4760 320 690 www.unipune.ac.in 2480 8600 1130 1580 www.wipro.com 2300 24750 900 3780 www.infosys.com 1710 18180 770 1980 www.techmahindra.com 990 18000 1260 7250 www.jaihindcollege.com 1210 13230 500 1170 www.jaihindcollege.ac.in 1800 15930 540 1040 www.msbte.com 1800 10170 810 1130 www.msbshse.ac.in 1530 4550 540 1040 www.cbse.nic.in 1130 5580 630 900 www.irctc.com 1710 12960 1670 3240 Table 3 shows the connection time and response time of browser for a various sites. It gives the comparison of connection time and response time when page is fetched from cache server instead of web server. First column shows the different websites which is used for analysis. From table we can say that there is considerable amount of reduction of access latency when the page is fetch from Proxy Cache instead of main server.
  • 8.
    International Journal ofComputer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 363 Figure.4 Connection time for different Websites Figure 4 shows the effect of content aliasing on the access time of web browser in terms of connection time. In maximum cases we get more than 50 percent of reduction in connection time. In some cases the reduction is 30-50 percent. In case of IRCTC website the reduction in connection time is negligible. Where in case of ‘TECHMAHINDRA’ website connection time increased. It is due to the dynamic content is more on website. Figure. 5 Response time for Different Websites Figure 5 shows the comparative graph of response time of browser for different websites. When the web page is fetched from cache server then the response time is less. From above graph we can say that the reduction in response time is more than 60 percent in each case. In some cases the reduction into the response time is more than 90 percent. So by using the content aliasing in proxy cache server we get significant amount of time save in terms of response time as well as connection time.
  • 9.
    International Journal ofComputer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 364 It is clear that amount of user time is saved by using the concept of content aliasing. We have achieved reduction of access latency by also considering other parameters like cache size, stale data. 4. CONCLUSION The analysis based experimental results proves the need for methodology that improve the web access performance to enhance bandwidth utilization and greater connectivity speed. Here the suggested Design aspects improve the web performance in terms of reduced latency, improved user response time, and optimal use of the existing bandwidth by using web caching. Content aliasing successfully detected using a web based application, database queries and files system calls. A considerable amount of duplicate storage can be avoided through the suggested methodology. It is, therefore, a very useful mechanism for web proxy caches. Moreover, the solution is successfully able to keep cached pages in synchronization with the pages on the web server, checking for new pages if needed. This work can be further optimize by the Daemon Process, which can be design and run periodically to check the consistency of the data cached and the data at the web server. This can be scheduled during the slack time with the less traffic which will not add any additional toll on the bandwidth as well as it updates the TTL – Time to Live Period of the cached data. REFERENCES [1] Kartik Bommepally, Glisa T. K., Jeena J. Prakash, Sanasam Ranbir Singh and Hema A Murthy “Internet Activity Analysis through Proxy Log” IEEE, 2010. [2] E-Services Team, “Changing Proxy Server” by the Robert Gordon University, School hill, Aberdeen, Scotland-2006. [3] Chen, W.; Martin, P.; Hassanein, H.S., "Caching dynamic content on the Web," Canadian Conference on Electrical and Computer Engineering, 2003, vol.2, no., pp. 947- 950 vol.2, 4-7 May 2003. [4] Sadhna Ahuja, Tao Wu and Sudhir Dixit “On the Effects of Content Compression on Web Cache Performance,” Proceedings of the International Conference on Information Technology: Computers and Communications, 2003. [5] Mark S. Squillante, David D. Yaot and Li Zhang “Web Traffic Modeling and Web Server Performance Analysis” Proceedings of the 38' Conference on Decision & Control Phoenix, Arizona USA December 1999. [6] C. E. Wills and M. Mikhailov, “Studying the Impact of More Complete Server Information on Web Caching,” Computer Communications, vol. 24, no. 2, pp. 184.190, May 2000. [7] J Wang “A Survey of Web Caching Schemes for the Internet” - Cornell Network Research Group (C/NRG), Department of Computer Science, Cornell University 1999. [8] N. Shivakumar and H. Garcia-Molina, “Finding near Replicas of Documents on the Web” Proc. Workshop on Web Databases, Mar. 1998. [9] L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf like Distributions: Evidence and Implications. In Proc. Infocom ’99. New York, NY, March, 1999.
  • 10.
    International Journal ofComputer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 365 [10] Guerrero, C.; Juiz, C.; Puigjaner, R.; "Web Performance and Behavior Ontology," Complex, Intelligent and Software Intensive Systems, 2008. CISIS 2008. International Conference on, vol., no., pp.219-225, 4-7 March 2008. [11] Kimmo Jarvinen, Matti Tommiska and Jorma Skytta, “Hardware Implementation Analysis of the MD5 Hash Algorithm,” IEEE Computer Society. 2005. [12] Andrzej Sieminski, “The impact of Proxy caches on Browser Latency” International Journal of Computer Science & Applications, 2005, Vol. II, No. II, pp. 5 – 21. [13] S B Patil, Sachin Chavan, Preeti Patil; “High quality design to enhance and improve performance of large scale web applications” International Journal of Computer Engineering and Technology (IJCET), Volume 3, Issue 1, January- June (2012), pp. 198-205, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. [14] S.Vikram Phaneendra, “Minimizing Client-Server Traffic Based on AJAX”, International journal of Computer Engineering & Technology (IJCET), Volume 3, Issue 1, 2012, pp. 10 - 16, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. [15] A. Suganthy, G.S.Sumithra, J.Hindusha, A.Gayathri and S.Girija, “Semantic Web Services and its Challenges”, International journal of Computer Engineering & Technology (IJCET), Volume 1, Issue 2, 2010, pp. 26 - 37, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375.