More Related Content
Similar to 50120140504023 2-3
Similar to 50120140504023 2-3 (20)
More from IAEME Publication
More from IAEME Publication (20)
50120140504023 2-3
- 1. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 210-217 © IAEME
210
NETWORK ARCHITECTURE AND DESIGN FOR OPTIMIZED WEB PAGE
CLUSTERING WITH CUSTOMIZED LOCAL PROXY SERVER TO REDUCE
USER-PERCEIVED LATENCY AND NETWORK RESOURCE
REQUIREMENTS IN THE WORLD WIDE WEB
Dr. Suryakant B Patil1
, Mr. Sangramsinh Deshmukh2
, Mr. Amey Redkar3
, Dr. Preeti Patil4
1
Professor, JSPM’s Imperial College of Engineering & Research, Wagholi, Pune
2, 3
Research Scholar, JSPM’s ICOER, Wagholi, Pune
4
Dean (SA), HOD & Professor, KIT’s COE, Kolhapur
ABSTRACT
This paper presents that using local customized proxy server, perfecting of multimedia or any
information from web page clusters can be optimized and similar kind of information retrieval
through different sources may contain same information or links or pages or multimedia, such kind
of information can be optimized using customized local proxy server. The underlying premise of our
approach is that in the case of cluster accesses, the next pages or multimedia or any document
requested by users of the web server are typically based on the current and previous pages or
multimedia or any document requested. Furthermore, if the requested pages have a lot of links to
some pages or multimedia or any document, that pages or multimedia or any document has a higher
probability of being the next one requested. An experimental evaluation of the perfecting mechanism
is presented using real server logs.
Categories and Subject Descriptors
C.2.1 [Network Architecture and Design]: Computer-Communication Networks.
GENERAL TERMS
Design, Performance, Reliability, Experimentation, Algorithms, Standardization.
Keywords: Web Page Cluster, Traffic Optimization, Customized Proxy Server.
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING &
TECHNOLOGY (IJCET)
ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online)
Volume 5, Issue 4, April (2014), pp. 210-217
© IAEME: www.iaeme.com/ijcet.asp
Journal Impact Factor (2014): 8.5328 (Calculated by GISI)
www.jifactor.com
IJCET
© I A E M E
- 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 210-217 © IAEME
211
I. INTRODUCTION
In the field of web server management, researchers have focused on aliasing in proxy server
caches for a long time. Web caching consists of storing frequently referred objects on a caching
server instead of the original server, so that web servers can make better use of network bandwidth,
reduce the workload on servers, and improve the response time for users. Aliasing means giving
multiple names to the same thing [9].
Proxies are emerging as an important way to reduce user-perceived latency and network
resource requirements in the Internet. While relaying traffic between servers and clients, a proxy can
cache resources in the hope of satisfying future client requests directly at the proxy. However,
existing techniques for caching text and images are not appropriate for the rapidly growing number
of continuous media streams [10]. In addition, high latency and loss rates in the Internet make it
difficult to stream audio and video without introducing a large playback delay. The results show that
the fetching of same kind of information from different source of information or any web pages may
contain same link or multimedia or information to address these problems [11], we propose that,
instead of caching entire pages or multimedia or any document (which may be quite large), the proxy
should store same kind of information at our local customized proxy server.
II. LITERATURE SURVEY
Ngamsuriyaroj, S. ; Rattidham, P. ; Rassameeroj, I. ; Wongbuchasin, P. ; Aramkul, N. ;
Rungmano, S. experimented Performance Evaluation of Load Balanced Web Proxies [1].
Considered approximate mirroring or “syntactic similarity” [3]. Although they introduce
sophisticated measures of document similarity, they report that most “clusters” of similar documents
in a large crawler data set contain only identical documents. Shivkumar and Garcia-Molina
investigated mirroring in a large crawler data set and reported that in the WebTV client trace far
more aliasing happens than expected. In fact, they reported that 36% of reply bodies are accessible
through more than one URL [7]. Similarly, surveyed techniques for identifying mirrors on the
Internet [3]. Investigated mirroring in a large crawler data set and reported that roughly 10% of
popular hosts are mirrored to some extent [3].
Large scale web applications require high latency time on the network of the networks. The
latency time increased along with the additional response time especially for large scale web
applications accessed by huge number of web users. This paper proposes solution to above problem
by high quality design technique that reduces the required response time for Large Scale Web
Applications.
This results in the improvement in the performance of the web applications especially in the
ETL in Data Warehouses with size in TBs, E-commerce with online payments and enhances the
browsing experience especially on social networking sites. The Proxy cache is used to synchronise
the various systems at single point of contact for such large scale web applications. The HTTP
header verification and comparison enables the proposed system to optimize the proxy cache data.
This optimization of the proxy cache data improves the response time. The analysis and
experimentation proves the improvement in the performance by reducing the latency time.
III. EXPERIMENTATION AND RESULTS
A proxy Cache acts as a mediator for requests from browsers seeking resources for the web
application on the respective web servers. The request initiates from the clients go through the proxy
cache to the respective web servers.
- 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 210-217 © IAEME
212
Fig 1: Web Server Vs Proxy Server w.r.t. Data Size
The response comes along with the data required for those particular web applications from
web servers and maintains one copy at proxy cache before delivering to the respective clients.
Fig 2: Cumulative Results of Proxy & Web Servers
- 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 210-217 © IAEME
213
Figure2 shows Cumulative Results of Proxy & Web Servers for various 3 case studies traffic,
which can be optimized further. Aliasing in proxy servers caches occurs due to same content is
stored in cache multiple times. On the World Wide Web, aliasing commonly occurs when a client
makes two requests, and both the requests have the same payload [3].
The major problem associated with using the web cache is storage space requires storing the
visited pages with their objects [15]. Commercial browsers are slowly becoming aware of aliasing in
proxy server caches, and therefore, are encouraging website designers to make cache-friendly web
pages that avoid aliasing [12]. However, website designers as well as administrators do not know
much about the effects of aliasing and how it causes repeated transfers of the same payload [14].
Fig 3: Cumulative Results of Proxy & Web Servers
Application and user specific Web Traffic shown in Fig 3 with Cumulative Results of Proxy
& Web Servers during peak hours where download time at JSPM ICOER campus is below average
somewhere between 2.3 to 3.5 Mbps. The amount of data sent and received by visitors to a website is
web traffic. It is analysis to see the popularity of web sites and individual pages or sections within a
site. Web traffic can be analysed by viewing the traffic statistics found in the web server log file, an
automatically generated list of all the pages served.
Traffic analysis is conducted using access logs from web proxy server. Each entry in access
logs records the URL of document being requested, date and time of the request, the name of the
client host making the request, number of bytes returns to requesting client, and information that
describe how the clients request was treated as proxy [1].
Processing these log entries can produce useful summary statistics about workload volume,
document type and sizes, popularity of document and proxy cache performance [7].
- 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 210-217 © IAEME
214
Fig. 4: Web Server II Traffic Analysis
Figure 4 shows that, marginal traffic Reduction in different case studies when considered
Web Server II Traffic Analysis. Optimization of Web cache: Optimization of Web caching can play
a valuable role in improving service quality for a large range of Internet users. A web cache is a
mechanism for the temporary storage (caching) of web documents, such as HTML pages and images,
to reduce bandwidth usage, server load, and perceived lag. A web cache stores copies of documents
passing through it; subsequent requests may be satisfied from the cache if certain conditions are met.
There are two types of Web caches a browser cache and a proxy cache [9].
Fig. 5: Cumulative Data Traffic Analysis from all Web Servers
- 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 210-217 © IAEME
215
Fig. 5 shows the Cumulative Data Traffic Analysis from all Web Servers during peak time
hours, which results in the minimizing the traffic at customized proxy server level. Mirroring:
Mirroring is defined as keeping multiple copies of the content of a Web site or Web pages on
different servers using different domain names. A mirrored site is nothing but an exact replica of the
original site and is updated frequently to ensure that it reflects the updates made to the content of the
original site. The main purpose of mirroring is to build in redundancy and ensure high availability of
web documents or objects. Mirrored sited also help make access faster when the original site is
geographical distant. Shivkumar and Garcia-Molina investigated mirroring in a large crawler data set
and reported that in the WebTV client trace far more aliasing happens than expected [7].
Fig. 6: Traffic Analysis of all Proxy servers during peak hours
Fig 6 shows the Traffic Analysis of all Proxy servers during peak hours where the network
architecture and design used to results in optimized web page clustering with customized local proxy
server S-I to reduce user-perceived latency and network resource requirements in the world wide
web from Web servers S-I and S-II.
IV. CONCLUSION
Proxy server enables you to cache your web content and return it quickly on subsequent
requests. System administrators often struggle with delays and too much bandwidth being used, but
proxy server solves these problems by handling requests locally. By deploying proxy server in
accelerator mode, requests are handled faster than on normal web servers, thus making your site
perform quicker than everyone else's!.
Proxies are emerging as an important way to reduce user-perceived latency and network
resource requirements in the Internet. While relaying traffic between servers and clients, a proxy can
cache resources in the hope of satisfying future client requests directly at the proxy. However,
existing techniques for caching text and images are not appropriate for the rapidly growing number
of continuous media streams. In addition, high latency and loss rates in the Internet make it difficult
to stream audio and video without introducing a large playback delay. The results show that the
- 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 210-217 © IAEME
216
fetching of same kind of information from different source of information or any web pages may
contain same link or multimedia or information To address these problems, we propose that, instead
of caching entire pages or multimedia or any document (which may be quite large), the proxy should
store same kind of information at our local customized proxy server. This work can be further
optimize by the Daemon Process, which can be design and run periodically to check the consistency
of the data cached and the data at the web server. This can be scheduled during the slack time with
the less traffic which will not add any additional toll on the bandwidth as well as it updates the TTL
– Time to Live Period of the cached data. Upon receiving a request for pages or multimedia or any
document, the proxy immediately initiates transmission to the local customized proxy server for
similar kind of information, while simultaneously requesting the remaining frames from the server in
addition to hiding the latency between the server and the proxy. The network architecture and design
used to results in optimized web page clustering with customized local proxy server S-I to reduce
user-perceived latency and network resource requirements in the world wide web from Web servers
S-I and S-II. The results show that the fetching of same kind of information from different source of
information or any web pages may contain same link or multimedia or information, with hit rates of
90% in some cases.
REFERENCES
[1] Ngamsuriyaroj, S. ; Rattidham, P. ; Rassameeroj, I. ; Wongbuchasin, P. ; Aramkul, N. ;
Rungmano, S., “Performance Evaluation of Load Balanced Web Proxies” IEEE, 2011.
[2] Chen, W.; Martin, P.; Hassanein, H.S., "Caching dynamic content on the Web," Canadian
Conference on Electrical and Computer Engineering, 2003, vol.2, no., pp. 947- 950 vol. 2,
4-7 May 2003.
[3] Srikantha Rao, Preeti Patil, S B Patil; “Enhanced Software Development Strategy implying
High Quality Design for Large Scale Database Projects”, International Conference and
Workshop on Emerging Trends in Technology ICWET 2012, ISBN: 978-0-615-58717-2,
TCET Mumbai, February 22–25, 2012, Pages: 508-513.
[4] Srikantha Rao, Preeti Patil, S B Patil; “Object-Oriented Software Engineering Paradigm: A
Seamless Interface in Software Development Life Cycle”, ACM_Asia_Pacific International
Conference on Advances in Computing (ICAC-2008), Anuradha Engineering College,
Chikhali, Feb 2008.
[5] Sadhna Ahuja, Tao Wu and Sudhir Dixit, “On the Effects of Content Compression on Web
Cache Performance,” Proceedings of the International Conference on Information
Technology: Computers and Communications, 2003.
[6] A. Mahanti, C. Williamson, and D. Eager, “Traffic Analysis of a Web Proxy Caching
Hierarchy,” IEEE Network Magazine, May 2000.
[7] N. Shivakumar and H. Garcia-Molina, “Finding near Replicas of Documents on the Web”
Proc. Workshop on Web Databases, Mar. 1998.
[8] Jeffrey C. mogul “A trace-based analysis of duplicate suppression in HTTP,” Compaq
Computer Corporation Western Research Laboratory, Nov. 1999.
[9] S B Patil, Sachin Chavan, Preeti Patil; “High Quality Design and Methodology Aspects To
Enhance Large Scale Web Services”, International Journal of Advances in Engineering
& Technology (IJAET-2012), ISSN: 2231-1963, March 2012, Volume 3, Issue 1,
Pages: 175-185. (Journal Impact Factor: 1.96).
[10] S B Patil, Sachin Chavan, Preeti Patil; “High Quality Design To Enhance and Improve
Performance of Large Scale Web Applications”, International Journal of Computer
Engineering & Technology (IJCET), ISSN 0976 – 6375, Volume 3, Issue 1, January-June
2012, Pages: 198 - 205.(Journal Impact Factor: 1.0425).
- 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 210-217 © IAEME
217
[11] S B Patil, D. B. Kulkarni; “Improving web performance through Hierarchical caching &
content aliasing”, The 7th International Conference on “Information Integration and Web-
based Applications & Services”, 19-21 September 2005, Kuala Lumpur, Malaysia.
[12] Kartik Bommepally, Glisa T. K., Jeena J. Prakash, Sanasam Ranbir Singh and Hema A
Murthy “Internet Activity Analysis through Proxy Log” IEEE, 2010.
[13] Jun Wu; Ravindran, K., "Optimization algorithms for proxy server placement in content
distribution networks," Integrated Network Management-Workshops, 2009.
[14] Srikantha Rao, Preeti Patil, S B Patil, Sunita Patil “Customized Approach for Efficient Data
Storing and Retrieving from University Database Using Repetitive Frequency Indexing”,
IEEE INTERNATIONAL CONFERENCE PUBLICATIONS, RAIT 2012, ISM Dhanbad,
Jahrkhand, March 15–17, 2012 (Aavailable on IEEE Xplore) Print ISBN: 978-1-4577-0694-
3, Digital Object Identifier: 10.1109/RAIT.2012.6194612 Page(s): 511 – 514.
[15] Suryakant B Patil, Sonal Deshmukh, Anuja Bharate, Preeti Patil; “Network Traffic
Optimization for Performance Improvement in the Web Service Infrastructures by
Categorization of the Web Contents with Size Reduction Approach”, International Journal of
Advanced Research In Engineering & Technology (IJARET), ISSN 0976-6499, Volume 5,
Issue 4, April 2014, Pages: 198-204. (Journal Impact Factor: 7.8273).
[16] Sachin Chavan and Nitin Chavan, “Improving Access Latency of Web Browser by using
Content Aliasing in Proxy Cache Server”, International Journal of Computer Engineering &
Technology (IJCET), ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375, Volume 4,
Issue 2, 2013, Pages: 356 – 365. (Journal Impact Factor: 6.1302).
[17] A. Catherine Esther Karunya, C. Priyadharsini, D.Daniel and P.Priya, “Proxy Based Solution
for Mitigating Cross-Site Scripting Attack in Client Side”, International Journal of Computer
Engineering & Technology (IJCET), ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375,
Volume 2, Issue 1, 2011, Pages: 22 – 32. (Journal Impact Factor: 1.0425).