JAVA 2013 IEEE DATAMINING PROJECT Distributed web systems performance forecasting
Distributed Web Systems Performance Forecasting Using Turning
With the increasing development of distributed computer systems (DCSs) in networked industrial and
manufacturing applications on the Worldwide Web (WWW) platform, including service-oriented architecture
and Web of Things QoS-aware systems, it has become important to predict the Web performance. In this paper,
we present Web performance prediction in time and in space by making a forecast of a Web resource
downloading using the Turning Bands (TB) geostatistical simulation method. Real-life data for the research
were obtained in an active experiment conducted by our multi-agent measurement system WING performing
monitoring of a group of Web servers worldwide from agents localized in different geographical localizations in
Poland. The results show good quality of Web performance prediction made by means of the TB method,
especially in the case when European Web servers were monitored by an MWING agent localized in Gliwice,
The aim of this paper is to present a robust spatio-temporal prediction method and algorithm that can provide an
efficient forecasting of a Web client-perceived performance on the World Wide Web. This may provide
efficient QoS for individual nodes of Web-based DCS and enable to improve operation of the whole system.
The predicted performance characteristics can be used in selection of the best performance Web server and best
in space and in time. Here, we propose to make Web performance prediction with the use of the Turning Bands
(TB) geostatistical method some of the main contributions of the paper are as follows. The first is the
introduction of a new spatio-temporal methodological approach to the performance prediction of Internetbased
DCSs, established on the theory and application of geostatistics. The second is a Web performance prediction
IEEE PROJECTS & SOFTWARE DEVELOPMENTS
IEEE FINAL YEAR PROJECTS|IEEE ENGINEERING PROJECTS|IEEE STUDENTS PROJECTS|IEEE
BULK PROJECTS|BE/BTECH/ME/MTECH/MS/MCA PROJECTS|CSE/IT/ECE/EEE PROJECTS
CELL: +91 98495 39085, +91 99662 35788, +91 98495 57908, +91 97014 40401
Visit: www.finalyearprojects.org Mail to:firstname.lastname@example.org
algorithm based on the widely proven TB simulation method, which gives efficient and accurate forecasting, as
well as reliable results.
The third contribution is that our analysis uses real-life data collections gathered for various clients
monitoring many Web servers localized in different Internet geographic locations.
At present, to the best of the authors’ knowledge, the approach presented in this paper is unique, and
there is no similar problem statement in the literature with which to compare.
The present the comparison of our TB-based Web performance prediction method with other spatio-
temporal prediction approaches, which, like the TB method, were studied
The methodology of the proposed approach and the algorithm of the TB method, which will be used for spatio-
temporal forecasting of Web system performance (WSP). The basic assumption of the TBmethod is as follows:
the field to be simulated is second-order stationary and isotropic; at each point, the values of the field are
normally distributed and have zero mean. In other cases, the transformation to Gaussian with subsequent
subtraction of the mean could be applied. The next assumption is the knowledge of the covariance C(r) of the
field which is to be simulated. agents implemented in different programming languages, so it can be run in both
Linux and Windows operating environments. Agents perform measurements and monitoring by means of
common system functionalities as well as on open developments aiming to match specific aims of
measurements. Common functionalities include: agent management, measurement scheduling, heartbeat (status
and conditions of an agent), data model, synchronization, local databases, and central database support. The
network delay, the web server latency, and the delay caused by the special web infrastructure, built on the
client-to-server communication path to reduce the response time, if only exist. Finally, a web client always
perceives the grand total delay resulted from all activities.
The information regarding an area of forecast, a time of forecast, a geostatistical method, and an agent
from which datasets were collected.
As a result, one could obtain spatial-temporal database and rastermap, where the analyses of variability
for whole space not only for given points could be performed.
These two methods have been used by us because they use an acceptable amount of casts.
Geostatisticalmethods are developing significantly in traditional sciences for geostatistics like climate
studies, geology, ecology, or agriculture
There are generally two ways of solution of problems caused due to the imperfect performance of the Web. The
first is making improvements in the quality of communication protocols, including the development of real-time
protocols, protocol tuning, as well as upgrading existing network technologies to support needed
communication requirements. This development is finely realized for Web-based systems of general usage and
includes, for example, content distribution networks.
The aim of this paper is to present a robust spatio-temporal prediction method and algorithm that can
provide an efficient forecasting of a Web client-perceived performance on the World Wide Web. This may
provide efficient QoS for individual nodes of Web-based DCS and enable to improve operation of the whole
system. The predicted performance characteristics can be used in selection of the best performance Web server
and best in space and in time. Here, we propose to make Web performance prediction with the use of the
Turning Bands (TB) geostatistical method.
The basic assumption of the TB method is as follows: the field to be simulated is second-order stationary and
isotropic; at each point, the values of the field are normally distributed and have zero mean. In other cases, the
transformation to Gaussian with subsequent subtraction of the mean could be applied. The next assumption is
the knowledge of the covariance C(r) of the field
which is to be simulated.
Structural Data Analysis
The minimum and maximum values, a rather large data range is observed. Only for data measured at
12:00 a.m. is this difference smaller. Moreover, the high value of standard deviation and the coefficient of
variation, which is above 100% for each considered hours, confirms the process variation. However, the
coefficient and kurtosis values indicate that the distribution of the considered web performances should show
similarity to a symmetrical distribution but with only small right-side asymmetry.
Distributed Web System
The simulation, the moving neighborhood type was adopted where the search ellipsoid was 10 km for
the - and –directions and 18 km for the -direction in the case of Web performance at 6:00 a.m. and 12:00 a.m.,
and for the -direction at 6:00 p.m. The search ellipsoid was 28 km. The forecast of the download time was
determined on the basis of 100 simulation realizations.
The Formula-based methods use a mathematical formula expressing particular performance measure as
a function of essential independent variables that characterize a studied phenomenon. In history-based
performance prediction, the time series of observations obtained through repeated measurements over time are
analyzed, and this is the approach used in this paper. Two basic prediction approaches are considered, namely
classification and regression.
In this paper, an approach for predicting Web performance by the innovative application of the TB geostatistical
simulation method was proposed. A large-scale measurement experiment was performed in the real-life Internet
to gather the data characterizing performance of over 60 Web servers localized worldwide and perceived from
four agents installed in different Internet locations. An unquestionable possibility of using geostatistics in a new
application that is Internet network performance prediction is outlined. Such geostatistics methods have
different applications, for example, spatial estimate crime rate. The comparison of spatial regression analysis
(econometric models) with kriging methods indicates clearly the advantage of the former. On the basis of
conducted research, the authors claim that we must work on improvement of the forecast accuracy. Web
performance should be analyzed using various measurement data and prediction horizon lengths. Also, the next
step should be an attempt to use other geostatistical methods which have already been successfully used by the
authors to forecast loads in power transmission and distribution networks. Furthermore, we address our research
approach to QoS issues in smart-grid communications technologies.
 M. Ulieru and S. Grobbelaar, “Engineering industrial ecosystems in a networked world,” in Proc. 5th Int.
IEEE Conf. Ind. Informat., Vienna, Austria, Jul. 23–27, 2007, keynote address.
 Internet-based Control Systems: Design and Applications, Advances in Industrial Control, S-H. Yang, Ed.
London, U.K.: Springer-Verlag, 2011.
 F. Tao, D. Zhao, Y. Hu, and Z. Zhou, “Resource service composition and its optimal-selection based on
particle swarm optimization in manufacturing grid system,” IEEE Trans. Ind. Inform., vol. 4, no. 4, pp. 315–
327, Nov. 2008.
 T. Cucinotta, A. Mancina, G. F. Anastasi, G. Lipari, L. Mangeruca, R. Checcozzo, and F. Rusina, “A real-
time service-oriented architecture for industrial automation,” IEEE Trans. Ind. Inform., vol. 5, no. 3, pp. 267–
277, Aug. 2009.
 D. Guinard, V. Trifa, F. Mattern, and E. Wilde, “From the Internet of things to the web of things: Resource
oriented architecture and best practices,” in Architecting the Internet of Things, D. Uckelmann, M. Harrison,
and F. Michahelles, Eds. Berlin, Germany: Springer, 2011, pp. 97–129.
 N. Chari, “Outlining the communications behind distribution automation,” Renew Grid Mag., no. 4, pp. 18–
21, Apr. 2011.
 H. Wackernagel, Multivariate Geostatistics: an Introduction with Applications. Berlin, Germany: Springer-