SlideShare a Scribd company logo
1 of 4
Download to read offline
www.ijmer.com

International Journal of Modern Engineering Research (IJMER)
Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3196-3199
ISSN: 2249-6645

A Novel Approach To Answer Continuous Aggregation Queries
Using Data Aggregations
K. Suresh1, Sk. Karimulla 2, B. Rasagna3
1, M.TECH Scholar, QCET, Nellore
2, Assistant Professor, QCET, Nellore
3, Assistant Professor, MLRITM, Hyderabad

ABSTRACT: Decision making can be regarded as the reflective development consequential in the selection of a course of
action among several alternative scenarios. Continuous and uninterrupted aggregation queries are used to monitor the
changes in data with time varying for online decision making. Usually a user requirements to obtain the value of some
aggregation function larger than distributed data items, for example, to be familiar with value of portfolio for a client; or the
average of temperatures sensed by a set of sensors. In these queries a client identifies a coherency requirement as
measurement of the query. In this we suggested a low-cost, scalable technique to answer continuous aggregation queries
using a network of aggregators of dynamic data items. Such a network of data aggregators, each and every data aggregator
gives out a set of data items at specific coherencies. Just as a variety of fragments of a dynamic web pages are served by one
or more nodes of a content allocation network, our proposed method involves decomposing a client query into sub-queries
and carry out sub-queries on sensibly chosen data aggregators with their individual sub query incoherency boundaries. We
present that a technique for getting the optimal set of sub queries with their incoherency boundaries which gratifies client
query’s coherency obligation with least number of refresh messages driven from aggregators to the client. For
approximation the number of refresh messages, we put together a query cost model which can be used to calculate
approximately the number of messages required to gratify the client specified incoherency bound.

Index Terms— Algorithms, Continuous Queries, Data Dissemination, Coherency, Performance, Distributed Query
Processing

I.

INTRODUCTION

Applications such as sensors-based monitoring, auctions, personal portfolio valuations for financial decisions, route
planning depends on traffic information, etc., make wide use of dynamic data. For such type of applications, data from one
or more independent data sources may be aggregated to determine if some action is warranted. Given the growing number of
such type of applications that make use of highly dynamic data, there is imperative interest in systems that can efficiently
deliver the relevant updates repeatedly. In these continuous query applications, users are probably to bear some incorrectness
in the results. That is, the exact data values at the equivalent data sources need not be reported as long as the query results
gratify user specified accuracy requirements.
Data incoherency: Data accuracy can be specified in terms of incoherency of a data item, defined as the complete
dissimilarity in value of the data item at the data source and the value known to a client of the data. Let v i(t) indicate the
value of the ith data item at the data source at time t; and let the value the data item known to the client be ui(t). Then, the data
incoherency at the client is given by |vi(t)- ui(t)|. For a data item which needs to be refreshed at an incoherency bound C a
data refresh message is sent to the client as soon as data incoherency exceeds C, i.e., |v i(t)- ui(t)| > C.
Network of data aggregators (DA): Data refresh from data sources to consumers can be done using push or pull based
mechanisms. In a push based system data sources send bring up to date messages to clients on their own while in a pull
based system data sources send messages to the customer only when the client makes a request. In this assume the push
based system for data transfer between data sources and clients. For scalable management of push based data distribution,
network of data aggregators are proposed.
In this assume that each data aggregator maintains its configured incoherency bounds for different data items. From a data
distribution capability point of vision, each and every data aggregator is characterized by a set of (d i, ci) pairs, where di is a
data item which the DA can disseminate at an incoherency bound c i. The configured incoherency bound of a data item at a
data aggregator can be maintained using any of following methods:
1) The data source refreshes the data value of the DA whenever DA’s incoherency bound is about to get violated. This
method has scalability problems. 2) Data aggregator with tighter incoherency bound help the DA to maintain its incoherency
bound in a scalable manner
1.1. Aggregate Queries and Their Execution
While executing continuous multi data aggregation queries, using a network of data aggregators, with the objective of
minimizing the number of restores from data aggregators to the client. First, we give two motivating scenarios where there
are various options for executing a multi data aggregation query and one must select a particular option to minimize the
number of messages.
www.ijmer.com

3196 | Page
International Journal of Modern Engineering Research (IJMER)
www.ijmer.com
Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3196-3199
ISSN: 2249-6645
1.2. Problem statement & contributions
Our simulation studies show that for continuous aggregation queries:
 Our method of dividing query into sub queries and executing them at individual DAs requires less than one third of the
number of refreshes required in the existing schemes.
 For reducing the number of refreshes more dynamic data items should be part of a sub query involving larger number of
data items.
Our method of executing queries over a network of data aggregators is practical since it can be implemented using a
mechanism similar to URL-rewriting in content distribution networks (CDNs). Just like in a content distribution networks,
the client sends its query to the central site. For getting appropriate aggregators (edge nodes) to answer the client query
(webpage), the central site has to first determine which data aggregators have the data items required for the client query. If
the client query cannot be respond by a single data aggregator, the query is divided into sub queries (fragments) and each sub
query is assigned to a single data aggregator. In case of a content distribution networks webpage’s division into fragments is
a page design issue, whereas, for permanent aggregation queries, the issue has to be handled on per query basis by
considering data dissemination capabilities of data aggregators.
This strategy ignores the fact that the client is interested only in the aggregated value of the data items and various
aggregators can disseminate more than one data item. Second, if a single Data Aggregator (DA) can disseminate all three
data items required to answer the client query, the Data Aggregator can construct a compound data item corresponding to the
client query and disseminate the result to the client so that the query incoherency bound is not violated. It is obvious that if
we obtain the query result from a single Data Aggregator, the number of refreshes will be (as data item updates may cancel
not in each other, thereby maintaining the query results within the incoherency bound). Further, even if an aggregator can
refresh all the data items, it may not be capable to satisfy the query coherency requirements. In such cases the query has to be
implemented with data from multiple aggregators.

II.

DATA DISSEMINATION COST MODEL

To estimate the number of refreshes required to disseminate a data item while maintaining a certain incoherency
bound. There are two most important factors affecting the number of messages that are needed to maintain the coherency
requirement:
1) The coherency requirement itself
2) Dynamics of the data.
1.3. Incoherency Bound Model
Consider a data item which needs to be distributed at an incoherency bound C that is new value of the data item will
be pushed if the value deviates by more than C from the last pushed value Thus, the number of dissemination messages will
be proportional to the probability of | v(t)- u(t) | greater than C for data value v(t) at the source/aggregator and u(t) at the
client, at time t. A data item can be replicated as a discrete time random process where each step is correlated with its
previous step. In a push-based distribution, a data source can follow one of the following schemes:
1.
Data source pushes the data value whenever it differs from the last pushed charge by an amount more than C.
2.
Client estimates data value based on server specified parameters. The source drives the new data value whenever it
differs from the (client) estimated value by an amount more than C.

Fig.1: Number of pushes versus incoherency bounds
In both these cases, value at the source can be modeled as a random process with average as the value known at the
client. In case 2, the server and the client estimate the data value as the mean of the modeled random process, whereas in
case 1 difference from the last pushed value can be modeled as zero mean process.
2.2. Data dynamics Model
Two possible options to model data dynamics, as a first option, the data dynamics can be quantified based on
standard deviation of the data item values. Suppose both data items are disseminated with an incoherency bound of 3. It can
be seen that the number of messages required for maintaining the incoherency bound will be 7 and 1 for data items d1 and
d2, respectively, whereas both data items have the same standard deviation. Thus, we need a measure which captures data
changes along with its temporal properties. This motivates us to examine the second measure. As a second option we
considered Fast Fourier Trans- form (FFT) which is used in the digital signal processing domain to characterize a digital
www.ijmer.com

3197 | Page
International Journal of Modern Engineering Research (IJMER)
www.ijmer.com
Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3196-3199
ISSN: 2249-6645
signal. FFT captures number of changes in data value, amount of changes, and their timings. Thus, FFT can be used to model
data dynamics but it has a problem. To estimate the number of refreshes required to disseminate a data item we need a
function over FFT coefficients which can return a scalar value. The number of FFT coefficients can be as high as the number
of changes in the data value. Among FFT coefficients, 0th order coefficient identifies average value of the data item,
whereas higher order coefficients represent transient changes in the value of data item.
Validating the hypothesis: We did simulations with different stocks being disseminated with incoherency bound values of
$0:001, 0.01, and 0.1. This variety is 0.1 to 10 times the regular standard deviation of the share price values. Number of
invigorate messages is plotted with data sum diff (in $) the linear relationship appears to exist for all incoherency bound
values. To enumerate the measure of linearity we used Pearson product moment correlation coefficient (PPMCC), a widely
used measure of association, measuring the quantity of linearity between two variables.

Fig.2: Number of pushes versus data sumdiff (a) C Âź 0:001, (b) C Âź 0:01, and (c) C Âź 0:1.
It is calculated by summing up the products of the deviations of the data item values from their mean. PPMCC
varies between 1 and 1 with higher (absolute) values signifying that data points can be considered linear with more
confidence. For three principles of incoherency bounds 0.001, 0.01, and 0.1; PPMCC values be 0.94, 0.96, and 0.90,
respectively and average deviation from linearity was in the range of 5 percent for low values of C and 10 percent for high
values of C. Thus, we can conclude that, for lower values of the incoherency bounds, linear relation- ship between data sum
diff and the number of refresh messages can be assumed with more confidence.
2.3 Combining Data Dissemination Models
Number of refresh messages is proportional to data sumdiff R s and inversely proportional to square of the
incoherency bound (C2). Further, we can see that we need not disseminate any message when either data value is not
changing (Rs = 0) or incoherency bound is unlimited (1/C2 = 0). Thus, for a given data item S, disseminated with an
incoherency bound C, the data distribution cost is proportional to R s/C2. In the next section, we use this data dissemination
cost model for developing cost model for additive aggregation queries.

III.

QUERY PLANNING

A query plan is an ordered set of steps used to access or modify information in a SQL relational database
management system. This is a detailed case of the relational model concept of access plans. Since Oracle is declarative, there
are typically a large number of alternative ways to execute a specified query, with extensively varying performance. When a
query is suggested to the database, the query optimizer appraises some of the unlike, accurate possible plans for executing
the query and returns what it considers the best alternative. Thus to get a query plan we need to perform following tasks.
1. Determining sub queries: For the client query get sub queries for each data aggregator.
2. Dividing incoherency bound: Divide the query incoherency bound among sub queries to get the value of sub query.
Optimization objective:
Number of refresh messages is minimized. For a sub query the estimated number of refresh messages is given by
the ratio of sum diff of the sub query and the incoherency bound assigned to it and the proportionality factor k. Thus the total
no of refresh messages is estimated as the summation of the ratio of the sub query of a given query and in coherency bound
associated to it.
Constraint1: qk is executable at ak: Each DA has the data items required to execute the sub query allocated to it, i.e., for
each data item dq ki required for the sub query
Constraint2: Query incoherency bound is satisfied: Query incoherency should be less than or equal to the query incoherency
bound. Meant for preservative aggregation queries, value of the consumer query is the sum of sub query values. As different
sub queries are distributed by different data aggregators, we need to ensure that sum of sub query incoherencies is less than
or equal to the query incoherency bound.
Constraint3: Sub query incoherency bound is satisfied: Data incoherency bounds at ak should be such that the sub query
incoherency bound can be satisfied at that data aggregator. The tightest incoherency bound, which the data aggregator ak can
satisfy for the given sub query q k. For satisfying this constraint we ensure the following is the outline of our approach for
solving this constraint Optimization problem we prove that determining sub queries at the same time as minimizing Zq, as
specified by is NPhard.
www.ijmer.com

3198 | Page
www.ijmer.com

International Journal of Modern Engineering Research (IJMER)
Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3196-3199
ISSN: 2249-6645

IV.

PERFORMANCE EVOLUTION

4.1 Comparison of Algorithms
We consider various other query plan options. Each query can be executed by distributing individual data items or
by getting sub query values from network of data aggregators. Set of sub queries can be selected using sumdiff-based
approaches or any other random selection. Sub query (or data) incoherency bound can either be pre decided or optimally
allocated. Various arrangements of these proportions are covered in the following algorithms:
1. No sub query, equal incoherency bound (naive): In this the client query is performed with each data item being
disseminated to the client independent of other data items in the query. Incoherency bound is separated evenly among the
data items. This algorithm proceeded as a baseline algorithm.
2. No sub query, optimal incoherency bound (optc): In this algorithm as well data items are distributed independently but
incoherency bound is separated among data items so that total number of invigorate can be minimized.
3. Random sub query selection (random): In this case, sub queries are obtained by randomly selecting a network of data
aggregators in the each iteration of the greedy algorithm. This algorithm is designed to see how the random selection works
in comparison to the sumdiff-based algorithms.
4. Subquery selection while minimizing sumdiff (mincost)
5. Subquery selection while maximizing gain (maxgain)

Fig. 5: Performance evaluation of algorithms.

IV.

CONCLUSION

Here current a cost-based approach to minimize the number of refreshes required to execute an incoherency
bounded uninterrupted query. We assume the continuation of a network of data aggregators, where each Data Aggregator is
accomplished of distributing a set of data items at their pre specified incoherency bounds. We developed an important
measure for data dynamics in the form of sum diff which is a more appropriate measure compared to the widely used
standard deviation based measures. For optimal query implementation we divide the query into sub queries and evaluate
each sub query at a thoughtfully chosen data aggregator.

REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]

A. Davis, J. Parikh, and W. Weihl, “Edge Computing: Extending Enterprise Applications to the Edge of the Internet,” Proc. 13th
Int’l World Wide Web Conf. Alternate Track Papers & Posters (WWW), 2004.
D. VanderMeer, A. Datta, K. Dutta, H. Thomas, and K. Ramamritham, “Proxy-Based Acceleration of Dynamically Generated
Content on the World Wide Web,” ACM Trans. Database Systems, vol. 29, pp. 403-443, June 2004.
J. Dilley, B. Maggs, J. Parikh, H. Prokop, R. Sitaraman, and B. Weihl, “Globally Distributed Content Delivery,” IEEE Internet
Computing, vol. 6, no. 5, pp. 50-58, Sept. 2002.
S. Rangarajan, S. Mukerjee, and P. Rodriguez, “User Specific Request Redirection in a Content Delivery Network,” Proc. Eighth
Int’l Workshop Web Content Caching and Distribution (IWCW), 2003.
S. Shah, K. Ramamritham, and P. Shenoy, “Maintaining Coherency of Dynamic Data in Cooperating Repositories,” Proc. 28th Int’l
Conf. Very Large Data Bases (VLDB), 2002.
T.H. Cormen, C.E. Leiserson, R.L. Rivest, and C. Stein, Introduction to Algorithms. MIT Press and McGraw-Hill 2001.
Y. Zhou, B. Chin Ooi, and K.-L. Tan, “Disseminating Streaming Data in a Dynamic Environment: An Adaptive and Cost Based
Approach,” The Int’l J. Very Large Data Bases, vol. 17, pp. 1465-1483, 2008.
“Query Cost Model Validation for Sensor Data,”www.cse.iitb.ac. in/~grajeev/sumdiff/ RaviVijay_BTP06.pdf, 2011.
R. Gupta, A. Puri, and K. Ramamritham, “Executing Incoherency Bounded Continuous Queries at Web Data Aggregators,” Proc.
14th Int’l Conf. World Wide Web (WWW), 2005.
A. Populis, Probability, Random Variable and Stochastic Process. Mc. Graw-Hill, 1991.
C. Olston, J. Jiang, and J. Widom, “Adaptive Filter for Continuous Queries over Distributed Data Streams,” Proc. ACM SIGMOD
Int’l Conf. Management of Data, 2003.
S. Shah, K. Ramamritham, and C. Ravishankar, “Client Assignment in Content Dissemination Networks for Dynamic Data,” Proc.
31st Int’l Conf. Very Large Data Bases (VLDB), 2005.
NEFSC Scientific Computer System, http://sole.wh.whoi.edu/ ~jmanning//cruise/serve1.cgi, 2011.

www.ijmer.com

3199 | Page

More Related Content

What's hot

What Sets Verified Users apart? Insights Into, Analysis of and Prediction of ...
What Sets Verified Users apart? Insights Into, Analysis of and Prediction of ...What Sets Verified Users apart? Insights Into, Analysis of and Prediction of ...
What Sets Verified Users apart? Insights Into, Analysis of and Prediction of ...
IIIT Hyderabad
 
Survey on confidentiality of the user and query processing on spatial network
Survey on confidentiality of the user and query processing on spatial networkSurvey on confidentiality of the user and query processing on spatial network
Survey on confidentiality of the user and query processing on spatial network
eSAT Journals
 
Study, analysis and formulation of a new method for integrity protection of d...
Study, analysis and formulation of a new method for integrity protection of d...Study, analysis and formulation of a new method for integrity protection of d...
Study, analysis and formulation of a new method for integrity protection of d...
ijsrd.com
 
Distributeddatabasesforchallengednet
DistributeddatabasesforchallengednetDistributeddatabasesforchallengednet
Distributeddatabasesforchallengednet
Vinoth Chandar
 

What's hot (18)

An Efficient Algorithm to Calculate The Connectivity of Hyper-Rings Distribut...
An Efficient Algorithm to Calculate The Connectivity of Hyper-Rings Distribut...An Efficient Algorithm to Calculate The Connectivity of Hyper-Rings Distribut...
An Efficient Algorithm to Calculate The Connectivity of Hyper-Rings Distribut...
 
Using Page Size for Controlling Duplicate Query Results in Semantic Web
Using Page Size for Controlling Duplicate Query Results in Semantic WebUsing Page Size for Controlling Duplicate Query Results in Semantic Web
Using Page Size for Controlling Duplicate Query Results in Semantic Web
 
Approaches for Keyword Query Routing
Approaches for Keyword Query RoutingApproaches for Keyword Query Routing
Approaches for Keyword Query Routing
 
CONTENT BASED DATA TRANSFER MECHANISM FOR EFFICIENT BULK DATA TRANSFER IN GRI...
CONTENT BASED DATA TRANSFER MECHANISM FOR EFFICIENT BULK DATA TRANSFER IN GRI...CONTENT BASED DATA TRANSFER MECHANISM FOR EFFICIENT BULK DATA TRANSFER IN GRI...
CONTENT BASED DATA TRANSFER MECHANISM FOR EFFICIENT BULK DATA TRANSFER IN GRI...
 
Social Network Analysis for Telecoms
Social Network Analysis for TelecomsSocial Network Analysis for Telecoms
Social Network Analysis for Telecoms
 
What Sets Verified Users apart? Insights Into, Analysis of and Prediction of ...
What Sets Verified Users apart? Insights Into, Analysis of and Prediction of ...What Sets Verified Users apart? Insights Into, Analysis of and Prediction of ...
What Sets Verified Users apart? Insights Into, Analysis of and Prediction of ...
 
At33264269
At33264269At33264269
At33264269
 
Survey on confidentiality of the user and query processing on spatial network
Survey on confidentiality of the user and query processing on spatial networkSurvey on confidentiality of the user and query processing on spatial network
Survey on confidentiality of the user and query processing on spatial network
 
IEEE 2015 NS2 Projects
IEEE 2015 NS2 ProjectsIEEE 2015 NS2 Projects
IEEE 2015 NS2 Projects
 
Provider Aware Anonymization Algorithm for Preserving M - Privacy
Provider Aware Anonymization Algorithm for Preserving M - PrivacyProvider Aware Anonymization Algorithm for Preserving M - Privacy
Provider Aware Anonymization Algorithm for Preserving M - Privacy
 
Time-Ordered Collaborative Filtering for News Recommendation
Time-Ordered Collaborative Filtering for News RecommendationTime-Ordered Collaborative Filtering for News Recommendation
Time-Ordered Collaborative Filtering for News Recommendation
 
Sub-Graph Finding Information over Nebula Networks
Sub-Graph Finding Information over Nebula NetworksSub-Graph Finding Information over Nebula Networks
Sub-Graph Finding Information over Nebula Networks
 
Sub1522
Sub1522Sub1522
Sub1522
 
Data mining for_java_and_dot_net 2016-17
Data mining for_java_and_dot_net 2016-17Data mining for_java_and_dot_net 2016-17
Data mining for_java_and_dot_net 2016-17
 
adcom2013_submission_59
adcom2013_submission_59adcom2013_submission_59
adcom2013_submission_59
 
Study, analysis and formulation of a new method for integrity protection of d...
Study, analysis and formulation of a new method for integrity protection of d...Study, analysis and formulation of a new method for integrity protection of d...
Study, analysis and formulation of a new method for integrity protection of d...
 
Providing highly accurate service recommendation for semantic clustering over...
Providing highly accurate service recommendation for semantic clustering over...Providing highly accurate service recommendation for semantic clustering over...
Providing highly accurate service recommendation for semantic clustering over...
 
Distributeddatabasesforchallengednet
DistributeddatabasesforchallengednetDistributeddatabasesforchallengednet
Distributeddatabasesforchallengednet
 

Viewers also liked

How to quickly put money in your bank
How to quickly put money in your bankHow to quickly put money in your bank
How to quickly put money in your bank
denise2228
 
Ijmer 46064044
Ijmer 46064044Ijmer 46064044
Ijmer 46064044
IJMER
 
The Impact of HIV/AIDS on Rural Household Welfare in Rukungiri District-Uganda
The Impact of HIV/AIDS on Rural Household Welfare in Rukungiri District-UgandaThe Impact of HIV/AIDS on Rural Household Welfare in Rukungiri District-Uganda
The Impact of HIV/AIDS on Rural Household Welfare in Rukungiri District-Uganda
IJMER
 

Viewers also liked (20)

Z02417321735
Z02417321735Z02417321735
Z02417321735
 
CCL
CCLCCL
CCL
 
Influence of the Powder of Pozzolana on Someproperties of the Concrete: Case ...
Influence of the Powder of Pozzolana on Someproperties of the Concrete: Case ...Influence of the Powder of Pozzolana on Someproperties of the Concrete: Case ...
Influence of the Powder of Pozzolana on Someproperties of the Concrete: Case ...
 
How to quickly put money in your bank
How to quickly put money in your bankHow to quickly put money in your bank
How to quickly put money in your bank
 
Need For Strengthening Automobile Industry in Ethiopia
Need For Strengthening Automobile Industry in EthiopiaNeed For Strengthening Automobile Industry in Ethiopia
Need For Strengthening Automobile Industry in Ethiopia
 
Taguchi Analysis of Erosion Wear Maize Husk Based Polymer Composite
Taguchi Analysis of Erosion Wear Maize Husk Based Polymer CompositeTaguchi Analysis of Erosion Wear Maize Husk Based Polymer Composite
Taguchi Analysis of Erosion Wear Maize Husk Based Polymer Composite
 
Ijmer 46064044
Ijmer 46064044Ijmer 46064044
Ijmer 46064044
 
Stability of the Equilibrium Position of the Centre of Mass of an Inextensibl...
Stability of the Equilibrium Position of the Centre of Mass of an Inextensibl...Stability of the Equilibrium Position of the Centre of Mass of an Inextensibl...
Stability of the Equilibrium Position of the Centre of Mass of an Inextensibl...
 
Voltage Clamped Dc-Dc Convertor with Reduced Reverse Recovery Current And Sta...
Voltage Clamped Dc-Dc Convertor with Reduced Reverse Recovery Current And Sta...Voltage Clamped Dc-Dc Convertor with Reduced Reverse Recovery Current And Sta...
Voltage Clamped Dc-Dc Convertor with Reduced Reverse Recovery Current And Sta...
 
Some New Operations on Fuzzy Soft Sets
Some New Operations on Fuzzy Soft SetsSome New Operations on Fuzzy Soft Sets
Some New Operations on Fuzzy Soft Sets
 
Numerical Investigation of Multilayer Fractal FSS
Numerical Investigation of Multilayer Fractal FSSNumerical Investigation of Multilayer Fractal FSS
Numerical Investigation of Multilayer Fractal FSS
 
Regeneration of Liquid Desiccant in Solar Passive Regenerator with Enhanced ...
Regeneration of Liquid Desiccant in Solar Passive Regenerator  with Enhanced ...Regeneration of Liquid Desiccant in Solar Passive Regenerator  with Enhanced ...
Regeneration of Liquid Desiccant in Solar Passive Regenerator with Enhanced ...
 
Using the ICT tools
Using the ICT toolsUsing the ICT tools
Using the ICT tools
 
Seasonal Variational Impact of the Physical Parameters On Mohand Rao River F...
Seasonal Variational Impact of the Physical Parameters On  Mohand Rao River F...Seasonal Variational Impact of the Physical Parameters On  Mohand Rao River F...
Seasonal Variational Impact of the Physical Parameters On Mohand Rao River F...
 
My adventure with WebSockets
My adventure with WebSocketsMy adventure with WebSockets
My adventure with WebSockets
 
Spectrophotometric Determination of Cardiovascular Drugs
Spectrophotometric Determination of Cardiovascular DrugsSpectrophotometric Determination of Cardiovascular Drugs
Spectrophotometric Determination of Cardiovascular Drugs
 
Integration by Parts for DK Integral
Integration by Parts for DK Integral Integration by Parts for DK Integral
Integration by Parts for DK Integral
 
Range Query on Big Data Based on Map Reduce
Range Query on Big Data Based on Map ReduceRange Query on Big Data Based on Map Reduce
Range Query on Big Data Based on Map Reduce
 
The Impact of HIV/AIDS on Rural Household Welfare in Rukungiri District-Uganda
The Impact of HIV/AIDS on Rural Household Welfare in Rukungiri District-UgandaThe Impact of HIV/AIDS on Rural Household Welfare in Rukungiri District-Uganda
The Impact of HIV/AIDS on Rural Household Welfare in Rukungiri District-Uganda
 
Effect of Molybdenum Disulphide on Physical Properties of Neodymium-Iron-Boro...
Effect of Molybdenum Disulphide on Physical Properties of Neodymium-Iron-Boro...Effect of Molybdenum Disulphide on Physical Properties of Neodymium-Iron-Boro...
Effect of Molybdenum Disulphide on Physical Properties of Neodymium-Iron-Boro...
 

Similar to A Novel Approach To Answer Continuous Aggregation Queries Using Data Aggregations

Query evaluation over network of data aggregators
Query evaluation over network of data aggregatorsQuery evaluation over network of data aggregators
Query evaluation over network of data aggregators
IAEME Publication
 
Data Retrieval Scheduling For Unsynchronized Channel in Wireless Broadcast Sy...
Data Retrieval Scheduling For Unsynchronized Channel in Wireless Broadcast Sy...Data Retrieval Scheduling For Unsynchronized Channel in Wireless Broadcast Sy...
Data Retrieval Scheduling For Unsynchronized Channel in Wireless Broadcast Sy...
IJERA Editor
 
PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...
PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...
PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...
paperpublications3
 
Service Level Comparison for Online Shopping using Data Mining
Service Level Comparison for Online Shopping using Data MiningService Level Comparison for Online Shopping using Data Mining
Service Level Comparison for Online Shopping using Data Mining
IIRindia
 
Unit 4 part 1
Unit 4 part 1Unit 4 part 1
Unit 4 part 1
Sushil Kumar
 
A Survey: Hybrid Job-Driven Meta Data Scheduling for Data storage with Intern...
A Survey: Hybrid Job-Driven Meta Data Scheduling for Data storage with Intern...A Survey: Hybrid Job-Driven Meta Data Scheduling for Data storage with Intern...
A Survey: Hybrid Job-Driven Meta Data Scheduling for Data storage with Intern...
dbpublications
 

Similar to A Novel Approach To Answer Continuous Aggregation Queries Using Data Aggregations (20)

Query evaluation over network of data aggregators
Query evaluation over network of data aggregatorsQuery evaluation over network of data aggregators
Query evaluation over network of data aggregators
 
[IJET-V1I4P2] Authors : Doddappa Kandakur; Ashwini B P
[IJET-V1I4P2] Authors : Doddappa Kandakur; Ashwini B P[IJET-V1I4P2] Authors : Doddappa Kandakur; Ashwini B P
[IJET-V1I4P2] Authors : Doddappa Kandakur; Ashwini B P
 
Data Retrieval Scheduling For Unsynchronized Channel in Wireless Broadcast Sy...
Data Retrieval Scheduling For Unsynchronized Channel in Wireless Broadcast Sy...Data Retrieval Scheduling For Unsynchronized Channel in Wireless Broadcast Sy...
Data Retrieval Scheduling For Unsynchronized Channel in Wireless Broadcast Sy...
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
An efficient enhanced k-means clustering algorithm for best offer prediction...
An efficient enhanced k-means clustering algorithm for best  offer prediction...An efficient enhanced k-means clustering algorithm for best  offer prediction...
An efficient enhanced k-means clustering algorithm for best offer prediction...
 
PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...
PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...
PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...
 
Service Level Comparison for Online Shopping using Data Mining
Service Level Comparison for Online Shopping using Data MiningService Level Comparison for Online Shopping using Data Mining
Service Level Comparison for Online Shopping using Data Mining
 
Clustering
ClusteringClustering
Clustering
 
IRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom IndustryIRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom Industry
 
Survey on Dynamic Resource Allocation Strategy in Cloud Computing Environment
Survey on Dynamic Resource Allocation Strategy in Cloud Computing EnvironmentSurvey on Dynamic Resource Allocation Strategy in Cloud Computing Environment
Survey on Dynamic Resource Allocation Strategy in Cloud Computing Environment
 
Efficient Cost Minimization for Big Data Processing
Efficient Cost Minimization for Big Data ProcessingEfficient Cost Minimization for Big Data Processing
Efficient Cost Minimization for Big Data Processing
 
Gc3310851089
Gc3310851089Gc3310851089
Gc3310851089
 
Gc3310851089
Gc3310851089Gc3310851089
Gc3310851089
 
Unit 4 part 1
Unit 4 part 1Unit 4 part 1
Unit 4 part 1
 
A Survey: Hybrid Job-Driven Meta Data Scheduling for Data storage with Intern...
A Survey: Hybrid Job-Driven Meta Data Scheduling for Data storage with Intern...A Survey: Hybrid Job-Driven Meta Data Scheduling for Data storage with Intern...
A Survey: Hybrid Job-Driven Meta Data Scheduling for Data storage with Intern...
 
A compendium on load forecasting approaches and models
A compendium on load forecasting approaches and modelsA compendium on load forecasting approaches and models
A compendium on load forecasting approaches and models
 
Unit 5
Unit 5Unit 5
Unit 5
 
P2P Cache Resolution System for MANET
P2P Cache Resolution System for MANETP2P Cache Resolution System for MANET
P2P Cache Resolution System for MANET
 
IRJET- Secure Data Access on Distributed Database using Skyline Queries
IRJET- Secure Data Access on Distributed Database using Skyline QueriesIRJET- Secure Data Access on Distributed Database using Skyline Queries
IRJET- Secure Data Access on Distributed Database using Skyline Queries
 

More from IJMER

Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
IJMER
 
Intrusion Detection and Forensics based on decision tree and Association rule...
Intrusion Detection and Forensics based on decision tree and Association rule...Intrusion Detection and Forensics based on decision tree and Association rule...
Intrusion Detection and Forensics based on decision tree and Association rule...
IJMER
 
Material Parameter and Effect of Thermal Load on Functionally Graded Cylinders
Material Parameter and Effect of Thermal Load on Functionally Graded CylindersMaterial Parameter and Effect of Thermal Load on Functionally Graded Cylinders
Material Parameter and Effect of Thermal Load on Functionally Graded Cylinders
IJMER
 

More from IJMER (20)

A Study on Translucent Concrete Product and Its Properties by Using Optical F...
A Study on Translucent Concrete Product and Its Properties by Using Optical F...A Study on Translucent Concrete Product and Its Properties by Using Optical F...
A Study on Translucent Concrete Product and Its Properties by Using Optical F...
 
Developing Cost Effective Automation for Cotton Seed Delinting
Developing Cost Effective Automation for Cotton Seed DelintingDeveloping Cost Effective Automation for Cotton Seed Delinting
Developing Cost Effective Automation for Cotton Seed Delinting
 
Study & Testing Of Bio-Composite Material Based On Munja Fibre
Study & Testing Of Bio-Composite Material Based On Munja FibreStudy & Testing Of Bio-Composite Material Based On Munja Fibre
Study & Testing Of Bio-Composite Material Based On Munja Fibre
 
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)
 
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
 
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...
 
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...
 
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...
 
Static Analysis of Go-Kart Chassis by Analytical and Solid Works Simulation
Static Analysis of Go-Kart Chassis by Analytical and Solid Works SimulationStatic Analysis of Go-Kart Chassis by Analytical and Solid Works Simulation
Static Analysis of Go-Kart Chassis by Analytical and Solid Works Simulation
 
High Speed Effortless Bicycle
High Speed Effortless BicycleHigh Speed Effortless Bicycle
High Speed Effortless Bicycle
 
Integration of Struts & Spring & Hibernate for Enterprise Applications
Integration of Struts & Spring & Hibernate for Enterprise ApplicationsIntegration of Struts & Spring & Hibernate for Enterprise Applications
Integration of Struts & Spring & Hibernate for Enterprise Applications
 
Microcontroller Based Automatic Sprinkler Irrigation System
Microcontroller Based Automatic Sprinkler Irrigation SystemMicrocontroller Based Automatic Sprinkler Irrigation System
Microcontroller Based Automatic Sprinkler Irrigation System
 
On some locally closed sets and spaces in Ideal Topological Spaces
On some locally closed sets and spaces in Ideal Topological SpacesOn some locally closed sets and spaces in Ideal Topological Spaces
On some locally closed sets and spaces in Ideal Topological Spaces
 
Intrusion Detection and Forensics based on decision tree and Association rule...
Intrusion Detection and Forensics based on decision tree and Association rule...Intrusion Detection and Forensics based on decision tree and Association rule...
Intrusion Detection and Forensics based on decision tree and Association rule...
 
Natural Language Ambiguity and its Effect on Machine Learning
Natural Language Ambiguity and its Effect on Machine LearningNatural Language Ambiguity and its Effect on Machine Learning
Natural Language Ambiguity and its Effect on Machine Learning
 
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcess
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcessEvolvea Frameworkfor SelectingPrime Software DevelopmentProcess
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcess
 
Material Parameter and Effect of Thermal Load on Functionally Graded Cylinders
Material Parameter and Effect of Thermal Load on Functionally Graded CylindersMaterial Parameter and Effect of Thermal Load on Functionally Graded Cylinders
Material Parameter and Effect of Thermal Load on Functionally Graded Cylinders
 
Studies On Energy Conservation And Audit
Studies On Energy Conservation And AuditStudies On Energy Conservation And Audit
Studies On Energy Conservation And Audit
 
An Implementation of I2C Slave Interface using Verilog HDL
An Implementation of I2C Slave Interface using Verilog HDLAn Implementation of I2C Slave Interface using Verilog HDL
An Implementation of I2C Slave Interface using Verilog HDL
 
Discrete Model of Two Predators competing for One Prey
Discrete Model of Two Predators competing for One PreyDiscrete Model of Two Predators competing for One Prey
Discrete Model of Two Predators competing for One Prey
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

A Novel Approach To Answer Continuous Aggregation Queries Using Data Aggregations

  • 1. www.ijmer.com International Journal of Modern Engineering Research (IJMER) Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3196-3199 ISSN: 2249-6645 A Novel Approach To Answer Continuous Aggregation Queries Using Data Aggregations K. Suresh1, Sk. Karimulla 2, B. Rasagna3 1, M.TECH Scholar, QCET, Nellore 2, Assistant Professor, QCET, Nellore 3, Assistant Professor, MLRITM, Hyderabad ABSTRACT: Decision making can be regarded as the reflective development consequential in the selection of a course of action among several alternative scenarios. Continuous and uninterrupted aggregation queries are used to monitor the changes in data with time varying for online decision making. Usually a user requirements to obtain the value of some aggregation function larger than distributed data items, for example, to be familiar with value of portfolio for a client; or the average of temperatures sensed by a set of sensors. In these queries a client identifies a coherency requirement as measurement of the query. In this we suggested a low-cost, scalable technique to answer continuous aggregation queries using a network of aggregators of dynamic data items. Such a network of data aggregators, each and every data aggregator gives out a set of data items at specific coherencies. Just as a variety of fragments of a dynamic web pages are served by one or more nodes of a content allocation network, our proposed method involves decomposing a client query into sub-queries and carry out sub-queries on sensibly chosen data aggregators with their individual sub query incoherency boundaries. We present that a technique for getting the optimal set of sub queries with their incoherency boundaries which gratifies client query’s coherency obligation with least number of refresh messages driven from aggregators to the client. For approximation the number of refresh messages, we put together a query cost model which can be used to calculate approximately the number of messages required to gratify the client specified incoherency bound. Index Terms— Algorithms, Continuous Queries, Data Dissemination, Coherency, Performance, Distributed Query Processing I. INTRODUCTION Applications such as sensors-based monitoring, auctions, personal portfolio valuations for financial decisions, route planning depends on traffic information, etc., make wide use of dynamic data. For such type of applications, data from one or more independent data sources may be aggregated to determine if some action is warranted. Given the growing number of such type of applications that make use of highly dynamic data, there is imperative interest in systems that can efficiently deliver the relevant updates repeatedly. In these continuous query applications, users are probably to bear some incorrectness in the results. That is, the exact data values at the equivalent data sources need not be reported as long as the query results gratify user specified accuracy requirements. Data incoherency: Data accuracy can be specified in terms of incoherency of a data item, defined as the complete dissimilarity in value of the data item at the data source and the value known to a client of the data. Let v i(t) indicate the value of the ith data item at the data source at time t; and let the value the data item known to the client be ui(t). Then, the data incoherency at the client is given by |vi(t)- ui(t)|. For a data item which needs to be refreshed at an incoherency bound C a data refresh message is sent to the client as soon as data incoherency exceeds C, i.e., |v i(t)- ui(t)| > C. Network of data aggregators (DA): Data refresh from data sources to consumers can be done using push or pull based mechanisms. In a push based system data sources send bring up to date messages to clients on their own while in a pull based system data sources send messages to the customer only when the client makes a request. In this assume the push based system for data transfer between data sources and clients. For scalable management of push based data distribution, network of data aggregators are proposed. In this assume that each data aggregator maintains its configured incoherency bounds for different data items. From a data distribution capability point of vision, each and every data aggregator is characterized by a set of (d i, ci) pairs, where di is a data item which the DA can disseminate at an incoherency bound c i. The configured incoherency bound of a data item at a data aggregator can be maintained using any of following methods: 1) The data source refreshes the data value of the DA whenever DA’s incoherency bound is about to get violated. This method has scalability problems. 2) Data aggregator with tighter incoherency bound help the DA to maintain its incoherency bound in a scalable manner 1.1. Aggregate Queries and Their Execution While executing continuous multi data aggregation queries, using a network of data aggregators, with the objective of minimizing the number of restores from data aggregators to the client. First, we give two motivating scenarios where there are various options for executing a multi data aggregation query and one must select a particular option to minimize the number of messages. www.ijmer.com 3196 | Page
  • 2. International Journal of Modern Engineering Research (IJMER) www.ijmer.com Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3196-3199 ISSN: 2249-6645 1.2. Problem statement & contributions Our simulation studies show that for continuous aggregation queries:  Our method of dividing query into sub queries and executing them at individual DAs requires less than one third of the number of refreshes required in the existing schemes.  For reducing the number of refreshes more dynamic data items should be part of a sub query involving larger number of data items. Our method of executing queries over a network of data aggregators is practical since it can be implemented using a mechanism similar to URL-rewriting in content distribution networks (CDNs). Just like in a content distribution networks, the client sends its query to the central site. For getting appropriate aggregators (edge nodes) to answer the client query (webpage), the central site has to first determine which data aggregators have the data items required for the client query. If the client query cannot be respond by a single data aggregator, the query is divided into sub queries (fragments) and each sub query is assigned to a single data aggregator. In case of a content distribution networks webpage’s division into fragments is a page design issue, whereas, for permanent aggregation queries, the issue has to be handled on per query basis by considering data dissemination capabilities of data aggregators. This strategy ignores the fact that the client is interested only in the aggregated value of the data items and various aggregators can disseminate more than one data item. Second, if a single Data Aggregator (DA) can disseminate all three data items required to answer the client query, the Data Aggregator can construct a compound data item corresponding to the client query and disseminate the result to the client so that the query incoherency bound is not violated. It is obvious that if we obtain the query result from a single Data Aggregator, the number of refreshes will be (as data item updates may cancel not in each other, thereby maintaining the query results within the incoherency bound). Further, even if an aggregator can refresh all the data items, it may not be capable to satisfy the query coherency requirements. In such cases the query has to be implemented with data from multiple aggregators. II. DATA DISSEMINATION COST MODEL To estimate the number of refreshes required to disseminate a data item while maintaining a certain incoherency bound. There are two most important factors affecting the number of messages that are needed to maintain the coherency requirement: 1) The coherency requirement itself 2) Dynamics of the data. 1.3. Incoherency Bound Model Consider a data item which needs to be distributed at an incoherency bound C that is new value of the data item will be pushed if the value deviates by more than C from the last pushed value Thus, the number of dissemination messages will be proportional to the probability of | v(t)- u(t) | greater than C for data value v(t) at the source/aggregator and u(t) at the client, at time t. A data item can be replicated as a discrete time random process where each step is correlated with its previous step. In a push-based distribution, a data source can follow one of the following schemes: 1. Data source pushes the data value whenever it differs from the last pushed charge by an amount more than C. 2. Client estimates data value based on server specified parameters. The source drives the new data value whenever it differs from the (client) estimated value by an amount more than C. Fig.1: Number of pushes versus incoherency bounds In both these cases, value at the source can be modeled as a random process with average as the value known at the client. In case 2, the server and the client estimate the data value as the mean of the modeled random process, whereas in case 1 difference from the last pushed value can be modeled as zero mean process. 2.2. Data dynamics Model Two possible options to model data dynamics, as a first option, the data dynamics can be quantified based on standard deviation of the data item values. Suppose both data items are disseminated with an incoherency bound of 3. It can be seen that the number of messages required for maintaining the incoherency bound will be 7 and 1 for data items d1 and d2, respectively, whereas both data items have the same standard deviation. Thus, we need a measure which captures data changes along with its temporal properties. This motivates us to examine the second measure. As a second option we considered Fast Fourier Trans- form (FFT) which is used in the digital signal processing domain to characterize a digital www.ijmer.com 3197 | Page
  • 3. International Journal of Modern Engineering Research (IJMER) www.ijmer.com Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3196-3199 ISSN: 2249-6645 signal. FFT captures number of changes in data value, amount of changes, and their timings. Thus, FFT can be used to model data dynamics but it has a problem. To estimate the number of refreshes required to disseminate a data item we need a function over FFT coefficients which can return a scalar value. The number of FFT coefficients can be as high as the number of changes in the data value. Among FFT coefficients, 0th order coefficient identifies average value of the data item, whereas higher order coefficients represent transient changes in the value of data item. Validating the hypothesis: We did simulations with different stocks being disseminated with incoherency bound values of $0:001, 0.01, and 0.1. This variety is 0.1 to 10 times the regular standard deviation of the share price values. Number of invigorate messages is plotted with data sum diff (in $) the linear relationship appears to exist for all incoherency bound values. To enumerate the measure of linearity we used Pearson product moment correlation coefficient (PPMCC), a widely used measure of association, measuring the quantity of linearity between two variables. Fig.2: Number of pushes versus data sumdiff (a) C Âź 0:001, (b) C Âź 0:01, and (c) C Âź 0:1. It is calculated by summing up the products of the deviations of the data item values from their mean. PPMCC varies between 1 and 1 with higher (absolute) values signifying that data points can be considered linear with more confidence. For three principles of incoherency bounds 0.001, 0.01, and 0.1; PPMCC values be 0.94, 0.96, and 0.90, respectively and average deviation from linearity was in the range of 5 percent for low values of C and 10 percent for high values of C. Thus, we can conclude that, for lower values of the incoherency bounds, linear relation- ship between data sum diff and the number of refresh messages can be assumed with more confidence. 2.3 Combining Data Dissemination Models Number of refresh messages is proportional to data sumdiff R s and inversely proportional to square of the incoherency bound (C2). Further, we can see that we need not disseminate any message when either data value is not changing (Rs = 0) or incoherency bound is unlimited (1/C2 = 0). Thus, for a given data item S, disseminated with an incoherency bound C, the data distribution cost is proportional to R s/C2. In the next section, we use this data dissemination cost model for developing cost model for additive aggregation queries. III. QUERY PLANNING A query plan is an ordered set of steps used to access or modify information in a SQL relational database management system. This is a detailed case of the relational model concept of access plans. Since Oracle is declarative, there are typically a large number of alternative ways to execute a specified query, with extensively varying performance. When a query is suggested to the database, the query optimizer appraises some of the unlike, accurate possible plans for executing the query and returns what it considers the best alternative. Thus to get a query plan we need to perform following tasks. 1. Determining sub queries: For the client query get sub queries for each data aggregator. 2. Dividing incoherency bound: Divide the query incoherency bound among sub queries to get the value of sub query. Optimization objective: Number of refresh messages is minimized. For a sub query the estimated number of refresh messages is given by the ratio of sum diff of the sub query and the incoherency bound assigned to it and the proportionality factor k. Thus the total no of refresh messages is estimated as the summation of the ratio of the sub query of a given query and in coherency bound associated to it. Constraint1: qk is executable at ak: Each DA has the data items required to execute the sub query allocated to it, i.e., for each data item dq ki required for the sub query Constraint2: Query incoherency bound is satisfied: Query incoherency should be less than or equal to the query incoherency bound. Meant for preservative aggregation queries, value of the consumer query is the sum of sub query values. As different sub queries are distributed by different data aggregators, we need to ensure that sum of sub query incoherencies is less than or equal to the query incoherency bound. Constraint3: Sub query incoherency bound is satisfied: Data incoherency bounds at ak should be such that the sub query incoherency bound can be satisfied at that data aggregator. The tightest incoherency bound, which the data aggregator ak can satisfy for the given sub query q k. For satisfying this constraint we ensure the following is the outline of our approach for solving this constraint Optimization problem we prove that determining sub queries at the same time as minimizing Zq, as specified by is NPhard. www.ijmer.com 3198 | Page
  • 4. www.ijmer.com International Journal of Modern Engineering Research (IJMER) Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3196-3199 ISSN: 2249-6645 IV. PERFORMANCE EVOLUTION 4.1 Comparison of Algorithms We consider various other query plan options. Each query can be executed by distributing individual data items or by getting sub query values from network of data aggregators. Set of sub queries can be selected using sumdiff-based approaches or any other random selection. Sub query (or data) incoherency bound can either be pre decided or optimally allocated. Various arrangements of these proportions are covered in the following algorithms: 1. No sub query, equal incoherency bound (naive): In this the client query is performed with each data item being disseminated to the client independent of other data items in the query. Incoherency bound is separated evenly among the data items. This algorithm proceeded as a baseline algorithm. 2. No sub query, optimal incoherency bound (optc): In this algorithm as well data items are distributed independently but incoherency bound is separated among data items so that total number of invigorate can be minimized. 3. Random sub query selection (random): In this case, sub queries are obtained by randomly selecting a network of data aggregators in the each iteration of the greedy algorithm. This algorithm is designed to see how the random selection works in comparison to the sumdiff-based algorithms. 4. Subquery selection while minimizing sumdiff (mincost) 5. Subquery selection while maximizing gain (maxgain) Fig. 5: Performance evaluation of algorithms. IV. CONCLUSION Here current a cost-based approach to minimize the number of refreshes required to execute an incoherency bounded uninterrupted query. We assume the continuation of a network of data aggregators, where each Data Aggregator is accomplished of distributing a set of data items at their pre specified incoherency bounds. We developed an important measure for data dynamics in the form of sum diff which is a more appropriate measure compared to the widely used standard deviation based measures. For optimal query implementation we divide the query into sub queries and evaluate each sub query at a thoughtfully chosen data aggregator. REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] A. Davis, J. Parikh, and W. Weihl, “Edge Computing: Extending Enterprise Applications to the Edge of the Internet,” Proc. 13th Int’l World Wide Web Conf. Alternate Track Papers & Posters (WWW), 2004. D. VanderMeer, A. Datta, K. Dutta, H. Thomas, and K. Ramamritham, “Proxy-Based Acceleration of Dynamically Generated Content on the World Wide Web,” ACM Trans. Database Systems, vol. 29, pp. 403-443, June 2004. J. Dilley, B. Maggs, J. Parikh, H. Prokop, R. Sitaraman, and B. Weihl, “Globally Distributed Content Delivery,” IEEE Internet Computing, vol. 6, no. 5, pp. 50-58, Sept. 2002. S. Rangarajan, S. Mukerjee, and P. Rodriguez, “User Specific Request Redirection in a Content Delivery Network,” Proc. Eighth Int’l Workshop Web Content Caching and Distribution (IWCW), 2003. S. Shah, K. Ramamritham, and P. Shenoy, “Maintaining Coherency of Dynamic Data in Cooperating Repositories,” Proc. 28th Int’l Conf. Very Large Data Bases (VLDB), 2002. T.H. Cormen, C.E. Leiserson, R.L. Rivest, and C. Stein, Introduction to Algorithms. MIT Press and McGraw-Hill 2001. Y. Zhou, B. Chin Ooi, and K.-L. Tan, “Disseminating Streaming Data in a Dynamic Environment: An Adaptive and Cost Based Approach,” The Int’l J. Very Large Data Bases, vol. 17, pp. 1465-1483, 2008. “Query Cost Model Validation for Sensor Data,”www.cse.iitb.ac. in/~grajeev/sumdiff/ RaviVijay_BTP06.pdf, 2011. R. Gupta, A. Puri, and K. Ramamritham, “Executing Incoherency Bounded Continuous Queries at Web Data Aggregators,” Proc. 14th Int’l Conf. World Wide Web (WWW), 2005. A. Populis, Probability, Random Variable and Stochastic Process. Mc. Graw-Hill, 1991. C. Olston, J. Jiang, and J. Widom, “Adaptive Filter for Continuous Queries over Distributed Data Streams,” Proc. ACM SIGMOD Int’l Conf. Management of Data, 2003. S. Shah, K. Ramamritham, and C. Ravishankar, “Client Assignment in Content Dissemination Networks for Dynamic Data,” Proc. 31st Int’l Conf. Very Large Data Bases (VLDB), 2005. NEFSC Scientific Computer System, http://sole.wh.whoi.edu/ ~jmanning//cruise/serve1.cgi, 2011. www.ijmer.com 3199 | Page