SlideShare a Scribd company logo
1 of 96
Download to read offline
ISSN: 1694-2507 (Print)
ISSN: 1694-2108 (Online)
International Journal of Computer Science
and Business Informatics
(IJCSBI.ORG)
VOL 8, NO 1
DECEMBER 2013
Table of Contents VOL 8, NO 1 DECEMBER 2013
An Integrated Distributed Clustering Algorithm for Large Scale WSN...................................................1
S. R. Boselin Prabhu, S. Sophia, S. Arthi and K. Vetriselvi
An Efficient Connection between Statistical Software and Database Management System ................... 1
Sunghae Jun
Pragmatic Approach to Component Based Software Metrics Based on Static Methods ......................... 1
S. Sagayaraj and M. Poovizhi
SDI System with Scalable Filtering of XML Documents for Mobile Clients ............................................... 1
Yi Yi Myint and Hninn Aye Thant
An Easy yet Effective Method for Detecting Spatial Domain LSB Steganography .................................... 1
Minati Mishra and Flt. Lt. Dr. M. C. Adhikary
Minimizing the Time of Detection of Large (Probably) Prime Numbers ................................................... 1
Dragan Vidakovic, Dusko Parezanovic and Zoran Vucetic
Design of ATL Rules for TransformingUML 2 Sequence Diagrams into Petri Nets..................................... 1
Elkamel Merah, Nabil Messaoudi, Dalal Bardou and Allaoua Chaoui
IJCSBI.ORG
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 1
An Integrated Distributed Clustering
Algorithm for Large Scale WSN
S. R. BOSELIN PRABHU
Assistant Professor, Department of Electronics and Communication Engineering
SVS College of Engineering, Coimbatore, India.
S. SOPHIA
Professor, Department of Electronics and Communication Engineering
Sri Krishna College of Engineering and Technology, Coimbatore, India.
S. ARTHI & K. VETRISELVI
UG Students, Department of Electronics and Communication Engineering
SVS College of Engineering, Coimbatore, India.
Abstract
Latest researches in wireless communications and electronics has imposed
the progress of low-cost wireless sensor nodes. Clustering is a thriving
topology control approach, which can prolong the lifetime and increase
scalability for wireless sensor networks. The admired criteria for clustering
methodology are to select cluster heads with more residual energy and to
rotate them periodically. Sensors at heavy traffic locations quickly deplete
their energy resources and die much earlier, leaving behind energy hole and
network partition. In this paper, a model of distributed layer-based
clustering algorithm is proposed based on three concepts. First, the
aggregated data is forwarded from cluster head to the base station through
cluster head of the next higher layer with shortest distance between the
cluster heads. Second, cluster head is elected based on the clustering factor,
which is the combination of residual energy and the number of neighbors of
a particular node within a cluster. Third, each cluster has a crisis hindrance
node, which does the function of cluster head when the cluster head fails to
carry out its work in some critical conditions. The key aim of the proposed
algorithm is to accomplish energy efficiency and to prolong the network
lifetime. The proposed distributed clustering algorithm is contrasted with
the existing clustering algorithm LEACH.
Keywords: Wireless sensor network (WSN), distributed clustering
algorithm, cluster head, residual energy, energy efficiency, network lifetime.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 2
1. INTRODUCTION
Wireless sensor network (WSN) is a collection of huge number of small,
low-power and low-cost electronic devices called sensor nodes. Each sensor
node consists of four major blocks: sensing, processing, power and
communication unit and they are responsible for sensing, processing and
wireless communications (figure 1). These nodes bring together the relevant
data from the environment and then transfer the gathered data to base station
(BS). Since WSNs has many advantages like self organization,
infrastructure-free, fault-tolerance and locality, they have a wide variety of
potential applications like border security and surveillance, environmental
monitoring and forecasting, wildlife animal protection and home
automation, disaster management and control. Considering that sensor
nodes are usually deployed in remote locations, it is impossible to recharge
their batteries. Therefore, ways to utilize the limited energy resource wisely
to extend the lifetime of sensor networks is a very demanding research issue
for these sensor networks.
Figure 1: Various components of a wireless sensor node
Clustering [2-7] is an effectual topology control approach, which can
prolong the lifetime and increase scalability for these sensor networks. The
popular criterion for clustering technique (figure 2) is to select a cluster head
(CH) with more residual energy and to spin them periodically. The basic
idea of clustering algorithms is to use the data aggregation [8-11]
mechanism in the cluster head to lessen the amount of data transmission.
Clustering goes behind some advantages like network scalability, localizing
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 3
route setup, uses communication bandwidth [17] efficiently and takes
advantage of network lifetime [12-16]. By the data aggregation process,
unnecessary communication between sensor nodes, cluster head and the
base station is evaded. In this paper, a well-defined model of distributed
layer-based clustering algorithm is proposed based of three concepts: the
aggregated data is forwarded from the cluster head to the base station
through cluster head of the next higher layer with shortest distance between
the cluster heads, cluster head is elected based on the clustering factor and
the crisis hindrance node does the function of cluster head when the cluster
head fails to carry out its work. The prime aim of the proposed algorithm is
to attain energy efficiency and increased network lifetime.
Figure 2: Cluster formation in a wireless sensor network
The rest of this paper is structured as follows. A literature review of existing
distributed clustering algorithms, talking about their projected advantages
and shortcomings is profoundly conversed in Section 2. An evaluation of
the existing clustering algorithm LEACH (Low Energy Adaptive Clustering
Hierarchy) and the basic concept behind this algorithm is briefed in Section
3. Section 4 sketches a precise model of the proposed distributed layer-
based clustering algorithm, enumerating the precious hiding concepts
behind it. Finally, the last section gives the conclusion creatively.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 4
2. A REVIEW OF EXISTING CLUSTERING ALGORITHMS
Bandyopadhyay and Coyle anticipated EEHC [18], which is a randomized
clustering algorithm which categorizes the sensor nodes into hierarchy of
clusters with an objective of minimizing the total energy spent in the system
to communicate the information gathered by the sensors to the information
processing center. It has variable cluster count, the immobile cluster head
aggregates and relays the data to the BS. It is valid for extensive large scale
networks. The peculiar negative aspect of this algorithm is that, some nodes
remain un-clustered throughout the clustering process.
Barker, Ephremides and Flynn proposed LCA [19], which is chiefly
developed to avoid the communication collisions among the nodes by using
a TDMA time-slot. It makes utilization of single-hop scheme thereby
attaining high degree of connectivity when CH is selected randomly. The
restructured version of LCA, the LCA2 was implemented to lessen the
number of nodes compared to the original LCA algorithm. The key
drawback of this algorithm is that, the single-hop clustering leads to the
creation of more number of clusters.
Nagpal and Coore proposed CLUBS [20], which is executed with an idea to
form overlapping clusters with maximum cluster diameter of two hops. The
clusters are created by local broadcasting and its convergence depends on
the local density of the wireless sensor nodes. This algorithm can be
implemented in asynchronous environment without dropping efficiency.
The main difficulty is the overlapping of clusters, clusters having their CHs
within one hop range of each other, thereby both the clusters will collapse
and CH election process will get restarted.
Demirbas, Arora and Mittal brought out FLOC [21], which shows double-
band nature of wireless radio-model for communication. The nodes can
commune reliably with the nodes in the inner-band and unreliably with the
nodes that are in the outer-band. The chief disadvantage of the algorithm is,
the communication between the nodes in the outer band is unreliable and the
messages have maximum probability of getting lost during communication.
Ye, Li, Chen and Wu proposed EECS [22], which is based on a supposition
that all CHs can communicate directly with the BS. The clusters have
variable size, those closer to the CH are larger in size and those farther from
CH are smaller in size. It is really energy efficient in intra-cluster
communication and shows an excellent improvement in network lifetime.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 5
EEUC is anticipated for uniform energy consumption within the sensor
network. It forms dissimilar clusters, with a guessing that each cluster can
have variable sizes. Probabilistic selection of CH is the focal shortcoming of
this algorithm. Few nodes will be gone without being part of any cluster.
Yu, Li and Levy proposed DECA, which selects CH based on residual
energy, connectivity and a node identifier. It is greatly energy efficient, as it
uses lesser messages for CH selection. The main trouble with this algorithm
is that high risk of wrong CH selection which leads to the discarding of
every packets sent by the wireless sensor node.
Ding, Holliday and Celik proposed DWEHC, which elects CH on the basis
of weight, a combination of nodes’ residual energy and its distance to the
neighboring nodes. It produces well balanced clusters, independent of
network topology. A node possessing largest weight in a cluster is
designated as CH. The algorithm constructs multilevel clusters and the
nodes in every cluster reach CH by relaying through other intermediate
nodes. The foremost problem occurs due to much energy utilization by
several iterations until the nodes settle in most energy efficient topology.
HEED is a well distributed clustering algorithm in which CH selection is
done by taking into account the residual energy of the nodes and intra-
cluster communication cost leading to prolonged network lifetime. It is clear
that it can have variable cluster count and supports heterogeneous sensors.
The problems with HEED are its application narrowed only to static
networks, the employment of complex methods and multiple clustering
messages per node for CH selection even though it prevents random
selection of CH.
3. AN EVALUATION OF LEACH ALGORITHM
LEACH [1] is one of the most well-liked clustering mechanisms for WSNs
and it is considered as the representative energy efficient protocol. In this
protocol, sensor nodes are unified together to form a cluster. In each cluster,
one sensor node is chosen arbitrarily to act as a cluster head (CH), which
collects data from its member nodes, aggregates them and then forwards to
the base station. It disperses the operation unit into many rounds and each
round consists of two phases: the set-up phase and the steady phase. During
the set-up phase, initial clusters are fashioned and cluster heads are selected.
All the wireless sensor nodes produce a random number between 0 and 1. If
the number is lesser than the threshold, then the node selects itself as the
cluster head for the present round. The threshold for cluster head selection
in LEACH for a particular round is given in equation 1. Gone selecting
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 6
itself as a CH, the sensor node broadcasts an advertisement message which
has its own ID. The non-cluster head nodes can formulate an assessment,
which cluster to join based on the strength of the received advertisement
signal. After the decision is made, every non-cluster head node should
transmit a join- request message to the chosen cluster head to specify that it
will be a member of the cluster. The cluster head fashions and broadcasts a
time division multiple access (TDMA) schedule to exchange the data with
non-cluster sensor nodes without collision after it receives all the join-
request messages.
(1)
where p is the preferred percentage of cluster heads, r is the current round
number and G is the set of nodes which have not been chosen as cluster
head for the last 1/p rounds.
The steady phase commences after the clusters are fashioned and the TDMA
schedules are broadcasted. All of the sensor nodes transmits their data to the
cluster head once per round during their allotted transmission slot based on
the TDMA schedule and in other time, they turn off the radio in order to
trim down the energy consumption. However, the cluster heads must stay
awake all the time. Therefore, it can receive every data from the nodes
within their own clusters. On receiving the data from the cluster, the cluster
head carries out data aggregation mechanism and onwards it to the base
station directly. This is the entire mechanism of the steady state phase. After
a certain predefined time, the network will step into the next round. LEACH
is the basic clustering protocol which processes cluster approach and it can
prolong the network lifetime in comparison with other multi-hop routing
and static routing. However, there are still some hiding problems that should
be considered.
LEACH does not take into account the residual energy to elect cluster heads
and to construct the clusters. As a result, nodes with lesser energy may be
elected as cluster heads and then die much earlier. Moreover, since a node
selects itself as a cluster head only according to the value of the calculated
probability, it is hard to guarantee the number of cluster heads and their
distribution. Also in LEACH clustering algorithm, the cluster heads are
selected randomly and hence the weaker nodes drain easily. To rise above
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 7
these shortcomings in LEACH, a model of distributed layer-based clustering
algorithm is proposed, where clusters are arranged in to hierarchical layers.
Instead of cluster heads directly sending the aggregated data to the base
station, sends them to their next layer nearer cluster heads. These cluster
heads send their data along with that received from lower level cluster heads
to the next layer nearer cluster heads. The cumulative process gets repeated
and finally the data from all the layers reach the base station. The proposed
model is dedicated with some expensive designs, focusing on reduced
energy utilization and improved network lifetime of the sensor network.
4. THE PROPOSED CLUSTERING ALGORITHM
The proposed clustering algorithm is well distributed, where the sensor
nodes are deployed randomly to sense the target environment. The nodes are
divided into clusters with each cluster having a CH. The nodes throw the
information during their TDMA timeslot to their respective CH which fuses
the data to avoid redundant information by the process of data aggregation.
The aggregated data is forwarded to the BS. Compared to the existing
algorithms, the proposed algorithm has three distinguishing features. First,
the aggregated data is forwarded from the cluster head to the base station
through cluster head of the next higher layer with shortest distance between
the cluster heads. Second, cluster head is elected based on the clustering
factor, which is the combination of residual energy and the number of
neighbors of a particular node within a cluster. Third, each cluster has a
crisis hindrance node, that does the function of cluster head when the cluster
head fails to carry out its work in some conditions.
Figure 3: Aggregated data forwarding in the proposed algorithm
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 8
A. Aggregated Data Forwarding
In a network of N nodes, each node is assigned with an exclusive Node
Identity (NID). The NID just serves as a recognition of the nodes and has no
relationship with location or clustering. The CH will be placed at the center
and the nodes will be organized in to several layers around the CH. Every
clusters are arranged into hierarchical layers and layer numbers are assigned
to each clusters. The cluster that is far away from the base station is
designated as the lowest layer and the cluster nearer to the base station is
designated as the highest layer. The main characteristic feature of the
proposed algorithm is that the lowest layer cluster head forwards only its
own aggregated data to the next layer cluster head but the highest layer
forwards all the aggregated data from the preceding cluster heads to the base
station (figure 3). Thus lower workload is assigned to the lower layers but
the higher layers are assigned with greater workload. The workload assigned
to a particular cluster head is directly proportional to the energy utilization
of the cluster head. In order to balance the energy utilization among the
cluster head, the concept of variable transmission power is employed, where
the transmission power reduces with increase in layer numbers. In LEACH,
each cluster head forwards the aggregated data to the base station directly
which uses much energy. The proposed algorithm uses a multi-hop fashion
of data forwarding from cluster head to the base station resulting in reduced
energy utilization.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 9
Figure 4: Mechanism of cluster head selection in the proposed algorithm
B. Cluster Head Selection
The cluster head is elected based on the clustering factor (figure 4), which is
the combination of residual energy and the number of neighbors of a
particular node within a cluster. Residual energy is defined as the energy
remaining within a particular node after some number of rounds. This is
generally believed as one of the main parameter for CH selection in the
proposed algorithm. A neighboring node is a node that remains closer to a
particular node within one hop distance. LEACH selects cluster head only
based on residual energy, but in the proposed algorithm an additional
parameter is included basically to elect the cluster head properly, thereby to
reduce the node death rate. The main characteristic feature of the proposed
algorithm compared to LEACH is that, the base station does not involve in
clustering process directly or indirectly. A node with highest clustering
factor is selected as cluster head for the current round. This is generally
significant in mobile environment, when the sensor nodes move, the number
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 10
of neighbors vary which should be taken into account but it is barely not
concentrated in the LEACH clustering mechanism.
C. Alternate Crisis Hindrance Node
In a cluster with large number of nodes, cluster crisis does not affect the
overall performance of the wireless sensor system. But in the case of
network with less number of nodes, cluster crisis greatly affects the wireless
sensor system. Care should be done when cluster head selection process by
applying alternate recovery mechanisms. In addition to the regular cluster
head, additional cluster node is assigned the task of secondary cluster head,
and the particular node is called as crisis hindrance node. Generally the
cluster collapses when the cluster head fails. In such situations, crisis
hindrance node act as cluster head and recovers the cluster. The main
characteristic feature of the proposed algorithm is that, the crisis hindrance
node solely performs the function of recovery mechanism and does not
involve in sensing process. In case of LEACH, the distribution and the
loading of CHs to all nodes in the networks is not uniform by switching the
cluster heads periodically. Hence, there is a maximum probability of a
cluster to be collapsed easily, but it can be avoided in the proposed
algorithm with the help of crisis hindrance node.
6. CONCLUSION AND FUTURE WORK
This paper gives a brief introduction on clustering process in wireless sensor
networks. A study on the well evaluated distributed clustering algorithm
Low Energy Adaptive Clustering Hierarchy (LEACH) is described
artistically. To overcome the drawbacks of the existing LEACH algorithm, a
model of distributed layer-based clustering algorithm is proposed for
clustering the wireless sensor nodes. The proposed distributed clustering
algorithm is based on the aggregated data being forwarded from the cluster
head to the base station through cluster head of the next higher layer with
shortest distance between the cluster heads. The selection of cluster head is
based on the clustering factor, which is the combination of residual energy
and the number of neighbors of a particular node within a cluster. Also each
cluster has a crisis hindrance node. In future, the algorithm will be simulated
using the network simulator and the simulated results will be compared with
two or three existing distributed clustering algorithms.
7. ACKNOWLEDGMENTS
Our sincere gratitude to the management of SVS Educational Institutions
and my Research Supervisor Dr. S. Sophia who served as a guiding light to
come out with this amazing research work.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 11
REFERENCES
[1] W.B.Heinzelman, A.P.Chandrakasan, H.Balakrishnan, (2002), “An application specific
protocol architecture for wireless microsensor networks”, IEEE Transactions on Wireless
Communication Volume 1, Number 4, Pages 660-670.
[2] O.Younis, S.Fahmy, (2004), “HEED: A hybrid energy-efficient distributed clustering
approach for adhoc sensor networks”, IEEE Transactions on Mobile Computing, Volume 3,
Number 4, Pages 366-379.
[3] S.Zairi, B.Zouari, E.Niel, E.Dumitrescu, (2012), “Nodes self-scheduling approach for
maximizing wireless sensor network lifetime based on remaining energy” IET Wireless
Sensor Systems, Volume 2, Number 1, Pages 52-62.
[4] I.Akyildiz, W.Su, Y.Sankarasubramaniam, E.Cayirci, (2002), “A Survey on sensor
networks”, IEEE Communications Magazine, Pages 102-114.
[5] G.J.Pottie, W.J.Kaiser, (2000), “Embedding the internet: wireless integrated network
sensors”, Communications of the ACM, Volume 43, Number 5, Pages 51-58.
[6] J.H.Chang, L.Tassiulas, (2004), “Maximum lifetime routing in wireless sensor
networks”, IEEE/ACM Transactions on Networking, Volume 12, Number 4, Pages 609-
619.
[7] S.R.Boselin Prabhu, S.Sophia, (2011), “A survey of adaptive distributed clustering
algorithms for wireless sensor networks”, International Journal of Computer Science and
Engineering Survey, Volume 2, Number 4, Pages 165-176.
[8] S.R.Boselin Prabhu, S.Sophia, (2012), “A Research on decentralized clustering
algorithms for dense wireless sensor networks”, International Journal of Computer
Applications , Volume 57, Number 20, Pages 0975-0987.
[9] S.R.Boselin Prabhu, S.Sophia, (2013), “Mobility assisted dynamic routing for mobile
wireless sensor networks”, International Journal of Advanced Information Technology ,
Volume 3, Number 1, Pages 09-19.
[10] S.R.Boselin Prabhu, S.Sophia, (2013), “A review of energy efficient clustering
algorithm for connecting wireless sensor network fields”, International Journal of
Engineering Research & Technology, Volume 1, Number 4, Pages 477–481.
[11] S.R.Boselin Prabhu, S.Sophia, (2013), “Capacity based clustering model for dense
wireless sensor networks”, International Journal of Computer Science and Business
Informatics, Volume 5, Number 1.
[12] J.Deng, Y.S.Han, W.B.Heinzelman, P.K.Varshney, (2005), “Balanced-energy sleep
scheduling scheme for high density cluster-based sensor networks”, Elsevier Computer
Communications Journal, Special Issue on ASWN04, Pages 1631-1642.
[13] C.Y.Wen, W.A.Sethares, (2005), “Automatic decentralized clustering for wireless
sensor networks”, EURASIP Journal of Wireless Communication Networks, Volume 5,
Number 5, Pages 686-697.
[14] S.D.Murugananthan, D.C.F.Ma, R.I.Bhasin, A.O.Fapojuwo, (2005) “A centralized
energy-efficient routing protocol for wireless sensor networks”, IEEE Transactions on
Communication Magazine, Volume 43, Number 3, Pages S8-13.
[15] F.Bajaber, I.Awan, (2009), “Centralized dynamic clustering for wireless sensor
networks”, Proceedings of the International Conference on Advanced Information
Networking and Applications.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 12
[16] Pedro A. Forero, Alfonso Cano, Georgios B.Giannakis, (2011), “Distributed clustering
using wireless sensor networks”, IEEE Journal of Selected Topics in Signal Processing,
Volume 5, Pages 707-724.
[17] Lianshan Yan, Wei Pan, Bin Luo, Xiaoyin Li, Jiangtao Liu, (2011), “Modified energy-
efficient protocol for wireless sensor networks in the presence of distributed optical fiber
sensor link, IEEE Sensors Journal, Volume 11, Number 9, Pages 1815-1819.
[18] S.Bandyopadhay, E.Coyle, (2003), “An energy-efficient hierarchical clustering
algorithm for wireless sensor networks”, Proceedings of the 22nd
Annual Joint Conference
of the IEEE Computer and Communications Societies (INFOCOM 2003), San Francisco,
California.
[19] D.J.Barker, A.Ephremides, J.A.Flynn, (1984), “The design and simulation of a mobile
radio network with distributed control”, IEEE Journal on Selected Areas in
Communications, Pages 226-237.
[20] R.Nagpal, D.Coore, (2002), “An algorithm for group formation in an amorphous
computer”, Proceedings of IEEE Military Communications Conference (MILCOM 2002),
Anaheim, CA.
[21] M.Demirbas, A.Arora, V.Mittal, (2004), “FLOC: A fast local clustering service for
wireless sensor networks”, Proceedings of Workshop on Dependability Issues in Wireless
Ad Hoc Networks and Sensor Networks (DIWANS’04), Italy.
[22] M.Ye, C.F.Li, G.H.Chen, J.Wu, (2005), “EECS: An energy efficient clustering scheme
in wireless sensor networks”, Proceedings of the Second IEEE International Performance
Computing and Communications Conference (IPCCC), Pages 535-540.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 1
An Efficient Connection between
Statistical Software and Database
Management System
Sunghae Jun
Department of Statistics, Cheongju University
Chungbuk 360-764 Korea
ABSTRACT
In big data era, we need to manipulate and analyze the big data. For the first step of big data
manipulation, we can consider traditional database management system. To discover novel
knowledge from the big data environment, we should analyze the big data. Many statistical
methods have been applied to big data analysis, and most works of statistical analysis are
dependent on diverse statistical software such as SAS, SPSS, or R project. In addition, a
considerable portion of big data is stored in diverse database systems. But, the data types of
general statistical software are different from the database systems such as Oracle, or
MySQL. So, many approaches to connect statistical software to database management
system (DBMS) were introduced. In this paper, we study on an efficient connection
between the statistical software and DBMS. To show our performance, we carry out a case
study using real application.
Keywords
Statistical software, Database management system, Big data analysis, Database connection,
MySQL, R project.
1. INTRODUCTION
Every day, huge data are created from diverse fields, and stored in computer
systems. These big data are extremely large and complex [1]. So, it is very
difficult to manage and analyze them. But, big data analysis is important
issue in many fields such as marketing, finance, technology, or medicine.
Big data analysis is based on statistics and machine learning algorithms. In
addition, data analysis is depended on statistical software, and the data are
stored in database systems. So, for big data analysis, we should manage
statistical software and database system effectively. In this paper, we
consider R project system as statistical software. R is an environment for
statistical computing including statistical analysis and graphical display of
data [2]. This program provides most of statistical and machine learning
methods for big data analysis. We use MySQL for connecting database
system from R project. The MySQL is a database management system
(DBMS) product that is the most popular open source database in the world,
in addition, this is a free software like R system [3]. So, in our research, we
use R and MySQL for an efficient connection between statistical software
and DBMS. There was a work about DB access through R [4]. This covered
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 2
the DB access problems of R, and showed the ODBC (open database
connectivity) drivers for connecting R and DBMS such as MySQL,
PostgreSQL, and Oracle. Also, the authors of this paper introduced the
installation and technological environment for the DB access. But, they did
not illustrate detailed approaches for real applications. That is, their work
was about a conceptual suggestion for the access of R to MySQL. So, in this
paper, we perform more specific study for connection between statistical
software, R to DBMS, MySQL. In our case study, we will show detailed
and efficient connection of R to MySQL using specific data set from the
University of California, Irvine (UCI) machine learning repository [5]. We
will cover our research background in next section. In section 3, our
proposed methodology will be shown. We also introduce an efficient
connection between statistical database and DBMS in section 4. Lastly we
conclude our study and offer our future works for statistical database system.
2. RESEARCH BACKGROUND
2.1 Statistical Software
To analyze data, we can consider diverse approaches using statistical
software. These days, there are so many products for statistical software.
SAS (statistical analysis system) is the most popular software for statistical
analysis [6]. But, this is expensive, so there are not many companies using
SAS except large size companies. SPSS (statistical analysis in social science)
is another representative software [7], but this is also expensive. Minitab [8]
and S-Plus [9] are well used statistics packages and these are all not free.
Recently, R has been used in many works for statistical data analysis, and
this is free. In addition, R also provides most of statistical functions
included in SAS, or SPSS. R is open source program, so we can modify R
functions for our statistical computing. This is very useful advantage of R.
Therefore, we consider R for connection to database system in this research.
2.2 Database Management System
Database is a collection of data, and database management system (DBMS)
is a software for managing database using structured query language (SQL)
[10],[11]. Oracle is one of popular DBMS products [12], but it is expensive.
MySQL is another DBMS, which is widely used open source software in the
world [3]. Also, most functions of MySQL are similar to Oracle [3]. So, in
this paper, we use MySQL for DBMS connecting to statistical software, R.
Using MySQL DBMS efficiently, we use RODBC package supported by R
CRAN in our research [13].
3. STATISTICAL DATABASE SYSTEM
The main goal of our study is to solve the cost problem for constructing
statistical database system, because we should buy additional product to
connect statistical software to DBMS. For example, for the connection
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 3
between SAS and DBMS, we need „SAS/Access‟ product as supplementary
software. In general, this is expensive. So, we tried to make the connection
between statistical software and DBMS without cost. The „efficient‟ of our
paper was about „cost‟. There are many approaches to connect statistical
software and DBMS. To use most of them, we should buy additional
products. But, there are few free approaches. So, we find an approach to
connect statistical software and DBMS without cost. In this paper, we study
an efficient connection between DBMS and statistical software. We select
the MySQL as a DBMS for our research, and use R project as statistical
software because not only they are free but also they have good functions. In
addition, the R and MySQL have strong performance in statistical
computing and DBMS respectively for constructing statistical database
system [14],[15],[16],[17]. In general, big data are transformed to structured
data type for statistical analysis as follow;
Figure 1. From big data to statistical analysis
First, big data are stored in DB by creating table. Second, big data are
changed to structured data by preprocessing based on text mining. All data
by DB and text mining are analyzed by statistical analysis. We find that text
mining process is hard work for data preprocessing [18]. So, we know that
table creation is more effective approach for big data analysis. To construct
MySQL DB, we use console or graphic user interface (GUI) environments
as follow;
Figure 2. User interface of MySQL
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 4
In this paper, we use SQL codes in the MySQL console. Also, we use
RODBC as an ODBC database interface between R and MySQL [13]. In
general R system, package is a set of additional R functions. R packages are
not installed in basic R system. If we need to use a package, we have to add
the package to the R system. Also we can search all packages from the R
CRAN, and install them from the CRAN [19]. The RODBC package
provides efficient functions for ODBC database access. So, our research is
based on RODBC package to connect R to MySQL. To install RODBC in R
system, we should select R CRAN mirror site. After RODBC installation,
we load this package on R system as follow;
>library (RODBC)
The R system uses „library‟ function for loading a package. By this R code,
we can use all functions provided by RODBC package such as odbcConnect,
sqlFetch, and sqlQuery. They are used in our research for DB accessing and
connecting. To connect MySQL DB, we use „odbcConnect‟ function of
RODBC package as follow;
>db_con =odbcConnect("stat_MySQL")
User = , Password = , Database =
The DSN is „stat_MySQL‟ and the „db_con‟ object of R system includes the
connecting result. Also, in this connecting process, we decide user name,
password, and determined database. If R and MySQL are connected each
other, we can show the tables of MySQL DB using „sqlTables‟ function as
follow;
>sqlTables(con)
TABLE_CAT TABLE_SCHEM TABLE_NAME TABLE_TYPE
REMARKS
The result of this function is the information of connected DB and its tables.
3.1 Structure of DB Connection Software
In general, for connecting DBMS to application software, we should use
ODBC connector [20]. R as a statistical software is also needed to ODBC
driver to access MySQL DBMS. In this paper, we consider RODBC
package for efficient connection between R and MySQL. Figure 3 shows
the ODBC connection between DBMS and statistical software, and their
specific products.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 5
Figure 3. Connection between DBMS and statistical software
Oracle and MySQL are representative DBMS products, and SAS and R
system are popular software for statistical analysis. General ODBC program
is used for connecting application software to DBMS. So, there are so many
ODBC drivers for diverse DBMS and application products. Our work is
focused on the connection R and MySQL, and we select RODBC as an
ODBC driver. The RODBC is a package of many R packages for DB
accessing. RMySQL is another R package for R and MySQL [21]. This
package is also R interface to access the MySQL DBMS. In addition to
RODBC and RMySQL, there are some packages for connecting R to
MySQL. In this paper, we use RODBC for MySQL accessing. This is an
ODBC driver like SAS connection to DBMS as follow.
Figure 4. Connection between MySQL/Oracle and SAS
SAS uses some ODBC drivers for diverse DBMS such as MySQL and
Oracle. Also, the drivers use their data source name (DSN). In this research,
we also use DSN for RODBC package. Next, we show more detailed
connection between R and MySQL.
3.2 Efficient Connection between R and MySQL
The RODBC package of R system is an efficient ODBC connector. This
includes diverse functions to access DBMS as follow;
•odbcConnect: function for open connections to ODBC
•sqlFetch: function for fetching tables from DB
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 6
•sqlQuery: function for SQL query
•sqlSave: function for writing data frame to table in DB
Also, we can use more functions for accessing and manipulating MySQL
DB by RODBC packages. The process of connection between R and
MySQL is as follow;
Figure 5. Connecting process between R and MySQL
Using RODBC package, R system get necessary data from MySQL DB, and
we analyze the connected data. Also, R system accesses to MySQL by
sqlQuery function of RODBC, and create a table for storing analysis result
using R system. Our process of connection between R and MySQL is shown
as follow;
Figure 6. Connecting process between R and MySQL
A table of MySQL DB is transformed to an object in R by RODBC
connector. So, we are able to analyze the object data from the DB table. We
also perform online transaction processing (OLAP) for data summarization
and visualization. Next, we carry out a case study for verifying our work.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 7
4. CASE STUDY
To illustrate a case study in real problem, we used „RODBC‟ package from
R-project [13]. This is the software for ODBC database connection between
R and DBMS such as MySQL. Also, we made experiment using an example
data set from the UCI machine learning repository [5].
4.1 UCI Machine Learning Repository
For our case study, we used “Abalone” data set from the UCI machine
learning repository [5]. This data set consisted of 8 variables (columns) and
4,177 observations (rows). The main goal of the data is to predict the age of
abalone from the physical measurements. Next table shows the variables
and their values [5].
Table 1. Table captions should be placed above the table
Variable Data type Description
Sex Nominal M(male), F(female), I(infant)
Length Continuous Longest shell measurement
Diameter Continuous perpendicular to length
Height Continuous with meat in shell
Whole_weight Continuous whole abalone
Shucked_weight Continuous weight of meat
Viscera_weight Continuous gut weight (after bleeding)
Shell_weight Continuous after being dried
Rings Discrete +1.5 gives the age in years
The last variable (rings) is target variable, and others are all input variables.
We constructed MySQL DB using this data set. The original data from UCI
machine learning repository was text file separated by „comma‟, but the
MySQL needed data file separated by „tab key‟ for DB loading file. So, we
transformed the data type using Excel as follow.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 8
Figure 7. Data transformation for MySQL loading
To load text data file on MySQL, we should make a table to save these data.
So, we create the table in next step.
4.2 DB Creation
We used SQL to create table for loading Abalone data set on MySQL
DBMS as follow;
• CREATE DATABASE case_study;
• USE case_study;
• CREATE TABLE abalone( Sex CHAR(3), Length FLOAT(10), Diameter
FLOAT(10), Height FLOAT(10), Whole_weight FLOAT(10),
Shucked_weight FLOAT(10), Viscera_weight FLOAT(10), Shell_weight
FLOAT(10), Rings INT(5));
• LOAD DATA INFILE 'd:/data/abalone.txt' INTO TABLE abalone;
• SELECT * FROM abalone;
Using above SQL codes, we constructed a table of Abalone data in MySQL
DB(case_study). Next, we connected the table of abalone in case_study DB
to R system.
4.3 Connecting R to MySQL
We used RODBC package for connecting R to MySQL as follow;
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 9
>library(RODBC)
>abalone_con=odbcConnect("abalone_ODBC")
>sqlTables(abalone_con)
TABLE_SCHEM TABLE_NAME TABLE_TYPE
case_study abalone TABLE
>vars=sqlQuery(abalone_con, "SELECT sex, diameter, rings FROM
abalone")
Sex Diameter Rings
1 M 0.365 15
2 M 0.265 7
3 F 0.420 9
4 M 0.365 10
5 I 0.255 7
…
Using above R codes, we saved three variables of abalone data set to „vars‟
R object. We found the abalone table was created well from the SQL query
result by sqlQuery function. This function enabled the usage of SQL in R
system. So, we analyzed abalone data using analytical functions of R system.
Next, the result of data analysis is shown.
4.4 Data Analysis
First, we performed data summarization of three variables using „summary‟
function of R system as follow;
>summary(vars)
sex diameter rings
F:1307 Min. :0.0550 Min. : 1.000
I:1342 1st Qu.:0.3500 1st Qu.: 8.000
M:1528 Median :0.4250 Median : 9.000
Mean :0.4079 Mean : 9.934
3rd Qu.:0.4800 3rd Qu.:11.000
Max. :0.6500Max. :29.000
This function provided frequency or descriptive statistic according to data
type (continuous or nominal). For example diameter is continuous variable,
so we got minimum, 25 percentile, median, mean, 75 percentile, and
maximum values. Next we carried out data visualization as follow;
>boxplot(vars$diameter)
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 10
Figure 8. Boxplot: data visualization of MySQL table
This shows boxplot of diameter variable of abalone table. Using graphical
functions supported by R system, we can also get diverse visualization
results such as histogram, plot, and so on. Lastly we constructed regression
model using „reg‟ function as follow;
>regression_result=lm(rings~diameter, data=vars)
>sunnary(regression_result)
Residuals:
Min 1Q Median 3Q Max
-5.19 -1.69 -0.72 0.91 16.00
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.3186 0.1727 13.42 <2e-16 ***
diameter 18.6699 0.4115 45.37 <2e-16 ***
R-squared: 0.3302, Adj. R-squared: 0.3301
The regression is popular model in statistical analysis. The dependent and
independent variables are „rings‟ and „diameter‟ respectively. So, we got the
following regression equation;
Rings=2.3186+18.6699diameter. Therefore, in our case study, we illustrated
a case study of connection between R and MySQL.
5. CONCLUSION
In this paper, we studied on the efficient connection between DBMS and
statistical software. We used R system and MySQL as statistical software
and DBMS respectively. The RODBC package was used for DB connection
in our study. After connecting between R and MySQL, we analyzed the data
of MySQL table. So, this can be expanded to the big data analysis. In our
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 11
case study, we illustrated how our approach could be applied in real
application. We selected Abalone data set from the UCI machine learning
repository for our case study. Our result contributes to the works related to
big data analysis. In addition, we can analyze the data in DBMS directly by
statistical methods. In our future works, we will expand the scope of the
connection between DBMS and statistical software to more products.
6. DISCUSSION
The biggest problem of statistical database system is the cost of connecting
between statistical software and DBMS. For example, we should buy
„SAS/Access‟ product additionally and install it to SAS base system for
connecting SAS and DBMS. Generally this supplementary product is
expensive, so most users have had difficulty to use statistical databases
system. In this paper, we selected R system as statistical software instead of
SAS, and we used RODBC as ODBC connector instead of SAS/Access,
because R and RODBC are all free. But, their performance is similar to SAS.
Also, in new analytical functions such as statistical leaning theory and
machine learning algorithm, they surpass SAS.
REFERENCES
[1] Sathi, A. Big Data Analytics. An Article from IBM Corporation, 2012.
[2] Heiberger, R. M., and Neuwirth, E.R through Excel – A Spreadsheet Interface for
Statistics, Data Analysis, and Graphics. Springer, 2009.
[3] MySQL, The World’s most popular open source database. http://www.mysql.com,
accessed on October 2013.
[4] Sim, S., Kang, H., and Lee, Y. Access to Database through the R-Language. The
Korean Communications in Statistics, 15, 1 (2008), 51-64.
[5] UCI Machine Learning Repository, http://archive.ics.uci.edu/ml, accessed on October
2013.
[6] SAS, http://www.sas.com,accessed on October 2013.
[7] SPSS, http://www-01.ibm.com/software/analytics/spss/, accessed on October 2013.
[8] Minitab, http://www.minitab.com, accessed on October 2013.
[9] S-Plus, http://solutionmetrics.com.au/products/splus/, accessed on October 2013.
[10]Wikipedia, the free encyclopedia. http://en.wikipedia.org, accessed on October 2013.
[11]Date, C. J.An Introduction to Database Systems. 7th edition, Addition-Wesley, 2000.
[12]Oracle, http://www.oracle.com, accessed on October 2013.
[13]Ripley, B.Package RODBC. CRAN R-Project, 2013.
[14]R-bloggers, On R versus SAS. http://www.r-bloggers.com/on-r-versus-sas/, accessed on
December, 2013.
[15]Linkin,Advanced Business Analytics, Data Mining and Predictive Modeling.
http://www.linkedin.com/groups/SAS-versus-R-35222.S.65098787, accessed on
December, 2013.
[16]Clever Logic, MySQL vs. Oracle Security, http://cleverlogic.net/articles/mysql-vs-
oracle, accessed on December, 2013.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 12
[17]Find The Best, Oracle vs MySQL, http://database-management-
systems.findthebest.com/saved_compare/Oracle-vs-MySQL, accessed on December,
2013.
[18]Han, J., and Kamber, M. Data Mining Concepts and Techniques. Morgan Kaufmann,
2001.
[19]R system, The R Project for Statistical Computing. http://www.r-project.org, accessed
on October 2013.
[20]Spector, P. Data Manipulation with R, Springer, 2008.
[21]James, D. A., and DebRoy, S.Package RMySQL. CRAN R-Project, 2013.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 1
Pragmatic Approach to Component Based
Software Metrics Based on Static Methods
S. Sagayaraj
Department of Computer Science
Sacred Heart College, Tirupattur
M. Poovizhi
Department of Computer Science
Sacred Heart College, Tirupattur
ABSTRACT
Component-Based Software Engineering (CBSE) is an emerging technique for reuse of
software. This paper presents the component based software metrics by investigating the
improved measurement techniques. Two types of metrics are used: static metrics and
dynamic metrics. This research work presents the measured metric value for Complexity
metrics and Criticality metric. The static metrics applied to the E-healthcare application
which is developed with the reusable software components. The value of each metric is
analyzed with the application. The metric measured value is the evidence for the
reusability, good maintainability of component based software system.
Keywords
Component Based Software Engineering, Component Based Software Metrics, Component
Based Software System.
1. INTRODUCTION
The demand for new software applications is currently increasing at the
exponential rate. The number of qualified and experienced professionals
required for creating new software/applications is not increasing
commensurably [1]. Software Reuse applications are built from existing
components, primarily by assembling and replacing interoperable parts. So,
software professionals have recognized reuse as a powerful means of
potentially overcoming the above said software crisis and it promises
significant improvements in software productivity and quality [2].
There are two approaches for reuse of code: develop the reusable code from
scratch or identify and extract the reusable code from already developed
code [3]. The organizations have experience in developing software, there
exists extra cost to develop the reusable components from scratch to build
and strengthen their reusable software reservoir. The cost of developing the
software from scratch can be saved by identifying and extracting the
reusable components from already developed and existing software systems
or legacy systems [4]. But the problem of how to recognize reusable
components from existing systems has remained relatively unexplored. In
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 2
both the cases, whether the organization is developing software from scratch
or reusing code from already developed projects, there is a need of
evaluating the quality of the potentially reusable piece of software. Metrics
is very essential to prove the quality of the components [5].
Software metrics are an essential part of the state-of-the-practice in software
engineering. Goodman describes software metrics as: "The continuous
application of measurement-based techniques to the software development
process and its products to supply meaningful and timely management
information, together with the use of those techniques to improve that
process and its products"[6].Software metrics can do one of four functions
such as understand, evaluate, control, predict.
Various attributes, which determine the quality of the software, include
maintainability, defect density, fault proneness, normalized rework,
understandability, reusability etc [5]. To achieve both the quality and
productivity objectives it is always recommended to go for the software
reuse that not only saves the time taken to develop the product from scratch
but also delivers the almost error free code, as the code is already tested
many times during its software development [7].
During the last decade, the software reuse and software engineering
communities have come to better understanding on component-based
software engineering. The development of a reuse process and repository
produces a base of knowledge that improves in excellence after every reuse,
minimizing the amount of development work necessary for future projects,
and ultimately reducing the risk of new projects that are based on repository
knowledge [8].
CBSD centers on building large software systems by integrating previously
existing software components. By enhancing the flexibility and
maintainability of systems, this approach can potentially be used to reduce
software development costs, assemble systems rapidly, and reduce the
spiraling maintenance burden associated with the support and upgrade of
large systems [9].
The paper is organized as follows: The related work on component based
software metric is provided in Section 2. The list of Component based static
and dynamic metrics in section 3. The detail of implementation is presented
in Section 4. The analysis of complexity metrics and criticality metrics
is described in section 5. Finally, the last section concludes the paper and
offers further research in this area.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 3
2. RELATED WORKS
Many works are carried out in the area of Component Based Software
Metrics. Some of the works are listed below:
Nael SALMAN focuses mainly on the complexity that results mainly from
factors related to system structure and connectivity in 2006 [10]. Also, a
new set of properties that a component-oriented complexity metric must
possess are defined. The metrics have been evaluated using the properties
defined. A case study has been conducted to detect the power of complexity
metrics in predicting integration and maintenance efforts. The results of the
study revealed that component oriented complexity metrics can be of great
value in predicting both integration and maintenance efforts.
Arun Sharma, Rajesh Kumar, and P. S. Grover presented survey few
existing component-based reusability metrics in 2007 [11]. These metrics
gave a border view of component’s understandability, adaptability, and
portability. It also expresses the analysis, in terms of quality factors related
to reusability, contained in an approach that helps significantly in assessing
existing components for reusability.
V. Lakshmi Narasimhan, P. T. Parthasarathy, and M. Das hearted a series of
metrics projected by various researchers have been analyzed, evaluated and
benchmarked using several large-scale openly available software systems in
2009[12]. A systematic analysis of the values for various metrics has been
carried out and several key inferences have been drawn from them. A
number of useful conclusions have been drawn from various metrics
evaluations, which include inferences on complexity, reusability, testability,
modularity and stability of the underlying components.
Misook Choi, Injoo J. Kim, Jiman Hong, Jungyeop Kim suggested
Component-Based Metrics Applying the Strength of Dependency between
Classes in 2009 to increase quality of components, they proposed the
component-based metrics applying the strength of dependency between
classes to measure precisely [13]. In addition, they proved the theoretical
soundness of the proposed metrics by the axioms of Briand et al. and
suggest the accuracy and practicality of the proposed metrics through a
comparison with the conventional metrics in component development phase.
Majdi Abdellatief, Abu Bakar Md Sultan, Abdul Azim Abd Ghani,
Marzanah A.Jaba presented dependency between components is considered
as a most important issue affecting the structural design of Component-
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 4
Based Software System (CBSS) in 2011 [14]. Two sets of metrics that is,
Component Information Flow Metrics and Component Coupling Metrics are
proposed based on the concept of Component Information Flow from CBSS
designer’s point of view.
Jianguo Chen, Hui Wang, Yongxia Zhou, Stefan D. Bruda presented some
such efforts by investigating the improved measurement tools and
techniques, i.e., through the effective software metrics in 2011 [15].
Coupling, Cohesion and interface metrics are proposed newly and evaluated
those metrics.
The previous research explained the work done with varieties of Component
Based Software Metrics. This paper deals about the static and dynamic
metrics of component based software. This work is extended by developing
the E-Healthcare application and the results are carried out for the static
metrics.
3. COMPONENT BASED SOFTWARE METRICS
The traditional software metrics focus on non-CBSS and are inappropriate
to CBSS mainly because the component size is normally not known in
advance. Inaccessibility of the source code for some components prevents
comprehensive testing. So, the component based metrics are defined to
evaluate the component based application.
There are two types of metrics considered in this paper for measuring the
values.
 Static Metric
Static metrics cover the complexity and the criticality within an integrated
component. Static metrics are collected from static analysis of component
assembly. The complexity and criticality metrics are intended to be used
early during the design stage. The list of static metrics [16] is provided in
Table 1.
 Dynamic metric
Dynamic metrics are gathered during execution of complete application.
Dynamic metrics are meant to be used at implementation stage. The
dynamic metrics are listed in Table 2 [15].
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 5
Table 1. Static Metrics
Sl.no Metric Name Formulae
1 Component Packing Density Metric
2 Component Interaction Density metric
3 Component Incoming Interaction Density
4 Component Outgoing Interaction Density
5 Component Average Interaction Density
6 Bridge Criticality Metrics CRIT bridge =#bridge_component
7 Inheritance Criticality Metrics CRIT inheritance =#root_component
8 Link Criticality Metrics CRIT link =#link_component
9 Size Criticality Metrics CRIT size =#size_component
10 #Criticality Metrics CRIT all = CRIT bridge+ CRIT inheritance
+ CRIT link + CRIT size
Table 2. Dynamic Metrics
4. IMPLEMENTATION
The E-Healthcare application is developed to measure the static metrics.
The application is designed with the number of components. The metrics are
applied with the application and the values are measured. There are five
modules in e-healthcare application.
Sl.no Metric Name Formulae
1 Number of Cycle (NC) NC = # cycles
2 Average Number of Active Components
3 Active Component Density (ACD)
4 Average Active Component Density
5 Peak Number of Active Components ACΔt = max { AC1,..,ACn}
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 6
4.1 Admin
Admin module is used to store the user or doctor or admin details. Admin
has a responsibility to manage every record in the database.
4.2 Appointments and payments
This module is used to add, drop doctor details and help to get appointment
for users. Admin is a responsible person to add new doctor details. The
existing doctor also can be deleted by admin.
4.3 Diagnosis and health
Diagnosis and Health module is used to retrieve user’s diagnosis details.
The users who are all taking the treatment by using application, those users
information is store in the database.
4.4 First aid and E-certificate
This module is used to get blood bank details for the required blood group.
A first aid medicine detail for a particular disease is provided to the users.
The user can get treatment type which helps users for their emergency.
4.5 Symptoms and alerts
Symptoms and alerts module is used to check the BP level of the user. The
patient information is retrieved from database and their symptoms, causes
for disease are helps the users to prevent them from disease.
The pictographic representation of the modules in application is shown in
Figure 1.
Appointments
and payments
Admin Diagnosis and
health
First aid and
E-certificates
Symptoms
and Alerts
Figure 1. Modules in E-healthcare Application
Components are created to develop the whole application. The components
(admin, appointments and payment, diagnosis and health, firstaid and e-
certificate, symptoms and alerts, DBHelper, EhealthBL) are required to
complete the component based application called E-Healthcare. The static
metrics are applied with that component, and each component value is
measured according to the metric formula. The analysis of metric is carried
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 7
out manually with the application. With the help of database table, web page
form, components the metric values are calculated.
5. ANALYSIS
The analysis made to prove the CBSS has good reusability, maintainability
and independence.
The Component Packing Density Metric, Component Interaction Metrics
(Incoming, outgoing, Average), Criticality metrics analyses are as follows:
5.1 Component Packing Density Metric
CPD is used to measure the number of operation in which each component
contains.
The CPD is defined as a ratio of #constituent (LOC, object/classes,
operations, classes and/or modules) and #component
#Constituent = one of the following: LOC, object/classes, operations,
classes and/or modules
#Component = number of components
For this metric the no. of operation of each component is listed in Table 3.
Table 3. Component packing Density
S.No Component Name No. of operations
1 Admin 3
2 Appointments and payments 4
3 Diagnosis and health 4
4 Firstaid and e-certificate 6
5 Symptoms and alerts 5
6 DBHelper 1
7 EhealthBL 19
= 3+4+6+5+4+1+19/7= 42/7= 6
Hence, the CPD metric is helps to know the average number of operations
in each component contains.
5.2Component Interaction Density Metric
The CID is defined as a ration of actual interactions over potential ones. A
higher interaction density causes a higher complexity in the interaction [17].
The CID metric is applied on the E-Healthcare application. The measured
value of actual interactions in each component of E-Healthcare is illustrated
in Table 4.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 8
#I = no. of actual interactions
#Imax = no. of maximum available interactions.
Table 4. Actual interactions
S.No Name of the page No. of actual interactions
1 Registration.aspx 4
2 Postquestion.aspx 2
3 Search.aspx 5 i/p, 5 o/p
4 Doctormanagement.aspx 6
5 Diagnosis.aspx 1
6 Searchmedicine.aspx 2 i/p, 3 o/p
7 Medicine.aspx 5
8 Bloodbank.aspx 4
9 Firstaidsuggestion.aspx 2
10 Medicalcertificate.aspx 3
11 Treatmenttype.aspx 1 i/p, 2 o/p
12 Symptoms.aspx 1 i/p, 3 o/p
Total 51
The actual interaction value between other components is 51.
The maximum no. of available interaction with other component is 87
=51/87 = 0.586
This metric brings out the number of incoming and outgoing interactions
available in each component. This metric helps to know which component
has greater connectivity with other component.
5.3 Component Incoming Interaction Density
CIID is defined as a ratio of number of incoming interactions and maximum
number of incoming interactions. A higher interaction density causes a
higher complexity in the interaction. The no. of actual incoming interactions
in each component is shown in the Table 5.
#I in = no. of incoming interactions
#Imax in = maximum no. of available incoming interactions.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 9
Table 5. Incoming Interactions
S.No Name of the page No. of incoming interactions
1 Registration.aspx 4
2 Postquestion.aspx 1
3 Search.aspx 5
4 Doctormanagement.aspx 4
5 Diagnosis.aspx 1
6 Searchmedicine.aspx 2
7 Medicine.aspx 5
8 Bloodbank.aspx 4
9 Firstaidsuggestion.aspx 1
10 Medicalcertificate.aspx 2
11 Treatmenttype.aspx 4
12 Symptoms.aspx 4
Total 37
The no. of incoming interaction value is 37.
The maximum no. of available incoming interaction value is 51. Out of 51
interactions only the 37 interactions are actually has link to the other
component.
= 37/51 = 0.725
CIID metric value 0.725 is clearly state the incoming interactions density
with other component is very high.
5.4 Component Outgoing interaction Density
COID is defined as a ratio of number of outgoing interactions and maximum
number of outgoing interactions. A higher interaction density causes a
higher complexity in the interaction. The number of outgoing interaction in
each component is shown in Table 6.
#I out = no. of outgoing interactions
#Imax out = no. of maximum no. of outgoing interactions.
Table 6. Outgoing Interactions
S.No Name of the page No. of outgoing interactions
1 Registration.aspx 2
2 Postquestion.aspx 1
3 Search.aspx 5
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 10
4 Doctormanagement.aspx 1
5 Diagnosis.aspx 3
6 Searchmedicine.aspx 3
7 Medicine.aspx 1
8 Bloodbank.aspx 3
9 Firstaidsuggestion.aspx 1
10 Medicalcertificate.aspx 1
11 Treatmenttype.aspx 4
12 Symptoms.aspx 3
Total 28
The no. of outgoing interaction value is 28.
The maximum no. of available outgoing interaction value is 46. Only 28
outgoing interactions are actually connected with other components.
= 28/46 = 0.608
The calculated value is 0.608 proven that there is greater outgoing
interactions with the components.
5.5 Component Average Interaction Density
CAID represents the sum of CID for each component divided by the number
of components.
#components = Number of components in the system. (Sum of interaction
density of n component / no. of existing component)
Admin: The actual interfaces (incoming and outgoing) of admin component
are listed. Sum of interaction density value for admin component is shown
in Table 7.
Table 7. Sum of CID for admin component
S.No Name of the page Sum of CID for admin
component
1 Registration.aspx 4 out of 13 (only 4 interfaces
interact with other components out
of 13 interfaces)
2 Login.aspx 2 out of 2
3 Postquestion.aspx 1 out of 1
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 11
Summation of CID for Component Admin is 7/16. Seven are actual
interactions out of sixteen. This component has a greater reliability.
Appointments and payments: Sum of interaction density of an
appointments and payments component is shown in Table 8. The sum is
considered both the incoming and outgoing interfaces in appointments and
payments component.
Table 8. Sum of CID for appointments and payments component
Summation of CID for Component Appointments and payments is 10/12.
The 10 interfaces has link with other component out of 12 interfaces.
Diagnosis and health: Sum of interaction density of a diagnosis and health
component is shown in Table 9.
Table 9. Sum of CID for diagnosis and health component
S.No Name of the page Sum of CID for diagnosis and health
component
1 Diagnosis.aspx : 1 out of 2
2
Searchmedicine.aspx
: 2 out of 2
3 Medicine.aspx : 4 out of 5
Summation of CID for Component Diagnosis and health is 7/9. The 7
interfaces are represents both interactions with added components out of 9
interfaces.
Firstaid and e-certificate: Table 10 shows the sum of CID value for
component called firstaid and e-certificates.
Table 10. Sum of CID for firstaid and e-certificates
S.No Name of the page Sum of CID for firstaid and
e-certificate
S.No Name of the page Sum of CID for appointments and
payments component
1 Search.aspx : 2 out of 2
2 To get appointment : 4 out of 4
3 Doctormanagement.aspx : 4 out of 6
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 12
1 Bloodbank.aspx : 1 out of 1
: 3 out of 3
2 Firstaidsuggestion.aspx : 1 out of 1
3 Medical certificate.aspx : 2 out of 4
4 Treatmenttype.aspx : 1 out of 1
: 3 out of 7
Summation of CID for Component Firstaid and E-certificates is 11/17. Out
of 17 only 11 interactions are connected with the rest of the component.
Symptoms and alerts: Table 11 shows the sum of CID value for
component called symptoms and alerts.
Table 11. Sum of CID for symptoms and alerts.
S.No Name of the page Sum of CID for symptoms and alerts
1
Searchpatient.aspx
: 1 out of 1
: 3 out of 3
Summation of CID for Component Symptoms and alerts is 4/4. This
component completely connected with other components.
Component Average Interaction Density metric takes the ratio between sum
of each component and number of existing components.
= (7/16+10/12+7/9+11/17+4/4)/7
= 0.5279
The measured value for this metric proved that, greater reliability with the
components.
5.6 Bridge Criticality Metric
Bridge criticality metric is used to identify the bridge component. The
component which is acts as a bridge for components is a bridge component.
CRIT bridge =#bridge_component. Out of 7 components EhealthBL is acts
a bridge component between other component and from the code behind to
the database. It contains all the queries to store and retrieve the
information’s.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 13
So, the bridge_component value is 1.The value 1 is explicitly tells that, one
component is operates as a bridge component to all other component.
5.7 Inheritance Criticality Metric
Inheritance is deriving a new component from the existing component. The
existing component is called as root component.
CRIT inheritance =#root_component
The interface is inherited from the existing/ derived component.
Root components
 Symptoms and alerts (patient info inherited to diagnosis component)
 EhealthBL (query is inherited from the basequery)
So, the root component value is: 2, this value is shows that, object oriented
programming concepts utilized between the components.
5.8 Link Criticality Metric
Link criticality metric is used to identify link component. The component
which is providing link to other components is called as link component.
CRIT link =#link_component
The link component value is: 1 (DB helper).This value proved that the
component acts as link between code behind page to database.
5.9 Size Criticality Metric
Size criticality metric is used to identify the component which exceeds the
critical level, which is called size component.
CRIT size =#size_component
The size component value is: 0
Size critical level is: 60 lines in a component. No component exceeds the
critical level.
5.10 # Criticality Metric
The Sum of the bridge criticality, inheritance criticality, link criticality, size
criticality is known as Criticality Metrics.
CRIT all = CRIT bridge+ CRIT inheritance + CRIT link + CRIT size
CRIT all = 1+2+1+0
= 4
The compound value 4 proved that the huge criticality is arising.
Threshold Value
The threshold value is fixed as 0.5 and it is used to compare the computed
value of each meric. The comparison with this threshold value is to check
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 14
the metric value is increased or decreased in it reusability and good
maintainability aspects. Table 12 shows the result of compared with the
threshold value.
Table 12. Comparison with threshold value.
{{{
Metric Name Comparison with Threshold
Value
Component Packing Density Metric Increasing
Component Interaction Density Metric Increasing
Component Incoming Interaction Density Increasing
Component Outgoing Interaction Density Increasing
Component Average Interaction Density Increasing
Bridge Criticality Metrics Increasing
Inheritance Criticality Metrics Increasing
Link Criticality Metrics Increasing
Size Criticality Metrics Decreasing
6. CONCLUSIONS
Building software systems with reusable components bring many
advantages to Organizations. Reusability may have several direct or indirect
factors like cost, efforts, and time. This paper discussed various aspects of
reusability for Component- Based systems. It has given an insight view of
various reusability metrics for Component-Based systems. The qualities of
components are correctly measured by applying metrics to an e-healthcare
in an electronic commerce domain. The component-based metrics result in
improving the quality of design components and developing the component
based system with good maintainability, reusability, and independence.
Most of the Metrics have future enhancement. That enhancements help to
add the features at the future. The demand of the new software applications
is currently increasing at the exponential rate. So the future enhancements
will help to fulfill those requirements. The Dynamic Metric analysis can be
applied to the component based software application and it can be validated.
Based on the applications the enhanced metrics can be proposed for the
component based software systems.
REFERENCES
[1] Dr. Nedhal A. Al Saiyd, Dr. Intisar A. Al Said, Ahmed H. Al Takrori, Semantic-Based
Retrieving Model of Reuse Software Component, IJCSNS International Journal of
Computer Science and Network Security, VOL.10 No.7, July 2010.
[2] Joaquina Martín-Albo, Manuel F. Bertoa, Coral Calero, Antonio Vallecillo, Alejandra
Cechich and Mario Piattini, CQM: A Software Component Metric Classification
Model, IEEE Transactions onJjournal Name.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 15
[3] Anas Bassam AL-Badareen, Mohd Hasan Selamat, Marzanah A. Jabar, Jamilah Din,
Sherzod Turaev, Reusable Software Component Life Cycle, International Journal of
Computers, Issue 2, Volume 5, 2011.
[4] Chintakindi Srinivas, Dr.C.V.Guru rao, Software Reusable Components With
Repository System, International Journal of Computer Science & Informatics, Volume-
1, Issue-1,2011
[5] Parvinder S.Sandhu, Harpreet Kaur, and Amanpreet Singh, Modeling of Reusability of
Object oriented Software System, World Academy of Science, Engineering and
Technology 56 2009.
[6] Sarbjeet Singh, Manjit Thapa, Sukhvinder singh and Gurpreet Singh, Sarbjeet Singh,
Manjit Thapa, Sukhvinder singh and Gurpreet Singh, International Journal of
Computer Applications (0975 – 8887) Volume 8– No.12, October 2010
[7] Linda L. Westfall, Seven steps to designing a software metrics, Principles of software
measurement services.
[8] K.S. Jasmine and R.Vasantha, DRE – A Quality metric for Component Based Software
Products, World Academy of Science, Engineering and Technology 34 2007.
[9] Iqbaldeep Kaur, Parvinder S. Sandhu, Hardeep Singh, and Vandana Saini, Analytical
Study of Component Based Software Engineering, World Academy of Science,
Engineering and Technology 50 2009.
[10]Nael Salman, Complexity metrics as predicators of maintainability and integrability of
software components, Journal of arts and science, May 2006.
[11]Arun Sharma, Rajesh Kumar, and P. S. Grover, A critical survey of reusability aspects
for component-Based systems, World academy of science, Engineering and
Technology 33 2007.
[12]V. Lakshmi Narasimhan, P. T. Parthasarathy, and M. Das, Evaluation of a suite of
metrics for CBSE, Issues in informing science and information technology, Vol 6,
2009.
[13]Misook Choi, Injoo J. Kim, Jiman Hong, Jungyeop Kim, Component-Based Metrics
Applying the Strength of Dependency between Classes, ACM Journal, March 2009.
[14]Majdi Abdellatief, Abu Bakar Md Sultan, Abdul Azim Abd Ghani, Marzanah A.Jabar,
Component-based Software System Dependency Metrics based on Component
Information Flow Measurements, ICSEA 2011.
[15]Jianguo Chen, Hui Wang, Yongxia Zhou, Stefan D.Bruda, Complexity Metrics for
Component-based Software Systems, International Journal of Digital Content
Technology and its Applications. Vol.5, No.3, March 2011.
[16]V. Lakshmi Narasimhan, and Bayu Hendradjaya, Theoretical Considerations for Software
Component Metrics, World Academy of Science, Engineering and Technology 10
2005.
[17]E. S. Cho, M.S. Kim, S.D. Kim, Component Metrics to Measure Component Quality,
the 8th Asia-Pacific Software Engineering Conference (APSEC), Macau, 2001, pp.
419-426.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 1
SDI System with Scalable Filtering of
XML Documents for Mobile Clients
Yi Yi Myint
Department of Information and Communication Technology
University of Technology (Yatanarpon Cyber City)
Pyin Oo Lwin, Mandalay Division, Myanmar
Hninn Aye Thant
Department of Information and Communication Technology
University of Technology (Yatanarpon Cyber City)
Pyin Oo Lwin, Mandalay Division, Myanmar
ABSTRACT
As the number of user grows and the amount of information available becomes even bigger,
the information dissemination applications are gaining popularity in distributing data to the
end users. Selective Dissemination of Information (SDI) system distributes the right
information to the right users based upon their profiles. Typically, the exploitation of
Extensible Markup Language (XML) representation entails the profile representation, and
the utilization of the XML query languages assist the employment of queries indexing
techniques in SDI systems. As a consequence of these advances, mobile information
retrieval is crucial to share the vast information from diverse data sources. However, the
inherent limitations of mobile devices require information to be delivered to mobile clients
to be highly personalized consistent with their profiles. In this paper, we address the issue
of scalable filtering of XML documents for mobile clients. We describe an efficient
indexing mechanism by enhancing XFilter algorithm based on a modified Finite State
Machine (FSM) approach that can quickly locate and evaluate relevant profiles. Finally, our
experimental results show that the proposed indexing method outperforms the previous
XFilter algorithm in time aspect.
Keywords
XML, FSM, scalable filtering, SDI.
1. INTRODUCTION
Nowadays the SDI System becomes increasingly an important research area
and industrial topic. Obviously, there is a trend to create new applications
for small and light computing devices such as cell phones and PDAs.
Amongst the new applications, mobile information dissemination
applications (e.g. electronic personalized newspapers delivery, ecommerce
site monitoring, headline news, alerting services for digital libraries, etc.)
deserve special attention.
Recently, there have been a number of efforts to build efficient large-scale
XML filtering systems. In an XML filtering system [4], constantly arriving
streams of XML documents are passed through a filtering engine that
matches documents to queries and routes the matched documents
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 2
accordingly. XML filtering techniques comprise a key component of
modern SDI applications.
XML [3] is becoming a standard for information exchange and a textual
representation of data that is designed for the description of the content,
especially on the internet. The basic mechanism used to describe user
profiles in XML format is through the XPath query language. XPath is a
query language for addressing parts of an XML document. However, this
technique often suffers from restricted capability to express user interests,
being unable to rightly capture the semantics of the user requirements.
Therefore, expressing deeply personalized profiles require a querying power
just like SQL provides on relational databases. Moreover, as the user
profiles are complex in mobile environment, a more powerful language than
XPath is needed. In this case, the choice is XML-QL. XML-QL [7] has
more expressive power compared to XPath and it is also measured the most
powerful among all XML query languages. XML-QL’s querying power and
its elaborate CONSTRUCT statement allows the format of the query results
to be specified.
The rest of the paper is organized as follows: Section 2 briefly summarizes
the related works. Section 3 describes the proposed system architecture and
its components. The operation of the system that is how the query index is
created, the operation of the finite state machine and the generation of the
customized results are explained in Section 4. Section 5 gives the
performance evaluation of the system. Finally Section 6 concludes the
paper.
2. RELATED WORKS
We now introduce some existing XML filtering methods. XFilter [1] was
one of the early works. The XFilter system is designed and implemented for
pushing XML documents to users according to their profiles expressed in
XML Path Language (XPath). XFilter employs a separate FSM per path
query and a novel indexing mechanism to allow all of the FSMs to be
executed simultaneously during the processing of a document. A major
drawback of XFilter is its lack of expressiveness.
In addition, XFilter does not execute the XPath queries to generate partial
results. As a result, the whole document is pushed to the user when a
document matches a user’s profile. This feature prevents XFilter to be used
in mobile environments because the limited capability of the mobile devices
is not enough to handle the entire document. Also XFilter does not utilize
the commonalities between the queries, i.e. it produces a FSM per query.
This observation motivated us to develop mechanisms that employ only a
single FSM for the queries which have common element structure.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 3
YFilter [2] overcomes the disadvantage of XFilter by using
Nondeterministic Finite Automata (NFA) to emphasize prefix sharing. The
resulting shared processing provided tremendous improvements to the
performance of structure matching but complicated the handling of value-
based predicates. However, the ancestor/descendant relationship introduces
more matching states, which may result in the number of active states
increasing exponentially. Post processing is required for YFilter.
FoXtrot [5] is an efficient XML filtering system which integrates the
strengths of automata and distributed hash tables to create a fully distributed
system. FoXtrot also describes different methods for evaluating value-based
predicates. The performance evaluation demonstrates that it can index
millions of queries and attain an excellent filtering throughput. However,
FoXtrot necessitates the extensions of the query language to reach the full
XPath or the powerful expressiveness for user profiles.
NiagaraCQ system [6] uses XML-QL to express user profiles. It provides
the measures of scalability through query groups and cashing techniques.
However, its query grouping ability is derived from execution plans which
are different from our proposed method. The execution times of queries do
not make such planning a possible applicant for mobile environments.
Accordingly, our system will solve the above problems and reduce the
filtering time as much as possible.
3. PROPOSED SYSTEM ARCHITECTURE
We first present a high-level overview of our XML filtering system. We
then describe XML-QL language that we use to specify the user profiles in
this work. The overall architecture of the system is depicted in Figure 1.
Figure 1. Overall architecture of the system
User profiles describe the information preferences of individual users. These
profiles may be created by the users themselves, e.g., by choosing items in a
Graphical User Interface (GUI) via their mobile phones. The user profiles
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 4
are automatically converted into a XML-QL format that can be efficiently
stored in the profile database and evaluated by the filtering system. These
profiles are effectively “standing queries”, which are applied to all incoming
documents. Filtered engine first creates query indices for user profiles and
then parses the incoming XML documents to obtain the query results. The
results are stored in a special content list, so that the whole document need
not be sent. Extracting parts of an XML document can save bandwidth in a
mobile environment. After that, filtered engine sends the filtered XML
documents to the related mobile clients.
3.1 Defining User Profiles with XML-QL
XML-QL has a SELECT WHERE construct, like SQL, that can express
queries, to extract pieces of data from XML documents. It can also specify
transformations that, for example, can map XML data between Document
Type Definitions (DTDs) and integrate XML data from different sources.
Profiles defined through a GUI are transformed into XML documents which
contain XML-QL queries as shown in Figure 2.
<Profile>
<XML-QL>
WHERE<course>
<major>
<name>ICT</name>
<program>First Year</program>
<syllabus>$n</syllabus>
</major></course> IN “course.xml”
CONSTRUCT<result><syllabus>$n</syllabus></result>
</XML-QL>
<PushTo> <address>…</address> </PushTo>
</Profile>
Figure 2. Profile syntax represented in XML containing XML-QL query
3.2 Filtered Engine
The basic components of the filtered engine are 1) An event-based XML
parser which is implemented using SAX API for XML documents; 2) A
profile parser that has an XML-QL parser for user profiles and creates the
Query Index; 3) A Query Execution Engine which contains the Query Index
which is associated with Finite State Machines to query the XML
documents; 4) Delivery Component which pushes the results to the related
mobile clients (see Figure 3).
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 5
User Profiles Profile Parser
XML-QL Parser
Query Execution Engine
Query Index
XML Parser XML Document
Delivery
Query
Query nodes
Events
Results
Figure 3. Filtered engine
4. OPERATION OF THE SYSTEM
The system operates as follows: subscriber informs the filtered engine when
a new profile is created or updated; the profiles are stored in an XML file
that contains XML-QL queries and addresses to transmit the results (see
Figure 2). Profiles are parsed by the profile parser component and XML-QL
queries in the profile are parsed by an XML-QL parser. While parsing the
queries, the XML-QL parser generates FSM representation for each query if
the query does not match to any existing query group. Otherwise, the FSM
of the corresponding query group is used for the input query. FSM
representation contains state nodes of each element name in the queries
which are stored in the Query Index.
When a new document arrives, the system alerts the filtered engine to parse
the related XML document. The event based XML parser sends the events
encountered to the query execution engine. The handlers in the query
execution engine move the FSMs to their next states after the current states
have succeed level checking or character data matching. Meanwhile the data
in the document which matches the variables are kept in the content lists so
that all the necessary partial data for producing the results are formatted and
pushed to the related mobile clients when the FSM reaches its final state.
4.1 Creating Query Index
Consider an example XML document and its DTD given in Figure 4.
<!-- DTD for Course -->
<!ELEMENT root (course*)>
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 6
<!ELEMENT course (degree, major*)>
<!ELEMENT degree (#PCDATA)>
<!ELEMENT major(name, program, semester, syllabus*)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT program (#PCDATA)>
<!ELEMENT semester (#PCDATA)>
<!ELEMENT syllabus (sub-code, sub-title, instructor)>
<!ELEMENT sub-code (#PCDATA)>
<!ELEMENT sub-title (#PCDATA)>
<!ELEMENT instructor (#PCDATA)>
<root> <course>
<degree>Bachelor</degree>
<major><name>ICT</name>
<program>First Year</program>
<semester>First Semester</semester>
<syllabus>
<sub-code>EM-101</sub-code>
<sub-title>English</sub-title>
<instructor>Dr. Thiri</instructor>
</syllabus>
</major>
</course>…</root>
Figure 4. An example XML document and its DTD (course.xml)
The example queries and their FSM representations are shown in Figure 5.
Note that there is a node in the FSM representation corresponding to each
element in the query, and the FSM representation’s tree structure follows
from XML-QL query structure.
Query 1: Retrieve all syllabuses of first year program for ICT major.
WHERE <major> <name>ICT</><program>First Year</><syllabus>$n</>
</> IN “course.xml”
CONSTRUCT<result><syllabus>$n</></>
Q1.1 Q1.2 Q1.3 Q1.4
Q1.1
Q1.2
Q1.3
Q1.4
FSM for Query 1
Query 2: Find the instructor name of the subject code EM-101.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 7
Q2.1 Q2.2 Q2.3
WHERE <syllabus> <sub-code>EM-101</><instructor>$s</>
</> IN “course.xml”
CONSTRUCT<result><syllabus>$s</></>
Q2.1
Q2.2
Q2.3
FSM for Query 2
Query 3: Retrieve all the instructors for first year program in ICT major.
WHERE<major> <name>ICT</><program>First Year</><syllabus> <instructor>$s</></>
</> IN “course.xml”
CONSTRUCT<result><syllabus>$s</></>
Q3.1 Q3.2 Q3.3 Q3.4 Q3.5
Q3.1
Q3.2
Q3.3
Q3.4 Q3.5
FSM for Query 3
Figure 5. Example queries and its FSM representation
We also substitute constants in a query with parameters to create
syntactically equivalent queries, which lead to the use of the same FSM for
them. The state changes of a FSM are handled through the two lists
associated with each node in the Query Index (See Figure 6). The current
nodes of each query are placed on the Candidate List (CL) of their related
element name. In addition, all of the nodes representing the future states are
stored in the Wait Lists (WL) of their related element name. A state
transition in the FSM is represented by copying a query node from WL to
the CL. Notice that the node copied to the CL also remains in the WL so
that it can be reused by the FSM in future executions of the query as the
same element name may reappear in another level in the XML document.
When the query index is initialized, the first node of each query tree is
placed on the CL of the index entry of its relevant element name. The
remaining elements in the query tree are placed in relevant WLs. Query
nodes in the CL designate that the state of the query might change when the
XML parser processes the relevant elements of these nodes. When the XML
parser catches a start element tag, the immediate child elements of this node
in the Query Index are copied from WL to CL If a node in the CL of the
element satisfies level checking or character data matching. The purpose of
the level checking is to make sure that this element name possibly will
reappear in the document.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 8
instructor
major
name
program
syllabus
sub-code
WL
CL
CL
CL
CL
CL
CL
WL
WL
WL
WL
WL
Q3.2
Q1.1 Q3.1
Q1.2
Q1.3 Q3.3
Q2.1
Q1.4 Q3.4
Q2.2
Q2.3 Q3.5
Figure 6. Initial states of the query index for example queries
4.2 Operation of the Finite State Machine
When a new XML document activates the SAX parser, it starts generating
events. The following event handlers hold these events:
Table 1. Sample SAX API
An XML Document SAX API Events
<?xml version=”1.0”>
<course>
<major>
<name>
ICT
</name>
</major>
</course>
start document
start element: course
start element: major
start element: name
characters: ICT
end element: name
end element: major
end element: course
end document
Start Element Handler checks whether the query element matches the
element in the document. For this purpose it performs a level and an
attribute check. If these are satisfied, it either enables data comparison or
starts variable content generation. As the next step, the nodes in the WL that
are the immediate successors of this node are moved to CL.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 9
End Element Handler evaluates the state of a node by considering the states
of its successor nodes. Moreover, it generates the output when the root node
is reached. It also deletes the nodes from CL which are inserted in the start
element handler of the node. This provides “backtracking” in the FSM.
Element Data Handler is implemented for data comparison in the query. If
the expression is true, the state of the node is set to true and this value is
used by the End Element Handler of the current element node.
End Document Handler signals the end of result generation and passes the
results to the Delivery Component.
4.3 Generating Customized Results
Results are generated when the end element of the root node of the query is
encountered. Therefore, content lists of the variable nodes are traversed to
obtain content groups. These content groups are further processed to
produce results. This process is repeated until the end of the document is
reached. The results require to be formatted as defined in the CONSTRUCT
clause. After all, the queries results are sent to the related mobile clients.
5. PERFORMANCE EVALUATION
In this section, we conducted three sets of experiments to demonstrate the
performance of the architecture for different document sizes and query
workloads. The graph shown in Figure 7 contains the results for different
query groups, that is, the queries have the same FSM representation but
different constants, for the document course.xml (1MB). When the number
of queries on the same XML document is very large, the probability of
having queries with the same FSM representation increases considerably.
Figure 7. Comparing the performance by varying the number of queries
The above experiment indicates that our proposed architecture is highly
scalable, and a very important factor on the performance is the number of
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 10
query groups and that generating a single FSM per query group rather than
per query is well justified.
Figure 8. Comparing the performance by varying depth
The depth of XML documents and queries in the user profiles varies
according to application characteristics. Figure 8 shows the execution time
for evaluating the performance of the system as the maximum depth is
varied. Here, we fixed the number of profiles at 25000 and varied the
maximum depth of the XML document and queries from 1 to 10.
Figure 9. Execution time of queries for different number of query groups and
document sizes
Figure 9 shows the results for the execution times of queries which are
varied the number of query groups and the size of different documents. The
results indicate that performance is more sensitive to document size when
the number of query groups increases. Therefore, this result also confirms
the importance of the query grouping.
As final conclusion we can say that FSM approach proposed in this paper
for executing XML-QL queries on XML documents is a very promising
approach to be used in the mobile environments.
International Journal of Computer Science and Business Informatics
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 11
6. CONCLUSIONS
Mobile communication is blooming and access to Internet from mobile
devices has become possible. Given this new technology, researchers and
developers are in the process of figuring out what users really want to do
anytime from anywhere and determining how to make this possible. In
addition, highly personalization is a very important requirement for
developing SDI services in mobile environment as the limited capability of
mobile devices is not enough to handle the entire documents. This paper
attempts to develop an efficient and scalable SDI system for mobile clients
based upon their profiles. We anticipate that one of the common uses of
mobile devices will be to deliver the personalized information from XML
sources. We believe that a querying power is necessary for expressing
highly personalized user profiles and for the system to be used for millions
of mobile users, it has to be scalable. Since the critical issue is the number
of profiles compared to the number of documents, indexing queries rather
than documents makes sense. We expect that the performance of the system
will still be acceptable for mobile environments for millions of queries since
the results of the experiments show that the system is highly scalable.
7. ACKNOWLEDGMENTS
The authors wish to acknowledge Dr. Soe Khaing for her useful comments
on earlier drafts of the paper. Our heart-felt thanks to our family, friends and
colleagues who have helped us for the completion of this work.
REFERENCES
[1] M. Altinel and M. Franklin, “Efficient filtering of XML documents for selective
dissemination of information,” Proc of the Int’l Conf on VLDB, pp. 53-64, Sept 2000.
[2] Y. Diao, M. Altinel, M. Franklin, H. Zhang and P.M. Fischer, “Path sharing and
predicate evaluation for high-performance XML filtering,” ACM Trans. Database
Syst., 28(4), Dec 2003, pp. 467–516.
[3] Extensible Markup Language, http://www.w3.org/XML/.
[4] I. Miliaraki, Distributed Filtering and Dissemination of XML Data in Peer-to-Peer
Systems, PhD Thesis, Department of Informatics and Telecommunications, National
and Kapodistrian University of Athens, July 2011.
[5] I. Miliaraki and M. Koubarakis, “FoXtrot: distributed structural and value XML
filtering”, ACM Transactions on the Web, Vol. 6, No. 3, Article 12, Publication date:
September 2012.
[6] J. Chen, D. DeWitt, F. Tian and Y. Wang, “NiagaraCQ: a scalable continuous query
system for internet databases”, ACM SIGMOD, Texas, USA, June 2000, pp.379-390.
[7] XML-QL: A Query Language for XML, http://www.w3.org/TR/1998/NOTE-xml-ql-
19980819.
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013
Vol 8 No 1 - December 2013

More Related Content

What's hot

A review on energy efficient clustering routing
A review on energy efficient clustering routingA review on energy efficient clustering routing
A review on energy efficient clustering routingeSAT Publishing House
 
Energy Conservation in Wireless Sensor Networks Using Cluster-Based Approach
Energy Conservation in Wireless Sensor Networks Using Cluster-Based ApproachEnergy Conservation in Wireless Sensor Networks Using Cluster-Based Approach
Energy Conservation in Wireless Sensor Networks Using Cluster-Based ApproachIJRES Journal
 
HERF: A Hybrid Energy Efficient Routing using a Fuzzy Method in Wireless Sens...
HERF: A Hybrid Energy Efficient Routing using a Fuzzy Method in Wireless Sens...HERF: A Hybrid Energy Efficient Routing using a Fuzzy Method in Wireless Sens...
HERF: A Hybrid Energy Efficient Routing using a Fuzzy Method in Wireless Sens...ijdpsjournal
 
International Journal of Advanced Smart Sensor Network Systems ( IJASSN )
International Journal of Advanced Smart Sensor Network Systems ( IJASSN )International Journal of Advanced Smart Sensor Network Systems ( IJASSN )
International Journal of Advanced Smart Sensor Network Systems ( IJASSN )ijassn
 
A Review of Atypical Hierarchical Routing Protocols for Wireless Sensor Networks
A Review of Atypical Hierarchical Routing Protocols for Wireless Sensor NetworksA Review of Atypical Hierarchical Routing Protocols for Wireless Sensor Networks
A Review of Atypical Hierarchical Routing Protocols for Wireless Sensor Networksiosrjce
 
Mobile Agents based Energy Efficient Routing for Wireless Sensor Networks
Mobile Agents based Energy Efficient Routing for Wireless Sensor NetworksMobile Agents based Energy Efficient Routing for Wireless Sensor Networks
Mobile Agents based Energy Efficient Routing for Wireless Sensor NetworksEswar Publications
 
Ameliorate Threshold Distributed Energy Efficient Clustering Algorithm for He...
Ameliorate Threshold Distributed Energy Efficient Clustering Algorithm for He...Ameliorate Threshold Distributed Energy Efficient Clustering Algorithm for He...
Ameliorate Threshold Distributed Energy Efficient Clustering Algorithm for He...chokrio
 
Congestion Control Clustering a Review Paper
Congestion Control Clustering a Review PaperCongestion Control Clustering a Review Paper
Congestion Control Clustering a Review PaperEditor IJCATR
 
Hierarchical Coordination for Data Gathering (HCDG) in Wireless Sensor Networks
Hierarchical Coordination for Data Gathering (HCDG) in Wireless Sensor NetworksHierarchical Coordination for Data Gathering (HCDG) in Wireless Sensor Networks
Hierarchical Coordination for Data Gathering (HCDG) in Wireless Sensor NetworksCSCJournals
 
Corona based energy efficient clustering in wsn 2
Corona based energy efficient clustering in wsn 2Corona based energy efficient clustering in wsn 2
Corona based energy efficient clustering in wsn 2IAEME Publication
 
A COST EFFECTIVE COMPRESSIVE DATA AGGREGATION TECHNIQUE FOR WIRELESS SENSOR N...
A COST EFFECTIVE COMPRESSIVE DATA AGGREGATION TECHNIQUE FOR WIRELESS SENSOR N...A COST EFFECTIVE COMPRESSIVE DATA AGGREGATION TECHNIQUE FOR WIRELESS SENSOR N...
A COST EFFECTIVE COMPRESSIVE DATA AGGREGATION TECHNIQUE FOR WIRELESS SENSOR N...ijasuc
 

What's hot (13)

Aps 10june2020
Aps 10june2020Aps 10june2020
Aps 10june2020
 
A review on energy efficient clustering routing
A review on energy efficient clustering routingA review on energy efficient clustering routing
A review on energy efficient clustering routing
 
Energy Conservation in Wireless Sensor Networks Using Cluster-Based Approach
Energy Conservation in Wireless Sensor Networks Using Cluster-Based ApproachEnergy Conservation in Wireless Sensor Networks Using Cluster-Based Approach
Energy Conservation in Wireless Sensor Networks Using Cluster-Based Approach
 
HERF: A Hybrid Energy Efficient Routing using a Fuzzy Method in Wireless Sens...
HERF: A Hybrid Energy Efficient Routing using a Fuzzy Method in Wireless Sens...HERF: A Hybrid Energy Efficient Routing using a Fuzzy Method in Wireless Sens...
HERF: A Hybrid Energy Efficient Routing using a Fuzzy Method in Wireless Sens...
 
International Journal of Advanced Smart Sensor Network Systems ( IJASSN )
International Journal of Advanced Smart Sensor Network Systems ( IJASSN )International Journal of Advanced Smart Sensor Network Systems ( IJASSN )
International Journal of Advanced Smart Sensor Network Systems ( IJASSN )
 
A Review of Atypical Hierarchical Routing Protocols for Wireless Sensor Networks
A Review of Atypical Hierarchical Routing Protocols for Wireless Sensor NetworksA Review of Atypical Hierarchical Routing Protocols for Wireless Sensor Networks
A Review of Atypical Hierarchical Routing Protocols for Wireless Sensor Networks
 
Mobile Agents based Energy Efficient Routing for Wireless Sensor Networks
Mobile Agents based Energy Efficient Routing for Wireless Sensor NetworksMobile Agents based Energy Efficient Routing for Wireless Sensor Networks
Mobile Agents based Energy Efficient Routing for Wireless Sensor Networks
 
Ed33777782
Ed33777782Ed33777782
Ed33777782
 
Ameliorate Threshold Distributed Energy Efficient Clustering Algorithm for He...
Ameliorate Threshold Distributed Energy Efficient Clustering Algorithm for He...Ameliorate Threshold Distributed Energy Efficient Clustering Algorithm for He...
Ameliorate Threshold Distributed Energy Efficient Clustering Algorithm for He...
 
Congestion Control Clustering a Review Paper
Congestion Control Clustering a Review PaperCongestion Control Clustering a Review Paper
Congestion Control Clustering a Review Paper
 
Hierarchical Coordination for Data Gathering (HCDG) in Wireless Sensor Networks
Hierarchical Coordination for Data Gathering (HCDG) in Wireless Sensor NetworksHierarchical Coordination for Data Gathering (HCDG) in Wireless Sensor Networks
Hierarchical Coordination for Data Gathering (HCDG) in Wireless Sensor Networks
 
Corona based energy efficient clustering in wsn 2
Corona based energy efficient clustering in wsn 2Corona based energy efficient clustering in wsn 2
Corona based energy efficient clustering in wsn 2
 
A COST EFFECTIVE COMPRESSIVE DATA AGGREGATION TECHNIQUE FOR WIRELESS SENSOR N...
A COST EFFECTIVE COMPRESSIVE DATA AGGREGATION TECHNIQUE FOR WIRELESS SENSOR N...A COST EFFECTIVE COMPRESSIVE DATA AGGREGATION TECHNIQUE FOR WIRELESS SENSOR N...
A COST EFFECTIVE COMPRESSIVE DATA AGGREGATION TECHNIQUE FOR WIRELESS SENSOR N...
 

Similar to Vol 8 No 1 - December 2013

IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...ijasuc
 
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...ijasuc
 
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...ijasuc
 
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...ijasuc
 
Cluster Head Selection Techniques for Energy Efficient Wireless Sensor Networ...
Cluster Head Selection Techniques for Energy Efficient Wireless Sensor Networ...Cluster Head Selection Techniques for Energy Efficient Wireless Sensor Networ...
Cluster Head Selection Techniques for Energy Efficient Wireless Sensor Networ...ijsrd.com
 
DISSECT AND ENRICH DIVIDE AND RULE SCHEME FOR WIRELESS SENSOR NETWORK TO SOLV...
DISSECT AND ENRICH DIVIDE AND RULE SCHEME FOR WIRELESS SENSOR NETWORK TO SOLV...DISSECT AND ENRICH DIVIDE AND RULE SCHEME FOR WIRELESS SENSOR NETWORK TO SOLV...
DISSECT AND ENRICH DIVIDE AND RULE SCHEME FOR WIRELESS SENSOR NETWORK TO SOLV...ijsc
 
Dissect and Enrich Divide and Rule Scheme for Wireless Sensor Network to Solv...
Dissect and Enrich Divide and Rule Scheme for Wireless Sensor Network to Solv...Dissect and Enrich Divide and Rule Scheme for Wireless Sensor Network to Solv...
Dissect and Enrich Divide and Rule Scheme for Wireless Sensor Network to Solv...ijsc
 
CBHRP: A Cluster Based Routing Protocol for Wireless Sensor Network
CBHRP: A Cluster Based Routing Protocol for Wireless Sensor NetworkCBHRP: A Cluster Based Routing Protocol for Wireless Sensor Network
CBHRP: A Cluster Based Routing Protocol for Wireless Sensor NetworkCSEIJJournal
 
CLUSTERING-BASED ROUTING FOR WIRELESS SENSOR NETWORKS IN SMART GRID ENVIRONMENT
CLUSTERING-BASED ROUTING FOR WIRELESS SENSOR NETWORKS IN SMART GRID ENVIRONMENTCLUSTERING-BASED ROUTING FOR WIRELESS SENSOR NETWORKS IN SMART GRID ENVIRONMENT
CLUSTERING-BASED ROUTING FOR WIRELESS SENSOR NETWORKS IN SMART GRID ENVIRONMENTijassn
 
An energy-efficient cluster head selection in wireless sensor network using g...
An energy-efficient cluster head selection in wireless sensor network using g...An energy-efficient cluster head selection in wireless sensor network using g...
An energy-efficient cluster head selection in wireless sensor network using g...TELKOMNIKA JOURNAL
 
A Cluster-Based Routing Protocol and Fault Detection for Wireless Sensor Network
A Cluster-Based Routing Protocol and Fault Detection for Wireless Sensor NetworkA Cluster-Based Routing Protocol and Fault Detection for Wireless Sensor Network
A Cluster-Based Routing Protocol and Fault Detection for Wireless Sensor NetworkIJCNCJournal
 
A CLUSTER-BASED ROUTING PROTOCOL AND FAULT DETECTION FOR WIRELESS SENSOR NETWORK
A CLUSTER-BASED ROUTING PROTOCOL AND FAULT DETECTION FOR WIRELESS SENSOR NETWORKA CLUSTER-BASED ROUTING PROTOCOL AND FAULT DETECTION FOR WIRELESS SENSOR NETWORK
A CLUSTER-BASED ROUTING PROTOCOL AND FAULT DETECTION FOR WIRELESS SENSOR NETWORKIJCNCJournal
 
Energy Efficient Data Mining in Multi-Feature Sensor Networks Using Improved...
Energy Efficient Data Mining in Multi-Feature Sensor Networks  Using Improved...Energy Efficient Data Mining in Multi-Feature Sensor Networks  Using Improved...
Energy Efficient Data Mining in Multi-Feature Sensor Networks Using Improved...IOSR Journals
 
Corona based energy efficient clustering in wsn 2
Corona based energy efficient clustering in wsn 2Corona based energy efficient clustering in wsn 2
Corona based energy efficient clustering in wsn 2IAEME Publication
 
Wireless sensor networks, clustering, Energy efficient protocols, Particles S...
Wireless sensor networks, clustering, Energy efficient protocols, Particles S...Wireless sensor networks, clustering, Energy efficient protocols, Particles S...
Wireless sensor networks, clustering, Energy efficient protocols, Particles S...IJMIT JOURNAL
 
Performance Evaluation of Ant Colony Optimization Based Rendezvous Leach Usin...
Performance Evaluation of Ant Colony Optimization Based Rendezvous Leach Usin...Performance Evaluation of Ant Colony Optimization Based Rendezvous Leach Usin...
Performance Evaluation of Ant Colony Optimization Based Rendezvous Leach Usin...IJERD Editor
 
Energy Efficient Clustering Algorithm based on Expectation Maximization for H...
Energy Efficient Clustering Algorithm based on Expectation Maximization for H...Energy Efficient Clustering Algorithm based on Expectation Maximization for H...
Energy Efficient Clustering Algorithm based on Expectation Maximization for H...IRJET Journal
 
WIRELESS SENSOR NETWORK CLUSTERING USING PARTICLES SWARM OPTIMIZATION FOR RED...
WIRELESS SENSOR NETWORK CLUSTERING USING PARTICLES SWARM OPTIMIZATION FOR RED...WIRELESS SENSOR NETWORK CLUSTERING USING PARTICLES SWARM OPTIMIZATION FOR RED...
WIRELESS SENSOR NETWORK CLUSTERING USING PARTICLES SWARM OPTIMIZATION FOR RED...IJMIT JOURNAL
 
Cluster Head Selection for in Wireless Sensor Networks
Cluster Head Selection for in Wireless Sensor NetworksCluster Head Selection for in Wireless Sensor Networks
Cluster Head Selection for in Wireless Sensor Networkseditor1knowledgecuddle
 

Similar to Vol 8 No 1 - December 2013 (20)

IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
 
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
 
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
 
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
IMPACTS OF STRUCTURAL FACTORS ON ENERGY CONSUMPTION IN CLUSTER-BASED WIRELESS...
 
Cluster Head Selection Techniques for Energy Efficient Wireless Sensor Networ...
Cluster Head Selection Techniques for Energy Efficient Wireless Sensor Networ...Cluster Head Selection Techniques for Energy Efficient Wireless Sensor Networ...
Cluster Head Selection Techniques for Energy Efficient Wireless Sensor Networ...
 
DISSECT AND ENRICH DIVIDE AND RULE SCHEME FOR WIRELESS SENSOR NETWORK TO SOLV...
DISSECT AND ENRICH DIVIDE AND RULE SCHEME FOR WIRELESS SENSOR NETWORK TO SOLV...DISSECT AND ENRICH DIVIDE AND RULE SCHEME FOR WIRELESS SENSOR NETWORK TO SOLV...
DISSECT AND ENRICH DIVIDE AND RULE SCHEME FOR WIRELESS SENSOR NETWORK TO SOLV...
 
Dissect and Enrich Divide and Rule Scheme for Wireless Sensor Network to Solv...
Dissect and Enrich Divide and Rule Scheme for Wireless Sensor Network to Solv...Dissect and Enrich Divide and Rule Scheme for Wireless Sensor Network to Solv...
Dissect and Enrich Divide and Rule Scheme for Wireless Sensor Network to Solv...
 
1104.0355
1104.03551104.0355
1104.0355
 
CBHRP: A Cluster Based Routing Protocol for Wireless Sensor Network
CBHRP: A Cluster Based Routing Protocol for Wireless Sensor NetworkCBHRP: A Cluster Based Routing Protocol for Wireless Sensor Network
CBHRP: A Cluster Based Routing Protocol for Wireless Sensor Network
 
CLUSTERING-BASED ROUTING FOR WIRELESS SENSOR NETWORKS IN SMART GRID ENVIRONMENT
CLUSTERING-BASED ROUTING FOR WIRELESS SENSOR NETWORKS IN SMART GRID ENVIRONMENTCLUSTERING-BASED ROUTING FOR WIRELESS SENSOR NETWORKS IN SMART GRID ENVIRONMENT
CLUSTERING-BASED ROUTING FOR WIRELESS SENSOR NETWORKS IN SMART GRID ENVIRONMENT
 
An energy-efficient cluster head selection in wireless sensor network using g...
An energy-efficient cluster head selection in wireless sensor network using g...An energy-efficient cluster head selection in wireless sensor network using g...
An energy-efficient cluster head selection in wireless sensor network using g...
 
A Cluster-Based Routing Protocol and Fault Detection for Wireless Sensor Network
A Cluster-Based Routing Protocol and Fault Detection for Wireless Sensor NetworkA Cluster-Based Routing Protocol and Fault Detection for Wireless Sensor Network
A Cluster-Based Routing Protocol and Fault Detection for Wireless Sensor Network
 
A CLUSTER-BASED ROUTING PROTOCOL AND FAULT DETECTION FOR WIRELESS SENSOR NETWORK
A CLUSTER-BASED ROUTING PROTOCOL AND FAULT DETECTION FOR WIRELESS SENSOR NETWORKA CLUSTER-BASED ROUTING PROTOCOL AND FAULT DETECTION FOR WIRELESS SENSOR NETWORK
A CLUSTER-BASED ROUTING PROTOCOL AND FAULT DETECTION FOR WIRELESS SENSOR NETWORK
 
Energy Efficient Data Mining in Multi-Feature Sensor Networks Using Improved...
Energy Efficient Data Mining in Multi-Feature Sensor Networks  Using Improved...Energy Efficient Data Mining in Multi-Feature Sensor Networks  Using Improved...
Energy Efficient Data Mining in Multi-Feature Sensor Networks Using Improved...
 
Corona based energy efficient clustering in wsn 2
Corona based energy efficient clustering in wsn 2Corona based energy efficient clustering in wsn 2
Corona based energy efficient clustering in wsn 2
 
Wireless sensor networks, clustering, Energy efficient protocols, Particles S...
Wireless sensor networks, clustering, Energy efficient protocols, Particles S...Wireless sensor networks, clustering, Energy efficient protocols, Particles S...
Wireless sensor networks, clustering, Energy efficient protocols, Particles S...
 
Performance Evaluation of Ant Colony Optimization Based Rendezvous Leach Usin...
Performance Evaluation of Ant Colony Optimization Based Rendezvous Leach Usin...Performance Evaluation of Ant Colony Optimization Based Rendezvous Leach Usin...
Performance Evaluation of Ant Colony Optimization Based Rendezvous Leach Usin...
 
Energy Efficient Clustering Algorithm based on Expectation Maximization for H...
Energy Efficient Clustering Algorithm based on Expectation Maximization for H...Energy Efficient Clustering Algorithm based on Expectation Maximization for H...
Energy Efficient Clustering Algorithm based on Expectation Maximization for H...
 
WIRELESS SENSOR NETWORK CLUSTERING USING PARTICLES SWARM OPTIMIZATION FOR RED...
WIRELESS SENSOR NETWORK CLUSTERING USING PARTICLES SWARM OPTIMIZATION FOR RED...WIRELESS SENSOR NETWORK CLUSTERING USING PARTICLES SWARM OPTIMIZATION FOR RED...
WIRELESS SENSOR NETWORK CLUSTERING USING PARTICLES SWARM OPTIMIZATION FOR RED...
 
Cluster Head Selection for in Wireless Sensor Networks
Cluster Head Selection for in Wireless Sensor NetworksCluster Head Selection for in Wireless Sensor Networks
Cluster Head Selection for in Wireless Sensor Networks
 

More from ijcsbi

Vol 17 No 2 - July-December 2017
Vol 17 No 2 - July-December 2017Vol 17 No 2 - July-December 2017
Vol 17 No 2 - July-December 2017ijcsbi
 
Vol 17 No 1 - January June 2017
Vol 17 No 1 - January June 2017Vol 17 No 1 - January June 2017
Vol 17 No 1 - January June 2017ijcsbi
 
Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016ijcsbi
 
Vol 16 No 1 - January-June 2016
Vol 16 No 1 - January-June 2016Vol 16 No 1 - January-June 2016
Vol 16 No 1 - January-June 2016ijcsbi
 
Vol 15 No 6 - November 2015
Vol 15 No 6 - November 2015Vol 15 No 6 - November 2015
Vol 15 No 6 - November 2015ijcsbi
 
Vol 15 No 5 - September 2015
Vol 15 No 5 - September 2015Vol 15 No 5 - September 2015
Vol 15 No 5 - September 2015ijcsbi
 
Vol 15 No 4 - July 2015
Vol 15 No 4 - July 2015Vol 15 No 4 - July 2015
Vol 15 No 4 - July 2015ijcsbi
 
Vol 15 No 3 - May 2015
Vol 15 No 3 - May 2015Vol 15 No 3 - May 2015
Vol 15 No 3 - May 2015ijcsbi
 
Vol 15 No 2 - March 2015
Vol 15 No 2 - March 2015Vol 15 No 2 - March 2015
Vol 15 No 2 - March 2015ijcsbi
 
Vol 15 No 1 - January 2015
Vol 15 No 1 - January 2015Vol 15 No 1 - January 2015
Vol 15 No 1 - January 2015ijcsbi
 
Vol 14 No 3 - November 2014
Vol 14 No 3 - November 2014Vol 14 No 3 - November 2014
Vol 14 No 3 - November 2014ijcsbi
 
Vol 14 No 2 - September 2014
Vol 14 No 2 - September 2014Vol 14 No 2 - September 2014
Vol 14 No 2 - September 2014ijcsbi
 
Vol 14 No 1 - July 2014
Vol 14 No 1 - July 2014Vol 14 No 1 - July 2014
Vol 14 No 1 - July 2014ijcsbi
 
Vol 13 No 1 - May 2014
Vol 13 No 1 - May 2014Vol 13 No 1 - May 2014
Vol 13 No 1 - May 2014ijcsbi
 
Vol 12 No 1 - April 2014
Vol 12 No 1 - April 2014Vol 12 No 1 - April 2014
Vol 12 No 1 - April 2014ijcsbi
 
Vol 11 No 1 - March 2014
Vol 11 No 1 - March 2014Vol 11 No 1 - March 2014
Vol 11 No 1 - March 2014ijcsbi
 
Vol 10 No 1 - February 2014
Vol 10 No 1 - February 2014Vol 10 No 1 - February 2014
Vol 10 No 1 - February 2014ijcsbi
 
Vol 9 No 1 - January 2014
Vol 9 No 1 - January 2014Vol 9 No 1 - January 2014
Vol 9 No 1 - January 2014ijcsbi
 
Vol 7 No 1 - November 2013
Vol 7 No 1 - November 2013Vol 7 No 1 - November 2013
Vol 7 No 1 - November 2013ijcsbi
 
Vol 6 No 1 - October 2013
Vol 6 No 1 - October 2013Vol 6 No 1 - October 2013
Vol 6 No 1 - October 2013ijcsbi
 

More from ijcsbi (20)

Vol 17 No 2 - July-December 2017
Vol 17 No 2 - July-December 2017Vol 17 No 2 - July-December 2017
Vol 17 No 2 - July-December 2017
 
Vol 17 No 1 - January June 2017
Vol 17 No 1 - January June 2017Vol 17 No 1 - January June 2017
Vol 17 No 1 - January June 2017
 
Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016
 
Vol 16 No 1 - January-June 2016
Vol 16 No 1 - January-June 2016Vol 16 No 1 - January-June 2016
Vol 16 No 1 - January-June 2016
 
Vol 15 No 6 - November 2015
Vol 15 No 6 - November 2015Vol 15 No 6 - November 2015
Vol 15 No 6 - November 2015
 
Vol 15 No 5 - September 2015
Vol 15 No 5 - September 2015Vol 15 No 5 - September 2015
Vol 15 No 5 - September 2015
 
Vol 15 No 4 - July 2015
Vol 15 No 4 - July 2015Vol 15 No 4 - July 2015
Vol 15 No 4 - July 2015
 
Vol 15 No 3 - May 2015
Vol 15 No 3 - May 2015Vol 15 No 3 - May 2015
Vol 15 No 3 - May 2015
 
Vol 15 No 2 - March 2015
Vol 15 No 2 - March 2015Vol 15 No 2 - March 2015
Vol 15 No 2 - March 2015
 
Vol 15 No 1 - January 2015
Vol 15 No 1 - January 2015Vol 15 No 1 - January 2015
Vol 15 No 1 - January 2015
 
Vol 14 No 3 - November 2014
Vol 14 No 3 - November 2014Vol 14 No 3 - November 2014
Vol 14 No 3 - November 2014
 
Vol 14 No 2 - September 2014
Vol 14 No 2 - September 2014Vol 14 No 2 - September 2014
Vol 14 No 2 - September 2014
 
Vol 14 No 1 - July 2014
Vol 14 No 1 - July 2014Vol 14 No 1 - July 2014
Vol 14 No 1 - July 2014
 
Vol 13 No 1 - May 2014
Vol 13 No 1 - May 2014Vol 13 No 1 - May 2014
Vol 13 No 1 - May 2014
 
Vol 12 No 1 - April 2014
Vol 12 No 1 - April 2014Vol 12 No 1 - April 2014
Vol 12 No 1 - April 2014
 
Vol 11 No 1 - March 2014
Vol 11 No 1 - March 2014Vol 11 No 1 - March 2014
Vol 11 No 1 - March 2014
 
Vol 10 No 1 - February 2014
Vol 10 No 1 - February 2014Vol 10 No 1 - February 2014
Vol 10 No 1 - February 2014
 
Vol 9 No 1 - January 2014
Vol 9 No 1 - January 2014Vol 9 No 1 - January 2014
Vol 9 No 1 - January 2014
 
Vol 7 No 1 - November 2013
Vol 7 No 1 - November 2013Vol 7 No 1 - November 2013
Vol 7 No 1 - November 2013
 
Vol 6 No 1 - October 2013
Vol 6 No 1 - October 2013Vol 6 No 1 - October 2013
Vol 6 No 1 - October 2013
 

Recently uploaded

FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsSandeep D Chaudhary
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
dusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningdusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningMarc Dusseiller Dusjagr
 
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...EADTU
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 
How to Add a Tool Tip to a Field in Odoo 17
How to Add a Tool Tip to a Field in Odoo 17How to Add a Tool Tip to a Field in Odoo 17
How to Add a Tool Tip to a Field in Odoo 17Celine George
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
Economic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food AdditivesEconomic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food AdditivesSHIVANANDaRV
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptNishitharanjan Rout
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 

Recently uploaded (20)

FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Our Environment Class 10 Science Notes pdf
Our Environment Class 10 Science Notes pdfOur Environment Class 10 Science Notes pdf
Our Environment Class 10 Science Notes pdf
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
dusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningdusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learning
 
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
How to Add a Tool Tip to a Field in Odoo 17
How to Add a Tool Tip to a Field in Odoo 17How to Add a Tool Tip to a Field in Odoo 17
How to Add a Tool Tip to a Field in Odoo 17
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
Economic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food AdditivesEconomic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food Additives
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.ppt
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...
 

Vol 8 No 1 - December 2013

  • 1. ISSN: 1694-2507 (Print) ISSN: 1694-2108 (Online) International Journal of Computer Science and Business Informatics (IJCSBI.ORG) VOL 8, NO 1 DECEMBER 2013
  • 2. Table of Contents VOL 8, NO 1 DECEMBER 2013 An Integrated Distributed Clustering Algorithm for Large Scale WSN...................................................1 S. R. Boselin Prabhu, S. Sophia, S. Arthi and K. Vetriselvi An Efficient Connection between Statistical Software and Database Management System ................... 1 Sunghae Jun Pragmatic Approach to Component Based Software Metrics Based on Static Methods ......................... 1 S. Sagayaraj and M. Poovizhi SDI System with Scalable Filtering of XML Documents for Mobile Clients ............................................... 1 Yi Yi Myint and Hninn Aye Thant An Easy yet Effective Method for Detecting Spatial Domain LSB Steganography .................................... 1 Minati Mishra and Flt. Lt. Dr. M. C. Adhikary Minimizing the Time of Detection of Large (Probably) Prime Numbers ................................................... 1 Dragan Vidakovic, Dusko Parezanovic and Zoran Vucetic Design of ATL Rules for TransformingUML 2 Sequence Diagrams into Petri Nets..................................... 1 Elkamel Merah, Nabil Messaoudi, Dalal Bardou and Allaoua Chaoui IJCSBI.ORG
  • 3. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 1 An Integrated Distributed Clustering Algorithm for Large Scale WSN S. R. BOSELIN PRABHU Assistant Professor, Department of Electronics and Communication Engineering SVS College of Engineering, Coimbatore, India. S. SOPHIA Professor, Department of Electronics and Communication Engineering Sri Krishna College of Engineering and Technology, Coimbatore, India. S. ARTHI & K. VETRISELVI UG Students, Department of Electronics and Communication Engineering SVS College of Engineering, Coimbatore, India. Abstract Latest researches in wireless communications and electronics has imposed the progress of low-cost wireless sensor nodes. Clustering is a thriving topology control approach, which can prolong the lifetime and increase scalability for wireless sensor networks. The admired criteria for clustering methodology are to select cluster heads with more residual energy and to rotate them periodically. Sensors at heavy traffic locations quickly deplete their energy resources and die much earlier, leaving behind energy hole and network partition. In this paper, a model of distributed layer-based clustering algorithm is proposed based on three concepts. First, the aggregated data is forwarded from cluster head to the base station through cluster head of the next higher layer with shortest distance between the cluster heads. Second, cluster head is elected based on the clustering factor, which is the combination of residual energy and the number of neighbors of a particular node within a cluster. Third, each cluster has a crisis hindrance node, which does the function of cluster head when the cluster head fails to carry out its work in some critical conditions. The key aim of the proposed algorithm is to accomplish energy efficiency and to prolong the network lifetime. The proposed distributed clustering algorithm is contrasted with the existing clustering algorithm LEACH. Keywords: Wireless sensor network (WSN), distributed clustering algorithm, cluster head, residual energy, energy efficiency, network lifetime.
  • 4. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 2 1. INTRODUCTION Wireless sensor network (WSN) is a collection of huge number of small, low-power and low-cost electronic devices called sensor nodes. Each sensor node consists of four major blocks: sensing, processing, power and communication unit and they are responsible for sensing, processing and wireless communications (figure 1). These nodes bring together the relevant data from the environment and then transfer the gathered data to base station (BS). Since WSNs has many advantages like self organization, infrastructure-free, fault-tolerance and locality, they have a wide variety of potential applications like border security and surveillance, environmental monitoring and forecasting, wildlife animal protection and home automation, disaster management and control. Considering that sensor nodes are usually deployed in remote locations, it is impossible to recharge their batteries. Therefore, ways to utilize the limited energy resource wisely to extend the lifetime of sensor networks is a very demanding research issue for these sensor networks. Figure 1: Various components of a wireless sensor node Clustering [2-7] is an effectual topology control approach, which can prolong the lifetime and increase scalability for these sensor networks. The popular criterion for clustering technique (figure 2) is to select a cluster head (CH) with more residual energy and to spin them periodically. The basic idea of clustering algorithms is to use the data aggregation [8-11] mechanism in the cluster head to lessen the amount of data transmission. Clustering goes behind some advantages like network scalability, localizing
  • 5. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 3 route setup, uses communication bandwidth [17] efficiently and takes advantage of network lifetime [12-16]. By the data aggregation process, unnecessary communication between sensor nodes, cluster head and the base station is evaded. In this paper, a well-defined model of distributed layer-based clustering algorithm is proposed based of three concepts: the aggregated data is forwarded from the cluster head to the base station through cluster head of the next higher layer with shortest distance between the cluster heads, cluster head is elected based on the clustering factor and the crisis hindrance node does the function of cluster head when the cluster head fails to carry out its work. The prime aim of the proposed algorithm is to attain energy efficiency and increased network lifetime. Figure 2: Cluster formation in a wireless sensor network The rest of this paper is structured as follows. A literature review of existing distributed clustering algorithms, talking about their projected advantages and shortcomings is profoundly conversed in Section 2. An evaluation of the existing clustering algorithm LEACH (Low Energy Adaptive Clustering Hierarchy) and the basic concept behind this algorithm is briefed in Section 3. Section 4 sketches a precise model of the proposed distributed layer- based clustering algorithm, enumerating the precious hiding concepts behind it. Finally, the last section gives the conclusion creatively.
  • 6. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 4 2. A REVIEW OF EXISTING CLUSTERING ALGORITHMS Bandyopadhyay and Coyle anticipated EEHC [18], which is a randomized clustering algorithm which categorizes the sensor nodes into hierarchy of clusters with an objective of minimizing the total energy spent in the system to communicate the information gathered by the sensors to the information processing center. It has variable cluster count, the immobile cluster head aggregates and relays the data to the BS. It is valid for extensive large scale networks. The peculiar negative aspect of this algorithm is that, some nodes remain un-clustered throughout the clustering process. Barker, Ephremides and Flynn proposed LCA [19], which is chiefly developed to avoid the communication collisions among the nodes by using a TDMA time-slot. It makes utilization of single-hop scheme thereby attaining high degree of connectivity when CH is selected randomly. The restructured version of LCA, the LCA2 was implemented to lessen the number of nodes compared to the original LCA algorithm. The key drawback of this algorithm is that, the single-hop clustering leads to the creation of more number of clusters. Nagpal and Coore proposed CLUBS [20], which is executed with an idea to form overlapping clusters with maximum cluster diameter of two hops. The clusters are created by local broadcasting and its convergence depends on the local density of the wireless sensor nodes. This algorithm can be implemented in asynchronous environment without dropping efficiency. The main difficulty is the overlapping of clusters, clusters having their CHs within one hop range of each other, thereby both the clusters will collapse and CH election process will get restarted. Demirbas, Arora and Mittal brought out FLOC [21], which shows double- band nature of wireless radio-model for communication. The nodes can commune reliably with the nodes in the inner-band and unreliably with the nodes that are in the outer-band. The chief disadvantage of the algorithm is, the communication between the nodes in the outer band is unreliable and the messages have maximum probability of getting lost during communication. Ye, Li, Chen and Wu proposed EECS [22], which is based on a supposition that all CHs can communicate directly with the BS. The clusters have variable size, those closer to the CH are larger in size and those farther from CH are smaller in size. It is really energy efficient in intra-cluster communication and shows an excellent improvement in network lifetime.
  • 7. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 5 EEUC is anticipated for uniform energy consumption within the sensor network. It forms dissimilar clusters, with a guessing that each cluster can have variable sizes. Probabilistic selection of CH is the focal shortcoming of this algorithm. Few nodes will be gone without being part of any cluster. Yu, Li and Levy proposed DECA, which selects CH based on residual energy, connectivity and a node identifier. It is greatly energy efficient, as it uses lesser messages for CH selection. The main trouble with this algorithm is that high risk of wrong CH selection which leads to the discarding of every packets sent by the wireless sensor node. Ding, Holliday and Celik proposed DWEHC, which elects CH on the basis of weight, a combination of nodes’ residual energy and its distance to the neighboring nodes. It produces well balanced clusters, independent of network topology. A node possessing largest weight in a cluster is designated as CH. The algorithm constructs multilevel clusters and the nodes in every cluster reach CH by relaying through other intermediate nodes. The foremost problem occurs due to much energy utilization by several iterations until the nodes settle in most energy efficient topology. HEED is a well distributed clustering algorithm in which CH selection is done by taking into account the residual energy of the nodes and intra- cluster communication cost leading to prolonged network lifetime. It is clear that it can have variable cluster count and supports heterogeneous sensors. The problems with HEED are its application narrowed only to static networks, the employment of complex methods and multiple clustering messages per node for CH selection even though it prevents random selection of CH. 3. AN EVALUATION OF LEACH ALGORITHM LEACH [1] is one of the most well-liked clustering mechanisms for WSNs and it is considered as the representative energy efficient protocol. In this protocol, sensor nodes are unified together to form a cluster. In each cluster, one sensor node is chosen arbitrarily to act as a cluster head (CH), which collects data from its member nodes, aggregates them and then forwards to the base station. It disperses the operation unit into many rounds and each round consists of two phases: the set-up phase and the steady phase. During the set-up phase, initial clusters are fashioned and cluster heads are selected. All the wireless sensor nodes produce a random number between 0 and 1. If the number is lesser than the threshold, then the node selects itself as the cluster head for the present round. The threshold for cluster head selection in LEACH for a particular round is given in equation 1. Gone selecting
  • 8. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 6 itself as a CH, the sensor node broadcasts an advertisement message which has its own ID. The non-cluster head nodes can formulate an assessment, which cluster to join based on the strength of the received advertisement signal. After the decision is made, every non-cluster head node should transmit a join- request message to the chosen cluster head to specify that it will be a member of the cluster. The cluster head fashions and broadcasts a time division multiple access (TDMA) schedule to exchange the data with non-cluster sensor nodes without collision after it receives all the join- request messages. (1) where p is the preferred percentage of cluster heads, r is the current round number and G is the set of nodes which have not been chosen as cluster head for the last 1/p rounds. The steady phase commences after the clusters are fashioned and the TDMA schedules are broadcasted. All of the sensor nodes transmits their data to the cluster head once per round during their allotted transmission slot based on the TDMA schedule and in other time, they turn off the radio in order to trim down the energy consumption. However, the cluster heads must stay awake all the time. Therefore, it can receive every data from the nodes within their own clusters. On receiving the data from the cluster, the cluster head carries out data aggregation mechanism and onwards it to the base station directly. This is the entire mechanism of the steady state phase. After a certain predefined time, the network will step into the next round. LEACH is the basic clustering protocol which processes cluster approach and it can prolong the network lifetime in comparison with other multi-hop routing and static routing. However, there are still some hiding problems that should be considered. LEACH does not take into account the residual energy to elect cluster heads and to construct the clusters. As a result, nodes with lesser energy may be elected as cluster heads and then die much earlier. Moreover, since a node selects itself as a cluster head only according to the value of the calculated probability, it is hard to guarantee the number of cluster heads and their distribution. Also in LEACH clustering algorithm, the cluster heads are selected randomly and hence the weaker nodes drain easily. To rise above
  • 9. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 7 these shortcomings in LEACH, a model of distributed layer-based clustering algorithm is proposed, where clusters are arranged in to hierarchical layers. Instead of cluster heads directly sending the aggregated data to the base station, sends them to their next layer nearer cluster heads. These cluster heads send their data along with that received from lower level cluster heads to the next layer nearer cluster heads. The cumulative process gets repeated and finally the data from all the layers reach the base station. The proposed model is dedicated with some expensive designs, focusing on reduced energy utilization and improved network lifetime of the sensor network. 4. THE PROPOSED CLUSTERING ALGORITHM The proposed clustering algorithm is well distributed, where the sensor nodes are deployed randomly to sense the target environment. The nodes are divided into clusters with each cluster having a CH. The nodes throw the information during their TDMA timeslot to their respective CH which fuses the data to avoid redundant information by the process of data aggregation. The aggregated data is forwarded to the BS. Compared to the existing algorithms, the proposed algorithm has three distinguishing features. First, the aggregated data is forwarded from the cluster head to the base station through cluster head of the next higher layer with shortest distance between the cluster heads. Second, cluster head is elected based on the clustering factor, which is the combination of residual energy and the number of neighbors of a particular node within a cluster. Third, each cluster has a crisis hindrance node, that does the function of cluster head when the cluster head fails to carry out its work in some conditions. Figure 3: Aggregated data forwarding in the proposed algorithm
  • 10. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 8 A. Aggregated Data Forwarding In a network of N nodes, each node is assigned with an exclusive Node Identity (NID). The NID just serves as a recognition of the nodes and has no relationship with location or clustering. The CH will be placed at the center and the nodes will be organized in to several layers around the CH. Every clusters are arranged into hierarchical layers and layer numbers are assigned to each clusters. The cluster that is far away from the base station is designated as the lowest layer and the cluster nearer to the base station is designated as the highest layer. The main characteristic feature of the proposed algorithm is that the lowest layer cluster head forwards only its own aggregated data to the next layer cluster head but the highest layer forwards all the aggregated data from the preceding cluster heads to the base station (figure 3). Thus lower workload is assigned to the lower layers but the higher layers are assigned with greater workload. The workload assigned to a particular cluster head is directly proportional to the energy utilization of the cluster head. In order to balance the energy utilization among the cluster head, the concept of variable transmission power is employed, where the transmission power reduces with increase in layer numbers. In LEACH, each cluster head forwards the aggregated data to the base station directly which uses much energy. The proposed algorithm uses a multi-hop fashion of data forwarding from cluster head to the base station resulting in reduced energy utilization.
  • 11. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 9 Figure 4: Mechanism of cluster head selection in the proposed algorithm B. Cluster Head Selection The cluster head is elected based on the clustering factor (figure 4), which is the combination of residual energy and the number of neighbors of a particular node within a cluster. Residual energy is defined as the energy remaining within a particular node after some number of rounds. This is generally believed as one of the main parameter for CH selection in the proposed algorithm. A neighboring node is a node that remains closer to a particular node within one hop distance. LEACH selects cluster head only based on residual energy, but in the proposed algorithm an additional parameter is included basically to elect the cluster head properly, thereby to reduce the node death rate. The main characteristic feature of the proposed algorithm compared to LEACH is that, the base station does not involve in clustering process directly or indirectly. A node with highest clustering factor is selected as cluster head for the current round. This is generally significant in mobile environment, when the sensor nodes move, the number
  • 12. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 10 of neighbors vary which should be taken into account but it is barely not concentrated in the LEACH clustering mechanism. C. Alternate Crisis Hindrance Node In a cluster with large number of nodes, cluster crisis does not affect the overall performance of the wireless sensor system. But in the case of network with less number of nodes, cluster crisis greatly affects the wireless sensor system. Care should be done when cluster head selection process by applying alternate recovery mechanisms. In addition to the regular cluster head, additional cluster node is assigned the task of secondary cluster head, and the particular node is called as crisis hindrance node. Generally the cluster collapses when the cluster head fails. In such situations, crisis hindrance node act as cluster head and recovers the cluster. The main characteristic feature of the proposed algorithm is that, the crisis hindrance node solely performs the function of recovery mechanism and does not involve in sensing process. In case of LEACH, the distribution and the loading of CHs to all nodes in the networks is not uniform by switching the cluster heads periodically. Hence, there is a maximum probability of a cluster to be collapsed easily, but it can be avoided in the proposed algorithm with the help of crisis hindrance node. 6. CONCLUSION AND FUTURE WORK This paper gives a brief introduction on clustering process in wireless sensor networks. A study on the well evaluated distributed clustering algorithm Low Energy Adaptive Clustering Hierarchy (LEACH) is described artistically. To overcome the drawbacks of the existing LEACH algorithm, a model of distributed layer-based clustering algorithm is proposed for clustering the wireless sensor nodes. The proposed distributed clustering algorithm is based on the aggregated data being forwarded from the cluster head to the base station through cluster head of the next higher layer with shortest distance between the cluster heads. The selection of cluster head is based on the clustering factor, which is the combination of residual energy and the number of neighbors of a particular node within a cluster. Also each cluster has a crisis hindrance node. In future, the algorithm will be simulated using the network simulator and the simulated results will be compared with two or three existing distributed clustering algorithms. 7. ACKNOWLEDGMENTS Our sincere gratitude to the management of SVS Educational Institutions and my Research Supervisor Dr. S. Sophia who served as a guiding light to come out with this amazing research work.
  • 13. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 11 REFERENCES [1] W.B.Heinzelman, A.P.Chandrakasan, H.Balakrishnan, (2002), “An application specific protocol architecture for wireless microsensor networks”, IEEE Transactions on Wireless Communication Volume 1, Number 4, Pages 660-670. [2] O.Younis, S.Fahmy, (2004), “HEED: A hybrid energy-efficient distributed clustering approach for adhoc sensor networks”, IEEE Transactions on Mobile Computing, Volume 3, Number 4, Pages 366-379. [3] S.Zairi, B.Zouari, E.Niel, E.Dumitrescu, (2012), “Nodes self-scheduling approach for maximizing wireless sensor network lifetime based on remaining energy” IET Wireless Sensor Systems, Volume 2, Number 1, Pages 52-62. [4] I.Akyildiz, W.Su, Y.Sankarasubramaniam, E.Cayirci, (2002), “A Survey on sensor networks”, IEEE Communications Magazine, Pages 102-114. [5] G.J.Pottie, W.J.Kaiser, (2000), “Embedding the internet: wireless integrated network sensors”, Communications of the ACM, Volume 43, Number 5, Pages 51-58. [6] J.H.Chang, L.Tassiulas, (2004), “Maximum lifetime routing in wireless sensor networks”, IEEE/ACM Transactions on Networking, Volume 12, Number 4, Pages 609- 619. [7] S.R.Boselin Prabhu, S.Sophia, (2011), “A survey of adaptive distributed clustering algorithms for wireless sensor networks”, International Journal of Computer Science and Engineering Survey, Volume 2, Number 4, Pages 165-176. [8] S.R.Boselin Prabhu, S.Sophia, (2012), “A Research on decentralized clustering algorithms for dense wireless sensor networks”, International Journal of Computer Applications , Volume 57, Number 20, Pages 0975-0987. [9] S.R.Boselin Prabhu, S.Sophia, (2013), “Mobility assisted dynamic routing for mobile wireless sensor networks”, International Journal of Advanced Information Technology , Volume 3, Number 1, Pages 09-19. [10] S.R.Boselin Prabhu, S.Sophia, (2013), “A review of energy efficient clustering algorithm for connecting wireless sensor network fields”, International Journal of Engineering Research & Technology, Volume 1, Number 4, Pages 477–481. [11] S.R.Boselin Prabhu, S.Sophia, (2013), “Capacity based clustering model for dense wireless sensor networks”, International Journal of Computer Science and Business Informatics, Volume 5, Number 1. [12] J.Deng, Y.S.Han, W.B.Heinzelman, P.K.Varshney, (2005), “Balanced-energy sleep scheduling scheme for high density cluster-based sensor networks”, Elsevier Computer Communications Journal, Special Issue on ASWN04, Pages 1631-1642. [13] C.Y.Wen, W.A.Sethares, (2005), “Automatic decentralized clustering for wireless sensor networks”, EURASIP Journal of Wireless Communication Networks, Volume 5, Number 5, Pages 686-697. [14] S.D.Murugananthan, D.C.F.Ma, R.I.Bhasin, A.O.Fapojuwo, (2005) “A centralized energy-efficient routing protocol for wireless sensor networks”, IEEE Transactions on Communication Magazine, Volume 43, Number 3, Pages S8-13. [15] F.Bajaber, I.Awan, (2009), “Centralized dynamic clustering for wireless sensor networks”, Proceedings of the International Conference on Advanced Information Networking and Applications.
  • 14. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 12 [16] Pedro A. Forero, Alfonso Cano, Georgios B.Giannakis, (2011), “Distributed clustering using wireless sensor networks”, IEEE Journal of Selected Topics in Signal Processing, Volume 5, Pages 707-724. [17] Lianshan Yan, Wei Pan, Bin Luo, Xiaoyin Li, Jiangtao Liu, (2011), “Modified energy- efficient protocol for wireless sensor networks in the presence of distributed optical fiber sensor link, IEEE Sensors Journal, Volume 11, Number 9, Pages 1815-1819. [18] S.Bandyopadhay, E.Coyle, (2003), “An energy-efficient hierarchical clustering algorithm for wireless sensor networks”, Proceedings of the 22nd Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2003), San Francisco, California. [19] D.J.Barker, A.Ephremides, J.A.Flynn, (1984), “The design and simulation of a mobile radio network with distributed control”, IEEE Journal on Selected Areas in Communications, Pages 226-237. [20] R.Nagpal, D.Coore, (2002), “An algorithm for group formation in an amorphous computer”, Proceedings of IEEE Military Communications Conference (MILCOM 2002), Anaheim, CA. [21] M.Demirbas, A.Arora, V.Mittal, (2004), “FLOC: A fast local clustering service for wireless sensor networks”, Proceedings of Workshop on Dependability Issues in Wireless Ad Hoc Networks and Sensor Networks (DIWANS’04), Italy. [22] M.Ye, C.F.Li, G.H.Chen, J.Wu, (2005), “EECS: An energy efficient clustering scheme in wireless sensor networks”, Proceedings of the Second IEEE International Performance Computing and Communications Conference (IPCCC), Pages 535-540.
  • 15. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 1 An Efficient Connection between Statistical Software and Database Management System Sunghae Jun Department of Statistics, Cheongju University Chungbuk 360-764 Korea ABSTRACT In big data era, we need to manipulate and analyze the big data. For the first step of big data manipulation, we can consider traditional database management system. To discover novel knowledge from the big data environment, we should analyze the big data. Many statistical methods have been applied to big data analysis, and most works of statistical analysis are dependent on diverse statistical software such as SAS, SPSS, or R project. In addition, a considerable portion of big data is stored in diverse database systems. But, the data types of general statistical software are different from the database systems such as Oracle, or MySQL. So, many approaches to connect statistical software to database management system (DBMS) were introduced. In this paper, we study on an efficient connection between the statistical software and DBMS. To show our performance, we carry out a case study using real application. Keywords Statistical software, Database management system, Big data analysis, Database connection, MySQL, R project. 1. INTRODUCTION Every day, huge data are created from diverse fields, and stored in computer systems. These big data are extremely large and complex [1]. So, it is very difficult to manage and analyze them. But, big data analysis is important issue in many fields such as marketing, finance, technology, or medicine. Big data analysis is based on statistics and machine learning algorithms. In addition, data analysis is depended on statistical software, and the data are stored in database systems. So, for big data analysis, we should manage statistical software and database system effectively. In this paper, we consider R project system as statistical software. R is an environment for statistical computing including statistical analysis and graphical display of data [2]. This program provides most of statistical and machine learning methods for big data analysis. We use MySQL for connecting database system from R project. The MySQL is a database management system (DBMS) product that is the most popular open source database in the world, in addition, this is a free software like R system [3]. So, in our research, we use R and MySQL for an efficient connection between statistical software and DBMS. There was a work about DB access through R [4]. This covered
  • 16. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 2 the DB access problems of R, and showed the ODBC (open database connectivity) drivers for connecting R and DBMS such as MySQL, PostgreSQL, and Oracle. Also, the authors of this paper introduced the installation and technological environment for the DB access. But, they did not illustrate detailed approaches for real applications. That is, their work was about a conceptual suggestion for the access of R to MySQL. So, in this paper, we perform more specific study for connection between statistical software, R to DBMS, MySQL. In our case study, we will show detailed and efficient connection of R to MySQL using specific data set from the University of California, Irvine (UCI) machine learning repository [5]. We will cover our research background in next section. In section 3, our proposed methodology will be shown. We also introduce an efficient connection between statistical database and DBMS in section 4. Lastly we conclude our study and offer our future works for statistical database system. 2. RESEARCH BACKGROUND 2.1 Statistical Software To analyze data, we can consider diverse approaches using statistical software. These days, there are so many products for statistical software. SAS (statistical analysis system) is the most popular software for statistical analysis [6]. But, this is expensive, so there are not many companies using SAS except large size companies. SPSS (statistical analysis in social science) is another representative software [7], but this is also expensive. Minitab [8] and S-Plus [9] are well used statistics packages and these are all not free. Recently, R has been used in many works for statistical data analysis, and this is free. In addition, R also provides most of statistical functions included in SAS, or SPSS. R is open source program, so we can modify R functions for our statistical computing. This is very useful advantage of R. Therefore, we consider R for connection to database system in this research. 2.2 Database Management System Database is a collection of data, and database management system (DBMS) is a software for managing database using structured query language (SQL) [10],[11]. Oracle is one of popular DBMS products [12], but it is expensive. MySQL is another DBMS, which is widely used open source software in the world [3]. Also, most functions of MySQL are similar to Oracle [3]. So, in this paper, we use MySQL for DBMS connecting to statistical software, R. Using MySQL DBMS efficiently, we use RODBC package supported by R CRAN in our research [13]. 3. STATISTICAL DATABASE SYSTEM The main goal of our study is to solve the cost problem for constructing statistical database system, because we should buy additional product to connect statistical software to DBMS. For example, for the connection
  • 17. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 3 between SAS and DBMS, we need „SAS/Access‟ product as supplementary software. In general, this is expensive. So, we tried to make the connection between statistical software and DBMS without cost. The „efficient‟ of our paper was about „cost‟. There are many approaches to connect statistical software and DBMS. To use most of them, we should buy additional products. But, there are few free approaches. So, we find an approach to connect statistical software and DBMS without cost. In this paper, we study an efficient connection between DBMS and statistical software. We select the MySQL as a DBMS for our research, and use R project as statistical software because not only they are free but also they have good functions. In addition, the R and MySQL have strong performance in statistical computing and DBMS respectively for constructing statistical database system [14],[15],[16],[17]. In general, big data are transformed to structured data type for statistical analysis as follow; Figure 1. From big data to statistical analysis First, big data are stored in DB by creating table. Second, big data are changed to structured data by preprocessing based on text mining. All data by DB and text mining are analyzed by statistical analysis. We find that text mining process is hard work for data preprocessing [18]. So, we know that table creation is more effective approach for big data analysis. To construct MySQL DB, we use console or graphic user interface (GUI) environments as follow; Figure 2. User interface of MySQL
  • 18. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 4 In this paper, we use SQL codes in the MySQL console. Also, we use RODBC as an ODBC database interface between R and MySQL [13]. In general R system, package is a set of additional R functions. R packages are not installed in basic R system. If we need to use a package, we have to add the package to the R system. Also we can search all packages from the R CRAN, and install them from the CRAN [19]. The RODBC package provides efficient functions for ODBC database access. So, our research is based on RODBC package to connect R to MySQL. To install RODBC in R system, we should select R CRAN mirror site. After RODBC installation, we load this package on R system as follow; >library (RODBC) The R system uses „library‟ function for loading a package. By this R code, we can use all functions provided by RODBC package such as odbcConnect, sqlFetch, and sqlQuery. They are used in our research for DB accessing and connecting. To connect MySQL DB, we use „odbcConnect‟ function of RODBC package as follow; >db_con =odbcConnect("stat_MySQL") User = , Password = , Database = The DSN is „stat_MySQL‟ and the „db_con‟ object of R system includes the connecting result. Also, in this connecting process, we decide user name, password, and determined database. If R and MySQL are connected each other, we can show the tables of MySQL DB using „sqlTables‟ function as follow; >sqlTables(con) TABLE_CAT TABLE_SCHEM TABLE_NAME TABLE_TYPE REMARKS The result of this function is the information of connected DB and its tables. 3.1 Structure of DB Connection Software In general, for connecting DBMS to application software, we should use ODBC connector [20]. R as a statistical software is also needed to ODBC driver to access MySQL DBMS. In this paper, we consider RODBC package for efficient connection between R and MySQL. Figure 3 shows the ODBC connection between DBMS and statistical software, and their specific products.
  • 19. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 5 Figure 3. Connection between DBMS and statistical software Oracle and MySQL are representative DBMS products, and SAS and R system are popular software for statistical analysis. General ODBC program is used for connecting application software to DBMS. So, there are so many ODBC drivers for diverse DBMS and application products. Our work is focused on the connection R and MySQL, and we select RODBC as an ODBC driver. The RODBC is a package of many R packages for DB accessing. RMySQL is another R package for R and MySQL [21]. This package is also R interface to access the MySQL DBMS. In addition to RODBC and RMySQL, there are some packages for connecting R to MySQL. In this paper, we use RODBC for MySQL accessing. This is an ODBC driver like SAS connection to DBMS as follow. Figure 4. Connection between MySQL/Oracle and SAS SAS uses some ODBC drivers for diverse DBMS such as MySQL and Oracle. Also, the drivers use their data source name (DSN). In this research, we also use DSN for RODBC package. Next, we show more detailed connection between R and MySQL. 3.2 Efficient Connection between R and MySQL The RODBC package of R system is an efficient ODBC connector. This includes diverse functions to access DBMS as follow; •odbcConnect: function for open connections to ODBC •sqlFetch: function for fetching tables from DB
  • 20. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 6 •sqlQuery: function for SQL query •sqlSave: function for writing data frame to table in DB Also, we can use more functions for accessing and manipulating MySQL DB by RODBC packages. The process of connection between R and MySQL is as follow; Figure 5. Connecting process between R and MySQL Using RODBC package, R system get necessary data from MySQL DB, and we analyze the connected data. Also, R system accesses to MySQL by sqlQuery function of RODBC, and create a table for storing analysis result using R system. Our process of connection between R and MySQL is shown as follow; Figure 6. Connecting process between R and MySQL A table of MySQL DB is transformed to an object in R by RODBC connector. So, we are able to analyze the object data from the DB table. We also perform online transaction processing (OLAP) for data summarization and visualization. Next, we carry out a case study for verifying our work.
  • 21. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 7 4. CASE STUDY To illustrate a case study in real problem, we used „RODBC‟ package from R-project [13]. This is the software for ODBC database connection between R and DBMS such as MySQL. Also, we made experiment using an example data set from the UCI machine learning repository [5]. 4.1 UCI Machine Learning Repository For our case study, we used “Abalone” data set from the UCI machine learning repository [5]. This data set consisted of 8 variables (columns) and 4,177 observations (rows). The main goal of the data is to predict the age of abalone from the physical measurements. Next table shows the variables and their values [5]. Table 1. Table captions should be placed above the table Variable Data type Description Sex Nominal M(male), F(female), I(infant) Length Continuous Longest shell measurement Diameter Continuous perpendicular to length Height Continuous with meat in shell Whole_weight Continuous whole abalone Shucked_weight Continuous weight of meat Viscera_weight Continuous gut weight (after bleeding) Shell_weight Continuous after being dried Rings Discrete +1.5 gives the age in years The last variable (rings) is target variable, and others are all input variables. We constructed MySQL DB using this data set. The original data from UCI machine learning repository was text file separated by „comma‟, but the MySQL needed data file separated by „tab key‟ for DB loading file. So, we transformed the data type using Excel as follow.
  • 22. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 8 Figure 7. Data transformation for MySQL loading To load text data file on MySQL, we should make a table to save these data. So, we create the table in next step. 4.2 DB Creation We used SQL to create table for loading Abalone data set on MySQL DBMS as follow; • CREATE DATABASE case_study; • USE case_study; • CREATE TABLE abalone( Sex CHAR(3), Length FLOAT(10), Diameter FLOAT(10), Height FLOAT(10), Whole_weight FLOAT(10), Shucked_weight FLOAT(10), Viscera_weight FLOAT(10), Shell_weight FLOAT(10), Rings INT(5)); • LOAD DATA INFILE 'd:/data/abalone.txt' INTO TABLE abalone; • SELECT * FROM abalone; Using above SQL codes, we constructed a table of Abalone data in MySQL DB(case_study). Next, we connected the table of abalone in case_study DB to R system. 4.3 Connecting R to MySQL We used RODBC package for connecting R to MySQL as follow;
  • 23. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 9 >library(RODBC) >abalone_con=odbcConnect("abalone_ODBC") >sqlTables(abalone_con) TABLE_SCHEM TABLE_NAME TABLE_TYPE case_study abalone TABLE >vars=sqlQuery(abalone_con, "SELECT sex, diameter, rings FROM abalone") Sex Diameter Rings 1 M 0.365 15 2 M 0.265 7 3 F 0.420 9 4 M 0.365 10 5 I 0.255 7 … Using above R codes, we saved three variables of abalone data set to „vars‟ R object. We found the abalone table was created well from the SQL query result by sqlQuery function. This function enabled the usage of SQL in R system. So, we analyzed abalone data using analytical functions of R system. Next, the result of data analysis is shown. 4.4 Data Analysis First, we performed data summarization of three variables using „summary‟ function of R system as follow; >summary(vars) sex diameter rings F:1307 Min. :0.0550 Min. : 1.000 I:1342 1st Qu.:0.3500 1st Qu.: 8.000 M:1528 Median :0.4250 Median : 9.000 Mean :0.4079 Mean : 9.934 3rd Qu.:0.4800 3rd Qu.:11.000 Max. :0.6500Max. :29.000 This function provided frequency or descriptive statistic according to data type (continuous or nominal). For example diameter is continuous variable, so we got minimum, 25 percentile, median, mean, 75 percentile, and maximum values. Next we carried out data visualization as follow; >boxplot(vars$diameter)
  • 24. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 10 Figure 8. Boxplot: data visualization of MySQL table This shows boxplot of diameter variable of abalone table. Using graphical functions supported by R system, we can also get diverse visualization results such as histogram, plot, and so on. Lastly we constructed regression model using „reg‟ function as follow; >regression_result=lm(rings~diameter, data=vars) >sunnary(regression_result) Residuals: Min 1Q Median 3Q Max -5.19 -1.69 -0.72 0.91 16.00 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.3186 0.1727 13.42 <2e-16 *** diameter 18.6699 0.4115 45.37 <2e-16 *** R-squared: 0.3302, Adj. R-squared: 0.3301 The regression is popular model in statistical analysis. The dependent and independent variables are „rings‟ and „diameter‟ respectively. So, we got the following regression equation; Rings=2.3186+18.6699diameter. Therefore, in our case study, we illustrated a case study of connection between R and MySQL. 5. CONCLUSION In this paper, we studied on the efficient connection between DBMS and statistical software. We used R system and MySQL as statistical software and DBMS respectively. The RODBC package was used for DB connection in our study. After connecting between R and MySQL, we analyzed the data of MySQL table. So, this can be expanded to the big data analysis. In our
  • 25. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 11 case study, we illustrated how our approach could be applied in real application. We selected Abalone data set from the UCI machine learning repository for our case study. Our result contributes to the works related to big data analysis. In addition, we can analyze the data in DBMS directly by statistical methods. In our future works, we will expand the scope of the connection between DBMS and statistical software to more products. 6. DISCUSSION The biggest problem of statistical database system is the cost of connecting between statistical software and DBMS. For example, we should buy „SAS/Access‟ product additionally and install it to SAS base system for connecting SAS and DBMS. Generally this supplementary product is expensive, so most users have had difficulty to use statistical databases system. In this paper, we selected R system as statistical software instead of SAS, and we used RODBC as ODBC connector instead of SAS/Access, because R and RODBC are all free. But, their performance is similar to SAS. Also, in new analytical functions such as statistical leaning theory and machine learning algorithm, they surpass SAS. REFERENCES [1] Sathi, A. Big Data Analytics. An Article from IBM Corporation, 2012. [2] Heiberger, R. M., and Neuwirth, E.R through Excel – A Spreadsheet Interface for Statistics, Data Analysis, and Graphics. Springer, 2009. [3] MySQL, The World’s most popular open source database. http://www.mysql.com, accessed on October 2013. [4] Sim, S., Kang, H., and Lee, Y. Access to Database through the R-Language. The Korean Communications in Statistics, 15, 1 (2008), 51-64. [5] UCI Machine Learning Repository, http://archive.ics.uci.edu/ml, accessed on October 2013. [6] SAS, http://www.sas.com,accessed on October 2013. [7] SPSS, http://www-01.ibm.com/software/analytics/spss/, accessed on October 2013. [8] Minitab, http://www.minitab.com, accessed on October 2013. [9] S-Plus, http://solutionmetrics.com.au/products/splus/, accessed on October 2013. [10]Wikipedia, the free encyclopedia. http://en.wikipedia.org, accessed on October 2013. [11]Date, C. J.An Introduction to Database Systems. 7th edition, Addition-Wesley, 2000. [12]Oracle, http://www.oracle.com, accessed on October 2013. [13]Ripley, B.Package RODBC. CRAN R-Project, 2013. [14]R-bloggers, On R versus SAS. http://www.r-bloggers.com/on-r-versus-sas/, accessed on December, 2013. [15]Linkin,Advanced Business Analytics, Data Mining and Predictive Modeling. http://www.linkedin.com/groups/SAS-versus-R-35222.S.65098787, accessed on December, 2013. [16]Clever Logic, MySQL vs. Oracle Security, http://cleverlogic.net/articles/mysql-vs- oracle, accessed on December, 2013.
  • 26. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 12 [17]Find The Best, Oracle vs MySQL, http://database-management- systems.findthebest.com/saved_compare/Oracle-vs-MySQL, accessed on December, 2013. [18]Han, J., and Kamber, M. Data Mining Concepts and Techniques. Morgan Kaufmann, 2001. [19]R system, The R Project for Statistical Computing. http://www.r-project.org, accessed on October 2013. [20]Spector, P. Data Manipulation with R, Springer, 2008. [21]James, D. A., and DebRoy, S.Package RMySQL. CRAN R-Project, 2013.
  • 27. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 1 Pragmatic Approach to Component Based Software Metrics Based on Static Methods S. Sagayaraj Department of Computer Science Sacred Heart College, Tirupattur M. Poovizhi Department of Computer Science Sacred Heart College, Tirupattur ABSTRACT Component-Based Software Engineering (CBSE) is an emerging technique for reuse of software. This paper presents the component based software metrics by investigating the improved measurement techniques. Two types of metrics are used: static metrics and dynamic metrics. This research work presents the measured metric value for Complexity metrics and Criticality metric. The static metrics applied to the E-healthcare application which is developed with the reusable software components. The value of each metric is analyzed with the application. The metric measured value is the evidence for the reusability, good maintainability of component based software system. Keywords Component Based Software Engineering, Component Based Software Metrics, Component Based Software System. 1. INTRODUCTION The demand for new software applications is currently increasing at the exponential rate. The number of qualified and experienced professionals required for creating new software/applications is not increasing commensurably [1]. Software Reuse applications are built from existing components, primarily by assembling and replacing interoperable parts. So, software professionals have recognized reuse as a powerful means of potentially overcoming the above said software crisis and it promises significant improvements in software productivity and quality [2]. There are two approaches for reuse of code: develop the reusable code from scratch or identify and extract the reusable code from already developed code [3]. The organizations have experience in developing software, there exists extra cost to develop the reusable components from scratch to build and strengthen their reusable software reservoir. The cost of developing the software from scratch can be saved by identifying and extracting the reusable components from already developed and existing software systems or legacy systems [4]. But the problem of how to recognize reusable components from existing systems has remained relatively unexplored. In
  • 28. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 2 both the cases, whether the organization is developing software from scratch or reusing code from already developed projects, there is a need of evaluating the quality of the potentially reusable piece of software. Metrics is very essential to prove the quality of the components [5]. Software metrics are an essential part of the state-of-the-practice in software engineering. Goodman describes software metrics as: "The continuous application of measurement-based techniques to the software development process and its products to supply meaningful and timely management information, together with the use of those techniques to improve that process and its products"[6].Software metrics can do one of four functions such as understand, evaluate, control, predict. Various attributes, which determine the quality of the software, include maintainability, defect density, fault proneness, normalized rework, understandability, reusability etc [5]. To achieve both the quality and productivity objectives it is always recommended to go for the software reuse that not only saves the time taken to develop the product from scratch but also delivers the almost error free code, as the code is already tested many times during its software development [7]. During the last decade, the software reuse and software engineering communities have come to better understanding on component-based software engineering. The development of a reuse process and repository produces a base of knowledge that improves in excellence after every reuse, minimizing the amount of development work necessary for future projects, and ultimately reducing the risk of new projects that are based on repository knowledge [8]. CBSD centers on building large software systems by integrating previously existing software components. By enhancing the flexibility and maintainability of systems, this approach can potentially be used to reduce software development costs, assemble systems rapidly, and reduce the spiraling maintenance burden associated with the support and upgrade of large systems [9]. The paper is organized as follows: The related work on component based software metric is provided in Section 2. The list of Component based static and dynamic metrics in section 3. The detail of implementation is presented in Section 4. The analysis of complexity metrics and criticality metrics is described in section 5. Finally, the last section concludes the paper and offers further research in this area.
  • 29. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 3 2. RELATED WORKS Many works are carried out in the area of Component Based Software Metrics. Some of the works are listed below: Nael SALMAN focuses mainly on the complexity that results mainly from factors related to system structure and connectivity in 2006 [10]. Also, a new set of properties that a component-oriented complexity metric must possess are defined. The metrics have been evaluated using the properties defined. A case study has been conducted to detect the power of complexity metrics in predicting integration and maintenance efforts. The results of the study revealed that component oriented complexity metrics can be of great value in predicting both integration and maintenance efforts. Arun Sharma, Rajesh Kumar, and P. S. Grover presented survey few existing component-based reusability metrics in 2007 [11]. These metrics gave a border view of component’s understandability, adaptability, and portability. It also expresses the analysis, in terms of quality factors related to reusability, contained in an approach that helps significantly in assessing existing components for reusability. V. Lakshmi Narasimhan, P. T. Parthasarathy, and M. Das hearted a series of metrics projected by various researchers have been analyzed, evaluated and benchmarked using several large-scale openly available software systems in 2009[12]. A systematic analysis of the values for various metrics has been carried out and several key inferences have been drawn from them. A number of useful conclusions have been drawn from various metrics evaluations, which include inferences on complexity, reusability, testability, modularity and stability of the underlying components. Misook Choi, Injoo J. Kim, Jiman Hong, Jungyeop Kim suggested Component-Based Metrics Applying the Strength of Dependency between Classes in 2009 to increase quality of components, they proposed the component-based metrics applying the strength of dependency between classes to measure precisely [13]. In addition, they proved the theoretical soundness of the proposed metrics by the axioms of Briand et al. and suggest the accuracy and practicality of the proposed metrics through a comparison with the conventional metrics in component development phase. Majdi Abdellatief, Abu Bakar Md Sultan, Abdul Azim Abd Ghani, Marzanah A.Jaba presented dependency between components is considered as a most important issue affecting the structural design of Component-
  • 30. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 4 Based Software System (CBSS) in 2011 [14]. Two sets of metrics that is, Component Information Flow Metrics and Component Coupling Metrics are proposed based on the concept of Component Information Flow from CBSS designer’s point of view. Jianguo Chen, Hui Wang, Yongxia Zhou, Stefan D. Bruda presented some such efforts by investigating the improved measurement tools and techniques, i.e., through the effective software metrics in 2011 [15]. Coupling, Cohesion and interface metrics are proposed newly and evaluated those metrics. The previous research explained the work done with varieties of Component Based Software Metrics. This paper deals about the static and dynamic metrics of component based software. This work is extended by developing the E-Healthcare application and the results are carried out for the static metrics. 3. COMPONENT BASED SOFTWARE METRICS The traditional software metrics focus on non-CBSS and are inappropriate to CBSS mainly because the component size is normally not known in advance. Inaccessibility of the source code for some components prevents comprehensive testing. So, the component based metrics are defined to evaluate the component based application. There are two types of metrics considered in this paper for measuring the values.  Static Metric Static metrics cover the complexity and the criticality within an integrated component. Static metrics are collected from static analysis of component assembly. The complexity and criticality metrics are intended to be used early during the design stage. The list of static metrics [16] is provided in Table 1.  Dynamic metric Dynamic metrics are gathered during execution of complete application. Dynamic metrics are meant to be used at implementation stage. The dynamic metrics are listed in Table 2 [15].
  • 31. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 5 Table 1. Static Metrics Sl.no Metric Name Formulae 1 Component Packing Density Metric 2 Component Interaction Density metric 3 Component Incoming Interaction Density 4 Component Outgoing Interaction Density 5 Component Average Interaction Density 6 Bridge Criticality Metrics CRIT bridge =#bridge_component 7 Inheritance Criticality Metrics CRIT inheritance =#root_component 8 Link Criticality Metrics CRIT link =#link_component 9 Size Criticality Metrics CRIT size =#size_component 10 #Criticality Metrics CRIT all = CRIT bridge+ CRIT inheritance + CRIT link + CRIT size Table 2. Dynamic Metrics 4. IMPLEMENTATION The E-Healthcare application is developed to measure the static metrics. The application is designed with the number of components. The metrics are applied with the application and the values are measured. There are five modules in e-healthcare application. Sl.no Metric Name Formulae 1 Number of Cycle (NC) NC = # cycles 2 Average Number of Active Components 3 Active Component Density (ACD) 4 Average Active Component Density 5 Peak Number of Active Components ACΔt = max { AC1,..,ACn}
  • 32. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 6 4.1 Admin Admin module is used to store the user or doctor or admin details. Admin has a responsibility to manage every record in the database. 4.2 Appointments and payments This module is used to add, drop doctor details and help to get appointment for users. Admin is a responsible person to add new doctor details. The existing doctor also can be deleted by admin. 4.3 Diagnosis and health Diagnosis and Health module is used to retrieve user’s diagnosis details. The users who are all taking the treatment by using application, those users information is store in the database. 4.4 First aid and E-certificate This module is used to get blood bank details for the required blood group. A first aid medicine detail for a particular disease is provided to the users. The user can get treatment type which helps users for their emergency. 4.5 Symptoms and alerts Symptoms and alerts module is used to check the BP level of the user. The patient information is retrieved from database and their symptoms, causes for disease are helps the users to prevent them from disease. The pictographic representation of the modules in application is shown in Figure 1. Appointments and payments Admin Diagnosis and health First aid and E-certificates Symptoms and Alerts Figure 1. Modules in E-healthcare Application Components are created to develop the whole application. The components (admin, appointments and payment, diagnosis and health, firstaid and e- certificate, symptoms and alerts, DBHelper, EhealthBL) are required to complete the component based application called E-Healthcare. The static metrics are applied with that component, and each component value is measured according to the metric formula. The analysis of metric is carried
  • 33. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 7 out manually with the application. With the help of database table, web page form, components the metric values are calculated. 5. ANALYSIS The analysis made to prove the CBSS has good reusability, maintainability and independence. The Component Packing Density Metric, Component Interaction Metrics (Incoming, outgoing, Average), Criticality metrics analyses are as follows: 5.1 Component Packing Density Metric CPD is used to measure the number of operation in which each component contains. The CPD is defined as a ratio of #constituent (LOC, object/classes, operations, classes and/or modules) and #component #Constituent = one of the following: LOC, object/classes, operations, classes and/or modules #Component = number of components For this metric the no. of operation of each component is listed in Table 3. Table 3. Component packing Density S.No Component Name No. of operations 1 Admin 3 2 Appointments and payments 4 3 Diagnosis and health 4 4 Firstaid and e-certificate 6 5 Symptoms and alerts 5 6 DBHelper 1 7 EhealthBL 19 = 3+4+6+5+4+1+19/7= 42/7= 6 Hence, the CPD metric is helps to know the average number of operations in each component contains. 5.2Component Interaction Density Metric The CID is defined as a ration of actual interactions over potential ones. A higher interaction density causes a higher complexity in the interaction [17]. The CID metric is applied on the E-Healthcare application. The measured value of actual interactions in each component of E-Healthcare is illustrated in Table 4.
  • 34. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 8 #I = no. of actual interactions #Imax = no. of maximum available interactions. Table 4. Actual interactions S.No Name of the page No. of actual interactions 1 Registration.aspx 4 2 Postquestion.aspx 2 3 Search.aspx 5 i/p, 5 o/p 4 Doctormanagement.aspx 6 5 Diagnosis.aspx 1 6 Searchmedicine.aspx 2 i/p, 3 o/p 7 Medicine.aspx 5 8 Bloodbank.aspx 4 9 Firstaidsuggestion.aspx 2 10 Medicalcertificate.aspx 3 11 Treatmenttype.aspx 1 i/p, 2 o/p 12 Symptoms.aspx 1 i/p, 3 o/p Total 51 The actual interaction value between other components is 51. The maximum no. of available interaction with other component is 87 =51/87 = 0.586 This metric brings out the number of incoming and outgoing interactions available in each component. This metric helps to know which component has greater connectivity with other component. 5.3 Component Incoming Interaction Density CIID is defined as a ratio of number of incoming interactions and maximum number of incoming interactions. A higher interaction density causes a higher complexity in the interaction. The no. of actual incoming interactions in each component is shown in the Table 5. #I in = no. of incoming interactions #Imax in = maximum no. of available incoming interactions.
  • 35. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 9 Table 5. Incoming Interactions S.No Name of the page No. of incoming interactions 1 Registration.aspx 4 2 Postquestion.aspx 1 3 Search.aspx 5 4 Doctormanagement.aspx 4 5 Diagnosis.aspx 1 6 Searchmedicine.aspx 2 7 Medicine.aspx 5 8 Bloodbank.aspx 4 9 Firstaidsuggestion.aspx 1 10 Medicalcertificate.aspx 2 11 Treatmenttype.aspx 4 12 Symptoms.aspx 4 Total 37 The no. of incoming interaction value is 37. The maximum no. of available incoming interaction value is 51. Out of 51 interactions only the 37 interactions are actually has link to the other component. = 37/51 = 0.725 CIID metric value 0.725 is clearly state the incoming interactions density with other component is very high. 5.4 Component Outgoing interaction Density COID is defined as a ratio of number of outgoing interactions and maximum number of outgoing interactions. A higher interaction density causes a higher complexity in the interaction. The number of outgoing interaction in each component is shown in Table 6. #I out = no. of outgoing interactions #Imax out = no. of maximum no. of outgoing interactions. Table 6. Outgoing Interactions S.No Name of the page No. of outgoing interactions 1 Registration.aspx 2 2 Postquestion.aspx 1 3 Search.aspx 5
  • 36. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 10 4 Doctormanagement.aspx 1 5 Diagnosis.aspx 3 6 Searchmedicine.aspx 3 7 Medicine.aspx 1 8 Bloodbank.aspx 3 9 Firstaidsuggestion.aspx 1 10 Medicalcertificate.aspx 1 11 Treatmenttype.aspx 4 12 Symptoms.aspx 3 Total 28 The no. of outgoing interaction value is 28. The maximum no. of available outgoing interaction value is 46. Only 28 outgoing interactions are actually connected with other components. = 28/46 = 0.608 The calculated value is 0.608 proven that there is greater outgoing interactions with the components. 5.5 Component Average Interaction Density CAID represents the sum of CID for each component divided by the number of components. #components = Number of components in the system. (Sum of interaction density of n component / no. of existing component) Admin: The actual interfaces (incoming and outgoing) of admin component are listed. Sum of interaction density value for admin component is shown in Table 7. Table 7. Sum of CID for admin component S.No Name of the page Sum of CID for admin component 1 Registration.aspx 4 out of 13 (only 4 interfaces interact with other components out of 13 interfaces) 2 Login.aspx 2 out of 2 3 Postquestion.aspx 1 out of 1
  • 37. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 11 Summation of CID for Component Admin is 7/16. Seven are actual interactions out of sixteen. This component has a greater reliability. Appointments and payments: Sum of interaction density of an appointments and payments component is shown in Table 8. The sum is considered both the incoming and outgoing interfaces in appointments and payments component. Table 8. Sum of CID for appointments and payments component Summation of CID for Component Appointments and payments is 10/12. The 10 interfaces has link with other component out of 12 interfaces. Diagnosis and health: Sum of interaction density of a diagnosis and health component is shown in Table 9. Table 9. Sum of CID for diagnosis and health component S.No Name of the page Sum of CID for diagnosis and health component 1 Diagnosis.aspx : 1 out of 2 2 Searchmedicine.aspx : 2 out of 2 3 Medicine.aspx : 4 out of 5 Summation of CID for Component Diagnosis and health is 7/9. The 7 interfaces are represents both interactions with added components out of 9 interfaces. Firstaid and e-certificate: Table 10 shows the sum of CID value for component called firstaid and e-certificates. Table 10. Sum of CID for firstaid and e-certificates S.No Name of the page Sum of CID for firstaid and e-certificate S.No Name of the page Sum of CID for appointments and payments component 1 Search.aspx : 2 out of 2 2 To get appointment : 4 out of 4 3 Doctormanagement.aspx : 4 out of 6
  • 38. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 12 1 Bloodbank.aspx : 1 out of 1 : 3 out of 3 2 Firstaidsuggestion.aspx : 1 out of 1 3 Medical certificate.aspx : 2 out of 4 4 Treatmenttype.aspx : 1 out of 1 : 3 out of 7 Summation of CID for Component Firstaid and E-certificates is 11/17. Out of 17 only 11 interactions are connected with the rest of the component. Symptoms and alerts: Table 11 shows the sum of CID value for component called symptoms and alerts. Table 11. Sum of CID for symptoms and alerts. S.No Name of the page Sum of CID for symptoms and alerts 1 Searchpatient.aspx : 1 out of 1 : 3 out of 3 Summation of CID for Component Symptoms and alerts is 4/4. This component completely connected with other components. Component Average Interaction Density metric takes the ratio between sum of each component and number of existing components. = (7/16+10/12+7/9+11/17+4/4)/7 = 0.5279 The measured value for this metric proved that, greater reliability with the components. 5.6 Bridge Criticality Metric Bridge criticality metric is used to identify the bridge component. The component which is acts as a bridge for components is a bridge component. CRIT bridge =#bridge_component. Out of 7 components EhealthBL is acts a bridge component between other component and from the code behind to the database. It contains all the queries to store and retrieve the information’s.
  • 39. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 13 So, the bridge_component value is 1.The value 1 is explicitly tells that, one component is operates as a bridge component to all other component. 5.7 Inheritance Criticality Metric Inheritance is deriving a new component from the existing component. The existing component is called as root component. CRIT inheritance =#root_component The interface is inherited from the existing/ derived component. Root components  Symptoms and alerts (patient info inherited to diagnosis component)  EhealthBL (query is inherited from the basequery) So, the root component value is: 2, this value is shows that, object oriented programming concepts utilized between the components. 5.8 Link Criticality Metric Link criticality metric is used to identify link component. The component which is providing link to other components is called as link component. CRIT link =#link_component The link component value is: 1 (DB helper).This value proved that the component acts as link between code behind page to database. 5.9 Size Criticality Metric Size criticality metric is used to identify the component which exceeds the critical level, which is called size component. CRIT size =#size_component The size component value is: 0 Size critical level is: 60 lines in a component. No component exceeds the critical level. 5.10 # Criticality Metric The Sum of the bridge criticality, inheritance criticality, link criticality, size criticality is known as Criticality Metrics. CRIT all = CRIT bridge+ CRIT inheritance + CRIT link + CRIT size CRIT all = 1+2+1+0 = 4 The compound value 4 proved that the huge criticality is arising. Threshold Value The threshold value is fixed as 0.5 and it is used to compare the computed value of each meric. The comparison with this threshold value is to check
  • 40. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 14 the metric value is increased or decreased in it reusability and good maintainability aspects. Table 12 shows the result of compared with the threshold value. Table 12. Comparison with threshold value. {{{ Metric Name Comparison with Threshold Value Component Packing Density Metric Increasing Component Interaction Density Metric Increasing Component Incoming Interaction Density Increasing Component Outgoing Interaction Density Increasing Component Average Interaction Density Increasing Bridge Criticality Metrics Increasing Inheritance Criticality Metrics Increasing Link Criticality Metrics Increasing Size Criticality Metrics Decreasing 6. CONCLUSIONS Building software systems with reusable components bring many advantages to Organizations. Reusability may have several direct or indirect factors like cost, efforts, and time. This paper discussed various aspects of reusability for Component- Based systems. It has given an insight view of various reusability metrics for Component-Based systems. The qualities of components are correctly measured by applying metrics to an e-healthcare in an electronic commerce domain. The component-based metrics result in improving the quality of design components and developing the component based system with good maintainability, reusability, and independence. Most of the Metrics have future enhancement. That enhancements help to add the features at the future. The demand of the new software applications is currently increasing at the exponential rate. So the future enhancements will help to fulfill those requirements. The Dynamic Metric analysis can be applied to the component based software application and it can be validated. Based on the applications the enhanced metrics can be proposed for the component based software systems. REFERENCES [1] Dr. Nedhal A. Al Saiyd, Dr. Intisar A. Al Said, Ahmed H. Al Takrori, Semantic-Based Retrieving Model of Reuse Software Component, IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.7, July 2010. [2] Joaquina Martín-Albo, Manuel F. Bertoa, Coral Calero, Antonio Vallecillo, Alejandra Cechich and Mario Piattini, CQM: A Software Component Metric Classification Model, IEEE Transactions onJjournal Name.
  • 41. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 15 [3] Anas Bassam AL-Badareen, Mohd Hasan Selamat, Marzanah A. Jabar, Jamilah Din, Sherzod Turaev, Reusable Software Component Life Cycle, International Journal of Computers, Issue 2, Volume 5, 2011. [4] Chintakindi Srinivas, Dr.C.V.Guru rao, Software Reusable Components With Repository System, International Journal of Computer Science & Informatics, Volume- 1, Issue-1,2011 [5] Parvinder S.Sandhu, Harpreet Kaur, and Amanpreet Singh, Modeling of Reusability of Object oriented Software System, World Academy of Science, Engineering and Technology 56 2009. [6] Sarbjeet Singh, Manjit Thapa, Sukhvinder singh and Gurpreet Singh, Sarbjeet Singh, Manjit Thapa, Sukhvinder singh and Gurpreet Singh, International Journal of Computer Applications (0975 – 8887) Volume 8– No.12, October 2010 [7] Linda L. Westfall, Seven steps to designing a software metrics, Principles of software measurement services. [8] K.S. Jasmine and R.Vasantha, DRE – A Quality metric for Component Based Software Products, World Academy of Science, Engineering and Technology 34 2007. [9] Iqbaldeep Kaur, Parvinder S. Sandhu, Hardeep Singh, and Vandana Saini, Analytical Study of Component Based Software Engineering, World Academy of Science, Engineering and Technology 50 2009. [10]Nael Salman, Complexity metrics as predicators of maintainability and integrability of software components, Journal of arts and science, May 2006. [11]Arun Sharma, Rajesh Kumar, and P. S. Grover, A critical survey of reusability aspects for component-Based systems, World academy of science, Engineering and Technology 33 2007. [12]V. Lakshmi Narasimhan, P. T. Parthasarathy, and M. Das, Evaluation of a suite of metrics for CBSE, Issues in informing science and information technology, Vol 6, 2009. [13]Misook Choi, Injoo J. Kim, Jiman Hong, Jungyeop Kim, Component-Based Metrics Applying the Strength of Dependency between Classes, ACM Journal, March 2009. [14]Majdi Abdellatief, Abu Bakar Md Sultan, Abdul Azim Abd Ghani, Marzanah A.Jabar, Component-based Software System Dependency Metrics based on Component Information Flow Measurements, ICSEA 2011. [15]Jianguo Chen, Hui Wang, Yongxia Zhou, Stefan D.Bruda, Complexity Metrics for Component-based Software Systems, International Journal of Digital Content Technology and its Applications. Vol.5, No.3, March 2011. [16]V. Lakshmi Narasimhan, and Bayu Hendradjaya, Theoretical Considerations for Software Component Metrics, World Academy of Science, Engineering and Technology 10 2005. [17]E. S. Cho, M.S. Kim, S.D. Kim, Component Metrics to Measure Component Quality, the 8th Asia-Pacific Software Engineering Conference (APSEC), Macau, 2001, pp. 419-426.
  • 42. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 1 SDI System with Scalable Filtering of XML Documents for Mobile Clients Yi Yi Myint Department of Information and Communication Technology University of Technology (Yatanarpon Cyber City) Pyin Oo Lwin, Mandalay Division, Myanmar Hninn Aye Thant Department of Information and Communication Technology University of Technology (Yatanarpon Cyber City) Pyin Oo Lwin, Mandalay Division, Myanmar ABSTRACT As the number of user grows and the amount of information available becomes even bigger, the information dissemination applications are gaining popularity in distributing data to the end users. Selective Dissemination of Information (SDI) system distributes the right information to the right users based upon their profiles. Typically, the exploitation of Extensible Markup Language (XML) representation entails the profile representation, and the utilization of the XML query languages assist the employment of queries indexing techniques in SDI systems. As a consequence of these advances, mobile information retrieval is crucial to share the vast information from diverse data sources. However, the inherent limitations of mobile devices require information to be delivered to mobile clients to be highly personalized consistent with their profiles. In this paper, we address the issue of scalable filtering of XML documents for mobile clients. We describe an efficient indexing mechanism by enhancing XFilter algorithm based on a modified Finite State Machine (FSM) approach that can quickly locate and evaluate relevant profiles. Finally, our experimental results show that the proposed indexing method outperforms the previous XFilter algorithm in time aspect. Keywords XML, FSM, scalable filtering, SDI. 1. INTRODUCTION Nowadays the SDI System becomes increasingly an important research area and industrial topic. Obviously, there is a trend to create new applications for small and light computing devices such as cell phones and PDAs. Amongst the new applications, mobile information dissemination applications (e.g. electronic personalized newspapers delivery, ecommerce site monitoring, headline news, alerting services for digital libraries, etc.) deserve special attention. Recently, there have been a number of efforts to build efficient large-scale XML filtering systems. In an XML filtering system [4], constantly arriving streams of XML documents are passed through a filtering engine that matches documents to queries and routes the matched documents
  • 43. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 2 accordingly. XML filtering techniques comprise a key component of modern SDI applications. XML [3] is becoming a standard for information exchange and a textual representation of data that is designed for the description of the content, especially on the internet. The basic mechanism used to describe user profiles in XML format is through the XPath query language. XPath is a query language for addressing parts of an XML document. However, this technique often suffers from restricted capability to express user interests, being unable to rightly capture the semantics of the user requirements. Therefore, expressing deeply personalized profiles require a querying power just like SQL provides on relational databases. Moreover, as the user profiles are complex in mobile environment, a more powerful language than XPath is needed. In this case, the choice is XML-QL. XML-QL [7] has more expressive power compared to XPath and it is also measured the most powerful among all XML query languages. XML-QL’s querying power and its elaborate CONSTRUCT statement allows the format of the query results to be specified. The rest of the paper is organized as follows: Section 2 briefly summarizes the related works. Section 3 describes the proposed system architecture and its components. The operation of the system that is how the query index is created, the operation of the finite state machine and the generation of the customized results are explained in Section 4. Section 5 gives the performance evaluation of the system. Finally Section 6 concludes the paper. 2. RELATED WORKS We now introduce some existing XML filtering methods. XFilter [1] was one of the early works. The XFilter system is designed and implemented for pushing XML documents to users according to their profiles expressed in XML Path Language (XPath). XFilter employs a separate FSM per path query and a novel indexing mechanism to allow all of the FSMs to be executed simultaneously during the processing of a document. A major drawback of XFilter is its lack of expressiveness. In addition, XFilter does not execute the XPath queries to generate partial results. As a result, the whole document is pushed to the user when a document matches a user’s profile. This feature prevents XFilter to be used in mobile environments because the limited capability of the mobile devices is not enough to handle the entire document. Also XFilter does not utilize the commonalities between the queries, i.e. it produces a FSM per query. This observation motivated us to develop mechanisms that employ only a single FSM for the queries which have common element structure.
  • 44. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 3 YFilter [2] overcomes the disadvantage of XFilter by using Nondeterministic Finite Automata (NFA) to emphasize prefix sharing. The resulting shared processing provided tremendous improvements to the performance of structure matching but complicated the handling of value- based predicates. However, the ancestor/descendant relationship introduces more matching states, which may result in the number of active states increasing exponentially. Post processing is required for YFilter. FoXtrot [5] is an efficient XML filtering system which integrates the strengths of automata and distributed hash tables to create a fully distributed system. FoXtrot also describes different methods for evaluating value-based predicates. The performance evaluation demonstrates that it can index millions of queries and attain an excellent filtering throughput. However, FoXtrot necessitates the extensions of the query language to reach the full XPath or the powerful expressiveness for user profiles. NiagaraCQ system [6] uses XML-QL to express user profiles. It provides the measures of scalability through query groups and cashing techniques. However, its query grouping ability is derived from execution plans which are different from our proposed method. The execution times of queries do not make such planning a possible applicant for mobile environments. Accordingly, our system will solve the above problems and reduce the filtering time as much as possible. 3. PROPOSED SYSTEM ARCHITECTURE We first present a high-level overview of our XML filtering system. We then describe XML-QL language that we use to specify the user profiles in this work. The overall architecture of the system is depicted in Figure 1. Figure 1. Overall architecture of the system User profiles describe the information preferences of individual users. These profiles may be created by the users themselves, e.g., by choosing items in a Graphical User Interface (GUI) via their mobile phones. The user profiles
  • 45. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 4 are automatically converted into a XML-QL format that can be efficiently stored in the profile database and evaluated by the filtering system. These profiles are effectively “standing queries”, which are applied to all incoming documents. Filtered engine first creates query indices for user profiles and then parses the incoming XML documents to obtain the query results. The results are stored in a special content list, so that the whole document need not be sent. Extracting parts of an XML document can save bandwidth in a mobile environment. After that, filtered engine sends the filtered XML documents to the related mobile clients. 3.1 Defining User Profiles with XML-QL XML-QL has a SELECT WHERE construct, like SQL, that can express queries, to extract pieces of data from XML documents. It can also specify transformations that, for example, can map XML data between Document Type Definitions (DTDs) and integrate XML data from different sources. Profiles defined through a GUI are transformed into XML documents which contain XML-QL queries as shown in Figure 2. <Profile> <XML-QL> WHERE<course> <major> <name>ICT</name> <program>First Year</program> <syllabus>$n</syllabus> </major></course> IN “course.xml” CONSTRUCT<result><syllabus>$n</syllabus></result> </XML-QL> <PushTo> <address>…</address> </PushTo> </Profile> Figure 2. Profile syntax represented in XML containing XML-QL query 3.2 Filtered Engine The basic components of the filtered engine are 1) An event-based XML parser which is implemented using SAX API for XML documents; 2) A profile parser that has an XML-QL parser for user profiles and creates the Query Index; 3) A Query Execution Engine which contains the Query Index which is associated with Finite State Machines to query the XML documents; 4) Delivery Component which pushes the results to the related mobile clients (see Figure 3).
  • 46. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 5 User Profiles Profile Parser XML-QL Parser Query Execution Engine Query Index XML Parser XML Document Delivery Query Query nodes Events Results Figure 3. Filtered engine 4. OPERATION OF THE SYSTEM The system operates as follows: subscriber informs the filtered engine when a new profile is created or updated; the profiles are stored in an XML file that contains XML-QL queries and addresses to transmit the results (see Figure 2). Profiles are parsed by the profile parser component and XML-QL queries in the profile are parsed by an XML-QL parser. While parsing the queries, the XML-QL parser generates FSM representation for each query if the query does not match to any existing query group. Otherwise, the FSM of the corresponding query group is used for the input query. FSM representation contains state nodes of each element name in the queries which are stored in the Query Index. When a new document arrives, the system alerts the filtered engine to parse the related XML document. The event based XML parser sends the events encountered to the query execution engine. The handlers in the query execution engine move the FSMs to their next states after the current states have succeed level checking or character data matching. Meanwhile the data in the document which matches the variables are kept in the content lists so that all the necessary partial data for producing the results are formatted and pushed to the related mobile clients when the FSM reaches its final state. 4.1 Creating Query Index Consider an example XML document and its DTD given in Figure 4. <!-- DTD for Course --> <!ELEMENT root (course*)>
  • 47. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 6 <!ELEMENT course (degree, major*)> <!ELEMENT degree (#PCDATA)> <!ELEMENT major(name, program, semester, syllabus*)> <!ELEMENT name (#PCDATA)> <!ELEMENT program (#PCDATA)> <!ELEMENT semester (#PCDATA)> <!ELEMENT syllabus (sub-code, sub-title, instructor)> <!ELEMENT sub-code (#PCDATA)> <!ELEMENT sub-title (#PCDATA)> <!ELEMENT instructor (#PCDATA)> <root> <course> <degree>Bachelor</degree> <major><name>ICT</name> <program>First Year</program> <semester>First Semester</semester> <syllabus> <sub-code>EM-101</sub-code> <sub-title>English</sub-title> <instructor>Dr. Thiri</instructor> </syllabus> </major> </course>…</root> Figure 4. An example XML document and its DTD (course.xml) The example queries and their FSM representations are shown in Figure 5. Note that there is a node in the FSM representation corresponding to each element in the query, and the FSM representation’s tree structure follows from XML-QL query structure. Query 1: Retrieve all syllabuses of first year program for ICT major. WHERE <major> <name>ICT</><program>First Year</><syllabus>$n</> </> IN “course.xml” CONSTRUCT<result><syllabus>$n</></> Q1.1 Q1.2 Q1.3 Q1.4 Q1.1 Q1.2 Q1.3 Q1.4 FSM for Query 1 Query 2: Find the instructor name of the subject code EM-101.
  • 48. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 7 Q2.1 Q2.2 Q2.3 WHERE <syllabus> <sub-code>EM-101</><instructor>$s</> </> IN “course.xml” CONSTRUCT<result><syllabus>$s</></> Q2.1 Q2.2 Q2.3 FSM for Query 2 Query 3: Retrieve all the instructors for first year program in ICT major. WHERE<major> <name>ICT</><program>First Year</><syllabus> <instructor>$s</></> </> IN “course.xml” CONSTRUCT<result><syllabus>$s</></> Q3.1 Q3.2 Q3.3 Q3.4 Q3.5 Q3.1 Q3.2 Q3.3 Q3.4 Q3.5 FSM for Query 3 Figure 5. Example queries and its FSM representation We also substitute constants in a query with parameters to create syntactically equivalent queries, which lead to the use of the same FSM for them. The state changes of a FSM are handled through the two lists associated with each node in the Query Index (See Figure 6). The current nodes of each query are placed on the Candidate List (CL) of their related element name. In addition, all of the nodes representing the future states are stored in the Wait Lists (WL) of their related element name. A state transition in the FSM is represented by copying a query node from WL to the CL. Notice that the node copied to the CL also remains in the WL so that it can be reused by the FSM in future executions of the query as the same element name may reappear in another level in the XML document. When the query index is initialized, the first node of each query tree is placed on the CL of the index entry of its relevant element name. The remaining elements in the query tree are placed in relevant WLs. Query nodes in the CL designate that the state of the query might change when the XML parser processes the relevant elements of these nodes. When the XML parser catches a start element tag, the immediate child elements of this node in the Query Index are copied from WL to CL If a node in the CL of the element satisfies level checking or character data matching. The purpose of the level checking is to make sure that this element name possibly will reappear in the document.
  • 49. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 8 instructor major name program syllabus sub-code WL CL CL CL CL CL CL WL WL WL WL WL Q3.2 Q1.1 Q3.1 Q1.2 Q1.3 Q3.3 Q2.1 Q1.4 Q3.4 Q2.2 Q2.3 Q3.5 Figure 6. Initial states of the query index for example queries 4.2 Operation of the Finite State Machine When a new XML document activates the SAX parser, it starts generating events. The following event handlers hold these events: Table 1. Sample SAX API An XML Document SAX API Events <?xml version=”1.0”> <course> <major> <name> ICT </name> </major> </course> start document start element: course start element: major start element: name characters: ICT end element: name end element: major end element: course end document Start Element Handler checks whether the query element matches the element in the document. For this purpose it performs a level and an attribute check. If these are satisfied, it either enables data comparison or starts variable content generation. As the next step, the nodes in the WL that are the immediate successors of this node are moved to CL.
  • 50. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 9 End Element Handler evaluates the state of a node by considering the states of its successor nodes. Moreover, it generates the output when the root node is reached. It also deletes the nodes from CL which are inserted in the start element handler of the node. This provides “backtracking” in the FSM. Element Data Handler is implemented for data comparison in the query. If the expression is true, the state of the node is set to true and this value is used by the End Element Handler of the current element node. End Document Handler signals the end of result generation and passes the results to the Delivery Component. 4.3 Generating Customized Results Results are generated when the end element of the root node of the query is encountered. Therefore, content lists of the variable nodes are traversed to obtain content groups. These content groups are further processed to produce results. This process is repeated until the end of the document is reached. The results require to be formatted as defined in the CONSTRUCT clause. After all, the queries results are sent to the related mobile clients. 5. PERFORMANCE EVALUATION In this section, we conducted three sets of experiments to demonstrate the performance of the architecture for different document sizes and query workloads. The graph shown in Figure 7 contains the results for different query groups, that is, the queries have the same FSM representation but different constants, for the document course.xml (1MB). When the number of queries on the same XML document is very large, the probability of having queries with the same FSM representation increases considerably. Figure 7. Comparing the performance by varying the number of queries The above experiment indicates that our proposed architecture is highly scalable, and a very important factor on the performance is the number of
  • 51. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 10 query groups and that generating a single FSM per query group rather than per query is well justified. Figure 8. Comparing the performance by varying depth The depth of XML documents and queries in the user profiles varies according to application characteristics. Figure 8 shows the execution time for evaluating the performance of the system as the maximum depth is varied. Here, we fixed the number of profiles at 25000 and varied the maximum depth of the XML document and queries from 1 to 10. Figure 9. Execution time of queries for different number of query groups and document sizes Figure 9 shows the results for the execution times of queries which are varied the number of query groups and the size of different documents. The results indicate that performance is more sensitive to document size when the number of query groups increases. Therefore, this result also confirms the importance of the query grouping. As final conclusion we can say that FSM approach proposed in this paper for executing XML-QL queries on XML documents is a very promising approach to be used in the mobile environments.
  • 52. International Journal of Computer Science and Business Informatics IJCSBI.ORG ISSN: 1694-2108 | Vol. 8, No. 1. DECEMBER 2013 11 6. CONCLUSIONS Mobile communication is blooming and access to Internet from mobile devices has become possible. Given this new technology, researchers and developers are in the process of figuring out what users really want to do anytime from anywhere and determining how to make this possible. In addition, highly personalization is a very important requirement for developing SDI services in mobile environment as the limited capability of mobile devices is not enough to handle the entire documents. This paper attempts to develop an efficient and scalable SDI system for mobile clients based upon their profiles. We anticipate that one of the common uses of mobile devices will be to deliver the personalized information from XML sources. We believe that a querying power is necessary for expressing highly personalized user profiles and for the system to be used for millions of mobile users, it has to be scalable. Since the critical issue is the number of profiles compared to the number of documents, indexing queries rather than documents makes sense. We expect that the performance of the system will still be acceptable for mobile environments for millions of queries since the results of the experiments show that the system is highly scalable. 7. ACKNOWLEDGMENTS The authors wish to acknowledge Dr. Soe Khaing for her useful comments on earlier drafts of the paper. Our heart-felt thanks to our family, friends and colleagues who have helped us for the completion of this work. REFERENCES [1] M. Altinel and M. Franklin, “Efficient filtering of XML documents for selective dissemination of information,” Proc of the Int’l Conf on VLDB, pp. 53-64, Sept 2000. [2] Y. Diao, M. Altinel, M. Franklin, H. Zhang and P.M. Fischer, “Path sharing and predicate evaluation for high-performance XML filtering,” ACM Trans. Database Syst., 28(4), Dec 2003, pp. 467–516. [3] Extensible Markup Language, http://www.w3.org/XML/. [4] I. Miliaraki, Distributed Filtering and Dissemination of XML Data in Peer-to-Peer Systems, PhD Thesis, Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, July 2011. [5] I. Miliaraki and M. Koubarakis, “FoXtrot: distributed structural and value XML filtering”, ACM Transactions on the Web, Vol. 6, No. 3, Article 12, Publication date: September 2012. [6] J. Chen, D. DeWitt, F. Tian and Y. Wang, “NiagaraCQ: a scalable continuous query system for internet databases”, ACM SIGMOD, Texas, USA, June 2000, pp.379-390. [7] XML-QL: A Query Language for XML, http://www.w3.org/TR/1998/NOTE-xml-ql- 19980819.